Search results for: predictive data mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25287

Search results for: predictive data mining

24297 Assessment of Chromium Concentration and Human Health Risk in the Steelpoort River Sub-Catchment of the Olifants River Basin, South Africa

Authors: Abraham Addo-Bediako

Abstract:

Many freshwater ecosystems are facing immense pressure from anthropogenic activities, such as agricultural, industrial and mining. Trace metal pollution in freshwater ecosystems has become an issue of public health concern due to its toxicity and persistence in the environment. Trace elements pose a serious risk not only to the environment and aquatic biota but also humans. Chromium is one of such trace elements and its pollution in surface waters and groundwaters represents a serious environmental problem. In South Africa, agriculture, mining, industrial and domestic wastes are the main contributors to chromium discharge in rivers. The common forms of chromium are chromium (III) and chromium (VI). The latter is the most toxic because it can cause damage to human health. The aim of the study was to assess the contamination of chromium in the water and sediments of two rivers in the Steelpoort River sub-catchment of the Olifants River Basin, South Africa and human health risk. The concentration of Cr was analyzed using inductively coupled plasma–optical emission spectrometry (ICP-OES). The concentration of the metal was found to exceed the threshold limit, mainly in areas of high human activities. The hazard quotient through ingestion exposure did not exceed the threshold limit of 1 for adults and children and cancer risk for adults and children computed did not exceed the threshold limit of 10-4. Thus, there is no potential health risk from chromium through ingestion of drinking water for now. However, with increasing human activities, especially mining, the concentration could increase and become harmful to humans who depend on rivers for drinking water. It is recommended that proper management strategies should be taken to minimize the impact of chromium on the rivers and water from the rivers should properly be treated before domestic use.

Keywords: land use, health risk, metal pollution, water quality

Procedia PDF Downloads 65
24296 Enhancing Project Performance Forecasting using Machine Learning Techniques

Authors: Soheila Sadeghi

Abstract:

Accurate forecasting of project performance metrics is crucial for successfully managing and delivering urban road reconstruction projects. Traditional methods often rely on static baseline plans and fail to consider the dynamic nature of project progress and external factors. This research proposes a machine learning-based approach to forecast project performance metrics, such as cost variance and earned value, for each Work Breakdown Structure (WBS) category in an urban road reconstruction project. The proposed model utilizes time series forecasting techniques, including Autoregressive Integrated Moving Average (ARIMA) and Long Short-Term Memory (LSTM) networks, to predict future performance based on historical data and project progress. The model also incorporates external factors, such as weather patterns and resource availability, as features to enhance the accuracy of forecasts. By applying the predictive power of machine learning, the performance forecasting model enables proactive identification of potential deviations from the baseline plan, which allows project managers to take timely corrective actions. The research aims to validate the effectiveness of the proposed approach using a case study of an urban road reconstruction project, comparing the model's forecasts with actual project performance data. The findings of this research contribute to the advancement of project management practices in the construction industry, offering a data-driven solution for improving project performance monitoring and control.

Keywords: project performance forecasting, machine learning, time series forecasting, cost variance, earned value management

Procedia PDF Downloads 23
24295 Three-Stage Mining Metals Supply Chain Coordination and Product Quality Improvement with Revenue Sharing Contract

Authors: Hamed Homaei, Iraj Mahdavi, Ali Tajdin

Abstract:

One of the main concerns of miners is to increase the quality level of their products because the mining metals price depends on their quality level; however, increasing the quality level of these products has different costs at different levels of the supply chain. These costs usually increase after extractor level. This paper studies the coordination issue of a decentralized three-level supply chain with one supplier (extractor), one mineral processor and one manufacturer in which the increasing product quality level cost at the processor level is higher than the supplier and at the level of the manufacturer is more than the processor. We identify the optimal product quality level for each supply chain member by designing a revenue sharing contract. Finally, numerical examples show that the designed contract not only increases the final product quality level but also provides a win-win condition for all supply chain members and increases the whole supply chain profit.

Keywords: three-stage supply chain, product quality improvement, channel coordination, revenue sharing

Procedia PDF Downloads 170
24294 Automatic Lexicon Generation for Domain Specific Dataset for Mining Public Opinion on China Pakistan Economic Corridor

Authors: Tayyaba Azim, Bibi Amina

Abstract:

The increase in the popularity of opinion mining with the rapid growth in the availability of social networks has attracted a lot of opportunities for research in the various domains of Sentiment Analysis and Natural Language Processing (NLP) using Artificial Intelligence approaches. The latest trend allows the public to actively use the internet for analyzing an individual’s opinion and explore the effectiveness of published facts. The main theme of this research is to account the public opinion on the most crucial and extensively discussed development projects, China Pakistan Economic Corridor (CPEC), considered as a game changer due to its promise of bringing economic prosperity to the region. So far, to the best of our knowledge, the theme of CPEC has not been analyzed for sentiment determination through the ML approach. This research aims to demonstrate the use of ML approaches to spontaneously analyze the public sentiment on Twitter tweets particularly about CPEC. Support Vector Machine SVM is used for classification task classifying tweets into positive, negative and neutral classes. Word2vec and TF-IDF features are used with the SVM model, a comparison of the trained model on manually labelled tweets and automatically generated lexicon is performed. The contributions of this work are: Development of a sentiment analysis system for public tweets on CPEC subject, construction of an automatic generation of the lexicon of public tweets on CPEC, different themes are identified among tweets and sentiments are assigned to each theme. It is worth noting that the applications of web mining that empower e-democracy by improving political transparency and public participation in decision making via social media have not been explored and practised in Pakistan region on CPEC yet.

Keywords: machine learning, natural language processing, sentiment analysis, support vector machine, Word2vec

Procedia PDF Downloads 132
24293 Patterns in Fish Diversity and Abundance of an Abandoned Gold Mine Reservoirs

Authors: O. E. Obayemi, M. A. Ayoade, O. O. Komolafe

Abstract:

Fish survey was carried out for an annual cycle covering both rainy and dry seasons using cast nets, gill nets and traps at two different reservoirs. The objective was to examined the fish assemblages of the reservoirs and provide more additional information on the reservoir. The fish species in the reservoirs comprised of twelve species of six families. The results of the study also showed that five species of fish were caught in reservoir five while ten fish species were captured in reservoir six. Species such as Malapterurus electricus, Ctenopoma kingsleyae, Mormyrus rume, Parachanna obscura, Sarotherodon galilaeus, Tilapia mariae, C. guntheri, Clarias macromystax, Coptodon zilii and Clarias gariepinus were caught during the sampling period. There was a significant difference (p=0.014, t = 1.711) in the abundance of fish species in the two reservoirs. Seasonally, reservoirs five (p=0.221, t = 1.859) and six (p=0.453, t = 1.734) showed there was no significant difference in their fish populations. Also, despite being impacted with gold mining the diversity indices were high when compared to less disturbed waterbodies. The study concluded that the environments recorded low abundant fish species which suggests the influence of mining on the abundance and diversity of fish species.

Keywords: Igun, fish, Shannon-Wiener Index, Simpson index, Pielou index

Procedia PDF Downloads 77
24292 The Structure and Function Investigation and Analysis of the Automatic Spin Regulator (ASR) in the Powertrain System of Construction and Mining Machines with the Focus on Dump Trucks

Authors: Amir Mirzaei

Abstract:

The powertrain system is one of the most basic and essential components in a machine. The occurrence of motion is practically impossible without the presence of this system. When power is generated by the engine, it is transmitted by the powertrain system to the wheels, which are the last parts of the system. Powertrain system has different components according to the type of use and design. When the force generated by the engine reaches to the wheels, the amount of frictional force between the tire and the ground determines the amount of traction and non-slip or the amount of slip. At various levels, such as icy, muddy, and snow-covered ground, the amount of friction coefficient between the tire and the ground decreases dramatically and considerably, which in turn increases the amount of force loss and the vehicle traction decreases drastically. This condition is caused by the phenomenon of slipping, which, in addition to the waste of energy produced, causes the premature wear of driving tires. It also causes the temperature of the transmission oil to rise too much, as a result, causes a reduction in the quality and become dirty to oil and also reduces the useful life of the clutches disk and plates inside the transmission. this issue is much more important in road construction and mining machinery than passenger vehicles and is always one of the most important and significant issues in the design discussion, in order to overcome. One of these methods is the automatic spin regulator system which is abbreviated as ASR. The importance of this method and its structure and function have solved one of the biggest challenges of the powertrain system in the field of construction and mining machinery. That this research is examined.

Keywords: automatic spin regulator, ASR, methods of reducing slipping, methods of preventing the reduction of the useful life of clutches disk and plate, methods of preventing the premature dirtiness of transmission oil, method of preventing the reduction of the useful life of tires

Procedia PDF Downloads 65
24291 Geochemical Baseline and Origin of Trace Elements in Soils and Sediments around Selibe-Phikwe Cu-Ni Mining Town, Botswana

Authors: Fiona S. Motswaiso, Kengo Nakamura, Takeshi Komai

Abstract:

Heavy metals may occur naturally in rocks and soils, but elevated quantities of them are being gradually released into the environment by anthropogenic activities such as mining. In order to address issues of heavy metal water and soil pollution, a distinction needs to be made between natural and anthropogenic anomalies. The current study aims at characterizing the spatial distribution of trace elements and evaluate site-specific geochemical background concentrations of trace elements in the mine soils examined, and also to discriminate between lithogenic and anthropogenic sources of enrichment around a copper-nickel mining town in Selibe-Phikwe, Botswana. A total of 20 Soil samples, 11 river sediment, and 9 river water samples were collected from an area of 625m² within the precincts of the mine and the smelter. The concentrations of metals (Cu, Ni, Pb, Zn, Cr, Ni, Mn, As, Pb, and Co) were determined by using an ICP-MS after digestion with aqua regia. Major elements were also determined using ED-XRF. Water pH and EC were measured on site and recorded while soil pH and EC were also determined in the laboratory after performing water elution tests. The highest Cu and Ni concentrations in soil are 593mg/kg and 453mg/kg respectively, which is 3 times higher than the crustal composition values and 2 times higher than the South African minimum allowable levels of heavy metals in soils. The level of copper contamination was higher than that of nickel and other contaminants. Water pH levels ranged from basic (9) to very acidic (3) in areas closer to the mine/smelter. There is high variation in heavy metal concentration, eg. Cu suggesting that some sites depict regional natural background concentrations while other depict anthropogenic sources.

Keywords: contamination, geochemical baseline, heavy metals, soils

Procedia PDF Downloads 138
24290 Risk Assessment of Trace Metals in the Soil Surface of an Abandoned Mine, El-Abed Northwestern Algeria

Authors: Farida Mellah, Abdelhak Boutaleb, Bachir Henni, Dalila Berdous, Abdelhamid Mellah

Abstract:

Context/Purpose: One of the largest mining operations for lead and zinc deposits in northwestern Algeria in more than thirty years, El Abed is now the abandoned mine that has been inactive since 2004, leaving large amounts of accumulated mining waste under the influence of Wind, erosion, rain, and near agricultural lands. Materials & Methods: This study aims to verify the concentrations and sources of heavy metals for surface samples containing randomly taken soil. Chemical analyses were performed using iCAP 7000 Series ICP-optical emission spectrometer, using a set of environmental quality indicators by calculating the enrichment factor using iron and aluminum references, geographic accumulation index and geographic information system (GIS). On the basis of the spatial distribution. Results: The results indicated that the average metal concentration was: (As = 30,82),(Pb = 1219,27), (Zn = 2855,94), (Cu = 5,3), mg/Kg,based on these results, all metals except Cu passed by GBV in the Earth's crust. Environmental quality indicators were calculated based on the concentrations of trace metals such as lead, arsenic, zinc, copper, iron and aluminum. Interpretation: This study investigated the concentrations and sources of trace metals, and by using quality indicators and statistical methods, lead, zinc, and arsenic were determined from human sources, while copper was a natural source. And based on the spatial analysis on the basis of GIS, many hot spots were identified in the El-Abed region. Conclusion: These results could help in the development of future treatment strategies aimed primarily at eliminating materials from mining waste.

Keywords: soil contamination, trace metals, geochemical indices, El Abed mine, Algeria

Procedia PDF Downloads 55
24289 An ANN-Based Predictive Model for Diagnosis and Forecasting of Hypertension

Authors: Obe Olumide Olayinka, Victor Balanica, Eugen Neagoe

Abstract:

The effects of hypertension are often lethal thus its early detection and prevention is very important for everybody. In this paper, a neural network (NN) model was developed and trained based on a dataset of hypertension causative parameters in order to forecast the likelihood of occurrence of hypertension in patients. Our research goal was to analyze the potential of the presented NN to predict, for a period of time, the risk of hypertension or the risk of developing this disease for patients that are or not currently hypertensive. The results of the analysis for a given patient can support doctors in taking pro-active measures for averting the occurrence of hypertension such as recommendations regarding the patient behavior in order to lower his hypertension risk. Moreover, the paper envisages a set of three example scenarios in order to determine the age when the patient becomes hypertensive, i.e. determine the threshold for hypertensive age, to analyze what happens if the threshold hypertensive age is set to a certain age and the weight of the patient if being varied, and, to set the ideal weight for the patient and analyze what happens with the threshold of hypertensive age.

Keywords: neural network, hypertension, data set, training set, supervised learning

Procedia PDF Downloads 377
24288 Diabetes Mellitus and Blood Glucose Variability Increases the 30-day Readmission Rate after Kidney Transplantation

Authors: Harini Chakkera

Abstract:

Background: Inpatient hyperglycemia is an established independent risk factor among several patient cohorts with hospital readmission. This has not been studied after kidney transplantation. Nearly one-third of patients who have undergone a kidney transplant reportedly experience 30-day readmission. Methods: Data on first-time solitary kidney transplantations were retrieved between September 2015 to December 2018. Information was linked to the electronic health record to determine a diagnosis of diabetes mellitus and extract glucometeric and insulin therapy data. Univariate logistic regression analysis and the XGBoost algorithm were used to predict 30-day readmission. We report the average performance of the models on the testing set on five bootstrapped partitions of the data to ensure statistical significance. Results: The cohort included 1036 patients who received kidney transplantation, and 224 (22%) experienced 30-day readmission. The machine learning algorithm was able to predict 30-day readmission with an average AUC of 77.3% (95% CI 75.30-79.3%). We observed statistically significant differences in the presence of pretransplant diabetes, inpatient-hyperglycemia, inpatient-hypoglycemia, and minimum and maximum glucose values among those with higher 30-day readmission rates. The XGBoost model identified the index admission length of stay, presence of hyper- and hypoglycemia and recipient and donor BMI values as the most predictive risk factors of 30-day readmission. Additionally, significant variations in the therapeutic management of blood glucose by providers were observed. Conclusions: Suboptimal glucose metrics during hospitalization after kidney transplantation is associated with an increased risk for 30-day hospital readmission. Optimizing the hospital blood glucose management, a modifiable factor, after kidney transplantation may reduce the risk of 30-day readmission.

Keywords: kidney, transplant, diabetes, insulin

Procedia PDF Downloads 66
24287 Sentiment Analysis of Ensemble-Based Classifiers for E-Mail Data

Authors: Muthukumarasamy Govindarajan

Abstract:

Detection of unwanted, unsolicited mails called spam from email is an interesting area of research. It is necessary to evaluate the performance of any new spam classifier using standard data sets. Recently, ensemble-based classifiers have gained popularity in this domain. In this research work, an efficient email filtering approach based on ensemble methods is addressed for developing an accurate and sensitive spam classifier. The proposed approach employs Naive Bayes (NB), Support Vector Machine (SVM) and Genetic Algorithm (GA) as base classifiers along with different ensemble methods. The experimental results show that the ensemble classifier was performing with accuracy greater than individual classifiers, and also hybrid model results are found to be better than the combined models for the e-mail dataset. The proposed ensemble-based classifiers turn out to be good in terms of classification accuracy, which is considered to be an important criterion for building a robust spam classifier.

Keywords: accuracy, arcing, bagging, genetic algorithm, Naive Bayes, sentiment mining, support vector machine

Procedia PDF Downloads 123
24286 Predicting Radioactive Waste Glass Viscosity, Density and Dissolution with Machine Learning

Authors: Joseph Lillington, Tom Gout, Mike Harrison, Ian Farnan

Abstract:

The vitrification of high-level nuclear waste within borosilicate glass and its incorporation within a multi-barrier repository deep underground is widely accepted as the preferred disposal method. However, for this to happen, any safety case will require validation that the initially localized radionuclides will not be considerably released into the near/far-field. Therefore, accurate mechanistic models are necessary to predict glass dissolution, and these should be robust to a variety of incorporated waste species and leaching test conditions, particularly given substantial variations across international waste-streams. Here, machine learning is used to predict glass material properties (viscosity, density) and glass leaching model parameters from large-scale industrial data. A variety of different machine learning algorithms have been compared to assess performance. Density was predicted solely from composition, whereas viscosity additionally considered temperature. To predict suitable glass leaching model parameters, a large simulated dataset was created by coupling MATLAB and the chemical reactive-transport code HYTEC, considering the state-of-the-art GRAAL model (glass reactivity in allowance of the alteration layer). The trained models were then subsequently applied to the large-scale industrial, experimental data to identify potentially appropriate model parameters. Results indicate that ensemble methods can accurately predict viscosity as a function of temperature and composition across all three industrial datasets. Glass density prediction shows reliable learning performance with predictions primarily being within the experimental uncertainty of the test data. Furthermore, machine learning can predict glass dissolution model parameters behavior, demonstrating potential value in GRAAL model development and in assessing suitable model parameters for large-scale industrial glass dissolution data.

Keywords: machine learning, predictive modelling, pattern recognition, radioactive waste glass

Procedia PDF Downloads 102
24285 Quantum Statistical Machine Learning and Quantum Time Series

Authors: Omar Alzeley, Sergey Utev

Abstract:

Minimizing a constrained multivariate function is the fundamental of Machine learning, and these algorithms are at the core of data mining and data visualization techniques. The decision function that maps input points to output points is based on the result of optimization. This optimization is the central of learning theory. One approach to complex systems where the dynamics of the system is inferred by a statistical analysis of the fluctuations in time of some associated observable is time series analysis. The purpose of this paper is a mathematical transition from the autoregressive model of classical time series to the matrix formalization of quantum theory. Firstly, we have proposed a quantum time series model (QTS). Although Hamiltonian technique becomes an established tool to detect a deterministic chaos, other approaches emerge. The quantum probabilistic technique is used to motivate the construction of our QTS model. The QTS model resembles the quantum dynamic model which was applied to financial data. Secondly, various statistical methods, including machine learning algorithms such as the Kalman filter algorithm, are applied to estimate and analyses the unknown parameters of the model. Finally, simulation techniques such as Markov chain Monte Carlo have been used to support our investigations. The proposed model has been examined by using real and simulated data. We establish the relation between quantum statistical machine and quantum time series via random matrix theory. It is interesting to note that the primary focus of the application of QTS in the field of quantum chaos was to find a model that explain chaotic behaviour. Maybe this model will reveal another insight into quantum chaos.

Keywords: machine learning, simulation techniques, quantum probability, tensor product, time series

Procedia PDF Downloads 451
24284 Visual Template Detection and Compositional Automatic Regular Expression Generation for Business Invoice Extraction

Authors: Anthony Proschka, Deepak Mishra, Merlyn Ramanan, Zurab Baratashvili

Abstract:

Small and medium-sized businesses receive over 160 billion invoices every year. Since these documents exhibit many subtle differences in layout and text, extracting structured fields such as sender name, amount, and VAT rate from them automatically is an open research question. In this paper, existing work in template-based document extraction is extended, and a system is devised that is able to reliably extract all required fields for up to 70% of all documents in the data set, more than any other previously reported method. The approaches are described for 1) detecting through visual features which template a given document belongs to, 2) automatically generating extraction rules for a given new template by composing regular expressions from multiple components, and 3) computing confidence scores that indicate the accuracy of the automatic extractions. The system can generate templates with as little as one training sample and only requires the ground truth field values instead of detailed annotations such as bounding boxes that are hard to obtain. The system is deployed and used inside a commercial accounting software.

Keywords: data mining, information retrieval, business, feature extraction, layout, business data processing, document handling, end-user trained information extraction, document archiving, scanned business documents, automated document processing, F1-measure, commercial accounting software

Procedia PDF Downloads 110
24283 Bayesian Prospective Detection of Small Area Health Anomalies Using Kullback Leibler Divergence

Authors: Chawarat Rotejanaprasert, Andrew Lawson

Abstract:

Early detection of unusual health events depends on the ability to detect rapidly any substantial changes in disease, thus facilitating timely public health interventions. To assist public health practitioners to make decisions, statistical methods are adopted to assess unusual events in real time. We introduce a surveillance Kullback-Leibler (SKL) measure for timely detection of disease outbreaks for small area health data. The detection methods are compared with the surveillance conditional predictive ordinate (SCPO) within the framework of Bayesian hierarchical Poisson modeling and applied to a case study of a group of respiratory system diseases observed weekly in South Carolina counties. Properties of the proposed surveillance techniques including timeliness and detection precision are investigated using a simulation study.

Keywords: Bayesian, spatial, temporal, surveillance, prospective

Procedia PDF Downloads 297
24282 From Text to Data: Sentiment Analysis of Presidential Election Political Forums

Authors: Sergio V Davalos, Alison L. Watkins

Abstract:

User generated content (UGC) such as website post has data associated with it: time of the post, gender, location, type of device, and number of words. The text entered in user generated content (UGC) can provide a valuable dimension for analysis. In this research, each user post is treated as a collection of terms (words). In addition to the number of words per post, the frequency of each term is determined by post and by the sum of occurrences in all posts. This research focuses on one specific aspect of UGC: sentiment. Sentiment analysis (SA) was applied to the content (user posts) of two sets of political forums related to the US presidential elections for 2012 and 2016. Sentiment analysis results in deriving data from the text. This enables the subsequent application of data analytic methods. The SASA (SAIL/SAI Sentiment Analyzer) model was used for sentiment analysis. The application of SASA resulted with a sentiment score for each post. Based on the sentiment scores for the posts there are significant differences between the content and sentiment of the two sets for the 2012 and 2016 presidential election forums. In the 2012 forums, 38% of the forums started with positive sentiment and 16% with negative sentiment. In the 2016 forums, 29% started with positive sentiment and 15% with negative sentiment. There also were changes in sentiment over time. For both elections as the election got closer, the cumulative sentiment score became negative. The candidate who won each election was in the more posts than the losing candidates. In the case of Trump, there were more negative posts than Clinton’s highest number of posts which were positive. KNIME topic modeling was used to derive topics from the posts. There were also changes in topics and keyword emphasis over time. Initially, the political parties were the most referenced and as the election got closer the emphasis changed to the candidates. The performance of the SASA method proved to predict sentiment better than four other methods in Sentibench. The research resulted in deriving sentiment data from text. In combination with other data, the sentiment data provided insight and discovery about user sentiment in the US presidential elections for 2012 and 2016.

Keywords: sentiment analysis, text mining, user generated content, US presidential elections

Procedia PDF Downloads 171
24281 How to Use Big Data in Logistics Issues

Authors: Mehmet Akif Aslan, Mehmet Simsek, Eyup Sensoy

Abstract:

Big Data stands for today’s cutting-edge technology. As the technology becomes widespread, so does Data. Utilizing massive data sets enable companies to get competitive advantages over their adversaries. Out of many area of Big Data usage, logistics has significance role in both commercial sector and military. This paper lays out what big data is and how it is used in both military and commercial logistics.

Keywords: big data, logistics, operational efficiency, risk management

Procedia PDF Downloads 626
24280 Comparison of Various Policies under Different Maintenance Strategies on a Multi-Component System

Authors: Demet Ozgur-Unluakin, Busenur Turkali, Ayse Karacaorenli

Abstract:

Maintenance strategies can be classified into two types, which are reactive and proactive, with respect to the time of the failure and maintenance. If the maintenance activity is done after a breakdown, it is called reactive maintenance. On the other hand, proactive maintenance, which is further divided as preventive and predictive, focuses on maintaining components before a failure occurs to prevent expensive halts. Recently, the number of interacting components in a system has increased rapidly and therefore, the structure of the systems have become more complex. This situation has made it difficult to provide the right maintenance decisions. Herewith, determining effective decisions has played a significant role. In multi-component systems, many methodologies and strategies can be applied when a component or a system has already broken down or when it is desired to identify and avoid proactively defects that could lead to future failure. This study focuses on the comparison of various maintenance strategies on a multi-component dynamic system. Components in the system are hidden, although there exists partial observability to the decision maker and they deteriorate in time. Several predefined policies under corrective, preventive and predictive maintenance strategies are considered to minimize the total maintenance cost in a planning horizon. The policies are simulated via Dynamic Bayesian Networks on a multi-component system with different policy parameters and cost scenarios, and their performances are evaluated. Results show that when the difference between the corrective and proactive maintenance cost is low, none of the proactive maintenance policies is significantly better than the corrective maintenance. However, when the difference is increased, at least one policy parameter for each proactive maintenance strategy gives significantly lower cost than the corrective maintenance.

Keywords: decision making, dynamic Bayesian networks, maintenance, multi-component systems, reliability

Procedia PDF Downloads 115
24279 Modelling Spatial Dynamics of Terrorism

Authors: André Python

Abstract:

To this day, terrorism persists as a worldwide threat, exemplified by the recent deadly attacks in January 2015 in Paris and the ongoing massacres perpetrated by ISIS in Iraq and Syria. In response to this threat, states deploy various counterterrorism measures, the cost of which could be reduced through effective preventive measures. In order to increase the efficiency of preventive measures, policy-makers may benefit from accurate predictive models that are able to capture the complex spatial dynamics of terrorism occurring at a local scale. Despite empirical research carried out at country-level that has confirmed theories explaining the diffusion processes of terrorism across space and time, scholars have failed to assess diffusion’s theories on a local scale. Moreover, since scholars have not made the most of recent statistical modelling approaches, they have been unable to build up predictive models accurate in both space and time. In an effort to address these shortcomings, this research suggests a novel approach to systematically assess the theories of terrorism’s diffusion on a local scale and provide a predictive model of the local spatial dynamics of terrorism worldwide. With a focus on the lethal terrorist events that occurred after 9/11, this paper addresses the following question: why and how does lethal terrorism diffuse in space and time? Based on geolocalised data on worldwide terrorist attacks and covariates gathered from 2002 to 2013, a binomial spatio-temporal point process is used to model the probability of terrorist attacks on a sphere (the world), the surface of which is discretised in the form of Delaunay triangles and refined in areas of specific interest. Within a Bayesian framework, the model is fitted through an integrated nested Laplace approximation - a recent fitting approach that computes fast and accurate estimates of posterior marginals. Hence, for each location in the world, the model provides a probability of encountering a lethal terrorist attack and measures of volatility, which inform on the model’s predictability. Diffusion processes are visualised through interactive maps that highlight space-time variations in the probability and volatility of encountering a lethal attack from 2002 to 2013. Based on the previous twelve years of observation, the location and lethality of terrorist events in 2014 are statistically accurately predicted. Throughout the global scope of this research, local diffusion processes such as escalation and relocation are systematically examined: the former process describes an expansion from high concentration areas of lethal terrorist events (hotspots) to neighbouring areas, while the latter is characterised by changes in the location of hotspots. By controlling for the effect of geographical, economical and demographic variables, the results of the model suggest that the diffusion processes of lethal terrorism are jointly driven by contagious and non-contagious factors that operate on a local scale – as predicted by theories of diffusion. Moreover, by providing a quantitative measure of predictability, the model prevents policy-makers from making decisions based on highly uncertain predictions. Ultimately, this research may provide important complementary tools to enhance the efficiency of policies that aim to prevent and combat terrorism.

Keywords: diffusion process, terrorism, spatial dynamics, spatio-temporal modeling

Procedia PDF Downloads 337
24278 Document-level Sentiment Analysis: An Exploratory Case Study of Low-resource Language Urdu

Authors: Ammarah Irum, Muhammad Ali Tahir

Abstract:

Document-level sentiment analysis in Urdu is a challenging Natural Language Processing (NLP) task due to the difficulty of working with lengthy texts in a language with constrained resources. Deep learning models, which are complex neural network architectures, are well-suited to text-based applications in addition to data formats like audio, image, and video. To investigate the potential of deep learning for Urdu sentiment analysis, we implemented five different deep learning models, including Bidirectional Long Short Term Memory (BiLSTM), Convolutional Neural Network (CNN), Convolutional Neural Network with Bidirectional Long Short Term Memory (CNN-BiLSTM), and Bidirectional Encoder Representation from Transformer (BERT). In this study, we developed a hybrid deep learning model called BiLSTM-Single Layer Multi Filter Convolutional Neural Network (BiLSTM-SLMFCNN) by fusing BiLSTM and CNN architecture. The proposed and baseline techniques are applied on Urdu Customer Support data set and IMDB Urdu movie review data set by using pre-trained Urdu word embedding that are suitable for sentiment analysis at the document level. Results of these techniques are evaluated and our proposed model outperforms all other deep learning techniques for Urdu sentiment analysis. BiLSTM-SLMFCNN outperformed the baseline deep learning models and achieved 83%, 79%, 83% and 94% accuracy on small, medium and large sized IMDB Urdu movie review data set and Urdu Customer Support data set respectively.

Keywords: urdu sentiment analysis, deep learning, natural language processing, opinion mining, low-resource language

Procedia PDF Downloads 51
24277 Redefining the Croatian Economic Sentiment Indicator

Authors: Ivana Lolic, Petar Soric, Mirjana Cizmesija

Abstract:

Based on Business and Consumer Survey (BCS) data, the European Commission (EC) regularly publishes the monthly Economic Sentiment Indicator (ESI) for each EU member state. ESI is conceptualized as a leading indicator, aimed ad tracking the overall economic activity. In calculating ESI, the EC employs arbitrarily chosen weights on 15 BCS response balances. This paper raises the predictive quality of ESI by applying nonlinear programming to find such weights that maximize the correlation coefficient of ESI and year-on-year GDP growth. The obtained results show that the highest weights are assigned to the response balances of industrial sector questions, followed by questions from the retail trade sector. This comes as no surprise since the existing literature shows that the industrial production is a plausible proxy for the overall Croatian economic activity and since Croatian GDP is largely influenced by the aggregate personal consumption.

Keywords: business and consumer survey, economic sentiment indicator, leading indicator, nonlinear optimization with constraints

Procedia PDF Downloads 446
24276 Deep Learning for Qualitative and Quantitative Grain Quality Analysis Using Hyperspectral Imaging

Authors: Ole-Christian Galbo Engstrøm, Erik Schou Dreier, Birthe Møller Jespersen, Kim Steenstrup Pedersen

Abstract:

Grain quality analysis is a multi-parameterized problem that includes a variety of qualitative and quantitative parameters such as grain type classification, damage type classification, and nutrient regression. Currently, these parameters require human inspection, a multitude of instruments employing a variety of sensor technologies, and predictive model types or destructive and slow chemical analysis. This paper investigates the feasibility of applying near-infrared hyperspectral imaging (NIR-HSI) to grain quality analysis. For this study two datasets of NIR hyperspectral images in the wavelength range of 900 nm - 1700 nm have been used. Both datasets contain images of sparsely and densely packed grain kernels. The first dataset contains ~87,000 image crops of bulk wheat samples from 63 harvests where protein value has been determined by the FOSS Infratec NOVA which is the golden industry standard for protein content estimation in bulk samples of cereal grain. The second dataset consists of ~28,000 image crops of bulk grain kernels from seven different wheat varieties and a single rye variety. In the first dataset, protein regression analysis is the problem to solve while variety classification analysis is the problem to solve in the second dataset. Deep convolutional neural networks (CNNs) have the potential to utilize spatio-spectral correlations within a hyperspectral image to simultaneously estimate the qualitative and quantitative parameters. CNNs can autonomously derive meaningful representations of the input data reducing the need for advanced preprocessing techniques required for classical chemometric model types such as artificial neural networks (ANNs) and partial least-squares regression (PLS-R). A comparison between different CNN architectures utilizing 2D and 3D convolution is conducted. These results are compared to the performance of ANNs and PLS-R. Additionally, a variety of preprocessing techniques from image analysis and chemometrics are tested. These include centering, scaling, standard normal variate (SNV), Savitzky-Golay (SG) filtering, and detrending. The results indicate that the combination of NIR-HSI and CNNs has the potential to be the foundation for an automatic system unifying qualitative and quantitative grain quality analysis within a single sensor technology and predictive model type.

Keywords: deep learning, grain analysis, hyperspectral imaging, preprocessing techniques

Procedia PDF Downloads 82
24275 Role of Macro and Technical Indicators in Equity Risk Premium Prediction: A Principal Component Analysis Approach

Authors: Naveed Ul Hassan, Bilal Aziz, Maryam Mushtaq, Imran Ameen Khan

Abstract:

Equity risk premium (ERP) is the stock return in excess of risk free return. Even though it is an essential topic of finance but still there is no common consensus upon its forecasting. For forecasting ERP, apart from the macroeconomic variables attention is devoted to technical indicators as well. For this purpose, set of 14 technical and 14 macro-economic variables is selected and all forecasts are generated based on a standard predictive regression framework, where ERP is regressed on a constant and a lag of a macroeconomic variable or technical indicator. The comparative results showed that technical indicators provide better indications about ERP estimates as compared to macro-economic variables. The relative strength of ERP predictability is also investigated by using National Bureau of Economic Research (NBER) data of business cycle expansion and recessions and found that ERP predictability is more than twice for recessions as compared to expansions.

Keywords: equity risk premium, forecasting, macroeconomic indicators, technical indicators

Procedia PDF Downloads 290
24274 Modelling of Moisture Loss and Oil Uptake during Deep-Fat Frying of Plantain

Authors: James A. Adeyanju, John O. Olajide, Akinbode A. Adedeji

Abstract:

A predictive mathematical model based on the fundamental principles of mass transfer was developed to simulate the moisture content and oil content during Deep-Fat Frying (DFF) process of dodo. The resulting governing equation, that is, partial differential equation that describes rate of moisture loss and oil uptake was solved numerically using explicit Finite Difference Technique (FDT). Computer codes were written in MATLAB environment for the implementation of FDT at different frying conditions and moisture loss as well as oil uptake simulation during DFF of dodo. Plantain samples were sliced into 5 mm thickness and fried at different frying oil temperatures (150, 160 and 170 ⁰C) for periods varying from 2 to 4 min. The comparison between the predicted results and experimental data for the validation of the model showed reasonable agreement. The correlation coefficients between the predicted and experimental values of moisture and oil transfer models ranging from 0.912 to 0.947 and 0.895 to 0.957, respectively. The predicted results could be further used for the design, control and optimization of deep-fat frying process.

Keywords: frying, moisture loss, modelling, oil uptake

Procedia PDF Downloads 427
24273 The Concentration of Natural Alpha Emitters Radionuclides in Fish and Their Contribution to the Internal Dose

Authors: Wagner Pereira, Alphonse Kelecom

Abstract:

Mining can impact the environment, and the major impact of some mining activities is the radiological impact. In human populations, such impact is well studied and regulated. For biota, this assessment always had as focus the protection of human food chain. The protection of biota itself is a new approach, still developing. In order to contribute to this new approach, fish collecting was carried out in areas of naturally occurring radioactive materials (NORM), where a uranium mine is in decommissioning phase. The activity concentrations were analyzed, in Bq/kg wet weight, for Uranium (Unat), Th-232 and Ra-226 in the lambari fish Astyanax bimaculatus L. (omnivorous fish) and in the traíra fish Hoplias malabaricus Bloch, 1794 (carnivorous fish). Seven composite samples (that is: a sufficient number of individuals to reach at least 2 kg of fresh weight) were collected every six months between 2013 and 2015. The mean activity concentrations (AC) for uranium ranged from 1.12 (lambari) to 0.60 (lungfish). For Th, variations ranged from 0.30 to 0.05 (lambari and traíra, respectively). Finally, the Ra-226 means ranged between 0.08 and 0.03. No temporal trends of accumulation could be identified. Systematically, the AC values of radionuclides were higher in omnivorous fish when compared to the carnivore ones.

Keywords: biota dose, NORM, fish, environmental protection

Procedia PDF Downloads 236
24272 The Role of HPV Status in Patients with Overlapping Grey Zone Cancer in Oral Cavity and Oropharynx

Authors: Yao Song

Abstract:

Objectives: We aimed to explore the clinicodemographic characteristics and prognosis of grey zone squamous cell cancer (GZSCC) located in the overlapping or ambiguous area of the oral cavity and oropharynx and to identify valuable factors that would improve its differential diagnosis and prognosis. Methods: Information of GZSCC patients in the Surveillance, Epidemiology, and End Results (SEER) database was compared to patients with an oral cavity (OCSCC) and oropharyngeal (OPSCC) squamous cell carcinomas with corresponding HPV status, respectively. Kaplan-Meier method with log-rank test and multivariate Cox regression analysis were applied to assess associations between clinical characteristics and overall survival (OS). A predictive model integrating age, gender, marital status, HPV status, and staging variables was conducted to classify GZSCC patients into three risk groups and verified internally by 10-fold cross validation. Results: A total of 3318 GZSCC, 10792 OPSCC, and 6656 OCSCC patients were identified. HPV-positive GZSCC patients had the best 5-year OS as HPV-positive OPSCC (81% vs. 82%). However, the 5-year OS of HPV-negative/unknown GZSCC (43%/42%) was the worst among all groups, indicating that HPV status and the overlapping nature of tumors were valuable prognostic predictors in GZSCC patients. Compared with the strategy of dividing GZSCC into two groups by HPV status, the predictive model integrating more variables could additionally identify a unique high-risk GZSCC group with the lowest OS rate. Conclusions: GZSCC patients had distinct clinical characteristics and prognoses compared with OPSCC and OCSCC; integrating HPV status and other clinical factors could help distinguish GZSCC and predict their prognosis.

Keywords: GZSCC, OCSCC, OPSCC, HPV

Procedia PDF Downloads 64
24271 Role of Imaging in Predicting the Receptor Positivity Status in Lung Adenocarcinoma: A Chapter in Radiogenomics

Authors: Sonal Sethi, Mukesh Yadav, Abhimanyu Gupta

Abstract:

The upcoming field of radiogenomics has the potential to upgrade the role of imaging in lung cancer management by noninvasive characterization of tumor histology and genetic microenvironment. Receptor positivity like epidermal growth factor receptor (EGFR) and anaplastic lymphoma kinase (ALK) genotyping are critical in lung adenocarcinoma for treatment. As conventional identification of receptor positivity is an invasive procedure, we analyzed the features on non-invasive computed tomography (CT), which predicts the receptor positivity in lung adenocarcinoma. Retrospectively, we did a comprehensive study from 77 proven lung adenocarcinoma patients with CT images, EGFR and ALK receptor genotyping, and clinical information. Total 22/77 patients were receptor-positive (15 had only EGFR mutation, 6 had ALK mutation, and 1 had both EGFR and ALK mutation). Various morphological characteristics and metastatic distribution on CT were analyzed along with the clinical information. Univariate and multivariable logistic regression analyses were used. On multivariable logistic regression analysis, we found spiculated margin, lymphangitic spread, air bronchogram, pleural effusion, and distant metastasis had a significant predictive value for receptor mutation status. On univariate analysis, air bronchogram and pleural effusion had significant individual predictive value. Conclusions: Receptor positive lung cancer has characteristic imaging features compared with nonreceptor positive lung adenocarcinoma. Since CT is routinely used in lung cancer diagnosis, we can predict the receptor positivity by a noninvasive technique and would follow a more aggressive algorithm for evaluation of distant metastases as well as for the treatment.

Keywords: lung cancer, multidisciplinary cancer care, oncologic imaging, radiobiology

Procedia PDF Downloads 109
24270 Biosorption of Gold from Chloride Media in a Simultaneous Adsorption-Reduction Process

Authors: Shafiq Alam, Yen Ning Lee

Abstract:

Conventional hydrometallurgical processing of metals involves the use of large quantities of toxic chemicals. Realizing a need to develop sustainable technologies, extensive research studies are being carried out to recover and recycle base, precious and rare earth metals from their pregnant leach solutions (PLS) using green chemicals/biomaterials prepared from biomass wastes derived from agriculture, marine and forest resources. Our innovative research showed that bio-adsorbents prepared from such biomass wastes can effectively adsorb precious metals, especially gold after conversion of their functional groups in a very simple process. The highly effective ‘Adsorption-coupled-Reduction’ phenomenon witnessed appears promising for the potential use of this gold biosorption process in the mining industry. Proper management and effective use of biomass wastes as value added green chemicals will not only reduce the volume of wastes being generated every day in our society, but will also have a high-end value to the mining and mineral processing industries as those biomaterials would be cheap, but very selective for gold recovery/recycling from low grade ore, leach residue or e-wastes.

Keywords: biosorption, hydrometallurgy, gold, adsorption, reduction, biomass, sustainability

Procedia PDF Downloads 361
24269 Copyright Clearance for Artificial Intelligence Training Data: Challenges and Solutions

Authors: Erva Akin

Abstract:

– The use of copyrighted material for machine learning purposes is a challenging issue in the field of artificial intelligence (AI). While machine learning algorithms require large amounts of data to train and improve their accuracy and creativity, the use of copyrighted material without permission from the authors may infringe on their intellectual property rights. In order to overcome copyright legal hurdle against the data sharing, access and re-use of data, the use of copyrighted material for machine learning purposes may be considered permissible under certain circumstances. For example, if the copyright holder has given permission to use the data through a licensing agreement, then the use for machine learning purposes may be lawful. It is also argued that copying for non-expressive purposes that do not involve conveying expressive elements to the public, such as automated data extraction, should not be seen as infringing. The focus of such ‘copy-reliant technologies’ is on understanding language rules, styles, and syntax and no creative ideas are being used. However, the non-expressive use defense is within the framework of the fair use doctrine, which allows the use of copyrighted material for research or educational purposes. The questions arise because the fair use doctrine is not available in EU law, instead, the InfoSoc Directive provides for a rigid system of exclusive rights with a list of exceptions and limitations. One could only argue that non-expressive uses of copyrighted material for machine learning purposes do not constitute a ‘reproduction’ in the first place. Nevertheless, the use of machine learning with copyrighted material is difficult because EU copyright law applies to the mere use of the works. Two solutions can be proposed to address the problem of copyright clearance for AI training data. The first is to introduce a broad exception for text and data mining, either mandatorily or for commercial and scientific purposes, or to permit the reproduction of works for non-expressive purposes. The second is that copyright laws should permit the reproduction of works for non-expressive purposes, which opens the door to discussions regarding the transposition of the fair use principle from the US into EU law. Both solutions aim to provide more space for AI developers to operate and encourage greater freedom, which could lead to more rapid innovation in the field. The Data Governance Act presents a significant opportunity to advance these debates. Finally, issues concerning the balance of general public interests and legitimate private interests in machine learning training data must be addressed. In my opinion, it is crucial that robot-creation output should fall into the public domain. Machines depend on human creativity, innovation, and expression. To encourage technological advancement and innovation, freedom of expression and business operation must be prioritised.

Keywords: artificial intelligence, copyright, data governance, machine learning

Procedia PDF Downloads 67
24268 Predictive Analysis of Personnel Relationship in Graph Database

Authors: Kay Thi Yar, Khin Mar Lar Tun

Abstract:

Nowadays, social networks are so popular and widely used in all over the world. In addition, searching personal information of each person and searching connection between them (peoples’ relation in real world) becomes interesting issue in our society. In this paper, we propose a framework with three portions for exploring peoples’ relations from their connected information. The first portion focuses on the Graph database structure to store the connected data of peoples’ information. The second one proposes the graph database searching algorithm, the Modified-SoS-ACO (Sense of Smell-Ant Colony Optimization). The last portion proposes the Deductive Reasoning Algorithm to define two persons’ relationship. This study reveals the proper storage structure for connected information, graph searching algorithm and deductive reasoning algorithm to predict and analyze the personnel relationship from peoples’ relation in their connected information.

Keywords: personnel information, graph storage structure, graph searching algorithm, deductive reasoning algorithm

Procedia PDF Downloads 436