Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 2231

Search results for: prediction%20models

1031 Representativity Based Wasserstein Active Regression

Authors: Benjamin Bobbia, Matthias Picard

Abstract:

In recent years active learning methodologies based on the representativity of the data seems more promising to limit overfitting. The presented query methodology for regression using the Wasserstein distance measuring the representativity of our labelled dataset compared to the global distribution. In this work a crucial use of GroupSort Neural Networks is made therewith to draw a double advantage. The Wasserstein distance can be exactly expressed in terms of such neural networks. Moreover, one can provide explicit bounds for their size and depth together with rates of convergence. However, heterogeneity of the dataset is also considered by weighting the Wasserstein distance with the error of approximation at the previous step of active learning. Such an approach leads to a reduction of overfitting and high prediction performance after few steps of query. After having detailed the methodology and algorithm, an empirical study is presented in order to investigate the range of our hyperparameters. The performances of this method are compared, in terms of numbers of query needed, with other classical and recent query methods on several UCI datasets.

Keywords: active learning, Lipschitz regularization, neural networks, optimal transport, regression

Procedia PDF Downloads 80

1030 Extended Strain Energy Density Criterion for Fracture Investigation of Orthotropic Materials

Authors: Mahdi Fakoor, Hannaneh Manafi Farid

Abstract:

In order to predict the fracture behavior of cracked orthotropic materials under mixed-mode loading, well-known minimum strain energy density (SED) criterion is extended. The crack is subjected along the fibers at plane strain conditions. Despite the complicities to solve the nonlinear equations which are requirements of SED criterion, SED criterion for anisotropic materials is derived. In the present research, fracture limit curve of SED criterion is depicted by a numerical solution, hence the direction of crack growth is figured out by derived criterion, MSED. The validated MSED demonstrates the improvement in prediction of fracture behavior of the materials. Also, damaged factor that plays a crucial role in the fracture behavior of quasi-brittle materials is derived from this criterion and proved its dependency on mechanical properties and direction of crack growth.

Keywords: mixed-mode fracture, minimum strain energy density criterion, orthotropic materials, fracture limit curve, mode II critical stress intensity factor

Procedia PDF Downloads 165

1029 The Role and Importance of Genome Sequencing in Prediction of Cancer Risk

Authors: M. Sadeghi, H. Pezeshk, R. Tusserkani, A. Sharifi Zarchi, A. Malekpour, M. Foroughmand, S. Goliaei, M. Totonchi, N. Ansari–Pour

Abstract:

The role and relative importance of intrinsic and extrinsic factors in the development of complex diseases such as cancer still remains a controversial issue. Determining the amount of variation explained by these factors needs experimental data and statistical models. These models are nevertheless based on the occurrence and accumulation of random mutational events during stem cell division, thus rendering cancer development a stochastic outcome. We demonstrate that not only individual genome sequencing is uninformative in determining cancer risk, but also assigning a unique genome sequence to any given individual (healthy or affected) is not meaningful. Current whole-genome sequencing approaches are therefore unlikely to realize the promise of personalized medicine. In conclusion, since genome sequence differs from cell to cell and changes over time, it seems that determining the risk factor of complex diseases based on genome sequence is somewhat unrealistic, and therefore, the resulting data are likely to be inherently uninformative.

Keywords: cancer risk, extrinsic factors, genome sequencing, intrinsic factors

Procedia PDF Downloads 268

1028 A Resource Optimization Strategy for CPU (Central Processing Unit) Intensive Applications

Authors: Junjie Peng, Jinbao Chen, Shuai Kong, Danxu Liu

Abstract:

On the basis of traditional resource allocation strategies, the usage of resources on physical servers in cloud data center is great uncertain. It will cause waste of resources if the assignment of tasks is not enough. On the contrary, it will cause overload if the assignment of tasks is too much. This is especially obvious when the applications are the same type because of its resource preferences. Considering CPU intensive application is one of the most common types of application in the cloud, we studied the optimization strategy for CPU intensive applications on the same server. We used resource preferences to analyze the case that multiple CPU intensive applications run simultaneously, and put forward a model which can predict the execution time for CPU intensive applications which run simultaneously. Based on the prediction model, we proposed the method to select the appropriate number of applications for a machine. Experiments show that the model can predict the execution time accurately for CPU intensive applications. To improve the execution efficiency of applications, we propose a scheduling model based on priority for CPU intensive applications. Extensive experiments verify the validity of the scheduling model.

Keywords: cloud computing, CPU intensive applications, resource optimization, strategy

Procedia PDF Downloads 276

1027 Prediction of Maximum Inter-Story Drifts of Steel Frames Using Intensity Measures

Authors: Edén Bojórquez, Victor Baca, Alfredo Reyes-Salazar, Jorge González

Abstract:

In this paper, simplified equations to predict maximum inter-story drift demands of steel framed buildings are proposed in terms of two ground motion intensity measures based on the acceleration spectral shape. For this aim, the maximum inter-story drifts of steel frames with 4, 6, 8 and 10 stories subjected to narrow-band ground motion records are estimated and compared with the spectral acceleration at first mode of vibration Sa(T1) which is commonly used in earthquake engineering and seismology, and with a new parameter related with the structural response known as INp. It is observed that INp is the parameter best related with the structural response of steel frames under narrow-band motions. Finally, equations to compute maximum inter-story drift demands of steel frames as a function of spectral acceleration and INp are proposed.

Keywords: intensity measures, spectral shape, steel frames, peak demands

Procedia PDF Downloads 390

1026 Evaluating the Diagnostic Accuracy of the ctDNA Methylation for Liver Cancer

Authors: Maomao Cao

Abstract:

Objective: To test the performance of ctDNA methylation for the detection of liver cancer. Methods: A total of 1233 individuals have been recruited in 2017. 15 male and 15 female samples (including 10 cases of liver cancer) were randomly selected in the present study. CfDNA was extracted by MagPure Circulating DNA Maxi Kit. The concentration of cfDNA was obtained by Qubit™ dsDNA HS Assay Kit. A pre-constructed predictive model was used to analyze methylation data and to give a predictive score for each cfDNA sample. Individuals with a predictive score greater than or equal to 80 were classified as having liver cancer. CT tests were considered the gold standard. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) for the diagnosis of liver cancer were calculated. Results: 9 patients were diagnosed with liver cancer according to the prediction model (with high sensitivity and threshold of 80 points), with scores of 99.2, 91.9, 96.6, 92.4, 91.3, 92.5, 96.8, 91.1, and 92.2, respectively. The sensitivity, specificity, positive predictive value, and negative predictive value of ctDNA methylation for the diagnosis of liver cancer were 0.70, 0.90, 0.78, and 0.86, respectively. Conclusions: ctDNA methylation could be an acceptable diagnostic modality for the detection of liver cancer.

Keywords: liver cancer, ctDNA methylation, detection, diagnostic performance

Procedia PDF Downloads 148

1025 Digital Platform of Crops for Smart Agriculture

Authors: Pascal François Faye, Baye Mor Sall, Bineta Dembele, Jeanne Ana Awa Faye

Abstract:

In agriculture, estimating crop yields is key to improving productivity and decision-making processes such as financial market forecasting and addressing food security issues. The main objective of this paper is to have tools to predict and improve the accuracy of crop yield forecasts using machine learning (ML) algorithms such as CART , KNN and SVM . We developed a mobile app and a web app that uses these algorithms for practical use by farmers. The tests show that our system (collection and deployment architecture, web application and mobile application) is operational and validates empirical knowledge on agro-climatic parameters in addition to proactive decision-making support. The experimental results obtained on the agricultural data, the performance of the ML algorithms are compared using cross-validation in order to identify the most effective ones following the agricultural data. The proposed applications demonstrate that the proposed approach is effective in predicting crop yields and provides timely and accurate responses to farmers for decision support.

Keywords: prediction, machine learning, artificial intelligence, digital agriculture

Procedia PDF Downloads 79

1024 Profit-Based Artificial Neural Network (ANN) Trained by Migrating Birds Optimization: A Case Study in Credit Card Fraud Detection

Authors: Ashkan Zakaryazad, Ekrem Duman

Abstract:

A typical classification technique ranks the instances in a data set according to the likelihood of belonging to one (positive) class. A credit card (CC) fraud detection model ranks the transactions in terms of probability of being fraud. In fact, this approach is often criticized, because firms do not care about fraud probability but about the profitability or costliness of detecting a fraudulent transaction. The key contribution in this study is to focus on the profit maximization in the model building step. The artificial neural network proposed in this study works based on profit maximization instead of minimizing the error of prediction. Moreover, some studies have shown that the back propagation algorithm, similar to other gradient–based algorithms, usually gets trapped in local optima and swarm-based algorithms are more successful in this respect. In this study, we train our profit maximization ANN using the Migrating Birds optimization (MBO) which is introduced to literature recently.

Keywords: neural network, profit-based neural network, sum of squared errors (SSE), MBO, gradient descent

Procedia PDF Downloads 473

1023 Transfer Learning for Protein Structure Classification at Low Resolution

Authors: Alexander Hudson, Shaogang Gong

Abstract:

Structure determination is key to understanding protein function at a molecular level. Whilst significant advances have been made in predicting structure and function from amino acid sequence, researchers must still rely on expensive, time-consuming analytical methods to visualise detailed protein conformation. In this study, we demonstrate that it is possible to make accurate (≥80%) predictions of protein class and architecture from structures determined at low (>3A) resolution, using a deep convolutional neural network trained on high-resolution (≤3A) structures represented as 2D matrices. Thus, we provide proof of concept for high-speed, low-cost protein structure classification at low resolution, and a basis for extension to prediction of function. We investigate the impact of the input representation on classification performance, showing that side-chain information may not be necessary for fine-grained structure predictions. Finally, we confirm that high resolution, low-resolution and NMR-determined structures inhabit a common feature space, and thus provide a theoretical foundation for boosting with single-image super-resolution.

Keywords: transfer learning, protein distance maps, protein structure classification, neural networks

Procedia PDF Downloads 134

1022 Estimation of Coefficient of Discharge of Side Trapezoidal Labyrinth Weir Using Group Method of Data Handling Technique

Authors: M. A. Ansari, A. Hussain, A. Uddin

Abstract:

A side weir is a flow diversion structure provided in the side wall of a channel to divert water from the main channel to a branch channel. The trapezoidal labyrinth weir is a special type of weir in which crest length of the weir is increased to pass higher discharge. Experimental and numerical studies related to the coefficient of discharge of trapezoidal labyrinth weir in an open channel have been presented in the present study. Group Method of Data Handling (GMDH) with the transfer function of quadratic polynomial has been used to predict the coefficient of discharge for the side trapezoidal labyrinth weir. A new model is developed for coefficient of discharge of labyrinth weir by regression method. Generalized models for predicting the coefficient of discharge for labyrinth weir using Group Method of Data Handling (GMDH) network have also been developed. The prediction based on GMDH model is more satisfactory than those given by traditional regression equations.

Keywords: discharge coefficient, group method of data handling, open channel, side labyrinth weir

Procedia PDF Downloads 159

1021 Capability of Available Seismic Soil Liquefaction Potential Assessment Models Based on Shear-Wave Velocity Using Banchu Case History

Authors: Nima Pirhadi, Yong Bo Shao, Xusheng Wa, Jianguo Lu

Abstract:

Several models based on the simplified method introduced by Seed and Idriss (1971) have been developed to assess the liquefaction potential of saturated sandy soils. The procedure includes determining the cyclic resistance of the soil as the cyclic resistance ratio (CRR) and comparing it with earthquake loads as cyclic stress ratio (CSR). Of all methods to determine CRR, the methods using shear-wave velocity (Vs) are common because of their low sensitivity to the penetration resistance reduction caused by fine content (FC). To evaluate the capability of the models, based on the Vs., the new data from Bachu-Jianshi earthquake case history collected, then the prediction results of the models are compared to the measured results; consequently, the accuracy of the models are discussed via three criteria and graphs. The evaluation demonstrates reasonable accuracy of the models in the Banchu region.

Keywords: seismic liquefaction, banchu-jiashi earthquake, shear-wave velocity, liquefaction potential evaluation

Procedia PDF Downloads 235

1020 Instability Index Method and Logistic Regression to Assess Landslide Susceptibility in County Route 89, Taiwan

Authors: Y. H. Wu, Ji-Yuan Lin, Yu-Ming Liou

Abstract:

This study aims to set up the landslide susceptibility map of County Route 89 at Ren-Ai Township in Nantou County using the Instability Index Method and Logistic regression. Seven susceptibility factors including Slope Angle, Aspect, Elevation, Distance to fold, Distance to River, Distance to Road and Accumulated Rainfall were obtained by GIS based on the Typhoon Toraji landslide area identified by Industrial Technology Research Institute in 2001. To calculate the landslide percentage of each factor and acquire the weight and grade the grid by means of Instability Index Method. In this study, landslide susceptibility can be classified into four grades: high, medium high, medium low and low, in order to determine the advantages and disadvantages of the two models. The precision of this model is verified by classification error matrix and SRC curve. These results suggest that the logistic regression model is a preferred method than instability index in the assessment of landslide susceptibility. It is suitable for the landslide prediction and precaution in this area in the future.

Keywords: instability index method, logistic regression, landslide susceptibility, SRC curve

Procedia PDF Downloads 289

1019 Automatic Classification of Periodic Heart Sounds Using Convolutional Neural Network

Authors: Jia Xin Low, Keng Wah Choo

Abstract:

This paper presents an automatic normal and abnormal heart sound classification model developed based on deep learning algorithm. MITHSDB heart sounds datasets obtained from the 2016 PhysioNet/Computing in Cardiology Challenge database were used in this research with the assumption that the electrocardiograms (ECG) were recorded simultaneously with the heart sounds (phonocardiogram, PCG). The PCG time series are segmented per heart beat, and each sub-segment is converted to form a square intensity matrix, and classified using convolutional neural network (CNN) models. This approach removes the need to provide classification features for the supervised machine learning algorithm. Instead, the features are determined automatically through training, from the time series provided. The result proves that the prediction model is able to provide reasonable and comparable classification accuracy despite simple implementation. This approach can be used for real-time classification of heart sounds in Internet of Medical Things (IoMT), e.g. remote monitoring applications of PCG signal.

Keywords: convolutional neural network, discrete wavelet transform, deep learning, heart sound classification

Procedia PDF Downloads 346

1018 Urban Runoff Modeling of Ungauged Volcanic Catchment in Madinah, Western Saudi Arabia

Authors: Fahad Alahmadi, Norhan Abd Rahman, Mohammad Abdulrazzak, Zulikifli Yusop

Abstract:

Runoff prediction of ungauged catchment is still a challenging task especially in arid regions with a unique land cover such as volcanic basalt rocks where geological weathering and fractures are highly significant. In this study, Bathan catchment in Madinah western Saudi Arabia was selected for analysis. The aim of this paper is to evaluate different rainfall loss methods; soil conservation Services curve number (SCS-CN), green-ampt and initial-constant rate. Different direct runoff methods were evaluated: soil conservation services dimensionless unit hydrograph (SCS-UH), Snyder unit hydrograph and Clark unit hydrograph. The study showed the superiority of SCS-CN loss method and Clark unit hydrograph method for ungauged catchment where there is no observed runoff data.

Keywords: urban runoff modelling, arid regions, ungauged catchments, volcanic rocks, Madinah, Saudi Arabia

Procedia PDF Downloads 403

1017 On Hyperbolic Gompertz Growth Model (HGGM)

Authors: S. O. Oyamakin, A. U. Chukwu,

Abstract:

We proposed a Hyperbolic Gompertz Growth Model (HGGM), which was developed by introducing a stabilizing parameter called θ using hyperbolic sine function into the classical gompertz growth equation. The resulting integral solution obtained deterministically was reprogrammed into a statistical model and used in modeling the height and diameter of Pines (Pinus caribaea). Its ability in model prediction was compared with the classical gompertz growth model, an approach which mimicked the natural variability of height/diameter increment with respect to age and therefore provides a more realistic height/diameter predictions using goodness of fit tests and model selection criteria. The Kolmogorov-Smirnov test and Shapiro-Wilk test was also used to test the compliance of the error term to normality assumptions while using testing the independence of the error term using the runs test. The mean function of top height/Dbh over age using the two models under study predicted closely the observed values of top height/Dbh in the hyperbolic gompertz growth models better than the source model (classical gompertz growth model) while the results of R2, Adj. R2, MSE, and AIC confirmed the predictive power of the Hyperbolic Monomolecular growth models over its source model.

Keywords: height, Dbh, forest, Pinus caribaea, hyperbolic, gompertz

Procedia PDF Downloads 439

1016 First Principle Calculations of the Structural and Optoelectronic Properties of Cubic Perovskite CsSrF3

Authors: Meriem Harmel, Houari Khachai

Abstract:

We have investigated the structural, electronic and optical properties of a compound perovskite CsSrF3 using the full-potential linearized augmented plane wave (FP-LAPW) method within density functional theory (DFT). In this approach, both the local density approximation (LDA) and the generalized gradient approximation (GGA) were used for exchange-correlation potential calculation. The ground state properties such as lattice parameter, bulk modulus and its pressure derivative were calculated and the results are compared whit experimental and theoretical data. Electronic and bonding properties are discussed from the calculations of band structure, density of states and electron charge density, where the fundamental energy gap is direct under ambient conditions. The contribution of the different bands was analyzed from the total and partial density of states curves. The optical properties (namely: the real and the imaginary parts of the dielectric function ε(ω), the refractive index n(ω) and the extinction coefficient k(ω)) were calculated for radiation up to 35.0 eV. This is the first quantitative theoretical prediction of the optical properties for the investigated compound and still awaits experimental confirmations.

Keywords: DFT, fluoroperovskite, electronic structure, optical properties

Procedia PDF Downloads 476

1015 Using the Technology Acceptance Model to Examine Seniors’ Attitudes toward Facebook

Authors: Chien-Jen Liu, Shu Ching Yang

Abstract:

Using the technology acceptance model (TAM), this study examined the external variables of technological complexity (TC) to acquire a better understanding of the factors that influence the acceptance of computer application courses by learners at Active Aging Universities. After the learners in this study had completed a 27-hour Facebook course, 44 learners responded to a modified TAM survey. Data were collected to examine the path relationships among the variables that influence the acceptance of Facebook-mediated community learning. The partial least squares (PLS) method was used to test the measurement and the structural model. The study results demonstrated that attitudes toward Facebook use directly influence behavioral intentions (BI) with respect to Facebook use, evincing a high prediction rate of 58.3%. In addition to the perceived usefulness (PU) and perceived ease of use (PEOU) measures that are proposed in the TAM, other external variables, such as TC, also indirectly influence BI. These four variables can explain 88% of the variance in BI and demonstrate a high level of predictive ability. Finally, limitations of this investigation and implications for further research are discussed.

Keywords: technology acceptance model (TAM), technological complexity, partial least squares (PLS), perceived usefulness

Procedia PDF Downloads 344

1014 The Role of Brand Loyalty in Generating Positive Word of Mouth among Malaysian Hypermarket Customers

Authors: S. R. Nikhashemi, Laily Haj Paim, Ali Khatibi

Abstract:

Structural Equation Modeling (SEM) was used to test a hypothesized model explaining Malaysian hypermarket customers’ perceptions of brand trust (BT), customer perceived value (CPV) and perceived service quality (PSQ) on building their brand loyalty (CBL) and generating positive word-of-mouth communication (WOM). Self-administered questionnaires were used to collect data from 374 Malaysian hypermarket customers from Mydin, Tesco, Aeon Big and Giant in Kuala Lumpur, a metropolitan city of Malaysia. The data strongly supported the model exhibiting that BT, CPV and PSQ are prerequisite factors in building customer brand loyalty, while PSQ has the strongest effect on prediction of customer brand loyalty compared to other factors. Besides, the present study suggests the effect of the aforementioned factors via customer brand loyalty strongly contributes to generate positive word of mouth communication.

Keywords: brand trust, perceived value, Perceived Service Quality, Brand loyalty, positive word of mouth communication

Procedia PDF Downloads 481

1013 Classification of Health Risk Factors to Predict the Risk of Falling in Older Adults

Authors: L. Lindsay, S. A. Coleman, D. Kerr, B. J. Taylor, A. Moorhead

Abstract:

Cognitive decline and frailty is apparent in older adults leading to an increased likelihood of the risk of falling. Currently health care professionals have to make professional decisions regarding such risks, and hence make difficult decisions regarding the future welfare of the ageing population. This study uses health data from The Irish Longitudinal Study on Ageing (TILDA), focusing on adults over the age of 50 years, in order to analyse health risk factors and predict the likelihood of falls. This prediction is based on the use of machine learning algorithms whereby health risk factors are used as inputs to predict the likelihood of falling. Initial results show that health risk factors such as long-term health issues contribute to the number of falls. The identification of such health risk factors has the potential to inform health and social care professionals, older people and their family members in order to mitigate daily living risks.

Keywords: classification, falls, health risk factors, machine learning, older adults

Procedia PDF Downloads 146

1012 Bankruptcy Prediction Analysis on Mining Sector Companies in Indonesia

Authors: Devina Aprilia Gunawan, Tasya Aspiranti, Inugrah Ratia Pratiwi

Abstract:

This research aims to classify the mining sector companies based on Altman’s Z-score model, and providing an analysis based on the Altman’s Z-score model’s financial ratios to provide a picture about the financial condition in mining sector companies in Indonesia and their viability in the future, and to find out the partial and simultaneous impact of each of the financial ratio variables in the Altman’s Z-score model, namely (WC/TA), (RE/TA), (EBIT/TA), (MVE/TL), and (S/TA), toward the financial condition represented by the Z-score itself. Among 38 mining sector companies listed in Indonesia Stock Exchange (IDX), 28 companies are selected as research sample according to the purposive sampling criteria.The results of this research showed that during 3 years research period at 2010-2012, the amount of the companies that was predicted to be healthy in each year was less than half of the total sample companies and not even reach up to 50%. The multiple regression analysis result showed that all of the research hypotheses are accepted, which means that (WC/TA), (RE/TA), (EBIT/TA), (MVE/TL), and (S/TA), both partially and simultaneously had an impact towards company’s financial condition.

Keywords: Altman’s Z-score model, financial condition, mining companies, Indonesia

Procedia PDF Downloads 527

1011 Early Impact Prediction and Key Factors Study of Artificial Intelligence Patents: A Method Based on LightGBM and Interpretable Machine Learning

Authors: Xingyu Gao, Qiang Wu

Abstract:

Patents play a crucial role in protecting innovation and intellectual property. Early prediction of the impact of artificial intelligence (AI) patents helps researchers and companies allocate resources and make better decisions. Understanding the key factors that influence patent impact can assist researchers in gaining a better understanding of the evolution of AI technology and innovation trends. Therefore, identifying highly impactful patents early and providing support for them holds immeasurable value in accelerating technological progress, reducing research and development costs, and mitigating market positioning risks. Despite the extensive research on AI patents, accurately predicting their early impact remains a challenge. Traditional methods often consider only single factors or simple combinations, failing to comprehensively and accurately reflect the actual impact of patents. This paper utilized the artificial intelligence patent database from the United States Patent and Trademark Office and the Len.org patent retrieval platform to obtain specific information on 35,708 AI patents. Using six machine learning models, namely Multiple Linear Regression, Random Forest Regression, XGBoost Regression, LightGBM Regression, Support Vector Machine Regression, and K-Nearest Neighbors Regression, and using early indicators of patents as features, the paper comprehensively predicted the impact of patents from three aspects: technical, social, and economic. These aspects include the technical leadership of patents, the number of citations they receive, and their shared value. The SHAP (Shapley Additive exPlanations) metric was used to explain the predictions of the best model, quantifying the contribution of each feature to the model's predictions. The experimental results on the AI patent dataset indicate that, for all three target variables, LightGBM regression shows the best predictive performance. Specifically, patent novelty has the greatest impact on predicting the technical impact of patents and has a positive effect. Additionally, the number of owners, the number of backward citations, and the number of independent claims are all crucial and have a positive influence on predicting technical impact. In predicting the social impact of patents, the number of applicants is considered the most critical input variable, but it has a negative impact on social impact. At the same time, the number of independent claims, the number of owners, and the number of backward citations are also important predictive factors, and they have a positive effect on social impact. For predicting the economic impact of patents, the number of independent claims is considered the most important factor and has a positive impact on economic impact. The number of owners, the number of sibling countries or regions, and the size of the extended patent family also have a positive influence on economic impact. The study primarily relies on data from the United States Patent and Trademark Office for artificial intelligence patents. Future research could consider more comprehensive data sources, including artificial intelligence patent data, from a global perspective. While the study takes into account various factors, there may still be other important features not considered. In the future, factors such as patent implementation and market applications may be considered as they could have an impact on the influence of patents.

Keywords: patent influence, interpretable machine learning, predictive models, SHAP

Procedia PDF Downloads 47

1010 Feature-Based Summarizing and Ranking from Customer Reviews

Authors: Dim En Nyaung, Thin Lai Lai Thein

Abstract:

Due to the rapid increase of Internet, web opinion sources dynamically emerge which is useful for both potential customers and product manufacturers for prediction and decision purposes. These are the user generated contents written in natural languages and are unstructured-free-texts scheme. Therefore, opinion mining techniques become popular to automatically process customer reviews for extracting product features and user opinions expressed over them. Since customer reviews may contain both opinionated and factual sentences, a supervised machine learning technique applies for subjectivity classification to improve the mining performance. In this paper, we dedicate our work is the task of opinion summarization. Therefore, product feature and opinion extraction is critical to opinion summarization, because its effectiveness significantly affects the identification of semantic relationships. The polarity and numeric score of all the features are determined by Senti-WordNet Lexicon. The problem of opinion summarization refers how to relate the opinion words with respect to a certain feature. Probabilistic based model of supervised learning will improve the result that is more flexible and effective.

Keywords: opinion mining, opinion summarization, sentiment analysis, text mining

Procedia PDF Downloads 330

1009 Modelling Fluoride Pollution of Groundwater Using Artificial Neural Network in the Western Parts of Jharkhand

Authors: Neeta Kumari, Gopal Pathak

Abstract:

Artificial neural network has been proved to be an efficient tool for non-parametric modeling of data in various applications where output is non-linearly associated with input. It is a preferred tool for many predictive data mining applications because of its power , flexibility, and ease of use. A standard feed forward networks (FFN) is used to predict the groundwater fluoride content. The ANN model is trained using back propagated algorithm, Tansig and Logsig activation function having varying number of neurons. The models are evaluated on the basis of statistical performance criteria like Root Mean Squarred Error (RMSE) and Regression coefficient (R2), bias (mean error), Coefficient of variation (CV), Nash-Sutcliffe efficiency (NSE), and the index of agreement (IOA). The results of the study indicate that Artificial neural network (ANN) can be used for groundwater fluoride prediction in the limited data situation in the hard rock region like western parts of Jharkhand with sufficiently good accuracy.

Keywords: Artificial neural network (ANN), FFN (Feed-forward network), backpropagation algorithm, Levenberg-Marquardt algorithm, groundwater fluoride contamination

Procedia PDF Downloads 548

1008 Geothermal Prospect Prediction at Mt. Ciremai Using Fault and Fracture Density Method

Authors: Rifqi Alfadhillah Sentosa, Hasbi Fikru Syabi, Stephen

Abstract:

West Java is a province in Indonesia which has a number of volcanoes. One of those volcanoes is Mt. Ciremai, located administratively at Kuningan and Majalengka District, and is known for its significant geothermal potential in Java Island. This research aims to assume geothermal prospects at Mt. Ciremai using Fault and Fracture Density (FFD) Method, which is correlated to the geochemistry of geothermal manifestations around the mountain. This FFD method is using SRTM data to draw lineaments, which are assumed associated with fractures and faults in the research area. These faults and fractures were assumed as the paths for reservoir fluids to reached surface as geothermal manifestations. The goal of this method is to analyze the density of those lineaments found in the research area. Based on this FFD Method, it is known that area with high density of lineaments located on Mt. Kromong at the northern side of Mt. Ciremai. This prospect area is proven by its higher geothermometer values compared to geothermometer values calculated at the south area of Mt. Ciremai.

Keywords: geothermal prospect, fault and fracture density, Mt. Ciremai, surface manifestation

Procedia PDF Downloads 366

1007 Synoptic Analysis of a Heavy Flood in the Province of Sistan-Va-Balouchestan: Iran January 2020

Authors: N. Pegahfar, P. Ghafarian

Abstract:

In this research, the synoptic weather conditions during the heavy flood of 10-12 January 2020 in the Sistan-va-Balouchestan Province of Iran will be analyzed. To this aim, reanalysis data from the National Centers for Environmental Prediction (NCEP) and National Center for Atmospheric Research (NCAR), NCEP Global Forecasting System (GFS) analysis data, measured data from a surface station together with satellite images from the European Organization for the Exploitation of Meteorological Satellites (EUMETSAT) have been used from 9 to 12 January 2020. Atmospheric parameters both at the lower troposphere and also at the upper part of that have been used, including absolute vorticity, wind velocity, temperature, geopotential height, relative humidity, and precipitation. Results indicated that both lower-level and upper-level currents were strong. In addition, the transport of a large amount of humidity from the Oman Sea and the Red Sea to the south and southeast of Iran (Sistan-va-Balouchestan Province) led to the vast and unexpected precipitation and then a heavy flood.

Keywords: Sistan-va-Balouchestn Province, heavy flood, synoptic, analysis data

Procedia PDF Downloads 101

1006 On the Influence of Sleep Habits for Predicting Preterm Births: A Machine Learning Approach

Authors: C. Fernandez-Plaza, I. Abad, E. Diaz, I. Diaz

Abstract:

Births occurring before the 37th week of gestation are considered preterm births. A threat of preterm is defined as the beginning of regular uterine contractions, dilation and cervical effacement between 23 and 36 gestation weeks. To author's best knowledge, the factors that determine the beginning of the birth are not completely defined yet. In particular, the incidence of sleep habits on preterm births is weekly studied. The aim of this study is to develop a model to predict the factors affecting premature delivery on pregnancy, based on the above potential risk factors, including those derived from sleep habits and light exposure at night (introduced as 12 variables obtained by a telephone survey using two questionnaires previously used by other authors). Thus, three groups of variables were included in the study (maternal, fetal and sleep habits). The study was approved by Research Ethics Committee of the Principado of Asturias (Spain). An observational, retrospective and descriptive study was performed with 481 births between January 1, 2015 and May 10, 2016 in the University Central Hospital of Asturias (Spain). A statistical analysis using SPSS was carried out to compare qualitative and quantitative variables between preterm and term delivery. Chi-square test qualitative variable and t-test for quantitative variables were applied. Statistically significant differences (p < 0.05) between preterm vs. term births were found for primiparity, multi-parity, kind of conception, place of residence or premature rupture of membranes and interruption during nights. In addition to the statistical analysis, machine learning methods to look for a prediction model were tested. In particular, tree based models were applied as the trade-off between performance and interpretability is especially suitable for this study. C5.0, recursive partitioning, random forest and tree bag models were analysed using caret R-package. Cross validation with 10-folds and parameter tuning to optimize the methods were applied. In addition, different noise reduction methods were applied to the initial data using NoiseFiltersR package. The best performance was obtained by C5.0 method with Accuracy 0.91, Sensitivity 0.93, Specificity 0.89 and Precision 0.91. Some well known preterm birth factors were identified: Cervix Dilation, maternal BMI, Premature rupture of membranes or nuchal translucency analysis in the first trimester. The model also identifies other new factors related to sleep habits such as light through window, bedtime on working days, usage of electronic devices before sleeping from Mondays to Fridays or change of sleeping habits reflected in the number of hours, in the depth of sleep or in the lighting of the room. IF dilation < = 2.95 AND usage of electronic devices before sleeping from Mondays to Friday = YES and change of sleeping habits = YES, then preterm is one of the predicting rules obtained by C5.0. In this work a model for predicting preterm births is developed. It is based on machine learning together with noise reduction techniques. The method maximizing the performance is the one selected. This model shows the influence of variables related to sleep habits in preterm prediction.

Keywords: machine learning, noise reduction, preterm birth, sleep habit

Procedia PDF Downloads 145

1005 A Neural Network Modelling Approach for Predicting Permeability from Well Logs Data

Authors: Chico Horacio Jose Sambo

Abstract:

Recently neural network has gained popularity when come to solve complex nonlinear problems. Permeability is one of fundamental reservoir characteristics system that are anisotropic distributed and non-linear manner. For this reason, permeability prediction from well log data is well suited by using neural networks and other computer-based techniques. The main goal of this paper is to predict reservoir permeability from well logs data by using neural network approach. A multi-layered perceptron trained by back propagation algorithm was used to build the predictive model. The performance of the model on net results was measured by correlation coefficient. The correlation coefficient from testing, training, validation and all data sets was evaluated. The results show that neural network was capable of reproducing permeability with accuracy in all cases, so that the calculated correlation coefficients for training, testing and validation permeability were 0.96273, 0.89991 and 0.87858, respectively. The generalization of the results to other field can be made after examining new data, and a regional study might be possible to study reservoir properties with cheap and very fast constructed models.

Keywords: neural network, permeability, multilayer perceptron, well log

Procedia PDF Downloads 402

1004 Exploring Tweet Geolocation: Leveraging Large Language Models for Post-Hoc Explanations

Authors: Sarra Hasni, Sami Faiz

Abstract:

In recent years, location prediction on social networks has gained significant attention, with short and unstructured texts like tweets posing additional challenges. Advanced geolocation models have been proposed, increasing the need to explain their predictions. In this paper, we provide explanations for a geolocation black-box model using LIME and SHAP, two state-of-the-art XAI (eXplainable Artificial Intelligence) methods. We extend our evaluations to Large Language Models (LLMs) as post hoc explainers for tweet geolocation. Our preliminary results show that LLMs outperform LIME and SHAP by generating more accurate explanations. Additionally, we demonstrate that prompts with examples and meta-prompts containing phonetic spelling rules improve the interpretability of these models, even with informal input data. This approach highlights the potential of advanced prompt engineering techniques to enhance the effectiveness of black-box models in geolocation tasks on social networks.

Keywords: large language model, post hoc explainer, prompt engineering, local explanation, tweet geolocation

Procedia PDF Downloads 24

1003 Traction Behavior of Linear Piezo-Viscous Lubricants in Rough Elastohydrodynamic Lubrication Contacts

Authors: Punit Kumar, Niraj Kumar

Abstract:

The traction behavior of lubricants with the linear pressure-viscosity response in EHL line contacts is investigated numerically for smooth as well as rough surfaces. The analysis involves the simultaneous solution of Reynolds, elasticity and energy equations along with the computation of lubricant properties and surface temperatures. The temperature modified Doolittle-Tait equations are used to calculate viscosity and density as functions of fluid pressure and temperature, while Carreau model is used to describe the lubricant rheology. The surface roughness is assumed to be sinusoidal and it is present on the nearly stationary surface in near-pure sliding EHL conjunction. The linear P-V oil is found to yield much lower traction coefficients and slightly thicker EHL films as compared to the synthetic oil for a given set of dimensionless speed and load parameters. Besides, the increase in traction coefficient attributed to surface roughness is much lower for the former case. The present analysis emphasizes the importance of employing realistic pressure-viscosity response for accurate prediction of EHL traction.

Keywords: EHL, linear pressure-viscosity, surface roughness, traction, water/glycol

Procedia PDF Downloads 381

1002 Gender Differences in the Prediction of Smartphone Use While Driving: Personal and Social Factors

Authors: Erez Kita, Gil Luria

Abstract:

This study examines gender as a boundary condition for the relationship between the psychological variable of mindfulness and the social variable of income with regards to the use of smartphones by young drivers. The use of smartphones while driving increases the likelihood of a car accident, endangering young drivers and other road users. The study sample included 186 young drivers who were legally permitted to drive without supervision. The subjects were first asked to complete questionnaires on mindfulness and income. Next, their smartphone use while driving was monitored over a one-month period. This study is unique as it used an objective smartphone monitoring application (rather than self-reporting) to count the number of times the young participants actually touched their smartphones while driving. The findings show that gender moderates the effects of social and personal factors (i.e., income and mindfulness) on the use of smartphones while driving. The pattern of moderation was similar for both social and personal factors. For men, mindfulness and income are negatively associated with the use of smartphones while driving. These factors are not related to the use of smartphones by women drivers. Mindfulness and income can be used to identify male populations that are at risk of using smartphones while driving. Interventions that improve mindfulness can be used to reduce the use of smartphones by male drivers.

Keywords: mindfulness, using smartphones while driving, income, gender, young drivers

Procedia PDF Downloads 169