Search results for: predictive data mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25287

Search results for: predictive data mining

24747 A Hybrid Approach for Thread Recommendation in MOOC Forums

Authors: Ahmad. A. Kardan, Amir Narimani, Foozhan Ataiefard

Abstract:

Recommender Systems have been developed to provide contents and services compatible to users based on their behaviors and interests. Due to information overload in online discussion forums and users diverse interests, recommending relative topics and threads is considered to be helpful for improving the ease of forum usage. In order to lead learners to find relevant information in educational forums, recommendations are even more needed. We present a hybrid thread recommender system for MOOC forums by applying social network analysis and association rule mining techniques. Initial results indicate that the proposed recommender system performs comparatively well with regard to limited available data from users' previous posts in the forum.

Keywords: association rule mining, hybrid recommender system, massive open online courses, MOOCs, social network analysis

Procedia PDF Downloads 278
24746 The Role of Strategic Alliances, Innovation Capability, Cost Reduction in Enhancing Customer Loyalty and Firm’s Competitive Advantage

Authors: Soebowo Musa

Abstract:

Mining industries are known to be very volatile due to their sensitive nature toward changes in the environment, particularly coal mining. Heavy equipment distributors and coal mining contractors are among heavily affected by such volatility. They are facing more uncertainty on the sustainability of the coal mining industry. Strategic alliances and organizational capabilities such as innovation capability have long been seen as ways to stay competitive with a focus more on the strategic alliances partner-to-partner in serving their customers. In today’s rapid change in the environment, a shift in consumer behaviors, and the human-centric business approach, this study looks at the strategic alliance partner-to-customer relationship in both the industrial organization and resource-based theories. This study was conducted based on 250 respondents from the strategic alliances partner-to-customer between heavy equipment distributors and coal mining contractors in Indonesia. This study finds strategic alliances have the highest association toward cost reduction, a proxy of operational efficiency followed by its association toward innovation capability. Further, strategic alliances and innovation capability have a positive relationship with customer loyalty, while innovation capability and customer loyalty have no significant relationships toward the firm’s competitive advantage. This study also indicates that cost reduction is not a condition to develop customer loyalty in the strategic alliance partner-to-customer relationship. It confirms strategic alliances are a strategy that creates a firm’s operational efficiency, innovation capability that develops customer loyalty, and competitive advantage.

Keywords: strategic alliance, innovation capability, cost reduction, customer loyalty, competitive advantage

Procedia PDF Downloads 99
24745 Predicting Emerging Agricultural Investment Opportunities: The Potential of Structural Evolution Index

Authors: Kwaku Damoah

Abstract:

The agricultural sector is characterized by continuous transformation, driven by factors such as demographic shifts, evolving consumer preferences, climate change, and migration trends. This dynamic environment presents complex challenges for key stakeholders including farmers, governments, and investors, who must navigate these changes to achieve optimal investment returns. To effectively predict market trends and uncover promising investment opportunities, a systematic, data-driven approach is essential. This paper introduces the Structural Evolution Index (SEI), a machine learning-based methodology. SEI is specifically designed to analyse long-term trends and forecast the potential of emerging agricultural products for investment. Versatile in application, it evaluates various agricultural metrics such as production, yield, trade, land use, and consumption, providing a comprehensive view of the evolution within agricultural markets. By harnessing data from the UN Food and Agricultural Organisation (FAOSTAT), this study demonstrates the SEI's capabilities through Comparative Exploratory Analysis and evaluation of international trade in agricultural products, focusing on Malaysia and Singapore. The SEI methodology reveals intricate patterns and transitions within the agricultural sector, enabling stakeholders to strategically identify and capitalize on emerging markets. This predictive framework is a powerful tool for decision-makers, offering crucial insights that help anticipate market shifts and align investments with anticipated returns.

Keywords: agricultural investment, algorithm, comparative exploratory analytics, machine learning, market trends, predictive analytics, structural evolution index

Procedia PDF Downloads 47
24744 A Neural Network Modelling Approach for Predicting Permeability from Well Logs Data

Authors: Chico Horacio Jose Sambo

Abstract:

Recently neural network has gained popularity when come to solve complex nonlinear problems. Permeability is one of fundamental reservoir characteristics system that are anisotropic distributed and non-linear manner. For this reason, permeability prediction from well log data is well suited by using neural networks and other computer-based techniques. The main goal of this paper is to predict reservoir permeability from well logs data by using neural network approach. A multi-layered perceptron trained by back propagation algorithm was used to build the predictive model. The performance of the model on net results was measured by correlation coefficient. The correlation coefficient from testing, training, validation and all data sets was evaluated. The results show that neural network was capable of reproducing permeability with accuracy in all cases, so that the calculated correlation coefficients for training, testing and validation permeability were 0.96273, 0.89991 and 0.87858, respectively. The generalization of the results to other field can be made after examining new data, and a regional study might be possible to study reservoir properties with cheap and very fast constructed models.

Keywords: neural network, permeability, multilayer perceptron, well log

Procedia PDF Downloads 380
24743 Correlation Between Different Radiological Findings and Histopathological diagnosis of Breast Diseases: Retrospective Review Conducted Over Sixth Years in King Fahad University Hospital in Eastern Province, Saudi Arabia

Authors: Sadeem Aljamaan, Reem Hariri, Rahaf Alghamdi, Batool Alotaibi, Batool Alsenan, Lama Althunayyan, Areej Alnemer

Abstract:

The aim of this study is to correlate between radiological findings and histopathological results in regard to the breast imaging-reporting and data system scores, size of breast masses, molecular subtypes and suspicious radiological features, as well as to assess the concordance rate in histological grade between core biopsy and surgical excision among breast cancer patients, followed by analyzing the change of concordance rate in relation to neoadjuvant chemotherapy in a Saudi population. A retrospective review was conducted over 6-year period (2017-2022) on all breast core biopsies of women preceded by radiological investigation. Chi-squared test (χ2) was performed on qualitative data, the Mann-Whitney test for quantitative non-parametric variables, and the Kappa test for grade agreement. A total of 641 cases were included. Ultrasound, mammography, and magnetic resonance imaging demonstrated diagnostic accuracies of 85%, 77.9% and 86.9%; respectively. magnetic resonance imaging manifested the highest sensitivity (72.2%), and the lowest was for ultrasound (61%). Concordance in tumor size with final excisions was best in magnetic resonance imaging, while mammography demonstrated a higher tendency of overestimation (41.9%), and ultrasound showed the highest underestimation (67.7%). The association between basal-like molecular subtypes and the breast imaging-reporting and data system score 5 classifications was statistically significant only for magnetic resonance imaging (p=0.04). Luminal subtypes demonstrated a significantly higher percentage of speculation in mammography. Breast imaging-reporting and data system score 4 manifested a substantial number of benign pathologies in all the 3 modalities. A fair concordance rate (k= 0.212 & 0.379) was demonstrated between excision and the preceding core biopsy grading with and without neoadjuvant therapy, respectively. The results demonstrated a down-grading in cases post-neoadjuvant therapy. In cases who did not receive neoadjuvant therapy, underestimation of tumor grade in biopsy was evident. In summary, magnetic resonance imaging had the highest sensitivity, specificity, positive predictive value and accuracy of both diagnosis and estimation of tumor size. Mammography demonstrated better sensitivity than ultrasound and had the highest negative predictive value, but ultrasound had better specificity, positive predictive value and accuracy. Therefore, the combination of different modalities is advantageous. The concordance rate of core biopsy grading with excision was not impacted by neoadjuvant therapy.

Keywords: breast cancer, mammography, MRI, neoadjuvant, pathology, US

Procedia PDF Downloads 70
24742 Non-Destructive Prediction System Using near Infrared Spectroscopy for Crude Palm Oil

Authors: Siti Nurhidayah Naqiah Abdull Rani, Herlina Abdul Rahim

Abstract:

Near infrared (NIR) spectroscopy has always been of great interest in the food and agriculture industries. The development of predictive models has facilitated the estimation process in recent years. In this research, 176 crude palm oil (CPO) samples acquired from Felda Johor Bulker Sdn Bhd were studied. A FOSS NIRSystem was used to tak e absorbance measurements from the sample. The wavelength range for the spectral measurement is taken at 1600nm to 1900nm. Partial Least Square Regression (PLSR) prediction model with 50 optimal number of principal components was implemented to study the relationship between the measured Free Fatty Acid (FFA) values and the measured spectral absorption. PLSR showed predictive ability of FFA values with correlative coefficient (R) of 0.9808 for the training set and 0.9684 for the testing set.

Keywords: palm oil, fatty acid, NIRS, PLSR

Procedia PDF Downloads 195
24741 Multicriteria for Optimal Land Use after Mining

Authors: Carla Idely Palencia-Aguilar

Abstract:

Mining in Colombia represents around 2% of the GDP (USD 8 billion in 2018), with main productions represented by coal, nickel, gold, silver, emeralds, iron, limestone, gypsum, among others. Sand and Gravel had been decreasing its participation of the GDP with a reduction of 33.2 million m3 in 2015, to 27.4 in 2016, 22.7 in 2017 and 15.8 in 2018, with a consumption of approximately 3 tons/inhabitant. However, with the new government policies it is expected to increase in the following years. Mining causes temporary environmental impacts, once restoration and rehabilitation takes place, social, environmental and economic benefits are higher than the initial state. A way to demonstrate how the mining interventions had contributed to improve the characteristics of the region after sand and gravel mining, the NDVI (Normalized Difference Vegetation Index) from MODIS and ASTER were employed. The histograms show not only increments of vegetation in the area (8 times higher), but also topographies similar to the ones before the intervention, according to the application for sustainable development selected: either agriculture, forestry, cattle raising, artificial wetlands or do nothing. The decision was based upon a Multicriteria analysis for optimal land use, with three main variables: geostatistics, evapotranspiration and groundwater characteristics. The use of remote sensing, meteorological stations, piezometers, sunphotometers, geoelectric analysis among others; provide the information required for the multicriteria decision. For cattle raising and agricultural applications (where various crops were implemented), conservation of products were tested by means of nanotechnology. The results showed a duration of 2 years with no chemicals added for preservation and concentration of vitamins of the tested products.

Keywords: ASTER, Geostatistics, MODIS, Multicriteria

Procedia PDF Downloads 111
24740 Modeling of Tool Flank Wear in Finish Hard Turning of AISI D2 Using Genetic Programming

Authors: V. Pourmostaghimi, M. Zadshakoyan

Abstract:

Efficiency and productivity of the finish hard turning can be enhanced impressively by utilizing accurate predictive models for cutting tool wear. However, the ability of genetic programming in presenting an accurate analytical model is a notable characteristic which makes it more applicable than other predictive modeling methods. In this paper, the genetic equation for modeling of tool flank wear is developed with the use of the experimentally measured flank wear values and genetic programming during finish turning of hardened AISI D2. Series of tests were conducted over a range of cutting parameters and the values of tool flank wear were measured. On the basis of obtained results, genetic model presenting connection between cutting parameters and tool flank wear were extracted. The accuracy of the genetically obtained model was assessed by using two statistical measures, which were root mean square error (RMSE) and coefficient of determination (R²). Evaluation results revealed that presented genetic model predicted flank wear over the study area accurately (R² = 0.9902 and RMSE = 0.0102). These results allow concluding that the proposed genetic equation corresponds well with experimental data and can be implemented in real industrial applications.

Keywords: cutting parameters, flank wear, genetic programming, hard turning

Procedia PDF Downloads 166
24739 Lead Removal From Ex- Mining Pond Water by Electrocoagulation: Kinetics, Isotherm, and Dynamic Studies

Authors: Kalu Uka Orji, Nasiman Sapari, Khamaruzaman W. Yusof

Abstract:

Exposure of galena (PbS), tealite (PbSnS2), and other associated minerals during mining activities release lead (Pb) and other heavy metals into the mining water through oxidation and dissolution. Heavy metal pollution has become an environmental challenge. Lead, for instance, can cause toxic effects to human health, including brain damage. Ex-mining pond water was reported to contain lead as high as 69.46 mg/L. Conventional treatment does not easily remove lead from water. A promising and emerging treatment technology for lead removal is the application of the electrocoagulation (EC) process. However, some of the problems associated with EC are systematic reactor design, selection of maximum EC operating parameters, scale-up, among others. This study investigated an EC process for the removal of lead from synthetic ex-mining pond water using a batch reactor and Fe electrodes. The effects of various operating parameters on lead removal efficiency were examined. The results obtained indicated that the maximum removal efficiency of 98.6% was achieved at an initial PH of 9, the current density of 15mA/cm2, electrode spacing of 0.3cm, treatment time of 60 minutes, Liquid Motion of Magnetic Stirring (LM-MS), and electrode arrangement = BP-S. The above experimental data were further modeled and optimized using a 2-Level 4-Factor Full Factorial design, a Response Surface Methodology (RSM). The four factors optimized were the current density, electrode spacing, electrode arrangements, and Liquid Motion Driving Mode (LM). Based on the regression model and the analysis of variance (ANOVA) at 0.01%, the results showed that an increase in current density and LM-MS increased the removal efficiency while the reverse was the case for electrode spacing. The model predicted the optimal lead removal efficiency of 99.962% with an electrode spacing of 0.38 cm alongside others. Applying the predicted parameters, the lead removal efficiency of 100% was actualized. The electrode and energy consumptions were 0.192kg/m3 and 2.56 kWh/m3 respectively. Meanwhile, the adsorption kinetic studies indicated that the overall lead adsorption system belongs to the pseudo-second-order kinetic model. The adsorption dynamics were also random, spontaneous, and endothermic. The higher temperature of the process enhances adsorption capacity. Furthermore, the adsorption isotherm fitted the Freundlish model more than the Langmuir model; describing the adsorption on a heterogeneous surface and showed good adsorption efficiency by the Fe electrodes. Adsorption of Pb2+ onto the Fe electrodes was a complex reaction, involving more than one mechanism. The overall results proved that EC is an efficient technique for lead removal from synthetic mining pond water. The findings of this study would have application in the scale-up of EC reactor and in the design of water treatment plants for feed-water sources that contain lead using the electrocoagulation method.

Keywords: ex-mining water, electrocoagulation, lead, adsorption kinetics

Procedia PDF Downloads 136
24738 Applying Big Data Analysis to Efficiently Exploit the Vast Unconventional Tight Oil Reserves

Authors: Shengnan Chen, Shuhua Wang

Abstract:

Successful production of hydrocarbon from unconventional tight oil reserves has changed the energy landscape in North America. The oil contained within these reservoirs typically will not flow to the wellbore at economic rates without assistance from advanced horizontal well and multi-stage hydraulic fracturing. Efficient and economic development of these reserves is a priority of society, government, and industry, especially under the current low oil prices. Meanwhile, society needs technological and process innovations to enhance oil recovery while concurrently reducing environmental impacts. Recently, big data analysis and artificial intelligence become very popular, developing data-driven insights for better designs and decisions in various engineering disciplines. However, the application of data mining in petroleum engineering is still in its infancy. The objective of this research aims to apply intelligent data analysis and data-driven models to exploit unconventional oil reserves both efficiently and economically. More specifically, a comprehensive database including the reservoir geological data, reservoir geophysical data, well completion data and production data for thousands of wells is firstly established to discover the valuable insights and knowledge related to tight oil reserves development. Several data analysis methods are introduced to analysis such a huge dataset. For example, K-means clustering is used to partition all observations into clusters; principle component analysis is applied to emphasize the variation and bring out strong patterns in the dataset, making the big data easy to explore and visualize; exploratory factor analysis (EFA) is used to identify the complex interrelationships between well completion data and well production data. Different data mining techniques, such as artificial neural network, fuzzy logic, and machine learning technique are then summarized, and appropriate ones are selected to analyze the database based on the prediction accuracy, model robustness, and reproducibility. Advanced knowledge and patterned are finally recognized and integrated into a modified self-adaptive differential evolution optimization workflow to enhance the oil recovery and maximize the net present value (NPV) of the unconventional oil resources. This research will advance the knowledge in the development of unconventional oil reserves and bridge the gap between the big data and performance optimizations in these formations. The newly developed data-driven optimization workflow is a powerful approach to guide field operation, which leads to better designs, higher oil recovery and economic return of future wells in the unconventional oil reserves.

Keywords: big data, artificial intelligence, enhance oil recovery, unconventional oil reserves

Procedia PDF Downloads 269
24737 A Configurational Approach to Understand the Effect of Organizational Structure on Absorptive Capacity: Results from PLS and fsQCA

Authors: Murad Ali, Anderson Konan Seny Kan, Khalid A. Maimani

Abstract:

Based on the theory of organizational design and the theory of knowledge, this study uses complexity theory to explain and better understand the causal impacts of various patterns of organizational structural factors stimulating absorptive capacity (ACAP). Organizational structure can be thought of as heterogeneous configurations where various components are often intertwined. This study argues that impact of the traditional variables which define a firm’s organizational structure (centralization, formalization, complexity and integration) on ACAP is better understood in terms of set-theoretic relations rather than correlations. This study uses a data sample of 347 from a multiple industrial sector in South Korea. The results from PLS-SEM support all the hypothetical relationships among the variables. However, fsQCA results suggest the possible configurations of centralization, formalization, complexity, integration, age, size, industry and revenue factors that contribute to high level of ACAP. The results from fsQCA demonstrate the usefulness of configurational approaches in helping understand equifinality in the field of knowledge management. A recent fsQCA procedure based on a modeling subsample and holdout subsample is use in this study to assess the predictive validity of the model under investigation. The same type predictive analysis is also made through PLS-SEM. These analyses reveal a good relevance of causal solutions leading to high level of ACAP. In overall, the results obtained from combining PLS-SEM and fsQCA are very insightful. In particular, they could help managers to link internal organizational structural with ACAP. In other words, managers may comprehend finely how different components of organizational structure can increase the level of ACAP. The configurational approach may trigger new insights that could help managers prioritize selection criteria and understand the interactions between organizational structure and ACAP. The paper also discusses theoretical and managerial implications arising from these findings.

Keywords: absorptive capacity, organizational structure, PLS-SEM, fsQCA, predictive analysis, modeling subsample, holdout subsample

Procedia PDF Downloads 316
24736 Model Predictive Control (MPC) and Proportional-Integral-Derivative (PID) Control of Quadcopters: A Comparative Analysis

Authors: Anel Hasić, Naser Prljača

Abstract:

In the domain of autonomous or piloted flights, the accurate control of quadrotor trajectories is of paramount significance for large numbers of tasks. These adaptable aerial platforms find applications that span from high-precision aerial photography and surveillance to demanding search and rescue missions. Among the fundamental challenges confronting quadrotor operation is the demand for accurate following of desired flight paths. To address this control challenge, among others, two celebrated well-established control strategies have emerged as noteworthy contenders: Model Predictive Control (MPC) and Proportional-Integral-Derivative (PID) control. In this work, we focus on the extensive examination of MPC and PID control techniques by using comprehensive simulation studies in MATLAB/Simulink. Intensive simulation results demonstrate the performance of the studied control algorithms.

Keywords: MATLAB, MPC, PID, quadcopter, simulink

Procedia PDF Downloads 29
24735 Investigation of Yard Seam Workings for the Proposed Newcastle Light Rail Project

Authors: David L. Knott, Robert Kingsland, Alistair Hitchon

Abstract:

The proposed Newcastle Light Rail is a key part of the revitalisation of Newcastle, NSW and will provide a frequent and reliable travel option throughout the city centre, running from Newcastle Interchange at Wickham to Pacific Park in Newcastle East, a total of 2.7 kilometers in length. Approximately one-third of the route, along Hunter and Scott Streets, is subject to potential shallow underground mine workings. The extent of mining and seams mined is unclear. Convicts mined the Yard Seam and overlying Dudley (Dirty) Seam in Newcastle sometime between 1800 and 1830. The Australian Agricultural Company mined the Yard Seam from about 1831 to the 1860s in the alignment area. The Yard Seam was about 3 feet (0.9m) thick, and therefore, known as the Yard Seam. Mine maps do not exist for the workings in the area of interest and it was unclear if both or just one seam was mined. Information from 1830s geological mapping and other data showing shaft locations were used along Scott Street and information from the 1908 Royal Commission was used along Hunter Street to develop an investigation program. In addition, mining was encountered for several sites to the south of the alignment at depths of about 7 m to 25 m. Based on the anticipated depths of mining, it was considered prudent to assess the potential for sinkhole development on the proposed alignment and realigned underground utilities and to obtain approval for the work from Subsidence Advisory NSW (SA NSW). The assessment consisted of a desktop study, followed by a subsurface investigation. Four boreholes were drilled along Scott Street and three boreholes were drilled along Hunter Street using HQ coring techniques in the rock. The placement of boreholes was complicated by the presence of utilities in the roadway and traffic constraints. All the boreholes encountered the Yard Seam, with conditions varying from unmined coal to an open void, indicating the presence of mining. The geotechnical information obtained from the boreholes was expanded by using various downhole techniques including; borehole camera, borehole sonar, and downhole geophysical logging. The camera provided views of the rock and helped to explain zones of no recovery. In addition, timber props within the void were observed. Borehole sonar was performed in the void and provided an indication of room size as well as the presence of timber props within the room. Downhole geophysical logging was performed in the boreholes to measure density, natural gamma, and borehole deviation. The data helped confirm that all the mining was in the Yard Seam and that the overlying Dudley Seam had been eroded in the past over much of the alignment. In summary, the assessment allowed the potential for sinkhole subsidence to be assessed and a mitigation approach developed to allow conditional approval by SA NSW. It also confirmed the presence of mining in the Yard Seam, the depth to the seam and mining conditions, and indicated that subsidence did not appear to have occurred in the past.

Keywords: downhole investigation techniques, drilling, mine subsidence, yard seam

Procedia PDF Downloads 299
24734 Investigating Dynamic Transition Process of Issues Using Unstructured Text Analysis

Authors: Myungsu Lim, William Xiu Shun Wong, Yoonjin Hyun, Chen Liu, Seongi Choi, Dasom Kim, Namgyu Kim

Abstract:

The amount of real-time data generated through various mass media has been increasing rapidly. In this study, we had performed topic analysis by using the unstructured text data that is distributed through news article. As one of the most prevalent applications of topic analysis, the issue tracking technique investigates the changes of the social issues that identified through topic analysis. Currently, traditional issue tracking is conducted by identifying the main topics of documents that cover an entire period at the same time and analyzing the occurrence of each topic by the period of occurrence. However, this traditional issue tracking approach has limitation that it cannot discover dynamic mutation process of complex social issues. The purpose of this study is to overcome the limitations of the existing issue tracking method. We first derived core issues of each period, and then discover the dynamic mutation process of various issues. In this study, we further analyze the mutation process from the perspective of the issues categories, in order to figure out the pattern of issue flow, including the frequency and reliability of the pattern. In other words, this study allows us to understand the components of the complex issues by tracking the dynamic history of issues. This methodology can facilitate a clearer understanding of complex social phenomena by providing mutation history and related category information of the phenomena.

Keywords: Data Mining, Issue Tracking, Text Mining, topic Analysis, topic Detection, Trend Detection

Procedia PDF Downloads 386
24733 Using the Transtheoretical Model to Investigate Stages of Change in Regular Volunteer Service among Seniors in Community

Authors: Pei-Ti Hsu, I-Ju Chen, Jeu-Jung Chen, Cheng-Fen Chang, Shiu-Yan Yang

Abstract:

Taiwan now is an aging society Research on the elderly should not be confined to caring for seniors, but should also be focused on ways to improve health and the quality of life. Senior citizens who participate in volunteer services could become less lonely, have new growth opportunities, and regain a sense of accomplishment. Thus, the question of how to get the elderly to participate in volunteer service is worth exploring. Apply the Transtheoretical Model to understand stages of change in regular volunteer service and voluntary service behaviour among the seniors. 1525 adults over the age of 65 from the Renai district of Keelung City were interviewed. The research tool was a self-constructed questionnaire and individual interviews were conducted to collect data. Then the data was processed and analyzed using the IBM SPSS Statistics 20 (Windows version) statistical software program. In the past six months, research subjects averaged 9.92 days of volunteer services. A majority of these elderly individuals had no intention to change their regular volunteer services. We discovered that during the maintenance stage, the self-efficacy for volunteer services was higher than during all other stages, but self-perceived barriers were less during the preparation stage and action stage. Self-perceived benefits were found to have an important predictive power for those with regular volunteer service behaviors in the previous stage, and self-efficacy was found to have an important predictive power for those with regular volunteer service behaviors in later stages. The research results support the conclusion that community nursing staff should group elders based on their regular volunteer services change stages and design appropriate behavioral change strategies.

Keywords: seniors, stages of change in regular volunteer services, volunteer service behavior, self-efficacy, self-perceived benefits

Procedia PDF Downloads 407
24732 Mining User-Generated Contents to Detect Service Failures with Topic Model

Authors: Kyung Bae Park, Sung Ho Ha

Abstract:

Online user-generated contents (UGC) significantly change the way customers behave (e.g., shop, travel), and a pressing need to handle the overwhelmingly plethora amount of various UGC is one of the paramount issues for management. However, a current approach (e.g., sentiment analysis) is often ineffective for leveraging textual information to detect the problems or issues that a certain management suffers from. In this paper, we employ text mining of Latent Dirichlet Allocation (LDA) on a popular online review site dedicated to complaint from users. We find that the employed LDA efficiently detects customer complaints, and a further inspection with the visualization technique is effective to categorize the problems or issues. As such, management can identify the issues at stake and prioritize them accordingly in a timely manner given the limited amount of resources. The findings provide managerial insights into how analytics on social media can help maintain and improve their reputation management. Our interdisciplinary approach also highlights several insights by applying machine learning techniques in marketing research domain. On a broader technical note, this paper illustrates the details of how to implement LDA in R program from a beginning (data collection in R) to an end (LDA analysis in R) since the instruction is still largely undocumented. In this regard, it will help lower the boundary for interdisciplinary researcher to conduct related research.

Keywords: latent dirichlet allocation, R program, text mining, topic model, user generated contents, visualization

Procedia PDF Downloads 173
24731 A New Approach for Improving Accuracy of Multi Label Stream Data

Authors: Kunal Shah, Swati Patel

Abstract:

Many real world problems involve data which can be considered as multi-label data streams. Efficient methods exist for multi-label classification in non streaming scenarios. However, learning in evolving streaming scenarios is more challenging, as the learners must be able to adapt to change using limited time and memory. Classification is used to predict class of unseen instance as accurate as possible. Multi label classification is a variant of single label classification where set of labels associated with single instance. Multi label classification is used by modern applications, such as text classification, functional genomics, image classification, music categorization etc. This paper introduces the task of multi-label classification, methods for multi-label classification and evolution measure for multi-label classification. Also, comparative analysis of multi label classification methods on the basis of theoretical study, and then on the basis of simulation was done on various data sets.

Keywords: binary relevance, concept drift, data stream mining, MLSC, multiple window with buffer

Procedia PDF Downloads 568
24730 Discovering the Effects of Meteorological Variables on the Air Quality of Bogota, Colombia, by Data Mining Techniques

Authors: Fabiana Franceschi, Martha Cobo, Manuel Figueredo

Abstract:

Bogotá, the capital of Colombia, is its largest city and one of the most polluted in Latin America due to the fast economic growth over the last ten years. Bogotá has been affected by high pollution events which led to the high concentration of PM10 and NO2, exceeding the local 24-hour legal limits (100 and 150 g/m3 each). The most important pollutants in the city are PM10 and PM2.5 (which are associated with respiratory and cardiovascular problems) and it is known that their concentrations in the atmosphere depend on the local meteorological factors. Therefore, it is necessary to establish a relationship between the meteorological variables and the concentrations of the atmospheric pollutants such as PM10, PM2.5, CO, SO2, NO2 and O3. This study aims to determine the interrelations between meteorological variables and air pollutants in Bogotá, using data mining techniques. Data from 13 monitoring stations were collected from the Bogotá Air Quality Monitoring Network within the period 2010-2015. The Principal Component Analysis (PCA) algorithm was applied to obtain primary relations between all the parameters, and afterwards, the K-means clustering technique was implemented to corroborate those relations found previously and to find patterns in the data. PCA was also used on a per shift basis (morning, afternoon, night and early morning) to validate possible variation of the previous trends and a per year basis to verify that the identified trends have remained throughout the study time. Results demonstrated that wind speed, wind direction, temperature, and NO2 are the most influencing factors on PM10 concentrations. Furthermore, it was confirmed that high humidity episodes increased PM2,5 levels. It was also found that there are direct proportional relationships between O3 levels and wind speed and radiation, while there is an inverse relationship between O3 levels and humidity. Concentrations of SO2 increases with the presence of PM10 and decreases with the wind speed and wind direction. They proved as well that there is a decreasing trend of pollutant concentrations over the last five years. Also, in rainy periods (March-June and September-December) some trends regarding precipitations were stronger. Results obtained with K-means demonstrated that it was possible to find patterns on the data, and they also showed similar conditions and data distribution among Carvajal, Tunal and Puente Aranda stations, and also between Parque Simon Bolivar and las Ferias. It was verified that the aforementioned trends prevailed during the study period by applying the same technique per year. It was concluded that PCA algorithm is useful to establish preliminary relationships among variables, and K-means clustering to find patterns in the data and understanding its distribution. The discovery of patterns in the data allows using these clusters as an input to an Artificial Neural Network prediction model.

Keywords: air pollution, air quality modelling, data mining, particulate matter

Procedia PDF Downloads 243
24729 Prediction Factor of Recurrence Supraventricular Tachycardia After Adenosine Treatment in the Emergency Department

Authors: Chaiyaporn Yuksen

Abstract:

Backgroud: Supraventricular tachycardia (SVT) is an abnormally fast atrial tachycardia characterized by narrow (≤ 120 ms) and constant QRS. Adenosine was the drug of choice; the first dose was 6 mg. It can be repeated with the second and third doses of 12 mg, with greater than 90% success. The study found that patients observed at 4 hours after normal sinus rhythm was no recurrence within 24 hours. The objective of this study was to investigate the factors that influence the recurrence of SVT after adenosine in the emergency department (ED). Method: The study was conducted retrospectively exploratory model, prognostic study at the Emergency Department (ED) in Faculty of Medicine, Ramathibodi Hospital, a university-affiliated super tertiary care hospital in Bangkok, Thailand. The study was conducted for ten years period between 2010 and 2020. The inclusion criteria were age > 15 years, visiting the ED with SVT, and treating with adenosine. Those patients were recorded with the recurrence SVT in ED. The multivariable logistic regression model developed the predictive model and prediction score for recurrence PSVT. Result: 264 patients met the study criteria. Of those, 24 patients (10%) had recurrence PSVT. Five independent factors were predictive of recurrence PSVT. There was age>65 years, heart rate (after adenosine) > 100 per min, structural heart disease, and dose of adenosine. The clinical risk score to predict recurrence PSVT is developed accuracy 74.41%. The score of >6 had the likelihood ratio of recurrence PSVT by 5.71 times Conclusion: The clinical predictive score of > 6 was associated with recurrence PSVT in ED.

Keywords: clinical prediction score, SVT, recurrence, emergency department

Procedia PDF Downloads 137
24728 A Posterior Predictive Model-Based Control Chart for Monitoring Healthcare

Authors: Yi-Fan Lin, Peter P. Howley, Frank A. Tuyl

Abstract:

Quality measurement and reporting systems are used in healthcare internationally. In Australia, the Australian Council on Healthcare Standards records and reports hundreds of clinical indicators (CIs) nationally across the healthcare system. These CIs are measures of performance in the clinical setting, and are used as a screening tool to help assess whether a standard of care is being met. Existing analysis and reporting of these CIs incorporate Bayesian methods to address sampling variation; however, such assessments are retrospective in nature, reporting upon the previous six or twelve months of data. The use of Bayesian methods within statistical process control for monitoring systems is an important pursuit to support more timely decision-making. Our research has developed and assessed a new graphical monitoring tool, similar to a control chart, based on the beta-binomial posterior predictive (BBPP) distribution to facilitate the real-time assessment of health care organizational performance via CIs. The BBPP charts have been compared with the traditional Bernoulli CUSUM (BC) chart by simulation. The more traditional “central” and “highest posterior density” (HPD) interval approaches were each considered to define the limits, and the multiple charts were compared via in-control and out-of-control average run lengths (ARLs), assuming that the parameter representing the underlying CI rate (proportion of cases with an event of interest) required estimation. Preliminary results have identified that the BBPP chart with HPD-based control limits provides better out-of-control run length performance than the central interval-based and BC charts. Further, the BC chart’s performance may be improved by using Bayesian parameter estimation of the underlying CI rate.

Keywords: average run length (ARL), bernoulli cusum (BC) chart, beta binomial posterior predictive (BBPP) distribution, clinical indicator (CI), healthcare organization (HCO), highest posterior density (HPD) interval

Procedia PDF Downloads 190
24727 A Hybrid Recommendation System Based on Association Rules

Authors: Ahmed Mohammed Alsalama

Abstract:

Recommendation systems are widely used in e-commerce applications. The engine of a current recommendation system recommends items to a particular user based on user preferences and previous high ratings. Various recommendation schemes such as collaborative filtering and content-based approaches are used to build a recommendation system. Most of the current recommendation systems were developed to fit a certain domain such as books, articles, and movies. We propose a hybrid framework recommendation system to be applied on two-dimensional spaces (User x Item) with a large number of Users and a small number of Items. Moreover, our proposed framework makes use of both favorite and non-favorite items of a particular user. The proposed framework is built upon the integration of association rules mining and the content-based approach. The results of experiments show that our proposed framework can provide accurate recommendations to users.

Keywords: data mining, association rules, recommendation systems, hybrid systems

Procedia PDF Downloads 443
24726 Feature Selection for Production Schedule Optimization in Transition Mines

Authors: Angelina Anani, Ignacio Ortiz Flores, Haitao Li

Abstract:

The use of underground mining methods have increased significantly over the past decades. This increase has also been spared on by several mines transitioning from surface to underground mining. However, determining the transition depth can be a challenging task, especially when coupled with production schedule optimization. Several researchers have simplified the problem by excluding operational features relevant to production schedule optimization. Our research objective is to investigate the extent to which operational features of transition mines accounted for affect the optimal production schedule. We also provide a framework for factors to consider in production schedule optimization for transition mines. An integrated mixed-integer linear programming (MILP) model is developed that maximizes the NPV as a function of production schedule and transition depth. A case study is performed to validate the model, with a comparative sensitivity analysis to obtain operational insights.

Keywords: underground mining, transition mines, mixed-integer linear programming, production schedule

Procedia PDF Downloads 154
24725 Effect of Bacillus Pumilus Strains on Heavy Metal Accumulation in Lettuce Grown on Contaminated Soil

Authors: Sabeen Alam, Mehboob Alam

Abstract:

The research work entitled “Effect of Bacillus pumilus strains on heavy metal accumulation in lettuce grown on contaminated soil” focused on functional role of Bacillus pumilus strains inoculated with lettuce seed in mitigating heavy metal in chromite mining soil. In this experiment, factor A was three Bacillus pumilus strains (sequence C-2PMW-8, C-1 SSK-8 and C-1 PWK-7) while soil used for this experiment was collected from Prang Ghar mining site and lettuce seeds were grown in three levels of chromite mining soil (2.27, 4.65 and 7.14 %). For mining soil minimum days to germinate noted in lettuce grown on garden soil inoculated with sequence. Maximum germination percentage noted was for C-1 SSK-8 grown on garden soil, maximum lettuce height for sequence C-2 PWM-8, fresh leaf weight for C-1 PWK-7 inoculated lettuce, dry weight of lettuce leaf for lettuce inoculated with C-1 SSK-8 and C-1 PWK-7 strains, number of leaves per plant for lettuce inoculated with C-1 SSK-8, leaf area for C-2 PMW-8 inoculated lettuce, survival percentage for C-1 SSK-8 treated lettuce and chlorophyll content for C-2 PMW-8. Results related to heavy metals accumulation showed that minimum chromium was in lettuce and in soil for all three sequences, cadmium (Cd) in lettuce and in soil for all three sequences, manganese (Mn) in lettuce and in soil for three sequences, lead (Pb) in lettuce and in soil for three sequences. It can be concluded that chromite mining soil significantly reduced the growth and survival of lettuce, but when lettuce was inoculated with Bacillus.pumilus strains, it enhances growth and survival. Similarly, minimum heavy metal accumulation in plant and soil, regardless of type of Bacillus pumilus used, all three sequences has same mitigating effect on heavy metal in both soil and lettuce. All the three Bacillus pumilus strains ensured reduction in heavy metals content (Mn, Cd, Cr) in lettuce, below the maximum permissible limits of WHO 2011.

Keywords: bacillus pumilus, heavy metals, permissible limits, lettuce, chromite mining soil, mitigating effect

Procedia PDF Downloads 37
24724 Indian Premier League (IPL) Score Prediction: Comparative Analysis of Machine Learning Models

Authors: Rohini Hariharan, Yazhini R, Bhamidipati Naga Shrikarti

Abstract:

In the realm of cricket, particularly within the context of the Indian Premier League (IPL), the ability to predict team scores accurately holds significant importance for both cricket enthusiasts and stakeholders alike. This paper presents a comprehensive study on IPL score prediction utilizing various machine learning algorithms, including Support Vector Machines (SVM), XGBoost, Multiple Regression, Linear Regression, K-nearest neighbors (KNN), and Random Forest. Through meticulous data preprocessing, feature engineering, and model selection, we aimed to develop a robust predictive framework capable of forecasting team scores with high precision. Our experimentation involved the analysis of historical IPL match data encompassing diverse match and player statistics. Leveraging this data, we employed state-of-the-art machine learning techniques to train and evaluate the performance of each model. Notably, Multiple Regression emerged as the top-performing algorithm, achieving an impressive accuracy of 77.19% and a precision of 54.05% (within a threshold of +/- 10 runs). This research contributes to the advancement of sports analytics by demonstrating the efficacy of machine learning in predicting IPL team scores. The findings underscore the potential of advanced predictive modeling techniques to provide valuable insights for cricket enthusiasts, team management, and betting agencies. Additionally, this study serves as a benchmark for future research endeavors aimed at enhancing the accuracy and interpretability of IPL score prediction models.

Keywords: indian premier league (IPL), cricket, score prediction, machine learning, support vector machines (SVM), xgboost, multiple regression, linear regression, k-nearest neighbors (KNN), random forest, sports analytics

Procedia PDF Downloads 29
24723 Fuzzy Logic Classification Approach for Exponential Data Set in Health Care System for Predication of Future Data

Authors: Manish Pandey, Gurinderjit Kaur, Meenu Talwar, Sachin Chauhan, Jagbir Gill

Abstract:

Health-care management systems are a unit of nice connection as a result of the supply a straightforward and fast management of all aspects relating to a patient, not essentially medical. What is more, there are unit additional and additional cases of pathologies during which diagnosing and treatment may be solely allotted by victimization medical imaging techniques. With associate ever-increasing prevalence, medical pictures area unit directly acquired in or regenerate into digital type, for his or her storage additionally as sequent retrieval and process. Data Mining is the process of extracting information from large data sets through using algorithms and Techniques drawn from the field of Statistics, Machine Learning and Data Base Management Systems. Forecasting may be a prediction of what's going to occur within the future, associated it's an unsure method. Owing to the uncertainty, the accuracy of a forecast is as vital because the outcome foretold by foretelling the freelance variables. A forecast management should be wont to establish if the accuracy of the forecast is within satisfactory limits. Fuzzy regression strategies have normally been wont to develop shopper preferences models that correlate the engineering characteristics with shopper preferences relating to a replacement product; the patron preference models offer a platform, wherever by product developers will decide the engineering characteristics so as to satisfy shopper preferences before developing the merchandise. Recent analysis shows that these fuzzy regression strategies area units normally will not to model client preferences. We tend to propose a Testing the strength of Exponential Regression Model over regression toward the mean Model.

Keywords: health-care management systems, fuzzy regression, data mining, forecasting, fuzzy membership function

Procedia PDF Downloads 261
24722 Evaluation of the Accuracy of a ‘Two Question Screening Tool’ in the Detection of Intimate Partner Violence in a Primary Healthcare Setting in South Africa

Authors: A. Saimen, E. Armstrong, C. Manitshana

Abstract:

Intimate partner violence (IPV) has been recognised as a global human rights violation. It is universally under diagnosed and the institution of timeous multi-faceted interventions has been noted to benefit IPV victims. Currently, the concept of using a screening tool to detect IPV has not been widely explored in a primary healthcare setting in South Africa, and it was for this reason that this study has been undertaken. A systematic random sampling of 1 in 8 women over a period of 3 months was conducted prospectively at the OPD of a Level 1 Hospital. Participants were asked about their experience of IPV during the past 12 months. The WAST-short, a two-question tool, was used to screen patients for IPV. To verify the result of the screening, women were also asked the remaining questions from the WAST. Data was collected from 400 participants, with a response rate of 99.3%. The prevalence of IPV in the sample was 32%. The WAST-short was shown to have the following operating characteristics: sensitivity 45.2%, specificity 98%,positive predictive value 98%, negative predictive value 79%. The WAST-short lacks sufficient sensitivity and therefore is not an ideal screening tool for this setting. Improvement in the sensitivity of the WAST-short in this setting may be achieved by lowering the threshold for a positive result for IPV screening, and modification of the screening questions to better reflect IPV as understood by the local population.

Keywords: domestic violence, intimate partner violence, screening, screening tools

Procedia PDF Downloads 290
24721 The Human Right to a Safe, Clean and Healthy Environment in Corporate Social Responsibility's Strategies: An Approach to Understanding Mexico's Mining Sector

Authors: Thalia Viveros-Uehara

Abstract:

The virtues of Corporate Social Responsibility (CSR) are explored widely in the academic literature. However, few studies address its link to human rights, per se; specifically, the right to a safe, clean and healthy environment. Fewer still are the research works in this area that relate to developing countries, where a number of areas are biodiversity hotspots. In Mexico, despite the rise and evolution of CSR schemes, grave episodes of pollution persist, especially those caused by the mining industry. These cases set up the question of the correspondence between the current CSR practices of mining companies in the country and their responsibility to respect the right to a safe, clean and healthy environment. The present study approaches precisely such a bridge, which until now has not been fully tackled in light of Mexico's 2011 constitutional human rights amendment and the United Nation's Guiding Principles on Business and Human Rights (UN Guiding Principles), adopted by the Human Rights Council in 2011. To that aim, it initially presents a contextual framework; it then explores qualitatively the adoption of human rights’ language in the CSR strategies of the three main mining companies in Mexico, and finally, it examines their standing with respect to the UN Guiding Principles. The results reveal that human rights are included in the RSE strategies of the analysed businesses, at least at the rhetoric level; however, they do not embrace the right to a safe, clean and healthy environment as such. Moreover, we conclude that despite the finding that corporations publicly express their commitment to respect human rights, some operational weaknesses that hamper the exercise of such responsibility persist; for example, the systematic lack of human rights impact assessments per mining unit, the denial of actual and publicly-known negative episodes on the environment linked directly to their operations, and the absence of effective mechanisms to remediate adverse impacts.

Keywords: corporate social responsibility, environmental impacts, human rights, right to a safe, clean and healthy environment, mining industry

Procedia PDF Downloads 313
24720 On-Line Data-Driven Multivariate Statistical Prediction Approach to Production Monitoring

Authors: Hyun-Woo Cho

Abstract:

Detection of incipient abnormal events in production processes is important to improve safety and reliability of manufacturing operations and reduce losses caused by failures. The construction of calibration models for predicting faulty conditions is quite essential in making decisions on when to perform preventive maintenance. This paper presents a multivariate calibration monitoring approach based on the statistical analysis of process measurement data. The calibration model is used to predict faulty conditions from historical reference data. This approach utilizes variable selection techniques, and the predictive performance of several prediction methods are evaluated using real data. The results shows that the calibration model based on supervised probabilistic model yielded best performance in this work. By adopting a proper variable selection scheme in calibration models, the prediction performance can be improved by excluding non-informative variables from their model building steps.

Keywords: calibration model, monitoring, quality improvement, feature selection

Procedia PDF Downloads 341
24719 Comparison between Transient Elastography (FibroScan) and Liver Biopsy for Diagnosis of Hepatic Fibrosis in Chronic Hepatitis C Genotype 4

Authors: Gamal Shiha, Seham Seif, Shahera Etreby, Khaled Zalata, Waleed Samir

Abstract:

Background: Transient Elastography (TE; FibroScan®) is a non-invasive technique to assess liver fibrosis. Aim: To compare TE and liver biopsy in hepatitis C virus (HCV) patients, genotype IV and evaluate the effect of steatosis and schistosomiasis on FibroScan. Methods: The fibrosis stage (METAVIR Score) TE, was assessed in 519 patients. The diagnostic performance of FibroScan is assessed by calculating the area under the receiver operating characteristic curves (AUROCs). Results: The cut-off value of ≥ F2 was 8.55 kPa, ≥ F3 was 10.2 kPa and cirrhosis = F4 was 16.3 kPa. The positive predictive value and negative predictive value were 70.1% and 81.7% for the diagnosis of ≥ F2, 62.6% and 96.22% for F ≥ 3, and 27.7% and 100% for F4. No significant difference between schistosomiasis, steatosis degree and FibroScan measurements. Conclusion: Fibroscan could accurately predict liver fibrosis.

Keywords: chronic hepatitis C, FibroScan, liver biopsy, liver fibrosis

Procedia PDF Downloads 388
24718 State Estimation Based on Unscented Kalman Filter for Burgers’ Equation

Authors: Takashi Shimizu, Tomoaki Hashimoto

Abstract:

Controlling the flow of fluids is a challenging problem that arises in many fields. Burgers’ equation is a fundamental equation for several flow phenomena such as traffic, shock waves, and turbulence. The optimal feedback control method, so-called model predictive control, has been proposed for Burgers’ equation. However, the model predictive control method is inapplicable to systems whose all state variables are not exactly known. In practical point of view, it is unusual that all the state variables of systems are exactly known, because the state variables of systems are measured through output sensors and limited parts of them can be only available. In fact, it is usual that flow velocities of fluid systems cannot be measured for all spatial domains. Hence, any practical feedback controller for fluid systems must incorporate some type of state estimator. To apply the model predictive control to the fluid systems described by Burgers’ equation, it is needed to establish a state estimation method for Burgers’ equation with limited measurable state variables. To this purpose, we apply unscented Kalman filter for estimating the state variables of fluid systems described by Burgers’ equation. The objective of this study is to establish a state estimation method based on unscented Kalman filter for Burgers’ equation. The effectiveness of the proposed method is verified by numerical simulations.

Keywords: observer systems, unscented Kalman filter, nonlinear systems, Burgers' equation

Procedia PDF Downloads 136