Search results for: data mining analytics
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25601

Search results for: data mining analytics

25151 Predictive Analytics Algorithms: Mitigating Elementary School Drop Out Rates

Authors: Bongs Lainjo

Abstract:

Educational institutions and authorities that are mandated to run education systems in various countries need to implement a curriculum that considers the possibility and existence of elementary school dropouts. This research focuses on elementary school dropout rates and the ability to replicate various predictive models carried out globally on selected Elementary Schools. The study was carried out by comparing the classical case studies in Africa, North America, South America, Asia and Europe. Some of the reasons put forward for children dropping out include the notion of being successful in life without necessarily going through the education process. Such mentality is coupled with a tough curriculum that does not take care of all students. The system has completely led to poor school attendance - truancy which continuously leads to dropouts. In this study, the focus is on developing a model that can systematically be implemented by school administrations to prevent possible dropout scenarios. At the elementary level, especially the lower grades, a child's perception of education can be easily changed so that they focus on the better future that their parents desire. To deal effectively with the elementary school dropout problem, strategies that are put in place need to be studied and predictive models are installed in every educational system with a view to helping prevent an imminent school dropout just before it happens. In a competency-based curriculum that most advanced nations are trying to implement, the education systems have wholesome ideas of learning that reduce the rate of dropout.

Keywords: elementary school, predictive models, machine learning, risk factors, data mining, classifiers, dropout rates, education system, competency-based curriculum

Procedia PDF Downloads 175
25150 Implementation of Knowledge and Attitude Management Based on Holistic Approach in Andragogy Learning, as an Effort to Solve the Environmental Problems of Post-Coal Mining Activity

Authors: Aloysius Hardoko, Susilo

Abstract:

The root cause of the problem after the environmental damage due to coal mining activities defined as the province of East Kalimantan corridor masterplan economic activity accelerated the expansion of Indonesia's economic development (MP3EI) is the behavior of adults. Adult behavior can be changed through knowledge management and attitude. Based on the root of the problem, the objective of the research is to apply knowledge management and attitude based on holistic approach in learning andragogy as an effort to solve environmental problems after coal mining activities. Research methods to achieve the objective of using quantitative research with pretest postes group design. Knowledge management and attitudes based on a holistic approach in adult learning are applied through initial learning activities, core and case-based cover of environmental damage. The research instrument is a description of the case of environmental damage. The data analysis uses t-test to see the effect of knowledge management attitude based on holistic approach before and after adult learning. Location and sample of representative research of adults as many as 20 people in Kutai Kertanegara District, one of the districts in East Kalimantan province, which suffered the worst environmental damage. The conclusion of the research result is the application of knowledge management and attitude in adult learning influence to adult knowledge and attitude to overcome environmental problem post-coal mining activity.

Keywords: knowledge management and attitude, holistic approach, andragogy learning, environmental Issue

Procedia PDF Downloads 207
25149 Data Mining to Capture User-Experience: A Case Study in Notebook Product Appearance Design

Authors: Rhoann Kerh, Chen-Fu Chien, Kuo-Yi Lin

Abstract:

In the era of rapidly increasing notebook market, consumer electronics manufacturers are facing a highly dynamic and competitive environment. In particular, the product appearance is the first part for user to distinguish the product from the product of other brands. Notebook product should differ in its appearance to engage users and contribute to the user experience (UX). The UX evaluates various product concepts to find the design for user needs; in addition, help the designer to further understand the product appearance preference of different market segment. However, few studies have been done for exploring the relationship between consumer background and the reaction of product appearance. This study aims to propose a data mining framework to capture the user’s information and the important relation between product appearance factors. The proposed framework consists of problem definition and structuring, data preparation, rules generation, and results evaluation and interpretation. An empirical study has been done in Taiwan that recruited 168 subjects from different background to experience the appearance performance of 11 different portable computers. The results assist the designers to develop product strategies based on the characteristics of consumers and the product concept that related to the UX, which help to launch the products to the right customers and increase the market shares. The results have shown the practical feasibility of the proposed framework.

Keywords: consumers decision making, product design, rough set theory, user experience

Procedia PDF Downloads 313
25148 Mining Scientific Literature to Discover Potential Research Data Sources: An Exploratory Study in the Field of Haemato-Oncology

Authors: A. Anastasiou, K. S. Tingay

Abstract:

Background: Discovering suitable datasets is an important part of health research, particularly for projects working with clinical data from patients organized in cohorts (cohort data), but with the proliferation of so many national and international initiatives, it is becoming increasingly difficult for research teams to locate real world datasets that are most relevant to their project objectives. We present a method for identifying healthcare institutes in the European Union (EU) which may hold haemato-oncology (HO) data. A key enabler of this research was the bibInsight platform, a scientometric data management and analysis system developed by the authors at Swansea University. Method: A PubMed search was conducted using HO clinical terms taken from previous work. The resulting XML file was processed using the bibInsight platform, linking affiliations to the Global Research Identifier Database (GRID). GRID is an international, standardized list of institutions, including the city and country in which the institution exists, as well as a category of the main business type, e.g., Academic, Healthcare, Government, Company. Countries were limited to the 28 current EU members, and institute type to 'Healthcare'. An article was considered valid if at least one author was affiliated with an EU-based healthcare institute. Results: The PubMed search produced 21,310 articles, consisting of 9,885 distinct affiliations with correspondence in GRID. Of these articles, 760 were from EU countries, and 390 of these were healthcare institutes. One affiliation was excluded as being a veterinary hospital. Two EU countries did not have any publications in our analysis dataset. The results were analysed by country and by individual healthcare institute. Networks both within the EU and internationally show institutional collaborations, which may suggest a willingness to share data for research purposes. Geographical mapping can ensure that data has broad population coverage. Collaborations with industry or government may exclude healthcare institutes that may have embargos or additional costs associated with data access. Conclusions: Data reuse is becoming increasingly important both for ensuring the validity of results, and economy of available resources. The ability to identify potential, specific data sources from over twenty thousand articles in less than an hour could assist in improving knowledge of, and access to, data sources. As our method has not yet specified if these healthcare institutes are holding data, or merely publishing on that topic, future work will involve text mining of data-specific concordant terms to identify numbers of participants, demographics, study methodologies, and sub-topics of interest.

Keywords: data reuse, data discovery, data linkage, journal articles, text mining

Procedia PDF Downloads 115
25147 Financial Assessment of the Hard Coal Mining in the Chosen Region in the Czech Republic: Real Options Methodology Application

Authors: Miroslav Čulík, Petr Gurný

Abstract:

This paper is aimed at the financial assessment of the hard coal mining in a given region by real option methodology application. Hard coal mining in this mine makes net loss for the owner during the last years due to the long-term unfavourable mining conditions and significant drop in the coal prices during the last years. Management is going to shut down the operation and abandon the project to reduce the loss of the company. The goal is to assess whether the shutting down the operation is the only and correct solution of the problem. Due to the uncertainty in the future hard coal price evolution, the production might be again restarted if the price raises enough to cover the cost of the production. For the assessment, real option methodology is applied, which captures two important aspect of the financial decision-making: risk and flexibility. The paper is structured as follows: first, current state is described and problem is analysed. Next, methodology of real options is described. At last, project is evaluated by applying real option methodology. The results are commented and recommendations are provided.

Keywords: real option, investment, option to abandon, option to shut down and restart, risk, flexibility

Procedia PDF Downloads 548
25146 Improve Student Performance Prediction Using Majority Vote Ensemble Model for Higher Education

Authors: Wade Ghribi, Abdelmoty M. Ahmed, Ahmed Said Badawy, Belgacem Bouallegue

Abstract:

In higher education institutions, the most pressing priority is to improve student performance and retention. Large volumes of student data are used in Educational Data Mining techniques to find new hidden information from students' learning behavior, particularly to uncover the early symptom of at-risk pupils. On the other hand, data with noise, outliers, and irrelevant information may provide incorrect conclusions. By identifying features of students' data that have the potential to improve performance prediction results, comparing and identifying the most appropriate ensemble learning technique after preprocessing the data, and optimizing the hyperparameters, this paper aims to develop a reliable students' performance prediction model for Higher Education Institutions. Data was gathered from two different systems: a student information system and an e-learning system for undergraduate students in the College of Computer Science of a Saudi Arabian State University. The cases of 4413 students were used in this article. The process includes data collection, data integration, data preprocessing (such as cleaning, normalization, and transformation), feature selection, pattern extraction, and, finally, model optimization and assessment. Random Forest, Bagging, Stacking, Majority Vote, and two types of Boosting techniques, AdaBoost and XGBoost, are ensemble learning approaches, whereas Decision Tree, Support Vector Machine, and Artificial Neural Network are supervised learning techniques. Hyperparameters for ensemble learning systems will be fine-tuned to provide enhanced performance and optimal output. The findings imply that combining features of students' behavior from e-learning and students' information systems using Majority Vote produced better outcomes than the other ensemble techniques.

Keywords: educational data mining, student performance prediction, e-learning, classification, ensemble learning, higher education

Procedia PDF Downloads 108
25145 The Role of Technology in Transforming the Finance, Banking, and Insurance Sectors

Authors: Farid Fahami

Abstract:

This article explores the transformative role of technology in the finance, banking, and insurance sectors. It examines key technological trends such as AI, blockchain, data analytics, and digital platforms and their impact on operations, customer experiences, and business models. The article highlights the benefits of technology adoption, including improved efficiency, cost reduction, enhanced customer experiences, and expanded financial inclusion. It also addresses challenges like cybersecurity, data privacy, and the need for upskilling. Real-world case studies demonstrate successful technology integration, and recommendations for stakeholders emphasize embracing innovation and collaboration. The article concludes by emphasizing the importance of technology in shaping the future of these sectors.

Keywords: banking, finance, insurance, technology

Procedia PDF Downloads 72
25144 High School Gain Analytics From National Assessment Program – Literacy and Numeracy and Australian Tertiary Admission Rankin Linkage

Authors: Andrew Laming, John Hattie, Mark Wilson

Abstract:

Nine Queensland Independent high schools provided deidentified student-matched ATAR and NAPLAN data for all 1217 ATAR graduates since 2020 who also sat NAPLAN at the school. Graduating cohorts from the nine schools contained a mean 100 ATAR graduates with previous NAPLAN data from their school. Excluded were vocational students (mean=27) and any ATAR graduates without NAPLAN data (mean=20). Based on Index of Community Socio-Educational Access (ICSEA) prediction, all schools had larger that predicted proportions of their students graduating with ATARs. There were an additional 173 students not releasing their ATARs to their school (14%), requiring this data to be inferred by schools. Gain was established by first converting each student’s strongest NAPLAN domain to a statewide percentile, then subtracting this result from final ATAR. The resulting ‘percentile shift’ was corrected for plausible ATAR participation at each NAPLAN level. Strongest NAPLAN domain had the highest correlation with ATAR (R2=0.58). RESULTS School mean NAPLAN scores fitted ICSEA closely (R2=0.97). Schools achieved a mean cohort gain of two ATAR rankings, but only 66% of students gained. This ranged from 46% of top-NAPLAN decile students gaining, rising to 75% achieving gains outside the top decile. The 54% of top-decile students whose ATAR fell short of prediction lost a mean 4.0 percentiles (or 6.2 percentiles prior to correction for regression to the mean). 71% of students in smaller schools gained, compared to 63% in larger schools. NAPLAN variability in each of the 13 ICSEA1100 cohorts was 17%, with both intra-school and inter-school variation of these values extremely low (0.3% to 1.8%). Mean ATAR change between years in each school was just 1.1 ATAR ranks. This suggests consecutive school cohorts and ICSEA-similar schools share very similar distributions and outcomes over time. Quantile analysis of the NAPLAN/ATAR revealed heteroscedasticity, but splines offered little additional benefit over simple linear regression. The NAPLAN/ATAR R2 was 0.33. DISCUSSION Standardised data like NAPLAN and ATAR offer educators a simple no-cost progression metric to analyse performance in conjunction with their internal test results. Change is expressed in percentiles, or ATAR shift per student, which is layperson intuitive. Findings may also reduce ATAR/vocational stream mismatch, reveal proportions of cohorts meeting or falling short of expectation and demonstrate by how much. Finally, ‘crashed’ ATARs well below expectation are revealed, which schools can reasonably work to minimise. The percentile shift method is neither value-add nor a growth percentile. In the absence of exit NAPLAN testing, this metric is unable to discriminate academic gain from legitimate ATAR-maximizing strategies. But by controlling for ICSEA, ATAR proportion variation and student mobility, it uncovers progression to ATAR metrics which are not currently publicly available. However achieved, ATAR maximisation is a sought-after private good. So long as standardised nationwide data is available, this analysis offers useful analytics for educators and reasonable predictivity when counselling subsequent cohorts about their ATAR prospects.  

Keywords: NAPLAN, ATAR, analytics, measurement, gain, performance, data, percentile, value-added, high school, numeracy, reading comprehension, variability, regression to the mean

Procedia PDF Downloads 68
25143 A Framework of Product Information Service System Using Mobile Image Retrieval and Text Mining Techniques

Authors: Mei-Yi Wu, Shang-Ming Huang

Abstract:

The online shoppers nowadays often search the product information on the Internet using some keywords of products. To use this kind of information searching model, shoppers should have a preliminary understanding about their interesting products and choose the correct keywords. However, if the products are first contact (for example, the worn clothes or backpack of passengers which you do not have any idea about the brands), these products cannot be retrieved due to insufficient information. In this paper, we discuss and study the applications in E-commerce using image retrieval and text mining techniques. We design a reasonable E-commerce application system containing three layers in the architecture to provide users product information. The system can automatically search and retrieval similar images and corresponding web pages on Internet according to the target pictures which taken by users. Then text mining techniques are applied to extract important keywords from these retrieval web pages and search the prices on different online shopping stores with these keywords using a web crawler. Finally, the users can obtain the product information including photos and prices of their favorite products. The experiments shows the efficiency of proposed system.

Keywords: mobile image retrieval, text mining, product information service system, online marketing

Procedia PDF Downloads 359
25142 Customer Preference in the Textile Market: Fabric-Based Analysis

Authors: Francisca Margarita Ocran

Abstract:

Underwear, and more particularly bras and panties, are defined as intimate clothing. Strictly speaking, they enhance the place of women in the public or private satchel. Therefore, women's lingerie is a complex garment with a high involvement profile, motivating consumers to buy it not only by its functional utility but also by the multisensory experience it provides them. Customer behavior models are generally based on customer data mining, and each model is designed to answer questions at a specific time. Predicting the customer experience is uncertain and difficult. Thus, knowledge of consumers' tastes in lingerie deserves to be treated as an experiential product, where the dimensions of the experience motivating consumers to buy a lingerie product and to remain faithful to it must be analyzed in detail by the manufacturers and retailers to engage and retain consumers, which is why this research aims to identify the variables that push consumers to choose their lingerie product, based on an in-depth analysis of the types of fabrics used to make lingerie. The data used in this study comes from online purchases. Machine learning approach with the use of Python programming language and Pycaret gives us a precision of 86.34%, 85.98%, and 84.55% for the three algorithms to use concerning the preference of a buyer in front of a range of lingerie. Gradient Boosting, random forest, and K Neighbors were used in this study; they are very promising and rich in the classification of preference in the textile industry.

Keywords: consumer behavior, data mining, lingerie, machine learning, preference

Procedia PDF Downloads 90
25141 Employing a Knime-based and Open-source Tools to Identify AMI and VER Metabolites from UPLC-MS Data

Authors: Nouf Alourfi

Abstract:

This study examines the metabolism of amitriptyline (AMI) and verapamil (VER) using a KNIME-based method. KNIME improved workflow is an open-source data-analytics platform that integrates a number of open-source metabolomics tools such as CFMID and MetFrag to provide standard data visualisations, predict candidate metabolites, assess them against experimental data, and produce reports on identified metabolites. The use of this workflow is demonstrated by employing three types of liver microsomes (human, rat, and Guinea pig) to study the in vitro metabolism of the two drugs (AMI and VER). This workflow is used to create and treat UPLC-MS (Orbitrap) data. The formulas and structures of these drugs' metabolites can be assigned automatically. The key metabolic routes for amitriptyline are hydroxylation, N-dealkylation, N-oxidation, and conjugation, while N-demethylation, O-demethylation and N-dealkylation, and conjugation are the primary metabolic routes for verapamil. The identified metabolites are compatible to the published, clarifying the solidity of the workflow technique and the usage of computational tools like KNIME in supporting the integration and interoperability of emerging novel software packages in the metabolomics area.

Keywords: KNIME, CFMID, MetFrag, Data Analysis, Metabolomics

Procedia PDF Downloads 119
25140 Virtual Dimension Analysis of Hyperspectral Imaging to Characterize a Mining Sample

Authors: L. Chevez, A. Apaza, J. Rodriguez, R. Puga, H. Loro, Juan Z. Davalos

Abstract:

Virtual Dimension (VD) procedure is used to analyze Hyperspectral Image (HIS) treatment-data in order to estimate the abundance of mineral components of a mining sample. Hyperspectral images coming from reflectance spectra (NIR region) are pre-treated using Standard Normal Variance (SNV) and Minimum Noise Fraction (MNF) methodologies. The endmember components are identified by the Simplex Growing Algorithm (SVG) and after adjusted to the reflectance spectra of reference-databases using Simulated Annealing (SA) methodology. The obtained abundance of minerals of the sample studied is very near to the ones obtained using XRD with a total relative error of 2%.

Keywords: hyperspectral imaging, minimum noise fraction, MNF, simplex growing algorithm, SGA, standard normal variance, SNV, virtual dimension, XRD

Procedia PDF Downloads 158
25139 Choosing an Optimal Epsilon for Differentially Private Arrhythmia Analysis

Authors: Arin Ghazarian, Cyril Rakovski

Abstract:

Differential privacy has become the leading technique to protect the privacy of individuals in a database while allowing useful analysis to be done and the results to be shared. It puts a guarantee on the amount of privacy loss in the worst-case scenario. Differential privacy is not a toggle between full privacy and zero privacy. It controls the tradeoff between the accuracy of the results and the privacy loss using a single key parameter called

Keywords: arrhythmia, cardiology, differential privacy, ECG, epsilon, medi-cal data, privacy preserving analytics, statistical databases

Procedia PDF Downloads 152
25138 Methods for Distinction of Cattle Using Supervised Learning

Authors: Radoslav Židek, Veronika Šidlová, Radovan Kasarda, Birgit Fuerst-Waltl

Abstract:

Machine learning represents a set of topics dealing with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data. The data can present identification patterns which are used to classify into groups. The result of the analysis is the pattern which can be used for identification of data set without the need to obtain input data used for creation of this pattern. An important requirement in this process is careful data preparation validation of model used and its suitable interpretation. For breeders, it is important to know the origin of animals from the point of the genetic diversity. In case of missing pedigree information, other methods can be used for traceability of animal´s origin. Genetic diversity written in genetic data is holding relatively useful information to identify animals originated from individual countries. We can conclude that the application of data mining for molecular genetic data using supervised learning is an appropriate tool for hypothesis testing and identifying an individual.

Keywords: genetic data, Pinzgau cattle, supervised learning, machine learning

Procedia PDF Downloads 550
25137 Mining Coupled to Agriculture: Systems Thinking in Scalable Food Production

Authors: Jason West

Abstract:

Low profitability in agriculture production along with increasing scrutiny over environmental effects is limiting food production at scale. In contrast, the mining sector offers access to resources including energy, water, transport and chemicals for food production at low marginal cost. Scalable agricultural production can benefit from the nexus of resources (water, energy, transport) offered by mining activity in remote locations. A decision support bioeconomic model for controlled environment vertical farms was used. Four submodels were used: crop structure, nutrient requirements, resource-crop integration, and economic. They escalate to a macro mathematical model. A demonstrable dynamic systems framework is needed to prove productive outcomes are feasible. We demonstrate a generalized bioeconomic macro model for controlled environment production systems in minesites using systems dynamics modeling methodology. Despite the complexity of bioeconomic modelling of resource-agricultural dynamic processes and interactions, the economic potential greater than general economic models would assume. Scalability of production as an input becomes a key success feature.

Keywords: crop production systems, mathematical model, mining, agriculture, dynamic systems

Procedia PDF Downloads 77
25136 A Data-Mining Model for Protection of FACTS-Based Transmission Line

Authors: Ashok Kalagura

Abstract:

This paper presents a data-mining model for fault-zone identification of flexible AC transmission systems (FACTS)-based transmission line including a thyristor-controlled series compensator (TCSC) and unified power-flow controller (UPFC), using ensemble decision trees. Given the randomness in the ensemble of decision trees stacked inside the random forests model, it provides an effective decision on the fault-zone identification. Half-cycle post-fault current and voltage samples from the fault inception are used as an input vector against target output ‘1’ for the fault after TCSC/UPFC and ‘1’ for the fault before TCSC/UPFC for fault-zone identification. The algorithm is tested on simulated fault data with wide variations in operating parameters of the power system network, including noisy environment providing a reliability measure of 99% with faster response time (3/4th cycle from fault inception). The results of the presented approach using the RF model indicate the reliable identification of the fault zone in FACTS-based transmission lines.

Keywords: distance relaying, fault-zone identification, random forests, RFs, support vector machine, SVM, thyristor-controlled series compensator, TCSC, unified power-flow controller, UPFC

Procedia PDF Downloads 423
25135 Case Study about Women Driving in Saudi Arabia Announced in 2018: Netnographic and Data Mining Study

Authors: Majdah Alnefaie

Abstract:

The ‘netnographic study’ and data mining have been used to monitor the public interaction on Social Media Sites (SMSs) to understand what the motivational factors influence the Saudi intentions regarding allowing women driving in Saudi Arabia in 2018. The netnographic study monitored the publics’ textual and visual communications in Twitter, Snapchat, and YouTube. SMSs users’ communications method is also known as electronic word of mouth (eWOM). Netnography methodology is still in its initial stages as it depends on manual extraction, reading and classification of SMSs users text. On the other hand, data mining is come from the computer and physical sciences background, therefore it is much harder to extract meaning from unstructured qualitative data. In addition, the new development in data mining software does not support the Arabic text, especially local slang in Saudi Arabia. Therefore, collaborations between social and computer scientists such as ‘netnographic study’ and data mining will enhance the efficiency of this study methodology leading to comprehensive research outcome. The eWOM communications between individuals on SMSs can promote a sense that sharing their preferences and experiences regarding politics and social government regulations is a part of their daily life, highlighting the importance of using SMSs as assistance in promoting participation in political and social. Therefore, public interactions on SMSs are important tools to comprehend people’s intentions regarding the new government regulations in the country. This study aims to answer this question, "What factors influence the Saudi Arabians' intentions of Saudi female's car-driving in 2018". The study utilized qualitative method known as netnographic study. The study used R studio to collect and analyses 27000 Saudi users’ comments from 25th May until 25th June 2018. The study has developed data collection model that support importing and analysing the Arabic text in the local slang. The data collection model in this study has been clustered based on different type of social networks, gender and the study main factors. The social network analysis was employed to collect comments from SMSs owned by governments’ originations, celebrities, vloggers, social activist and news SMSs accounts. The comments were collected from both males and females SMSs users. The sentiment analysis shows that the total number of positive comments Saudi females car driving was higher than negative comments. The data have provided the most important factors influenced the Saudi Arabians’ intention of Saudi females car driving including, culture and environment, freedom of choice, equal opportunities, security and safety. The most interesting finding indicted that women driving would play a role in increasing the individual freedom of choice. Saudi female will be able to drive cars to fulfill her daily life and family needs without being stressed due to the lack of transportation. The study outcome will help Saudi government to improve woman quality of life by increasing the ability to find more jobs and studies, increasing income through decreasing the spending on transport means such as taxi and having more freedom of choice in woman daily life needs. The study enhances the importance of using use marketing research to measure the public opinions on the new government regulations in the country. The study has explained the limitations and suggestions for future research.

Keywords: netnographic study, data mining, social media, Saudi Arabia, female driving

Procedia PDF Downloads 153
25134 Industrial Process Mining Based on Data Pattern Modeling and Nonlinear Analysis

Authors: Hyun-Woo Cho

Abstract:

Unexpected events may occur with serious impacts on industrial process. This work utilizes a data representation technique to model and to analyze process data pattern for the purpose of diagnosis. In this work, the use of triangular representation of process data is evaluated using simulation process. Furthermore, the effect of using different pre-treatment techniques based on such as linear or nonlinear reduced spaces was compared. This work extracted the fault pattern in the reduced space, not in the original data space. The results have shown that the non-linear technique based diagnosis method produced more reliable results and outperforms linear method.

Keywords: process monitoring, data analysis, pattern modeling, fault, nonlinear techniques

Procedia PDF Downloads 387
25133 The Analyzer: Clustering Based System for Improving Business Productivity by Analyzing User Profiles to Enhance Human Computer Interaction

Authors: Dona Shaini Abhilasha Nanayakkara, Kurugamage Jude Pravinda Gregory Perera

Abstract:

E-commerce platforms have revolutionized the shopping experience, offering convenient ways for consumers to make purchases. To improve interactions with customers and optimize marketing strategies, it is essential for businesses to understand user behavior, preferences, and needs on these platforms. This paper focuses on recommending businesses to customize interactions with users based on their behavioral patterns, leveraging data-driven analysis and machine learning techniques. Businesses can improve engagement and boost the adoption of e-commerce platforms by aligning behavioral patterns with user goals of usability and satisfaction. We propose TheAnalyzer, a clustering-based system designed to enhance business productivity by analyzing user-profiles and improving human-computer interaction. The Analyzer seamlessly integrates with business applications, collecting relevant data points based on users' natural interactions without additional burdens such as questionnaires or surveys. It defines five key user analytics as features for its dataset, which are easily captured through users' interactions with e-commerce platforms. This research presents a study demonstrating the successful distinction of users into specific groups based on the five key analytics considered by TheAnalyzer. With the assistance of domain experts, customized business rules can be attached to each group, enabling The Analyzer to influence business applications and provide an enhanced personalized user experience. The outcomes are evaluated quantitatively and qualitatively, demonstrating that utilizing TheAnalyzer’s capabilities can optimize business outcomes, enhance customer satisfaction, and drive sustainable growth. The findings of this research contribute to the advancement of personalized interactions in e-commerce platforms. By leveraging user behavioral patterns and analyzing both new and existing users, businesses can effectively tailor their interactions to improve customer satisfaction, loyalty and ultimately drive sales.

Keywords: data clustering, data standardization, dimensionality reduction, human computer interaction, user profiling

Procedia PDF Downloads 73
25132 Safety-critical Alarming Strategy Based on Statistically Defined Slope Deformation Behaviour Model Case Study: Upright-dipping Highwall in a Coal Mining Area

Authors: Lintang Putra Sadewa, Ilham Prasetya Budhi

Abstract:

Slope monitoring program has now become a mandatory campaign for any open pit mines around the world to operate safely. Utilizing various slope monitoring instruments and strategies, miners are now able to deliver precise decisions in mitigating the risk of slope failures which can be catastrophic. Currently, the most sophisticated slope monitoring technology available is the Slope Stability Radar (SSR), whichcan measure wall deformation in submillimeter accuracy. One of its eminent features is that SSRcan provide a timely warning by automatically raise an alarm when a predetermined rate-of-movement threshold is reached. However, establishing proper alarm thresholds is arguably one of the onerous challenges faced in any slope monitoring program. The difficulty mainly lies in the number of considerations that must be taken when generating a threshold becausean alarm must be effectivethat it should limit the occurrences of false alarms while alsobeing able to capture any real wall deformations. In this sense, experience shows that a site-specific alarm thresholdtendsto produce more reliable results because it considers site distinctive variables. This study will attempt to determinealarming thresholds for safety-critical monitoring based on an empirical model of slope deformation behaviour that is defined statistically fromdeformation data captured by the Slope Stability Radar (SSR). The study area comprises of upright-dipping highwall setting in a coal mining area with intense mining activities, andthe deformation data used for the study were recorded by the SSR throughout the year 2022. The model is site-specific in nature thus, valuable information extracted from the model (e.g., time-to-failure, onset-of-acceleration, and velocity) will be applicable in setting up site-specific alarm thresholds and will give a clear understanding of how deformation trends evolve over the area.

Keywords: safety-critical monitoring, alarming strategy, slope deformation behaviour model, coal mining

Procedia PDF Downloads 90
25131 Determining of the Performance of Data Mining Algorithm Determining the Influential Factors and Prediction of Ischemic Stroke: A Comparative Study in the Southeast of Iran

Authors: Y. Mehdipour, S. Ebrahimi, A. Jahanpour, F. Seyedzaei, B. Sabayan, A. Karimi, H. Amirifard

Abstract:

Ischemic stroke is one of the common reasons for disability and mortality. The fourth leading cause of death in the world and the third in some other sources. Only 1/3 of the patients with ischemic stroke fully recover, 1/3 of them end in permanent disability and 1/3 face death. Thus, the use of predictive models to predict stroke has a vital role in reducing the complications and costs related to this disease. Thus, the aim of this study was to specify the effective factors and predict ischemic stroke with the help of DM methods. The present study was a descriptive-analytic study. The population was 213 cases from among patients referring to Ali ibn Abi Talib (AS) Hospital in Zahedan. Data collection tool was a checklist with the validity and reliability confirmed. This study used DM algorithms of decision tree for modeling. Data analysis was performed using SPSS-19 and SPSS Modeler 14.2. The results of the comparison of algorithms showed that CHAID algorithm with 95.7% accuracy has the best performance. Moreover, based on the model created, factors such as anemia, diabetes mellitus, hyperlipidemia, transient ischemic attacks, coronary artery disease, and atherosclerosis are the most effective factors in stroke. Decision tree algorithms, especially CHAID algorithm, have acceptable precision and predictive ability to determine the factors affecting ischemic stroke. Thus, by creating predictive models through this algorithm, will play a significant role in decreasing the mortality and disability caused by ischemic stroke.

Keywords: data mining, ischemic stroke, decision tree, Bayesian network

Procedia PDF Downloads 174
25130 Modelling of Recovery and Application of Low-Grade Thermal Resources in the Mining and Mineral Processing Industry

Authors: S. McLean, J. A. Scott

Abstract:

The research topic is focusing on improving sustainable operation through recovery and reuse of waste heat in process water streams, an area in the mining industry that is often overlooked. There are significant advantages to the application of this topic, including economic and environmental benefits. The smelting process in the mining industry presents an opportunity to recover waste heat and apply it to alternative uses, thereby enhancing the overall process. This applied research has been conducted at the Sudbury Integrated Nickel Operations smelter site, in particular on the water cooling towers. The aim was to determine and optimize methods for appropriate recovery and subsequent upgrading of thermally low-grade heat lost from the water cooling towers in a manner that makes it useful for repurposing in applications, such as within an acid plant. This would be valuable to mining companies as it would be an opportunity to reduce the cost of the process, as well as decrease environmental impact and primary fuel usage. The waste heat from the cooling towers needs to be upgraded before it can be beneficially applied, as lower temperatures result in a decrease of the number of potential applications. Temperature and flow rate data were collected from the water cooling towers at an acid plant over two years. The research includes process control strategies and the development of a model capable of determining if the proposed heat recovery technique is economically viable, as well as assessing any environmental impact with the reduction in net energy consumption by the process. Therefore, comprehensive cost and impact analyses are carried out to determine the best area of application for the recovered waste heat. This method will allow engineers to easily identify the value of thermal resources available to them and determine if a full feasibility study should be carried out. The rapid scoping model developed will be applicable to any site that generates large amounts of waste heat. Results show that heat pumps are an economically viable solution for this application, allowing for reduced cost and CO₂ emissions.

Keywords: environment, heat recovery, mining engineering, sustainability

Procedia PDF Downloads 110
25129 Developing a Cloud Intelligence-Based Energy Management Architecture Facilitated with Embedded Edge Analytics for Energy Conservation in Demand-Side Management

Authors: Yu-Hsiu Lin, Wen-Chun Lin, Yen-Chang Cheng, Chia-Ju Yeh, Yu-Chuan Chen, Tai-You Li

Abstract:

Demand-Side Management (DSM) has the potential to reduce electricity costs and carbon emission, which are associated with electricity used in the modern society. A home Energy Management System (EMS) commonly used by residential consumers in a down-stream sector of a smart grid to monitor, control, and optimize energy efficiency to domestic appliances is a system of computer-aided functionalities as an energy audit for residential DSM. Implementing fault detection and classification to domestic appliances monitored, controlled, and optimized is one of the most important steps to realize preventive maintenance, such as residential air conditioning and heating preventative maintenance in residential/industrial DSM. In this study, a cloud intelligence-based green EMS that comes up with an Internet of Things (IoT) technology stack for residential DSM is developed. In the EMS, Arduino MEGA Ethernet communication-based smart sockets that module a Real Time Clock chip to keep track of current time as timestamps via Network Time Protocol are designed and implemented for readings of load phenomena reflecting on voltage and current signals sensed. Also, a Network-Attached Storage providing data access to a heterogeneous group of IoT clients via Hypertext Transfer Protocol (HTTP) methods is configured to data stores of parsed sensor readings. Lastly, a desktop computer with a WAMP software bundle (the Microsoft® Windows operating system, Apache HTTP Server, MySQL relational database management system, and PHP programming language) serves as a data science analytics engine for dynamic Web APP/REpresentational State Transfer-ful web service of the residential DSM having globally-Advanced Internet of Artificial Intelligence (AI)/Computational Intelligence. Where, an abstract computing machine, Java Virtual Machine, enables the desktop computer to run Java programs, and a mash-up of Java, R language, and Python is well-suited and -configured for AI in this study. Having the ability of sending real-time push notifications to IoT clients, the desktop computer implements Google-maintained Firebase Cloud Messaging to engage IoT clients across Android/iOS devices and provide mobile notification service to residential/industrial DSM. In this study, in order to realize edge intelligence that edge devices avoiding network latency and much-needed connectivity of Internet connections for Internet of Services can support secure access to data stores and provide immediate analytical and real-time actionable insights at the edge of the network, we upgrade the designed and implemented smart sockets to be embedded AI Arduino ones (called embedded AIduino). With the realization of edge analytics by the proposed embedded AIduino for data analytics, an Arduino Ethernet shield WizNet W5100 having a micro SD card connector is conducted and used. The SD library is included for reading parsed data from and writing parsed data to an SD card. And, an Artificial Neural Network library, ArduinoANN, for Arduino MEGA is imported and used for locally-embedded AI implementation. The embedded AIduino in this study can be developed for further applications in manufacturing industry energy management and sustainable energy management, wherein in sustainable energy management rotating machinery diagnostics works to identify energy loss from gross misalignment and unbalance of rotating machines in power plants as an example.

Keywords: demand-side management, edge intelligence, energy management system, fault detection and classification

Procedia PDF Downloads 251
25128 Data Analysis Tool for Predicting Water Scarcity in Industry

Authors: Tassadit Issaadi Hamitouche, Nicolas Gillard, Jean Petit, Valerie Lavaste, Celine Mayousse

Abstract:

Water is a fundamental resource for the industry. It is taken from the environment either from municipal distribution networks or from various natural water sources such as the sea, ocean, rivers, aquifers, etc. Once used, water is discharged into the environment, reprocessed at the plant or treatment plants. These withdrawals and discharges have a direct impact on natural water resources. These impacts can apply to the quantity of water available, the quality of the water used, or to impacts that are more complex to measure and less direct, such as the health of the population downstream from the watercourse, for example. Based on the analysis of data (meteorological, river characteristics, physicochemical substances), we wish to predict water stress episodes and anticipate prefectoral decrees, which can impact the performance of plants and propose improvement solutions, help industrialists in their choice of location for a new plant, visualize possible interactions between companies to optimize exchanges and encourage the pooling of water treatment solutions, and set up circular economies around the issue of water. The development of a system for the collection, processing, and use of data related to water resources requires the functional constraints specific to the latter to be made explicit. Thus the system will have to be able to store a large amount of data from sensors (which is the main type of data in plants and their environment). In addition, manufacturers need to have 'near-real-time' processing of information in order to be able to make the best decisions (to be rapidly notified of an event that would have a significant impact on water resources). Finally, the visualization of data must be adapted to its temporal and geographical dimensions. In this study, we set up an infrastructure centered on the TICK application stack (for Telegraf, InfluxDB, Chronograf, and Kapacitor), which is a set of loosely coupled but tightly integrated open source projects designed to manage huge amounts of time-stamped information. The software architecture is coupled with the cross-industry standard process for data mining (CRISP-DM) data mining methodology. The robust architecture and the methodology used have demonstrated their effectiveness on the study case of learning the level of a river with a 7-day horizon. The management of water and the activities within the plants -which depend on this resource- should be considerably improved thanks, on the one hand, to the learning that allows the anticipation of periods of water stress, and on the other hand, to the information system that is able to warn decision-makers with alerts created from the formalization of prefectoral decrees.

Keywords: data mining, industry, machine Learning, shortage, water resources

Procedia PDF Downloads 121
25127 Structuring and Visualizing Healthcare Claims Data Using Systems Architecture Methodology

Authors: Inas S. Khayal, Weiping Zhou, Jonathan Skinner

Abstract:

Healthcare delivery systems around the world are in crisis. The need to improve health outcomes while decreasing healthcare costs have led to an imminent call to action to transform the healthcare delivery system. While Bioinformatics and Biomedical Engineering have primarily focused on biological level data and biomedical technology, there is clear evidence of the importance of the delivery of care on patient outcomes. Classic singular decomposition approaches from reductionist science are not capable of explaining complex systems. Approaches and methods from systems science and systems engineering are utilized to structure healthcare delivery system data. Specifically, systems architecture is used to develop a multi-scale and multi-dimensional characterization of the healthcare delivery system, defined here as the Healthcare Delivery System Knowledge Base. This paper is the first to contribute a new method of structuring and visualizing a multi-dimensional and multi-scale healthcare delivery system using systems architecture in order to better understand healthcare delivery.

Keywords: health informatics, systems thinking, systems architecture, healthcare delivery system, data analytics

Procedia PDF Downloads 348
25126 Investigation of Topic Modeling-Based Semi-Supervised Interpretable Document Classifier

Authors: Dasom Kim, William Xiu Shun Wong, Yoonjin Hyun, Donghoon Lee, Minji Paek, Sungho Byun, Namgyu Kim

Abstract:

There have been many researches on document classification for classifying voluminous documents automatically. Through document classification, we can assign a specific category to each unlabeled document on the basis of various machine learning algorithms. However, providing labeled documents manually requires considerable time and effort. To overcome the limitations, the semi-supervised learning which uses unlabeled document as well as labeled documents has been invented. However, traditional document classifiers, regardless of supervised or semi-supervised ones, cannot sufficiently explain the reason or the process of the classification. Thus, in this paper, we proposed a methodology to visualize major topics and class components of each document. We believe that our methodology for visualizing topics and classes of each document can enhance the reliability and explanatory power of document classifiers.

Keywords: data mining, document classifier, text mining, topic modeling

Procedia PDF Downloads 402
25125 Searching Linguistic Synonyms through Parts of Speech Tagging

Authors: Faiza Hussain, Usman Qamar

Abstract:

Synonym-based searching is recognized to be a complicated problem as text mining from unstructured data of web is challenging. Finding useful information which matches user need from bulk of web pages is a cumbersome task. In this paper, a novel and practical synonym retrieval technique is proposed for addressing this problem. For replacement of semantics, user intent is taken into consideration to realize the technique. Parts-of-Speech tagging is applied for pattern generation of the query and a thesaurus for this experiment was formed and used. Comparison with Non-Context Based Searching, Context Based searching proved to be a more efficient approach while dealing with linguistic semantics. This approach is very beneficial in doing intent based searching. Finally, results and future dimensions are presented.

Keywords: natural language processing, text mining, information retrieval, parts-of-speech tagging, grammar, semantics

Procedia PDF Downloads 307
25124 The Role of Strategic Alliances, Innovation Capability, Cost Reduction in Enhancing Customer Loyalty and Firm’s Competitive Advantage

Authors: Soebowo Musa

Abstract:

Mining industries are known to be very volatile due to their sensitive nature toward changes in the environment, particularly coal mining. Heavy equipment distributors and coal mining contractors are among heavily affected by such volatility. They are facing more uncertainty on the sustainability of the coal mining industry. Strategic alliances and organizational capabilities such as innovation capability have long been seen as ways to stay competitive with a focus more on the strategic alliances partner-to-partner in serving their customers. In today’s rapid change in the environment, a shift in consumer behaviors, and the human-centric business approach, this study looks at the strategic alliance partner-to-customer relationship in both the industrial organization and resource-based theories. This study was conducted based on 250 respondents from the strategic alliances partner-to-customer between heavy equipment distributors and coal mining contractors in Indonesia. This study finds strategic alliances have the highest association toward cost reduction, a proxy of operational efficiency followed by its association toward innovation capability. Further, strategic alliances and innovation capability have a positive relationship with customer loyalty, while innovation capability and customer loyalty have no significant relationships toward the firm’s competitive advantage. This study also indicates that cost reduction is not a condition to develop customer loyalty in the strategic alliance partner-to-customer relationship. It confirms strategic alliances are a strategy that creates a firm’s operational efficiency, innovation capability that develops customer loyalty, and competitive advantage.

Keywords: strategic alliance, innovation capability, cost reduction, customer loyalty, competitive advantage

Procedia PDF Downloads 119
25123 Planning for Enviromental and Social Sustainability in Coastal Areas: A Case of Alappad

Authors: K. Vrinda

Abstract:

Coastal ecosystems across the world are facing a lot of challenges due to natural phenomena as well as from uncontrolled human interventions. Here, Alappad, a coastal island situated in Kerala, India is undergoing significant damage and is gradually losing its environmental and social sustainability. The area is blessed with very rare and precious black mineral sand deposits. Sand mining for these minerals started in 1911 and is still continuing. But, unfortunately all the problems that Alappad faces now, have its root on mining of this mineral sand. The land area is continuously diminishing due to sea erosion. The mining has also caused displacement of people and environmental degradation. Marine life also is getting affected by mining on beach and pollution. The inhabitants are fishermen who are largely dependent on the eco-system for a living. So loss of environmental sustainability subsequently affects social sustainability too. Now the damage has reached a point beyond which our actions may not be able to make any impact. This was one of the most affected areas of the 2004 tsunami and the environmental degradation has further increased the vulnerability. So this study focuses on understanding the concerns related to the resource utilization, environment and the indigenous community staying there, and on formulating suitable strategies to restore the sustainability of the area. An extensive study was conducted on site, to find out the physical, social, and economical characteristics of the area. A focus group discussion with the inhabitants shed light on different issues they face in their day-to-day life. The analysis of all these data, led to the formation of a new development vision for the area which focuses on environmental restoration and socio-economic development while allowing controlled exploitation of resources. A participatory approach is formulated which enables these three aspects through community based programs.

Keywords: Community development, Disaster resilience, Ecological restoration, Environmental sustainability, Social-environmental planning, Social Sustainability

Procedia PDF Downloads 111
25122 Video Analytics on Pedagogy Using Big Data

Authors: Jamuna Loganath

Abstract:

Education is the key to the development of any individual’s personality. Today’s students will be tomorrow’s citizens of the global society. The education of the student is the edifice on which his/her future will be built. Schools therefore should provide an all-round development of students so as to foster a healthy society. The behaviors and the attitude of the students in school play an essential role for the success of the education process. Frequent reports of misbehaviors such as clowning, harassing classmates, verbal insults are becoming common in schools today. If this issue is left unattended, it may develop a negative attitude and increase the delinquent behavior. So, the need of the hour is to find a solution to this problem. To solve this issue, it is important to monitor the students’ behaviors in school and give necessary feedback and mentor them to develop a positive attitude and help them to become a successful grownup. Nevertheless, measuring students’ behavior and attitude is extremely challenging. None of the present technology has proven to be effective in this measurement process because actions, reactions, interactions, response of the students are rarely used in the course of the data due to complexity. The purpose of this proposal is to recommend an effective supervising system after carrying out a feasibility study by measuring the behavior of the Students. This can be achieved by equipping schools with CCTV cameras. These CCTV cameras installed in various schools of the world capture the facial expressions and interactions of the students inside and outside their classroom. The real time raw videos captured from the CCTV can be uploaded to the cloud with the help of a network. The video feeds get scooped into various nodes in the same rack or on the different racks in the same cluster in Hadoop HDFS. The video feeds are converted into small frames and analyzed using various Pattern recognition algorithms and MapReduce algorithm. Then, the video frames are compared with the bench marking database (good behavior). When misbehavior is detected, an alert message can be sent to the counseling department which helps them in mentoring the students. This will help in improving the effectiveness of the education process. As Video feeds come from multiple geographical areas (schools from different parts of the world), BIG DATA helps in real time analysis as it analyzes computationally to reveal patterns, trends, and associations, especially relating to human behavior and interactions. It also analyzes data that can’t be analyzed by traditional software applications such as RDBMS, OODBMS. It has also proven successful in handling human reactions with ease. Therefore, BIG DATA could certainly play a vital role in handling this issue. Thus, effectiveness of the education process can be enhanced with the help of video analytics using the latest BIG DATA technology.

Keywords: big data, cloud, CCTV, education process

Procedia PDF Downloads 240