Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 2862

Search results for: S/R machine

1962 Assessing Online Learning Paths in an Learning Management Systems Using a Data Mining and Machine Learning Approach

Authors: Alvaro Figueira, Bruno Cabral

Abstract:

Nowadays, students are used to be assessed through an online platform. Educators have stepped up from a period in which they endured the transition from paper to digital. The use of a diversified set of question types that range from quizzes to open questions is currently common in most university courses. In many courses, today, the evaluation methodology also fosters the students’ online participation in forums, the download, and upload of modified files, or even the participation in group activities. At the same time, new pedagogy theories that promote the active participation of students in the learning process, and the systematic use of problem-based learning, are being adopted using an eLearning system for that purpose. However, although there can be a lot of feedback from these activities to student’s, usually it is restricted to the assessments of online well-defined tasks. In this article, we propose an automatic system that informs students of abnormal deviations of a 'correct' learning path in the course. Our approach is based on the fact that by obtaining this information earlier in the semester, may provide students and educators an opportunity to resolve an eventual problem regarding the student’s current online actions towards the course. Our goal is to prevent situations that have a significant probability to lead to a poor grade and, eventually, to failing. In the major learning management systems (LMS) currently available, the interaction between the students and the system itself is registered in log files in the form of registers that mark beginning of actions performed by the user. Our proposed system uses that logged information to derive new one: the time each student spends on each activity, the time and order of the resources used by the student and, finally, the online resource usage pattern. Then, using the grades assigned to the students in previous years, we built a learning dataset that is used to feed a machine learning meta classifier. The produced classification model is then used to predict the grades a learning path is heading to, in the current year. Not only this approach serves the teacher, but also the student to receive automatic feedback on her current situation, having past years as a perspective. Our system can be applied to online courses that integrate the use of an online platform that stores user actions in a log file, and that has access to other student’s evaluations. The system is based on a data mining process on the log files and on a self-feedback machine learning algorithm that works paired with the Moodle LMS.

Keywords: data mining, e-learning, grade prediction, machine learning, student learning path

Procedia PDF Downloads 123

1961 Using Hyperspectral Sensor and Machine Learning to Predict Water Potentials of Wild Blueberries during Drought Treatment

Authors: Yongjiang Zhang, Kallol Barai, Umesh R. Hodeghatta, Trang Tran, Vikas Dhiman

Abstract:

Detecting water stress on crops early and accurately is crucial to minimize its impact. This study aims to measure water stress in wild blueberry crops non-destructively by analyzing proximal hyperspectral data. The data collection took place in the summer growing season of 2022. A drought experiment was conducted on wild blueberries in the randomized block design in the greenhouse, incorporating various genotypes and irrigation treatments. Hyperspectral data ( spectral range: 400-1000 nm) using a handheld spectroradiometer and leaf water potential data using a pressure chamber were collected from wild blueberry plants. Machine learning techniques, including multiple regression analysis and random forest models, were employed to predict leaf water potential (MPa). We explored the optimal wavelength bands for simple differences (RY1-R Y2), simple ratios (RY1/RY2), and normalized differences (|RY1-R Y2|/ (RY1-R Y2)). NDWI ((R857 - R1241)/(R857 + R1241)), SD (R2188 – R2245), and SR (R1752 / R1756) emerged as top predictors for predicting leaf water potential, significantly contributing to the highest model performance. The base learner models achieved an R-squared value of approximately 0.81, indicating their capacity to explain 81% of the variance. Research is underway to develop a neural vegetation index (NVI) that automates the process of index development by searching for specific wavelengths in the space ratio of linear functions of reflectance. The NVI framework could work across species and predict different physiological parameters.

Keywords: hyperspectral reflectance, water potential, spectral indices, machine learning, wild blueberries, optimal bands

Procedia PDF Downloads 67

1960 Comparative Analysis of Predictive Models for Customer Churn Prediction in the Telecommunication Industry

Authors: Deepika Christopher, Garima Anand

Abstract:

To determine the best model for churn prediction in the telecom industry, this paper compares 11 machine learning algorithms, namely Logistic Regression, Support Vector Machine, Random Forest, Decision Tree, XGBoost, LightGBM, Cat Boost, AdaBoost, Extra Trees, Deep Neural Network, and Hybrid Model (MLPClassifier). It also aims to pinpoint the top three factors that lead to customer churn and conducts customer segmentation to identify vulnerable groups. According to the data, the Logistic Regression model performs the best, with an F1 score of 0.6215, 81.76% accuracy, 68.95% precision, and 56.57% recall. The top three attributes that cause churn are found to be tenure, Internet Service Fiber optic, and Internet Service DSL; conversely, the top three models in this article that perform the best are Logistic Regression, Deep Neural Network, and AdaBoost. The K means algorithm is applied to establish and analyze four different customer clusters. This study has effectively identified customers that are at risk of churn and may be utilized to develop and execute strategies that lower customer attrition.

Keywords: attrition, retention, predictive modeling, customer segmentation, telecommunications

Procedia PDF Downloads 58

1959 Implementation of Correlation-Based Data Analysis as a Preliminary Stage for the Prediction of Geometric Dimensions Using Machine Learning in the Forming of Car Seat Rails

Authors: Housein Deli, Loui Al-Shrouf, Hammoud Al Joumaa, Mohieddine Jelali

Abstract:

When forming metallic materials, fluctuations in material properties, process conditions, and wear lead to deviations in the component geometry. Several hundred features sometimes need to be measured, especially in the case of functional and safety-relevant components. These can only be measured offline due to the large number of features and the accuracy requirements. The risk of producing components outside the tolerances is minimized but not eliminated by the statistical evaluation of process capability and control measurements. The inspection intervals are based on the acceptable risk and are at the expense of productivity but remain reactive and, in some cases, considerably delayed. Due to the considerable progress made in the field of condition monitoring and measurement technology, permanently installed sensor systems in combination with machine learning and artificial intelligence, in particular, offer the potential to independently derive forecasts for component geometry and thus eliminate the risk of defective products - actively and preventively. The reliability of forecasts depends on the quality, completeness, and timeliness of the data. Measuring all geometric characteristics is neither sensible nor technically possible. This paper, therefore, uses the example of car seat rail production to discuss the necessary first step of feature selection and reduction by correlation analysis, as otherwise, it would not be possible to forecast components in real-time and inline. Four different car seat rails with an average of 130 features were selected and measured using a coordinate measuring machine (CMM). The run of such measuring programs alone takes up to 20 minutes. In practice, this results in the risk of faulty production of at least 2000 components that have to be sorted or scrapped if the measurement results are negative. Over a period of 2 months, all measurement data (> 200 measurements/ variant) was collected and evaluated using correlation analysis. As part of this study, the number of characteristics to be measured for all 6 car seat rail variants was reduced by over 80%. Specifically, direct correlations for almost 100 characteristics were proven for an average of 125 characteristics for 4 different products. A further 10 features correlate via indirect relationships so that the number of features required for a prediction could be reduced to less than 20. A correlation factor >0.8 was assumed for all correlations.

Keywords: long-term SHM, condition monitoring, machine learning, correlation analysis, component prediction, wear prediction, regressions analysis

Procedia PDF Downloads 50

1958 Machine Learning and Internet of Thing for Smart-Hydrology of the Mantaro River Basin

Authors: Julio Jesus Salazar, Julio Jesus De Lama

Abstract:

the fundamental objective of hydrological studies applied to the engineering field is to determine the statistically consistent volumes or water flows that, in each case, allow us to size or design a series of elements or structures to effectively manage and develop a river basin. To determine these values, there are several ways of working within the framework of traditional hydrology: (1) Study each of the factors that influence the hydrological cycle, (2) Study the historical behavior of the hydrology of the area, (3) Study the historical behavior of hydrologically similar zones, and (4) Other studies (rain simulators or experimental basins). Of course, this range of studies in a certain basin is very varied and complex and presents the difficulty of collecting the data in real time. In this complex space, the study of variables can only be overcome by collecting and transmitting data to decision centers through the Internet of things and artificial intelligence. Thus, this research work implemented the learning project of the sub-basin of the Shullcas river in the Andean basin of the Mantaro river in Peru. The sensor firmware to collect and communicate hydrological parameter data was programmed and tested in similar basins of the European Union. The Machine Learning applications was programmed to choose the algorithms that direct the best solution to the determination of the rainfall-runoff relationship captured in the different polygons of the sub-basin. Tests were carried out in the mountains of Europe, and in the sub-basins of the Shullcas river (Huancayo) and the Yauli river (Jauja) with heights close to 5000 m.a.s.l., giving the following conclusions: to guarantee a correct communication, the distance between devices should not pass the 15 km. It is advisable to minimize the energy consumption of the devices and avoid collisions between packages, the distances oscillate between 5 and 10 km, in this way the transmission power can be reduced and a higher bitrate can be used. In case the communication elements of the devices of the network (internet of things) installed in the basin do not have good visibility between them, the distance should be reduced to the range of 1-3 km. The energy efficiency of the Atmel microcontrollers present in Arduino is not adequate to meet the requirements of system autonomy. To increase the autonomy of the system, it is recommended to use low consumption systems, such as the Ashton Raggatt McDougall or ARM Cortex L (Ultra Low Power) microcontrollers or even the Cortex M; and high-performance direct current (DC) to direct current (DC) converters. The Machine Learning System has initiated the learning of the Shullcas system to generate the best hydrology of the sub-basin. This will improve as machine learning and the data entered in the big data coincide every second. This will provide services to each of the applications of the complex system to return the best data of determined flows.

Keywords: hydrology, internet of things, machine learning, river basin

Procedia PDF Downloads 160

1957 Shotcrete Performance Optimisation and Audit Using 3D Laser Scanning

Authors: Carlos Gonzalez, Neil Slatcher, Marcus Properzi, Kan Seah

Abstract:

In many underground mining operations, shotcrete is used for permanent rock support. Shotcrete thickness is a critical measure of the success of this process. 3D Laser Mapping, in conjunction with Jetcrete, has developed a 3D laser scanning system specifically for measuring the thickness of shotcrete. The system is mounted on the shotcrete spraying machine and measures the rock faces before and after spraying. The calculated difference between the two 3D surface models is measured as the thickness of the sprayed concrete. Typical work patterns for the shotcrete process required a rapid and automatic system. The scanning takes place immediately before and after the application of the shotcrete so no convergence takes place in the interval between scans. Automatic alignment of scans without targets was implemented which allows for the possibility of movement of the spraying machine between scans. Case studies are presented where accuracy tests are undertaken and automatic audit reports are calculated. The use of 3D imaging data for the calculation of shotcrete thickness is an important tool for geotechnical engineers and contract managers, and this could become the new state-of-the-art methodology for the mining industry.

Keywords: 3D imaging, shotcrete, surface model, tunnel stability

Procedia PDF Downloads 292

1956 Application of Support Vector Machines in Forecasting Non-Residential

Authors: Wiwat Kittinaraporn, Napat Harnpornchai, Sutja Boonyachut

Abstract:

This paper deals with the application of a novel neural network technique, so-called Support Vector Machine (SVM). The objective of this study is to explore the variable and parameter of forecasting factors in the construction industry to build up forecasting model for construction quantity in Thailand. The scope of the research is to study the non-residential construction quantity in Thailand. There are 44 sets of yearly data available, ranging from 1965 to 2009. The correlation between economic indicators and construction demand with the lag of one year was developed by Apichat Buakla. The selected variables are used to develop SVM models to forecast the non-residential construction quantity in Thailand. The parameters are selected by using ten-fold cross-validation method. The results are indicated in term of Mean Absolute Percentage Error (MAPE). The MAPE value for the non-residential construction quantity predicted by Epsilon-SVR in corporation with Radial Basis Function (RBF) of kernel function type is 5.90. Analysis of the experimental results show that the support vector machine modelling technique can be applied to forecast construction quantity time series which is useful for decision planning and management purpose.

Keywords: forecasting, non-residential, construction, support vector machines

Procedia PDF Downloads 435

1955 Profiling Risky Code Using Machine Learning

Authors: Zunaira Zaman, David Bohannon

Abstract:

This study explores the application of machine learning (ML) for detecting security vulnerabilities in source code. The research aims to assist organizations with large application portfolios and limited security testing capabilities in prioritizing security activities. ML-based approaches offer benefits such as increased confidence scores, false positives and negatives tuning, and automated feedback. The initial approach using natural language processing techniques to extract features achieved 86% accuracy during the training phase but suffered from overfitting and performed poorly on unseen datasets during testing. To address these issues, the study proposes using the abstract syntax tree (AST) for Java and C++ codebases to capture code semantics and structure and generate path-context representations for each function. The Code2Vec model architecture is used to learn distributed representations of source code snippets for training a machine-learning classifier for vulnerability prediction. The study evaluates the performance of the proposed methodology using two datasets and compares the results with existing approaches. The Devign dataset yielded 60% accuracy in predicting vulnerable code snippets and helped resist overfitting, while the Juliet Test Suite predicted specific vulnerabilities such as OS-Command Injection, Cryptographic, and Cross-Site Scripting vulnerabilities. The Code2Vec model achieved 75% accuracy and a 98% recall rate in predicting OS-Command Injection vulnerabilities. The study concludes that even partial AST representations of source code can be useful for vulnerability prediction. The approach has the potential for automated intelligent analysis of source code, including vulnerability prediction on unseen source code. State-of-the-art models using natural language processing techniques and CNN models with ensemble modelling techniques did not generalize well on unseen data and faced overfitting issues. However, predicting vulnerabilities in source code using machine learning poses challenges such as high dimensionality and complexity of source code, imbalanced datasets, and identifying specific types of vulnerabilities. Future work will address these challenges and expand the scope of the research.

Keywords: code embeddings, neural networks, natural language processing, OS command injection, software security, code properties

Procedia PDF Downloads 109

1954 Visual Thing Recognition with Binary Scale-Invariant Feature Transform and Support Vector Machine Classifiers Using Color Information

Authors: Wei-Jong Yang, Wei-Hau Du, Pau-Choo Chang, Jar-Ferr Yang, Pi-Hsia Hung

Abstract:

The demands of smart visual thing recognition in various devices have been increased rapidly for daily smart production, living and learning systems in recent years. This paper proposed a visual thing recognition system, which combines binary scale-invariant feature transform (SIFT), bag of words model (BoW), and support vector machine (SVM) by using color information. Since the traditional SIFT features and SVM classifiers only use the gray information, color information is still an important feature for visual thing recognition. With color-based SIFT features and SVM, we can discard unreliable matching pairs and increase the robustness of matching tasks. The experimental results show that the proposed object recognition system with color-assistant SIFT SVM classifier achieves higher recognition rate than that with the traditional gray SIFT and SVM classification in various situations.

Keywords: color moments, visual thing recognition system, SIFT, color SIFT

Procedia PDF Downloads 471

1953 A Neuron Model of Facial Recognition and Detection of an Authorized Entity Using Machine Learning System

Authors: J. K. Adedeji, M. O. Oyekanmi

Abstract:

This paper has critically examined the use of Machine Learning procedures in curbing unauthorized access into valuable areas of an organization. The use of passwords, pin codes, user’s identification in recent times has been partially successful in curbing crimes involving identities, hence the need for the design of a system which incorporates biometric characteristics such as DNA and pattern recognition of variations in facial expressions. The facial model used is the OpenCV library which is based on the use of certain physiological features, the Raspberry Pi 3 module is used to compile the OpenCV library, which extracts and stores the detected faces into the datasets directory through the use of camera. The model is trained with 50 epoch run in the database and recognized by the Local Binary Pattern Histogram (LBPH) recognizer contained in the OpenCV. The training algorithm used by the neural network is back propagation coded using python algorithmic language with 200 epoch runs to identify specific resemblance in the exclusive OR (XOR) output neurons. The research however confirmed that physiological parameters are better effective measures to curb crimes relating to identities.

Keywords: biometric characters, facial recognition, neural network, OpenCV

Procedia PDF Downloads 258

1952 A Study on the Impact of Artificial Intelligence on Human Society and the Necessity for Setting up the Boundaries on AI Intrusion

Authors: Swarna Pundir, Prabuddha Hans

Abstract:

As AI has already stepped into the daily life of human society, one cannot be ignorant about the data it collects and used it to provide a quality of services depending up on the individuals’ choices. It also helps in giving option for making decision Vs choice selection with a calculation based on the history of our search criteria. Over the past decade or so, the way Artificial Intelligence (AI) has impacted society is undoubtedly large.AI has changed the way we shop, the way we entertain and challenge ourselves, the way information is handled, and has automated some sections of our life. We have answered as to what AI is, but not why one may see it as useful. AI is useful because it is capable of learning and predicting outcomes, using Machine Learning (ML) and Deep Learning (DL) with the help of Artificial Neural Networks (ANN). AI can also be a system that can act like humans. One of the major impacts be Joblessness through automation via AI which is seen mostly in manufacturing sectors, especially in the routine manual and blue-collar occupations and those without a college degree. It raises some serious concerns about AI in regards of less employment, ethics in making moral decisions, Individuals privacy, human judgement’s, natural emotions, biased decisions, discrimination. So, the question is if an error occurs who will be responsible, or it will be just waved off as a “Machine Error”, with no one taking the responsibility of any wrongdoing, it is essential to form some rules for using the AI where both machines and humans are involved.

Keywords: AI, ML, DL, ANN

Procedia PDF Downloads 99

1951 Thermal and Solar Performances of Adsorption Solar Refrigerating Machine

Authors: Nadia Allouache

Abstract:

Solar radiation is by far the largest and the most world’s abundant, clean and permanent energy source. The amount of solar radiation intercepted by the Earth is much higher than annual global energy use. The energy available from the sun is greater than about 5200 times the global world’s need in 2006. In recent years, many promising technologies have been developed to harness the sun's energy. These technologies help in environmental protection, economizing energy, and sustainable development, which are the major issues of the world in the 21st century. One of these important technologies is the solar cooling systems that make use of either absorption or adsorption technologies. The solar adsorption cooling systems are good alternative since they operate with environmentally benign refrigerants that are natural, free from CFCs, and therefore they have a zero ozone depleting potential (ODP). A numerical analysis of thermal and solar performances of an adsorption solar refrigerating system using different adsorbent/adsorbate pairs such as activated carbon AC35 and activated carbon BPL/Ammoniac; is undertaken in this study. The modeling of the adsorption cooling machine requires the resolution of the equation describing the energy and mass transfer in the tubular adsorber that is the most important component of the machine. The Wilson and Dubinin- Astakhov models of the solid-adsorbat equilibrium are used to calculate the adsorbed quantity. The porous medium is contained in the annular space and the adsorber is heated by solar energy. Effect of key parameters on the adsorbed quantity and on the thermal and solar performances are analysed and discussed. The performances of the system that depends on the incident global irradiance during a whole day depends on the weather conditions: the condenser temperature and the evaporator temperature. The AC35/methanol pair is the best pair comparing to the BPL/Ammoniac in terms of system performances.

Keywords: activated carbon-methanol pair, activated carbon-ammoniac pair, adsorption, performance coefficients, numerical analysis, solar cooling system

Procedia PDF Downloads 72

1950 Framework for Detecting External Plagiarism from Monolingual Documents: Use of Shallow NLP and N-Gram Frequency Comparison

Authors: Saugata Bose, Ritambhra Korpal

Abstract:

The internet has increased the copy-paste scenarios amongst students as well as amongst researchers leading to different levels of plagiarized documents. For this reason, much of research is focused on for detecting plagiarism automatically. In this paper, an initiative is discussed where Natural Language Processing (NLP) techniques as well as supervised machine learning algorithms have been combined to detect plagiarized texts. Here, the major emphasis is on to construct a framework which detects external plagiarism from monolingual texts successfully. For successfully detecting the plagiarism, n-gram frequency comparison approach has been implemented to construct the model framework. The framework is based on 120 characteristics which have been extracted during pre-processing the documents using NLP approach. Afterwards, filter metrics has been applied to select most relevant characteristics and then supervised classification learning algorithm has been used to classify the documents in four levels of plagiarism. Confusion matrix was built to estimate the false positives and false negatives. Our plagiarism framework achieved a very high the accuracy score.

Keywords: lexical matching, shallow NLP, supervised machine learning algorithm, word n-gram

Procedia PDF Downloads 359

1949 Synthetic Classicism: A Machine Learning Approach to the Recognition and Design of Circular Pavilions

Authors: Federico Garrido, Mostafa El Hayani, Ahmed Shams

Abstract:

The exploration of the potential of artificial intelligence (AI) in architecture is still embryonic, however, its latent capacity to change design disciplines is significant. 'Synthetic Classism' is a research project that questions the underlying aspects of classically organized architecture not just in aesthetic terms but also from a geometrical and morphological point of view, intending to generate new architectural information using historical examples as source material. The main aim of this paper is to explore the uses of artificial intelligence and machine learning algorithms in architectural design while creating a coherent narrative to be contained within a design process. The purpose is twofold: on one hand, to develop and train machine learning algorithms to produce architectural information of small pavilions and on the other, to synthesize new information from previous architectural drawings. These algorithms intend to 'interpret' graphical information from each pavilion and then generate new information from it. The procedure, once these algorithms are trained, is the following: parting from a line profile, a synthetic 'front view' of a pavilion is generated, then using it as a source material, an isometric view is created from it, and finally, a top view is produced. Thanks to GAN algorithms, it is also possible to generate Front and Isometric views without any graphical input as well. The final intention of the research is to produce isometric views out of historical information, such as the pavilions from Sebastiano Serlio, James Gibbs, or John Soane. The idea is to create and interpret new information not just in terms of historical reconstruction but also to explore AI as a novel tool in the narrative of a creative design process. This research also challenges the idea of the role of algorithmic design associated with efficiency or fitness while embracing the possibility of a creative collaboration between artificial intelligence and a human designer. Hence the double feature of this research, both analytical and creative, first by synthesizing images based on a given dataset and then by generating new architectural information from historical references. We find that the possibility of creatively understand and manipulate historic (and synthetic) information will be a key feature in future innovative design processes. Finally, the main question that we propose is whether an AI could be used not just to create an original and innovative group of simple buildings but also to explore the possibility of fostering a novel architectural sensibility grounded on the specificities on the architectural dataset, either historic, human-made or synthetic.

Keywords: architecture, central pavilions, classicism, machine learning

Procedia PDF Downloads 141

1948 Automatic Identification and Classification of Contaminated Biodegradable Plastics using Machine Learning Algorithms and Hyperspectral Imaging Technology

Authors: Nutcha Taneepanichskul, Helen C. Hailes, Mark Miodownik

Abstract:

Plastic waste has emerged as a critical global environmental challenge, primarily driven by the prevalent use of conventional plastics derived from petrochemical refining and manufacturing processes in modern packaging. While these plastics serve vital functions, their persistence in the environment post-disposal poses significant threats to ecosystems. Addressing this issue necessitates approaches, one of which involves the development of biodegradable plastics designed to degrade under controlled conditions, such as industrial composting facilities. It is imperative to note that compostable plastics are engineered for degradation within specific environments and are not suited for uncontrolled settings, including natural landscapes and aquatic ecosystems. The full benefits of compostable packaging are realized when subjected to industrial composting, preventing environmental contamination and waste stream pollution. Therefore, effective sorting technologies are essential to enhance composting rates for these materials and diminish the risk of contaminating recycling streams. In this study, it leverage hyperspectral imaging technology (HSI) coupled with advanced machine learning algorithms to accurately identify various types of plastics, encompassing conventional variants like Polyethylene terephthalate (PET), Polypropylene (PP), Low density polyethylene (LDPE), High density polyethylene (HDPE) and biodegradable alternatives such as Polybutylene adipate terephthalate (PBAT), Polylactic acid (PLA), and Polyhydroxyalkanoates (PHA). The dataset is partitioned into three subsets: a training dataset comprising uncontaminated conventional and biodegradable plastics, a validation dataset encompassing contaminated plastics of both types, and a testing dataset featuring real-world packaging items in both pristine and contaminated states. Five distinct machine learning algorithms, namely Partial Least Squares Discriminant Analysis (PLS-DA), Support Vector Machine (SVM), Convolutional Neural Network (CNN), Logistic Regression, and Decision Tree Algorithm, were developed and evaluated for their classification performance. Remarkably, the Logistic Regression and CNN model exhibited the most promising outcomes, achieving a perfect accuracy rate of 100% for the training and validation datasets. Notably, the testing dataset yielded an accuracy exceeding 80%. The successful implementation of this sorting technology within recycling and composting facilities holds the potential to significantly elevate recycling and composting rates. As a result, the envisioned circular economy for plastics can be established, thereby offering a viable solution to mitigate plastic pollution.

Keywords: biodegradable plastics, sorting technology, hyperspectral imaging technology, machine learning algorithms

Procedia PDF Downloads 82

1947 Automated Heart Sound Classification from Unsegmented Phonocardiogram Signals Using Time Frequency Features

Authors: Nadia Masood Khan, Muhammad Salman Khan, Gul Muhammad Khan

Abstract:

Cardiologists perform cardiac auscultation to detect abnormalities in heart sounds. Since accurate auscultation is a crucial first step in screening patients with heart diseases, there is a need to develop computer-aided detection/diagnosis (CAD) systems to assist cardiologists in interpreting heart sounds and provide second opinions. In this paper different algorithms are implemented for automated heart sound classification using unsegmented phonocardiogram (PCG) signals. Support vector machine (SVM), artificial neural network (ANN) and cartesian genetic programming evolved artificial neural network (CGPANN) without the application of any segmentation algorithm has been explored in this study. The signals are first pre-processed to remove any unwanted frequencies. Both time and frequency domain features are then extracted for training the different models. The different algorithms are tested in multiple scenarios and their strengths and weaknesses are discussed. Results indicate that SVM outperforms the rest with an accuracy of 73.64%.

Keywords: pattern recognition, machine learning, computer aided diagnosis, heart sound classification, and feature extraction

Procedia PDF Downloads 264

1946 Parallel Fuzzy Rough Support Vector Machine for Data Classification in Cloud Environment

Authors: Arindam Chaudhuri

Abstract:

Classification of data has been actively used for most effective and efficient means of conveying knowledge and information to users. The prima face has always been upon techniques for extracting useful knowledge from data such that returns are maximized. With emergence of huge datasets the existing classification techniques often fail to produce desirable results. The challenge lies in analyzing and understanding characteristics of massive data sets by retrieving useful geometric and statistical patterns. We propose a supervised parallel fuzzy rough support vector machine (PFRSVM) for data classification in cloud environment. The classification is performed by PFRSVM using hyperbolic tangent kernel. The fuzzy rough set model takes care of sensitiveness of noisy samples and handles impreciseness in training samples bringing robustness to results. The membership function is function of center and radius of each class in feature space and is represented with kernel. It plays an important role towards sampling the decision surface. The success of PFRSVM is governed by choosing appropriate parameter values. The training samples are either linear or nonlinear separable. The different input points make unique contributions to decision surface. The algorithm is parallelized with a view to reduce training times. The system is built on support vector machine library using Hadoop implementation of MapReduce. The algorithm is tested on large data sets to check its feasibility and convergence. The performance of classifier is also assessed in terms of number of support vectors. The challenges encountered towards implementing big data classification in machine learning frameworks are also discussed. The experiments are done on the cloud environment available at University of Technology and Management, India. The results are illustrated for Gaussian RBF and Bayesian kernels. The effect of variability in prediction and generalization of PFRSVM is examined with respect to values of parameter C. It effectively resolves outliers’ effects, imbalance and overlapping class problems, normalizes to unseen data and relaxes dependency between features and labels. The average classification accuracy for PFRSVM is better than other classifiers for both Gaussian RBF and Bayesian kernels. The experimental results on both synthetic and real data sets clearly demonstrate the superiority of the proposed technique.

Keywords: FRSVM, Hadoop, MapReduce, PFRSVM

Procedia PDF Downloads 491

1945 2016 Taiwan's 'Health and Physical Education Field of 12-Year Basic Education Curriculum Outline (Draft)' Reform and Its Implications

Authors: Hai Zeng, Yisheng Li, Jincheng Huang, Chenghui Huang, Ying Zhang

Abstract:

Children are strong; the country strong, the development of children Basketball is a strategic advantage. Common forms of basketball equipment has been difficult to meet the needs of young children teaching the game of basketball, basketball development for 3-6 years old children in the form of appropriate teaching aids is a breakthrough basketball game teaching children bottlenecks, improve teaching critical path pleasure, but also the development of early childhood basketball a necessary requirement. In this study, literature, questionnaires, focus group interviews, comparative analysis, for domestic and foreign use of 12 kinds of basketball teaching aids (cloud computing MINI basketball, adjustable basketball MINI, MINI basketball court, shooting assist paw print ball, dribble goggles, dribbling machine, machine cartoon shooting, rebounding machine, against the mat, elastic belt, ladder, fitness ball), from fun and improve early childhood shooting technique, dribbling technology, as well as offensive and defensive rebounding against technology conduct research on conversion technology. The results show that by using appropriate forms of teaching children basketball aids, can effectively improve children's fun basketball game, targeted to improve a technology, different types of aids from different perspectives enrich the connotation of children basketball game. Recommended for children of color psychology, cartoon and environmentally friendly material production aids, and increase research efforts basketball aids children, encourage children to sports teachers aids applications.

Keywords: health and physical education field of curriculum outline, health fitness, sports and health curriculum reform, Taiwan, twelve years basic education

Procedia PDF Downloads 393

1944 Polarity Classification of Social Media Comments in Turkish

Authors: Migena Ceyhan, Zeynep Orhan, Dimitrios Karras

Abstract:

People in modern societies are continuously sharing their experiences, emotions, and thoughts in different areas of life. The information reaches almost everyone in real-time and can have an important impact in shaping people’s way of living. This phenomenon is very well recognized and advantageously used by the market representatives, trying to earn the most from this means. Given the abundance of information, people and organizations are looking for efficient tools that filter the countless data into important information, ready to analyze. This paper is a modest contribution in this field, describing the process of automatically classifying social media comments in the Turkish language into positive or negative. Once data is gathered and preprocessed, feature sets of selected single words or groups of words are build according to the characteristics of language used in the texts. These features are used later to train, and test a system according to different machine learning algorithms (Naïve Bayes, Sequential Minimal Optimization, J48, and Bayesian Linear Regression). The resultant high accuracies can be important feedback for decision-makers to improve the business strategies accordingly.

Keywords: feature selection, machine learning, natural language processing, sentiment analysis, social media reviews

Procedia PDF Downloads 147

1943 A Biologically Inspired Approach to Automatic Classification of Textile Fabric Prints Based On Both Texture and Colour Information

Authors: Babar Khan, Wang Zhijie

Abstract:

Machine Vision has been playing a significant role in Industrial Automation, to imitate the wide variety of human functions, providing improved safety, reduced labour cost, the elimination of human error and/or subjective judgments, and the creation of timely statistical product data. Despite the intensive research, there have not been any attempts to classify fabric prints based on printed texture and colour, most of the researches so far encompasses only black and white or grey scale images. We proposed a biologically inspired processing architecture to classify fabrics w.r.t. the fabric print texture and colour. We created a texture descriptor based on the HMAX model for machine vision, and incorporated colour descriptor based on opponent colour channels simulating the single opponent and double opponent neuronal function of the brain. We found that our algorithm not only outperformed the original HMAX algorithm on classification of fabric print texture and colour, but we also achieved a recognition accuracy of 85-100% on different colour and different texture fabric.

Keywords: automatic classification, texture descriptor, colour descriptor, opponent colour channel

Procedia PDF Downloads 486

1942 Analysis of Photic Zone’s Summer Period-Dissolved Oxygen and Temperature as an Early Warning System of Fish Mass Mortality in Sampaloc Lake in San Pablo, Laguna

Authors: Al Romano, Jeryl C. Hije, Mechaela Marie O. Tabiolo

Abstract:

The decline in water quality is a major factor in aquatic disease outbreaks and can lead to significant mortality among aquatic organisms. Understanding the relationship between dissolved oxygen (DO) and water temperature is crucial, as these variables directly impact the health, behavior, and survival of fish populations. This study investigated how DO levels, water temperature, and atmospheric temperature interact in Sampaloc Lake to assess the risk of fish mortality. By employing a combination of linear regression models and machine learning techniques, researchers developed predictive models to forecast DO concentrations at various depths. The results indicate that while DO levels generally decrease with depth, the predicted concentrations are sufficient to support the survival of common fish species in Sampaloc Lake during March, April, and May 2025.

Keywords: aquaculture, dissolved oxygen, water temperature, regression analysis, machine learning, fish mass mortality, early warning system

Procedia PDF Downloads 37

1941 Spectrogram Pre-Processing to Improve Isotopic Identification to Discriminate Gamma and Neutrons Sources

Authors: Mustafa Alhamdi

Abstract:

Industrial application to classify gamma rays and neutron events is investigated in this study using deep machine learning. The identification using a convolutional neural network and recursive neural network showed a significant improvement in predication accuracy in a variety of applications. The ability to identify the isotope type and activity from spectral information depends on feature extraction methods, followed by classification. The features extracted from the spectrum profiles try to find patterns and relationships to present the actual spectrum energy in low dimensional space. Increasing the level of separation between classes in feature space improves the possibility to enhance classification accuracy. The nonlinear nature to extract features by neural network contains a variety of transformation and mathematical optimization, while principal component analysis depends on linear transformations to extract features and subsequently improve the classification accuracy. In this paper, the isotope spectrum information has been preprocessed by finding the frequencies components relative to time and using them as a training dataset. Fourier transform implementation to extract frequencies component has been optimized by a suitable windowing function. Training and validation samples of different isotope profiles interacted with CdTe crystal have been simulated using Geant4. The readout electronic noise has been simulated by optimizing the mean and variance of normal distribution. Ensemble learning by combing voting of many models managed to improve the classification accuracy of neural networks. The ability to discriminate gamma and neutron events in a single predication approach using deep machine learning has shown high accuracy using deep learning. The paper findings show the ability to improve the classification accuracy by applying the spectrogram preprocessing stage to the gamma and neutron spectrums of different isotopes. Tuning deep machine learning models by hyperparameter optimization of neural network models enhanced the separation in the latent space and provided the ability to extend the number of detected isotopes in the training database. Ensemble learning contributed significantly to improve the final prediction.

Keywords: machine learning, nuclear physics, Monte Carlo simulation, noise estimation, feature extraction, classification

Procedia PDF Downloads 151

1940 Using Machine Learning to Classify Different Body Parts and Determine Healthiness

Authors: Zachary Pan

Abstract:

Our general mission is to solve the problem of classifying images into different body part types and deciding if each of them is healthy or not. However, for now, we will determine healthiness for only one-sixth of the body parts, specifically the chest. We will detect pneumonia in X-ray scans of those chest images. With this type of AI, doctors can use it as a second opinion when they are taking CT or X-ray scans of their patients. Another ad-vantage of using this machine learning classifier is that it has no human weaknesses like fatigue. The overall ap-proach to this problem is to split the problem into two parts: first, classify the image, then determine if it is healthy. In order to classify the image into a specific body part class, the body parts dataset must be split into test and training sets. We can then use many models, like neural networks or logistic regression models, and fit them using the training set. Now, using the test set, we can obtain a realistic accuracy the models will have on images in the real world since these testing images have never been seen by the models before. In order to increase this testing accuracy, we can also apply many complex algorithms to the models, like multiplicative weight update. For the second part of the problem, to determine if the body part is healthy, we can have another dataset consisting of healthy and non-healthy images of the specific body part and once again split that into the test and training sets. We then use another neural network to train on those training set images and use the testing set to figure out its accuracy. We will do this process only for the chest images. A major conclusion reached is that convolutional neural networks are the most reliable and accurate at image classification. In classifying the images, the logistic regression model, the neural network, neural networks with multiplicative weight update, neural networks with the black box algorithm, and the convolutional neural network achieved 96.83 percent accuracy, 97.33 percent accuracy, 97.83 percent accuracy, 96.67 percent accuracy, and 98.83 percent accuracy, respectively. On the other hand, the overall accuracy of the model that de-termines if the images are healthy or not is around 78.37 percent accuracy.

Keywords: body part, healthcare, machine learning, neural networks

Procedia PDF Downloads 109

1939 Detecting Hate Speech And Cyberbullying Using Natural Language Processing

Authors: Nádia Pereira, Paula Ferreira, Sofia Francisco, Sofia Oliveira, Sidclay Souza, Paula Paulino, Ana Margarida Veiga Simão

Abstract:

Social media has progressed into a platform for hate speech among its users, and thus, there is an increasing need to develop automatic detection classifiers of offense and conflicts to help decrease the prevalence of such incidents. Online communication can be used to intentionally harm someone, which is why such classifiers could be essential in social networks. A possible application of these classifiers is the automatic detection of cyberbullying. Even though identifying the aggressive language used in online interactions could be important to build cyberbullying datasets, there are other criteria that must be considered. Being able to capture the language, which is indicative of the intent to harm others in a specific context of online interaction is fundamental. Offense and hate speech may be the foundation of online conflicts, which have become commonly used in social media and are an emergent research focus in machine learning and natural language processing. This study presents two Portuguese language offense-related datasets which serve as examples for future research and extend the study of the topic. The first is similar to other offense detection related datasets and is entitled Aggressiveness dataset. The second is a novelty because of the use of the history of the interaction between users and is entitled the Conflicts/Attacks dataset. Both datasets were developed in different phases. Firstly, we performed a content analysis of verbal aggression witnessed by adolescents in situations of cyberbullying. Secondly, we computed frequency analyses from the previous phase to gather lexical and linguistic cues used to identify potentially aggressive conflicts and attacks which were posted on Twitter. Thirdly, thorough annotation of real tweets was performed byindependent postgraduate educational psychologists with experience in cyberbullying research. Lastly, we benchmarked these datasets with other machine learning classifiers.

Keywords: aggression, classifiers, cyberbullying, datasets, hate speech, machine learning

Procedia PDF Downloads 229

1938 The Relationship between Human Pose and Intention to Fire a Handgun

Authors: Joshua van Staden, Dane Brown, Karen Bradshaw

Abstract:

Gun violence is a significant problem in modern-day society. Early detection of carried handguns through closed-circuit television (CCTV) can aid in preventing potential gun violence. However, CCTV operators have a limited attention span. Machine learning approaches to automating the detection of dangerous gun carriers provide a way to aid CCTV operators in identifying these individuals. This study provides insight into the relationship between human key points extracted using human pose estimation (HPE) and their intention to fire a weapon. We examine the feature importance of each keypoint and their correlations. We use principal component analysis (PCA) to reduce the feature space and optimize detection. Finally, we run a set of classifiers to determine what form of classifier performs well on this data. We find that hips, shoulders, and knees tend to be crucial aspects of the human pose when making these predictions. Furthermore, the horizontal position plays a larger role than the vertical position. Of the 66 key points, nine principal components could be used to make nonlinear classifications with 86% accuracy. Furthermore, linear classifications could be done with 85% accuracy, showing that there is a degree of linearity in the data.

Keywords: feature engineering, human pose, machine learning, security

Procedia PDF Downloads 93

1937 A Framework of Dynamic Rule Selection Method for Dynamic Flexible Job Shop Problem by Reinforcement Learning Method

Authors: Rui Wu

Abstract:

In the volatile modern manufacturing environment, new orders randomly occur at any time, while the pre-emptive methods are infeasible. This leads to a real-time scheduling method that can produce a reasonably good schedule quickly. The dynamic Flexible Job Shop problem is an NP-hard scheduling problem that hybrid the dynamic Job Shop problem with the Parallel Machine problem. A Flexible Job Shop contains different work centres. Each work centre contains parallel machines that can process certain operations. Many algorithms, such as genetic algorithms or simulated annealing, have been proposed to solve the static Flexible Job Shop problems. However, the time efficiency of these methods is low, and these methods are not feasible in a dynamic scheduling problem. Therefore, a dynamic rule selection scheduling system based on the reinforcement learning method is proposed in this research, in which the dynamic Flexible Job Shop problem is divided into several parallel machine problems to decrease the complexity of the dynamic Flexible Job Shop problem. Firstly, the features of jobs, machines, work centres, and flexible job shops are selected to describe the status of the dynamic Flexible Job Shop problem at each decision point in each work centre. Secondly, a framework of reinforcement learning algorithm using a double-layer deep Q-learning network is applied to select proper composite dispatching rules based on the status of each work centre. Then, based on the selected composite dispatching rule, an available operation is selected from the waiting buffer and assigned to an available machine in each work centre. Finally, the proposed algorithm will be compared with well-known dispatching rules on objectives of mean tardiness, mean flow time, mean waiting time, or mean percentage of waiting time in the real-time Flexible Job Shop problem. The result of the simulations proved that the proposed framework has reasonable performance and time efficiency.

Keywords: dynamic scheduling problem, flexible job shop, dispatching rules, deep reinforcement learning

Procedia PDF Downloads 108

1936 Enhanced Extra Trees Classifier for Epileptic Seizure Prediction

Authors: Maurice Ntahobari, Levin Kuhlmann, Mario Boley, Zhinoos Razavi Hesabi

Abstract:

For machine learning based epileptic seizure prediction, it is important for the model to be implemented in small implantable or wearable devices that can be used to monitor epilepsy patients; however, current state-of-the-art methods are complex and computationally intensive. We use Shapley Additive Explanation (SHAP) to find relevant intracranial electroencephalogram (iEEG) features and improve the computational efficiency of a state-of-the-art seizure prediction method based on the extra trees classifier while maintaining prediction performance. Results for a small contest dataset and a much larger dataset with continuous recordings of up to 3 years per patient from 15 patients yield better than chance prediction performance (p < 0.004). Moreover, while the performance of the SHAP-based model is comparable to that of the benchmark, the overall training and prediction time of the model has been reduced by a factor of 1.83. It can also be noted that the feature called zero crossing value is the best EEG feature for seizure prediction. These results suggest state-of-the-art seizure prediction performance can be achieved using efficient methods based on optimal feature selection.

Keywords: machine learning, seizure prediction, extra tree classifier, SHAP, epilepsy

Procedia PDF Downloads 113

1935 Radiomics: Approach to Enable Early Diagnosis of Non-Specific Breast Nodules in Contrast-Enhanced Magnetic Resonance Imaging

Authors: N. D'Amico, E. Grossi, B. Colombo, F. Rigiroli, M. Buscema, D. Fazzini, G. Cornalba, S. Papa

Abstract:

Purpose: To characterize, through a radiomic approach, the nature of nodules considered non-specific by expert radiologists, recognized in magnetic resonance mammography (MRm) with T1-weighted (T1w) sequences with paramagnetic contrast. Material and Methods: 47 cases out of 1200 undergoing MRm, in which the MRm assessment gave uncertain classification (non-specific nodules), were admitted to the study. The clinical outcome of the non-specific nodules was later found through follow-up or further exams (biopsy), finding 35 benign and 12 malignant. All MR Images were acquired at 1.5T, a first basal T1w sequence and then four T1w acquisitions after the paramagnetic contrast injection. After a manual segmentation of the lesions, done by a radiologist, and the extraction of 150 radiomic features (30 features per 5 subsequent times) a machine learning (ML) approach was used. An evolutionary algorithm (TWIST system based on KNN algorithm) was used to subdivide the dataset into training and validation test and to select features yielding the maximal amount of information. After this pre-processing, different machine learning systems were applied to develop a predictive model based on a training-testing crossover procedure. 10 cases with a benign nodule (follow-up older than 5 years) and 18 with an evident malignant tumor (clear malignant histological exam) were added to the dataset in order to allow the ML system to better learn from data. Results: NaiveBayes algorithm working on 79 features selected by a TWIST system, resulted to be the best performing ML system with a sensitivity of 96% and a specificity of 78% and a global accuracy of 87% (average values of two training-testing procedures ab-ba). The results showed that in the subset of 47 non-specific nodules, the algorithm predicted the outcome of 45 nodules which an expert radiologist could not identify. Conclusion: In this pilot study we identified a radiomic approach allowing ML systems to perform well in the diagnosis of a non-specific nodule at MR mammography. This algorithm could be a great support for the early diagnosis of malignant breast tumor, in the event the radiologist is not able to identify the kind of lesion and reduces the necessity for long follow-up. Clinical Relevance: This machine learning algorithm could be essential to support the radiologist in early diagnosis of non-specific nodules, in order to avoid strenuous follow-up and painful biopsy for the patient.

Keywords: breast, machine learning, MRI, radiomics

Procedia PDF Downloads 269

1934 Sustainable Development of Adsorption Solar Cooling Machine

Authors: N. Allouache, W. Elgahri, A. Gahfif, M. Belmedani

Abstract:

Solar radiation is by far the largest and the most world’s abundant, clean and permanent energy source. The amount of solar radiation intercepted by the Earth is much higher than annual global energy use. The energy available from the sun is greater than about 5200 times the global world’s need in 2006. In recent years, many promising technologies have been developed to harness the sun's energy. These technologies help in environmental protection, economizing energy, and sustainable development, which are the major issues of the world in the 21st century. One of these important technologies is the solar cooling systems that make use of either absorption or adsorption technologies. The solar adsorption cooling systems are a good alternative since they operate with environmentally benign refrigerants that are natural, free from CFCs, and therefore they have a zero ozone depleting potential (ODP). A numerical analysis of thermal and solar performances of an adsorption solar refrigerating system using different adsorbent/adsorbate pairs, such as activated carbon AC35 and activated carbon BPL/Ammoniac; is undertaken in this study. The modeling of the adsorption cooling machine requires the resolution of the equation describing the energy and mass transfer in the tubular adsorber, that is the most important component of the machine. The Wilson and Dubinin- Astakhov models of the solid-adsorbat equilibrium are used to calculate the adsorbed quantity. The porous medium is contained in the annular space, and the adsorber is heated by solar energy. Effect of key parameters on the adsorbed quantity and on the thermal and solar performances are analysed and discussed. The performances of the system that depends on the incident global irradiance during a whole day depends on the weather conditions: the condenser temperature and the evaporator temperature. The AC35/methanol pair is the best pair comparing to the BPL/Ammoniac in terms of system performances.

Keywords: activated carbon-methanol pair, activated carbon-ammoniac pair, adsorption, performance coefficients, numerical analysis, solar cooling system

Procedia PDF Downloads 78

1933 Integrating Machine Learning and Rule-Based Decision Models for Enhanced B2B Sales Forecasting and Customer Prioritization

Authors: Wenqi Liu, Reginald Bailey

Abstract:

This study proposes a comprehensive and effective approach to business-to-business (B2B) sales forecasting by integrating advanced machine learning models with a rule-based decision-making framework. The methodology addresses the critical challenge of optimizing sales pipeline performance and improving conversion rates through predictive analytics and actionable insights. The first component involves developing a classification model to predict the likelihood of conversion, aiming to outperform traditional methods such as logistic regression in terms of accuracy, precision, recall, and F1 score. Feature importance analysis highlights key predictive factors, such as client revenue size and sales velocity, providing valuable insights into conversion dynamics. The second component focuses on forecasting sales value using a regression model, designed to achieve superior performance compared to linear regression by minimizing mean absolute error (MAE), mean squared error (MSE), and maximizing R-squared metrics. The regression analysis identifies primary drivers of sales value, further informing data-driven strategies. To bridge the gap between predictive modeling and actionable outcomes, a rule-based decision framework is introduced. This model categorizes leads into high, medium, and low priorities based on thresholds for conversion probability and predicted sales value. By combining classification and regression outputs, this framework enables sales teams to allocate resources effectively, focus on high-value opportunities, and streamline lead management processes. The integrated approach significantly enhances lead prioritization, increases conversion rates, and drives revenue generation, offering a robust solution to the declining pipeline conversion rates faced by many B2B organizations. Our findings demonstrate the practical benefits of blending machine learning with decision-making frameworks, providing a scalable, data-driven solution for strategic sales optimization. This study underscores the potential of predictive analytics to transform B2B sales operations, enabling more informed decision-making and improved organizational outcomes in competitive markets.

Keywords: machine learning, XGBoost, regression, decision making framework, system engineering

Procedia PDF Downloads 25