Search results for: amazon forest
463 Implications of Oxidative Stress for Monoterpenoid Oxindole Alkaloid Production in Uncaria tomentosa Cultures
Authors: Ana C. Ramos Valdivia, Ileana Vera-Reyes, Ariana A. Huerta-Heredia
Abstract:
The conditions of biotic and abiotic stress in plants can lead to the generation of high amounts of reactive oxygen species (ROS), which leads through a signaling cascade and second messengers to different antioxidant defense responses including the production of secondary metabolites. A limited number of species of plants like Uncaria tomentosa (cat claw) typical of the Amazon region produce monoterpenoid oxindole alkaloids (MOA) such as isopteropodine, mitraphylline, rhynchophylline and its isomers. Moreover, in cultivated roots, the glucoindole alkaloid 3α-dihydrocadambine (DHC) is also accumulated. Several studies have demonstrated that MAO has antioxidant properties and possess important pharmacological activities such as antitumor and immunostimulant while DHC, has hypotensive and hypolipidemic effects. In order the study the regulatory concerns operating in MAO production, the links between oxidative stress and antioxidant alkaloid production in U. tomentosa root cultures were examined. Different amount of hydrogen peroxide between 0.2 -1.0 mM was added to 12 days old roots cultures showing that, this substance had a differential effect on the production of DHC and MOA whereas the viability remained in 80% after six days. Addition of 0.2 mM hydrogen peroxide increased approximately 65% MAO and DHC production (0,540 ± 0.018 and 0.618 ± 0.029 mg per g dry weight, respectively) relative to the control. On contrast, after the addition of 0.6 mM and 1 mM hydrogen peroxide, DHC accumulation into the roots gradually decreased to 53% and 93% respectively, without changes in MAO concentration, which was in relation to a twice increase of the intracellular hydrogen peroxide content. On the other hand, concentrations of DHC (0.1, 0.5 and 1.0 mM in methanol) demonstrated free-radical scavenging activity against 1,1-diphenyl-2-picrylhydrazyl (DPPH) radical. The calculated IC50 for all tested concentrations was 0.180 mg per ml (0.33 mM) while the calculated TE50 was 276 minutes. Our results suggest that U. tomentosa root cultures both MAO and DHC have antioxidant capacities and respond to oxidative stress with a stimulation of their production; however, in presence of a higher concentration of ROS into the roots, DHC could be oxidized.Keywords: monoterpenoid indole alkaloid, oxidative stress, root cultures, uncaria tomentosa
Procedia PDF Downloads 182462 Investigating the Effectiveness of Multilingual NLP Models for Sentiment Analysis
Authors: Othmane Touri, Sanaa El Filali, El Habib Benlahmar
Abstract:
Natural Language Processing (NLP) has gained significant attention lately. It has proved its ability to analyze and extract insights from unstructured text data in various languages. It is found that one of the most popular NLP applications is sentiment analysis which aims to identify the sentiment expressed in a piece of text, such as positive, negative, or neutral, in multiple languages. While there are several multilingual NLP models available for sentiment analysis, there is a need to investigate their effectiveness in different contexts and applications. In this study, we aim to investigate the effectiveness of different multilingual NLP models for sentiment analysis on a dataset of online product reviews in multiple languages. The performance of several NLP models, including Google Cloud Natural Language API, Microsoft Azure Cognitive Services, Amazon Comprehend, Stanford CoreNLP, spaCy, and Hugging Face Transformers are being compared. The models based on several metrics, including accuracy, precision, recall, and F1 score, are being evaluated and compared to their performance across different categories of product reviews. In order to run the study, preprocessing of the dataset has been performed by cleaning and tokenizing the text data in multiple languages. Then training and testing each model has been applied using a cross-validation approach where randomly dividing the dataset into training and testing sets and repeating the process multiple times has been used. A grid search approach to optimize the hyperparameters of each model and select the best-performing model for each category of product reviews and language has been applied. The findings of this study provide insights into the effectiveness of different multilingual NLP models for Multilingual Sentiment Analysis and their suitability for different languages and applications. The strengths and limitations of each model were identified, and recommendations for selecting the most performant model based on the specific requirements of a project were provided. This study contributes to the advancement of research methods in multilingual NLP and provides a practical guide for researchers and practitioners in the field.Keywords: NLP, multilingual, sentiment analysis, texts
Procedia PDF Downloads 103461 Preliminary Result on the Impact of Anthropogenic Noise on Understory Bird Population in Primary Forest of Gaya Island
Authors: Emily A. Gilbert, Jephte Sompud, Andy R. Mojiol, Cynthia B. Sompud, Alim Biun
Abstract:
Gaya Island of Sabah is known for its wildlife and marine biodiversity. It has marks itself as one of the hot destinations of tourists from all around the world. Gaya Island tourism activities have contributed to Sabah’s economy revenue with the high number of tourists visiting the island. However, it has led to the increased anthropogenic noise derived from tourism activities. This may greatly interfere with the animals such as understory birds that rely on acoustic signals as a tool for communication. Many studies in other parts of the regions reveal that anthropogenic noise does decrease species richness of avian community. However, in Malaysia, published research regarding the impact of anthropogenic noise on the understory birds is still very lacking. This study was conducted in order to fill up this gap. This study aims to investigate the anthropogenic noise’s impact towards understory bird population. There were three sites within the Primary forest of Gaya Island that were chosen to sample the level of anthropogenic noise in relation to the understory bird population. Noise mapping method was used to measure the anthropogenic noise level and identify the zone with high anthropogenic noise level (> 60dB) and zone with low anthropogenic noise level (< 60dB) based on the standard threshold of noise level. The methods that were used for this study was solely mist netting and ring banding. This method was chosen as it can determine the diversity of the understory bird population in Gaya Island. The preliminary study was conducted from 15th to 26th April and 5th to 10th May 2015 whereby there were 2 mist nets that were set up at each of the zones within the selected site. The data was analyzed by using the descriptive analysis, presence and absence analysis, diversity indices and diversity t-test. Meanwhile, PAST software was used to analyze the obtain data. The results from this study present a total of 60 individuals that consisted of 12 species from 7 families of understory birds were recorded in three of the sites in Gaya Island. The Shannon-Wiener index shows that diversity of species in high anthropogenic noise zone and low anthropogenic noise zone were 1.573 and 2.009, respectively. However, the statistical analysis shows that there was no significant difference between these zones. Nevertheless, based on the presence and absence analysis, it shows that the species at the low anthropogenic noise zone was higher as compared to the high anthropogenic noise zone. Thus, this result indicates that there is an impact of anthropogenic noise on the population diversity of understory birds. There is still an urgent need to conduct an in-depth study by increasing the sample size in the selected sites in order to fully understand the impact of anthropogenic noise towards the understory birds population so that it can then be in cooperated into the wildlife management for a sustainable environment in Gaya Island.Keywords: anthropogenic noise, biodiversity, Gaya Island, understory bird
Procedia PDF Downloads 365460 Undersea Communications Infrastructure: Risks, Opportunities, and Geopolitical Considerations
Authors: Lori W. Gordon, Karen A. Jones
Abstract:
Today’s high-speed data connectivity depends on a vast global network of infrastructure across space, air, land, and sea, with undersea cable infrastructure (UCI) serving as the primary means for intercontinental and ‘long-haul’ communications. The UCI landscape is changing and includes an increasing variety of state actors, such as the growing economies of Brazil, Russia, India, China, and South Africa. Non-state commercial actors, such as hyper-scale content providers including Google, Facebook, Microsoft, and Amazon, are also seeking to control their data and networks through significant investments in submarine cables. Active investments by both state and non-state actors will invariably influence the growth, geopolitics, and security of this sector. Beyond these hyper-scale content providers, there are new commercial satellite communication providers. These new players include traditional geosynchronous (GEO) satellites that offer broad coverage, high throughput GEO satellites offering high capacity with spot beam technology, low earth orbit (LEO) ‘mega constellations’ – global broadband services. And potential new entrants such as High Altitude Platforms (HAPS) offer low latency connectivity, LEO constellations offer high-speed optical mesh networks, i.e., ‘fiber in the sky.’ This paper focuses on understanding the role of submarine cables within the larger context of the global data commons, spanning space, terrestrial, air, and sea networks, including an analysis of national security policy and geopolitical implications. As network operators and commercial and government stakeholders plan for emerging technologies and architectures, hedging risks for future connectivity will ensure that our data backbone will be secure for years to come.Keywords: communications, global, infrastructure, technology
Procedia PDF Downloads 87459 Fire Risk Information Harmonization for Transboundary Fire Events between Portugal and Spain
Authors: Domingos Viegas, Miguel Almeida, Carmen Rocha, Ilda Novo, Yolanda Luna
Abstract:
Forest fires along the more than 1200km of the Spanish-Portuguese border are more and more frequent, currently achieving around 2000 fire events per year. Some of these events develop to large international wildfire requiring concerted operations based on shared information between the two countries. The fire event of Valencia de Alcantara (2003) causing several fatalities and more than 13000ha burnt, is a reference example of these international events. Currently, Portugal and Spain have a specific cross-border cooperation protocol on wildfires response for a strip of about 30km (15 km for each side). It is recognized by public authorities the successfulness of this collaboration however it is also assumed that this cooperation should include more functionalities such as the development of a common risk information system for transboundary fire events. Since Portuguese and Spanish authorities use different approaches to determine the fire risk indexes inputs and different methodologies to assess the fire risk, sometimes the conjoint firefighting operations are jeopardized since the information is not harmonized and the understanding of the situation by the civil protection agents from both countries is not unique. Thus, a methodology aiming the harmonization of the fire risk calculation and perception by Portuguese and Spanish Civil protection authorities is hereby presented. The final results are presented as well. The fire risk index used in this work is the Canadian Fire Weather Index (FWI), which is based on meteorological data. The FWI is limited on its application as it does not take into account other important factors with great effect on the fire appearance and development. The combination of these factors is very complex since, besides the meteorology, it addresses several parameters of different topics, namely: sociology, topography, vegetation and soil cover. Therefore, the meaning of FWI values is different from region to region, according the specific characteristics of each region. In this work, a methodology for FWI calibration based on the number of fire occurrences and on the burnt area in the transboundary regions of Portugal and Spain, in order to assess the fire risk based on calibrated FWI values, is proposed. As previously mentioned, the cooperative firefighting operations require a common perception of the information shared. Therefore, a common classification of the fire risk for the fire events occurred in the transboundary strip is proposed with the objective of harmonizing this type of information. This work is integrated in the ECHO project SpitFire - Spanish-Portuguese Meteorological Information System for Transboundary Operations in Forest Fires, which aims the development of a web platform for the sharing of information and supporting decision tools to be used in international fire events involving Portugal and Spain.Keywords: data harmonization, FWI, international collaboration, transboundary wildfires
Procedia PDF Downloads 252458 Combining Shallow and Deep Unsupervised Machine Learning Techniques to Detect Bad Actors in Complex Datasets
Authors: Jun Ming Moey, Zhiyaun Chen, David Nicholson
Abstract:
Bad actors are often hard to detect in data that imprints their behaviour patterns because they are comparatively rare events embedded in non-bad actor data. An unsupervised machine learning framework is applied here to detect bad actors in financial crime datasets that record millions of transactions undertaken by hundreds of actors (<0.01% bad). Specifically, the framework combines ‘shallow’ (PCA, Isolation Forest) and ‘deep’ (Autoencoder) methods to detect outlier patterns. Detection performance analysis for both the individual methods and their combination is reported.Keywords: detection, machine learning, deep learning, unsupervised, outlier analysis, data science, fraud, financial crime
Procedia PDF Downloads 94457 Modeling Engagement with Multimodal Multisensor Data: The Continuous Performance Test as an Objective Tool to Track Flow
Authors: Mohammad H. Taheri, David J. Brown, Nasser Sherkat
Abstract:
Engagement is one of the most important factors in determining successful outcomes and deep learning in students. Existing approaches to detect student engagement involve periodic human observations that are subject to inter-rater reliability. Our solution uses real-time multimodal multisensor data labeled by objective performance outcomes to infer the engagement of students. The study involves four students with a combined diagnosis of cerebral palsy and a learning disability who took part in a 3-month trial over 59 sessions. Multimodal multisensor data were collected while they participated in a continuous performance test. Eye gaze, electroencephalogram, body pose, and interaction data were used to create a model of student engagement through objective labeling from the continuous performance test outcomes. In order to achieve this, a type of continuous performance test is introduced, the Seek-X type. Nine features were extracted including high-level handpicked compound features. Using leave-one-out cross-validation, a series of different machine learning approaches were evaluated. Overall, the random forest classification approach achieved the best classification results. Using random forest, 93.3% classification for engagement and 42.9% accuracy for disengagement were achieved. We compared these results to outcomes from different models: AdaBoost, decision tree, k-Nearest Neighbor, naïve Bayes, neural network, and support vector machine. We showed that using a multisensor approach achieved higher accuracy than using features from any reduced set of sensors. We found that using high-level handpicked features can improve the classification accuracy in every sensor mode. Our approach is robust to both sensor fallout and occlusions. The single most important sensor feature to the classification of engagement and distraction was shown to be eye gaze. It has been shown that we can accurately predict the level of engagement of students with learning disabilities in a real-time approach that is not subject to inter-rater reliability, human observation or reliant on a single mode of sensor input. This will help teachers design interventions for a heterogeneous group of students, where teachers cannot possibly attend to each of their individual needs. Our approach can be used to identify those with the greatest learning challenges so that all students are supported to reach their full potential.Keywords: affective computing in education, affect detection, continuous performance test, engagement, flow, HCI, interaction, learning disabilities, machine learning, multimodal, multisensor, physiological sensors, student engagement
Procedia PDF Downloads 94456 Labile and Humified Carbon Storage in Natural and Anthropogenically Affected Luvisols
Authors: Kristina Amaleviciute, Ieva Jokubauskaite, Alvyra Slepetiene, Jonas Volungevicius, Inga Liaudanskiene
Abstract:
The main task of this research was to investigate the chemical composition of the differently used soil in profiles. To identify the differences in the soil were investigated organic carbon (SOC) and its fractional composition: dissolved organic carbon (DOC), mobile humic acids (MHA) and C to N ratio of natural and anthropogenically affected Luvisols. Research object: natural and anthropogenically affected Luvisol, Akademija, Kedainiai, distr. Lithuania. Chemical analyses were carried out at the Chemical Research Laboratory of Institute of Agriculture, LAMMC. Soil samples for chemical analyses were taken from the genetics soil horizons. SOC was determined by the Tyurin method modified by Nikitin, measuring with spectrometer Cary 50 (VARIAN) in 590 nm wavelength using glucose standards. For mobile humic acids (MHA) determination the extraction procedure was carried out using 0.1 M NaOH solution. Dissolved organic carbon (DOC) was analyzed using an ion chromatograph SKALAR. pH was measured in 1M H2O. N total was determined by Kjeldahl method. Results: Based on the obtained results, it can be stated that transformation of chemical composition is going through the genetic soil horizons. Morphology of the upper layers of soil profile which is formed under natural conditions was changed by anthropomorphic (agrogenic, urbogenic, technogenic and others) structure. Anthropogenic activities, mechanical and biochemical disturbances destroy the natural characteristics of soil formation and complicates the interpretation of soil development. Due to the intensive cultivation, the pH values of the curve equals (disappears acidification characteristic for E horizon) with natural Luvisol. Luvisols affected by agricultural activities was characterized by a decrease in the absolute amount of humic substances in separate horizons. But there was observed more sustainable, higher carbon sequestration and thicker storage of humic horizon compared with forest Luvisol. However, the average content of humic substances in the soil profile was lower. Soil organic carbon content in anthropogenic Luvisols was lower compared with the natural forest soil, but there was more evenly spread over in the wider thickness of accumulative horizon. These data suggest that the organization of geo-ecological declines and agroecological increases in Luvisols. Acknowledgement: This work was supported by the National Science Program ‘The effect of long-term, different-intensity management of resources on the soils of different genesis and on other components of the agro-ecosystems’ [grant number SIT-9/2015] funded by the Research Council of Lithuania.Keywords: agrogenization, dissolved organic carbon, luvisol, mobile humic acids, soil organic carbon
Procedia PDF Downloads 236455 'CardioCare': A Cutting-Edge Fusion of IoT and Machine Learning to Bridge the Gap in Cardiovascular Risk Management
Authors: Arpit Patil, Atharav Bhagwat, Rajas Bhope, Pramod Bide
Abstract:
This research integrates IoT and ML to predict heart failure risks, utilizing the Framingham dataset. IoT devices gather real-time physiological data, focusing on heart rate dynamics, while ML, specifically Random Forest, predicts heart failure. Rigorous feature selection enhances accuracy, achieving over 90% prediction rate. This amalgamation marks a transformative step in proactive healthcare, highlighting early detection's critical role in cardiovascular risk mitigation. Challenges persist, necessitating continual refinement for improved predictive capabilities.Keywords: cardiovascular diseases, internet of things, machine learning, cardiac risk assessment, heart failure prediction, early detection, cardio data analysis
Procedia PDF Downloads 11454 Changes in the Demand of Waterway Passengers During COVID-19 Pandemic: Case Study of Belém-Marajó Island, in Brazil
Authors: Maisa Sales Gama Tobias, Humberto de Paiva Junior, Luciano Silva Brito, Rui António Rodrigues Ramos
Abstract:
Waterway transport in the Amazon was the first means of access and occupation in the region. For the economic and social matter of high importance, still nowadays one of the main transport modes to several places in the region. To some places, still the only transport mode. With the advent of the pandemic, transport companies that already faced management challenges began to experience unprecedented structural changes and trends in trade and global supply chains. Thus, companies need operational reorganization to maintain the sustainability of the service under the penalty of loss of demand. Allied to this fact, it was observed that the demand presented behavior changes to adapt to this new moment. However, the lack of information about these changes makes it difficult to find solutions to maintain the quality of service. This work aimed to characterize the changes in the demand of waterway passengers through an empirical study with field research involving interviews with users and crew, on-board journeys, and visits to the waterway service company. The case study is the route Belém-Camara, on Marajó Island, in the state of Pará. This line is traditionally the only means of transport for this route, besides air transport on a much smaller scale. The collected data had a descriptive and analytical statistical treatment presented in this work. As the main result, the COVID-19 pandemic has caused significant changes, mainly in trip time and motives and, in the perception itself on service quality by part of the demand, with the increase of trip time and the feeling of insecurity. In conclusion, the service operator must review cost management and business survival strategies and tactics. The viability of the service and the social guarantee of transport proved to be threatened, putting at risk the service to the riverside populations.Keywords: demand of waterway transport passengers, data analysis, COVID-19, amazonia
Procedia PDF Downloads 113453 Machine Learning for Disease Prediction Using Symptoms and X-Ray Images
Authors: Ravija Gunawardana, Banuka Athuraliya
Abstract:
Machine learning has emerged as a powerful tool for disease diagnosis and prediction. The use of machine learning algorithms has the potential to improve the accuracy of disease prediction, thereby enabling medical professionals to provide more effective and personalized treatments. This study focuses on developing a machine-learning model for disease prediction using symptoms and X-ray images. The importance of this study lies in its potential to assist medical professionals in accurately diagnosing diseases, thereby improving patient outcomes. Respiratory diseases are a significant cause of morbidity and mortality worldwide, and chest X-rays are commonly used in the diagnosis of these diseases. However, accurately interpreting X-ray images requires significant expertise and can be time-consuming, making it difficult to diagnose respiratory diseases in a timely manner. By incorporating machine learning algorithms, we can significantly enhance disease prediction accuracy, ultimately leading to better patient care. The study utilized the Mask R-CNN algorithm, which is a state-of-the-art method for object detection and segmentation in images, to process chest X-ray images. The model was trained and tested on a large dataset of patient information, which included both symptom data and X-ray images. The performance of the model was evaluated using a range of metrics, including accuracy, precision, recall, and F1-score. The results showed that the model achieved an accuracy rate of over 90%, indicating that it was able to accurately detect and segment regions of interest in the X-ray images. In addition to X-ray images, the study also incorporated symptoms as input data for disease prediction. The study used three different classifiers, namely Random Forest, K-Nearest Neighbor and Support Vector Machine, to predict diseases based on symptoms. These classifiers were trained and tested using the same dataset of patient information as the X-ray model. The results showed promising accuracy rates for predicting diseases using symptoms, with the ensemble learning techniques significantly improving the accuracy of disease prediction. The study's findings indicate that the use of machine learning algorithms can significantly enhance disease prediction accuracy, ultimately leading to better patient care. The model developed in this study has the potential to assist medical professionals in diagnosing respiratory diseases more accurately and efficiently. However, it is important to note that the accuracy of the model can be affected by several factors, including the quality of the X-ray images, the size of the dataset used for training, and the complexity of the disease being diagnosed. In conclusion, the study demonstrated the potential of machine learning algorithms for disease prediction using symptoms and X-ray images. The use of these algorithms can improve the accuracy of disease diagnosis, ultimately leading to better patient care. Further research is needed to validate the model's accuracy and effectiveness in a clinical setting and to expand its application to other diseases.Keywords: K-nearest neighbor, mask R-CNN, random forest, support vector machine
Procedia PDF Downloads 154452 Analysis of Spatial and Temporal Data Using Remote Sensing Technology
Authors: Kapil Pandey, Vishnu Goyal
Abstract:
Spatial and temporal data analysis is very well known in the field of satellite image processing. When spatial data are correlated with time, series analysis it gives the significant results in change detection studies. In this paper the GIS and Remote sensing techniques has been used to find the change detection using time series satellite imagery of Uttarakhand state during the years of 1990-2010. Natural vegetation, urban area, forest cover etc. were chosen as main landuse classes to study. Landuse/ landcover classes within several years were prepared using satellite images. Maximum likelihood supervised classification technique was adopted in this work and finally landuse change index has been generated and graphical models were used to present the changes.Keywords: GIS, landuse/landcover, spatial and temporal data, remote sensing
Procedia PDF Downloads 433451 Diagnosis of Diabetes Using Computer Methods: Soft Computing Methods for Diabetes Detection Using Iris
Authors: Piyush Samant, Ravinder Agarwal
Abstract:
Complementary and Alternative Medicine (CAM) techniques are quite popular and effective for chronic diseases. Iridology is more than 150 years old CAM technique which analyzes the patterns, tissue weakness, color, shape, structure, etc. for disease diagnosis. The objective of this paper is to validate the use of iridology for the diagnosis of the diabetes. The suggested model was applied in a systemic disease with ocular effects. 200 subject data of 100 each diabetic and non-diabetic were evaluated. Complete procedure was kept very simple and free from the involvement of any iridologist. From the normalized iris, the region of interest was cropped. All 63 features were extracted using statistical, texture analysis, and two-dimensional discrete wavelet transformation. A comparison of accuracies of six different classifiers has been presented. The result shows 89.66% accuracy by the random forest classifier.Keywords: complementary and alternative medicine, classification, iridology, iris, feature extraction, disease prediction
Procedia PDF Downloads 407450 Application of Fuzzy Multiple Criteria Decision Making for Flooded Risk Region Selection in Thailand
Authors: Waraporn Wimuktalop
Abstract:
This research will select regions which are vulnerable to flooding in different level. Mathematical principles will be systematically and rationally utilized as a tool to solve problems of selection the regions. Therefore the method called Multiple Criteria Decision Making (MCDM) has been chosen by having two analysis standards, TOPSIS (Technique for Order Preference by Similarity to Ideal Solution) and AHP (Analytic Hierarchy Process). There are three criterions that have been considered in this research. The first criterion is climate which is the rainfall. The second criterion is geography which is the height above mean sea level. The last criterion is the land utilization which both forest and agriculture use. The study found that the South has the highest risk of flooding, then the East, the Centre, the North-East, the West and the North, respectively.Keywords: multiple criteria decision making, TOPSIS, analytic hierarchy process, flooding
Procedia PDF Downloads 233449 DeepNIC a Method to Transform Each Tabular Variable into an Independant Image Analyzable by Basic CNNs
Authors: Nguyen J. M., Lucas G., Ruan S., Digonnet H., Antonioli D.
Abstract:
Introduction: Deep Learning (DL) is a very powerful tool for analyzing image data. But for tabular data, it cannot compete with machine learning methods like XGBoost. The research question becomes: can tabular data be transformed into images that can be analyzed by simple CNNs (Convolutional Neuron Networks)? Will DL be the absolute tool for data classification? All current solutions consist in repositioning the variables in a 2x2 matrix using their correlation proximity. In doing so, it obtains an image whose pixels are the variables. We implement a technology, DeepNIC, that offers the possibility of obtaining an image for each variable, which can be analyzed by simple CNNs. Material and method: The 'ROP' (Regression OPtimized) model is a binary and atypical decision tree whose nodes are managed by a new artificial neuron, the Neurop. By positioning an artificial neuron in each node of the decision trees, it is possible to make an adjustment on a theoretically infinite number of variables at each node. From this new decision tree whose nodes are artificial neurons, we created the concept of a 'Random Forest of Perfect Trees' (RFPT), which disobeys Breiman's concepts by assembling very large numbers of small trees with no classification errors. From the results of the RFPT, we developed a family of 10 statistical information criteria, Nguyen Information Criterion (NICs), which evaluates in 3 dimensions the predictive quality of a variable: Performance, Complexity and Multiplicity of solution. A NIC is a probability that can be transformed into a grey level. The value of a NIC depends essentially on 2 super parameters used in Neurops. By varying these 2 super parameters, we obtain a 2x2 matrix of probabilities for each NIC. We can combine these 10 NICs with the functions AND, OR, and XOR. The total number of combinations is greater than 100,000. In total, we obtain for each variable an image of at least 1166x1167 pixels. The intensity of the pixels is proportional to the probability of the associated NIC. The color depends on the associated NIC. This image actually contains considerable information about the ability of the variable to make the prediction of Y, depending on the presence or absence of other variables. A basic CNNs model was trained for supervised classification. Results: The first results are impressive. Using the GSE22513 public data (Omic data set of markers of Taxane Sensitivity in Breast Cancer), DEEPNic outperformed other statistical methods, including XGBoost. We still need to generalize the comparison on several databases. Conclusion: The ability to transform any tabular variable into an image offers the possibility of merging image and tabular information in the same format. This opens up great perspectives in the analysis of metadata.Keywords: tabular data, CNNs, NICs, DeepNICs, random forest of perfect trees, classification
Procedia PDF Downloads 125448 Data Confidentiality in Public Cloud: A Method for Inclusion of ID-PKC Schemes in OpenStack Cloud
Authors: N. Nalini, Bhanu Prakash Gopularam
Abstract:
The term data security refers to the degree of resistance or protection given to information from unintended or unauthorized access. The core principles of information security are the confidentiality, integrity and availability, also referred as CIA triad. Cloud computing services are classified as SaaS, IaaS and PaaS services. With cloud adoption the confidential enterprise data are moved from organization premises to untrusted public network and due to this the attack surface has increased manifold. Several cloud computing platforms like OpenStack, Eucalyptus, Amazon EC2 offer users to build and configure public, hybrid and private clouds. While the traditional encryption based on PKI infrastructure still works in cloud scenario, the management of public-private keys and trust certificates is difficult. The Identity based Public Key Cryptography (also referred as ID-PKC) overcomes this problem by using publicly identifiable information for generating the keys and works well with decentralized systems. The users can exchange information securely without having to manage any trust information. Another advantage is that access control (role based access control policy) information can be embedded into data unlike in PKI where it is handled by separate component or system. In OpenStack cloud platform the keystone service acts as identity service for authentication and authorization and has support for public key infrastructure for auto services. In this paper, we explain OpenStack security architecture and evaluate the PKI infrastructure piece for data confidentiality. We provide method to integrate ID-PKC schemes for securing data while in transit and stored and explain the key measures for safe guarding data against security attacks. The proposed approach uses JPBC crypto library for key-pair generation based on IEEE P1636.3 standard and secure communication to other cloud services.Keywords: data confidentiality, identity based cryptography, secure communication, open stack key stone, token scoping
Procedia PDF Downloads 384447 Calling the Shots: How Others’ Mistakes May Influence Vaccine Take-up
Authors: Elizabeth Perry, Jylana Sheats
Abstract:
Scholars posit that there is an overlap between the fields of Behavioral Economics (BE) and Behavior Science (BSci)—and that consideration of concepts from both may facilitate a greater understanding of health decision-making processes. For example, the ‘intention-action gap’ is one BE concept to explain sup-optimal decision-making. It is described as having knowledge that does not translate into behavior. Complementary best BSci practices may provide insights into behavioral determinants and relevant behavior change techniques (BCT). Within the context of BSci, this exploratory study aimed to apply a BE concept with demonstrated effectiveness in financial decision-making to a health behavior: influenza (flu) vaccine uptake. Adults aged >18 years were recruited on Amazon’s Mechanical Turk, a digital labor market where anonymous users perform simple tasks at low cost. Eligible participants were randomized into 2 groups, reviewed a scenario, and then completed a survey on the likelihood of receiving a flu shot. The ‘usual care’ group’s scenario included standard CDC guidance that supported the behavior. The ‘intervention’ group’s scenario included messaging about people who did not receive the flu shot. The framing was such that participants could learn from others’ (strangers) mistakes and the subsequent health consequences: ‘Last year, other people who didn’t get the vaccine were about twice as likely to get the flu, and a number of them were hospitalized or even died. Don’t risk it.’ Descriptive statistics and chi-square analyses were performed on the sample. There were 648 participants (usual care, n=326; int., n=322). Among racial/ethnic minorities (n=169; 57% aged < 40), the intervention group was 22% more likely to report that they were ‘extremely’ or ‘moderately’ likely to get the flu vaccine (p = 0.11). While not statistically significant, findings suggest that framing messages from the perspective of learning from the mistakes of unknown others coupled with the BCT ‘knowledge about the health consequences’ may help influence flu vaccine uptake among the study population. With the widely documented disparities in vaccine uptake, exploration of the complementary application of these concepts and strategies may be critical.Keywords: public health, decision-making, vaccination, behavioral science
Procedia PDF Downloads 41446 XAI Implemented Prognostic Framework: Condition Monitoring and Alert System Based on RUL and Sensory Data
Authors: Faruk Ozdemir, Roy Kalawsky, Peter Hubbard
Abstract:
Accurate estimation of RUL provides a basis for effective predictive maintenance, reducing unexpected downtime for industrial equipment. However, while models such as the Random Forest have effective predictive capabilities, they are the so-called ‘black box’ models, where interpretability is at a threshold to make critical diagnostic decisions involved in industries related to aviation. The purpose of this work is to present a prognostic framework that embeds Explainable Artificial Intelligence (XAI) techniques in order to provide essential transparency in Machine Learning methods' decision-making mechanisms based on sensor data, with the objective of procuring actionable insights for the aviation industry. Sensor readings have been gathered from critical equipment such as turbofan jet engine and landing gear, and the prediction of the RUL is done by a Random Forest model. It involves steps such as data gathering, feature engineering, model training, and evaluation. These critical components’ datasets are independently trained and evaluated by the models. While suitable predictions are served, their performance metrics are reasonably good; such complex models, however obscure reasoning for the predictions made by them and may even undermine the confidence of the decision-maker or the maintenance teams. This is followed by global explanations using SHAP and local explanations using LIME in the second phase to bridge the gap in reliability within industrial contexts. These tools analyze model decisions, highlighting feature importance and explaining how each input variable affects the output. This dual approach offers a general comprehension of the overall model behavior and detailed insight into specific predictions. The proposed framework, in its third component, incorporates the techniques of causal analysis in the form of Granger causality tests in order to move beyond correlation toward causation. This will not only allow the model to predict failures but also present reasons, from the key sensor features linked to possible failure mechanisms to relevant personnel. The causality between sensor behaviors and equipment failures creates much value for maintenance teams due to better root cause identification and effective preventive measures. This step contributes to the system being more explainable. Surrogate Several simple models, including Decision Trees and Linear Models, can be used in yet another stage to approximately represent the complex Random Forest model. These simpler models act as backups, replicating important jobs of the original model's behavior. If the feature explanations obtained from the surrogate model are cross-validated with the primary model, the insights derived would be more reliable and provide an intuitive sense of how the input variables affect the predictions. We then create an iterative explainable feedback loop, where the knowledge learned from the explainability methods feeds back into the training of the models. This feeds into a cycle of continuous improvement both in model accuracy and interpretability over time. By systematically integrating new findings, the model is expected to adapt to changed conditions and further develop its prognosis capability. These components are then presented to the decision-makers through the development of a fully transparent condition monitoring and alert system. The system provides a holistic tool for maintenance operations by leveraging RUL predictions, feature importance scores, persistent sensor threshold values, and autonomous alert mechanisms. Since the system will provide explanations for the predictions given, along with active alerts, the maintenance personnel can make informed decisions on their end regarding correct interventions to extend the life of the critical machinery.Keywords: predictive maintenance, explainable artificial intelligence, prognostic, RUL, machine learning, turbofan engines, C-MAPSS dataset
Procedia PDF Downloads 6445 Environmental Education and Water Resources Management in the City of Belem, Para, Brazil
Authors: Naiara de Almeida Rios
Abstract:
The environmental education, from Tbilisi, is signaled as an important instrument for conservation and environmental management. However, the social, economic, political and environmental aspects of each place require an environmental management that corresponds to the reality to which they are inserted, as well as environmental education practices. The city of Belém, the capital of the State of Pará, is one of the most important cities in the Amazon Region, and its vast water dimension requires that its watersheds take a careful look at their socio-environmental management. The Estrada Nova Hydrographic Basin is considered as one of the most critical river basins in the city due to flooding, lack of basic sanitation and degradation of water bodies. In this context, environmental education is understood as one of the necessary conditions to reduce environmental degradation. Environmental education presents itself as an instrument of social transformation and conservation of natural resources (especially water resources), where thinking about the sustainability of natural resources is moving towards dialogue on the importance of building an environmental awareness. The commitment that environmental education proposes covers all spheres of society, since the main objective of the same is the transformation of thought and attitudes from the understanding of reality. Therefore, to analyze how the government is managing the basin, as well as the environmental education practices developed in it, is fundamental, so that government can be charged with improvements for the population and for the natural environment. Therefore, the objective of this study is to analyze the influence of environmental education actions developed by local public authorities in the management of the Estrada Nova Hydrographic Basin, Belém/PA. For the accomplishment of this study, some methodological procedures will be used, like documentary analysis, bibliographical survey and fieldwork. If the multivariate statistical method is used to analyze the results obtained in the field. Unfortunately, public policies in the area of environmental education in Belém are still moving in short steps, since government interests have had very little dialogue with the socio-environmental problems that affect the Estrada Nova Hydrographic Basin. Both formal and informal environmental education has been poorly developed, hampering the continuous process proposed by water resources management.Keywords: environmental education, environmental management, hydrographic basin, water resources
Procedia PDF Downloads 189444 Transformable Lightweight Structures for Short-term Stay
Authors: Anna Daskalaki, Andreas Ashikalis
Abstract:
This is a conceptual project that suggests an alternative type of summer camp in the forest of Rouvas in the island of Crete. Taking into account some feasts that are organised by the locals or mountaineering clubs near the church of St. John, we created a network of lightweight timber structures that serve the needs of the visitor. These structures are transformable and satisfy the need for rest, food, and sleep – this means a seat, a table and a tent are embodied in each structure. These structures blend in with the environment as they are being installed according to the following parameters: (a) the local relief, (b) the clusters of trees, and (c) the existing paths. Each timber structure could be considered as a module that could be totally independent or part of a bigger construction. The design showcases the advantages of a timber structure as it can be quite adaptive to the needs of the project, but also it is a sustainable and environmentally friendly material that can be recycled. Finally, it is important to note that the basic goal of this project is the minimum alteration of the natural environment.Keywords: lightweight structures, timber, transformable, tent
Procedia PDF Downloads 169443 Clean Technology: Hype or Need to Have
Authors: Dirk V. H. K. Franco
Abstract:
For many of us a lot of phenomena are considered a risk. Examples are: climate change, decrease of biodiversity, amount of available, clean water and the decreasing variety of living organism in the oceans. On the other hand a lot of people perceive the following trends as catastrophic: the sea level, the melting of the pole ice, the numbers of tornado’s, floods and forest fires, the national security and the potential of 192 million climate migrants in 2060. The interest for climate, health and the possible solutions is large and common. The 5th IPCC states that the last decades especially human activities (and in second order natural emissions) have caused large, mainly negative impacts on our ecological environments. Chris Stringer stated that we represent, nowadays after evolution, the only one version of the possible humanity. At this very moment we are faced with an (over) crowded planet together with global climate changes and a strong demand for energy and material resources. Let us hope that we can counter these difficulties either with better application of existing technologies or by inventing new (applications of) clean technologies together with new business models.Keywords: clean technologies, catastrophic, climate, possible solutions
Procedia PDF Downloads 499442 A Highly Accurate Computer-Aided Diagnosis: CAD System for the Diagnosis of Breast Cancer by Using Thermographic Analysis
Authors: Mahdi Bazarganigilani
Abstract:
Computer-aided diagnosis (CAD) systems can play crucial roles in diagnosing crucial diseases such as breast cancer at the earliest. In this paper, a CAD system for the diagnosis of breast cancer was introduced and evaluated. This CAD system was developed by using spatio-temporal analysis of data on a set of consecutive thermographic images by employing wavelet transformation. By using this analysis, a very accurate machine learning model using random forest was obtained. The final results showed a promising accuracy of 91% in terms of the F1 measure indicator among 200 patients' sample data. The CAD system was further extended to obtain a detailed analysis of the effect of smaller sub-areas of each breast on the occurrence of cancer.Keywords: computer-aided diagnosis systems, thermographic analysis, spatio-temporal analysis, image processing, machine learning
Procedia PDF Downloads 210441 Predicting OpenStreetMap Coverage by Means of Remote Sensing: The Case of Haiti
Authors: Ran Goldblatt, Nicholas Jones, Jennifer Mannix, Brad Bottoms
Abstract:
Accurate, complete, and up-to-date geospatial information is the foundation of successful disaster management. When the 2010 Haiti Earthquake struck, accurate and timely information on the distribution of critical infrastructure was essential for the disaster response community for effective search and rescue operations. Existing geospatial datasets such as Google Maps did not have comprehensive coverage of these features. In the days following the earthquake, many organizations released high-resolution satellite imagery, catalyzing a worldwide effort to map Haiti and support the recovery operations. Of these organizations, OpenStreetMap (OSM), a collaborative project to create a free editable map of the world, used the imagery to support volunteers to digitize roads, buildings, and other features, creating the most detailed map of Haiti in existence in just a few weeks. However, large portions of the island are still not fully covered by OSM. There is an increasing need for a tool to automatically identify which areas in Haiti, as well as in other countries vulnerable to disasters, that are not fully mapped. The objective of this project is to leverage different types of remote sensing measurements, together with machine learning approaches, in order to identify geographical areas where OSM coverage of building footprints is incomplete. Several remote sensing measures and derived products were assessed as potential predictors of OSM building footprints coverage, including: intensity of light emitted at night (based on VIIRS measurements), spectral indices derived from Sentinel-2 satellite (normalized difference vegetation index (NDVI), normalized difference built-up index (NDBI), soil-adjusted vegetation index (SAVI), urban index (UI)), surface texture (based on Sentinel-1 SAR measurements)), elevation and slope. Additional remote sensing derived products, such as Hansen Global Forest Change, DLR`s Global Urban Footprint (GUF), and World Settlement Footprint (WSF), were also evaluated as predictors, as well as OSM street and road network (including junctions). Using a supervised classification with a random forest classifier resulted in the prediction of 89% of the variation of OSM building footprint area in a given cell. These predictions allowed for the identification of cells that are predicted to be covered but are actually not mapped yet. With these results, this methodology could be adapted to any location to assist with preparing for future disastrous events and assure that essential geospatial information is available to support the response and recovery efforts during and following major disasters.Keywords: disaster management, Haiti, machine learning, OpenStreetMap, remote sensing
Procedia PDF Downloads 125440 Diversity and Use of Agroforestry Yards of Family Farmers of Ponte Alta – Gama, Federal District, Brazil
Authors: Kever Bruno Paradelo Gomes, Rosana Carvalho Martins
Abstract:
The home gardens areas are production systems, which are located near the homes and are quite common in the tropics. They consist of agricultural and forest species and may also involve the raising of small animals to produce food for subsistence as well as income generation, with a special focus on the conservation of biodiversity. Home gardens are diverse Agroforestry systems with multiple uses, among many, food security, income aid, traditional medicine. The work was carried out on rural properties of the family farmers of the Ponte Alta Rural Nucleus, Gama Administrative Region, in the city of Brasília, Federal District- Brazil. The present research is characterized methodologically as a quantitative, exploratory and descriptive nature. The instruments used in this research were: bibliographic survey and semi-structured questionnaire. The data collection was performed through the application of a semi-structured questionnaire, containing questions that referred to the perception and behavior of the interviewed producer on the subject under analysis. In each question, the respondent explained his knowledge about sustainability, agroecological practices, environmental legislation, conservation methods, forest and medicinal species, ago social and socioeconomic characteristics, use and purpose of agroforestry and technical assistance. The sample represented 55.62% of the universe of the study. We interviewed 99 people aged 18-83 years, with a mean age of 49 years. The low level of education, coupled with the lack of training and guidance for small family farmers in the Ponte Alta Rural Nucleus, is one of the limitations to the development of practices oriented towards sustainable and agroecological agriculture in the nucleus. It is observed that 50.5% of the interviewed people landed with agroforestry yards less than 20 years ago, and only 16.17% of them are older than 35 years. In identifying agriculture as the main activity of most of the rural properties studied, attention is drawn to the cultivation of medicinal plants, fruits and crops as the most extracted products. However, it is verified that the crops in the backyards have the exclusive purpose of family consumption, which could be complemented with the marketing of the surplus, as well as with the aggregation of value to the cultivated products. Initiatives such as this may contribute to the increase in family income and to the motivation and value of the crop in agroecological gardens. We conclude that home gardens of Ponte Alta are highly diverse thus contributing to local biodiversity conservation of are managed by women to ensure food security and allows income generation. The tradition of existing knowledge on the use and management of the diversity of resources used in agroforestry yards is of paramount importance for the development of sustainable alternative practices.Keywords: agriculture, agroforestry system, rural development, sustainability
Procedia PDF Downloads 141439 A Machine Learning Approach to Detecting Evasive PDF Malware
Authors: Vareesha Masood, Ammara Gul, Nabeeha Areej, Muhammad Asif Masood, Hamna Imran
Abstract:
The universal use of PDF files has prompted hackers to use them for malicious intent by hiding malicious codes in their victim’s PDF machines. Machine learning has proven to be the most efficient in identifying benign files and detecting files with PDF malware. This paper has proposed an approach using a decision tree classifier with parameters. A modern, inclusive dataset CIC-Evasive-PDFMal2022, produced by Lockheed Martin’s Cyber Security wing is used. It is one of the most reliable datasets to use in this field. We designed a PDF malware detection system that achieved 99.2%. Comparing the suggested model to other cutting-edge models in the same study field, it has a great performance in detecting PDF malware. Accordingly, we provide the fastest, most reliable, and most efficient PDF Malware detection approach in this paper.Keywords: PDF, PDF malware, decision tree classifier, random forest classifier
Procedia PDF Downloads 91438 Epidemiological, Ecology, and Case Management of Plasmodium Knowlesi Malaria in Phang-Nga Province, Thailand
Authors: Surachart Koyadun
Abstract:
Introduction: Plasmodium knowlesi (P. knowlesi) malaria is a zoonotic disease that is classified as type 5 of human malaria. Commonly found in macaques (Macaca fascicularis) and (Macaca nemestrina), P. knowlesi is capable of resulting in both uncomplicated and severe malaria in humans. Situation of P. knowlesi malaria in Phang-Nga province for the past 3 years from 2020 – 2022 revealed no case report in 2020, however, a total of 14 cases had been reported in 2021 - 2022. This research aimed to 1) study the epidemiology of P. knowlesi, 2) examine the clinical manifestations of P. knowlesi patients, 3) analyze the ecology and entomology of P. knowlesi, and 4) analyze the diagnosis and treatment of P. knowlesi. Method: This research was a retrospective descriptive study/case report. The study was conducted in 14 patients with P. knowlesi malaria between 2021 and 2022 in 4 districts of Phang-Nga Province, Thailand including Thapput, Kapong, Takuapa and Khuraburi. Results: The study subjects of P. knowlesi malaria were all males. Most of them were working age groups as farmers and worked in forest or plantation areas. All had no history of blood transfusions. Most of the patients did not use mosquito nets and had a history of camping in the forest prior to the onset of fever. An analysis of all 14 sources of infection unveiled the area is home to macaques, and that area has detected Anopheles mosquito, which is the carrier of the disease. Majority of them got sick in the dry season of Thailand (December-April). The main symptoms brought to the hospital were fever, chills, headache, body aches. Laboratory findings on the first day of diagnosis were as follows: The white blood cell count was found within the normal range. In the proportion of white blood cells, eosinophils were found to be slightly higher than normal. Slight anemia was found on early examination. The platelet count was found to be below normal in all cases. Severely low platelet count (2,000 cells/mm3) was found in severe cases with multiple complications. No patient was found dead but 85.7% of complications were found, with acute renal failure being the most common. Patients with delayed diagnosis and treatment of malaria (inaccurate diagnosis or late access to the hospital) had the highest severity and complications than those who had seen the doctor since the first 3-4 days of illness or the screening of symptoms and risk history by the malaria clinic staff at vector-borne disease control unit. Conclusion and Recommendation: P. knowlesi malaria is an emerging infectious disease transmitted from animals to humans. There are challenges in epidemiology, entomology, ecology for effective surveillance, prevention and control. Early diagnosis and treatment would reduce complications and prevent death.Keywords: malaria, plasmodium knowlesi, epidemiology, ecology, entomology, diagnosis, treatment
Procedia PDF Downloads 71437 Channel That Can Be Used on Slope, Slide Prone and Seismic Areas, Swelling and Collapsing Soils
Authors: Sabir Tehrankhan Hasanov, Mir Movsum Anar Dadashev
Abstract:
The article provides a brief overview of irrigation systems and canals applied to slopes, landslide-prone, seismic areas, and swelling and collapsing soils. The contemporary construction of the canal used for irrigation, energy, and water supply purposes is described. In order to ensure the durability, longevity, and reliability of the channel, a damping mat made of cast material is created under its cover, and the top is covered with a waterproof screen. Dowels are placed on the bottom and sides of the channel, and the bottom dowel is riveted to the solid bedrock and connected with piles placed at certain distances. Drainage was placed next to the bottom dowel, an operation road was created on one side of the channel, and a berm road was created on the other side. A bathtub was built on the side of the road, and a forest-bush strip was built on its bank.Keywords: slope, channel, landslide, collapse, swell, soil, structure
Procedia PDF Downloads 86436 Machine Learning for Aiding Meningitis Diagnosis in Pediatric Patients
Authors: Karina Zaccari, Ernesto Cordeiro Marujo
Abstract:
This paper presents a Machine Learning (ML) approach to support Meningitis diagnosis in patients at a children’s hospital in Sao Paulo, Brazil. The aim is to use ML techniques to reduce the use of invasive procedures, such as cerebrospinal fluid (CSF) collection, as much as possible. In this study, we focus on predicting the probability of Meningitis given the results of a blood and urine laboratory tests, together with the analysis of pain or other complaints from the patient. We tested a number of different ML algorithms, including: Adaptative Boosting (AdaBoost), Decision Tree, Gradient Boosting, K-Nearest Neighbors (KNN), Logistic Regression, Random Forest and Support Vector Machines (SVM). Decision Tree algorithm performed best, with 94.56% and 96.18% accuracy for training and testing data, respectively. These results represent a significant aid to doctors in diagnosing Meningitis as early as possible and in preventing expensive and painful procedures on some children.Keywords: machine learning, medical diagnosis, meningitis detection, pediatric research
Procedia PDF Downloads 150435 Reinforcement Learning for Classification of Low-Resolution Satellite Images
Authors: Khadija Bouzaachane, El Mahdi El Guarmah
Abstract:
The classification of low-resolution satellite images has been a worthwhile and fertile field that attracts plenty of researchers due to its importance in monitoring geographical areas. It could be used for several purposes such as disaster management, military surveillance, agricultural monitoring. The main objective of this work is to classify efficiently and accurately low-resolution satellite images by using novel technics of deep learning and reinforcement learning. The images include roads, residential areas, industrial areas, rivers, sea lakes, and vegetation. To achieve that goal, we carried out experiments on the sentinel-2 images considering both high accuracy and efficiency classification. Our proposed model achieved a 91% accuracy on the testing dataset besides a good classification for land cover. Focus on the parameter precision; we have obtained 93% for the river, 92% for residential, 97% for residential, 96% for the forest, 87% for annual crop, 84% for herbaceous vegetation, 85% for pasture, 78% highway and 100% for Sea Lake.Keywords: classification, deep learning, reinforcement learning, satellite imagery
Procedia PDF Downloads 213434 Predictive Analysis of the Stock Price Market Trends with Deep Learning
Authors: Suraj Mehrotra
Abstract:
The stock market is a volatile, bustling marketplace that is a cornerstone of economics. It defines whether companies are successful or in spiral. A thorough understanding of it is important - many companies have whole divisions dedicated to analysis of both their stock and of rivaling companies. Linking the world of finance and artificial intelligence (AI), especially the stock market, has been a relatively recent development. Predicting how stocks will do considering all external factors and previous data has always been a human task. With the help of AI, however, machine learning models can help us make more complete predictions in financial trends. Taking a look at the stock market specifically, predicting the open, closing, high, and low prices for the next day is very hard to do. Machine learning makes this task a lot easier. A model that builds upon itself that takes in external factors as weights can predict trends far into the future. When used effectively, new doors can be opened up in the business and finance world, and companies can make better and more complete decisions. This paper explores the various techniques used in the prediction of stock prices, from traditional statistical methods to deep learning and neural networks based approaches, among other methods. It provides a detailed analysis of the techniques and also explores the challenges in predictive analysis. For the accuracy of the testing set, taking a look at four different models - linear regression, neural network, decision tree, and naïve Bayes - on the different stocks, Apple, Google, Tesla, Amazon, United Healthcare, Exxon Mobil, J.P. Morgan & Chase, and Johnson & Johnson, the naïve Bayes model and linear regression models worked best. For the testing set, the naïve Bayes model had the highest accuracy along with the linear regression model, followed by the neural network model and then the decision tree model. The training set had similar results except for the fact that the decision tree model was perfect with complete accuracy in its predictions, which makes sense. This means that the decision tree model likely overfitted the training set when used for the testing set.Keywords: machine learning, testing set, artificial intelligence, stock analysis
Procedia PDF Downloads 95