Search results for: classification methods
15695 Food Insecurity Assessment, Consumption Pattern and Implications of Integrated Food Security Phase Classification: Evidence from Sudan
Authors: Ahmed A. A. Fadol, Guangji Tong, Wlaa Mohamed
Abstract:
This paper provides a comprehensive analysis of food insecurity in Sudan, focusing on consumption patterns and their implications, employing the Integrated Food Security Phase Classification (IPC) assessment framework. Years of conflict and economic instability have driven large segments of the population in Sudan into crisis levels of acute food insecurity according to the (IPC). A substantial number of people are estimated to currently face emergency conditions, with an additional sizeable portion categorized under less severe but still extreme hunger levels. In this study, we explore the multifaceted nature of food insecurity in Sudan, considering its historical, political, economic, and social dimensions. An analysis of consumption patterns and trends was conducted, taking into account cultural influences, dietary shifts, and demographic changes. Furthermore, we employ logistic regression and random forest analysis to identify significant independent variables influencing food security status in Sudan. Random forest clearly outperforms logistic regression in terms of area under curve (AUC), accuracy, precision and recall. Forward projections of the IPC for Sudan estimate that 15 million individuals are anticipated to face Crisis level (IPC Phase 3) or worse acute food insecurity conditions between October 2023 and February 2024. Of this, 60% are concentrated in Greater Darfur, Greater Kordofan, and Khartoum State, with Greater Darfur alone representing 29% of this total. These findings emphasize the urgent need for both short-term humanitarian aid and long-term strategies to address Sudan's deepening food insecurity crisis.Keywords: food insecurity, consumption patterns, logistic regression, random forest analysis
Procedia PDF Downloads 7215694 TessPy – Spatial Tessellation Made Easy
Authors: Jonas Hamann, Siavash Saki, Tobias Hagen
Abstract:
Discretization of urban areas is a crucial aspect in many spatial analyses. The process of discretization of space into subspaces without overlaps and gaps is called tessellation. It helps understanding spatial space and provides a framework for analyzing geospatial data. Tessellation methods can be divided into two groups: regular tessellations and irregular tessellations. While regular tessellation methods, like squares-grids or hexagons-grids, are suitable for addressing pure geometry problems, they cannot take the unique characteristics of different subareas into account. However, irregular tessellation methods allow the border between the subareas to be defined more realistically based on urban features like a road network or Points of Interest (POI). Even though Python is one of the most used programming languages when it comes to spatial analysis, there is currently no library that combines different tessellation methods to enable users and researchers to compare different techniques. To close this gap, we are proposing TessPy, an open-source Python package, which combines all above-mentioned tessellation methods and makes them easily accessible to everyone. The core functions of TessPy represent the five different tessellation methods: squares, hexagons, adaptive squares, Voronoi polygons, and city blocks. By using regular methods, users can set the resolution of the tessellation which defines the finesse of the discretization and the desired number of tiles. Irregular tessellation methods allow users to define which spatial data to consider (e.g., amenity, building, office) and how fine the tessellation should be. The spatial data used is open-source and provided by OpenStreetMap. This data can be easily extracted and used for further analyses. Besides the methodology of the different techniques, the state-of-the-art, including examples and future work, will be discussed. All dependencies can be installed using conda or pip; however, the former is more recommended.Keywords: geospatial data science, geospatial data analysis, tessellations, urban studies
Procedia PDF Downloads 12615693 Comparati̇ve Study of Pi̇xel and Object-Based Image Classificati̇on Techni̇ques for Extracti̇on of Land Use/Land Cover Informati̇on
Authors: Mahesh Kumar Jat, Manisha Choudhary
Abstract:
Rapid population and economic growth resulted in changes in large-scale land use land cover (LULC) changes. Changes in the biophysical properties of the Earth's surface and its impact on climate are of primary concern nowadays. Different approaches, ranging from location-based relationships or modelling earth surface - atmospheric interaction through modelling techniques like surface energy balance (SEB) have been used in the recent past to examine the relationship between changes in Earth surface land cover and climatic characteristics like temperature and precipitation. A remote sensing-based model i.e., Surface Energy Balance Algorithm for Land (SEBAL), has been used to estimate the surface heat fluxes over Mahi Bajaj Sagar catchment (India) from 2001 to 2020. Landsat ETM and OLI satellite data are used to model the SEB of the area. Changes in observed precipitation and temperature, obtained from India Meteorological Department (IMD) have been correlated with changes in surface heat fluxes to understand the relative contributions of LULC change in changing these climatic variables. Results indicate a noticeable impact of LULC changes on climatic variables, which are aligned with respective changes in SEB components. Results suggest that precipitation increases at a rate of 20 mm/year. The maximum and minimum temperature decreases and increases at 0.007 ℃ /year and 0.02 ℃ /year, respectively. The average temperature increases at 0.009 ℃ /year. Changes in latent heat flux and sensible heat flux positively correlate with precipitation and temperature, respectively. Variation in surface heat fluxes influences the climate parameters and is an adequate reason for climate change. So, SEB modelling is helpful to understand the LULC change and its impact on climate.Keywords: remote sensing, GIS, object based, classification
Procedia PDF Downloads 12815692 An Analysis on Clustering Based Gene Selection and Classification for Gene Expression Data
Authors: K. Sathishkumar, V. Thiagarasu
Abstract:
Due to recent advances in DNA microarray technology, it is now feasible to obtain gene expression profiles of tissue samples at relatively low costs. Many scientists around the world use the advantage of this gene profiling to characterize complex biological circumstances and diseases. Microarray techniques that are used in genome-wide gene expression and genome mutation analysis help scientists and physicians in understanding of the pathophysiological mechanisms, in diagnoses and prognoses, and choosing treatment plans. DNA microarray technology has now made it possible to simultaneously monitor the expression levels of thousands of genes during important biological processes and across collections of related samples. Elucidating the patterns hidden in gene expression data offers a tremendous opportunity for an enhanced understanding of functional genomics. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the resulting mass of data, which often consists of millions of measurements. A first step toward addressing this challenge is the use of clustering techniques, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. This work presents an analysis of several clustering algorithms proposed to deals with the gene expression data effectively. The existing clustering algorithms like Support Vector Machine (SVM), K-means algorithm and evolutionary algorithm etc. are analyzed thoroughly to identify the advantages and limitations. The performance evaluation of the existing algorithms is carried out to determine the best approach. In order to improve the classification performance of the best approach in terms of Accuracy, Convergence Behavior and processing time, a hybrid clustering based optimization approach has been proposed.Keywords: microarray technology, gene expression data, clustering, gene Selection
Procedia PDF Downloads 32315691 Weighted Risk Scores Method Proposal for Occupational Safety Risk Assessment
Authors: Ulas Cinar, Omer Faruk Ugurlu, Selcuk Cebi
Abstract:
Occupational safety risk management is the most important element of a safe working environment. Effective risk management can only be possible with accurate analysis and evaluations. Scoring-based risk assessment methods offer considerable ease of application as they convert linguistic expressions into numerical results. It can also be easily adapted to any field. Contrary to all these advantages, important problems in scoring-based methods are frequently discussed. Effective measurability is one of the most critical problems. Existing methods allow experts to choose a score equivalent to each parameter. Therefore, experts prefer the score of the most likely outcome for risk. However, all other possible consequences are neglected. Assessments of the existing methods express the most probable level of risk, not the real risk of the enterprises. In this study, it is aimed to develop a method that will present a more comprehensive evaluation compared to the existing methods by evaluating the probability and severity scores, all sub-parameters, and potential results, and a new scoring-based method is proposed in the literature.Keywords: occupational health and safety, risk assessment, scoring based risk assessment method, underground mining, weighted risk scores
Procedia PDF Downloads 13315690 Recommendations for Teaching Word Formation for Students of Linguistics Using Computer Terminology as an Example
Authors: Svetlana Kostrubina, Anastasia Prokopeva
Abstract:
This research presents a comprehensive study of the word formation processes in computer terminology within English and Russian languages and provides listeners with a system of exercises for training these skills. The originality is that this study focuses on a comparative approach, which shows both general patterns and specific features of English and Russian computer terms word formation. The key point is the system of exercises development for training computer terminology based on Bloom’s taxonomy. Data contain 486 units (228 English terms from the Glossary of Computer Terms and 258 Russian terms from the Terminological Dictionary-Reference Book). The objective is to identify the main affixation models in the English and Russian computer terms formation and to develop exercises. To achieve this goal, the authors employed Bloom’s Taxonomy as a methodological framework to create a systematic exercise program aimed at enhancing students’ cognitive skills in analyzing, applying, and evaluating computer terms. The exercises are appropriate for various levels of learning, from basic recall of definitions to higher-order thinking skills, such as synthesizing new terms and critically assessing their usage in different contexts. Methodology also includes: a method of scientific and theoretical analysis for systematization of linguistic concepts and clarification of the conceptual and terminological apparatus; a method of nominative and derivative analysis for identifying word-formation types; a method of word-formation analysis for organizing linguistic units; a classification method for determining structural types of abbreviations applicable to the field of computer communication; a quantitative analysis technique for determining the productivity of methods for forming abbreviations of computer vocabulary based on the English and Russian computer terms, as well as a technique of tabular data processing for a visual presentation of the results obtained. a technique of interlingua comparison for identifying common and different features of abbreviations of computer terms in the Russian and English languages. The research shows that affixation retains its productivity in the English and Russian computer terms formation. Bloom’s taxonomy allows us to plan a training program and predict the effectiveness of the compiled program based on the assessment of the teaching methods used.Keywords: word formation, affixation, computer terms, Bloom's taxonomy
Procedia PDF Downloads 915689 Application of extraction chromatography to the separation of Sc, Zr and Sn isotopes from target materials
Authors: Steffen Happel
Abstract:
Non-standard isotopes such as Sc-44/47, Zr-89, and Sn-117mare finding interest is increasing in radiopharmaceutical applications. Methods for the separation of these elements from typical target materials were developed. The methods used in this paper are based on the use of extraction chromatographic resins such as UTEVA, TBP, and DGA resin. Information on the selectivity of the resins (Dw values of selected elements in HCl and HNO3 of varying concentration) will be presented as well as results of the method development such as elution studies, chemical recoveries, and decontamination factors. Developed methods are based on the use of vacuum supported separation allowing for fast and selective separation.Keywords: elution, extraction chromatography, radiopharmacy, decontamination factors
Procedia PDF Downloads 46715688 Analysis of the Aquifer Vulnerability of a Miopliocene Arid Area Using Drastic and SI Models
Abstract:
Many methods in the groundwater vulnerability have been developed in the world (methods like PRAST, DRIST, APRON/ARAA, PRASTCHIM, GOD). In this study, our choice dealt with two recent complementary methods using category mapping of index with weighting criteria (Point County Systems Model MSCP) namely the standard DRASTIC method and SI (Susceptibility Index). At present, these two methods are the most used for the mapping of the intrinsic vulnerability of groundwater. Two classes of groundwater vulnerability in the Biskra sandy aquifer were identified by the DRASTIC method (average and high) and the SI method (very high and high). Integrated analysis has revealed that the high class is predominant for the DRASTIC method whereas for that of SI the preponderance is for the very high class. Furthermore, we notice that the method SI estimates better the vulnerability for the pollution in nitrates, with a rate of 85 % between the concentrations in nitrates of groundwater and the various established classes of vulnerability, against 75 % for the DRASTIC method. By including the land use parameter, the SI method produced more realistic results.Keywords: DRASTIC, SI, GIS, Biskra sandy aquifer, Algeria
Procedia PDF Downloads 48315687 Detecting Indigenous Languages: A System for Maya Text Profiling and Machine Learning Classification Techniques
Authors: Alejandro Molina-Villegas, Silvia Fernández-Sabido, Eduardo Mendoza-Vargas, Fátima Miranda-Pestaña
Abstract:
The automatic detection of indigenous languages in digital texts is essential to promote their inclusion in digital media. Underrepresented languages, such as Maya, are often excluded from language detection tools like Google’s language-detection library, LANGDETECT. This study addresses these limitations by developing a hybrid language detection solution that accurately distinguishes Maya (YUA) from Spanish (ES). Two strategies are employed: the first focuses on creating a profile for the Maya language within the LANGDETECT library, while the second involves training a Naive Bayes classification model with two categories, YUA and ES. The process includes comprehensive data preprocessing steps, such as cleaning, normalization, tokenization, and n-gram counting, applied to text samples collected from various sources, including articles from La Jornada Maya, a major newspaper in Mexico and the only media outlet that includes a Maya section. After the training phase, a portion of the data is used to create the YUA profile within LANGDETECT, which achieves an accuracy rate above 95% in identifying the Maya language during testing. Additionally, the Naive Bayes classifier, trained and tested on the same database, achieves an accuracy close to 98% in distinguishing between Maya and Spanish, with further validation through F1 score, recall, and logarithmic scoring, without signs of overfitting. This strategy, which combines the LANGDETECT profile with a Naive Bayes model, highlights an adaptable framework that can be extended to other underrepresented languages in future research. This fills a gap in Natural Language Processing and supports the preservation and revitalization of these languages.Keywords: indigenous languages, language detection, Maya language, Naive Bayes classifier, natural language processing, low-resource languages
Procedia PDF Downloads 1515686 Off-Line Detection of "Pannon Wheat" Milling Fractions by Near-Infrared Spectroscopic Methods
Authors: E. Izsó, M. Bartalné-Berceli, Sz. Gergely, A. Salgó
Abstract:
The aims of this investigation is to elaborate near-infrared methods for testing and recognition of chemical components and quality in “Pannon wheat” allied (i.e. true to variety or variety identified) milling fractions as well as to develop spectroscopic methods following the milling processes and evaluate the stability of the milling technology by different types of milling products and according to sampling times, respectively. This wheat categories produced under industrial conditions where samples were collected versus sampling time and maximum or minimum yields. The changes of the main chemical components (such as starch, protein, lipid) and physical properties of fractions (particle size) were analysed by dispersive spectrophotometers using visible (VIS) and near-infrared (NIR) regions of the electromagnetic radiation. Close correlation were obtained between the data of spectroscopic measurement techniques processed by various chemometric methods (e.g. principal component analysis (PCA), cluster analysis (CA) and operation condition of milling technology. Its obvious that NIR methods are able to detect the deviation of the yield parameters and differences of the sampling times by a wide variety of fractions, respectively. NIR technology can be used in the sensitive monitoring of milling technology.Keywords: near infrared spectroscopy, wheat categories, milling process, monitoring
Procedia PDF Downloads 40515685 Damage Assessment and Repair for Older Brick Buildings
Authors: Tim D. Sass
Abstract:
The experience of engineers and architects practicing today is typically limited to current building code requirements and modern construction methods and materials. However, many cities have a mix of new and old buildings with many buildings constructed over one hundred years ago when building codes and construction methods were much different. When a brick building sustains damage, a structural engineer is often hired to determine the cause of damage as well as determine the necessary repairs. Forensic studies of dozens of brick buildings shows an appreciation of historical building methods and materials is needed to correctly identify the cause of damage and design an appropriate repair. Damage on an older, brick building can be mistakenly attributed to storms or seismic events when the real source of the damage is deficient original construction. Assessing and remediating damaged brickwork on older brick buildings requires an understanding of the original construction, an understanding of older repair methods, and, an understanding of current building code requirements.Keywords: brick, damage, deterioration, facade
Procedia PDF Downloads 22515684 HIV and AIDS in Kosovo, Stigma Persist!
Authors: Luljeta Gashi, Naser Ramadani, Zana Deva, Dafina Gexha-Bunjaku
Abstract:
The official HIV/AIDS data in Kosovo are based on HIV case reporting from health-care services, the blood transfusion system and Voluntary Counselling and Testing centres. Between 1986 and 2014, are reported 95 HIV and AIDS cases, of which 49 were AIDS, 46 HIV and 40 deaths. The majority (69%) of cases were men, age group 25 to 34 (37%) and route of transmission is: heterosexual (90%), MSM (7%), vertical transmission (2%) and IDU (1%). Based on existing data and the UNAIDS classification system, Kosovo is currently still categorised as having a low-level HIV epidemic. Even though with a low HIV prevalence, Kosovo faces a number of threatening factors, including increased number of drug users, a stigmatized and discriminated MSM community, high percentage of youth among general population (57% of the population under the age of 25), with changing social norms and especially the sexual ones. Methods: Data collection was done using self administered structured questionnaires amongst 249 high school students. Data were analysed using the Statistical Package for Social Sciences (SPSS). Results: The findings revealed that 68% of students know that HIV transmission can be reduced by having sex with only one uninfected partner who has no other partners, 94% know that the risk of getting HIV can be reduced by using a condom every time they have sex, 68% know that a person cannot get HIV from mosquito bites, 81% know that they cannot get HIV by sharing food with someone who is infected and 46% know that a healthy looking person can have HIV. Conclusions: Seventy one percent of high school students correctly identify ways of preventing the sexual transmission of HIV and who reject the major misconceptions about HIV transmission. The findings of the study indicate a need for more health education and promotion.Keywords: Kosovo, KPAR, HIV, high school
Procedia PDF Downloads 47615683 Selection of New Business in Brazilian Companies Incubators through Hierarchical Methodology
Authors: Izabel Cristina Zattar, Gilberto Passos Lima, Guilherme Schünemann de Oliveira
Abstract:
In Brazil, there are several institutions committed to the development of new businesses based on product innovation. Among them are business incubators, universities and science institutes. Business incubators can be defined as nurseries for new companies, which may be in the technology segment, discussed in this article. Business incubators provide services related to infrastructure, such as physical space and meeting rooms. Besides these services, incubators also offer assistance in the form of information and communication, access to finance, relationship networks and business monitoring and mentoring processes. Business incubators support not all technology companies. One of the business incubators tasks is to assess the nature and feasibility of new business proposals. To assist this goal, this paper proposes a methodology for evaluating new business using the Analytic Hierarchy Process (AHP). This paper presents the concepts used in the assessing methodology application for new business, concepts that have been tested with positive results in practice. This study counts on three main steps: first, a hierarchy was built, based on new business manuals used by the business incubators. These books and manuals relate business selection requirements, such as the innovation status and other technological aspects. Then, a questionnaire was generated, in order to guide incubator experts in the parity comparisons at all hierarchy levels. The weights of each requirement are calculated from information obtained from the questionnaire responses. Finally, the proposed method was applied to evaluate five new business proposals, which were applying to be part of a company incubator. The main result is the classification of these new businesses, which helped the incubator experts to decide what companies were more eligible to work with. This classification may also be helpful to the decision-making process of business incubators in future selection processes.Keywords: Analytic Hierarchy Process (AHP), Brazilian companies incubators, technology companies, incubator
Procedia PDF Downloads 37315682 Comparative Evaluation of EBT3 Film Dosimetry Using Flat Bad Scanner, Densitometer and Spectrophotometer Methods and Its Applications in Radiotherapy
Authors: K. Khaerunnisa, D. Ryangga, S. A. Pawiro
Abstract:
Over the past few decades, film dosimetry has become a tool which is used in various radiotherapy modalities, either for clinical quality assurance (QA) or dose verification. The response of the film to irradiation is usually expressed in optical density (OD) or net optical density (netOD). While the film's response to radiation is not linear, then the use of film as a dosimeter must go through a calibration process. This study aimed to compare the function of the calibration curve of various measurement methods with various densitometer, using a flat bad scanner, point densitometer and spectrophotometer. For every response function, a radichromic film calibration curve is generated from each method by performing accuracy, precision and sensitivity analysis. netOD is obtained by measuring changes in the optical density (OD) of the film before irradiation and after irradiation when using a film scanner if it uses ImageJ to extract the pixel value of the film on the red channel of three channels (RGB), calculate the change in OD before and after irradiation when using a point densitometer, and calculate changes in absorbance before and after irradiation when using a spectrophotometer. the results showed that the three calibration methods gave readings with a netOD precision of doses below 3% for the uncertainty value of 1σ (one sigma). while the sensitivity of all three methods has the same trend in responding to film readings against radiation, it has a different magnitude of sensitivity. while the accuracy of the three methods provides readings below 3% for doses above 100 cGy and 200 cGy, but for doses below 100 cGy found above 3% when using point densitometers and spectrophotometers. when all three methods are used for clinical implementation, the results of the study show accuracy and precision below 2% for the use of scanners and spectrophotometers and above 3% for precision and accuracy when using point densitometers.Keywords: Callibration Methods, Film Dosimetry EBT3, Flat Bad Scanner, Densitomete, Spectrophotometer
Procedia PDF Downloads 13215681 Evaluation of Gesture-Based Password: User Behavioral Features Using Machine Learning Algorithms
Authors: Lakshmidevi Sreeramareddy, Komalpreet Kaur, Nane Pothier
Abstract:
Graphical-based passwords have existed for decades. Their major advantage is that they are easier to remember than an alphanumeric password. However, their disadvantage (especially recognition-based passwords) is the smaller password space, making them more vulnerable to brute force attacks. Graphical passwords are also highly susceptible to the shoulder-surfing effect. The gesture-based password method that we developed is a grid-free, template-free method. In this study, we evaluated the gesture-based passwords for usability and vulnerability. The results of the study are significant. We developed a gesture-based password application for data collection. Two modes of data collection were used: Creation mode and Replication mode. In creation mode (Session 1), users were asked to create six different passwords and reenter each password five times. In replication mode, users saw a password image created by some other user for a fixed duration of time. Three different duration timers, such as 5 seconds (Session 2), 10 seconds (Session 3), and 15 seconds (Session 4), were used to mimic the shoulder-surfing attack. After the timer expired, the password image was removed, and users were asked to replicate the password. There were 74, 57, 50, and 44 users participated in Session 1, Session 2, Session 3, and Session 4 respectfully. In this study, the machine learning algorithms have been applied to determine whether the person is a genuine user or an imposter based on the password entered. Five different machine learning algorithms were deployed to compare the performance in user authentication: namely, Decision Trees, Linear Discriminant Analysis, Naive Bayes Classifier, Support Vector Machines (SVMs) with Gaussian Radial Basis Kernel function, and K-Nearest Neighbor. Gesture-based password features vary from one entry to the next. It is difficult to distinguish between a creator and an intruder for authentication. For each password entered by the user, four features were extracted: password score, password length, password speed, and password size. All four features were normalized before being fed to a classifier. Three different classifiers were trained using data from all four sessions. Classifiers A, B, and C were trained and tested using data from the password creation session and the password replication with a timer of 5 seconds, 10 seconds, and 15 seconds, respectively. The classification accuracies for Classifier A using five ML algorithms are 72.5%, 71.3%, 71.9%, 74.4%, and 72.9%, respectively. The classification accuracies for Classifier B using five ML algorithms are 69.7%, 67.9%, 70.2%, 73.8%, and 71.2%, respectively. The classification accuracies for Classifier C using five ML algorithms are 68.1%, 64.9%, 68.4%, 71.5%, and 69.8%, respectively. SVMs with Gaussian Radial Basis Kernel outperform other ML algorithms for gesture-based password authentication. Results confirm that the shorter the duration of the shoulder-surfing attack, the higher the authentication accuracy. In conclusion, behavioral features extracted from the gesture-based passwords lead to less vulnerable user authentication.Keywords: authentication, gesture-based passwords, machine learning algorithms, shoulder-surfing attacks, usability
Procedia PDF Downloads 10215680 Breast Cancer Early Recognition, New Methods of Screening, and Analysis
Authors: Sahar Heidary
Abstract:
Breast cancer is a main public common obstacle global. Additionally, it is the second top reason for tumor death across women. Considering breast cancer cure choices can aid private doctors in precaution for their patients through future cancer treatment. This article reviews usual management centered on stage, histology, and biomarkers. The growth of breast cancer is a multi-stage procedure including numerous cell kinds and its inhibition residues stimulating in the universe. Timely identification of breast cancer is one of the finest methods to stop this illness. Entirely chief therapeutic administrations mention screening mammography for women aged 40 years and older. Breast cancer metastasis interpretations for the mainstream of deaths from breast cancer. The discovery of breast cancer metastasis at the initial step is essential for managing and estimate of breast cancer development. Developing methods consuming the exploration of flowing cancer cells illustrate talented outcomes in forecasting and classifying the initial steps of breast cancer metastasis in patients. In public, mammography residues are the key screening implement though the efficiency of medical breast checks and self-checkup is less. Innovative screening methods are doubtful to exchange mammography in the close upcoming for screening the overall people.Keywords: breast cancer, screening, metastasis, methods
Procedia PDF Downloads 16615679 Stock Prediction and Portfolio Optimization Thesis
Authors: Deniz Peksen
Abstract:
This thesis aims to predict trend movement of closing price of stock and to maximize portfolio by utilizing the predictions. In this context, the study aims to define a stock portfolio strategy from models created by using Logistic Regression, Gradient Boosting and Random Forest. Recently, predicting the trend of stock price has gained a significance role in making buy and sell decisions and generating returns with investment strategies formed by machine learning basis decisions. There are plenty of studies in the literature on the prediction of stock prices in capital markets using machine learning methods but most of them focus on closing prices instead of the direction of price trend. Our study differs from literature in terms of target definition. Ours is a classification problem which is focusing on the market trend in next 20 trading days. To predict trend direction, fourteen years of data were used for training. Following three years were used for validation. Finally, last three years were used for testing. Training data are between 2002-06-18 and 2016-12-30 Validation data are between 2017-01-02 and 2019-12-31 Testing data are between 2020-01-02 and 2022-03-17 We determine Hold Stock Portfolio, Best Stock Portfolio and USD-TRY Exchange rate as benchmarks which we should outperform. We compared our machine learning basis portfolio return on test data with return of Hold Stock Portfolio, Best Stock Portfolio and USD-TRY Exchange rate. We assessed our model performance with the help of roc-auc score and lift charts. We use logistic regression, Gradient Boosting and Random Forest with grid search approach to fine-tune hyper-parameters. As a result of the empirical study, the existence of uptrend and downtrend of five stocks could not be predicted by the models. When we use these predictions to define buy and sell decisions in order to generate model-based-portfolio, model-based-portfolio fails in test dataset. It was found that Model-based buy and sell decisions generated a stock portfolio strategy whose returns can not outperform non-model portfolio strategies on test dataset. We found that any effort for predicting the trend which is formulated on stock price is a challenge. We found same results as Random Walk Theory claims which says that stock price or price changes are unpredictable. Our model iterations failed on test dataset. Although, we built up several good models on validation dataset, we failed on test dataset. We implemented Random Forest, Gradient Boosting and Logistic Regression. We discovered that complex models did not provide advantage or additional performance while comparing them with Logistic Regression. More complexity did not lead us to reach better performance. Using a complex model is not an answer to figure out the stock-related prediction problem. Our approach was to predict the trend instead of the price. This approach converted our problem into classification. However, this label approach does not lead us to solve the stock prediction problem and deny or refute the accuracy of the Random Walk Theory for the stock price.Keywords: stock prediction, portfolio optimization, data science, machine learning
Procedia PDF Downloads 8015678 A Review of Travel Data Collection Methods
Authors: Muhammad Awais Shafique, Eiji Hato
Abstract:
Household trip data is of crucial importance for managing present transportation infrastructure as well as to plan and design future facilities. It also provides basis for new policies implemented under Transportation Demand Management. The methods used for household trip data collection have changed with passage of time, starting with the conventional face-to-face interviews or paper-and-pencil interviews and reaching to the recent approach of employing smartphones. This study summarizes the step-wise evolution in the travel data collection methods. It provides a comprehensive review of the topic, for readers interested to know the changing trends in the data collection field.Keywords: computer, smartphone, telephone, travel survey
Procedia PDF Downloads 31015677 DCDNet: Lightweight Document Corner Detection Network Based on Attention Mechanism
Authors: Kun Xu, Yuan Xu, Jia Qiao
Abstract:
The document detection plays an important role in optical character recognition and text analysis. Because the traditional detection methods have weak generalization ability, and deep neural network has complex structure and large number of parameters, which cannot be well applied in mobile devices, this paper proposes a lightweight Document Corner Detection Network (DCDNet). DCDNet is a two-stage architecture. The first stage with Encoder-Decoder structure adopts depthwise separable convolution to greatly reduce the network parameters. After introducing the Feature Attention Union (FAU) module, the second stage enhances the feature information of spatial and channel dim and adaptively adjusts the size of receptive field to enhance the feature expression ability of the model. Aiming at solving the problem of the large difference in the number of pixel distribution between corner and non-corner, Weighted Binary Cross Entropy Loss (WBCE Loss) is proposed to define corner detection problem as a classification problem to make the training process more efficient. In order to make up for the lack of Dataset of document corner detection, a Dataset containing 6620 images named Document Corner Detection Dataset (DCDD) is made. Experimental results show that the proposed method can obtain fast, stable and accurate detection results on DCDD.Keywords: document detection, corner detection, attention mechanism, lightweight
Procedia PDF Downloads 35215676 A Comprehensive Survey of Artificial Intelligence and Machine Learning Approaches across Distinct Phases of Wildland Fire Management
Authors: Ursula Das, Manavjit Singh Dhindsa, Kshirasagar Naik, Marzia Zaman, Richard Purcell, Srinivas Sampalli, Abdul Mutakabbir, Chung-Horng Lung, Thambirajah Ravichandran
Abstract:
Wildland fires, also known as forest fires or wildfires, are exhibiting an alarming surge in frequency in recent times, further adding to its perennial global concern. Forest fires often lead to devastating consequences ranging from loss of healthy forest foliage and wildlife to substantial economic losses and the tragic loss of human lives. Despite the existence of substantial literature on the detection of active forest fires, numerous potential research avenues in forest fire management, such as preventative measures and ancillary effects of forest fires, remain largely underexplored. This paper undertakes a systematic review of these underexplored areas in forest fire research, meticulously categorizing them into distinct phases, namely pre-fire, during-fire, and post-fire stages. The pre-fire phase encompasses the assessment of fire risk, analysis of fuel properties, and other activities aimed at preventing or reducing the risk of forest fires. The during-fire phase includes activities aimed at reducing the impact of active forest fires, such as the detection and localization of active fires, optimization of wildfire suppression methods, and prediction of the behavior of active fires. The post-fire phase involves analyzing the impact of forest fires on various aspects, such as the extent of damage in forest areas, post-fire regeneration of forests, impact on wildlife, economic losses, and health impacts from byproducts produced during burning. A comprehensive understanding of the three stages is imperative for effective forest fire management and mitigation of the impact of forest fires on both ecological systems and human well-being. Artificial intelligence and machine learning (AI/ML) methods have garnered much attention in the cyber-physical systems domain in recent times leading to their adoption in decision-making in diverse applications including disaster management. This paper explores the current state of AI/ML applications for managing the activities in the aforementioned phases of forest fire. While conventional machine learning and deep learning methods have been extensively explored for the prevention, detection, and management of forest fires, a systematic classification of these methods into distinct AI research domains is conspicuously absent. This paper gives a comprehensive overview of the state of forest fire research across more recent and prominent AI/ML disciplines, including big data, classical machine learning, computer vision, explainable AI, generative AI, natural language processing, optimization algorithms, and time series forecasting. By providing a detailed overview of the potential areas of research and identifying the diverse ways AI/ML can be employed in forest fire research, this paper aims to serve as a roadmap for future investigations in this domain.Keywords: artificial intelligence, computer vision, deep learning, during-fire activities, forest fire management, machine learning, pre-fire activities, post-fire activities
Procedia PDF Downloads 7015675 Physical and Chemical Alternative Methods of Fresh Produce Disinfection
Authors: Tuji Jemal Ahmed
Abstract:
Fresh produce is an essential component of a healthy diet. However, it can also be a potential source of pathogenic microorganisms that can cause foodborne illnesses. Traditional disinfection methods, such as washing with water and chlorine, have limitations and may not effectively remove or inactivate all microorganisms. This has led to the development of alternative/new methods of fresh produce disinfection, including physical and chemical methods. In this paper, we explore the physical and chemical new methods of fresh produce disinfection, their advantages and disadvantages, and their suitability for different types of produce. Physical methods of disinfection, such as ultraviolet (UV) radiation and high-pressure processing (HPP), are crucial in ensuring the microbiological safety of fresh produce. UV radiation uses short-wavelength UV-C light to damage the DNA and RNA of microorganisms, and HPP applies high levels of pressure to fresh produce to reduce the microbial load. These physical methods are highly effective in killing a wide range of microorganisms, including bacteria, viruses, and fungi. However, they may not penetrate deep enough into the product to kill all microorganisms and can alter the sensory characteristics of the product. Chemical methods of disinfection, such as acidic electrolyzed water (AEW), ozone, and peroxyacetic acid (PAA), are also important in ensuring the microbiological safety of fresh produce. AEW uses a low concentration of hypochlorous acid and a high concentration of hydrogen ions to inactivate microorganisms, ozone uses ozone gas to damage the cell membranes and DNA of microorganisms, and PAA uses a combination of hydrogen peroxide and acetic acid to inactivate microorganisms. These chemical methods are highly effective in killing a wide range of microorganisms, but they may cause discoloration or changes in the texture and flavor of some products and may require specialized equipment and trained personnel to produce and apply. In conclusion, the selection of the most suitable method of fresh produce disinfection should take into consideration the type of product, the level of microbial contamination, the effectiveness of the method in reducing the microbial load, and any potential negative impacts on the sensory characteristics, nutritional composition, and safety of the produce.Keywords: fresh produce, pathogenic microorganisms, foodborne illnesses, disinfection methods
Procedia PDF Downloads 7215674 Nondestructive Prediction and Classification of Gel Strength in Ethanol-Treated Kudzu Starch Gels Using Near-Infrared Spectroscopy
Authors: John-Nelson Ekumah, Selorm Yao-Say Solomon Adade, Mingming Zhong, Yufan Sun, Qiufang Liang, Muhammad Safiullah Virk, Xorlali Nunekpeku, Nana Adwoa Nkuma Johnson, Bridget Ama Kwadzokpui, Xiaofeng Ren
Abstract:
Enhancing starch gel strength and stability is crucial. However, traditional gel property assessment methods are destructive, time-consuming, and resource-intensive. Thus, understanding ethanol treatment effects on kudzu starch gel strength and developing a rapid, nondestructive gel strength assessment method is essential for optimizing the treatment process and ensuring product quality consistency. This study investigated the effects of different ethanol concentrations on the microstructure of kudzu starch gels using a comprehensive microstructural analysis. We also developed a nondestructive method for predicting gel strength and classifying treatment levels using near-infrared (NIR) spectroscopy, and advanced data analytics. Scanning electron microscopy revealed progressive network densification and pore collapse with increasing ethanol concentration, correlating with enhanced mechanical properties. NIR spectroscopy, combined with various variable selection methods (CARS, GA, and UVE) and modeling algorithms (PLS, SVM, and ELM), was employed to develop predictive models for gel strength. The UVE-SVM model demonstrated exceptional performance, with the highest R² values (Rc = 0.9786, Rp = 0.9688) and lowest error rates (RMSEC = 6.1340, RMSEP = 6.0283). Pattern recognition algorithms (PCA, LDA, and KNN) successfully classified gels based on ethanol treatment levels, achieving near-perfect accuracy. This integrated approach provided a multiscale perspective on ethanol-induced starch gel modification, from molecular interactions to macroscopic properties. Our findings demonstrate the potential of NIR spectroscopy, coupled with advanced data analysis, as a powerful tool for rapid, nondestructive quality assessment in starch gel production. This study contributes significantly to the understanding of starch modification processes and opens new avenues for research and industrial applications in food science, pharmaceuticals, and biomaterials.Keywords: kudzu starch gel, near-infrared spectroscopy, gel strength prediction, support vector machine, pattern recognition algorithms, ethanol treatment
Procedia PDF Downloads 3515673 Towards End-To-End Disease Prediction from Raw Metagenomic Data
Authors: Maxence Queyrel, Edi Prifti, Alexandre Templier, Jean-Daniel Zucker
Abstract:
Analysis of the human microbiome using metagenomic sequencing data has demonstrated high ability in discriminating various human diseases. Raw metagenomic sequencing data require multiple complex and computationally heavy bioinformatics steps prior to data analysis. Such data contain millions of short sequences read from the fragmented DNA sequences and stored as fastq files. Conventional processing pipelines consist in multiple steps including quality control, filtering, alignment of sequences against genomic catalogs (genes, species, taxonomic levels, functional pathways, etc.). These pipelines are complex to use, time consuming and rely on a large number of parameters that often provide variability and impact the estimation of the microbiome elements. Training Deep Neural Networks directly from raw sequencing data is a promising approach to bypass some of the challenges associated with mainstream bioinformatics pipelines. Most of these methods use the concept of word and sentence embeddings that create a meaningful and numerical representation of DNA sequences, while extracting features and reducing the dimensionality of the data. In this paper we present an end-to-end approach that classifies patients into disease groups directly from raw metagenomic reads: metagenome2vec. This approach is composed of four steps (i) generating a vocabulary of k-mers and learning their numerical embeddings; (ii) learning DNA sequence (read) embeddings; (iii) identifying the genome from which the sequence is most likely to come and (iv) training a multiple instance learning classifier which predicts the phenotype based on the vector representation of the raw data. An attention mechanism is applied in the network so that the model can be interpreted, assigning a weight to the influence of the prediction for each genome. Using two public real-life data-sets as well a simulated one, we demonstrated that this original approach reaches high performance, comparable with the state-of-the-art methods applied directly on processed data though mainstream bioinformatics workflows. These results are encouraging for this proof of concept work. We believe that with further dedication, the DNN models have the potential to surpass mainstream bioinformatics workflows in disease classification tasks.Keywords: deep learning, disease prediction, end-to-end machine learning, metagenomics, multiple instance learning, precision medicine
Procedia PDF Downloads 12415672 Using Hierarchical Methodology to Assist the Selection of New Business in Brazilian Companies Incubators
Authors: Izabel Cristina Zattar, Gilberto Passos Lima, Guilherme Schünemann de Oliveira
Abstract:
In Brazil, there are several institutions committed to the development of new businesses based on product innovation. Among them are business incubators, universities and science institutes. Business incubators can be defined as nurseries for new companies, which may be in the technology segment, discussed in this article. Business incubators provide services related to infrastructure, such as physical space and meeting rooms. Besides these services, incubators also offer assistance in the form of information and communication, access to finance, relationship networks and business monitoring and mentoring processes. Business incubators support not all technology companies. One of the business incubators tasks is to assess the nature and feasibility of new business proposals. To assist in this goal, this paper proposes a methodology for evaluating new business using the Analytic Hierarchy Process (AHP). This paper presents the concepts used in the assessing methodology application for new business, concepts that have been tested with positive results in practice. This study counts on three main steps: first, a hierarchy was built, based on new business manuals used by the business incubators. These books and manuals relate business selection requirements, such as the innovation status and other technological aspects. Then, a questionnaire was generated, in order to guide incubator experts in the parity comparisons at all hierarchy levels. The weights of each requirement are calculated from information obtained from the questionnaire responses. Finally, the proposed method was applied to evaluate five new business proposals, which were applying to be part of a company incubator. The main result is the classification of these new businesses, which helped the incubator experts to decide what companies were more eligible to work with. This classification may also be helpful to the decision-making process of business incubators in future selection processes.Keywords: Analytic Hierarchy Process (AHP), Brazilian companies incubators, technology companies, incubator
Procedia PDF Downloads 39715671 Comparative Evaluation of Accuracy of Selected Machine Learning Classification Techniques for Diagnosis of Cancer: A Data Mining Approach
Authors: Rajvir Kaur, Jeewani Anupama Ginige
Abstract:
With recent trends in Big Data and advancements in Information and Communication Technologies, the healthcare industry is at the stage of its transition from clinician oriented to technology oriented. Many people around the world die of cancer because the diagnosis of disease was not done at an early stage. Nowadays, the computational methods in the form of Machine Learning (ML) are used to develop automated decision support systems that can diagnose cancer with high confidence in a timely manner. This paper aims to carry out the comparative evaluation of a selected set of ML classifiers on two existing datasets: breast cancer and cervical cancer. The ML classifiers compared in this study are Decision Tree (DT), Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), Logistic Regression, Ensemble (Bagged Tree) and Artificial Neural Networks (ANN). The evaluation is carried out based on standard evaluation metrics Precision (P), Recall (R), F1-score and Accuracy. The experimental results based on the evaluation metrics show that ANN showed the highest-level accuracy (99.4%) when tested with breast cancer dataset. On the other hand, when these ML classifiers are tested with the cervical cancer dataset, Ensemble (Bagged Tree) technique gave better accuracy (93.1%) in comparison to other classifiers.Keywords: artificial neural networks, breast cancer, classifiers, cervical cancer, f-score, machine learning, precision, recall
Procedia PDF Downloads 27515670 Proposed Organizational Development Interventions in Managing Occupational Stressors for Business Schools in Batangas City
Authors: Marlon P. Perez
Abstract:
The study intended to determine the level of occupational stress that was experienced by faculty members of private and public business schools in Batangas City with the end in view of proposing organizational development interventions in managing occupational stressors. Stressors such as factors intrinsic to the job, role in the organization, relationships at work, career development and organizational structure and climate were used as determinants of occupational stress level. Descriptive method of research was used as its research design. There were only 64 full-time faculty members coming from private and public business schools in Batangas City – University of Batangas, Lyceum of the Philippines University-Batangas, Golden Gate Colleges, Batangas State University and Colegio ng Lungsod ng Batangas. Survey questionnaire was used as data gathering instrument. It was found out that all occupational stressors were assessed stressful when grouped according to its classification of tertiary schools while response of subject respondents differs on their assessment of occupational stressors. Age variable has become significantly related to respondents’ assessments on factors intrinsic to the job and career development; however, it was not significantly related to role in the organization, relationships at work and organizational structure and climate. On the other hand, gender, marital status, highest educational attainment, employment status, length of service, area of specialization and classification of tertiary school were revealed to be not significantly related to all occupational stressors. Various organizational development interventions have been proposed to manage the occupational stressors that are experienced by business faculty members in the institution.Keywords: occupational stress, business school, organizational development, intervention, stressors, faculty members, assessment, manage
Procedia PDF Downloads 43115669 A Comparative Study of Medical Image Segmentation Methods for Tumor Detection
Authors: Mayssa Bensalah, Atef Boujelben, Mouna Baklouti, Mohamed Abid
Abstract:
Image segmentation has a fundamental role in analysis and interpretation for many applications. The automated segmentation of organs and tissues throughout the body using computed imaging has been rapidly increasing. Indeed, it represents one of the most important parts of clinical diagnostic tools. In this paper, we discuss a thorough literature review of recent methods of tumour segmentation from medical images which are briefly explained with the recent contribution of various researchers. This study was followed by comparing these methods in order to define new directions to develop and improve the performance of the segmentation of the tumour area from medical images.Keywords: features extraction, image segmentation, medical images, tumor detection
Procedia PDF Downloads 16415668 Dimensional Investigation of Food Addiction in Individuals Who Have Undergone Bariatric Surgery
Authors: Ligia Florio, João Mauricio Castaldelli-Maia
Abstract:
Background: Food addiction (FA) emerged in the 1990s as a possible contributor to the increasing prevalence of obesity and overweight, in conjunction with changing food environments and mental health conditions. However, FA is not yet listed as one of the disorders in the DSM-5 and/or the ICD-11. Although there are controversies and debates in the literature about the classification and construct of FA, the most common approach to access it is the use of a research tool - the Yale Food Addiction Scale (YFAS) - which approximates the concept of FA to the concept diagnosis of dependence on psychoactive substances. There is a need to explore the dimensional phenotypes accessed by YFAS in different population groups for a better understanding and scientific support of FA diagnoses. Methods: The primary objective of this project was to investigate the construct validity of the FA concept by mYFAS 2.0 in individuals who underwent bariatric surgery (n = 100) at the Hospital Estadual Mário Covas since 2011. Statistical analyzes were conducted using the STATA software. In this sense, structural or factor validity was the type of construct validity investigated using exploratory factor analysis (EFA) and item response theory (IRT) techniques. Results: EFA showed that the one-dimensional model was the most parsimonious. The IRT showed that all criteria contributed to the latent structure, presenting discrimination values greater than 0.5, with most presenting values greater than 2. Conclusion: This study reinforces a FA dimension in patients who underwent bariatric surgery. Within this dimension, we identified the most severe and discriminating criteria for the diagnosis of FA.Keywords: obesity, food addiction, bariatric surgery, regain
Procedia PDF Downloads 7615667 Assessment of the Spatio-Temporal Distribution of Pteridium aquilinum (Bracken Fern) Invasion on the Grassland Plateau in Nyika National Park
Authors: Andrew Kanzunguze, Lusayo Mwabumba, Jason K. Gilbertson, Dominic B. Gondwe, George Z. Nxumayo
Abstract:
Knowledge about the spatio-temporal distribution of invasive plants in protected areas provides a base from which hypotheses explaining proliferation of plant invasions can be made alongside development of relevant invasive plant monitoring programs. The aim of this study was to investigate the spatio-temporal distribution of bracken fern on the grassland plateau of Nyika National Park over the past 30 years (1986-2016) as well as to determine the current extent of the invasion. Remote sensing, machine learning, and statistical modelling techniques (object-based image analysis, image classification and linear regression analysis) in geographical information systems were used to determine both the spatial and temporal distribution of bracken fern in the study area. Results have revealed that bracken fern has been increasing coverage on the Nyika plateau at an estimated annual rate of 87.3 hectares since 1986. This translates to an estimated net increase of 2,573.1 hectares, which was recorded from 1,788.1 hectares (1986) to 4,361.9 hectares (2016). As of 2017 bracken fern covered 20,940.7 hectares, approximately 14.3% of the entire grassland plateau. Additionally, it was observed that the fern was distributed most densely around Chelinda camp (on the central plateau) as well as in forest verges and roadsides across the plateau. Based on these results it is recommended that Ecological Niche Modelling approaches be employed to (i) isolate the most important factors influencing bracken fern proliferation as well as (ii) identify and prioritize areas requiring immediate control interventions so as to minimize bracken fern proliferation in Nyika National Park.Keywords: bracken fern, image classification, Landsat-8, Nyika National Park, spatio-temporal distribution
Procedia PDF Downloads 17815666 Reduplication in Dhiyan: An Indo-Aryan Language of Assam
Authors: S. Sulochana Singha
Abstract:
Dhiyan or Dehan is the name of the community and language spoken by the Koch-Rajbangshi people of Barak Valley of Assam. Ethnically, they are Mongoloids, and their language belongs to the Indo-Aryan language family. However, Dhiyan is absent in any classification of Indo-Aryan languages. So the classification of Dhiyan language under the Indo-Aryan language family is completely based on the shared typological features of the other Indo-Aryan languages. Typologically, Dhiyan is an agglutinating language, and it shares many features of Indo-Aryan languages like presence of aspirated voiced stops, non-tonal, verb-person agreement, adjectives as different word class, prominent tense and subject object verb word order. Reduplication is a productive word-formation process in Dhiyan. Besides it also expresses plurality, intensification, and distributive. Generally, reduplication in Dhiyan can be at the morphological or lexical level. Morphological reduplication in Dhiyan involves expressives which includes onomatopoeias, sound symbolism, idiophones, and imitatives. Lexical reduplication in the language can be formed by echo formations and word reduplication. Echo formation in Dhiyan is formed by partial repetition from the base word which can be either consonant alternation or vowel alternation. The consonant alternation is basically found in onset position while the alternation of vowel is basically found in open syllable particularly in final syllable. Word reduplication involves reduplication of nouns, interrogatives, adjectives, and numerals which further can be class changing or class maintaining reduplication. The process of reduplication can be partial or complete whether it is lexical or morphological. The present paper is an attempt to describe some aspects of the formation, function, and usage of reduplications in Dhiyan which is mainly spoken in ten villages in the Eastern part of Barak River in the Cachar District of Assam.Keywords: Barak-Valley, Dhiyan, Indo-Aryan, reduplication
Procedia PDF Downloads 214