Search results for: mixed dataset
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3752

Search results for: mixed dataset

3452 Contact Phenomena in Medieval Business Texts

Authors: Carmela Perta

Abstract:

Among the studies flourished in the field of historical sociolinguistics, mainly in the strand devoted to English history, during its Medieval and early modern phases, multilingual texts had been analysed using theories and models coming from contact linguistics, thus applying synchronic models and approaches to the past. This is true also in the case of contact phenomena which would transcend the writing level involving the language systems implicated in contact processes to the point of perceiving a new variety. This is the case for medieval administrative-commercial texts in which, according to some Scholars, the degree of fusion of Anglo-Norman, Latin and middle English is so high a mixed code emerges, and there are recurrent patterns of mixed forms. Interesting is a collection of multilingual business writings by John Balmayn, an Englishman overseeing a large shipment in Tuscany, namely the Cantelowe accounts. These documents display various analogies with multilingual texts written in England in the same period; in fact, the writer seems to make use of the above-mentioned patterns, with Middle English, Latin, Anglo-Norman, and the newly added Italian. Applying an atomistic yet dynamic approach to the study of contact phenomena, we will investigate these documents, trying to explore the nature of the switching forms they contain from an intra-writer variation perspective. After analysing the accounts and the type of multilingualism in them, we will take stock of the assumed mixed code nature, comparing the characteristics found in this genre with modern assumptions. The aim is to evaluate the possibility to consider the switching forms as core elements of a mixed code, used as professional variety among merchant communities, or whether such texts should be analysed from a switching perspective.

Keywords: historical sociolinguistics, historical code switching, letters, medieval england

Procedia PDF Downloads 49
3451 An Improved K-Means Algorithm for Gene Expression Data Clustering

Authors: Billel Kenidra, Mohamed Benmohammed

Abstract:

Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.

Keywords: microarray data mining, biological pattern recognition, partitional clustering, k-means algorithm, centroid initialization

Procedia PDF Downloads 165
3450 Mixed Hydrotropic Zaleplon Oral Tablets: Formulation and Neuropharmacological Effect on Plasma GABA Level

Authors: Ghada A. Abdelbary, Maha M. Amin, Mostafa Abdelmoteleb

Abstract:

Zaleplon (ZP) is a non-benzodiazepine poorly soluble hypnotic drug indicated for the short term treatment of insomnia having a bioavailability of about 30%. The aim of the present study is to enhance the solubility and consequently the bioavailability of ZP using hydrotropic agents (HA). Phase solubility diagrams of ZP in presence of different molar concentrations of HA (Sodium benzoate, Urea, Ascorbic acid, Resorcinol, Nicotinamide, and Piperazine) were constructed. ZP/Sodium benzoate and Resorcinol microparticles were prepared adopting melt, solvent evaporation and melt-evaporation techniques followed by XRD. Directly compressed mixed hydrotropic ZP tablets of Sodium benzoate and Resorcinol in different weight ratios were prepared and evaluated compared to the commercially available tablets (Sleep aid® 5 mg). The effect of shelf and accelerated stability storage (40°C ± 2°C/75%RH ± 5%RH) on the optimum tablet formula (F5) for six months were studied. The enhancement of ZP solubility follows the order of: Resorcinol > Sodium benzoate > Ascorbic acid > Piperazine > Urea > Nicotinamide with about 350 and 2000 fold increase using 1M of Sodium benzoate and Resorcinol respectively. ZP/HA microparticles exhibit the order of: Solvent evaporation > melt-solvent evaporation > melt > physical mixture which was further confirmed by the complete conversion of ZP into amorphous form. Mixed hydrotropic tablet formula (F5) composed of ZP/(Resorcinol: Sodium benzoate 4:1w/w) microparticles prepared by solvent evaporation exhibits in-vitro dissolution of 31.7±0.11% after five minutes (Q5min) compared to 10.0±0.10% for Sleep aid® (5 mg) respectively. F5 showed significantly higher GABA concentration of 122.5±5.5mg/mL in plasma compared to 118±1.00 and 27.8±1.5 mg/mL in case of Sleep aid® (5 mg) and control taking only saline respectively suggesting a higher neuropharmacological effect of ZP following hydrotropic solubilization.

Keywords: zaleplon, hydrotropic solubilization, plasma GABA level, mixed hydrotropy

Procedia PDF Downloads 421
3449 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Keywords: clustering, unsupervised learning, pattern recognition, categorical datasets, knowledge discovery, k-means

Procedia PDF Downloads 237
3448 Neural Network and Support Vector Machine for Prediction of Foot Disorders Based on Foot Analysis

Authors: Monireh Ahmadi Bani, Adel Khorramrouz, Lalenoor Morvarid, Bagheri Mahtab

Abstract:

Background:- Foot disorders are common in musculoskeletal problems. Plantar pressure distribution measurement is one the most important part of foot disorders diagnosis for quantitative analysis. However, the association of plantar pressure and foot disorders is not clear. With the growth of dataset and machine learning methods, the relationship between foot disorders and plantar pressures can be detected. Significance of the study:- The purpose of this study was to predict the probability of common foot disorders based on peak plantar pressure distribution and center of pressure during walking. Methodologies:- 2323 participants were assessed in a foot therapy clinic between 2015 and 2021. Foot disorders were diagnosed by an experienced physician and then they were asked to walk on a force plate scanner. After the data preprocessing, due to the difference in walking time and foot size, we normalized the samples based on time and foot size. Some of force plate variables were selected as input to a deep neural network (DNN), and the probability of any each foot disorder was measured. In next step, we used support vector machine (SVM) and run dataset for each foot disorder (classification of yes or no). We compared DNN and SVM for foot disorders prediction based on plantar pressure distributions and center of pressure. Findings:- The results demonstrated that the accuracy of deep learning architecture is sufficient for most clinical and research applications in the study population. In addition, the SVM approach has more accuracy for predictions, enabling applications for foot disorders diagnosis. The detection accuracy was 71% by the deep learning algorithm and 78% by the SVM algorithm. Moreover, when we worked with peak plantar pressure distribution, it was more accurate than center of pressure dataset. Conclusion:- Both algorithms- deep learning and SVM will help therapist and patients to improve the data pool and enhance foot disorders prediction with less expense and error after removing some restrictions properly.

Keywords: deep neural network, foot disorder, plantar pressure, support vector machine

Procedia PDF Downloads 318
3447 Exploring the Association between Race and Attitudes toward Physician-Assisted Death; An Analysis of the Gss Dataset

Authors: Seini G. Kaufusi

Abstract:

Background. Physician-assisted death (PAD) has and continues to be a controversial issue in the U.S. Dying with dignity statutes exists in 9 U.S. jurisdictions that permit competent adults diagnosed with a terminal illness and given a prognosis of 6 month or less to live to request medication to hasten death. Robust advocacy for and against PAD influences policy, and opinions vary. Aim. This study aims to explore the association between race and the attitudes toward physician-assisted death in the U.S. Methods. Data for this study derives from the General Social Survey (GSS) dataset, a national survey conducted by the National Opinion Research Center (NORC) that focuses on the opinions and values of American’s. A cross-sectional design and probability sample from the 2018 data set was used to randomly select respondents. Results. The results indicated that race is significantly associated with attitudes towards physician-assisted death. The level of significance suggests a strong positive association, and the direction indicated that Black and Other racial groups have higher rates of positive decision about PAD. Conclusion. Although attitudes towards PAD varied, Black and other racial groups had favorable decisions for PAD. Further research is crucial in the continuous debate on PAD and understanding the influences of predictors for or against PAD.

Keywords: attitudes, euthanasia, physician-assisted death, race

Procedia PDF Downloads 140
3446 Green Materials for Hot Mixed Asphalt Production

Authors: Salisu Dahiru, Jibrin M. Kaura, Abubakar I. Jumare, Sulaiman M. Mahmood

Abstract:

Reclaimed asphalt, used automobile tires and rice husk, were regarded as waste. These materials could be used in construction of new roads and for roads rehabilitation. Investigation into the production of a Green Hot Mixed Asphalt (GHMA) pavement using Reclaimed Asphalt Pavement (RAP) as partial replacement for coarse aggregate, Crumb Rubber (CR) from waste automobile tires as modifier for bitumen binder and Rice Husk Ash (RHA) as partial replacement of ordinary portland cement (OPC) filler, for roads construction and rehabilitation was presented. 30% Reclaimed asphalt of total aggregate, 15% Crumb Rubber of total binder content, 5% Rice Husk Ash of total mix, and 5.2% Crumb Rubber Modified Bitumen content were recommended for optimum performance. Loss of marshal stability was investigated on mix with the recommended optimum CRMB. The mix revealed good performance with only about 13% loss of stability after 24 hours of immersion in hot water bath, as against about 24% marshal stability lost reported in previous studies for conventional Hot Mixed Asphalt (HMA).

Keywords: rice husk, reclaimed asphalt, filler, crumb rubber, bitumen content green hot mix asphalt

Procedia PDF Downloads 303
3445 Heuristic Classification of Hydrophone Recordings

Authors: Daniel M. Wolff, Patricia Gray, Rafael de la Parra Venegas

Abstract:

An unsupervised machine listening system is constructed and applied to a dataset of 17,195 30-second marine hydrophone recordings. The system is then heuristically supplemented with anecdotal listening, contextual recording information, and supervised learning techniques to reduce the number of false positives. Features for classification are assembled by extracting the following data from each of the audio files: the spectral centroid, root-mean-squared values for each frequency band of a 10-octave filter bank, and mel-frequency cepstral coefficients in 5-second frames. In this way both time- and frequency-domain information are contained in the features to be passed to a clustering algorithm. Classification is performed using the k-means algorithm and then a k-nearest neighbors search. Different values of k are experimented with, in addition to different combinations of the available feature sets. Hypothesized class labels are 'primarily anthrophony' and 'primarily biophony', where the best class result conforming to the former label has 104 members after heuristic pruning. This demonstrates how a large audio dataset has been made more tractable with machine learning techniques, forming the foundation of a framework designed to acoustically monitor and gauge biological and anthropogenic activity in a marine environment.

Keywords: anthrophony, hydrophone, k-means, machine learning

Procedia PDF Downloads 136
3444 A Self-Adaptive Stimulus Artifacts Removal Approach for Electrical Stimulation Based Muscle Rehabilitation

Authors: Yinjun Tu, Qiang Fang, Glenn I. Matthews, Shuenn-Yuh Lee

Abstract:

This paper reports an efficient and rigorous self-adaptive stimulus artifacts removal approach for a mixed surface EMG (Electromyography) and stimulus signal during muscle stimulation. The recording of EMG and the stimulation of muscles were performing simultaneously. It is difficult to generate muscle fatigue feature from the mixed signal, which can be further used in closed loop system. A self-adaptive method is proposed in this paper, the stimulation frequency was calculated and verified firstly. Then, a mask was created based on this stimulation frequency to remove the undesired stimulus. 20 EMG signal recordings were analyzed, and the ANOVA (analysis of variance) approach illustrated that the decreasing trend of median power frequencies was successfully generated from the 'cleaned' EMG signal.

Keywords: EMG, FES, stimulus artefacts, self-adaptive

Procedia PDF Downloads 376
3443 Gait Biometric for Person Re-Identification

Authors: Lavanya Srinivasan

Abstract:

Biometric identification is to identify unique features in a person like fingerprints, iris, ear, and voice recognition that need the subject's permission and physical contact. Gait biometric is used to identify the unique gait of the person by extracting moving features. The main advantage of gait biometric to identify the gait of a person at a distance, without any physical contact. In this work, the gait biometric is used for person re-identification. The person walking naturally compared with the same person walking with bag, coat, and case recorded using longwave infrared, short wave infrared, medium wave infrared, and visible cameras. The videos are recorded in rural and in urban environments. The pre-processing technique includes human identified using YOLO, background subtraction, silhouettes extraction, and synthesis Gait Entropy Image by averaging the silhouettes. The moving features are extracted from the Gait Entropy Energy Image. The extracted features are dimensionality reduced by the principal component analysis and recognised using different classifiers. The comparative results with the different classifier show that linear discriminant analysis outperforms other classifiers with 95.8% for visible in the rural dataset and 94.8% for longwave infrared in the urban dataset.

Keywords: biometric, gait, silhouettes, YOLO

Procedia PDF Downloads 152
3442 One-Shot Text Classification with Multilingual-BERT

Authors: Hsin-Yang Wang, K. M. A. Salam, Ying-Jia Lin, Daniel Tan, Tzu-Hsuan Chou, Hung-Yu Kao

Abstract:

Detecting user intent from natural language expression has a wide variety of use cases in different natural language processing applications. Recently few-shot training has a spike of usage on commercial domains. Due to the lack of significant sample features, the downstream task performance has been limited or leads to an unstable result across different domains. As a state-of-the-art method, the pre-trained BERT model gathering the sentence-level information from a large text corpus shows improvement on several NLP benchmarks. In this research, we are proposing a method to change multi-class classification tasks into binary classification tasks, then use the confidence score to rank the results. As a language model, BERT performs well on sequence data. In our experiment, we change the objective from predicting labels into finding the relations between words in sequence data. Our proposed method achieved 71.0% accuracy in the internal intent detection dataset and 63.9% accuracy in the HuffPost dataset. Acknowledgment: This work was supported by NCKU-B109-K003, which is the collaboration between National Cheng Kung University, Taiwan, and SoftBank Corp., Tokyo.

Keywords: OSML, BERT, text classification, one shot

Procedia PDF Downloads 81
3441 FLIME - Fast Low Light Image Enhancement for Real-Time Video

Authors: Vinay P., Srinivas K. S.

Abstract:

Low Light Image Enhancement is of utmost impor- tance in computer vision based tasks. Applications include vision systems for autonomous driving, night vision devices for defence systems, low light object detection tasks. Many of the existing deep learning methods are resource intensive during the inference step and take considerable time for processing. The algorithm should take considerably less than 41 milliseconds in order to process a real-time video feed with 24 frames per second and should be even less for a video with 30 or 60 frames per second. The paper presents a fast and efficient solution which has two main advantages, it has the potential to be used for a real-time video feed, and it can be used in low compute environments because of the lightweight nature. The proposed solution is a pipeline of three steps, the first one is the use of a simple function to map input RGB values to output RGB values, the second is to balance the colors and the final step is to adjust the contrast of the image. Hence a custom dataset is carefully prepared using images taken in low and bright lighting conditions. The preparation of the dataset, the proposed model, the processing time are discussed in detail and the quality of the enhanced images using different methods is shown.

Keywords: low light image enhancement, real-time video, computer vision, machine learning

Procedia PDF Downloads 171
3440 On Enabling Miner Self-Rescue with In-Mine Robots using Real-Time Object Detection with Thermal Images

Authors: Cyrus Addy, Venkata Sriram Siddhardh Nadendla, Kwame Awuah-Offei

Abstract:

Surface robots in modern underground mine rescue operations suffer from several limitations in enabling a prompt self-rescue. Therefore, the possibility of designing and deploying in-mine robots to expedite miner self-rescue can have a transformative impact on miner safety. These in-mine robots for miner self-rescue can be envisioned to carry out diverse tasks such as object detection, autonomous navigation, and payload delivery. Specifically, this paper investigates the challenges in the design of object detection algorithms for in-mine robots using thermal images, especially to detect people in real-time. A total of 125 thermal images were collected in the Missouri S&T Experimental Mine with the help of student volunteers using the FLIR TG 297 infrared camera, which were pre-processed into training and validation datasets with 100 and 25 images, respectively. Three state-of-the-art, pre-trained real-time object detection models, namely YOLOv5, YOLO-FIRI, and YOLOv8, were considered and re-trained using transfer learning techniques on the training dataset. On the validation dataset, the re-trained YOLOv8 outperforms the re-trained versions of both YOLOv5, and YOLO-FIRI.

Keywords: miner self-rescue, object detection, underground mine, YOLO

Procedia PDF Downloads 48
3439 Applying Neural Networks for Solving Record Linkage Problem via Fuzzy Description Logics

Authors: Mikheil Kalmakhelidze

Abstract:

Record linkage (RL) problem has become more and more important in recent years due to the growing interest towards big data analysis. The problem can be formulated in a very simple way: Given two entries a and b of a database, decide whether they represent the same object or not. There are two classical deterministic and probabilistic ways of solving the RL problem. Using simple Bayes classifier in many cases produces useful results but sometimes they show to be poor. In recent years several successful approaches have been made towards solving specific RL problems by neural network algorithms including single layer perception, multilayer back propagation network etc. In our work, we model the RL problem for specific dataset of student applications in fuzzy description logic (FDL) where linkage of specific pair (a,b) depends on the truth value of corresponding formula A(a,b) in a canonical FDL model. As a main result, we build neural network for deciding truth value of FDL formulas in a canonical model and thus link RL problem to machine learning. We apply the approach to dataset with 10000 entries and also compare to classical RL solving approaches. The results show to be more accurate than standard probabilistic approach.

Keywords: description logic, fuzzy logic, neural networks, record linkage

Procedia PDF Downloads 249
3438 Photoluminescence Spectroscopy to Probe Mixed Valence State in Eu-Doped Nanocrystalline Glass-Ceramics

Authors: Ruchika Bagga, Mauro Falconieri, Venu Gopal Achanta, José M. F. Ferreira, Ashutosh Goel, Gopi Sharma

Abstract:

Mixed valence Eu-doped nanocrystalline NaAlSiO4/NaY9Si6O26 glass-ceramics have been prepared by controlled crystallization of melt quenched bulk glasses. XRD and SEM techniques were employed to characterize the crystallization process of the precursor glass and their resultant glass-ceramics. Photoluminescence spectroscopy was used to analyze the formation of divalent europium (Eu2+) from Eu3+ ions during high temperature synthesis under ambient atmosphere and is explained on the basis of optical basicity model. The observed luminescence properties of Eu: NaY9Si6O26 are compared with that of well explored Eu: β-PbF2 nanocrystals and their marked differences are discussed.

Keywords: rare earth, oxyfluoride glasses, nano-crystalline glass-ceramics, photoluminescence spectroscopy

Procedia PDF Downloads 318
3437 Research on Integrating Adult Learning and Practice into Long-Term Care Education

Authors: Liu Yi Hui, Chun-Liang Lai, Jhang Yu Cih, He You Jing, Chiu Fan-Yun, Lin Yu Fang

Abstract:

For universities offering long-term care education, the inclusion of adulting learning and practices in professional courses as appropriate based on holistic design and evaluation could improve talent empowerment by leveraging social capital. Moreover, it could make the courses and materials used in long-term care education responsive to real-life needs. A mixed research method was used in the research design. A quantitative study was also conducted using a questionnaire survey, and the data were analyzed by SPSS 22.0 Chinese version. The qualitative data included students’ learning files (learning reflection notes, course reports, and experience records).

Keywords: adult learning, community empowerment, social capital, mixed research

Procedia PDF Downloads 127
3436 Unsaturated Sites Constructed Grafted Polymer Nanoparticles to Promote CO₂ Separation in Mixed-Matrix Membranes

Authors: Boyu Li

Abstract:

Mixed matrix membranes (MMMs), as a separation technology, can improve CO₂ recycling efficiency and reduce the environmental impacts associated with huge emissions. Nevertheless, many challenges must be overcome to design excellent selectivity and permeability performance MMMs. Herein, this work demonstrates the design of nano-scale GNPs (Cu-BDC@PEG) with strong compatibility and high free friction volume (FFV) is an effective way to construct non-interfacial voids MMMs with a desirable combination of selectivity and permeability. Notably, the FFV boosted thanks to the chain length and shape of the GNPs. With this, the permeability and selectivity of Cu-BDC@PEG/PVDF MMMs had also been significantly improved. As such, compatible Cu-BDC@PEG proves very efficient for resolving challenges of MMMs with poor compatibility on the basis of the interfacial defect. Poly (Ethylene Glycol) (PEG) with oxygen groups can be finely coordinated with Cu-MOFs to disperse Cu-BDC@PEG homogenously and form hydrogen bonds with matrix to achieve continuous phase. The resultant MMMs exhibited a simultaneous enhancement of gas permeability (853.1 Barrer) and ideal CO₂/N selectivity (41.7), which has surpassed Robenson's upper bound. Moreover, Cu-BDC@PEG/PVDF has a high-temperature resistance and a long time sustainably. This attractive separation performance of Cu-BDC@PEG/PVDF offered an exciting platform for the development of composite membranes for sustainable CO₂ separations.

Keywords: metal organic framework, CO₂ separation, mixed matrix membrane, polymer

Procedia PDF Downloads 77
3435 Discerning Divergent Nodes in Social Networks

Authors: Mehran Asadi, Afrand Agah

Abstract:

In data mining, partitioning is used as a fundamental tool for classification. With the help of partitioning, we study the structure of data, which allows us to envision decision rules, which can be applied to classification trees. In this research, we used online social network dataset and all of its attributes (e.g., Node features, labels, etc.) to determine what constitutes an above average chance of being a divergent node. We used the R statistical computing language to conduct the analyses in this report. The data were found on the UC Irvine Machine Learning Repository. This research introduces the basic concepts of classification in online social networks. In this work, we utilize overfitting and describe different approaches for evaluation and performance comparison of different classification methods. In classification, the main objective is to categorize different items and assign them into different groups based on their properties and similarities. In data mining, recursive partitioning is being utilized to probe the structure of a data set, which allow us to envision decision rules and apply them to classify data into several groups. Estimating densities is hard, especially in high dimensions, with limited data. Of course, we do not know the densities, but we could estimate them using classical techniques. First, we calculated the correlation matrix of the dataset to see if any predictors are highly correlated with one another. By calculating the correlation coefficients for the predictor variables, we see that density is strongly correlated with transitivity. We initialized a data frame to easily compare the quality of the result classification methods and utilized decision trees (with k-fold cross validation to prune the tree). The method performed on this dataset is decision trees. Decision tree is a non-parametric classification method, which uses a set of rules to predict that each observation belongs to the most commonly occurring class label of the training data. Our method aggregates many decision trees to create an optimized model that is not susceptible to overfitting. When using a decision tree, however, it is important to use cross-validation to prune the tree in order to narrow it down to the most important variables.

Keywords: online social networks, data mining, social cloud computing, interaction and collaboration

Procedia PDF Downloads 123
3434 Spectral Mixture Model Applied to Cannabis Parcel Determination

Authors: Levent Basayigit, Sinan Demir, Yusuf Ucar, Burhan Kara

Abstract:

Many research projects require accurate delineation of the different land cover type of the agricultural area. Especially it is critically important for the definition of specific plants like cannabis. However, the complexity of vegetation stands structure, abundant vegetation species, and the smooth transition between different seconder section stages make vegetation classification difficult when using traditional approaches such as the maximum likelihood classifier. Most of the time, classification distinguishes only between trees/annual or grain. It has been difficult to accurately determine the cannabis mixed with other plants. In this paper, a mixed distribution models approach is applied to classify pure and mix cannabis parcels using Worldview-2 imagery in the Lakes region of Turkey. Five different land use types (i.e. sunflower, maize, bare soil, and cannabis) were identified in the image. A constrained Gaussian mixture discriminant analysis (GMDA) was used to unmix the image. In the study, 255 reflectance ratios derived from spectral signatures of seven bands (Blue-Green-Yellow-Red-Rededge-NIR1-NIR2) were randomly arranged as 80% for training and 20% for test data. Gaussian mixed distribution model approach is proved to be an effective and convenient way to combine very high spatial resolution imagery for distinguishing cannabis vegetation. Based on the overall accuracies of the classification, the Gaussian mixed distribution model was found to be very successful to achieve image classification tasks. This approach is sensitive to capture the illegal cannabis planting areas in the large plain. This approach can also be used for monitoring and determination with spectral reflections in illegal cannabis planting areas.

Keywords: Gaussian mixture discriminant analysis, spectral mixture model, Worldview-2, land parcels

Procedia PDF Downloads 166
3433 Efficacy of Mixed Actinomycetes against Fusarium Wilt Caused by Fusarium oxysporum f.sp. cubense

Authors: Jesryl B. Paulite, Irene Alcantara-Papa, Teofila O. Zulaybar, Jocelyn T. Zarate, Virgie Ugay

Abstract:

Banana is one of the major fruits in the Philippines in terms of volume of production and export earnings. The Philippines export of fresh Cavendish banana ranked No.1 with 22% share. One major threat to the industry is Fusarium wilt caused by Fusarium oxysporum f. sp. cubense. It tops as a major concern today affecting the Philippine banana industry since 2002 up to the present in Mindanao. Because of environmental and health issues concerning the use of chemical pesticides in the control of diseases, utilization of microorganisms has been significant in recent years as a promising alternative. This study aims to evaluate the potential of actinomycetes to control Fusarium wilt in Cavendish banana. The in-vitro experiments was carried out in Complete Randomized Design (CRD) while field experiment was laid out in a Randomized Complete Block Design (RCBD) with three treatments and three replications. Actinomycetes were isolated from mangrove soils in areas in Quezon and Bataan, Philippines. A total of 199 actinomycetes were isolated and 82 actinomycetes showed activity against the local Fusarium oxysporum (Foc) by agar plug assay. The test for antagonisms (AQ6, AQ30, and AQ121) of three best isolates Foc to were selected inhibiting Foc by 21.0mm, 22.0mm and 20.5mm, respectively. The same actinomycetes inhibited well Foc Tropical Race 4 showing 24.6 mm, 20.2mm and 19.0 mm zones of inhibition by agar plug assay, respectively. Combinations of the three isolates yielded an inhibition of 13.5 mm by cup cylinder assay. These findings led to the formulation of the mixed actinomycetes as biocontrol agents against Foc. A field experiment to evaluate the formulated mixed actinomycetes against Foc in a Foc infested field in Kinamayan, Sto Tomas, Davao Del Norte, Philippines. was conducted. Results showed that preventive method of application of the mixed actinomycetes against Foc showed promising results. A 56.66% mortality was observed in control set-up (no biocontrol agent added) compared to 33.33% mortality in preventive method. Further validation of the effectiveness of the mixed actinomycetes as biocontrol agent is presently being conducted in Asuncion, Davao Del Norte, Philippines.

Keywords: actinomycetes, biocontrol agents, cavendish banana, Fusarium oxysporum f. sp. cubense

Procedia PDF Downloads 555
3432 Stock Prediction and Portfolio Optimization Thesis

Authors: Deniz Peksen

Abstract:

This thesis aims to predict trend movement of closing price of stock and to maximize portfolio by utilizing the predictions. In this context, the study aims to define a stock portfolio strategy from models created by using Logistic Regression, Gradient Boosting and Random Forest. Recently, predicting the trend of stock price has gained a significance role in making buy and sell decisions and generating returns with investment strategies formed by machine learning basis decisions. There are plenty of studies in the literature on the prediction of stock prices in capital markets using machine learning methods but most of them focus on closing prices instead of the direction of price trend. Our study differs from literature in terms of target definition. Ours is a classification problem which is focusing on the market trend in next 20 trading days. To predict trend direction, fourteen years of data were used for training. Following three years were used for validation. Finally, last three years were used for testing. Training data are between 2002-06-18 and 2016-12-30 Validation data are between 2017-01-02 and 2019-12-31 Testing data are between 2020-01-02 and 2022-03-17 We determine Hold Stock Portfolio, Best Stock Portfolio and USD-TRY Exchange rate as benchmarks which we should outperform. We compared our machine learning basis portfolio return on test data with return of Hold Stock Portfolio, Best Stock Portfolio and USD-TRY Exchange rate. We assessed our model performance with the help of roc-auc score and lift charts. We use logistic regression, Gradient Boosting and Random Forest with grid search approach to fine-tune hyper-parameters. As a result of the empirical study, the existence of uptrend and downtrend of five stocks could not be predicted by the models. When we use these predictions to define buy and sell decisions in order to generate model-based-portfolio, model-based-portfolio fails in test dataset. It was found that Model-based buy and sell decisions generated a stock portfolio strategy whose returns can not outperform non-model portfolio strategies on test dataset. We found that any effort for predicting the trend which is formulated on stock price is a challenge. We found same results as Random Walk Theory claims which says that stock price or price changes are unpredictable. Our model iterations failed on test dataset. Although, we built up several good models on validation dataset, we failed on test dataset. We implemented Random Forest, Gradient Boosting and Logistic Regression. We discovered that complex models did not provide advantage or additional performance while comparing them with Logistic Regression. More complexity did not lead us to reach better performance. Using a complex model is not an answer to figure out the stock-related prediction problem. Our approach was to predict the trend instead of the price. This approach converted our problem into classification. However, this label approach does not lead us to solve the stock prediction problem and deny or refute the accuracy of the Random Walk Theory for the stock price.

Keywords: stock prediction, portfolio optimization, data science, machine learning

Procedia PDF Downloads 57
3431 Data Science-Based Key Factor Analysis and Risk Prediction of Diabetic

Authors: Fei Gao, Rodolfo C. Raga Jr.

Abstract:

This research proposal will ascertain the major risk factors for diabetes and to design a predictive model for risk assessment. The project aims to improve diabetes early detection and management by utilizing data science techniques, which may improve patient outcomes and healthcare efficiency. The phase relation values of each attribute were used to analyze and choose the attributes that might influence the examiner's survival probability using Diabetes Health Indicators Dataset from Kaggle’s data as the research data. We compare and evaluate eight machine learning algorithms. Our investigation begins with comprehensive data preprocessing, including feature engineering and dimensionality reduction, aimed at enhancing data quality. The dataset, comprising health indicators and medical data, serves as a foundation for training and testing these algorithms. A rigorous cross-validation process is applied, and we assess their performance using five key metrics like accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC-ROC). After analyzing the data characteristics, investigate their impact on the likelihood of diabetes and develop corresponding risk indicators.

Keywords: diabetes, risk factors, predictive model, risk assessment, data science techniques, early detection, data analysis, Kaggle

Procedia PDF Downloads 45
3430 Electrocardiogram-Based Heartbeat Classification Using Convolutional Neural Networks

Authors: Jacqueline Rose T. Alipo-on, Francesca Isabelle F. Escobar, Myles Joshua T. Tan, Hezerul Abdul Karim, Nouar Al Dahoul

Abstract:

Electrocardiogram (ECG) signal analysis and processing are crucial in the diagnosis of cardiovascular diseases, which are considered one of the leading causes of mortality worldwide. However, the traditional rule-based analysis of large volumes of ECG data is time-consuming, labor-intensive, and prone to human errors. With the advancement of the programming paradigm, algorithms such as machine learning have been increasingly used to perform an analysis of ECG signals. In this paper, various deep learning algorithms were adapted to classify five classes of heartbeat types. The dataset used in this work is the synthetic MIT-BIH Arrhythmia dataset produced from generative adversarial networks (GANs). Various deep learning models such as ResNet-50 convolutional neural network (CNN), 1-D CNN, and long short-term memory (LSTM) were evaluated and compared. ResNet-50 was found to outperform other models in terms of recall and F1 score using a five-fold average score of 98.88% and 98.87%, respectively. 1-D CNN, on the other hand, was found to have the highest average precision of 98.93%.

Keywords: heartbeat classification, convolutional neural network, electrocardiogram signals, generative adversarial networks, long short-term memory, ResNet-50

Procedia PDF Downloads 91
3429 Leveraging Natural Language Processing for Legal Artificial Intelligence: A Longformer Approach for Taiwanese Legal Cases

Authors: Hsin Lee, Hsuan Lee

Abstract:

Legal artificial intelligence (LegalAI) has been increasing applications within legal systems, propelled by advancements in natural language processing (NLP). Compared with general documents, legal case documents are typically long text sequences with intrinsic logical structures. Most existing language models have difficulty understanding the long-distance dependencies between different structures. Another unique challenge is that while the Judiciary of Taiwan has released legal judgments from various levels of courts over the years, there remains a significant obstacle in the lack of labeled datasets. This deficiency makes it difficult to train models with strong generalization capabilities, as well as accurately evaluate model performance. To date, models in Taiwan have yet to be specifically trained on judgment data. Given these challenges, this research proposes a Longformer-based pre-trained language model explicitly devised for retrieving similar judgments in Taiwanese legal documents. This model is trained on a self-constructed dataset, which this research has independently labeled to measure judgment similarities, thereby addressing a void left by the lack of an existing labeled dataset for Taiwanese judgments. This research adopts strategies such as early stopping and gradient clipping to prevent overfitting and manage gradient explosion, respectively, thereby enhancing the model's performance. The model in this research is evaluated using both the dataset and the Average Entropy of Offense-charged Clustering (AEOC) metric, which utilizes the notion of similar case scenarios within the same type of legal cases. Our experimental results illustrate our model's significant advancements in handling similarity comparisons within extensive legal judgments. By enabling more efficient retrieval and analysis of legal case documents, our model holds the potential to facilitate legal research, aid legal decision-making, and contribute to the further development of LegalAI in Taiwan.

Keywords: legal artificial intelligence, computation and language, language model, Taiwanese legal cases

Procedia PDF Downloads 47
3428 A Hybrid Feature Selection and Deep Learning Algorithm for Cancer Disease Classification

Authors: Niousha Bagheri Khulenjani, Mohammad Saniee Abadeh

Abstract:

Learning from very big datasets is a significant problem for most present data mining and machine learning algorithms. MicroRNA (miRNA) is one of the important big genomic and non-coding datasets presenting the genome sequences. In this paper, a hybrid method for the classification of the miRNA data is proposed. Due to the variety of cancers and high number of genes, analyzing the miRNA dataset has been a challenging problem for researchers. The number of features corresponding to the number of samples is high and the data suffer from being imbalanced. The feature selection method has been used to select features having more ability to distinguish classes and eliminating obscures features. Afterward, a Convolutional Neural Network (CNN) classifier for classification of cancer types is utilized, which employs a Genetic Algorithm to highlight optimized hyper-parameters of CNN. In order to make the process of classification by CNN faster, Graphics Processing Unit (GPU) is recommended for calculating the mathematic equation in a parallel way. The proposed method is tested on a real-world dataset with 8,129 patients, 29 different types of tumors, and 1,046 miRNA biomarkers, taken from The Cancer Genome Atlas (TCGA) database.

Keywords: cancer classification, feature selection, deep learning, genetic algorithm

Procedia PDF Downloads 90
3427 Establishing a Microbial Co-Culture for Production of Cellulases Using Banana (Musa Paradisiaca) Pseudostem

Authors: Mulanga Luscious Mulaudzi, Ignatious Ncube

Abstract:

In nature, enzymatic degradation of lignocellulose is more efficient compared to in vivo bioprocessing. Thus, a co-culture should enable production of more efficient enzyme preparations that would mimic the natural decomposition of lignocellulose. The aim of the study was to establish a microbial co-culture for the production of highly active cellulase preparations. The objectives were to determine the use of a variety of culture media to isolate cellulose degrading microorganisms from decomposing banana pseudo stem and to optimize production of cellulase by co-cultures of microorganisms producing high levels of cellulose. Screening of fungal isolates was done on carboxylmethylcellulose agar plates which were stained with Congo red to show hydrolytic activity of the isolates. Co-culture and mixed culture of these microorganisms were cultured using Mendel salts with Avicel as the carbon source. Cultures were incubated at 30 °C with shaking at 200 rpm for 240 hrs. Enzyme activity assays were performed to determine endoglycosidase and β-glucosidase. Mixed culture of fungi-dead bacterial cells showed to be the best co-culture/ mixed culture to produce higher levels of cellulase activity in submerged fermentations (SmF) using Avicel™ as a carbon source. The study concludes use microorganism 5A in co-cultures is highly recommended in order to produce high amounts of β-glucosidases, no matter the combination used.

Keywords: avicel, co-culture, submerged fermentation, pseudostem

Procedia PDF Downloads 104
3426 Supplier Selection by Bi-Objectives Mixed Integer Program Approach

Authors: K.-H. Yang

Abstract:

In the past, there was a lot of excellent research studies conducted on topics related to supplier selection. Because the considered factors of supplier selection are complicated and difficult to be quantified, most researchers deal supplier selection issues by qualitative approaches. Compared to qualitative approaches, quantitative approaches are less applicable in the real world. This study tried to apply the quantitative approach to study a supplier selection problem with considering operation cost and delivery reliability. By those factors, this study applies Normalized Normal Constraint Method to solve the dual objectives mixed integer program of the supplier selection problem.

Keywords: bi-objectives MIP, normalized normal constraint method, supplier selection, quantitative approach

Procedia PDF Downloads 386
3425 High-Performance Non-aqueous Organic Redox Flow Battery in Ambient Condition

Authors: S. K. Mohapatra, K. Ramanujam, S. Sankararaman

Abstract:

Redox flow battery (RFB) is a preferred energy storage option for grid stabilisation and energy arbitrage as it offers energy and power decoupling. In contrast to aqueous RFBs (ARFBs), nonaqueous RFBs (NARFBs) could offer high energy densities due to the wider electrochemical window of the solvents used, which could handle high and low voltage organic redox couples without undergoing electrolysis. In this study, a RFB based on benzyl viologen hexafluorophosphate [BV(PF6)2] as anolyte and N-hexyl phenothiazine [HPT] as catholyte demonstrated. A cell operated with mixed electrolyte (1:1) containing 0.2 M [BV(PF₆)₂] and 0.2 M [HPT] delivered a coulombic efficiency (CE) of 95.3 % and energy efficiency (EE) 53%, with nearly 68.9% material utilisation at 40 mA cm-2 current density.

Keywords: non-aqueous redox flow battery, benzyl viologen, N-hexyl phenothiazine, mixed electrolyte

Procedia PDF Downloads 53
3424 Disaggregation the Daily Rainfall Dataset into Sub-Daily Resolution in the Temperate Oceanic Climate Region

Authors: Mohammad Bakhshi, Firas Al Janabi

Abstract:

High resolution rain data are very important to fulfill the input of hydrological models. Among models of high-resolution rainfall data generation, the temporal disaggregation was chosen for this study. The paper attempts to generate three different rainfall resolutions (4-hourly, hourly and 10-minutes) from daily for around 20-year record period. The process was done by DiMoN tool which is based on random cascade model and method of fragment. Differences between observed and simulated rain dataset are evaluated with variety of statistical and empirical methods: Kolmogorov-Smirnov test (K-S), usual statistics, and Exceedance probability. The tool worked well at preserving the daily rainfall values in wet days, however, the generated data are cumulated in a shorter time period and made stronger storms. It is demonstrated that the difference between generated and observed cumulative distribution function curve of 4-hourly datasets is passed the K-S test criteria while in hourly and 10-minutes datasets the P-value should be employed to prove that their differences were reasonable. The results are encouraging considering the overestimation of generated high-resolution rainfall data.

Keywords: DiMoN Tool, disaggregation, exceedance probability, Kolmogorov-Smirnov test, rainfall

Procedia PDF Downloads 183
3423 Numerical and Experimental Investigation of Mixed-Mode Fracture of Cement Paste and Interface Under Three-Point Bending Test

Authors: S. Al Dandachli, F. Perales, Y. Monerie, F. Jamin, M. S. El Youssoufi, C. Pelissou

Abstract:

The goal of this research is to study the fracture process and mechanical behavior of concrete under I–II mixed-mode stress, which is essential for ensuring the safety of concrete structures. For this purpose, two-dimensional simulations of three-point bending tests under variable load and geometry on notched cement paste samples of composite samples (cement paste/siliceous aggregate) are modeled by employing Cohesive Zone Models (CZMs). As a result of experimental validation of these tests, the CZM model demonstrates its capacity to predict fracture propagation at the local scale.

Keywords: cement paste, interface, cohesive zone model, fracture, three-point flexural test bending

Procedia PDF Downloads 112