Search results for: imbalance dataset
809 Topic-to-Essay Generation with Event Element Constraints
Authors: Yufen Qin
Abstract:
Topic-to-Essay generation is a challenging task in Natural language processing, which aims to generate novel, diverse, and topic-related text based on user input. Previous research has overlooked the generation of articles under the constraints of event elements, resulting in issues such as incomplete event elements and logical inconsistencies in the generated results. To fill this gap, this paper proposes an event-constrained approach for a topic-to-essay generation that enforces the completeness of event elements during the generation process. Additionally, a language model is employed to verify the logical consistency of the generated results. Experimental results demonstrate that the proposed model achieves a better BLEU-2 score and performs better than the baseline in terms of subjective evaluation on a real dataset, indicating its capability to generate higher-quality topic-related text.Keywords: event element, language model, natural language processing, topic-to-essay generation.
Procedia PDF Downloads 236808 The Analysis of Changes in Urban Hierarchy of Isfahan Province in the Fifty-Year Period (1956-2006)
Authors: Hamidreza Joudaki, Yousefali Ziari
Abstract:
The appearance of city and urbanism is one of the important processes which have affected social communities. Being industrialized urbanism developed along with each other in the history. In addition, they have had simple relationship for more than six thousand years, that is, from the appearance of the first cities. In 18th century by coming out of industrial capitalism, progressive development took place in urbanism in the world. In Iran, the city of each region made its decision by itself and the capital of region (downtown) was the only central part and also the regional city without any hierarchy, controlled its realm. However, this method of ruling during these three decays, because of changing in political, social and economic issues that have caused changes in rural and urban relationship. Moreover, it has changed the variety of performance of cities and systematic urban network in Iran. Today, urban system has very vast imbalanced apace and performance. In Isfahan, the trend of urbanism is like the other part of Iran and systematic urban hierarchy is not suitable and normal. This article is a quantitative and analytical. The statistical communities are Isfahan Province cities and the changes in urban network and its hierarchy during the period of fifty years (1956 -2006) has been surveyed. In addition, those data have been analyzed by model of Rank and size and Entropy index. In this article Iran cities and also the factor of entropy of primate city and urban hierarchy of Isfahan Province have been introduced. Urban residents of this Province have been reached from 55 percent to 83% (2006). As we see the analytical data reflects that there is mismatching and imbalance between cities. Because the entropy index was.91 in 1956.And it decreased to.63 in 2006. Isfahan city is the primate city in the whole of these periods. Moreover, the second and the third cities have population gap with regard to the other cities and finally, they do not follow the system of rank-size.Keywords: urban network, urban hierarchy, primate city, Isfahan province, urbanism, first cities
Procedia PDF Downloads 258807 Landcover Mapping Using Lidar Data and Aerial Image and Soil Fertility Degradation Assessment for Rice Production Area in Quezon, Nueva Ecija, Philippines
Authors: Eliza. E. Camaso, Guiller. B. Damian, Miguelito. F. Isip, Ronaldo T. Alberto
Abstract:
Land-cover maps were important for many scientific, ecological and land management purposes and during the last decades, rapid decrease of soil fertility was observed to be due to land use practices such as rice cultivation. High-precision land-cover maps are not yet available in the area which is important in an economy management. To assure accurate mapping of land cover to provide information, remote sensing is a very suitable tool to carry out this task and automatic land use and cover detection. The study did not only provide high precision land cover maps but it also provides estimates of rice production area that had undergone chemical degradation due to fertility decline. Land-cover were delineated and classified into pre-defined classes to achieve proper detection features. After generation of Land-cover map, of high intensity of rice cultivation, soil fertility degradation assessment in rice production area due to fertility decline was created to assess the impact of soils used in agricultural production. Using Simple spatial analysis functions and ArcGIS, the Land-cover map of Municipality of Quezon in Nueva Ecija, Philippines was overlaid to the fertility decline maps from Land Degradation Assessment Philippines- Bureau of Soils and Water Management (LADA-Philippines-BSWM) to determine the area of rice crops that were most likely where nitrogen, phosphorus, zinc and sulfur deficiencies were induced by high dosage of urea and imbalance N:P fertilization. The result found out that 80.00 % of fallow and 99.81% of rice production area has high soil fertility decline.Keywords: aerial image, landcover, LiDAR, soil fertility degradation
Procedia PDF Downloads 252806 A Context-Sensitive Algorithm for Media Similarity Search
Authors: Guang-Ho Cha
Abstract:
This paper presents a context-sensitive media similarity search algorithm. One of the central problems regarding media search is the semantic gap between the low-level features computed automatically from media data and the human interpretation of them. This is because the notion of similarity is usually based on high-level abstraction but the low-level features do not sometimes reflect the human perception. Many media search algorithms have used the Minkowski metric to measure similarity between image pairs. However those functions cannot adequately capture the aspects of the characteristics of the human visual system as well as the nonlinear relationships in contextual information given by images in a collection. Our search algorithm tackles this problem by employing a similarity measure and a ranking strategy that reflect the nonlinearity of human perception and contextual information in a dataset. Similarity search in an image database based on this contextual information shows encouraging experimental results.Keywords: context-sensitive search, image search, similarity ranking, similarity search
Procedia PDF Downloads 365805 Automated Prediction of HIV-associated Cervical Cancer Patients Using Data Mining Techniques for Survival Analysis
Authors: O. J. Akinsola, Yinan Zheng, Rose Anorlu, F. T. Ogunsola, Lifang Hou, Robert Leo-Murphy
Abstract:
Cervical Cancer (CC) is the 2nd most common cancer among women living in low and middle-income countries, with no associated symptoms during formative periods. With the advancement and innovative medical research, there are numerous preventive measures being utilized, but the incidence of cervical cancer cannot be truncated with the application of only screening tests. The mortality associated with this invasive cervical cancer can be nipped in the bud through the important role of early-stage detection. This study research selected an array of different top features selection techniques which was aimed at developing a model that could validly diagnose the risk factors of cervical cancer. A retrospective clinic-based cohort study was conducted on 178 HIV-associated cervical cancer patients in Lagos University teaching Hospital, Nigeria (U54 data repository) in April 2022. The outcome measure was the automated prediction of the HIV-associated cervical cancer cases, while the predictor variables include: demographic information, reproductive history, birth control, sexual history, cervical cancer screening history for invasive cervical cancer. The proposed technique was assessed with R and Python programming software to produce the model by utilizing the classification algorithms for the detection and diagnosis of cervical cancer disease. Four machine learning classification algorithms used are: the machine learning model was split into training and testing dataset into ratio 80:20. The numerical features were also standardized while hyperparameter tuning was carried out on the machine learning to train and test the data. Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), and K-Nearest Neighbor (KNN). Some fitting features were selected for the detection and diagnosis of cervical cancer diseases from selected characteristics in the dataset using the contribution of various selection methods for the classification cervical cancer into healthy or diseased status. The mean age of patients was 49.7±12.1 years, mean age at pregnancy was 23.3±5.5 years, mean age at first sexual experience was 19.4±3.2 years, while the mean BMI was 27.1±5.6 kg/m2. A larger percentage of the patients are Married (62.9%), while most of them have at least two sexual partners (72.5%). Age of patients (OR=1.065, p<0.001**), marital status (OR=0.375, p=0.011**), number of pregnancy live-births (OR=1.317, p=0.007**), and use of birth control pills (OR=0.291, p=0.015**) were found to be significantly associated with HIV-associated cervical cancer. On top ten 10 features (variables) considered in the analysis, RF claims the overall model performance, which include: accuracy of (72.0%), the precision of (84.6%), a recall of (84.6%) and F1-score of (74.0%) while LR has: an accuracy of (74.0%), precision of (70.0%), recall of (70.0%) and F1-score of (70.0%). The RF model identified 10 features predictive of developing cervical cancer. The age of patients was considered as the most important risk factor, followed by the number of pregnancy livebirths, marital status, and use of birth control pills, The study shows that data mining techniques could be used to identify women living with HIV at high risk of developing cervical cancer in Nigeria and other sub-Saharan African countries.Keywords: associated cervical cancer, data mining, random forest, logistic regression
Procedia PDF Downloads 83804 FLEX: A Backdoor Detection and Elimination Method in Federated Scenario
Authors: Shuqi Zhang
Abstract:
Federated learning allows users to participate in collaborative model training without sending data to third-party servers, reducing the risk of user data privacy leakage, and is widely used in smart finance and smart healthcare. However, the distributed architecture design of federation learning itself and the existence of secure aggregation protocols make it inherently vulnerable to backdoor attacks. To solve this problem, the federated learning backdoor defense framework FLEX based on group aggregation, cluster analysis, and neuron pruning is proposed, and inter-compatibility with secure aggregation protocols is achieved. The good performance of FLEX is verified by building a horizontal federated learning framework on the CIFAR-10 dataset for experiments, which achieves 98% success rate of backdoor detection and reduces the success rate of backdoor tasks to 0% ~ 10%.Keywords: federated learning, secure aggregation, backdoor attack, cluster analysis, neuron pruning
Procedia PDF Downloads 96803 Generating Music with More Refined Emotions
Authors: Shao-Di Feng, Von-Wun Soo
Abstract:
To generate symbolic music with specific emotions is a challenging task due to symbolic music datasets that have emotion labels are scarce and incomplete. This research aims to generate more refined emotions based on the training datasets that are only labeled with four quadrants in Russel’s 2D emotion model. We focus on the theory of Music Fadernet and map arousal and valence to the low-level attributes, and build a symbolic music generation model by combining transformer and GM-VAE. We adopt an in-attention mechanism for the model and improve it by allowing modulation by conditional information. And we show the music generation model could control the generation of music according to the emotions specified by users in terms of high-level linguistic expression and by manipulating their corresponding low-level musical attributes. Finally, we evaluate the model performance using a pre-trained emotion classifier against a pop piano midi dataset called EMOPIA, and by subjective listening evaluation, we demonstrate that the model could generate music with more refined emotions correctly.Keywords: music generation, music emotion controlling, deep learning, semi-supervised learning
Procedia PDF Downloads 89802 Determining Antecedents of Employee Turnover: A Study on Blue Collar vs White Collar Workers on Marco Level
Authors: Evy Rombaut, Marie-Anne Guerry
Abstract:
Predicting voluntary turnover of employees is an important topic of study, both in academia and industry. Researchers try to uncover determinants for a broader understanding and possible prevention of turnover. In the current study, we use a data set based approach to reveal determinants for turnover, differing for blue and white collar workers. Our data set based approach made it possible to study actual turnover for more than 500000 employees in 15692 Belgian corporations. We use logistic regression to calculate individual turnover probabilities and test the goodness of our model with the AUC (area under the ROC-curve) method. The results of the study confirm the relationship of known determinants to employee turnover such as age, seniority, pay and work distance. In addition, the study unravels unknown and verifies known differences between blue and white collar workers. It shows opposite relationships to turnover for gender, marital status, the number of children, nationality, and pay.Keywords: employee turnover, blue collar, white collar, dataset analysis
Procedia PDF Downloads 291801 A Mutually Exclusive Task Generation Method Based on Data Augmentation
Authors: Haojie Wang, Xun Li, Rui Yin
Abstract:
In order to solve the memorization overfitting in the model-agnostic meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to an exponential growth of computation, this paper also proposes a key data extraction method that only extract part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.Keywords: mutex task generation, data augmentation, meta-learning, text classification.
Procedia PDF Downloads 143800 Evolution under Length Constraints for Convolutional Neural Networks Architecture Design
Authors: Ousmane Youme, Jean Marie Dembele, Eugene Ezin, Christophe Cambier
Abstract:
In recent years, the convolutional neural networks (CNN) architectures designed by evolution algorithms have proven to be competitive with handcrafted architectures designed by experts. However, these algorithms need a lot of computational power, which is beyond the capabilities of most researchers and engineers. To overcome this problem, we propose an evolution architecture under length constraints. It consists of two algorithms: a search length strategy to find an optimal space and a search architecture strategy based on a genetic algorithm to find the best individual in the optimal space. Our algorithms drastically reduce resource costs and also keep good performance. On the Cifar-10 dataset, our framework presents outstanding performance with an error rate of 5.12% and only 4.6 GPU a day to converge to the optimal individual -22 GPU a day less than the lowest cost automatic evolutionary algorithm in the peer competition.Keywords: CNN architecture, genetic algorithm, evolution algorithm, length constraints
Procedia PDF Downloads 128799 Printed Thai Character Recognition Using Particle Swarm Optimization Algorithm
Authors: Phawin Sangsuvan, Chutimet Srinilta
Abstract:
This Paper presents the applications of Particle Swarm Optimization (PSO) Method for Thai optical character recognition (OCR). OCR consists of the pre-processing, character recognition and post-processing. Before enter into recognition process. The Character must be “Prepped” by pre-processing process. The PSO is an optimization method that belongs to the swarm intelligence family based on the imitation of social behavior patterns of animals. Route of each particle is determined by an individual data among neighborhood particles. The interaction of the particles with neighbors is the advantage of Particle Swarm to determine the best solution. So PSO is interested by a lot of researchers in many difficult problems including character recognition. As the previous this research used a Projection Histogram to extract printed digits features and defined the simple Fitness Function for PSO. The results reveal that PSO gives 67.73% for testing dataset. So in the future there can be explored enhancement the better performance of PSO with improve the Fitness Function.Keywords: character recognition, histogram projection, particle swarm optimization, pattern recognition techniques
Procedia PDF Downloads 477798 A Comparative Assessment Method For Map Alignment Techniques
Authors: Rema Daher, Theodor Chakhachiro, Daniel Asmar
Abstract:
In the era of autonomous robot mapping, assessing the goodness of the generated maps is important, and is usually performed by aligning them to ground truth. Map alignment is difficult for two reasons: first, the query maps can be significantly distorted from ground truth, and second, establishing what constitutes ground truth for different settings is challenging. Most map alignment techniques to this date have addressed the first problem, while paying too little importance to the second. In this paper, we propose a benchmark dataset, which consists of synthetically transformed maps with their corresponding displacement fields. Furthermore, we propose a new system for comparison, where the displacement field of any map alignment technique can be computed and compared to the ground truth using statistical measures. The local information in displacement fields renders the evaluation system applicable to any alignment technique, whether it is linear or not. In our experiments, the proposed method was applied to different alignment methods from the literature, allowing for a comparative assessment between them all.Keywords: assessment methods, benchmark, image deformation, map alignment, robot mapping, robot motion
Procedia PDF Downloads 117797 Data-Centric Anomaly Detection with Diffusion Models
Authors: Sheldon Liu, Gordon Wang, Lei Liu, Xuefeng Liu
Abstract:
Anomaly detection, also referred to as one-class classification, plays a crucial role in identifying product images that deviate from the expected distribution. This study introduces Data-centric Anomaly Detection with Diffusion Models (DCADDM), presenting a systematic strategy for data collection and further diversifying the data with image generation via diffusion models. The algorithm addresses data collection challenges in real-world scenarios and points toward data augmentation with the integration of generative AI capabilities. The paper explores the generation of normal images using diffusion models. The experiments demonstrate that with 30% of the original normal image size, modeling in an unsupervised setting with state-of-the-art approaches can achieve equivalent performances. With the addition of generated images via diffusion models (10% equivalence of the original dataset size), the proposed algorithm achieves better or equivalent anomaly localization performance.Keywords: diffusion models, anomaly detection, data-centric, generative AI
Procedia PDF Downloads 82796 Diversity and Phylogenetic Placement of Seven Inocybe (Inocybaceae, Fungi) from Benin
Authors: Hyppolite Aignon, Souleymane Yorou, Martin Ryberg, Anneli Svanholm
Abstract:
Climate change and human actions cause the extinction of wild mushrooms. In Benin, the diversity of fungi is large and may still contain species new to science but the inventory effort remains low and focuses on particularly edible species (Russula, Lactarius, Lactifluus, and also Amanita). In addition, inventories have started recently and some groups of fungi are not sufficiently sampled, however, the degradation of fungal habitat continues to increase and some species are already disappearing. (Yorou and De Kesel, 2011), however, the degradation of fungi habitat continues to increase and some species may disappear without being known. This genus (Inocybe) overlooked has a worldwide distribution and includes more than 700 species with many undiscovered or poorly known species worldwide and particularly in tropical Africa. It is therefore important to orient the inventory to other genera or important families such as Inocybe (Fungi, Agaricales) in order to highlight their diversity and also to know their phylogenetic positions with a combined approach of gene regions. This study aims to evaluate the species richness and phylogenetic position of Inocybe species and affiliated taxa in West Africa. Thus, in North Benin, we visited the Forest Reserve of Ouémé Supérieur, the Okpara forest and the Alibori Supérieur Forest Reserve. In the center, we targeted the Forest Reserve of Toui-Kilibo. The surveys have been carried during the raining season in the study area meaning from June to October. A total of 24 taxa were collected, photographed and described. The DNA was extracted, the Polymerase Chain Reaction was carried out using primers (ITS1-F, ITS4-B) for Internal transcribed spacer (ITS), (LROR, LWRB, LR7, LR5) for nuclear ribosomal (LSU), (RPB2-f5F, RPB2-b6F, RPB2- b6R2, RPB2-b7R) for RNA polymerase II gene (RPB2) and sequenced. The ITS sequences of the 24 collections of Inocybaceae were edited in Staden and all the sequences were aligned and edited with Aliview v1.17. The sequences were examined by eye for sufficient similarity to be considered the same species. 13 different species were present in the collections. In addition, sequences similar to the ITS sequences of the thirteen final species were searched using BLAST. The nLSU and RPB2 markers for these species have been inserted in a complete alignment, where species from all major Inocybaceae clades as well as from all continents except Antarctica are present. Our new sequences for nLSU and RPB2 have been manually aligned in this dataset. Phylogenetic analysis was performed using the RAxML v7.2.6 maximum likelihood software. Bootstrap replications have been set to 100 and no partitioning of the dataset has been performed. The resulting tree was viewed and edited with FigTree v1.4.3. The preliminary tree resulting from the analysis of maximum likelihood shows us that these species coming from Benin are much diversified and are distributed in four different clades (Inosperma, Inocybe, Mallocybe and Pseudosperma) on the seven clades of Inocybaceae but the phylogeny position of 7 is currently known. This study marks the diversity of Inocybe in Benin and the investigations will continue and a protection plan will be developed in the coming years.Keywords: Benin, diversity, Inocybe, phylogeny placement
Procedia PDF Downloads 149795 Investigating the Impacts of Climate Change on Soil Erosion: A Case Study of Kasilian Watershed, Northern Iran
Authors: Mohammad Zare, Mahbubeh Sheikh
Abstract:
Many of the impact of climate change will material through change in soil erosion which were rarely addressed in Iran. This paper presents an investigation of the impacts of climate change soil erosin for the Kasilian basin. LARS-WG5 was used to downscale the IPCM4 and GFCM21 predictions of the A2 scenarios for the projected periods of 1985-2030 and 2080-2099. This analysis was carried out by means of the dataset the International Centre for Theoretical Physics (ICTP) of Trieste. Soil loss modeling using Revised Universal Soil Loss Equation (RUSLE). Results indicate that soil erosion increase or decrease, depending on which climate scenarios are considered. The potential for climate change to increase soil loss rate, soil erosion in future periods was established, whereas considerable decreases in erosion are projected when land use is increased from baseline periods.Keywords: Kasilian watershed, climatic change, soil erosion, LARS-WG5 Model, RUSLE
Procedia PDF Downloads 506794 Clinical Validation of an Automated Natural Language Processing Algorithm for Finding COVID-19 Symptoms and Complications in Patient Notes
Authors: Karolina Wieczorek, Sophie Wiliams
Abstract:
Introduction: Patient data is often collected in Electronic Health Record Systems (EHR) for purposes such as providing care as well as reporting data. This information can be re-used to validate data models in clinical trials or in epidemiological studies. Manual validation of automated tools is vital to pick up errors in processing and to provide confidence in the output. Mentioning a disease in a discharge letter does not necessarily mean that a patient suffers from this disease. Many of them discuss a diagnostic process, different tests, or discuss whether a patient has a certain disease. The COVID-19 dataset in this study used natural language processing (NLP), an automated algorithm which extracts information related to COVID-19 symptoms, complications, and medications prescribed within the hospital. Free-text patient clinical patient notes are rich sources of information which contain patient data not captured in a structured form, hence the use of named entity recognition (NER) to capture additional information. Methods: Patient data (discharge summary letters) were exported and screened by an algorithm to pick up relevant terms related to COVID-19. Manual validation of automated tools is vital to pick up errors in processing and to provide confidence in the output. A list of 124 Systematized Nomenclature of Medicine (SNOMED) Clinical Terms has been provided in Excel with corresponding IDs. Two independent medical student researchers were provided with a dictionary of SNOMED list of terms to refer to when screening the notes. They worked on two separate datasets called "A” and "B”, respectively. Notes were screened to check if the correct term had been picked-up by the algorithm to ensure that negated terms were not picked up. Results: Its implementation in the hospital began on March 31, 2020, and the first EHR-derived extract was generated for use in an audit study on June 04, 2020. The dataset has contributed to large, priority clinical trials (including International Severe Acute Respiratory and Emerging Infection Consortium (ISARIC) by bulk upload to REDcap research databases) and local research and audit studies. Successful sharing of EHR-extracted datasets requires communicating the provenance and quality, including completeness and accuracy of this data. The results of the validation of the algorithm were the following: precision (0.907), recall (0.416), and F-score test (0.570). Percentage enhancement with NLP extracted terms compared to regular data extraction alone was low (0.3%) for relatively well-documented data such as previous medical history but higher (16.6%, 29.53%, 30.3%, 45.1%) for complications, presenting illness, chronic procedures, acute procedures respectively. Conclusions: This automated NLP algorithm is shown to be useful in facilitating patient data analysis and has the potential to be used in more large-scale clinical trials to assess potential study exclusion criteria for participants in the development of vaccines.Keywords: automated, algorithm, NLP, COVID-19
Procedia PDF Downloads 102793 Mobile Platform’s Attitude Determination Based on Smoothed GPS Code Data and Carrier-Phase Measurements
Authors: Mohamed Ramdani, Hassen Abdellaoui, Abdenour Boudrassen
Abstract:
Mobile platform’s attitude estimation approaches mainly based on combined positioning techniques and developed algorithms; which aim to reach a fast and accurate solution. In this work, we describe the design and the implementation of an attitude determination (AD) process, using only measurements from GPS sensors. The major issue is based on smoothed GPS code data using Hatch filter and raw carrier-phase measurements integrated into attitude algorithm based on vectors measurement using least squares (LSQ) estimation method. GPS dataset from a static experiment is used to investigate the effectiveness of the presented approach and consequently to check the accuracy of the attitude estimation algorithm. Attitude results from GPS multi-antenna over short baselines are introduced and analyzed. The 3D accuracy of estimated attitude parameters using smoothed measurements is over 0.27°.Keywords: attitude determination, GPS code data smoothing, hatch filter, carrier-phase measurements, least-squares attitude estimation
Procedia PDF Downloads 155792 3D Building Model Utilizing Airborne LiDAR Dataset and Terrestrial Photographic Images
Authors: J. Jasmee, I. Roslina, A. Mohammed Yaziz & A.H Juazer Rizal
Abstract:
The need of an effective building information collection method is vital to support a diversity of land development activities. At present, advances in remote sensing such as airborne LiDAR (Light Detection and Ranging) is an established technology for building information collection, location, and elevation of the reflecting laser points towards the construction of 3D building models. In this study, LiDAR datasets and terrestrial photographic images of buildings towards the construction of 3D building models is explored. It is found that, the quantitative accuracy of the constructed 3D building model, namely in the horizontal and vertical components were ± 0.31m (RMSEx,y) and ± 0.145m (RMSEz) respectively. The accuracies were computed based on sixty nine (69) horizontal and twenty (20) vertical surveyed points. As for the qualitative assessment, it is shown that the appearance of the 3D building model is adequate to support the requirements of LOD3 presentation based on the OGC (Open Geospatial Consortium) standard CityGML.Keywords: LiDAR datasets, DSM, DTM, 3D building models
Procedia PDF Downloads 320791 Facial Pose Classification Using Hilbert Space Filling Curve and Multidimensional Scaling
Authors: Mekamı Hayet, Bounoua Nacer, Benabderrahmane Sidahmed, Taleb Ahmed
Abstract:
Pose estimation is an important task in computer vision. Though the majority of the existing solutions provide good accuracy results, they are often overly complex and computationally expensive. In this perspective, we propose the use of dimensionality reduction techniques to address the problem of facial pose estimation. Firstly, a face image is converted into one-dimensional time series using Hilbert space filling curve, then the approach converts these time series data to a symbolic representation. Furthermore, a distance matrix is calculated between symbolic series of an input learning dataset of images, to generate classifiers of frontal vs. profile face pose. The proposed method is evaluated with three public datasets. Experimental results have shown that our approach is able to achieve a correct classification rate exceeding 97% with K-NN algorithm.Keywords: machine learning, pattern recognition, facial pose classification, time series
Procedia PDF Downloads 350790 Satellite Image Classification Using Firefly Algorithm
Authors: Paramjit Kaur, Harish Kundra
Abstract:
In the recent years, swarm intelligence based firefly algorithm has become a great focus for the researchers to solve the real time optimization problems. Here, firefly algorithm is used for the application of satellite image classification. For experimentation, Alwar area is considered to multiple land features like vegetation, barren, hilly, residential and water surface. Alwar dataset is considered with seven band satellite images. Firefly Algorithm is based on the attraction of less bright fireflies towards more brightener one. For the evaluation of proposed concept accuracy assessment parameters are calculated using error matrix. With the help of Error matrix, parameters of Kappa Coefficient, Overall Accuracy and feature wise accuracy parameters of user’s accuracy & producer’s accuracy can be calculated. Overall results are compared with BBO, PSO, Hybrid FPAB/BBO, Hybrid ACO/SOFM and Hybrid ACO/BBO based on the kappa coefficient and overall accuracy parameters.Keywords: image classification, firefly algorithm, satellite image classification, terrain classification
Procedia PDF Downloads 400789 A Meta Regression Analysis to Detect Price Premium Threshold for Eco-Labeled Seafood
Authors: Cristina Giosuè, Federica Biondo, Sergio Vitale
Abstract:
In the last years, the consumers' awareness for environmental concerns has been increasing, and seafood eco-labels are considered as a possible instrument to improve both seafood markets and sustainable fishing management. In this direction, the aim of this study was to carry out a meta-analysis on consumers’ willingness to pay (WTP) for eco-labeled wild seafood, by a meta-regression. Therefore, only papers published on ISI journals were searched on “Web of Knowledge” and “SciVerse Scopus” platforms, using the combinations of the following key words: seafood, ecolabel, eco-label, willingness, WTP and premium. The dataset was built considering: paper’s and survey’s codes, year of publication, first author’s nationality, species’ taxa and family, sample size, survey’s continent and country, data collection (where and how), gender and age of consumers, brand and ΔWTP. From analysis the interest on eco labeled seafood emerged clearly, in particular in developed countries. In general, consumers declared greater willingness to pay than that actually applied for eco-label products, with difference related to taxa and brand.Keywords: eco label, meta regression, seafood, willingness to pay
Procedia PDF Downloads 122788 Enhancing Financial Security: Real-Time Anomaly Detection in Financial Transactions Using Machine Learning
Authors: Ali Kazemi
Abstract:
The digital evolution of financial services, while offering unprecedented convenience and accessibility, has also escalated the vulnerabilities to fraudulent activities. In this study, we introduce a distinct approach to real-time anomaly detection in financial transactions, aiming to fortify the defenses of banking and financial institutions against such threats. Utilizing unsupervised machine learning algorithms, specifically autoencoders and isolation forests, our research focuses on identifying irregular patterns indicative of fraud within transactional data, thus enabling immediate action to prevent financial loss. The data we used in this study included the monetary value of each transaction. This is a crucial feature as fraudulent transactions may have distributions of different amounts than legitimate ones, such as timestamps indicating when transactions occurred. Analyzing transactions' temporal patterns can reveal anomalies (e.g., unusual activity in the middle of the night). Also, the sector or category of the merchant where the transaction occurred, such as retail, groceries, online services, etc. Specific categories may be more prone to fraud. Moreover, the type of payment used (e.g., credit, debit, online payment systems). Different payment methods have varying risk levels associated with fraud. This dataset, anonymized to ensure privacy, reflects a wide array of transactions typical of a global banking institution, ranging from small-scale retail purchases to large wire transfers, embodying the diverse nature of potentially fraudulent activities. By engineering features that capture the essence of transactions, including normalized amounts and encoded categorical variables, we tailor our data to enhance model sensitivity to anomalies. The autoencoder model leverages its reconstruction error mechanism to flag transactions that deviate significantly from the learned normal pattern, while the isolation forest identifies anomalies based on their susceptibility to isolation from the dataset's majority. Our experimental results, validated through techniques such as k-fold cross-validation, are evaluated using precision, recall, and the F1 score alongside the area under the receiver operating characteristic (ROC) curve. Our models achieved an F1 score of 0.85 and a ROC AUC of 0.93, indicating high accuracy in detecting fraudulent transactions without excessive false positives. This study contributes to the academic discourse on financial fraud detection and provides a practical framework for banking institutions seeking to implement real-time anomaly detection systems. By demonstrating the effectiveness of unsupervised learning techniques in a real-world context, our research offers a pathway to significantly reduce the incidence of financial fraud, thereby enhancing the security and trustworthiness of digital financial services.Keywords: anomaly detection, financial fraud, machine learning, autoencoders, isolation forest, transactional data analysis
Procedia PDF Downloads 57787 Impact of Crime on Women and Their Families in Rural Areas of Haryana State in India
Authors: Rashmi Tyagi, Savita Vermani
Abstract:
Violence against women is the result of long-standing power imbalance between men and women and thus seriously compromises the well-being, productivity and contribution of one half the population. The costs incurred to the family especially children and society at large in terms of physical, psychological, social and financial losses are huge. The communities’ native to the state of Haryana in India is primarily patriarchal, burdened with age old regressive mindset under the socio-cultural and religious structures which discriminates against women. Therefore it was important to bring to light the issues affecting women in this region. Therefore this study focused on studying the consequences of crime on victim women and their families. Two hundred women were randomly selected and out of those one hundred twenty, who were affected with some kind of violence were interviewed. Data was collected and statistically analyzed for physical, psychological, inter-family and societal consequences of violence on these women. Women reported physical injuries, gynecological problems, unwanted pregnancies, frigidity, phobia and sexual dysfunction. 58.9% women felt decreased work efficiency. Psychological problems encountered were anxiety, isolation, depression, suicidal tendencies. 66.7% respondents suffered from anxiety followed by 65.0% faced depression symptoms. At family levels, 40.0% respondents felt the atmosphere was unsuitable for children while 39.2% reported lack of interaction. The societal consequences reported were breakdown of interaction with friends and family (44.2%) and resulting humiliation and demeaning remarks from others (38.3%). The impact of violence on women had an adverse effect on children. 36.7% children felt responsible for abuse and powerless to stop it, 29.2% reported living with fear. Concerted efforts are required to curb violence against women in Haryana.Keywords: impact of violence against women on children, patriarchal society, physical psychological and societal consequences, violence against women
Procedia PDF Downloads 308786 Faceless Women: The Blurred Image of Women in Film on and Off-Screen
Authors: Ana Sofia Torres Pereira
Abstract:
Till this day, women have been underrepresented and stereotyped both in TV and Cinema Screens all around the World. While women have been gaining a different status and finding their own voice in the work place and in society, what we see on-screen is still something different, something gender biased, something that does not show the multifaceted identities a woman might have. But why is this so? Why are we stuck on this shallow vision of women on-screen? According to several cinema industry studies, most film screenwriters in Hollywood are men. Women actually represent a very low percentage of screenwriters. So why is this relevant? Could the underrepresentation of women screenwriters in Hollywood be affecting the way women are written, and as a result, are depicted in film? Films are about stories, about people, and if these stories are continuously told through a man’s gaze, is that helping in the creation of a gender imbalance towards women? On the other hand, one of the reasons given for the low percentage of women screenwriters is: women are said to be better at writing specific genres, like dramas and comedies, and not as good writing thrillers and action films, so, as women seem to be limited in the genres they can write, they are undervalued and underrepresented as screenwriters. It seems the gender bias and stereotype isn’t saved exclusively for women on-screen, but also off-screen and behind the screen. So film appears to be a men’s world, on and off-screen, and since men seem to write the majority of scripts, it might be no wonder that women have been written in a specific way and depicted in a specific way on-screen. Also, since films are a mass communication medium, maybe this over-sexualization and stereotyping on-screen is indoctrinating our society into believing this bias is alive and well, and thus targeting women off-screen as well (ergo, screenwriters). What about at the very begging of film? In the Silent Movies and Early Talkies era, women dominated the screenwriting industry. They wrote every genre, and the majority of scripts were written by women, not men. So what about then? How were women depicted in films then? Did women screenwriters, in an era that was still very harsh on women, use their stories and their power to break stereotypes and show women in a different light, or did they carry on with the stereotype, did they continue it and standardize it? This papers aims to understand how important it is to have more working women screenwriters in order to break stereotypes regarding the image of women on and off-screen. How much can a screenwriter (male or female) influence our gaze on women (on and off-screen)?Keywords: cinema, gender bias, stereotype, women on-screen, women screenwriters
Procedia PDF Downloads 348785 Discussion on the Impact and Improvement Strategy of Bike Sharing on Urban Space
Authors: Bingying Liu, Dandong Ge, Xinlan Zhang, Haoyang Liang
Abstract:
Over the past two years, a new generation of No-Pile Bike sharing, represented by the Ofo, Mobike and HelloBike, has sprung up in various cities in China, and spread rapidly in countries such as Britain, Japan, the United States and Singapore. As a new green public transportation mode, bike sharing can bring a series of benefits to urban space. At first, this paper analyzes the specific impact of bike sharing on urban space in China. Based on the market research and data analyzing, it is found that bike sharing can improve the quality of urban space in three aspects: expanding the radius of public transportation service, filling service blind spots, alleviating urban traffic congestion, and enhancing the vitality of urban space. On the other hand, due to the immature market and the imperfect system, bike sharing has gradually revealed some difficulties, such as parking chaos, malicious damage, safety problems, imbalance between supply and demand, and so on. Then the paper investigates the characteristics of shared bikes, business model, operating mechanism on Chinese market currently. Finally, in order to make bike sharing serve urban construction better, this paper puts forward some specific countermeasures from four aspects. In terms of market operations, it is necessary to establish a public-private partnership model and set up a unified bike-sharing integrated management platform. From technical methods level, the paper proposes to develop an intelligent parking system for regulating parking. From policy formulation level, establishing a bike-sharing assessment mechanism would strengthen supervision. As to urban planning, sharing data and redesigning slow roadway is beneficial for transportation and spatial planning.Keywords: bike sharing, impact analysis, improvement strategy, urban space
Procedia PDF Downloads 169784 Multi-Sensor Target Tracking Using Ensemble Learning
Authors: Bhekisipho Twala, Mantepu Masetshaba, Ramapulana Nkoana
Abstract:
Multiple classifier systems combine several individual classifiers to deliver a final classification decision. However, an increasingly controversial question is whether such systems can outperform the single best classifier, and if so, what form of multiple classifiers system yields the most significant benefit. Also, multi-target tracking detection using multiple sensors is an important research field in mobile techniques and military applications. In this paper, several multiple classifiers systems are evaluated in terms of their ability to predict a system’s failure or success for multi-sensor target tracking tasks. The Bristol Eden project dataset is utilised for this task. Experimental and simulation results show that the human activity identification system can fulfill requirements of target tracking due to improved sensors classification performances with multiple classifier systems constructed using boosting achieving higher accuracy rates.Keywords: single classifier, ensemble learning, multi-target tracking, multiple classifiers
Procedia PDF Downloads 268783 A Comparative Study of Malware Detection Techniques Using Machine Learning Methods
Authors: Cristina Vatamanu, Doina Cosovan, Dragos Gavrilut, Henri Luchian
Abstract:
In the past few years, the amount of malicious software increased exponentially and, therefore, machine learning algorithms became instrumental in identifying clean and malware files through semi-automated classification. When working with very large datasets, the major challenge is to reach both a very high malware detection rate and a very low false positive rate. Another challenge is to minimize the time needed for the machine learning algorithm to do so. This paper presents a comparative study between different machine learning techniques such as linear classifiers, ensembles, decision trees or various hybrids thereof. The training dataset consists of approximately 2 million clean files and 200.000 infected files, which is a realistic quantitative mixture. The paper investigates the above mentioned methods with respect to both their performance (detection rate and false positive rate) and their practicability.Keywords: ensembles, false positives, feature selection, one side class algorithm
Procedia PDF Downloads 292782 A Neural Network Classifier for Identifying Duplicate Image Entries in Real-Estate Databases
Authors: Sergey Ermolin, Olga Ermolin
Abstract:
A Deep Convolution Neural Network with Triplet Loss is used to identify duplicate images in real-estate advertisements in the presence of image artifacts such as watermarking, cropping, hue/brightness adjustment, and others. The effects of batch normalization, spatial dropout, and various convergence methodologies on the resulting detection accuracy are discussed. For comparative Return-on-Investment study (per industry request), end-2-end performance is benchmarked on both Nvidia Titan GPUs and Intel’s Xeon CPUs. A new real-estate dataset from San Francisco Bay Area is used for this work. Sufficient duplicate detection accuracy is achieved to supplement other database-grounded methods of duplicate removal. The implemented method is used in a Proof-of-Concept project in the real-estate industry.Keywords: visual recognition, convolutional neural networks, triplet loss, spatial batch normalization with dropout, duplicate removal, advertisement technologies, performance benchmarking
Procedia PDF Downloads 338781 A Comparison between Artificial Neural Network Prediction Models for Coronal Hole Related High Speed Streams
Authors: Rehab Abdulmajed, Amr Hamada, Ahmed Elsaid, Hisashi Hayakawa, Ayman Mahrous
Abstract:
Solar emissions have a high impact on the Earth’s magnetic field, and the prediction of solar events is of high interest. Various techniques have been used in the prediction of solar wind using mathematical models, MHD models, and neural network (NN) models. This study investigates the coronal hole (CH) derived high-speed streams (HSSs) and their correlation to the CH area and create a neural network model to predict the HSSs. Two different algorithms were used to compare different models to find a model that best simulates the HSSs. A dataset of CH synoptic maps through Carrington rotations 1601 to 2185 along with Omni-data set solar wind speed averaged over the Carrington rotations is used, which covers Solar cycles (sc) 21, 22, 23, and most of 24.Keywords: artificial neural network, coronal hole area, feed-forward neural network models, solar high speed streams
Procedia PDF Downloads 88780 BigCrypt: A Probable Approach of Big Data Encryption to Protect Personal and Business Privacy
Authors: Abdullah Al Mamun, Talal Alkharobi
Abstract:
As data size is growing up, people are became more familiar to store big amount of secret information into cloud storage. Companies are always required to need transfer massive business files from one end to another. We are going to lose privacy if we transmit it as it is and continuing same scenario repeatedly without securing the communication mechanism means proper encryption. Although asymmetric key encryption solves the main problem of symmetric key encryption but it can only encrypt limited size of data which is inapplicable for large data encryption. In this paper we propose a probable approach of pretty good privacy for encrypt big data using both symmetric and asymmetric keys. Our goal is to achieve encrypt huge collection information and transmit it through a secure communication channel for committing the business and personal privacy. To justify our method an experimental dataset from three different platform is provided. We would like to show that our approach is working for massive size of various data efficiently and reliably.Keywords: big data, cloud computing, cryptography, hadoop, public key
Procedia PDF Downloads 320