Search results for: Text classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3270

Search results for: Text classification

2580 Short Answer Grading Using Multi-Context Features

Authors: S. Sharan Sundar, Nithish B. Moudhgalya, Nidhi Bhandari, Vineeth Vijayaraghavan

Abstract:

Automatic Short Answer Grading is one of the prime applications of artificial intelligence in education. Several approaches involving the utilization of selective handcrafted features, graphical matching techniques, concept identification and mapping, complex deep frameworks, sentence embeddings, etc. have been explored over the years. However, keeping in mind the real-world application of the task, these solutions present a slight overhead in terms of computations and resources in achieving high performances. In this work, a simple and effective solution making use of elemental features based on statistical, linguistic properties, and word-based similarity measures in conjunction with tree-based classifiers and regressors is proposed. The results for classification tasks show improvements ranging from 1%-30%, while the regression task shows a stark improvement of 35%. The authors attribute these improvements to the addition of multiple similarity scores to provide ensemble of scoring criteria to the models. The authors also believe the work could reinstate that classical natural language processing techniques and simple machine learning models can be used to achieve high results for short answer grading.

Keywords: artificial intelligence, intelligent systems, natural language processing, text mining

Procedia PDF Downloads 124
2579 A Novel Method for Face Detection

Authors: H. Abas Nejad, A. R. Teymoori

Abstract:

Facial expression recognition is one of the open problems in computer vision. Robust neutral face recognition in real time is a major challenge for various supervised learning based facial expression recognition methods. This is due to the fact that supervised methods cannot accommodate all appearance variability across the faces with respect to race, pose, lighting, facial biases, etc. in the limited amount of training data. Moreover, processing each and every frame to classify emotions is not required, as the user stays neutral for the majority of the time in usual applications like video chat or photo album/web browsing. Detecting neutral state at an early stage, thereby bypassing those frames from emotion classification would save the computational power. In this work, we propose a light-weight neutral vs. emotion classification engine, which acts as a preprocessor to the traditional supervised emotion classification approaches. It dynamically learns neutral appearance at Key Emotion (KE) points using a textural statistical model, constructed by a set of reference neutral frames for each user. The proposed method is made robust to various types of user head motions by accounting for affine distortions based on a textural statistical model. Robustness to dynamic shift of KE points is achieved by evaluating the similarities on a subset of neighborhood patches around each KE point using the prior information regarding the directionality of specific facial action units acting on the respective KE point. The proposed method, as a result, improves ER accuracy and simultaneously reduces the computational complexity of ER system, as validated on multiple databases.

Keywords: neutral vs. emotion classification, Constrained Local Model, procrustes analysis, Local Binary Pattern Histogram, statistical model

Procedia PDF Downloads 329
2578 Multi-Layer Perceptron and Radial Basis Function Neural Network Models for Classification of Diabetic Retinopathy Disease Using Video-Oculography Signals

Authors: Ceren Kaya, Okan Erkaymaz, Orhan Ayar, Mahmut Özer

Abstract:

Diabetes Mellitus (Diabetes) is a disease based on insulin hormone disorders and causes high blood glucose. Clinical findings determine that diabetes can be diagnosed by electrophysiological signals obtained from the vital organs. 'Diabetic Retinopathy' is one of the most common eye diseases resulting on diabetes and it is the leading cause of vision loss due to structural alteration of the retinal layer vessels. In this study, features of horizontal and vertical Video-Oculography (VOG) signals have been used to classify non-proliferative and proliferative diabetic retinopathy disease. Twenty-five features are acquired by using discrete wavelet transform with VOG signals which are taken from 21 subjects. Two models, based on multi-layer perceptron and radial basis function, are recommended in the diagnosis of Diabetic Retinopathy. The proposed models also can detect level of the disease. We show comparative classification performance of the proposed models. Our results show that proposed the RBF model (100%) results in better classification performance than the MLP model (94%).

Keywords: diabetic retinopathy, discrete wavelet transform, multi-layer perceptron, radial basis function, video-oculography (VOG)

Procedia PDF Downloads 246
2577 Spectrogram Pre-Processing to Improve Isotopic Identification to Discriminate Gamma and Neutrons Sources

Authors: Mustafa Alhamdi

Abstract:

Industrial application to classify gamma rays and neutron events is investigated in this study using deep machine learning. The identification using a convolutional neural network and recursive neural network showed a significant improvement in predication accuracy in a variety of applications. The ability to identify the isotope type and activity from spectral information depends on feature extraction methods, followed by classification. The features extracted from the spectrum profiles try to find patterns and relationships to present the actual spectrum energy in low dimensional space. Increasing the level of separation between classes in feature space improves the possibility to enhance classification accuracy. The nonlinear nature to extract features by neural network contains a variety of transformation and mathematical optimization, while principal component analysis depends on linear transformations to extract features and subsequently improve the classification accuracy. In this paper, the isotope spectrum information has been preprocessed by finding the frequencies components relative to time and using them as a training dataset. Fourier transform implementation to extract frequencies component has been optimized by a suitable windowing function. Training and validation samples of different isotope profiles interacted with CdTe crystal have been simulated using Geant4. The readout electronic noise has been simulated by optimizing the mean and variance of normal distribution. Ensemble learning by combing voting of many models managed to improve the classification accuracy of neural networks. The ability to discriminate gamma and neutron events in a single predication approach using deep machine learning has shown high accuracy using deep learning. The paper findings show the ability to improve the classification accuracy by applying the spectrogram preprocessing stage to the gamma and neutron spectrums of different isotopes. Tuning deep machine learning models by hyperparameter optimization of neural network models enhanced the separation in the latent space and provided the ability to extend the number of detected isotopes in the training database. Ensemble learning contributed significantly to improve the final prediction.

Keywords: machine learning, nuclear physics, Monte Carlo simulation, noise estimation, feature extraction, classification

Procedia PDF Downloads 137
2576 6D Posture Estimation of Road Vehicles from Color Images

Authors: Yoshimoto Kurihara, Tad Gonsalves

Abstract:

Currently, in the field of object posture estimation, there is research on estimating the position and angle of an object by storing a 3D model of the object to be estimated in advance in a computer and matching it with the model. However, in this research, we have succeeded in creating a module that is much simpler, smaller in scale, and faster in operation. Our 6D pose estimation model consists of two different networks – a classification network and a regression network. From a single RGB image, the trained model estimates the class of the object in the image, the coordinates of the object, and its rotation angle in 3D space. In addition, we compared the estimation accuracy of each camera position, i.e., the angle from which the object was captured. The highest accuracy was recorded when the camera position was 75°, the accuracy of the classification was about 87.3%, and that of regression was about 98.9%.

Keywords: 6D posture estimation, image recognition, deep learning, AlexNet

Procedia PDF Downloads 141
2575 Mobile Phone Text Reminders and Voice Call Follow-ups Improve Attendance for Community Retail Pharmacy Refills; Learnings from Lango Sub-region in Northern Uganda

Authors: Jonathan Ogwal, Louis H. Kamulegeya, John M. Bwanika, Davis Musinguzi

Abstract:

Introduction: Community retail Pharmacy drug distribution points (CRPDDP) were implemented in the Lango sub-region as part of the Ministry of Health’s response to improving access and adherence to antiretroviral treatment (ART). Clients received their ART refills from nearby local pharmacies; as such, the need for continuous engagement through mobile phone appointment reminders and health messages. We share learnings from the implementation of mobile text reminders and voice call follow-ups among ART clients attending the CRPDDP program in northern Uganda. Methods: A retrospective data review of electronic medical records from four pharmacies allocated for CRPDDP in the Lira and Apac districts of the Lango sub-region in Northern Uganda was done from February to August 2022. The process involved collecting phone contacts of eligible clients from the health facility appointment register and uploading them onto a messaging platform customized by Rapid-pro, an open-source software. Client information, including code name, phone number, next appointment date, and the allocated pharmacy for ART refill, was collected and kept confidential. Contacts received appointment reminder messages and other messages on positive living as an ART client. Routine voice call follow-ups were done to ascertain the picking of ART from the refill pharmacy. Findings: In total, 1,354 clients were reached from the four allocated pharmacies found in urban centers. 972 clients received short message service (SMS) appointment reminders, and 382 were followed up through voice calls. The majority (75%) of the clients returned for refills on the appointed date, 20% returned within four days after the appointment date, and the remaining 5% needed follow-up where they reported that they were not in the district by the appointment date due to other engagements. Conclusion: The use of mobile text reminders and voice call follow-ups improves the attendance of community retail pharmacy refills.

Keywords: antiretroviral treatment, community retail drug distribution points, mobile text reminders, voice call follow-up

Procedia PDF Downloads 93
2574 Gender Recognition with Deep Belief Networks

Authors: Xiaoqi Jia, Qing Zhu, Hao Zhang, Su Yang

Abstract:

A gender recognition system is able to tell the gender of the given person through a few of frontal facial images. An effective gender recognition approach enables to improve the performance of many other applications, including security monitoring, human-computer interaction, image or video retrieval and so on. In this paper, we present an effective method for gender classification task in frontal facial images based on deep belief networks (DBNs), which can pre-train model and improve accuracy a little bit. Our experiments have shown that the pre-training method with DBNs for gender classification task is feasible and achieves a little improvement of accuracy on FERET and CAS-PEAL-R1 facial datasets.

Keywords: gender recognition, beep belief net-works, semi-supervised learning, greedy-layer wise RBMs

Procedia PDF Downloads 434
2573 Hyper Parameter Optimization of Deep Convolutional Neural Networks for Pavement Distress Classification

Authors: Oumaima Khlifati, Khadija Baba

Abstract:

Pavement distress is the main factor responsible for the deterioration of road structure durability, damage vehicles, and driver comfort. Transportation agencies spend a high proportion of their funds on pavement monitoring and maintenance. The auscultation of pavement distress was based on the manual survey, which was extremely time consuming, labor intensive, and required domain expertise. Therefore, the automatic distress detection is needed to reduce the cost of manual inspection and avoid more serious damage by implementing the appropriate remediation actions at the right time. Inspired by recent deep learning applications, this paper proposes an algorithm for automatic road distress detection and classification using on the Deep Convolutional Neural Network (DCNN). In this study, the types of pavement distress are classified as transverse or longitudinal cracking, alligator, pothole, and intact pavement. The dataset used in this work is composed of public asphalt pavement images. In order to learn the structure of the different type of distress, the DCNN models are trained and tested as a multi-label classification task. In addition, to get the highest accuracy for our model, we adjust the structural optimization hyper parameters such as the number of convolutions and max pooling, filers, size of filters, loss functions, activation functions, and optimizer and fine-tuning hyper parameters that conclude batch size and learning rate. The optimization of the model is executed by checking all feasible combinations and selecting the best performing one. The model, after being optimized, performance metrics is calculated, which describe the training and validation accuracies, precision, recall, and F1 score.

Keywords: distress pavement, hyperparameters, automatic classification, deep learning

Procedia PDF Downloads 71
2572 The Asymmetric Proximal Support Vector Machine Based on Multitask Learning for Classification

Authors: Qing Wu, Fei-Yan Li, Heng-Chang Zhang

Abstract:

Multitask learning support vector machines (SVMs) have recently attracted increasing research attention. Given several related tasks, the single-task learning methods trains each task separately and ignore the inner cross-relationship among tasks. However, multitask learning can capture the correlation information among tasks and achieve better performance by training all tasks simultaneously. In addition, the asymmetric squared loss function can better improve the generalization ability of the models on the most asymmetric distributed data. In this paper, we first make two assumptions on the relatedness among tasks and propose two multitask learning proximal support vector machine algorithms, named MTL-a-PSVM and EMTL-a-PSVM, respectively. MTL-a-PSVM seeks a trade-off between the maximum expectile distance for each task model and the closeness of each task model to the general model. As an extension of the MTL-a-PSVM, EMTL-a-PSVM can select appropriate kernel functions for shared information and private information. Besides, two corresponding special cases named MTL-PSVM and EMTLPSVM are proposed by analyzing the asymmetric squared loss function, which can be easily implemented by solving linear systems. Experimental analysis of three classification datasets demonstrates the effectiveness and superiority of our proposed multitask learning algorithms.

Keywords: multitask learning, asymmetric squared loss, EMTL-a-PSVM, classification

Procedia PDF Downloads 99
2571 The Popular Imagination through the Poem of “Ras B’Nadam”

Authors: Hirreche Baghdad Mohamed

Abstract:

One of the main texts in popular culture in Algeria is a symbolic and imaginary tale, through which the author was able to derive from the world and popular cultural stock and symbolic capital elements that enabled him to create a synthesis between a number of imaginary and real events. Thanks to the level of spirituality that the author was experiencing, he was able to go deep in order to redraw the boundaries of human life in view of its existence and status (life experiences, its end, and its fate). It is a text that is consistent with religious values and has a philosophical depth. This poem can be shared in official and unofficial meetings, during feasts, and during popular celebrations, such as circumcision ceremonies, marriage, and condolences. It has also the ability to draw attention and appeal to the listener and let him travel into the imaginary world. It is the text related to the story of "Ras b’nadem", or "the head of a man", or rather, a "human skull", for which only a few academic studies have been devoted, and there are two copies of it, one attributed to Lakhdar Ibn Khalouf as a matter of suspicion, while the other is attributed to Qadour Ibn Ashour Al-Zarhouni.

Keywords: ras B’Nadam, ras al mahna, lakhdar ibn khalouf, qadour ibn ashour, sufism, melhoun poetry, resistance poetry

Procedia PDF Downloads 173
2570 The Arab Spring Rebellion or Revolution: An Analysis of the Text

Authors: Sulaiman Ahmed

Abstract:

This paper will analyse the classical Islamic text in order to determine whether the Arab spring was a rebellion or a revolution. Commencing in 2010, we saw a series of revolutions or what some would call rebellions throughout the Arab peninsula. Many of the religious clergies came out emphatically in support of the people who wanted to overthrow the leaders. This brought forth the important question about the acceptability of rebelling against unjust leaders in Islamic theological texts. The paper will look to analyse the Islamic legal and theological position on the permissibility of rebelling, whether there is scholarly consensus on the issue, and how the texts are analysed in order to come to the current position we have today. The position of the clergy who supported the Arab spring will also be analysed in order to deduce if their position falls within the religious framework. An inquiry will be about to determine the ideology of those who joined the rebellion after the inception and whether these ideas can be found in classical Islamic texts. The nuances of these positions will be analysed in order to determine whether what we witnessed was a rebellion or a revolution.

Keywords: rebellion, revolution, Arab spring, scholarly consensus

Procedia PDF Downloads 146
2569 Classification of Generative Adversarial Network Generated Multivariate Time Series Data Featuring Transformer-Based Deep Learning Architecture

Authors: Thrivikraman Aswathi, S. Advaith

Abstract:

As there can be cases where the use of real data is somehow limited, such as when it is hard to get access to a large volume of real data, we need to go for synthetic data generation. This produces high-quality synthetic data while maintaining the statistical properties of a specific dataset. In the present work, a generative adversarial network (GAN) is trained to produce multivariate time series (MTS) data since the MTS is now being gathered more often in various real-world systems. Furthermore, the GAN-generated MTS data is fed into a transformer-based deep learning architecture that carries out the data categorization into predefined classes. Further, the model is evaluated across various distinct domains by generating corresponding MTS data.

Keywords: GAN, transformer, classification, multivariate time series

Procedia PDF Downloads 113
2568 Rapid Soil Classification Using Computer Vision with Electrical Resistivity and Soil Strength

Authors: Eugene Y. J. Aw, J. W. Koh, S. H. Chew, K. E. Chua, P. L. Goh, Grace H. B. Foo, M. L. Leong

Abstract:

This paper presents the evaluation of various soil testing methods such as the four-probe soil electrical resistivity method and cone penetration test (CPT) that can complement a newly developed novel rapid soil classification scheme using computer vision, to improve the accuracy and productivity of on-site classification of excavated soil. In Singapore, excavated soils from the local construction industry are transported to Staging Grounds (SGs) to be reused as fill material for land reclamation. Excavated soils are mainly categorized into two groups (“Good Earth” and “Soft Clay”) based on particle size distribution (PSD) and water content (w) from soil investigation reports and on-site visual survey, such that proper treatment and usage can be exercised. However, this process is time-consuming and labor-intensive. Thus, a rapid classification method is needed at the SGs. Four-probe soil electrical resistivity and CPT were evaluated for their feasibility as suitable additions to the computer vision system to further develop this innovative non-destructive and instantaneous classification method. The computer vision technique comprises soil image acquisition using an industrial-grade camera; image processing and analysis via calculation of Grey Level Co-occurrence Matrix (GLCM) textural parameters; and decision-making using an Artificial Neural Network (ANN). It was found from the previous study that the ANN model coupled with ρ can classify soils into “Good Earth” and “Soft Clay” in less than a minute, with an accuracy of 85% based on selected representative soil images. To further improve the technique, the following three items were targeted to be added onto the computer vision scheme: the apparent electrical resistivity of soil (ρ) measured using a set of four probes arranged in Wenner’s array, the soil strength measured using a modified mini cone penetrometer, and w measured using a set of time-domain reflectometry (TDR) probes. Laboratory proof-of-concept was conducted through a series of seven tests with three types of soils – “Good Earth”, “Soft Clay,” and a mix of the two. Validation was performed against the PSD and w of each soil type obtained from conventional laboratory tests. The results show that ρ, w and CPT measurements can be collectively analyzed to classify soils into “Good Earth” or “Soft Clay” and are feasible as complementing methods to the computer vision system.

Keywords: computer vision technique, cone penetration test, electrical resistivity, rapid and non-destructive, soil classification

Procedia PDF Downloads 223
2567 Effect of Coaching Related Incompetency to Stand Trial on Symptom Validity Test: Robustness, Sensitivity, and Specificity

Authors: Natthawut Arin

Abstract:

In forensic contexts, competency to stand trial assessments are the most common referrals. The defendants may attempt to endorse psychopathology symptoms and feign incompetent. Coaching, which can be teaching them test-taking strategies to avoid detection of psychopathological symptoms feigning. Recently, the Symptom Validity Testings (SVTs) were created to detect feigning. Moreover, the works of the literature showed that the effects of coaching on SVTs may be more robust to the effects of coaching. Thai Symptom Validity Test (SVT-Th) was designed as SVTs which demonstrated adequate psychometric properties and ability to classify between feigners and honest responders. Thus, the current study to examine the utility as the robustness of SVT-Th in the detection of feigned psychopathology. Participants consisted of 120 were recruited from undergraduate courses in psychology, randomly assigned to one of three groups. The SVT-Th was administered to those three scenario-experimental groups: (a) Uncoached group were asked to respond honestly (n=40), (b) Symptom-coached without warning group were asked to feign psychiatric symptoms to gain incompetency to stand trial (n=40), while (c) Test-coached with warning group were asked to feign psychiatric symptoms to avoid test detection but being incompetency to stand trial (n=40). Group differences were analyzed using one-way ANOVAs. The result revealed an uncoached group (M = 4.23, SD.= 5.20) had significantly lower SVT-Th mean scores than those both coached groups (M =185.00, SD.= 72.88 and M = 132.10, SD.= 54.06, respectively). Classification rates were calculated to determine the classification accuracy. Result indicated that SVT-Th had overall classification accuracy rates of 96.67% with acceptable of 95% sensitivity and 100% specificity rates. Overall, the results of the present study indicate that the SVT-Th yielded high adequate indices of accuracy and these findings suggest that the SVT-Th is robustness against coaching.

Keywords: incompetency to stand trial, coaching, robustness, classification accuracy

Procedia PDF Downloads 123
2566 Determining Optimal Number of Trees in Random Forests

Authors: Songul Cinaroglu

Abstract:

Background: Random Forest is an efficient, multi-class machine learning method using for classification, regression and other tasks. This method is operating by constructing each tree using different bootstrap sample of the data. Determining the number of trees in random forests is an open question in the literature for studies about improving classification performance of random forests. Aim: The aim of this study is to analyze whether there is an optimal number of trees in Random Forests and how performance of Random Forests differ according to increase in number of trees using sample health data sets in R programme. Method: In this study we analyzed the performance of Random Forests as the number of trees grows and doubling the number of trees at every iteration using “random forest” package in R programme. For determining minimum and optimal number of trees we performed Mc Nemar test and Area Under ROC Curve respectively. Results: At the end of the analysis it was found that as the number of trees grows, it does not always means that the performance of the forest is better than forests which have fever trees. In other words larger number of trees only increases computational costs but not increases performance results. Conclusion: Despite general practice in using random forests is to generate large number of trees for having high performance results, this study shows that increasing number of trees doesn’t always improves performance. Future studies can compare different kinds of data sets and different performance measures to test whether Random Forest performance results change as number of trees increase or not.

Keywords: classification methods, decision trees, number of trees, random forest

Procedia PDF Downloads 386
2565 Spectral Mixture Model Applied to Cannabis Parcel Determination

Authors: Levent Basayigit, Sinan Demir, Yusuf Ucar, Burhan Kara

Abstract:

Many research projects require accurate delineation of the different land cover type of the agricultural area. Especially it is critically important for the definition of specific plants like cannabis. However, the complexity of vegetation stands structure, abundant vegetation species, and the smooth transition between different seconder section stages make vegetation classification difficult when using traditional approaches such as the maximum likelihood classifier. Most of the time, classification distinguishes only between trees/annual or grain. It has been difficult to accurately determine the cannabis mixed with other plants. In this paper, a mixed distribution models approach is applied to classify pure and mix cannabis parcels using Worldview-2 imagery in the Lakes region of Turkey. Five different land use types (i.e. sunflower, maize, bare soil, and cannabis) were identified in the image. A constrained Gaussian mixture discriminant analysis (GMDA) was used to unmix the image. In the study, 255 reflectance ratios derived from spectral signatures of seven bands (Blue-Green-Yellow-Red-Rededge-NIR1-NIR2) were randomly arranged as 80% for training and 20% for test data. Gaussian mixed distribution model approach is proved to be an effective and convenient way to combine very high spatial resolution imagery for distinguishing cannabis vegetation. Based on the overall accuracies of the classification, the Gaussian mixed distribution model was found to be very successful to achieve image classification tasks. This approach is sensitive to capture the illegal cannabis planting areas in the large plain. This approach can also be used for monitoring and determination with spectral reflections in illegal cannabis planting areas.

Keywords: Gaussian mixture discriminant analysis, spectral mixture model, Worldview-2, land parcels

Procedia PDF Downloads 185
2564 The Spatial Classification of China near Sea for Marine Biodiversity Conservation Based on Bio-Geographical Factors

Authors: Huang Hao, Li Weiwen

Abstract:

Global biodiversity continues to decline as a result of global climate change and various human activities, such as habitat destruction, pollution, introduction of alien species and overfishing. Although there are connections between global marine organisms more or less, it is better to have clear geographical boundaries in order to facilitate the assessment and management of different biogeographical zones. And so area based management tools (ABMT) are considered as the most effective means for the conservation and sustainable use of marine biodiversity. On a large scale, the geographical gap (or barrier) is the main factor to influence the connectivity, diffusion, ecological and evolutionary process of marine organisms, which results in different distribution patterns. On a small scale, these factors include geographical location, geology, and geomorphology, water depth, current, temperature, salinity, etc. Therefore, the analysis on geographic and environmental factors is of great significance in the study of biodiversity characteristics. This paper summarizes the marine spatial classification and ABMTs used in coastal area, open oceans and deep sea. And analysis principles and methods of marine spatial classification based on biogeographic related factors, and take China Near Sea (CNS) area as case study, and select key biogeographic related factors, carry out marine spatial classification at biological region scale, ecological regionals scale and biogeographical scale. The research shows that CNS is divided into 5 biological regions by climate and geographical differences, the Yellow Sea, the Bohai Sea, the East China Sea, the Taiwan Straits, and the South China Sea. And the bioregions are then divided into 12 ecological regions according to the typical ecological and administrative factors, and finally the eco-regions are divided into 98 biogeographical units according to the benthic substrate types, depth, coastal types, water temperature, and salinity, given the integrity of biological and ecological process, the area of the biogeographical units is not less than 1,000 km². This research is of great use to the coastal management and biodiversity conservation for local and central government, and provide important scientific support for future spatial planning and management of coastal waters and sustainable use of marine biodiversity.

Keywords: spatial classification, marine biodiversity, bio-geographical, conservation

Procedia PDF Downloads 144
2563 An Interdisciplinary Approach to Investigating Style: A Case Study of a Chinese Translation of Gilbert’s (2006) Eat Pray Love

Authors: Elaine Y. L. Ng

Abstract:

Elizabeth Gilbert’s (2006) biography Eat, Pray, Love describes her travels to Italy, India, and Indonesia after a painful divorce. The author’s experiences with love, loss, search for happiness, and meaning have resonated with a huge readership. As regards the translation of Gilbert’s (2006) Eat, Pray, Love into Chinese, it was first translated by a Taiwanese translator He Pei-Hua and published in Taiwan in 2007 by Make Boluo Wenhua Chubanshe with the fairly catching title “Enjoy! Traveling Alone.” The same translation was translocated to China, republished in simplified Chinese characters by Shanxi Shifan Daxue Chubanshe in 2008 and renamed in China, entitled “To Be a Girl for the Whole Life.” Later on, the same translation in simplified Chinese characters was reprinted by Hunan Wenyi Chubanshe in 2013. This study employs Munday’s (2002) systemic model for descriptive translation studies to investigate the translation of Gilbert’s (2006) Eat, Pray, Love into Chinese by the Taiwanese translator Hu Pei-Hua. It employs an interdisciplinary approach, combining systemic functional linguistics and corpus stylistics with sociohistorical research within a descriptive framework to study the translator’s discursive presence in the text. The research consists of three phases. The first phase is to locate the target text within its socio-cultural context. The target-text context concerning the para-texts, readers’ responses, and the publishers’ orientation will be explored. The second phase is to compare the source text and the target text for the categorization of translation shifts by using the methodological tools of systemic functional linguistics and corpus stylistics. The investigation concerns the rendering of mental clauses and speech and thought presentation. The final phase is an explanation of the causes of translation shifts. The linguistic findings are related to the extra-textual information collected in an effort to ascertain the motivations behind the translator’s choices. There exist sets of possible factors that may have contributed to shaping the textual features of the given translation within a specific socio-cultural context. The study finds that the translator generally reproduces the mental clauses and speech and thought presentation closely according to the original. Nevertheless, the language of the translation has been widely criticized to be unidiomatic and stiff, losing the elegance of the original. In addition, the several Chinese translations of the given text produced by one Taiwanese and two Chinese publishers are basically the same. They are repackaged slightly differently, mainly with the change of the book cover and its captions for each version. By relating the textual findings to the extra-textual data of the study, it is argued that the popularity of the Chinese translation of Gilbert’s (2006) Eat, Pray, Love may not be attributed to the quality of the translation. Instead, it may have to do with the way the work is promoted strategically by the social media manipulated by the four e-bookstores promoting and selling the book online in China.

Keywords: chinese translation of eat pray love, corpus stylistics, motivations for translation shifts, systemic approach to translation studies

Procedia PDF Downloads 162
2562 Semantic Textual Similarity on Contracts: Exploring Multiple Negative Ranking Losses for Sentence Transformers

Authors: Yogendra Sisodia

Abstract:

Researchers are becoming more interested in extracting useful information from legal documents thanks to the development of large-scale language models in natural language processing (NLP), and deep learning has accelerated the creation of powerful text mining models. Legal fields like contracts benefit greatly from semantic text search since it makes it quick and easy to find related clauses. After collecting sentence embeddings, it is relatively simple to locate sentences with a comparable meaning throughout the entire legal corpus. The author of this research investigated two pre-trained language models for this task: MiniLM and Roberta, and further fine-tuned them on Legal Contracts. The author used Multiple Negative Ranking Loss for the creation of sentence transformers. The fine-tuned language models and sentence transformers showed promising results.

Keywords: legal contracts, multiple negative ranking loss, natural language inference, sentence transformers, semantic textual similarity

Procedia PDF Downloads 91
2561 Evolving Convolutional Filter Using Genetic Algorithm for Image Classification

Authors: Rujia Chen, Ajit Narayanan

Abstract:

Convolutional neural networks (CNN), as typically applied in deep learning, use layer-wise backpropagation (BP) to construct filters and kernels for feature extraction. Such filters are 2D or 3D groups of weights for constructing feature maps at subsequent layers of the CNN and are shared across the entire input. BP as a gradient descent algorithm has well-known problems of getting stuck at local optima. The use of genetic algorithms (GAs) for evolving weights between layers of standard artificial neural networks (ANNs) is a well-established area of neuroevolution. In particular, the use of crossover techniques when optimizing weights can help to overcome problems of local optima. However, the application of GAs for evolving the weights of filters and kernels in CNNs is not yet an established area of neuroevolution. In this paper, a GA-based filter development algorithm is proposed. The results of the proof-of-concept experiments described in this paper show the proposed GA algorithm can find filter weights through evolutionary techniques rather than BP learning. For some simple classification tasks like geometric shape recognition, the proposed algorithm can achieve 100% accuracy. The results for MNIST classification, while not as good as possible through standard filter learning through BP, show that filter and kernel evolution warrants further investigation as a new subarea of neuroevolution for deep architectures.

Keywords: neuroevolution, convolutional neural network, genetic algorithm, filters, kernels

Procedia PDF Downloads 175
2560 Spermiogram Values of Fertile Men in Malatya Region

Authors: Aliseydi Bozkurt, Ugur Yılmaz

Abstract:

Objective: It was aimed to evaluate the current status of semen parameters in fertile males with one or more children and whose wife having a pregnancy for the last 1-12 months in Malatya region. Methods: Sperm samples were obtained from 131 voluntary fertile men. In each analysis, sperm volume (ml), number of sperm (sperm/ml), sperm motility and sperm viscosity were examined with Makler device. Classification was made according to World Health Organization (WHO) criteria. Results: Mean ejaculate volume ranged from 1.5 ml to 5.5 ml, sperm count ranged from 27 to 180 million/ml and motility ranged from 35 to 90%. Sperm motility was found to be on average; 69.9% in A, 7.6% in B, 8.7% in C, 13.3% in D category. Conclusion: The mean spermiogram values of fertile males in Malatya region were found to be similar to those in fertile males determined by the WHO. This study has a regional classification value in terms of spermiogram values.

Keywords: fertile men, infertility, spermiogram, sperm motility

Procedia PDF Downloads 334
2559 Information Extraction for Short-Answer Question for the University of the Cordilleras

Authors: Thelma Palaoag, Melanie Basa, Jezreel Mark Panilo

Abstract:

Checking short-answer questions and essays, whether it may be paper or electronic in form, is a tiring and tedious task for teachers. Evaluating a student’s output require wide array of domains. Scoring the work is often a critical task. Several attempts in the past few years to create an automated writing assessment software but only have received negative results from teachers and students alike due to unreliability in scoring, does not provide feedback and others. The study aims to create an application that will be able to check short-answer questions which incorporate information extraction. Information extraction is a subfield of Natural Language Processing (NLP) where a chunk of text (technically known as unstructured text) is being broken down to gather necessary bits of data and/or keywords (structured text) to be further analyzed or rather be utilized by query tools. The proposed system shall be able to extract keywords or phrases from the individual’s answers to match it into a corpora of words (as defined by the instructor), which shall be the basis of evaluation of the individual’s answer. The proposed system shall also enable the teacher to provide feedback and re-evaluate the output of the student for some writing elements in which the computer cannot fully evaluate such as creativity and logic. Teachers can formulate, design, and check short answer questions efficiently by defining keywords or phrases as parameters by assigning weights for checking answers. With the proposed system, teacher’s time in checking and evaluating students output shall be lessened, thus, making the teacher more productive and easier.

Keywords: information extraction, short-answer question, natural language processing, application

Procedia PDF Downloads 416
2558 Classification Using Worldview-2 Imagery of Giant Panda Habitat in Wolong, Sichuan Province, China

Authors: Yunwei Tang, Linhai Jing, Hui Li, Qingjie Liu, Xiuxia Li, Qi Yan, Haifeng Ding

Abstract:

The giant panda (Ailuropoda melanoleuca) is an endangered species, mainly live in central China, where bamboos act as the main food source of wild giant pandas. Knowledge of spatial distribution of bamboos therefore becomes important for identifying the habitat of giant pandas. There have been ongoing studies for mapping bamboos and other tree species using remote sensing. WorldView-2 (WV-2) is the first high resolution commercial satellite with eight Multi-Spectral (MS) bands. Recent studies demonstrated that WV-2 imagery has a high potential in classification of tree species. The advanced classification techniques are important for utilising high spatial resolution imagery. It is generally agreed that object-based image analysis is a more desirable method than pixel-based analysis in processing high spatial resolution remotely sensed data. Classifiers that use spatial information combined with spectral information are known as contextual classifiers. It is suggested that contextual classifiers can achieve greater accuracy than non-contextual classifiers. Thus, spatial correlation can be incorporated into classifiers to improve classification results. The study area is located at Wuyipeng area in Wolong, Sichuan Province. The complex environment makes it difficult for information extraction since bamboos are sparsely distributed, mixed with brushes, and covered by other trees. Extensive fieldworks in Wuyingpeng were carried out twice. The first one was on 11th June, 2014, aiming at sampling feature locations for geometric correction and collecting training samples for classification. The second fieldwork was on 11th September, 2014, for the purposes of testing the classification results. In this study, spectral separability analysis was first performed to select appropriate MS bands for classification. Also, the reflectance analysis provided information for expanding sample points under the circumstance of knowing only a few. Then, a spatially weighted object-based k-nearest neighbour (k-NN) classifier was applied to the selected MS bands to identify seven land cover types (bamboo, conifer, broadleaf, mixed forest, brush, bare land, and shadow), accounting for spatial correlation within classes using geostatistical modelling. The spatially weighted k-NN method was compared with three alternatives: the traditional k-NN classifier, the Support Vector Machine (SVM) method and the Classification and Regression Tree (CART). Through field validation, it was proved that the classification result obtained using the spatially weighted k-NN method has the highest overall classification accuracy (77.61%) and Kappa coefficient (0.729); the producer’s accuracy and user’s accuracy achieve 81.25% and 95.12% for the bamboo class, respectively, also higher than the other methods. Photos of tree crowns were taken at sample locations using a fisheye camera, so the canopy density could be estimated. It is found that it is difficult to identify bamboo in the areas with a large canopy density (over 0.70); it is possible to extract bamboos in the areas with a median canopy density (from 0.2 to 0.7) and in a sparse forest (canopy density is less than 0.2). In summary, this study explores the ability of WV-2 imagery for bamboo extraction in a mountainous region in Sichuan. The study successfully identified the bamboo distribution, providing supporting knowledge for assessing the habitats of giant pandas.

Keywords: bamboo mapping, classification, geostatistics, k-NN, worldview-2

Procedia PDF Downloads 300
2557 Automatic Motion Trajectory Analysis for Dual Human Interaction Using Video Sequences

Authors: Yuan-Hsiang Chang, Pin-Chi Lin, Li-Der Jeng

Abstract:

Advance in techniques of image and video processing has enabled the development of intelligent video surveillance systems. This study was aimed to automatically detect moving human objects and to analyze events of dual human interaction in a surveillance scene. Our system was developed in four major steps: image preprocessing, human object detection, human object tracking, and motion trajectory analysis. The adaptive background subtraction and image processing techniques were used to detect and track moving human objects. To solve the occlusion problem during the interaction, the Kalman filter was used to retain a complete trajectory for each human object. Finally, the motion trajectory analysis was developed to distinguish between the interaction and non-interaction events based on derivatives of trajectories related to the speed of the moving objects. Using a database of 60 video sequences, our system could achieve the classification accuracy of 80% in interaction events and 95% in non-interaction events, respectively. In summary, we have explored the idea to investigate a system for the automatic classification of events for interaction and non-interaction events using surveillance cameras. Ultimately, this system could be incorporated in an intelligent surveillance system for the detection and/or classification of abnormal or criminal events (e.g., theft, snatch, fighting, etc.).

Keywords: motion detection, motion tracking, trajectory analysis, video surveillance

Procedia PDF Downloads 531
2556 Concentric Circle Detection based on Edge Pre-Classification and Extended RANSAC

Authors: Zhongjie Yu, Hancheng Yu

Abstract:

In this paper, we propose an effective method to detect concentric circles with imperfect edges. First, the gradient of edge pixel is coded and a 2-D lookup table is built to speed up normal generation. Then we take an accumulator to estimate the rough center and collect plausible edges of concentric circles through gradient and distance. Later, we take the contour-based method, which takes the contour and edge intersection, to pre-classify the edges. Finally, we use the extended RANSAC method to find all the candidate circles. The center of concentric circles is determined by the two circles with the highest concentricity. Experimental results demonstrate that the proposed method has both good performance and accuracy for the detection of concentric circles.

Keywords: concentric circle detection, gradient, contour, edge pre-classification, RANSAC

Procedia PDF Downloads 123
2555 Comparison of Artificial Neural Networks and Statistical Classifiers in Olive Sorting Using Near-Infrared Spectroscopy

Authors: İsmail Kavdır, M. Burak Büyükcan, Ferhat Kurtulmuş

Abstract:

Table olive is a valuable product especially in Mediterranean countries. It is usually consumed after some fermentation process. Defects happened naturally or as a result of an impact while olives are still fresh may become more distinct after processing period. Defected olives are not desired both in table olive and olive oil industries as it will affect the final product quality and reduce market prices considerably. Therefore it is critical to sort table olives before processing or even after processing according to their quality and surface defects. However, doing manual sorting has many drawbacks such as high expenses, subjectivity, tediousness and inconsistency. Quality criterions for green olives were accepted as color and free of mechanical defects, wrinkling, surface blemishes and rotting. In this study, it was aimed to classify fresh table olives using different classifiers and NIR spectroscopy readings and also to compare the classifiers. For this purpose, green (Ayvalik variety) olives were classified based on their surface feature properties such as defect-free, with bruised defect and with fly defect using FT-NIR spectroscopy and classification algorithms such as artificial neural networks, ident and cluster. Bruker multi-purpose analyzer (MPA) FT-NIR spectrometer (Bruker Optik, GmbH, Ettlingen Germany) was used for spectral measurements. The spectrometer was equipped with InGaAs detectors (TE-InGaAs internal for reflectance and RT-InGaAs external for transmittance) and a 20-watt high intensity tungsten–halogen NIR light source. Reflectance measurements were performed with a fiber optic probe (type IN 261) which covered the wavelengths between 780–2500 nm, while transmittance measurements were performed between 800 and 1725 nm. Thirty-two scans were acquired for each reflectance spectrum in about 15.32 s while 128 scans were obtained for transmittance in about 62 s. Resolution was 8 cm⁻¹ for both spectral measurement modes. Instrument control was done using OPUS software (Bruker Optik, GmbH, Ettlingen Germany). Classification applications were performed using three classifiers; Backpropagation Neural Networks, ident and cluster classification algorithms. For these classification applications, Neural Network tool box in Matlab, ident and cluster modules in OPUS software were used. Classifications were performed considering different scenarios; two quality conditions at once (good vs bruised, good vs fly defect) and three quality conditions at once (good, bruised and fly defect). Two spectrometer readings were used in classification applications; reflectance and transmittance. Classification results obtained using artificial neural networks algorithm in discriminating good olives from bruised olives, from olives with fly defect and from the olive group including both bruised and fly defected olives with success rates respectively changing between 97 and 99%, 61 and 94% and between 58.67 and 92%. On the other hand, classification results obtained for discriminating good olives from bruised ones and also for discriminating good olives from fly defected olives using the ident method ranged between 75-97.5% and 32.5-57.5%, respectfully; results obtained for the same classification applications using the cluster method ranged between 52.5-97.5% and between 22.5-57.5%.

Keywords: artificial neural networks, statistical classifiers, NIR spectroscopy, reflectance, transmittance

Procedia PDF Downloads 233
2554 Enhance the Power of Sentiment Analysis

Authors: Yu Zhang, Pedro Desouza

Abstract:

Since big data has become substantially more accessible and manageable due to the development of powerful tools for dealing with unstructured data, people are eager to mine information from social media resources that could not be handled in the past. Sentiment analysis, as a novel branch of text mining, has in the last decade become increasingly important in marketing analysis, customer risk prediction and other fields. Scientists and researchers have undertaken significant work in creating and improving their sentiment models. In this paper, we present a concept of selecting appropriate classifiers based on the features and qualities of data sources by comparing the performances of five classifiers with three popular social media data sources: Twitter, Amazon Customer Reviews, and Movie Reviews. We introduced a couple of innovative models that outperform traditional sentiment classifiers for these data sources, and provide insights on how to further improve the predictive power of sentiment analysis. The modelling and testing work was done in R and Greenplum in-database analytic tools.

Keywords: sentiment analysis, social media, Twitter, Amazon, data mining, machine learning, text mining

Procedia PDF Downloads 338
2553 A Comparative Analysis of Classification Models with Wrapper-Based Feature Selection for Predicting Student Academic Performance

Authors: Abdullah Al Farwan, Ya Zhang

Abstract:

In today’s educational arena, it is critical to understand educational data and be able to evaluate important aspects, particularly data on student achievement. Educational Data Mining (EDM) is a research area that focusing on uncovering patterns and information in data from educational institutions. Teachers, if they are able to predict their students' class performance, can use this information to improve their teaching abilities. It has evolved into valuable knowledge that can be used for a wide range of objectives; for example, a strategic plan can be used to generate high-quality education. Based on previous data, this paper recommends employing data mining techniques to forecast students' final grades. In this study, five data mining methods, Decision Tree, JRip, Naive Bayes, Multi-layer Perceptron, and Random Forest with wrapper feature selection, were used on two datasets relating to Portuguese language and mathematics classes lessons. The results showed the effectiveness of using data mining learning methodologies in predicting student academic success. The classification accuracy achieved with selected algorithms lies in the range of 80-94%. Among all the selected classification algorithms, the lowest accuracy is achieved by the Multi-layer Perceptron algorithm, which is close to 70.45%, and the highest accuracy is achieved by the Random Forest algorithm, which is close to 94.10%. This proposed work can assist educational administrators to identify poor performing students at an early stage and perhaps implement motivational interventions to improve their academic success and prevent educational dropout.

Keywords: classification algorithms, decision tree, feature selection, multi-layer perceptron, Naïve Bayes, random forest, students’ academic performance

Procedia PDF Downloads 153
2552 AI Tutor: A Computer Science Domain Knowledge Graph-Based QA System on JADE platform

Authors: Yingqi Cui, Changran Huang, Raymond Lee

Abstract:

In this paper, we proposed an AI Tutor using ontology and natural language process techniques to generate a computer science domain knowledge graph and answer users’ questions based on the knowledge graph. We define eight types of relation to extract relationships between entities according to the computer science domain text. The AI tutor is separated into two agents: learning agent and Question-Answer (QA) agent and developed on JADE (a multi-agent system) platform. The learning agent is responsible for reading text to extract information and generate a corresponding knowledge graph by defined patterns. The QA agent can understand the users’ questions and answer humans’ questions based on the knowledge graph generated by the learning agent.

Keywords: artificial intelligence, natural Language processing, knowledge graph, intelligent agents, QA system

Procedia PDF Downloads 168
2551 Curvelet Features with Mouth and Face Edge Ratios for Facial Expression Identification

Authors: S. Kherchaoui, A. Houacine

Abstract:

This paper presents a facial expression recognition system. It performs identification and classification of the seven basic expressions; happy, surprise, fear, disgust, sadness, anger, and neutral states. It consists of three main parts. The first one is the detection of a face and the corresponding facial features to extract the most expressive portion of the face, followed by a normalization of the region of interest. Then calculus of curvelet coefficients is performed with dimensionality reduction through principal component analysis. The resulting coefficients are combined with two ratios; mouth ratio and face edge ratio to constitute the whole feature vector. The third step is the classification of the emotional state using the SVM method in the feature space.

Keywords: facial expression identification, curvelet coefficient, support vector machine (SVM), recognition system

Procedia PDF Downloads 225