Search results for: Kazakh speech dataset
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1852

Search results for: Kazakh speech dataset

922 NFResNet: Multi-Scale and U-Shaped Networks for Deblurring

Authors: Tanish Mittal, Preyansh Agrawal, Esha Pahwa, Aarya Makwana

Abstract:

Multi-Scale and U-shaped Networks are widely used in various image restoration problems, including deblurring. Keeping in mind the wide range of applications, we present a comparison of these architectures and their effects on image deblurring. We also introduce a new block called as NFResblock. It consists of a Fast Fourier Transformation layer and a series of modified Non-Linear Activation Free Blocks. Based on these architectures and additions, we introduce NFResnet and NFResnet+, which are modified multi-scale and U-Net architectures, respectively. We also use three differ-ent loss functions to train these architectures: Charbonnier Loss, Edge Loss, and Frequency Reconstruction Loss. Extensive experiments on the Deep Video Deblurring dataset, along with ablation studies for each component, have been presented in this paper. The proposed architectures achieve a considerable increase in Peak Signal to Noise (PSNR) ratio and Structural Similarity Index (SSIM) value.

Keywords: multi-scale, Unet, deblurring, FFT, resblock, NAF-block, nfresnet, charbonnier, edge, frequency reconstruction

Procedia PDF Downloads 108
921 Exploring the Use of Discourse Markers by American Male and Female Politicians: A Corpus Based Study

Authors: Gohar Rahman, Rabia Saad Ullah

Abstract:

This research aims to examine the use of discourse markers within the dominion of political speeches, differentiating between genders. The analysis centers on twelve speakers, comprising six males and six females. Speeches selected include commencement, victory, state union addresses, campaigns, and presidential speeches. Halliday and Hasan's cohesion framework, specifically discourse markers, is utilized as a theoretical framework. Data is quantitatively analyzed using AntConc to identify marker frequency. The findings are presented through Excel's tables and graphs, suggesting differences in discourse marker preferences between genders. The findings suggest a divergence in the preferences for discourse markers between males and females. However, asserting that females utilize discourse markers more frequently due to the increased use of filler words, face threat mitigation, and polite speech would be an exaggeration. The disparity in frequency is not substantial, suggesting that males and females exhibit varying language inclinations to some degree.

Keywords: discourse markers, political discourse, gender, speeches, language

Procedia PDF Downloads 45
920 Multidirectional Product Support System for Decision Making in Textile Industry Using Collaborative Filtering Methods

Authors: A. Senthil Kumar, V. Murali Bhaskaran

Abstract:

In the information technology ground, people are using various tools and software for their official use and personal reasons. Nowadays, people are worrying to choose data accessing and extraction tools at the time of buying and selling their products. In addition, worry about various quality factors such as price, durability, color, size, and availability of the product. The main purpose of the research study is to find solutions to these unsolved existing problems. The proposed algorithm is a Multidirectional Rank Prediction (MDRP) decision making algorithm in order to take an effective strategic decision at all the levels of data extraction, uses a real time textile dataset and analyzes the results. Finally, the results are obtained and compared with the existing measurement methods such as PCC, SLCF, and VSS. The result accuracy is higher than the existing rank prediction methods.

Keywords: Knowledge Discovery in Database (KDD), Multidirectional Rank Prediction (MDRP), Pearson’s Correlation Coefficient (PCC), VSS (Vector Space Similarity)

Procedia PDF Downloads 269
919 Plant Leaf Recognition Using Deep Learning

Authors: Aadhya Kaul, Gautam Manocha, Preeti Nagrath

Abstract:

Our environment comprises of a wide variety of plants that are similar to each other and sometimes the similarity between the plants makes the identification process tedious thus increasing the workload of the botanist all over the world. Now all the botanists cannot be accessible all the time for such laborious plant identification; therefore, there is an urge for a quick classification model. Also, along with the identification of the plants, it is also necessary to classify the plant as healthy or not as for a good lifestyle, humans require good food and this food comes from healthy plants. A large number of techniques have been applied to classify the plants as healthy or diseased in order to provide the solution. This paper proposes one such method known as anomaly detection using autoencoders using a set of collections of leaves. In this method, an autoencoder model is built using Keras and then the reconstruction of the original images of the leaves is done and the threshold loss is found in order to classify the plant leaves as healthy or diseased. A dataset of plant leaves is considered to judge the reconstructed performance by convolutional autoencoders and the average accuracy obtained is 71.55% for the purpose.

Keywords: convolutional autoencoder, anomaly detection, web application, FLASK

Procedia PDF Downloads 142
918 Named Entity Recognition System for Tigrinya Language

Authors: Sham Kidane, Fitsum Gaim, Ibrahim Abdella, Sirak Asmerom, Yoel Ghebrihiwot, Simon Mulugeta, Natnael Ambassager

Abstract:

The lack of annotated datasets is a bottleneck to the progress of NLP in low-resourced languages. The work presented here consists of large-scale annotated datasets and models for the named entity recognition (NER) system for the Tigrinya language. Our manually constructed corpus comprises over 340K words tagged for NER, with over 118K of the tokens also having parts-of-speech (POS) tags, annotated with 12 distinct classes of entities, represented using several types of tagging schemes. We conducted extensive experiments covering convolutional neural networks and transformer models; the highest performance achieved is 88.8% weighted F1-score. These results are especially noteworthy given the unique challenges posed by Tigrinya’s distinct grammatical structure and complex word morphologies. The system can be an essential building block for the advancement of NLP systems in Tigrinya and other related low-resourced languages and serve as a bridge for cross-referencing against higher-resourced languages.

Keywords: Tigrinya NER corpus, TiBERT, TiRoBERTa, BiLSTM-CRF

Procedia PDF Downloads 95
917 Efficacy of Deep Learning for Below-Canopy Reconstruction of Satellite and Aerial Sensing Point Clouds through Fractal Tree Symmetry

Authors: Dhanuj M. Gandikota

Abstract:

Sensor-derived three-dimensional (3D) point clouds of trees are invaluable in remote sensing analysis for the accurate measurement of key structural metrics, bio-inventory values, spatial planning/visualization, and ecological modeling. Machine learning (ML) holds the potential in addressing the restrictive tradeoffs in cost, spatial coverage, resolution, and information gain that exist in current point cloud sensing methods. Terrestrial laser scanning (TLS) remains the highest fidelity source of both canopy and below-canopy structural features, but usage is limited in both coverage and cost, requiring manual deployment to map out large, forested areas. While aerial laser scanning (ALS) remains a reliable avenue of LIDAR active remote sensing, ALS is also cost-restrictive in deployment methods. Space-borne photogrammetry from high-resolution satellite constellations is an avenue of passive remote sensing with promising viability in research for the accurate construction of vegetation 3-D point clouds. It provides both the lowest comparative cost and the largest spatial coverage across remote sensing methods. However, both space-borne photogrammetry and ALS demonstrate technical limitations in the capture of valuable below-canopy point cloud data. Looking to minimize these tradeoffs, we explored a class of powerful ML algorithms called Deep Learning (DL) that show promise in recent research on 3-D point cloud reconstruction and interpolation. Our research details the efficacy of applying these DL techniques to reconstruct accurate below-canopy point clouds from space-borne and aerial remote sensing through learned patterns of tree species fractal symmetry properties and the supplementation of locally sourced bio-inventory metrics. From our dataset, consisting of tree point clouds obtained from TLS, we deconstructed the point clouds of each tree into those that would be obtained through ALS and satellite photogrammetry of varying resolutions. We fed this ALS/satellite point cloud dataset, along with the simulated local bio-inventory metrics, into the DL point cloud reconstruction architectures to generate the full 3-D tree point clouds (the truth values are denoted by the full TLS tree point clouds containing the below-canopy information). Point cloud reconstruction accuracy was validated both through the measurement of error from the original TLS point clouds as well as the error of extraction of key structural metrics, such as crown base height, diameter above root crown, and leaf/wood volume. The results of this research additionally demonstrate the supplemental performance gain of using minimum locally sourced bio-inventory metric information as an input in ML systems to reach specified accuracy thresholds of tree point cloud reconstruction. This research provides insight into methods for the rapid, cost-effective, and accurate construction of below-canopy tree 3-D point clouds, as well as the supported potential of ML and DL to learn complex, unmodeled patterns of fractal tree growth symmetry.

Keywords: deep learning, machine learning, satellite, photogrammetry, aerial laser scanning, terrestrial laser scanning, point cloud, fractal symmetry

Procedia PDF Downloads 84
916 Amharic Text News Classification Using Supervised Learning

Authors: Misrak Assefa

Abstract:

The Amharic language is the second most widely spoken Semitic language in the world. There are several new overloaded on the web. Searching some useful documents from the web on a specific topic, which is written in the Amharic language, is a challenging task. Hence, document categorization is required for managing and filtering important information. In the classification of Amharic text news, there is still a gap in the domain of information that needs to be launch. This study attempts to design an automatic Amharic news classification using a supervised learning mechanism on four un-touch classes. To achieve this research, 4,182 news articles were used. Naive Bayes (NB) and Decision tree (j48) algorithms were used to classify the given Amharic dataset. In this paper, k-fold cross-validation is used to estimate the accuracy of the classifier. As a result, it shows those algorithms can be applicable in Amharic news categorization. The best average accuracy result is achieved by j48 decision tree and naïve Bayes is 95.2345 %, and 94.6245 % respectively using three categories. This research indicated that a typical decision tree algorithm is more applicable to Amharic news categorization.

Keywords: text categorization, supervised machine learning, naive Bayes, decision tree

Procedia PDF Downloads 173
915 Improved Rare Species Identification Using Focal Loss Based Deep Learning Models

Authors: Chad Goldsworthy, B. Rajeswari Matam

Abstract:

The use of deep learning for species identification in camera trap images has revolutionised our ability to study, conserve and monitor species in a highly efficient and unobtrusive manner, with state-of-the-art models achieving accuracies surpassing the accuracy of manual human classification. The high imbalance of camera trap datasets, however, results in poor accuracies for minority (rare or endangered) species due to their relative insignificance to the overall model accuracy. This paper investigates the use of Focal Loss, in comparison to the traditional Cross Entropy Loss function, to improve the identification of minority species in the “255 Bird Species” dataset from Kaggle. The results show that, although Focal Loss slightly decreased the accuracy of the majority species, it was able to increase the F1-score by 0.06 and improve the identification of the bottom two, five and ten (minority) species by 37.5%, 15.7% and 10.8%, respectively, as well as resulting in an improved overall accuracy of 2.96%.

Keywords: convolutional neural networks, data imbalance, deep learning, focal loss, species classification, wildlife conservation

Procedia PDF Downloads 169
914 An Experimental Study on Some Conventional and Hybrid Models of Fuzzy Clustering

Authors: Jeugert Kujtila, Kristi Hoxhalli, Ramazan Dalipi, Erjon Cota, Ardit Murati, Erind Bedalli

Abstract:

Clustering is a versatile instrument in the analysis of collections of data providing insights of the underlying structures of the dataset and enhancing the modeling capabilities. The fuzzy approach to the clustering problem increases the flexibility involving the concept of partial memberships (some value in the continuous interval [0, 1]) of the instances in the clusters. Several fuzzy clustering algorithms have been devised like FCM, Gustafson-Kessel, Gath-Geva, kernel-based FCM, PCM etc. Each of these algorithms has its own advantages and drawbacks, so none of these algorithms would be able to perform superiorly in all datasets. In this paper we will experimentally compare FCM, GK, GG algorithm and a hybrid two-stage fuzzy clustering model combining the FCM and Gath-Geva algorithms. Firstly we will theoretically dis-cuss the advantages and drawbacks for each of these algorithms and we will describe the hybrid clustering model exploiting the advantages and diminishing the drawbacks of each algorithm. Secondly we will experimentally compare the accuracy of the hybrid model by applying it on several benchmark and synthetic datasets.

Keywords: fuzzy clustering, fuzzy c-means algorithm (FCM), Gustafson-Kessel algorithm, hybrid clustering model

Procedia PDF Downloads 495
913 Family Satisfaction with Neuro-Linguistic Care for Patients with Alzheimer’s Disease

Authors: Sara Sahraoui

Abstract:

This research studied the effect of Alzheimer's disease (AD) on language information processing in subjects with Alzheimer’s disease (AD) who were bilingual (French and dialectical Arabic). The results show a disorder of certain semantic aspects of their mother tongue (L1). On the other hand, grammatical levels appeared to be relatively unaffected in oral speech in L1 but were disturbed in the second language (L2). In consequence, we constructed a cognitive-language stimulation protocol for bilingual patients (PSCLAB) to respond to this disorder. The efficacy of this protocol in terms of rehabilitation was assessed in 30 such patients through discourse analysis carried out before and after initiating the protocol. The results show that cognitive/language training using the PSCLAB appears to improve the language behaviour of bilingual patients with AD. However, this survey study aims to verify the satisfaction of patients’ relatives with the results of cognitive language training by PSCLAB. We developed a brief instrument to measure the satisfaction of family members. The results report that the patient's relatives are satisfied with the results of cognitive training by PSCLAB.

Keywords: satisfaction, Alzheimer's disease, rehabilitation, levels language

Procedia PDF Downloads 51
912 Estimating Gait Parameter from Digital RGB Camera Using Real Time AlphaPose Learning Architecture

Authors: Murad Almadani, Khalil Abu-Hantash, Xinyu Wang, Herbert Jelinek, Kinda Khalaf

Abstract:

Gait analysis is used by healthcare professionals as a tool to gain a better understanding of the movement impairment and track progress. In most circumstances, monitoring patients in their real-life environments with low-cost equipment such as cameras and wearable sensors is more important. Inertial sensors, on the other hand, cannot provide enough information on angular dynamics. This research offers a method for tracking 2D joint coordinates using cutting-edge vision algorithms and a single RGB camera. We provide an end-to-end comprehensive deep learning pipeline for marker-less gait parameter estimation, which, to our knowledge, has never been done before. To make our pipeline function in real-time for real-world applications, we leverage the AlphaPose human posture prediction model and a deep learning transformer. We tested our approach on the well-known GPJATK dataset, which produces promising results.

Keywords: gait analysis, human pose estimation, deep learning, real time gait estimation, AlphaPose, transformer

Procedia PDF Downloads 103
911 Analyzing Semantic Feature Using Multiple Information Sources for Reviews Summarization

Authors: Yu Hung Chiang, Hei Chia Wang

Abstract:

Nowadays, tourism has become a part of life. Before reserving hotels, customers need some information, which the most important source is online reviews, about hotels to help them make decisions. Due to the dramatic growing of online reviews, it is impossible for tourists to read all reviews manually. Therefore, designing an automatic review analysis system, which summarizes reviews, is necessary for them. The main purpose of the system is to understand the opinion of reviews, which may be positive or negative. In other words, the system would analyze whether the customers who visited the hotel like it or not. Using sentiment analysis methods will help the system achieve the purpose. In sentiment analysis methods, the targets of opinion (here they are called the feature) should be recognized to clarify the polarity of the opinion because polarity of the opinion may be ambiguous. Hence, the study proposes an unsupervised method using Part-Of-Speech pattern and multi-lexicons sentiment analysis to summarize all reviews. We expect this method can help customers search what they want information as well as make decisions efficiently.

Keywords: text mining, sentiment analysis, product feature extraction, multi-lexicons

Procedia PDF Downloads 317
910 Examining Bulling Rates among Youth with Intellectual Disabilities

Authors: Kaycee L. Bills

Abstract:

Adolescents and youth who are members of a minority group are more likely to experience higher rates of bullying in comparison to other student demographics. Specifically, adolescents with intellectual disabilities are a minority population that is more susceptible to experience unfair treatment in social settings. This study employs the 2015 Wave of the National Crime Victimization Survey – School Crime Supplement (NCVS/SCS) longitudinal dataset to explore bullying rates experienced among adolescents with intellectual disabilities. This study uses chi-square testing and a logistic regression to analyze if having a disability influences the likelihood of being bullied in comparison to other student demographics. Results of the chi-square testing and the logistic regression indicate that adolescent students who were identified as having a disability were approximately four times more likely to experience higher bullying rates in comparison to all other majority and minority student populations. Thus, it means having a disability resulted in higher bullying rates in comparison to all student groups.

Keywords: disability, bullying, social work, school bullying

Procedia PDF Downloads 115
909 Heart Attack Prediction Using Several Machine Learning Methods

Authors: Suzan Anwar, Utkarsh Goyal

Abstract:

Heart rate (HR) is a predictor of cardiovascular, cerebrovascular, and all-cause mortality in the general population, as well as in patients with cardio and cerebrovascular diseases. Machine learning (ML) significantly improves the accuracy of cardiovascular risk prediction, increasing the number of patients identified who could benefit from preventive treatment while avoiding unnecessary treatment of others. This research examines relationship between the individual's various heart health inputs like age, sex, cp, trestbps, thalach, oldpeaketc, and the likelihood of developing heart disease. Machine learning techniques like logistic regression and decision tree, and Python are used. The results of testing and evaluating the model using the Heart Failure Prediction Dataset show the chance of a person having a heart disease with variable accuracy. Logistic regression has yielded an accuracy of 80.48% without data handling. With data handling (normalization, standardscaler), the logistic regression resulted in improved accuracy of 87.80%, decision tree 100%, random forest 100%, and SVM 100%.

Keywords: heart rate, machine learning, SVM, decision tree, logistic regression, random forest

Procedia PDF Downloads 126
908 Topic Modelling Using Latent Dirichlet Allocation and Latent Semantic Indexing on SA Telco Twitter Data

Authors: Phumelele Kubheka, Pius Owolawi, Gbolahan Aiyetoro

Abstract:

Twitter is one of the most popular social media platforms where users can share their opinions on different subjects. As of 2010, The Twitter platform generates more than 12 Terabytes of data daily, ~ 4.3 petabytes in a single year. For this reason, Twitter is a great source for big mining data. Many industries such as Telecommunication companies can leverage the availability of Twitter data to better understand their markets and make an appropriate business decision. This study performs topic modeling on Twitter data using Latent Dirichlet Allocation (LDA). The obtained results are benchmarked with another topic modeling technique, Latent Semantic Indexing (LSI). The study aims to retrieve topics on a Twitter dataset containing user tweets on South African Telcos. Results from this study show that LSI is much faster than LDA. However, LDA yields better results with higher topic coherence by 8% for the best-performing model represented in Table 1. A higher topic coherence score indicates better performance of the model.

Keywords: big data, latent Dirichlet allocation, latent semantic indexing, telco, topic modeling, twitter

Procedia PDF Downloads 137
907 Recognizing Human Actions by Multi-Layer Growing Grid Architecture

Authors: Z. Gharaee

Abstract:

Recognizing actions performed by others is important in our daily lives since it is necessary for communicating with others in a proper way. We perceive an action by observing the kinematics of motions involved in the performance. We use our experience and concepts to make a correct recognition of the actions. Although building the action concepts is a life-long process, which is repeated throughout life, we are very efficient in applying our learned concepts in analyzing motions and recognizing actions. Experiments on the subjects observing the actions performed by an actor show that an action is recognized after only about two hundred milliseconds of observation. In this study, hierarchical action recognition architecture is proposed by using growing grid layers. The first-layer growing grid receives the pre-processed data of consecutive 3D postures of joint positions and applies some heuristics during the growth phase to allocate areas of the map by inserting new neurons. As a result of training the first-layer growing grid, action pattern vectors are generated by connecting the elicited activations of the learned map. The ordered vector representation layer receives action pattern vectors to create time-invariant vectors of key elicited activations. Time-invariant vectors are sent to second-layer growing grid for categorization. This grid creates the clusters representing the actions. Finally, one-layer neural network developed by a delta rule labels the action categories in the last layer. System performance has been evaluated in an experiment with the publicly available MSR-Action3D dataset. There are actions performed by using different parts of human body: Hand Clap, Two Hands Wave, Side Boxing, Bend, Forward Kick, Side Kick, Jogging, Tennis Serve, Golf Swing, Pick Up and Throw. The growing grid architecture was trained by applying several random selections of generalization test data fed to the system during on average 100 epochs for each training of the first-layer growing grid and around 75 epochs for each training of the second-layer growing grid. The average generalization test accuracy is 92.6%. A comparison analysis between the performance of growing grid architecture and self-organizing map (SOM) architecture in terms of accuracy and learning speed show that the growing grid architecture is superior to the SOM architecture in action recognition task. The SOM architecture completes learning the same dataset of actions in around 150 epochs for each training of the first-layer SOM while it takes 1200 epochs for each training of the second-layer SOM and it achieves the average recognition accuracy of 90% for generalization test data. In summary, using the growing grid network preserves the fundamental features of SOMs, such as topographic organization of neurons, lateral interactions, the abilities of unsupervised learning and representing high dimensional input space in the lower dimensional maps. The architecture also benefits from an automatic size setting mechanism resulting in higher flexibility and robustness. Moreover, by utilizing growing grids the system automatically obtains a prior knowledge of input space during the growth phase and applies this information to expand the map by inserting new neurons wherever there is high representational demand.

Keywords: action recognition, growing grid, hierarchical architecture, neural networks, system performance

Procedia PDF Downloads 143
906 Using Artificial Vision Techniques for Dust Detection on Photovoltaic Panels

Authors: Gustavo Funes, Eduardo Peters, Jose Delpiano

Abstract:

It is widely known that photovoltaic technology has been massively distributed over the last decade despite its low-efficiency ratio. Dust deposition reduces this efficiency even more, lowering the energy production and module lifespan. In this work, we developed an artificial vision algorithm based on CIELAB color space to identify dust over panels in an autonomous way. We performed several experiments photographing three different types of panels, 30W, 340W and 410W. Those panels were soiled artificially with uniform and non-uniform distributed dust. The algorithm proposed uses statistical tools to provide a simulation with a 100% soiled panel and then performs a comparison to get the percentage of dirt in the experimental data set. The simulation uses a seed that is obtained by taking a dust sample from the maximum amount of dust from the dataset. The final result is the dirt percentage and the possible distribution of dust over the panel. Dust deposition is a key factor for plant owners to determine cleaning cycles or identify nonuniform depositions that could lead to module failure and hot spots.

Keywords: dust detection, photovoltaic, artificial vision, soiling

Procedia PDF Downloads 35
905 Automated Process Quality Monitoring and Diagnostics for Large-Scale Measurement Data

Authors: Hyun-Woo Cho

Abstract:

Continuous monitoring of industrial plants is one of necessary tasks when it comes to ensuring high-quality final products. In terms of monitoring and diagnosis, it is quite critical and important to detect some incipient abnormal events of manufacturing processes in order to improve safety and reliability of operations involved and to reduce related losses. In this work a new multivariate statistical online diagnostic method is presented using a case study. For building some reference models an empirical discriminant model is constructed based on various past operation runs. When a fault is detected on-line, an on-line diagnostic module is initiated. Finally, the status of the current operating conditions is compared with the reference model to make a diagnostic decision. The performance of the presented framework is evaluated using a dataset from complex industrial processes. It has been shown that the proposed diagnostic method outperforms other techniques especially in terms of incipient detection of any faults occurred.

Keywords: data mining, empirical model, on-line diagnostics, process fault, process monitoring

Procedia PDF Downloads 387
904 Machine Learning Driven Analysis of Kepler Objects of Interest to Identify Exoplanets

Authors: Akshat Kumar, Vidushi

Abstract:

This paper identifies 27 KOIs, 26 of which are currently classified as candidates and one as false positives that have a high probability of being confirmed. For this purpose, 11 machine learning algorithms were implemented on the cumulative kepler dataset sourced from the NASA exoplanet archive; it was observed that the best-performing model was HistGradientBoosting and XGBoost with a test accuracy of 93.5%, and the lowest-performing model was Gaussian NB with a test accuracy of 54%, to test model performance F1, cross-validation score and RUC curve was calculated. Based on the learned models, the significant characteristics for confirm exoplanets were identified, putting emphasis on the object’s transit and stellar properties; these characteristics were namely koi_count, koi_prad, koi_period, koi_dor, koi_ror, and koi_smass, which were later considered to filter out the potential KOIs. The paper also calculates the Earth similarity index based on the planetary radius and equilibrium temperature for each KOI identified to aid in their classification.

Keywords: Kepler objects of interest, exoplanets, space exploration, machine learning, earth similarity index, transit photometry

Procedia PDF Downloads 44
903 A Corpus-Based Study on the Lexical, Syntactic and Sequential Features across Interpreting Types

Authors: Qianxi Lv, Junying Liang

Abstract:

Among the various modes of interpreting, simultaneous interpreting (SI) is regarded as a ‘complex’ and ‘extreme condition’ of cognitive tasks while consecutive interpreters (CI) do not have to share processing capacity between tasks. Given that SI exerts great cognitive demand, it makes sense to posit that the output of SI may be more compromised than that of CI in the linguistic features. The bulk of the research has stressed the varying cognitive demand and processes involved in different modes of interpreting; however, related empirical research is sparse. In keeping with our interest in investigating the quantitative linguistic factors discriminating between SI and CI, the current study seeks to examine the potential lexical simplification, syntactic complexity and sequential organization mechanism with a self-made inter-model corpus of transcribed simultaneous and consecutive interpretation, translated speech and original speech texts with a total running word of 321960. The lexical features are extracted in terms of the lexical density, list head coverage, hapax legomena, and type-token ratio, as well as core vocabulary percentage. Dependency distance, an index for syntactic complexity and reflective of processing demand is employed. Frequency motif is a non-grammatically-bound sequential unit and is also used to visualize the local function distribution of interpreting the output. While SI is generally regarded as multitasking with high cognitive load, our findings evidently show that CI may impose heavier or taxing cognitive resource differently and hence yields more lexically and syntactically simplified output. In addition, the sequential features manifest that SI and CI organize the sequences from the source text in different ways into the output, to minimize the cognitive load respectively. We reasoned the results in the framework that cognitive demand is exerted both on maintaining and coordinating component of Working Memory. On the one hand, the information maintained in CI is inherently larger in volume compared to SI. On the other hand, time constraints directly influence the sentence reformulation process. The temporal pressure from the input in SI makes the interpreters only keep a small chunk of information in the focus of attention. Thus, SI interpreters usually produce the output by largely retaining the source structure so as to relieve the information from the working memory immediately after formulated in the target language. Conversely, CI interpreters receive at least a few sentences before reformulation, when they are more self-paced. CI interpreters may thus tend to retain and generate the information in a way to lessen the demand. In other words, interpreters cope with the high demand in the reformulation phase of CI by generating output with densely distributed function words, more content words of higher frequency values and fewer variations, simpler structures and more frequently used language sequences. We consequently propose a revised effort model based on the result for a better illustration of cognitive demand during both interpreting types.

Keywords: cognitive demand, corpus-based, dependency distance, frequency motif, interpreting types, lexical simplification, sequential units distribution, syntactic complexity

Procedia PDF Downloads 154
902 Comparison of Classical Computer Vision vs. Convolutional Neural Networks Approaches for Weed Mapping in Aerial Images

Authors: Paulo Cesar Pereira Junior, Alexandre Monteiro, Rafael da Luz Ribeiro, Antonio Carlos Sobieranski, Aldo von Wangenheim

Abstract:

In this paper, we present a comparison between convolutional neural networks and classical computer vision approaches, for the specific precision agriculture problem of weed mapping on sugarcane fields aerial images. A systematic literature review was conducted to find which computer vision methods are being used on this specific problem. The most cited methods were implemented, as well as four models of convolutional neural networks. All implemented approaches were tested using the same dataset, and their results were quantitatively and qualitatively analyzed. The obtained results were compared to a human expert made ground truth for validation. The results indicate that the convolutional neural networks present better precision and generalize better than the classical models.

Keywords: convolutional neural networks, deep learning, digital image processing, precision agriculture, semantic segmentation, unmanned aerial vehicles

Procedia PDF Downloads 230
901 Metaphor Institutionalization as Phase Transition: Case Studies of Chinese Metaphors

Authors: Xuri Tang, Ting Pan

Abstract:

Metaphor institutionalization refers to the propagation of a metaphor that leads to its acceptance in speech community as a norm of the language. Such knowledge is important to both theoretical studies of metaphor and practical disciplines such as lexicography and language generation. This paper reports an empirical study of metaphor institutionalization of 14 Chinese metaphors. It first explores the pattern of metaphor institutionalization by fitting the logistic function (or S-shaped curve) to time series data of conventionality of the metaphors that are automatically obtained from a large-scale diachronic Chinese corpus. Then it reports a questionnaire-based survey on the propagation scale of each metaphor, which is measured by the average number of subjects that can easily understand the metaphorical expressions. The study provides two pieces of evidence supporting the hypothesis that metaphor institutionalization is a phrase transition: (1) the pattern of metaphor institutionalization is an S-shaped curve and (2) institutionalized metaphors generally do not propagate to the whole community but remain in equilibrium state. This conclusion helps distinguish metaphor institutionalization from topicalization and other types of semantic change.

Keywords: metaphor institutionalization, phase transition, propagation scale, s-shaped curve

Procedia PDF Downloads 156
900 Idiopathic Gingival Fibromatosis

Authors: Bandana Koirala, Shivalal Sharma

Abstract:

Introduction: Gingival enlargements are quite common and may be either inflammatory, non-inflammatory or a combination of both. Idiopathic gingival enlargement is a rare condition with a proliferative fibrous lesion of the gingival tissue that causes esthetic and functional problems. It is of undetermined etiology. Case Description: This case report addresses the diagnosis and treatment of a case of idiopathic gingival enlargement in a 9-year-old male patient. The patient presented with a generalized diffuse gingival enlargement involving the entire maxillary and the mandibular arch with extension on occlusal, buccal, lingual, and palatal surfaces with just parts of occlusal surfaces of few upper and lower molars visible resulting in open mouth, difficulty in mastication and speech. Biopsy report confirmed the diagnosis of fibromatosis gingivae. Gingivectomy was carried out in all four quadrants by using external bevel incision. Conclusion: Though total esthetics could not be restored due to unusual bony enlargement, the general appearance improved satisfactorily. Treatment after complete excision however, improved the masticatory competence to a great extent.

Keywords: idiopathic gingival fibromatosis, gingival enlargement, gingivectomy, medical and health sciences

Procedia PDF Downloads 316
899 Finding Related Scientific Documents Using Formal Concept Analysis

Authors: Nadeem Akhtar, Hira Javed

Abstract:

An important aspect of research is literature survey. Availability of a large amount of literature across different domains triggers the need for optimized systems which provide relevant literature to researchers. We propose a search system based on keywords for text documents. This experimental approach provides a hierarchical structure to the document corpus. The documents are labelled with keywords using KEA (Keyword Extraction Algorithm) and are automatically organized in a lattice structure using Formal Concept Analysis (FCA). This groups the semantically related documents together. The hierarchical structure, based on keywords gives out only those documents which precisely contain them. This approach open doors for multi-domain research. The documents across multiple domains which are indexed by similar keywords are grouped together. A hierarchical relationship between keywords is obtained. To signify the effectiveness of the approach, we have carried out the experiment and evaluation on Semeval-2010 Dataset. Results depict that the presented method is considerably successful in indexing of scientific papers.

Keywords: formal concept analysis, keyword extraction algorithm, scientific documents, lattice

Procedia PDF Downloads 312
898 Development of Computational Approach for Calculation of Hydrogen Solubility in Hydrocarbons for Treatment of Petroleum

Authors: Abdulrahman Sumayli, Saad M. AlShahrani

Abstract:

For the hydrogenation process, knowing the solubility of hydrogen (H2) in hydrocarbons is critical to improve the efficiency of the process. We investigated the H2 solubility computation in four heavy crude oil feedstocks using machine learning techniques. Temperature, pressure, and feedstock type were considered as the inputs to the models, while the hydrogen solubility was the sole response. Specifically, we employed three different models: Support Vector Regression (SVR), Gaussian process regression (GPR), and Bayesian ridge regression (BRR). To achieve the best performance, the hyper-parameters of these models are optimized using the whale optimization algorithm (WOA). We evaluated the models using a dataset of solubility measurements in various feedstocks, and we compared their performance based on several metrics. Our results show that the WOA-SVR model tuned with WOA achieves the best performance overall, with an RMSE of 1.38 × 10− 2 and an R-squared of 0.991. These findings suggest that machine learning techniques can provide accurate predictions of hydrogen solubility in different feedstocks, which could be useful in the development of hydrogen-related technologies. Besides, the solubility of hydrogen in the four heavy oil fractions is estimated in different ranges of temperatures and pressures of 150 ◦C–350 ◦C and 1.2 MPa–10.8 MPa, respectively

Keywords: temperature, pressure variations, machine learning, oil treatment

Procedia PDF Downloads 54
897 Change Point Detection Using Random Matrix Theory with Application to Frailty in Elderly Individuals

Authors: Malika Kharouf, Aly Chkeir, Khac Tuan Huynh

Abstract:

Detecting change points in time series data is a challenging problem, especially in scenarios where there is limited prior knowledge regarding the data’s distribution and the nature of the transitions. We present a method designed for detecting changes in the covariance structure of high-dimensional time series data, where the number of variables closely matches the data length. Our objective is to achieve unbiased test statistic estimation under the null hypothesis. We delve into the utilization of Random Matrix Theory to analyze the behavior of our test statistic within a high-dimensional context. Specifically, we illustrate that our test statistic converges pointwise to a normal distribution under the null hypothesis. To assess the effectiveness of our proposed approach, we conduct evaluations on a simulated dataset. Furthermore, we employ our method to examine changes aimed at detecting frailty in the elderly.

Keywords: change point detection, hypothesis tests, random matrix theory, frailty in elderly

Procedia PDF Downloads 21
896 Assessing Bus Service Quality in Dhaka City from the Perspective of Female Passengers

Authors: S. K. Subah, R. Tasnim, M. I. Jahan, M. R. Islam

Abstract:

While talking about how comfortable and convenient Dhaka's bus service is, the minimum emphasis is placed on the female commuters of the Dhaka city. Recognizing the contemporary situation, the supreme focus is to develop experimental model based on statistical methods. SEM has been adopted to quantify passenger satisfaction, which is affected by the perceived service quality. The study deals with 16 observed variables and three latent variables, which were correlated to identify their significance on the regulation of perceived SQ (Service Quality). To calibrate the model, a dataset of 250 responses from female users of local buses has been utilized through survey. A questionnaire structured with SQ variables was prepared in consultation with prevailing literature, practitioners, academicians, and users. The result concludes that the attributes of safe and secured environment have the most significant impact on the overall bus service quality according to the insight of female respondents. The study outcome might be a great help for the policymakers, women's organizations, and NGOs to formulate transport policy that will ensure a women-friendly public bus service.

Keywords: bus service quality, female perception, structural equation modelling, safety-security, women friendly bus

Procedia PDF Downloads 136
895 Optimized Preprocessing for Accurate and Efficient Bioassay Prediction with Machine Learning Algorithms

Authors: Jeff Clarine, Chang-Shyh Peng, Daisy Sang

Abstract:

Bioassay is the measurement of the potency of a chemical substance by its effect on a living animal or plant tissue. Bioassay data and chemical structures from pharmacokinetic and drug metabolism screening are mined from and housed in multiple databases. Bioassay prediction is calculated accordingly to determine further advancement. This paper proposes a four-step preprocessing of datasets for improving the bioassay predictions. The first step is instance selection in which dataset is categorized into training, testing, and validation sets. The second step is discretization that partitions the data in consideration of accuracy vs. precision. The third step is normalization where data are normalized between 0 and 1 for subsequent machine learning processing. The fourth step is feature selection where key chemical properties and attributes are generated. The streamlined results are then analyzed for the prediction of effectiveness by various machine learning algorithms including Pipeline Pilot, R, Weka, and Excel. Experiments and evaluations reveal the effectiveness of various combination of preprocessing steps and machine learning algorithms in more consistent and accurate prediction.

Keywords: bioassay, machine learning, preprocessing, virtual screen

Procedia PDF Downloads 259
894 Building Scalable and Accurate Hybrid Kernel Mapping Recommender

Authors: Hina Iqbal, Mustansar Ali Ghazanfar, Sandor Szedmak

Abstract:

Recommender systems uses artificial intelligence practices for filtering obscure information and can predict if a user likes a specified item. Kernel mapping Recommender systems have been proposed which are accurate and state-of-the-art algorithms and resolve recommender system’s design objectives such as; long tail, cold-start, and sparsity. The aim of research is to propose hybrid framework that can efficiently integrate different versions— namely item-based and user-based KMR— of KMR algorithm. We have proposed various heuristic algorithms that integrate different versions of KMR (into a unified framework) resulting in improved accuracy and elimination of problems associated with conventional recommender system. We have tested our system on publically available movies dataset and benchmark with KMR. The results (in terms of accuracy, precision, recall, F1 measure and ROC metrics) reveal that the proposed algorithm is quite accurate especially under cold-start and sparse scenarios.

Keywords: Kernel Mapping Recommender Systems, hybrid recommender systems, cold start, sparsity, long tail

Procedia PDF Downloads 321
893 Comics Scanlation and Publishing Houses Translation

Authors: Sharifa Alshahrani

Abstract:

Comics is a multimodal text wherein meaning is created by taking in all modes of expression at once. It uses two different semiotic modes, the verbal and the visual modes, together to make meaning and these different semiotic modes can be socially and culturally shaped to give meaning. Therefore, comics translation cannot treat comics as a monomodal text by translating only the verbal mode inside or outside the speech balloons as the cultural differences are encoded in the visual mode as well. Due to the development of the internet and editing software, comics translation is not anymore confined to the publishing houses and official translation as scanlation, or the fan translation took the initiative in translating comics for being emotionally attracted to the culture and genre. Scanlation is carried out by volunteering fans who translate out of passion. However, quality is one of the debatable issues relating to scanlation and fan translation. This study will investigate how the dynamic multimodal relationship in comics is exploited and interpreted in the translation by exploring the translation strategies and procedures adopted by the publishing houses and scanlation in interpreting comics into Arabic using three analytical frameworks; cultural references model, multimodal relation model and translation strategies and procedures models.

Keywords: comics, multimodality, translation, scanlation

Procedia PDF Downloads 195