Search results for: Arabic text classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3603

Search results for: Arabic text classification

3303 Multi-Criteria Inventory Classification Process Based on Logical Analysis of Data

Authors: Diana López-Soto, Soumaya Yacout, Francisco Ángel-Bello

Abstract:

Although inventories are considered as stocks of money sitting on shelve, they are needed in order to secure a constant and continuous production. Therefore, companies need to have control over the amount of inventory in order to find the balance between excessive and shortage of inventory. The classification of items according to certain criteria such as the price, the usage rate and the lead time before arrival allows any company to concentrate its investment in inventory according to certain ranking or priority of items. This makes the decision making process for inventory management easier and more justifiable. The purpose of this paper is to present a new approach for the classification of new items based on the already existing criteria. This approach is called the Logical Analysis of Data (LAD). It is used in this paper to assist the process of ABC items classification based on multiple criteria. LAD is a data mining technique based on Boolean theory that is used for pattern recognition. This technique has been tested in medicine, industry, credit risk analysis, and engineering with remarkable results. An application on ABC inventory classification is presented for the first time, and the results are compared with those obtained when using the well-known AHP technique and the ANN technique. The results show that LAD presented very good classification accuracy.

Keywords: ABC multi-criteria inventory classification, inventory management, multi-class LAD model, multi-criteria classification

Procedia PDF Downloads 849
3302 On the Comprehension of English Compound Nouns by Arabic-Speaking EFL Learners

Authors: Abdel Rahman Altakhaineh, Mohamma Alaghawat, Hiba Alhendi

Abstract:

This paper reports an investigation of the comprehension of English compound nouns by sixty Arabic-speaking English Foreign Language (EFL) learners majoring in English at the University of Jordan, Amman. The investigation focused on the problems that these learners may encounter in understanding certain types of compounds and their ability to use their L1 compound noun knowledge to produce the meaning of L2 compound nouns. Participants whose English proficiency level was advanced underwent a test to identify the meaning ofan underlined compound without using a dictionary. Theresponses to the three different types of compounds were analyzed usingTwo-Way repeated measures ANOVA, and the results showed that there were different endocentric and exocentric compound responses within subordinative compounds, with a statistically significant difference between the two in favor of endocentric compounds. We argue that the endocentric, especially subordinative endocentric compounds,weremore easily understood due to its representative nature, i.e., because the head represents the meaning of the whole compound. The study concludes with pedagogical implications for teaching compound nouns.

Keywords: morphology, compounding, SLA, arabic-speaking EFL learners

Procedia PDF Downloads 85
3301 Practical Ways to Acquire the Arabic Language through Electronic Means

Authors: Hondozi Jahja

Abstract:

There is an obvious need to learn Arabic language and teach it to other speakers through the new curricula. The idea is to bridge the gap between theory and practice. To that end, we have sought to offer some means of help to master the Arabic language, in addition to our efforts to apply these means, enriching the culture of the student and develop his vocabulary. There is no doubt that taking care of the practical aspect of the grammar was our constant goal, and this particular aspect is what builds the student’s positive values and refine his taste and develop his language. In addressing these issues, we have adopted a school-based approach based primarily on the active and positive participation of the student. The theoretical linguistic issues - in our opinion - are not a primary goal, but the goal is to be used them by students through speaking and applying them. Among the objectives of this research is to establish the basic language skills of the students using new means that help the student to acquire these skills and apply them in various subjects of interest in his progress and development. Unfortunately, some of our students consider the grammar as ‘difficult’, ‘complex’ and ‘heavy’ in itself. This is one of the obstacles that stand in the way of their desired results. As a consequence, they end up talking – mumbling - about the difficulties they face in applying those rules. Therefore, some of our students finish their university studies and are unable to express what they feel using language correctly. For this purpose, we have sought in this research to follow a new integrated approach, which is to study the grammar of the language through modern means of the consolidation of the principle of functional language, and that the rule implies to control tongues and linguistic expressions properly. This research is a result of a practical experience as a teacher of Arabic language for non-native speakers at the ‘Hassan Pristina’ University, located in Pristina, the capital of Kosovo and at the Qatar Training Center since its establishment in 2012.

Keywords: arabic, applied methods, acquire, learning

Procedia PDF Downloads 131
3300 On Copular Constructions in Yemeni Arabic and the Cartography of Subjects

Authors: Ameen Alahdal

Abstract:

This paper investigates copular constructions in Raimi Yemeni Arabic (RYA). The aim of the paper is actually twofold. First it explores the types of copular constructions in Raimi Yemeni Arabic, a variety of Arabic that has not attracted a lot of attention. In this connection, the paper shows that RYA manifests ‘bare’, verbal and pronominal/PRON copular constructions, just like other varieties of Arabic and indeed other Semitic languages like Hebrew. The sentences below from RYA represent the three constructions, respectively. (1) a. nada Hilwah Nada pretty.3sf ‘Nada is pretty’ b. kan al-banat hina was the-girls here ‘The girls were here c. ali hu-l mudiir Ali he-the manager ‘Ali is the manager’ Interestingly, in addition to these common types of copular constructions, RYA seems to exhibit dual copula sentences, a construction that features both a pronominal copula and a verbal copula. Such a construction is attested neither in Standard Arabic nor in other modern varieties of Arabic such as Lebanese, Moroccan, Egyptian, Jordanian. Remarkably, dual copular sentences do not appear even in other dialects of Yemeni Arabic such as Sanaani, Adeni and Tehami. (2) is an example. (2) maha kan-ih mudarrisah maha was-she teacher.3sf ‘Maha was a teacehr’ Second, the paper considers the cartography of subject positions in copular constructions proposed by Shlonsky and Rizzi (2018). Different copular constructions seem to involve different subject positions (which might eventually correlate with different interpretations – not our concern in this paper). Here, it is argued that in a bare copular sentence, as in (1a), RYA might exploit two criterial subject positions (in Rizzi’s sense), in addition to the canonical Spec,TP position. Under mainstream minimalist assumption, a copular sentence is analyzed as a PredP. Thus, in addition to the PredP-related thematic subject position, a criterial subject position is posited outside of PredP. (3) below represents the cartography of subject positions in a bare copular construction. (3) [……..DP subj PredP DP Pred DP/AP/PP ] In PRON sentences, as exemplified in (1c), another two subject positions are postulated high in the clause, particularly above PolP. (4) illustrates the hierarchy of the subject positions in a PRON copular construction. The subject resides in Spec,SUBJ2P. (4) …DP SUBJ2 …DP SUBJ1 … Pol … DP subj PredP Another related phenomenon in RYA which sets it apart from other languages like Hebrew is that of negative bare copular construction. This construction involves a PRON, which is not found in its affirmative counterpart. PRON, however, is hosted neither by SUBJ20 nor by SUBJ10. Rather, PRON occurs below Neg0 (Pol0 in the hierarchy). This situation raises interesting issues for the hierarchy of subjects in copular constructions as well as to the syntax of the left periphery in general. With regard to what causes the subject to move, there are different potential triggers. For instance, movement of the subject at the base, i.e., out of PredP is triggered by a labeling failure. Other movements of the subject can be driven by a formal feature like EPP, or a criterial feature like [subj].

Keywords: Yemeni Arabic, copular constructions, cartography of subjects, labeling, criterial positions

Procedia PDF Downloads 73
3299 A Study of Various Ontology Learning Systems from Text and a Look into Future

Authors: Fatima Al-Aswadi, Chan Yong

Abstract:

With the large volume of unstructured data that increases day by day on the web, the motivation of representing the knowledge in this data in the machine processable form is increased. Ontology is one of the major cornerstones of representing the information in a more meaningful way on the semantic Web. The goal of Ontology learning from text is to elicit and represent domain knowledge in the machine readable form. This paper aims to give a follow-up review on the ontology learning systems from text and some of their defects. Furthermore, it discusses how far the ontology learning process will enhance in the future.

Keywords: concept discovery, deep learning, ontology learning, semantic relation, semantic web

Procedia PDF Downloads 485
3298 SNR Classification Using Multiple CNNs

Authors: Thinh Ngo, Paul Rad, Brian Kelley

Abstract:

Noise estimation is essential in today wireless systems for power control, adaptive modulation, interference suppression and quality of service. Deep learning (DL) has already been applied in the physical layer for modulation and signal classifications. Unacceptably low accuracy of less than 50% is found to undermine traditional application of DL classification for SNR prediction. In this paper, we use divide-and-conquer algorithm and classifier fusion method to simplify SNR classification and therefore enhances DL learning and prediction. Specifically, multiple CNNs are used for classification rather than a single CNN. Each CNN performs a binary classification of a single SNR with two labels: less than, greater than or equal. Together, multiple CNNs are combined to effectively classify over a range of SNR values from −20 ≤ SNR ≤ 32 dB.We use pre-trained CNNs to predict SNR over a wide range of joint channel parameters including multiple Doppler shifts (0, 60, 120 Hz), power-delay profiles, and signal-modulation types (QPSK,16QAM,64-QAM). The approach achieves individual SNR prediction accuracy of 92%, composite accuracy of 70% and prediction convergence one order of magnitude faster than that of traditional estimation.

Keywords: classification, CNN, deep learning, prediction, SNR

Procedia PDF Downloads 110
3297 U-Net Based Multi-Output Network for Lung Disease Segmentation and Classification Using Chest X-Ray Dataset

Authors: Jaiden X. Schraut

Abstract:

Medical Imaging Segmentation of Chest X-rays is used for the purpose of identification and differentiation of lung cancer, pneumonia, COVID-19, and similar respiratory diseases. Widespread application of computer-supported perception methods into the diagnostic pipeline has been demonstrated to increase prognostic accuracy and aid doctors in efficiently treating patients. Modern models attempt the task of segmentation and classification separately and improve diagnostic efficiency; however, to further enhance this process, this paper proposes a multi-output network that follows a U-Net architecture for image segmentation output and features an additional CNN module for auxiliary classification output. The proposed model achieves a final Jaccard Index of .9634 for image segmentation and a final accuracy of .9600 for classification on the COVID-19 radiography database.

Keywords: chest X-ray, deep learning, image segmentation, image classification

Procedia PDF Downloads 111
3296 Principle Components Updates via Matrix Perturbations

Authors: Aiman Elragig, Hanan Dreiwi, Dung Ly, Idriss Elmabrook

Abstract:

This paper highlights a new approach to look at online principle components analysis (OPCA). Given a data matrix X R,^m x n we characterise the online updates of its covariance as a matrix perturbation problem. Up to the principle components, it turns out that online updates of the batch PCA can be captured by symmetric matrix perturbation of the batch covariance matrix. We have shown that as n→ n0 >> 1, the batch covariance and its update become almost similar. Finally, utilize our new setup of online updates to find a bound on the angle distance of the principle components of X and its update.

Keywords: online data updates, covariance matrix, online principle component analysis, matrix perturbation

Procedia PDF Downloads 173
3295 Case-Based Reasoning: A Hybrid Classification Model Improved with an Expert's Knowledge for High-Dimensional Problems

Authors: Bruno Trstenjak, Dzenana Donko

Abstract:

Data mining and classification of objects is the process of data analysis, using various machine learning techniques, which is used today in various fields of research. This paper presents a concept of hybrid classification model improved with the expert knowledge. The hybrid model in its algorithm has integrated several machine learning techniques (Information Gain, K-means, and Case-Based Reasoning) and the expert’s knowledge into one. The knowledge of experts is used to determine the importance of features. The paper presents the model algorithm and the results of the case study in which the emphasis was put on achieving the maximum classification accuracy without reducing the number of features.

Keywords: case based reasoning, classification, expert's knowledge, hybrid model

Procedia PDF Downloads 347
3294 A Comparison of South East Asian Face Emotion Classification based on Optimized Ellipse Data Using Clustering Technique

Authors: M. Karthigayan, M. Rizon, Sazali Yaacob, R. Nagarajan, M. Muthukumaran, Thinaharan Ramachandran, Sargunam Thirugnanam

Abstract:

In this paper, using a set of irregular and regular ellipse fitting equations using Genetic algorithm (GA) are applied to the lip and eye features to classify the human emotions. Two South East Asian (SEA) faces are considered in this work for the emotion classification. There are six emotions and one neutral are considered as the output. Each subject shows unique characteristic of the lip and eye features for various emotions. GA is adopted to optimize irregular ellipse characteristics of the lip and eye features in each emotion. That is, the top portion of lip configuration is a part of one ellipse and the bottom of different ellipse. Two ellipse based fitness equations are proposed for the lip configuration and relevant parameters that define the emotions are listed. The GA method has achieved reasonably successful classification of emotion. In some emotions classification, optimized data values of one emotion are messed or overlapped to other emotion ranges. In order to overcome the overlapping problem between the emotion optimized values and at the same time to improve the classification, a fuzzy clustering method (FCM) of approach has been implemented to offer better classification. The GA-FCM approach offers a reasonably good classification within the ranges of clusters and it had been proven by applying to two SEA subjects and have improved the classification rate.

Keywords: ellipse fitness function, genetic algorithm, emotion recognition, fuzzy clustering

Procedia PDF Downloads 527
3293 Sparse Coding Based Classification of Electrocardiography Signals Using Data-Driven Complete Dictionary Learning

Authors: Fuad Noman, Sh-Hussain Salleh, Chee-Ming Ting, Hadri Hussain, Syed Rasul

Abstract:

In this paper, a data-driven dictionary approach is proposed for the automatic detection and classification of cardiovascular abnormalities. Electrocardiography (ECG) signal is represented by the trained complete dictionaries that contain prototypes or atoms to avoid the limitations of pre-defined dictionaries. The data-driven trained dictionaries simply take the ECG signal as input rather than extracting features to study the set of parameters that yield the most descriptive dictionary. The approach inherently learns the complicated morphological changes in ECG waveform, which is then used to improve the classification. The classification performance was evaluated with ECG data under two different preprocessing environments. In the first category, QT-database is baseline drift corrected with notch filter and it filters the 60 Hz power line noise. In the second category, the data are further filtered using fast moving average smoother. The experimental results on QT database confirm that our proposed algorithm shows a classification accuracy of 92%.

Keywords: electrocardiogram, dictionary learning, sparse coding, classification

Procedia PDF Downloads 355
3292 Semi-Automatic Method to Assist Expert for Association Rules Validation

Authors: Amdouni Hamida, Gammoudi Mohamed Mohsen

Abstract:

In order to help the expert to validate association rules extracted from data, some quality measures are proposed in the literature. We distinguish two categories: objective and subjective measures. The first one depends on a fixed threshold and on data quality from which the rules are extracted. The second one consists on providing to the expert some tools in the objective to explore and visualize rules during the evaluation step. However, the number of extracted rules to validate remains high. Thus, the manually mining rules task is very hard. To solve this problem, we propose, in this paper, a semi-automatic method to assist the expert during the association rule's validation. Our method uses rule-based classification as follow: (i) We transform association rules into classification rules (classifiers), (ii) We use the generated classifiers for data classification. (iii) We visualize association rules with their quality classification to give an idea to the expert and to assist him during validation process.

Keywords: association rules, rule-based classification, classification quality, validation

Procedia PDF Downloads 413
3291 Teaching Pragmatic Coherence in Literary Text: Analysis of Chimamanda Adichie’s Americanah

Authors: Joy Aworo-Okoroh

Abstract:

Literary texts are mirrors of a real-life situation. Thus, authors choose the linguistic items that would best encode their intended meanings and messages. However, words mean more than they seem. The meaning of words is not static rather, it is dynamic as they constantly enter into relationships within a context. Literary texts can only be meaningful if all pragmatic cues are identified and interpreted. Drawing upon Teun Van Djik's theory of local pragmatic coherence, it is established that words enter into relations in a text and these relations account for sequential speech acts in the texts. Comprehension of the text is dependent on the interpretation of these relations.To show the relevance of pragmatic coherence in literary text analysis, ten conversations were selected in Americanah in order to give a clear idea of the pragmatic relations used. The conversations were analysed, identifying the speech act and epistemic relations inherent in them. A subtle analysis of the structure of the conversations was also carried out. It was discovered that justification is the most commonly used relation and the meaning of the text is dependent on the interpretation of these instances' pragmatic coherence. The study concludes that to effectively teach literature in English, pragmatic coherence should be incorporated as words mean more than they say.

Keywords: pragmatic coherence, epistemic coherence, speech act, Americanah

Procedia PDF Downloads 112
3290 Spatial Audio Player Using Musical Genre Classification

Authors: Jun-Yong Lee, Hyoung-Gook Kim

Abstract:

In this paper, we propose a smart music player that combines the musical genre classification and the spatial audio processing. The musical genre is classified based on content analysis of the musical segment detected from the audio stream. In parallel with the classification, the spatial audio quality is achieved by adding an artificial reverberation in a virtual acoustic space to the input mono sound. Thereafter, the spatial sound is boosted with the given frequency gains based on the musical genre when played back. Experiments measured the accuracy of detecting the musical segment from the audio stream and its musical genre classification. A listening test was performed based on the virtual acoustic space based spatial audio processing.

Keywords: automatic equalization, genre classification, music segment detection, spatial audio processing

Procedia PDF Downloads 400
3289 Sentiment Analysis of Fake Health News Using Naive Bayes Classification Models

Authors: Danielle Shackley, Yetunde Folajimi

Abstract:

As more people turn to the internet seeking health-related information, there is more risk of finding false, inaccurate, or dangerous information. Sentiment analysis is a natural language processing technique that assigns polarity scores to text, ranging from positive, neutral, and negative. In this research, we evaluate the weight of a sentiment analysis feature added to fake health news classification models. The dataset consists of existing reliably labeled health article headlines that were supplemented with health information collected about COVID-19 from social media sources. We started with data preprocessing and tested out various vectorization methods such as Count and TFIDF vectorization. We implemented 3 Naive Bayes classifier models, including Bernoulli, Multinomial, and Complement. To test the weight of the sentiment analysis feature on the dataset, we created benchmark Naive Bayes classification models without sentiment analysis, and those same models were reproduced, and the feature was added. We evaluated using the precision and accuracy scores. The Bernoulli initial model performed with 90% precision and 75.2% accuracy, while the model supplemented with sentiment labels performed with 90.4% precision and stayed constant at 75.2% accuracy. Our results show that the addition of sentiment analysis did not improve model precision by a wide margin; while there was no evidence of improvement in accuracy, we had a 1.9% improvement margin of the precision score with the Complement model. Future expansion of this work could include replicating the experiment process and substituting the Naive Bayes for a deep learning neural network model.

Keywords: sentiment analysis, Naive Bayes model, natural language processing, topic analysis, fake health news classification model

Procedia PDF Downloads 71
3288 Off-Line Text-Independent Arabic Writer Identification Using Optimum Codebooks

Authors: Ahmed Abdullah Ahmed

Abstract:

The task of recognizing the writer of a handwritten text has been an attractive research problem in the document analysis and recognition community with applications in handwriting forensics, paleography, document examination and handwriting recognition. This research presents an automatic method for writer recognition from digitized images of unconstrained writings. Although a great effort has been made by previous studies to come out with various methods, their performances, especially in terms of accuracy, are fallen short, and room for improvements is still wide open. The proposed technique employs optimal codebook based writer characterization where each writing sample is represented by a set of features computed from two codebooks, beginning and ending. Unlike most of the classical codebook based approaches which segment the writing into graphemes, this study is based on fragmenting a particular area of writing which are beginning and ending strokes. The proposed method starting with contour detection to extract significant information from the handwriting and the curve fragmentation is then employed to categorize the handwriting into Beginning and Ending zones into small fragments. The similar fragments of beginning strokes are grouped together to create Beginning cluster, and similarly, the ending strokes are grouped to create the ending cluster. These two clusters lead to the development of two codebooks (beginning and ending) by choosing the center of every similar fragments group. Writings under study are then represented by computing the probability of occurrence of codebook patterns. The probability distribution is used to characterize each writer. Two writings are then compared by computing distances between their respective probability distribution. The evaluations carried out on ICFHR standard dataset of 206 writers using Beginning and Ending codebooks separately. Finally, the Ending codebook achieved the highest identification rate of 98.23%, which is the best result so far on ICFHR dataset.

Keywords: off-line text-independent writer identification, feature extraction, codebook, fragments

Procedia PDF Downloads 488
3287 Preparation of Polyethylene/Cashewnut Flour/ Gum Arabic Polymer Blends Through Melt-blending and Determination of Their Biodegradation by Composting Method for Possible Reduction of Polyethylene-based Wastes from the Environment

Authors: Abubakar Umar Birnin-yauri

Abstract:

Plastic wastes arising from Polyethylene (PE)-based materials are increasingly becoming environmental problem, this is owed to the fact that these PE waste materials will only decompose over hundreds, or even thousands of years, during which they cause serious environmental problems. In this research, Polymer blends prepared from PE, Cashewnut flour (CNF) and Gum Arabic (GA) were studied in order to assay their biodegradation potentials via composting method. Different sample formulations were made i.e., X1= (70% PE, 25% CNF and 5% GA, X2= (70% PE, 20% CNF and 10% GA), X3= (70% PE, 15% CNF and 15% GA), X4 = (70% PE, 10% CNF and 20% GA) and X5 = (70% PE, 5% CNF and 25% GA) respectively. The results obtained showed that X1 recorded weight loss of 9.89% of its original weight after the first 20 days and 37.45% after 100 day, and X2 lost 12.67 % after the first 20 days and 42.56% after 100day, sample X5 experienced the greatest weight lost in the two methods adopted which are 52.9% and 57.89%. Instrumental analysis such as Fourier Transform Infrared Spectroscopy, Thermogravimetric analysis and Scanning electron microscopy were performed on the polymer blends before and after biodegradation. The study revealed that the biodegradation of the polymer blends is influenced by the contents of both the CNF and GA added into the blends.

Keywords: polyethylene, cashewnut, gum Arabic, biodegradation, blend, environment

Procedia PDF Downloads 48
3286 Survey on Big Data Stream Classification by Decision Tree

Authors: Mansoureh Ghiasabadi Farahani, Samira Kalantary, Sara Taghi-Pour, Mahboubeh Shamsi

Abstract:

Nowadays, the development of computers technology and its recent applications provide access to new types of data, which have not been considered by the traditional data analysts. Two particularly interesting characteristics of such data sets include their huge size and streaming nature .Incremental learning techniques have been used extensively to address the data stream classification problem. This paper presents a concise survey on the obstacles and the requirements issues classifying data streams with using decision tree. The most important issue is to maintain a balance between accuracy and efficiency, the algorithm should provide good classification performance with a reasonable time response.

Keywords: big data, data streams, classification, decision tree

Procedia PDF Downloads 490
3285 The Pragmatics of the Evil Eye: Compliment Response Strategies in Egyptian Colloquial Arabic

Authors: HebatAllah Mohamed

Abstract:

The present study aims at identifying compliment response strategies used by Egyptian students when responding to a problematic and cultural-specific type of compliments: those allegedly provoking the evil eye. Discourse Completion Tasks (DCTs) and interviews were used to collect the data. both The participants were 21 female and 16 male Egyptian graduate and undergraduate students at the American university in Cairo. The results revealed a number of both common and different main and sub-categories of responses utilized by participants of both genders. Pedagogical implications are discussed.

Keywords: Arabic pragmatics, compliment responses, evil eye pragmatics, pragmatics in Egypt

Procedia PDF Downloads 462
3284 Biofilm Text Classifiers Developed Using Natural Language Processing and Unsupervised Learning Approach

Authors: Kanika Gupta, Ashok Kumar

Abstract:

Biofilms are dense, highly hydrated cell clusters that are irreversibly attached to a substratum, to an interface or to each other, and are embedded in a self-produced gelatinous matrix composed of extracellular polymeric substances. Research in biofilm field has become very significant, as biofilm has shown high mechanical resilience and resistance to antibiotic treatment and constituted as a significant problem in both healthcare and other industry related to microorganisms. The massive information both stated and hidden in the biofilm literature are growing exponentially therefore it is not possible for researchers and practitioners to automatically extract and relate information from different written resources. So, the current work proposes and discusses the use of text mining techniques for the extraction of information from biofilm literature corpora containing 34306 documents. It is very difficult and expensive to obtain annotated material for biomedical literature as the literature is unstructured i.e. free-text. Therefore, we considered unsupervised approach, where no annotated training is necessary and using this approach we developed a system that will classify the text on the basis of growth and development, drug effects, radiation effects, classification and physiology of biofilms. For this, a two-step structure was used where the first step is to extract keywords from the biofilm literature using a metathesaurus and standard natural language processing tools like Rapid Miner_v5.3 and the second step is to discover relations between the genes extracted from the whole set of biofilm literature using pubmed.mineR_v1.0.11. We used unsupervised approach, which is the machine learning task of inferring a function to describe hidden structure from 'unlabeled' data, in the above-extracted datasets to develop classifiers using WinPython-64 bit_v3.5.4.0Qt5 and R studio_v0.99.467 packages which will automatically classify the text by using the mentioned sets. The developed classifiers were tested on a large data set of biofilm literature which showed that the unsupervised approach proposed is promising as well as suited for a semi-automatic labeling of the extracted relations. The entire information was stored in the relational database which was hosted locally on the server. The generated biofilm vocabulary and genes relations will be significant for researchers dealing with biofilm research, making their search easy and efficient as the keywords and genes could be directly mapped with the documents used for database development.

Keywords: biofilms literature, classifiers development, text mining, unsupervised learning approach, unstructured data, relational database

Procedia PDF Downloads 144
3283 Broadening the Roles of Masjid: Reviving Prophetic Holistic Model in Fostering Islamic Education and Arabic Language in South-Western Nigeria

Authors: Ahmad Tijani Surajudeen, Muhammad Zahiri Awang Mat, Aliy Abdulwahid Adebisi

Abstract:

With arrival of Islam in the South-Western Nigeria in the late fifteenth and early sixteenth centuries, various masājid established in different parts of the area played vital roles towards the betterment and unity of the Muslims. However, despite the fact that the masājid in the South-Western part of Nigeria contributed immensely to the spiritual and educational enhancement of the Muslims, it has not fully captured the holistic educational roles as a unique model used by the Prophet (S.A.W). Therefore, the primary objective of this paper is to investigate and broaden the roles of masjid towards its compartmentalized and holistic contributions among the Muslims in the south-western Nigeria. The findings from the paper have identified five holistic roles of masjid, namely, spiritual, intellectual, physical, social and emotional contributions which have been exemplified in the prophetic model of masjid. The paper has argued that the five factors must be unreservedly unified towards the betterment of the Muslims and enhancement of Islamic education and Arabic Language in the South-Western Nigeria. However, the challenges of masjid management in the South-Western Nigeria are the main hindrance in achieving the holistic roles of masjid. It is thereby suggested that, the management of masjid should take the identified prophetic model of masjid into account in order to positively improve the affairs of Muslims as well as promoting the teaching and learning of Islamic education and Arabic language among the Muslims in the South-Western Nigeria.

Keywords: worship, Islamic education, Arabic language, prophetic holistic model

Procedia PDF Downloads 301
3282 A Custom Convolutional Neural Network with Hue, Saturation, Value Color for Malaria Classification

Authors: Ghazala Hcini, Imen Jdey, Hela Ltifi

Abstract:

Malaria disease should be considered and handled as a potential restorative catastrophe. One of the most challenging tasks in the field of microscopy image processing is due to differences in test design and vulnerability of cell classifications. In this article, we focused on applying deep learning to classify patients by identifying images of infected and uninfected cells. We performed multiple forms, counting a classification approach using the Hue, Saturation, Value (HSV) color space. HSV is used since of its superior ability to speak to image brightness; at long last, for classification, a convolutional neural network (CNN) architecture is created. Clusters of focus were used to deliver the classification. The highlights got to be forbidden, and a few more clamor sorts are included in the information. The suggested method has a precision of 99.79%, a recall value of 99.55%, and provides 99.96% accuracy.

Keywords: deep learning, convolutional neural network, image classification, color transformation, HSV color, malaria diagnosis, malaria cells images

Procedia PDF Downloads 65
3281 Improving Subjective Bias Detection Using Bidirectional Encoder Representations from Transformers and Bidirectional Long Short-Term Memory

Authors: Ebipatei Victoria Tunyan, T. A. Cao, Cheol Young Ock

Abstract:

Detecting subjectively biased statements is a vital task. This is because this kind of bias, when present in the text or other forms of information dissemination media such as news, social media, scientific texts, and encyclopedias, can weaken trust in the information and stir conflicts amongst consumers. Subjective bias detection is also critical for many Natural Language Processing (NLP) tasks like sentiment analysis, opinion identification, and bias neutralization. Having a system that can adequately detect subjectivity in text will boost research in the above-mentioned areas significantly. It can also come in handy for platforms like Wikipedia, where the use of neutral language is of importance. The goal of this work is to identify the subjectively biased language in text on a sentence level. With machine learning, we can solve complex AI problems, making it a good fit for the problem of subjective bias detection. A key step in this approach is to train a classifier based on BERT (Bidirectional Encoder Representations from Transformers) as upstream model. BERT by itself can be used as a classifier; however, in this study, we use BERT as data preprocessor as well as an embedding generator for a Bi-LSTM (Bidirectional Long Short-Term Memory) network incorporated with attention mechanism. This approach produces a deeper and better classifier. We evaluate the effectiveness of our model using the Wiki Neutrality Corpus (WNC), which was compiled from Wikipedia edits that removed various biased instances from sentences as a benchmark dataset, with which we also compare our model to existing approaches. Experimental analysis indicates an improved performance, as our model achieved state-of-the-art accuracy in detecting subjective bias. This study focuses on the English language, but the model can be fine-tuned to accommodate other languages.

Keywords: subjective bias detection, machine learning, BERT–BiLSTM–Attention, text classification, natural language processing

Procedia PDF Downloads 101
3280 Reinforcement Learning for Classification of Low-Resolution Satellite Images

Authors: Khadija Bouzaachane, El Mahdi El Guarmah

Abstract:

The classification of low-resolution satellite images has been a worthwhile and fertile field that attracts plenty of researchers due to its importance in monitoring geographical areas. It could be used for several purposes such as disaster management, military surveillance, agricultural monitoring. The main objective of this work is to classify efficiently and accurately low-resolution satellite images by using novel technics of deep learning and reinforcement learning. The images include roads, residential areas, industrial areas, rivers, sea lakes, and vegetation. To achieve that goal, we carried out experiments on the sentinel-2 images considering both high accuracy and efficiency classification. Our proposed model achieved a 91% accuracy on the testing dataset besides a good classification for land cover. Focus on the parameter precision; we have obtained 93% for the river, 92% for residential, 97% for residential, 96% for the forest, 87% for annual crop, 84% for herbaceous vegetation, 85% for pasture, 78% highway and 100% for Sea Lake.

Keywords: classification, deep learning, reinforcement learning, satellite imagery

Procedia PDF Downloads 179
3279 Using Self Organizing Feature Maps for Classification in RGB Images

Authors: Hassan Masoumi, Ahad Salimi, Nazanin Barhemmat, Babak Gholami

Abstract:

Artificial neural networks have gained a lot of interest as empirical models for their powerful representational capacity, multi input and output mapping characteristics. In fact, most feed-forward networks with nonlinear nodal functions have been proved to be universal approximates. In this paper, we propose a new supervised method for color image classification based on self organizing feature maps (SOFM). This algorithm is based on competitive learning. The method partitions the input space using self-organizing feature maps to introduce the concept of local neighborhoods. Our image classification system entered into RGB image. Experiments with simulated data showed that separability of classes increased when increasing training time. In additional, the result shows proposed algorithms are effective for color image classification.

Keywords: classification, SOFM algorithm, neural network, neighborhood, RGB image

Procedia PDF Downloads 452
3278 A Hybrid Fuzzy Clustering Approach for Fertile and Unfertile Analysis

Authors: Shima Soltanzadeh, Mohammad Hosain Fazel Zarandi, Mojtaba Barzegar Astanjin

Abstract:

Diagnosis of male infertility by the laboratory tests is expensive and, sometimes it is intolerable for patients. Filling out the questionnaire and then using classification method can be the first step in decision-making process, so only in the cases with a high probability of infertility we can use the laboratory tests. In this paper, we evaluated the performance of four classification methods including naive Bayesian, neural network, logistic regression and fuzzy c-means clustering as a classification, in the diagnosis of male infertility due to environmental factors. Since the data are unbalanced, the ROC curves are most suitable method for the comparison. In this paper, we also have selected the more important features using a filtering method and examined the impact of this feature reduction on the performance of each methods; generally, most of the methods had better performance after applying the filter. We have showed that using fuzzy c-means clustering as a classification has a good performance according to the ROC curves and its performance is comparable to other classification methods like logistic regression.

Keywords: classification, fuzzy c-means, logistic regression, Naive Bayesian, neural network, ROC curve

Procedia PDF Downloads 309
3277 Visual Text Analytics Technologies for Real-Time Big Data: Chronological Evolution and Issues

Authors: Siti Azrina B. A. Aziz, Siti Hafizah A. Hamid

Abstract:

New approaches to analyze and visualize data stream in real-time basis is important in making a prompt decision by the decision maker. Financial market trading and surveillance, large-scale emergency response and crowd control are some example scenarios that require real-time analytic and data visualization. This situation has led to the development of techniques and tools that support humans in analyzing the source data. With the emergence of Big Data and social media, new techniques and tools are required in order to process the streaming data. Today, ranges of tools which implement some of these functionalities are available. In this paper, we present chronological evolution evaluation of technologies for supporting of real-time analytic and visualization of the data stream. Based on the past research papers published from 2002 to 2014, we gathered the general information, main techniques, challenges and open issues. The techniques for streaming text visualization are identified based on Text Visualization Browser in chronological order. This paper aims to review the evolution of streaming text visualization techniques and tools, as well as to discuss the problems and challenges for each of identified tools.

Keywords: information visualization, visual analytics, text mining, visual text analytics tools, big data visualization

Procedia PDF Downloads 379
3276 Automatic Classification of Periodic Heart Sounds Using Convolutional Neural Network

Authors: Jia Xin Low, Keng Wah Choo

Abstract:

This paper presents an automatic normal and abnormal heart sound classification model developed based on deep learning algorithm. MITHSDB heart sounds datasets obtained from the 2016 PhysioNet/Computing in Cardiology Challenge database were used in this research with the assumption that the electrocardiograms (ECG) were recorded simultaneously with the heart sounds (phonocardiogram, PCG). The PCG time series are segmented per heart beat, and each sub-segment is converted to form a square intensity matrix, and classified using convolutional neural network (CNN) models. This approach removes the need to provide classification features for the supervised machine learning algorithm. Instead, the features are determined automatically through training, from the time series provided. The result proves that the prediction model is able to provide reasonable and comparable classification accuracy despite simple implementation. This approach can be used for real-time classification of heart sounds in Internet of Medical Things (IoMT), e.g. remote monitoring applications of PCG signal.

Keywords: convolutional neural network, discrete wavelet transform, deep learning, heart sound classification

Procedia PDF Downloads 324
3275 Automatic Assignment of Geminate and Epenthetic Vowel for Amharic Text-to-Speech System

Authors: Tadesse Anberbir, Bankole Felix, Tomio Takara

Abstract:

In the development of a text-to-speech synthesizer, automatic derivation of correct pronunciation from the grapheme form of a text is a central problem. Particularly deriving phonological features which are not shown in orthography is challenging. In the Amharic language, geminates and epenthetic vowels are very crucial for proper pronunciation, but neither is shown in orthography. In this paper, to proposed and integrated a morphological analyzer into an Amharic Text-to-Speech system, mainly to predict geminates and epenthetic vowel positions and prepared a duration modeling method. Amharic Text-to-Speech system (AmhTTS) is a parametric and rule-based system that adopts a cepstral method and uses a source filter model for speech production and a Log Magnitude Approximation (LMA) filter as the vocal tract filter. The naturalness of the system after employing the duration modeling was evaluated by sentence listening test, and we achieved an average Mean Opinion Score (MOS) 3.4 (68%), which is moderate. By modeling the duration of geminates and controlling the locations of epenthetic vowel, we are able to synthesize good quality speech. Our system is mainly suitable to be customized for other Ethiopian languages with limited resources.

Keywords: amharic, gemination, Speech synthesis, morphology, epenthesis

Procedia PDF Downloads 59
3274 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning

Authors: Walid Cherif

Abstract:

Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.

Keywords: data mining, knowledge discovery, machine learning, similarity measurement, supervised classification

Procedia PDF Downloads 443