Search results for: short text classification
5973 Assessment of Planet Image for Land Cover Mapping Using Soft and Hard Classifiers
Authors: Lamyaa Gamal El-Deen Taha, Ashraf Sharawi
Abstract:
Planet image is a new data source from planet lab. This research is concerned with the assessment of Planet image for land cover mapping. Two pixel based classifiers and one subpixel based classifier were compared. Firstly, rectification of Planet image was performed. Secondly, a comparison between minimum distance, maximum likelihood and neural network classifications for classification of Planet image was performed. Thirdly, the overall accuracy of classification and kappa coefficient were calculated. Results indicate that neural network classification is best followed by maximum likelihood classifier then minimum distance classification for land cover mapping.Keywords: planet image, land cover mapping, rectification, neural network classification, multilayer perceptron, soft classifiers, hard classifiers
Procedia PDF Downloads 1885972 Satellite Image Classification Using Firefly Algorithm
Authors: Paramjit Kaur, Harish Kundra
Abstract:
In the recent years, swarm intelligence based firefly algorithm has become a great focus for the researchers to solve the real time optimization problems. Here, firefly algorithm is used for the application of satellite image classification. For experimentation, Alwar area is considered to multiple land features like vegetation, barren, hilly, residential and water surface. Alwar dataset is considered with seven band satellite images. Firefly Algorithm is based on the attraction of less bright fireflies towards more brightener one. For the evaluation of proposed concept accuracy assessment parameters are calculated using error matrix. With the help of Error matrix, parameters of Kappa Coefficient, Overall Accuracy and feature wise accuracy parameters of user’s accuracy & producer’s accuracy can be calculated. Overall results are compared with BBO, PSO, Hybrid FPAB/BBO, Hybrid ACO/SOFM and Hybrid ACO/BBO based on the kappa coefficient and overall accuracy parameters.Keywords: image classification, firefly algorithm, satellite image classification, terrain classification
Procedia PDF Downloads 4015971 Sentiment Classification Using Enhanced Contextual Valence Shifters
Authors: Vo Ngoc Phu, Phan Thi Tuoi
Abstract:
We have explored different methods of improving the accuracy of sentiment classification. The sentiment orientation of a document can be positive (+), negative (-), or neutral (0). We combine five dictionaries from [2, 3, 4, 5, 6] into the new one with 21137 entries. The new dictionary has many verbs, adverbs, phrases and idioms, that are not in five ones before. The paper shows that our proposed method based on the combination of Term-Counting method and Enhanced Contextual Valence Shifters method has improved the accuracy of sentiment classification. The combined method has accuracy 68.984% on the testing dataset, and 69.224% on the training dataset. All of these methods are implemented to classify the reviews based on our new dictionary and the Internet Movie data set.Keywords: sentiment classification, sentiment orientation, valence shifters, contextual, valence shifters, term counting
Procedia PDF Downloads 5055970 Short Arc Technique for Baselines Determinations
Authors: Gamal F.Attia
Abstract:
The baselines are the distances and lengths of the chords between projections of the positions of the laser stations on the reference ellipsoid. For the satellite geodesy, it is very important to determine the optimal length of orbital arc along which laser measurements are to be carried out. It is clear that for the dynamical methods long arcs (one month or more) are to be used. According to which more errors of modeling of different physical forces such as earth's gravitational field, air drag, solar radiation pressure, and others that may influence the accuracy of the estimation of the satellites position, at the same time the measured errors con be almost completely excluded and high stability in determination of relative coordinate system can be achieved. It is possible to diminish the influence of the errors of modeling by using short-arcs of the satellite orbit (several revolutions or days), but the station's coordinates estimated by different arcs con differ from each other by a larger quantity than statistical zero. Under the semidynamical ‘short arc’ method one or several passes of the satellite in one of simultaneous visibility from both ends of the chord is known and the estimated parameter in this case is the length of the chord. The comparison of the same baselines calculated with long and short arcs methods shows a good agreement and even speaks in favor of the last one. In this paper the Short Arc technique has been explained and 3 baselines have been determined using the ‘short arc’ method.Keywords: baselines, short arc, dynamical, gravitational field
Procedia PDF Downloads 4655969 A Method for Clinical Concept Extraction from Medical Text
Authors: Moshe Wasserblat, Jonathan Mamou, Oren Pereg
Abstract:
Natural Language Processing (NLP) has made a major leap in the last few years, in practical integration into medical solutions; for example, extracting clinical concepts from medical texts such as medical condition, medication, treatment, and symptoms. However, training and deploying those models in real environments still demands a large amount of annotated data and NLP/Machine Learning (ML) expertise, which makes this process costly and time-consuming. We present a practical and efficient method for clinical concept extraction that does not require costly labeled data nor ML expertise. The method includes three steps: Step 1- the user injects a large in-domain text corpus (e.g., PubMed). Then, the system builds a contextual model containing vector representations of concepts in the corpus, in an unsupervised manner (e.g., Phrase2Vec). Step 2- the user provides a seed set of terms representing a specific medical concept (e.g., for the concept of the symptoms, the user may provide: ‘dry mouth,’ ‘itchy skin,’ and ‘blurred vision’). Then, the system matches the seed set against the contextual model and extracts the most semantically similar terms (e.g., additional symptoms). The result is a complete set of terms related to the medical concept. Step 3 –in production, there is a need to extract medical concepts from the unseen medical text. The system extracts key-phrases from the new text, then matches them against the complete set of terms from step 2, and the most semantically similar will be annotated with the same medical concept category. As an example, the seed symptom concepts would result in the following annotation: “The patient complaints on fatigue [symptom], dry skin [symptom], and Weight loss [symptom], which can be an early sign for Diabetes.” Our evaluations show promising results for extracting concepts from medical corpora. The method allows medical analysts to easily and efficiently build taxonomies (in step 2) representing their domain-specific concepts, and automatically annotate a large number of texts (in step 3) for classification/summarization of medical reports.Keywords: clinical concepts, concept expansion, medical records annotation, medical records summarization
Procedia PDF Downloads 1355968 Improved Processing Speed for Text Watermarking Algorithm in Color Images
Authors: Hamza A. Al-Sewadi, Akram N. A. Aldakari
Abstract:
Copyright protection and ownership proof of digital multimedia are achieved nowadays by digital watermarking techniques. A text watermarking algorithm for protecting the property rights and ownership judgment of color images is proposed in this paper. Embedding is achieved by inserting texts elements randomly into the color image as noise. The YIQ image processing model is found to be faster than other image processing methods, and hence, it is adopted for the embedding process. An optional choice of encrypting the text watermark before embedding is also suggested (in case required by some applications), where, the text can is encrypted using any enciphering technique adding more difficulty to hackers. Experiments resulted in embedding speed improvement of more than double the speed of other considered systems (such as least significant bit method, and separate color code methods), and a fairly acceptable level of peak signal to noise ratio (PSNR) with low mean square error values for watermarking purposes.Keywords: steganography, watermarking, time complexity measurements, private keys
Procedia PDF Downloads 1445967 Activation of Google Classroom Features to Engage Introvert Students in Comprehensible Output
Authors: Raghad Dwaik
Abstract:
It is well known in language acquisition literature that a mere understanding of a reading text is not enough to help students build proficiency in comprehension. Students should rather follow understanding by attempting to express what has been understood by pushing their competence to the limit. Learners' attempt to push their competence was given the term "comprehensible output" by Swain (1985). Teachers in large classes, however, find it sometimes difficult to give all students a chance to communicate their views or to share their ideas during the short class time. In most cases, students who are outgoing dominate class discussion and get more opportunities for practice which leads to ignoring the shy students totally while helping the good ones become better. This paper presents the idea of using Google Classroom features of posting and commenting to allow students who hesitate to participate in class discussions about a reading text to write their views on the wall of a Google Classroom and share them later after they have received feedback and comments from classmates. Such attempts lead to developing their proficiency through additional practice in comprehensible output and to enhancing their confidence in themselves and their views. It was found that virtual classroom interaction would help students maintain vocabulary, use more complex structures and focus on meaning besides form.Keywords: learning groups, reading TESOL, Google Classroom, comprehensible output
Procedia PDF Downloads 785966 Evaluation of Nurse Immunisation Short Course Transitioning to Fully Online
Authors: Joanne Joyce-McCoach
Abstract:
Short courses are an integral part of the higher education sector, providing a pathway into tertiary qualifications. Recently, the Australian government has implemented a range of initiatives to support the development of short courses and micro-credentials designed to upskill the labor market and meet the needs of the healthcare workforce. While short courses have been an ongoing component of Australian nursing continuing professional development, there is an immediate need for more education opportunities as a response to the workforce shortages. However, despite the support for short courses, there are identified challenges for learners undertaking these courses online. As a result of restrictions to face-to-face classes and limited access to health services caused by the pandemic, education providers have had to transition to an online delivery requiring the redesign of skills acquisition. This paper will outline the transition of an immunisation short course to a fully online format, including the redesign of classes, content and assessment. Concurrently the enrolments for the immunisation short course substantially increased in direct response to the demand for nurse immunisers. In addition to providing a description of the curriculum changes implemented, an analysis of learners’ feedback on their experience of the new format will be discussed. Furthermore, it will explore the principles identified in the transition process for improving the short course design and learning activities. Finally, it will propose recommendations to integrate into the delivery of online short courses and to meet the learners' needs.Keywords: nurse, immunisation, short course, micro-credential, continuing professional development, online design
Procedia PDF Downloads 705965 Music Genre Classification Based on Non-Negative Matrix Factorization Features
Authors: Soyon Kim, Edward Kim
Abstract:
In order to retrieve information from the massive stream of songs in the music industry, music search by title, lyrics, artist, mood, and genre has become more important. Despite the subjectivity and controversy over the definition of music genres across different nations and cultures, automatic genre classification systems that facilitate the process of music categorization have been developed. Manual genre selection by music producers is being provided as statistical data for designing automatic genre classification systems. In this paper, an automatic music genre classification system utilizing non-negative matrix factorization (NMF) is proposed. Short-term characteristics of the music signal can be captured based on the timbre features such as mel-frequency cepstral coefficient (MFCC), decorrelated filter bank (DFB), octave-based spectral contrast (OSC), and octave band sum (OBS). Long-term time-varying characteristics of the music signal can be summarized with (1) the statistical features such as mean, variance, minimum, and maximum of the timbre features and (2) the modulation spectrum features such as spectral flatness measure, spectral crest measure, spectral peak, spectral valley, and spectral contrast of the timbre features. Not only these conventional basic long-term feature vectors, but also NMF based feature vectors are proposed to be used together for genre classification. In the training stage, NMF basis vectors were extracted for each genre class. The NMF features were calculated in the log spectral magnitude domain (NMF-LSM) as well as in the basic feature vector domain (NMF-BFV). For NMF-LSM, an entire full band spectrum was used. However, for NMF-BFV, only low band spectrum was used since high frequency modulation spectrum of the basic feature vectors did not contain important information for genre classification. In the test stage, using the set of pre-trained NMF basis vectors, the genre classification system extracted the NMF weighting values of each genre as the NMF feature vectors. A support vector machine (SVM) was used as a classifier. The GTZAN multi-genre music database was used for training and testing. It is composed of 10 genres and 100 songs for each genre. To increase the reliability of the experiments, 10-fold cross validation was used. For a given input song, an extracted NMF-LSM feature vector was composed of 10 weighting values that corresponded to the classification probabilities for 10 genres. An NMF-BFV feature vector also had a dimensionality of 10. Combined with the basic long-term features such as statistical features and modulation spectrum features, the NMF features provided the increased accuracy with a slight increase in feature dimensionality. The conventional basic features by themselves yielded 84.0% accuracy, but the basic features with NMF-LSM and NMF-BFV provided 85.1% and 84.2% accuracy, respectively. The basic features required dimensionality of 460, but NMF-LSM and NMF-BFV required dimensionalities of 10 and 10, respectively. Combining the basic features, NMF-LSM and NMF-BFV together with the SVM with a radial basis function (RBF) kernel produced the significantly higher classification accuracy of 88.3% with a feature dimensionality of 480.Keywords: mel-frequency cepstral coefficient (MFCC), music genre classification, non-negative matrix factorization (NMF), support vector machine (SVM)
Procedia PDF Downloads 3035964 Detection and Classification of Mammogram Images Using Principle Component Analysis and Lazy Classifiers
Authors: Rajkumar Kolangarakandy
Abstract:
Feature extraction and selection is the primary part of any mammogram classification algorithms. The choice of feature, attribute or measurements have an important influence in any classification system. Discrete Wavelet Transformation (DWT) coefficients are one of the prominent features for representing images in frequency domain. The features obtained after the decomposition of the mammogram images using wavelet transformations have higher dimension. Even though the features are higher in dimension, they were highly correlated and redundant in nature. The dimensionality reduction techniques play an important role in selecting the optimum number of features from the higher dimension data, which are highly correlated. PCA is a mathematical tool that reduces the dimensionality of the data while retaining most of the variation in the dataset. In this paper, a multilevel classification of mammogram images using reduced discrete wavelet transformation coefficients and lazy classifiers is proposed. The classification is accomplished in two different levels. In the first level, mammogram ROIs extracted from the dataset is classified as normal and abnormal types. In the second level, all the abnormal mammogram ROIs is classified into benign and malignant too. A further classification is also accomplished based on the variation in structure and intensity distribution of the images in the dataset. The Lazy classifiers called Kstar, IBL and LWL are used for classification. The classification results obtained with the reduced feature set is highly promising and the result is also compared with the performance obtained without dimension reduction.Keywords: PCA, wavelet transformation, lazy classifiers, Kstar, IBL, LWL
Procedia PDF Downloads 3355963 Characterization, Classification and Fertility Capability Classification of Three Rice Zones of Ebonyi State, Southeastern Nigeria
Authors: Sunday Nathaniel Obasi, Chiamak Chinasa Obasi
Abstract:
Soil characterization and classification provide the basic information necessary to create a functional evaluation and soil classification schemes. Fertility capability classification (FCC) on the other hand is a technical system that groups the soils according to kinds of problems they present for management of soil physical and chemical properties. This research was carried out in Ebonyi state, southeastern Nigeria, which is an agrarian state and a leading rice producing part of southeastern Nigeria. In order to maximize the soil and enhance the productivity of rice in Ebonyi soils, soil classification, and fertility classification information need to be supplied. The state was grouped into three locations according to their agricultural zones namely; Ebonyi north, Ebonyi central and Ebonyi south representing Abakaliki, Ikwo and Ivo locations respectively. Major rice growing areas of the soils were located and two profile pits were sunk in each of the studied zones from which soils were characterized, classified and fertility capability classification (FCC) developed. Soil classification was done using United State Department of Agriculture (USDA) Soil Taxonomy and correlated with World Reference Base for soil resources. Results obtained classified Abakaliki 1 and Abakaliki 2 as Typic Fluvaquents (Ochric Fluvisols). Ikwo 1 was classified as Vertic Eutrudepts (Eutric Vertisols) while Ikwo 2 was classified as Typic Eutrudepts (Eutric Cambisols). Ivo 1 and Ivo 2 were both classified as Aquic Eutrudepts (Gleyic Leptosols). Fertility capability classification (FCC) revealed that all studied soils had mostly loamy topsoils and subsoils except Ikwo 1 with clayey topsoil. Limitations encountered in the studied soils include; dryness (d), low ECEC (e), low nutrient capital reserve (k) and water logging/ anaerobic condition (gley). Thus, FCC classifications were Ldek for Abakaliki 1 and 2, Ckv for Ikwo 1, LCk for Ikwo 2 while Ivo 1 and 2 were Legk and Lgk respectively.Keywords: soil classification, soil fertility, limitations, modifiers, Southeastern Nigeria
Procedia PDF Downloads 1305962 Land Cover Classification Using Sentinel-2 Image Data and Random Forest Algorithm
Authors: Thanh Noi Phan, Martin Kappas, Jan Degener
Abstract:
The currently launched Sentinel 2 (S2) satellite (June, 2015) bring a great potential and opportunities for land use/cover map applications, due to its fine spatial resolution multispectral as well as high temporal resolutions. So far, there are handful studies using S2 real data for land cover classification. Especially in northern Vietnam, to our best knowledge, there exist no studies using S2 data for land cover map application. The aim of this study is to provide the preliminary result of land cover classification using Sentinel -2 data with a rising state – of – art classifier, Random Forest. A case study with heterogeneous land use/cover in the eastern of Hanoi Capital – Vietnam was chosen for this study. All 10 spectral bands of 10 and 20 m pixel size of S2 images were used, the 10 m bands were resampled to 20 m. Among several classified algorithms, supervised Random Forest classifier (RF) was applied because it was reported as one of the most accuracy methods of satellite image classification. The results showed that the red-edge and shortwave infrared (SWIR) bands play an important role in land cover classified results. A very high overall accuracy above 90% of classification results was achieved.Keywords: classify algorithm, classification, land cover, random forest, sentinel 2, Vietnam
Procedia PDF Downloads 3895961 Understanding Mudrocks and Their Shear Strength Deterioration Associated with Inundation
Authors: Haslinda Nahazanan, Afshin Asadi, Zainuddin Md. Yusoff, Nik Nor Syahariati Nik Daud
Abstract:
Mudrocks is considered as a problematic material due to their unexpected behaviour specifically when they are contacting with water or being exposed to the atmosphere. Many instability problems of cutting slopes were found lying on high slaking mudrocks. It has become one of the major concerns to geotechnical engineer as mudrocks cover up to 50% of sedimentary rocks in the geologic records. Mudrocks display properties between soils and rocks which can be very hard to understand. Therefore, this paper aims to review the definition, mineralogy, geo-chemistry, classification and engineering properties of mudrocks. As water has become one of the major factors that will rapidly change the behaviour of mudrocks, a review on the shear strength of mudrocks in Derbyshire has been made using a fully automated hydraulic stress path testing system under three states: dry, short-term inundated and long-term inundated. It can be seen that the strength of mudrocks has deteriorated as it condition changed from dry to short-term inundated and finally to long-term inundated.Keywords: mudrocks, sedimentary rocks, inundation, shear strength
Procedia PDF Downloads 2365960 Classification of Cochannel Signals Using Cyclostationary Signal Processing and Deep Learning
Authors: Bryan Crompton, Daniel Giger, Tanay Mehta, Apurva Mody
Abstract:
The task of classifying radio frequency (RF) signals has seen recent success in employing deep neural network models. In this work, we present a combined signal processing and machine learning approach to signal classification for cochannel anomalous signals. The power spectral density and cyclostationary signal processing features of a captured signal are computed and fed into a neural net to produce a classification decision. Our combined signal preprocessing and machine learning approach allows for simpler neural networks with fast training times and small computational resource requirements for inference with longer preprocessing time.Keywords: signal processing, machine learning, cyclostationary signal processing, signal classification
Procedia PDF Downloads 1095959 Using Data Mining Technique for Scholarship Disbursement
Authors: J. K. Alhassan, S. A. Lawal
Abstract:
This work is on decision tree-based classification for the disbursement of scholarship. Tree-based data mining classification technique is used in other to determine the generic rule to be used to disburse the scholarship. The system based on the defined rules from the tree is able to determine the class (status) to which an applicant shall belong whether Granted or Not Granted. The applicants that fall to the class of granted denote a successful acquirement of scholarship while those in not granted class are unsuccessful in the scheme. An algorithm that can be used to classify the applicants based on the rules from tree-based classification was also developed. The tree-based classification is adopted because of its efficiency, effectiveness, and easy to comprehend features. The system was tested with the data of National Information Technology Development Agency (NITDA) Abuja, a Parastatal of Federal Ministry of Communication Technology that is mandated to develop and regulate information technology in Nigeria. The system was found working according to the specification. It is therefore recommended for all scholarship disbursement organizations.Keywords: classification, data mining, decision tree, scholarship
Procedia PDF Downloads 3785958 University Short Courses Web Application Using ASP.Net
Authors: Ahmed Hariri
Abstract:
E-Learning has become a necessity in the advanced education. It is easier for the student and teacher communication also it speed up the process with less time and less effort. With the progress and the enormous development of distance education must keep up with this age of making a website that allows students and teachers to take all the advantages of advanced education. In this regards, we developed University Short courses web application which is specially designed for Faculty of computing and information technology, Rabigh, Kingdom of Saudi Arabia. After an elaborate review of the current state-of-the-art methods of teaching and learning, we found that instructors deliver extra short courses and workshop to students to enhance the knowledge of students. Moreover, this process is completely manual. The prevailing methods of teaching and learning consume a lot of time; therefore in this context, University Short courses web application will help to make process easy and user friendly. The site allows for students can view and register short courses online conducted by instructor also they can see courses starting dates, finishing date and locations. It also allows the instructor to put things on his courses on the site and see the students enrolled in the study material. Finally, student can print the certificate after finished the course online. ASP.NET, SQLSERVER, JavaScript SQL SERVER Database will use to develop the University Short Courses web application.Keywords: e-learning, short courses, ASP.NET, SQL SERVER
Procedia PDF Downloads 1355957 Synthetic Aperture Radar Remote Sensing Classification Using the Bag of Visual Words Model to Land Cover Studies
Authors: Reza Mohammadi, Mahmod R. Sahebi, Mehrnoosh Omati, Milad Vahidi
Abstract:
Classification of high resolution polarimetric Synthetic Aperture Radar (PolSAR) images plays an important role in land cover and land use management. Recently, classification algorithms based on Bag of Visual Words (BOVW) model have attracted significant interest among scholars and researchers in and out of the field of remote sensing. In this paper, BOVW model with pixel based low-level features has been implemented to classify a subset of San Francisco bay PolSAR image, acquired by RADARSAR 2 in C-band. We have used segment-based decision-making strategy and compared the result with the result of traditional Support Vector Machine (SVM) classifier. 90.95% overall accuracy of the classification with the proposed algorithm has shown that the proposed algorithm is comparable with the state-of-the-art methods. In addition to increase in the classification accuracy, the proposed method has decreased undesirable speckle effect of SAR images.Keywords: Bag of Visual Words (BOVW), classification, feature extraction, land cover management, Polarimetric Synthetic Aperture Radar (PolSAR)
Procedia PDF Downloads 2135956 Evaluating 8D Reports Using Text-Mining
Authors: Benjamin Kuester, Bjoern Eilert, Malte Stonis, Ludger Overmeyer
Abstract:
Increasing quality requirements make reliable and effective quality management indispensable. This includes the complaint handling in which the 8D method is widely used. The 8D report as a written documentation of the 8D method is one of the key quality documents as it internally secures the quality standards and acts as a communication medium to the customer. In practice, however, the 8D report is mostly faulty and of poor quality. There is no quality control of 8D reports today. This paper describes the use of natural language processing for the automated evaluation of 8D reports. Based on semantic analysis and text-mining algorithms the presented system is able to uncover content and formal quality deficiencies and thus increases the quality of the complaint processing in the long term.Keywords: 8D report, complaint management, evaluation system, text-mining
Procedia PDF Downloads 3165955 Novel Inference Algorithm for Gaussian Process Classification Model with Multiclass and Its Application to Human Action Classification
Authors: Wanhyun Cho, Soonja Kang, Sangkyoon Kim, Soonyoung Park
Abstract:
In this paper, we propose a novel inference algorithm for the multi-class Gaussian process classification model that can be used in the field of human behavior recognition. This algorithm can drive simultaneously both a posterior distribution of a latent function and estimators of hyper-parameters in a Gaussian process classification model with multi-class. Our algorithm is based on the Laplace approximation (LA) technique and variational EM framework. This is performed in two steps: called expectation and maximization steps. First, in the expectation step, using the Bayesian formula and LA technique, we derive approximately the posterior distribution of the latent function indicating the possibility that each observation belongs to a certain class in the Gaussian process classification model. Second, in the maximization step, using a derived posterior distribution of latent function, we compute the maximum likelihood estimator for hyper-parameters of a covariance matrix necessary to define prior distribution for latent function. These two steps iteratively repeat until a convergence condition satisfies. Moreover, we apply the proposed algorithm with human action classification problem using a public database, namely, the KTH human action data set. Experimental results reveal that the proposed algorithm shows good performance on this data set.Keywords: bayesian rule, gaussian process classification model with multiclass, gaussian process prior, human action classification, laplace approximation, variational EM algorithm
Procedia PDF Downloads 3375954 Training AI to Be Empathetic and Determining the Psychotype of a Person During a Conversation with a Chatbot
Authors: Aliya Grig, Konstantin Sokolov, Igor Shatalin
Abstract:
The report describes the methodology for collecting data and building an ML model for determining the personality psychotype using profiling and personality traits methods based on several short messages of a user communicating on an arbitrary topic with a chitchat bot. In the course of the experiments, the minimum amount of text was revealed to confidently determine aspects of personality. Model accuracy - 85%. Users' language of communication is English. AI for a personalized communication with a user based on his mood, personality, and current emotional state. Features investigated during the research: personalized communication; providing empathy; adaptation to a user; predictive analytics. In the report, we describe the processes that captures both structured and unstructured data pertaining to a user in large quantities and diverse forms. This data is then effectively processed through ML tools to construct a knowledge graph and draw inferences regarding users of text messages in a comprehensive manner. Specifically, the system analyzes users' behavioral patterns and predicts future scenarios based on this analysis. As a result of the experiments, we provide for further research on training AI models to be empathetic, creating personalized communication for a userKeywords: AI, empathetic, chatbot, AI models
Procedia PDF Downloads 945953 Towards Learning Query Expansion
Authors: Ahlem Bouziri, Chiraz Latiri, Eric Gaussier
Abstract:
The steady growth in the size of textual document collections is a key progress-driver for modern information retrieval techniques whose effectiveness and efficiency are constantly challenged. Given a user query, the number of retrieved documents can be overwhelmingly large, hampering their efficient exploitation by the user. In addition, retaining only relevant documents in a query answer is of paramount importance for an effective meeting of the user needs. In this situation, the query expansion technique offers an interesting solution for obtaining a complete answer while preserving the quality of retained documents. This mainly relies on an accurate choice of the added terms to an initial query. Interestingly enough, query expansion takes advantage of large text volumes by extracting statistical information about index terms co-occurrences and using it to make user queries better fit the real information needs. In this respect, a promising track consists in the application of data mining methods to extract dependencies between terms, namely a generic basis of association rules between terms. The key feature of our approach is a better trade off between the size of the mining result and the conveyed knowledge. Thus, face to the huge number of derived association rules and in order to select the optimal combination of query terms from the generic basis, we propose to model the problem as a classification problem and solve it using a supervised learning algorithm such as SVM or k-means. For this purpose, we first generate a training set using a genetic algorithm based approach that explores the association rules space in order to find an optimal set of expansion terms, improving the MAP of the search results. The experiments were performed on SDA 95 collection, a data collection for information retrieval. It was found that the results were better in both terms of MAP and NDCG. The main observation is that the hybridization of text mining techniques and query expansion in an intelligent way allows us to incorporate the good features of all of them. As this is a preliminary attempt in this direction, there is a large scope for enhancing the proposed method.Keywords: supervised leaning, classification, query expansion, association rules
Procedia PDF Downloads 3255952 Polarimetric Synthetic Aperture Radar Data Classification Using Support Vector Machine and Mahalanobis Distance
Authors: Najoua El Hajjaji El Idrissi, Necip Gokhan Kasapoglu
Abstract:
Polarimetric Synthetic Aperture Radar-based imaging is a powerful technique used for earth observation and classification of surfaces. Forest evolution has been one of the vital areas of attention for the remote sensing experts. The information about forest areas can be achieved by remote sensing, whether by using active radars or optical instruments. However, due to several weather constraints, such as cloud cover, limited information can be recovered using optical data and for that reason, Polarimetric Synthetic Aperture Radar (PolSAR) is used as a powerful tool for forestry inventory. In this [14paper, we applied support vector machine (SVM) and Mahalanobis distance to the fully polarimetric AIRSAR P, L, C-bands data from the Nezer forest areas, the classification is based in the separation of different tree ages. The classification results were evaluated and the results show that the SVM performs better than the Mahalanobis distance and SVM achieves approximately 75% accuracy. This result proves that SVM classification can be used as a useful method to evaluate fully polarimetric SAR data with sufficient value of accuracy.Keywords: classification, synthetic aperture radar, SAR polarimetry, support vector machine, mahalanobis distance
Procedia PDF Downloads 1335951 Enframing the Smart City: Utilizing Heidegger's 'The Question Concerning Technology' as a Framework to Interpret Smart Urbanism
Authors: Will Brown
Abstract:
Martin Heidegger is considered to be one of the leading philosophical lights of the 20th century with his lecture/essay 'The Question Concerning Technology' proving to be an invaluable text in the study of technology and the understanding of how technology influences the world it is set upon. However, this text has not as of yet been applied to the rapid rise and proliferation of ‘smart’ cities. This article is premised upon the application of the aforementioned text and the smart city in order to provide a fresh, if not critical analysis and interpretation of this phenomena. The first section below provides a brief literature review of smart urbanism in order to lay the groundwork necessary to apply Heidegger’s work to the smart city, from which a framework is developed to interpret the infusion of digital sensing technologies and the urban milieu. This framework is comprised of four concepts put forward in Heidegger’s text: circumscribing, bringing-forth, challenging, and standing-reserve. A concluding chapter is based upon the notion of enframement, arguing that once the rubric of data collection is placed within the urban system, future systems will require the capability to harvest data, resulting in an ever-renewing smart city.Keywords: air quality sensing, big data, Martin Heidegger, smart city
Procedia PDF Downloads 2095950 Classification of Land Cover Usage from Satellite Images Using Deep Learning Algorithms
Authors: Shaik Ayesha Fathima, Shaik Noor Jahan, Duvvada Rajeswara Rao
Abstract:
Earth's environment and its evolution can be seen through satellite images in near real-time. Through satellite imagery, remote sensing data provide crucial information that can be used for a variety of applications, including image fusion, change detection, land cover classification, agriculture, mining, disaster mitigation, and monitoring climate change. The objective of this project is to propose a method for classifying satellite images according to multiple predefined land cover classes. The proposed approach involves collecting data in image format. The data is then pre-processed using data pre-processing techniques. The processed data is fed into the proposed algorithm and the obtained result is analyzed. Some of the algorithms used in satellite imagery classification are U-Net, Random Forest, Deep Labv3, CNN, ANN, Resnet etc. In this project, we are using the DeepLabv3 (Atrous convolution) algorithm for land cover classification. The dataset used is the deep globe land cover classification dataset. DeepLabv3 is a semantic segmentation system that uses atrous convolution to capture multi-scale context by adopting multiple atrous rates in cascade or in parallel to determine the scale of segments.Keywords: area calculation, atrous convolution, deep globe land cover classification, deepLabv3, land cover classification, resnet 50
Procedia PDF Downloads 1405949 Classification of Opaque Exterior Walls of Buildings from a Sustainable Point of View
Authors: Michelle Sánchez de León Brajkovich, Nuria Martí Audi
Abstract:
The envelope is one of the most important elements when one analyzes the operation of the building in terms of sustainability. Taking this into consideration, this research focuses on setting a classification system of the envelopes opaque systems, crossing the knowledge and parameters of construction systems with requirements in terms of sustainability that they may have, to have a better understanding of how these systems work with respect to their sustainable contribution to the building. Therefore, this paper evaluates the importance of the envelope design on the building sustainability. It analyses the parameters that make the construction systems behave differently in terms of sustainability. At the same time it explains the classification process generated from this analysis that results in a classification where all opaque vertical envelope construction systems enter.Keywords: sustainable, exterior walls, envelope, facades, construction systems, energy efficiency
Procedia PDF Downloads 5715948 Multi-Classification Deep Learning Model for Diagnosing Different Chest Diseases
Authors: Bandhan Dey, Muhsina Bintoon Yiasha, Gulam Sulaman Choudhury
Abstract:
Chest disease is one of the most problematic ailments in our regular life. There are many known chest diseases out there. Diagnosing them correctly plays a vital role in the process of treatment. There are many methods available explicitly developed for different chest diseases. But the most common approach for diagnosing these diseases is through X-ray. In this paper, we proposed a multi-classification deep learning model for diagnosing COVID-19, lung cancer, pneumonia, tuberculosis, and atelectasis from chest X-rays. In the present work, we used the transfer learning method for better accuracy and fast training phase. The performance of three architectures is considered: InceptionV3, VGG-16, and VGG-19. We evaluated these deep learning architectures using public digital chest x-ray datasets with six classes (i.e., COVID-19, lung cancer, pneumonia, tuberculosis, atelectasis, and normal). The experiments are conducted on six-classification, and we found that VGG16 outperforms other proposed models with an accuracy of 95%.Keywords: deep learning, image classification, X-ray images, Tensorflow, Keras, chest diseases, convolutional neural networks, multi-classification
Procedia PDF Downloads 935947 Role of Natural Language Processing in Information Retrieval; Challenges and Opportunities
Authors: Khaled M. Alhawiti
Abstract:
This paper aims to analyze the role of natural language processing (NLP). The paper will discuss the role in the context of automated data retrieval, automated question answer, and text structuring. NLP techniques are gaining wider acceptance in real life applications and industrial concerns. There are various complexities involved in processing the text of natural language that could satisfy the need of decision makers. This paper begins with the description of the qualities of NLP practices. The paper then focuses on the challenges in natural language processing. The paper also discusses major techniques of NLP. The last section describes opportunities and challenges for future research.Keywords: data retrieval, information retrieval, natural language processing, text structuring
Procedia PDF Downloads 3415946 Spacial Poetic Text throughout Samih al-Qasim's Poetry
Authors: Saleem Abu Jaber, Khaled Igbaria
Abstract:
For readers, space/place is one of the most significant references to reveal deep significances and indications in modern Arabic poetic texts. Generally, when poets evoke places and/or spaces, they do not mean to refer readers to detailed geographic or physical spaces, but to the symbolic significances and dimensions that those spaces have and through which poets encourage spacial awareness in their readers. Recently, as a result, there has been a great deal of interest in research addressing spacial poetic texts and dimensions in modern Arabic poetry in general and in Palestinian poetry in particular. Samih al-Qasim is one of the most recent prominent Palestinian revolutionary poets. Al-Qasim has published six series of poems that are well known in the Arab world. Although several researchers have studied al-Qasim's poetry, to our knowledge, yet no one has studied the aspect of spacial poetic text in his poetry. Therefore, this paper seeks to fill a gap in the scholarship that has not been addressed up to now. This article aims, not only to demonstrate the presence of spacial poetic text and dimensions throughout al-Qasim's poetry, but also to investigate the purpose for which the poet uses spacial poetic text. Our theory is that the poet, consciously and significantly, uses spacial poetic texts to magnify the Palestinian identity of the Palestinian readers. Methodologically, we applied a descriptive analytic method, referencing al-Qasim's poetry, addressing spacial poetic texts practically but not theoretically or statistically.Keywords: spatial poetic text, Samih al-Qasim, space and identity, Palestinian poetry
Procedia PDF Downloads 3145945 Performance Evaluation of Contemporary Classifiers for Automatic Detection of Epileptic EEG
Authors: K. E. Ch. Vidyasagar, M. Moghavvemi, T. S. S. T. Prabhat
Abstract:
Epilepsy is a global problem, and with seizures eluding even the smartest of diagnoses a requirement for automatic detection of the same using electroencephalogram (EEG) would have a huge impact in diagnosis of the disorder. Among a multitude of methods for automatic epilepsy detection, one should find the best method out, based on accuracy, for classification. This paper reasons out, and rationalizes, the best methods for classification. Accuracy is based on the classifier, and thus this paper discusses classifiers like quadratic discriminant analysis (QDA), classification and regression tree (CART), support vector machine (SVM), naive Bayes classifier (NBC), linear discriminant analysis (LDA), K-nearest neighbor (KNN) and artificial neural networks (ANN). Results show that ANN is the most accurate of all the above stated classifiers with 97.7% accuracy, 97.25% specificity and 98.28% sensitivity in its merit. This is followed closely by SVM with 1% variation in result. These results would certainly help researchers choose the best classifier for detection of epilepsy.Keywords: classification, seizure, KNN, SVM, LDA, ANN, epilepsy
Procedia PDF Downloads 5245944 3D Receiver Operator Characteristic Histogram
Authors: Xiaoli Zhang, Xiongfei Li, Yuncong Feng
Abstract:
ROC curves, as a widely used evaluating tool in machine learning field, are the tradeoff of true positive rate and negative rate. However, they are blamed for ignoring some vital information in the evaluation process, such as the amount of information about the target that each instance carries, predicted score given by each classification model to each instance. Hence, in this paper, a new classification performance method is proposed by extending the Receiver Operator Characteristic (ROC) curves to 3D space, which is denoted as 3D ROC Histogram. In the histogram, theKeywords: classification, performance evaluation, receiver operating characteristic histogram, hardness prediction
Procedia PDF Downloads 315