Search results for: malware classification
873 Deep Learning based Image Classifiers for Detection of CSSVD in Cacao Plants
Authors: Atuhurra Jesse, N'guessan Yves-Roland Douha, Pabitra Lenka
Abstract:
The detection of diseases within plants has attracted a lot of attention from computer vision enthusiasts. Despite the progress made to detect diseases in many plants, there remains a research gap to train image classifiers to detect the cacao swollen shoot virus disease or CSSVD for short, pertinent to cacao plants. This gap has mainly been due to the unavailability of high quality labeled training data. Moreover, institutions have been hesitant to share their data related to CSSVD. To fill these gaps, image classifiers to detect CSSVD-infected cacao plants are presented in this study. The classifiers are based on VGG16, ResNet50 and Vision Transformer (ViT). The image classifiers are evaluated on a recently released and publicly accessible KaraAgroAI Cocoa dataset. The best performing image classifier, based on ResNet50, achieves 95.39\% precision, 93.75\% recall, 94.34\% F1-score and 94\% accuracy on only 20 epochs. There is a +9.75\% improvement in recall when compared to previous works. These results indicate that the image classifiers learn to identify cacao plants infected with CSSVD.Keywords: CSSVD, image classification, ResNet50, vision transformer, KaraAgroAI cocoa dataset
Procedia PDF Downloads 104872 Decision Tree Analysis of Risk Factors for Intravenous Infiltration among Hospitalized Children: A Retrospective Study
Authors: Soon-Mi Park, Ihn Sook Jeong
Abstract:
This retrospective study was aimed to identify risk factors of intravenous (IV) infiltration for hospitalized children. The participants were 1,174 children for test and 424 children for validation, who admitted to a general hospital, received peripheral intravenous injection therapy at least once and had complete records. Data were analyzed with frequency and percentage or mean and standard deviation were calculated, and decision tree analysis was used to screen for the most important risk factors for IV infiltration for hospitalized children. The decision tree analysis showed that the most important traditional risk factors for IV infiltration were the use of ampicillin/sulbactam, IV insertion site (lower extremities), and medical department (internal medicine) both in the test sample and validation sample. The correct classification was 92.2% in the test sample and 90.1% in the validation sample. More careful attention should be made to patients who are administered ampicillin/sulbactam, have IV site in lower extremities and have internal medical problems to prevent or detect infiltration occurrence.Keywords: decision tree analysis, intravenous infiltration, child, validation
Procedia PDF Downloads 177871 Framework for Detecting External Plagiarism from Monolingual Documents: Use of Shallow NLP and N-Gram Frequency Comparison
Authors: Saugata Bose, Ritambhra Korpal
Abstract:
The internet has increased the copy-paste scenarios amongst students as well as amongst researchers leading to different levels of plagiarized documents. For this reason, much of research is focused on for detecting plagiarism automatically. In this paper, an initiative is discussed where Natural Language Processing (NLP) techniques as well as supervised machine learning algorithms have been combined to detect plagiarized texts. Here, the major emphasis is on to construct a framework which detects external plagiarism from monolingual texts successfully. For successfully detecting the plagiarism, n-gram frequency comparison approach has been implemented to construct the model framework. The framework is based on 120 characteristics which have been extracted during pre-processing the documents using NLP approach. Afterwards, filter metrics has been applied to select most relevant characteristics and then supervised classification learning algorithm has been used to classify the documents in four levels of plagiarism. Confusion matrix was built to estimate the false positives and false negatives. Our plagiarism framework achieved a very high the accuracy score.Keywords: lexical matching, shallow NLP, supervised machine learning algorithm, word n-gram
Procedia PDF Downloads 359870 Mineralogy and Classification of Altered Host Rocks in the Zaghia Iron Oxide Deposit, East of Bafq, Central Iran
Authors: Azat Eslamizadeh, Neda Akbarian
Abstract:
The Zaghia Iron ore, in 15 km east of a town named Bafq, is located in Precambrian formation of Central Iran in form of a small local deposit. The Volcano-sedimentary rocks of Precambrian-Cambrian age, belonging to Rizu series have spread through the region. Substantial portion of the deposit is covered by alluvial deposits. The rocks hosting the Zaghia iron ore have a main combination of rhyolitic tuffs along with clastic sediments, carbonate include sandstone, limestone, dolomite, conglomerate and is somewhat metamorphed causing them to have appeared as slate and phyllite. Moreover, carbonate rocks are in existence as skarn compound of marble bearing tremolite with mineralization of magnetite-hematite. The basic igneous rocks have dramatically altered into green rocks consist of actinolite-tremolite and chlorite along with amount of iron (magnetite + Martite). The youngest units of ore-bearing rocks in the area are found as dolerite - diabase dikes. The dikes are cutting the rhyolitic tuffs and carbonate rocks.Keywords: Zaghia, iron ore deposite, mineralogy, petrography Bafq, Iran
Procedia PDF Downloads 526869 An Online Adaptive Thresholding Method to Classify Google Trends Data Anomalies for Investor Sentiment Analysis
Authors: Duygu Dere, Mert Ergeneci, Kaan Gokcesu
Abstract:
Google Trends data has gained increasing popularity in the applications of behavioral finance, decision science and risk management. Because of Google’s wide range of use, the Trends statistics provide significant information about the investor sentiment and intention, which can be used as decisive factors for corporate and risk management fields. However, an anomaly, a significant increase or decrease, in a certain query cannot be detected by the state of the art applications of computation due to the random baseline noise of the Trends data, which is modelled as an Additive white Gaussian noise (AWGN). Since through time, the baseline noise power shows a gradual change an adaptive thresholding method is required to track and learn the baseline noise for a correct classification. To this end, we introduce an online method to classify meaningful deviations in Google Trends data. Through extensive experiments, we demonstrate that our method can successfully classify various anomalies for plenty of different data.Keywords: adaptive data processing, behavioral finance , convex optimization, online learning, soft minimum thresholding
Procedia PDF Downloads 169868 Recovery of Dredged Sediments With Lime or Cement as Platform Materials for Use in a Roadway
Authors: Abriak Yassine, Zri Abdeljalil, Benzerzour Mahfoud., Hadj Sadok Rachid, Abriak Nor-Edine
Abstract:
In this study, firstly, the study of the capacity reuse of dredged sediments and treated sediments with lime or cement were used in an establishment layer and the base layer of the roadway. Also, the analysis of mineral changes caused by the addition of lime or cement on the way as described in the mechanical results of stabilised sediments. After determining the quantity of lime and cement required to stabilise the sediment, the compaction characteristics were studied using the modified Proctor method. Then the evolution of the three parameters, that is, ideal water content and maximum dry density had been determined. Mechanical exhibitions can be assessed across the resistance to compression, flexibility modulus and the resistance under traction. The resistance of the formulation treated with cement addition (ROLAC®645) increase with the quantity of ROLAC®645. Traction resistances and the elastic modulus were utilized to assess the potential of the formulation as road construction materials utilizing classification diagram. The results show the various formulations with ROLAC® 645may be employed in subgrades and foundation layers for roads.Keywords: cement, dredged, sediment, foundation layer, resistance
Procedia PDF Downloads 101867 Decomposition of Funds Transfer Pricing Components in Islamic Bank: The Exposure Effect of Shariah Non-Compliant Event Rectification Process
Authors: Azrul Azlan Iskandar Mirza
Abstract:
The purpose of Funds Transfer Pricing (FTP) for Islamic Bank is to promote prudent liquidity risk-taking behavior of business units. The acquirer of stable deposits will be rewarded whilst a business unit that generates long-term assets will be charged for added liquidity funding risks. In the end, it promotes risk-adjusted pricing by incorporating profit rate risk and liquidity risk component in the product pricing. However, in the event of Shariah non-compliant (SNCE), FTP components will be examined in the rectification plan especially when Islamic banks need to purify the non-compliance income. The finding shows that the determination between actual and provision cost will defer the decision among Shariah committee in Islamic banks. This paper will review each of FTP components to ensure the classification of actual and provision costs reflect the decision on rectification process on SNCE. This will benefit future decision and its consistency of Islamic banks.Keywords: fund transfer pricing, Islamic banking, Islamic finance, shariah non-compliant event
Procedia PDF Downloads 195866 Second-Order Complex Systems: Case Studies of Autonomy and Free Will
Authors: Eric Sanchis
Abstract:
Although there does not exist a definitive consensus on a precise definition of a complex system, it is generally considered that a system is complex by nature. The presented work illustrates a different point of view: a system becomes complex only with regard to the question posed to it, i.e., with regard to the problem which has to be solved. A complex system is a couple (question, object). Because the number of questions posed to a given object can be potentially substantial, complexity does not present a uniform face. Two types of complex systems are clearly identified: first-order complex systems and second-order complex systems. First-order complex systems physically exist. They are well-known because they have been studied by the scientific community for a long time. In second-order complex systems, complexity results from the system composition and its articulation that are partially unknown. For some of these systems, there is no evidence of their existence. Vagueness is the keyword characterizing this kind of systems. Autonomy and free will, two mental productions of the human cognitive system, can be identified as second-order complex systems. A classification based on the properties structure makes it possible to discriminate complex properties from the others and to model this kind of second order complex systems. The final outcome is an implementable synthetic property that distinguishes the solid aspects of the actual property from those that are uncertain.Keywords: autonomy, free will, synthetic property, vaporous complex systems
Procedia PDF Downloads 205865 Facies, Diagenetic Analysis and Sequence Stratigraphy of Habib Rahi Formation Dwelling in the Vicinity of Jacobabad Khairpur High, Southern Indus Basin, Pakistan
Authors: Muhammad Haris, Syed Kamran Ali, Mubeen Islam, Tariq Mehmood, Faisal Shah
Abstract:
Jacobabad Khairpur High, part of a Sukkur rift zone, is the separating boundary between Central and Southern Indus Basin, formed as a result of Post-Jurassic uplift after the deposition of Middle Jurassic Chiltan Formation. Habib Rahi Formation of Middle to Late Eocene outcrops in the vicinity of Jacobabad Khairpur High, a section at Rohri near Sukkur is measured in detail for lithofacies, microfacies, diagenetic analysis and sequence stratigraphy. Habib Rahi Formation is richly fossiliferous and consists of mostly limestone with subordinate clays and marl. The total thickness of the formation in this section is 28.8m. The bottom of the formation is not exposed, while the upper contact with the Sirki Shale of the Middle Eocene age is unconformable in some places. A section is measured using Jacob’s Staff method, and traverses were made perpendicular to the strike. Four different lithofacies were identified based on outcrop geology which includes coarse-grained limestone facies (HR-1 to HR-5), massive bedded limestone facies (HR-6 HR-7), and micritic limestone facies (HR-8 to HR-13) and algal dolomitic limestone facie (HR-14). Total 14 rock samples were collected from outcrop for detailed petrographic studies, and thin sections of respective samples were prepared and analyzed under the microscope. On the basis of Dunham’s (1962) classification systems after studying textures, grain size, and fossil content and using Folk’s (1959) classification system after reviewing Allochems type, four microfacies were identified. These microfacies include HR-MF 1: Benthonic Foraminiferal Wackstone/Biomicrite Microfacies, HR-MF 2: Foramineral Nummulites Wackstone-Packstone/Biomicrite Microfacies HR-MF 3: Benthonic Foraminiferal Packstone/Biomicrite Microfacies, HR-MF 4: Bioclasts Carbonate Mudstone/Micrite Microfacies. The abundance of larger benthic Foraminifera’s (LBF), including Assilina sp., A. spiral abrade, A. granulosa, A. dandotica, A. laminosa, Nummulite sp., N. fabiani, N. stratus, N. globulus, Textularia, Bioclasts, and Red algae indicates shallow marine (Tidal Flat) environment of deposition. Based on variations in rock types, grain size, and marina fauna Habib Rahi Formation shows progradational stacking patterns, which indicates coarsening upward cycles. The second order of sea-level rise is identified (spanning from Y-Persian to Bartonian age) that represents the Transgressive System Tract (TST) and a third-order Regressive System Tract (RST) (spanning from Bartonian to Priabonian age). Diagenetic processes include fossils replacement by mud, dolomitization, pressure dissolution associated stylolites features and filling with dark organic matter. The presence of the microfossils includes Nummulite. striatus, N. fabiani, and Assilina. dandotica, signify Bartonian to Priabonian age of Habib Rahi Formation.Keywords: Jacobabad Khairpur High, Habib Rahi Formation, lithofacies, microfacies, sequence stratigraphy, diagenetic history
Procedia PDF Downloads 473864 Evaluation of Random Forest and Support Vector Machine Classification Performance for the Prediction of Early Multiple Sclerosis from Resting State FMRI Connectivity Data
Authors: V. Saccà, A. Sarica, F. Novellino, S. Barone, T. Tallarico, E. Filippelli, A. Granata, P. Valentino, A. Quattrone
Abstract:
The work aim was to evaluate how well Random Forest (RF) and Support Vector Machine (SVM) algorithms could support the early diagnosis of Multiple Sclerosis (MS) from resting-state functional connectivity data. In particular, we wanted to explore the ability in distinguishing between controls and patients of mean signals extracted from ICA components corresponding to 15 well-known networks. Eighteen patients with early-MS (mean-age 37.42±8.11, 9 females) were recruited according to McDonald and Polman, and matched for demographic variables with 19 healthy controls (mean-age 37.55±14.76, 10 females). MRI was acquired by a 3T scanner with 8-channel head coil: (a)whole-brain T1-weighted; (b)conventional T2-weighted; (c)resting-state functional MRI (rsFMRI), 200 volumes. Estimated total lesion load (ml) and number of lesions were calculated using LST-toolbox from the corrected T1 and FLAIR. All rsFMRIs were pre-processed using tools from the FMRIB's Software Library as follows: (1) discarding of the first 5 volumes to remove T1 equilibrium effects, (2) skull-stripping of images, (3) motion and slice-time correction, (4) denoising with high-pass temporal filter (128s), (5) spatial smoothing with a Gaussian kernel of FWHM 8mm. No statistical significant differences (t-test, p < 0.05) were found between the two groups in the mean Euclidian distance and the mean Euler angle. WM and CSF signal together with 6 motion parameters were regressed out from the time series. We applied an independent component analysis (ICA) with the GIFT-toolbox using the Infomax approach with number of components=21. Fifteen mean components were visually identified by two experts. The resulting z-score maps were thresholded and binarized to extract the mean signal of the 15 networks for each subject. Statistical and machine learning analysis were then conducted on this dataset composed of 37 rows (subjects) and 15 features (mean signal in the network) with R language. The dataset was randomly splitted into training (75%) and test sets and two different classifiers were trained: RF and RBF-SVM. We used the intrinsic feature selection of RF, based on the Gini index, and recursive feature elimination (rfe) for the SVM, to obtain a rank of the most predictive variables. Thus, we built two new classifiers only on the most important features and we evaluated the accuracies (with and without feature selection) on test-set. The classifiers, trained on all the features, showed very poor accuracies on training (RF:58.62%, SVM:65.52%) and test sets (RF:62.5%, SVM:50%). Interestingly, when feature selection by RF and rfe-SVM were performed, the most important variable was the sensori-motor network I in both cases. Indeed, with only this network, RF and SVM classifiers reached an accuracy of 87.5% on test-set. More interestingly, the only misclassified patient resulted to have the lowest value of lesion volume. We showed that, with two different classification algorithms and feature selection approaches, the best discriminant network between controls and early MS, was the sensori-motor I. Similar importance values were obtained for the sensori-motor II, cerebellum and working memory networks. These findings, in according to the early manifestation of motor/sensorial deficits in MS, could represent an encouraging step toward the translation to the clinical diagnosis and prognosis.Keywords: feature selection, machine learning, multiple sclerosis, random forest, support vector machine
Procedia PDF Downloads 241863 Enhancing the Recruitment Process through Machine Learning: An Automated CV Screening System
Authors: Kaoutar Ben Azzou, Hanaa Talei
Abstract:
Human resources is an important department in each organization as it manages the life cycle of employees from recruitment training to retirement or termination of contracts. The recruitment process starts with a job opening, followed by a selection of the best-fit candidates from all applicants. Matching the best profile for a job position requires a manual way of looking at many CVs, which requires hours of work that can sometimes lead to choosing not the best profile. The work presented in this paper aims at reducing the workload of HR personnel by automating the preliminary stages of the candidate screening process, thereby fostering a more streamlined recruitment workflow. This tool introduces an automated system designed to help with the recruitment process by scanning candidates' CVs, extracting pertinent features, and employing machine learning algorithms to decide the most fitting job profile for each candidate. Our work employs natural language processing (NLP) techniques to identify and extract key features from unstructured text extracted from a CV, such as education, work experience, and skills. Subsequently, the system utilizes these features to match candidates with job profiles, leveraging the power of classification algorithms.Keywords: automated recruitment, candidate screening, machine learning, human resources management
Procedia PDF Downloads 57862 Performance Measurement of Logistics Systems for Thailand's Wholesales and Retails Industries by Data Envelopment Analysis
Authors: Pornpimol Chaiwuttisak
Abstract:
The study aims to compare the performance of the logistics for Thailand’s wholesale and retail trade industries (except motor vehicles, motorcycle, and stalls) by using data (data envelopment analysis). Thailand Standard Industrial Classification in 2009 (TSIC - 2009) categories that industries into sub-group no. 45: wholesale and retail trade (except for the repair of motor vehicles and motorcycles), sub-group no. 46: wholesale trade (except motor vehicles and motorcycles), and sub-group no. 47: retail trade (except motor vehicles and motorcycles. Data used in the study is collected by the National Statistical Office, Thailand. The study consisted of four input factors include the number of companies, the number of personnel in logistics, the training cost in logistics, and outsourcing logistics management. Output factor includes the percentage of enterprises having inventory management. The results showed that the average relative efficiency of small-sized enterprises equals to 27.87 percent and 49.68 percent for the medium-sized enterprises.Keywords: DEA, wholesales and retails, logistics, Thailand
Procedia PDF Downloads 417861 Image Segmentation: New Methods
Authors: Flaurence Benjamain, Michel Casperance
Abstract:
We present in this paper, first, a comparative study of three mathematical theories to achieve the fusion of information sources. This study aims to identify the characteristics inherent in theories of possibilities, belief functions (DST) and plausible and paradoxical reasoning to establish a strategy of choice that allows us to adopt the most appropriate theory to solve a problem of fusion in order, taking into account the acquired information and imperfections that accompany them. Using the new theory of plausible and paradoxical reasoning, also called Dezert-Smarandache Theory (DSmT), to fuse information multi-sources needs, at first step, the generation of the composites events witch is, in general, difficult. Thus, we present in this paper a new approach to construct pertinent paradoxical classes based on gray levels histograms, which also allows to reduce the cardinality of the hyper-powerset. Secondly, we developed a new technique for order and coding generalized focal elements. This method is exploited, in particular, to calculate the cardinality of Dezert and Smarandache. Then, we give an experimentation of classification of a remote sensing image that illustrates the given methods and we compared the result obtained by the DSmT with that resulting from the use of the DST and theory of possibilities.Keywords: segmentation, image, approach, vision computing
Procedia PDF Downloads 278860 Design an Development of an Agorithm for Prioritizing the Test Cases Using Neural Network as Classifier
Authors: Amit Verma, Simranjeet Kaur, Sandeep Kaur
Abstract:
Test Case Prioritization (TCP) has gained wide spread acceptance as it often results in good quality software free from defects. Due to the increase in rate of faults in software traditional techniques for prioritization results in increased cost and time. Main challenge in TCP is difficulty in manually validate the priorities of different test cases due to large size of test suites and no more emphasis are made to make the TCP process automate. The objective of this paper is to detect the priorities of different test cases using an artificial neural network which helps to predict the correct priorities with the help of back propagation algorithm. In our proposed work one such method is implemented in which priorities are assigned to different test cases based on their frequency. After assigning the priorities ANN predicts whether correct priority is assigned to every test case or not otherwise it generates the interrupt when wrong priority is assigned. In order to classify the different priority test cases classifiers are used. Proposed algorithm is very effective as it reduces the complexity with robust efficiency and makes the process automated to prioritize the test cases.Keywords: test case prioritization, classification, artificial neural networks, TF-IDF
Procedia PDF Downloads 398859 Polarity Classification of Social Media Comments in Turkish
Authors: Migena Ceyhan, Zeynep Orhan, Dimitrios Karras
Abstract:
People in modern societies are continuously sharing their experiences, emotions, and thoughts in different areas of life. The information reaches almost everyone in real-time and can have an important impact in shaping people’s way of living. This phenomenon is very well recognized and advantageously used by the market representatives, trying to earn the most from this means. Given the abundance of information, people and organizations are looking for efficient tools that filter the countless data into important information, ready to analyze. This paper is a modest contribution in this field, describing the process of automatically classifying social media comments in the Turkish language into positive or negative. Once data is gathered and preprocessed, feature sets of selected single words or groups of words are build according to the characteristics of language used in the texts. These features are used later to train, and test a system according to different machine learning algorithms (Naïve Bayes, Sequential Minimal Optimization, J48, and Bayesian Linear Regression). The resultant high accuracies can be important feedback for decision-makers to improve the business strategies accordingly.Keywords: feature selection, machine learning, natural language processing, sentiment analysis, social media reviews
Procedia PDF Downloads 147858 Hyperspectral Mapping Methods for Differentiating Mangrove Species along Karachi Coast
Authors: Sher Muhammad, Mirza Muhammad Waqar
Abstract:
It is necessary to monitor and identify mangroves types and spatial extent near coastal areas because it plays an important role in coastal ecosystem and environmental protection. This research aims at identifying and mapping mangroves types along Karachi coast ranging from 24.79 to 24.85 degree in latitude and 66.91 to 66.97 degree in longitude using hyperspectral remote sensing data and techniques. Image acquired during February, 2012 through Hyperion sensor have been used for this research. Image preprocessing includes geometric and radiometric correction followed by Minimum Noise Fraction (MNF) and Pixel Purity Index (PPI). The output of MNF and PPI has been analyzed by visualizing it in n-dimensions for end-member extraction. Well-distributed clusters on the n-dimensional scatter plot have been selected with the region of interest (ROI) tool as end members. These end members have been used as an input for classification techniques applied to identify and map mangroves species including Spectral Angle Mapper (SAM), Spectral Feature Fitting (SFF), and Spectral Information Diversion (SID). Only two types of mangroves namely Avicennia Marina (white mangroves) and Avicennia Germinans (black mangroves) have been observed throughout the study area.Keywords: mangrove, hyperspectral, hyperion, SAM, SFF, SID
Procedia PDF Downloads 362857 Land Suitability Analysis for Maize Production in Egbeda Local Government Area of Oyo State Using GIS Techniques
Authors: Abegunde Linda, Adedeji Oluwatayo, Tope-Ajayi Opeyemi
Abstract:
Maize constitutes a major agrarian production for use by the vast population but despite its economic importance, it has not been produced to meet the economic needs of the country. Achieving optimum yield in maize can meaningfully be supported by land suitability analysis in order to guarantee self-sufficiency for future production optimization. This study examines land suitability for maize production through the analysis of the physic-chemical variations in soil properties over space using a Geographic Information System (GIS) framework. Physic-chemical parameters of importance selected include slope, landuse, and physical and chemical properties of the soil. Landsat imagery was used to categorize the landuse, Shuttle Radar Topographic Mapping (SRTM) generated the slope and soil samples were analyzed for its physical and chemical components. Suitability was categorized into highly, moderately and marginally suitable based on Food and Agricultural Organisation (FAO) classification using the Analytical Hierarchy Process (AHP) technique of GIS. This result can be used by small scale farmers for efficient decision making in the allocation of land for maize production.Keywords: AHP, GIS, MCE, suitability, Zea mays
Procedia PDF Downloads 396856 Grammatical and Lexical Cohesion in the Japan’s Prime Minister Shinzo Abe’s Speech Text ‘Nihon wa Modottekimashita’
Authors: Nadya Inda Syartanti
Abstract:
This research aims to identify, classify, and analyze descriptively the aspects of grammatical and lexical cohesion in the speech text of Japan’s Prime Minister Shinzo Abe entitled Nihon wa Modotte kimashita delivered in Washington DC, the United States on February 23, 2013, as a research data source. The method used is qualitative research, which uses descriptions through words that are applied by analyzing aspects of grammatical and lexical cohesion proposed by Halliday and Hasan (1976). The aspects of grammatical cohesion consist of references (personal, demonstrative, interrogative pronouns), substitution, ellipsis, and conjunction. In contrast, lexical cohesion consists of reiteration (repetition, synonym, antonym, hyponym, meronym) and collocation. Data classification is based on the 6 aspects of the cohesion. Through some aspects of cohesion, this research tries to find out the frequency of using grammatical and lexical cohesion in Shinzo Abe's speech text entitled Nihon wa Modotte kimashita. The results of this research are expected to help overcome the difficulty of understanding speech texts in Japanese. Therefore, this research can be a reference for learners, researchers, and anyone who is interested in the field of discourse analysis.Keywords: cohesion, grammatical cohesion, lexical cohesion, speech text, Shinzo Abe
Procedia PDF Downloads 162855 Deep-Learning Coupled with Pragmatic Categorization Method to Classify the Urban Environment of the Developing World
Authors: Qianwei Cheng, A. K. M. Mahbubur Rahman, Anis Sarker, Abu Bakar Siddik Nayem, Ovi Paul, Amin Ahsan Ali, M. Ashraful Amin, Ryosuke Shibasaki, Moinul Zaber
Abstract:
Thomas Friedman, in his famous book, argued that the world in this 21st century is flat and will continue to be flatter. This is attributed to rapid globalization and the interdependence of humanity that engendered tremendous in-flow of human migration towards the urban spaces. In order to keep the urban environment sustainable, policy makers need to plan based on extensive analysis of the urban environment. With the advent of high definition satellite images, high resolution data, computational methods such as deep neural network analysis, and hardware capable of high-speed analysis; urban planning is seeing a paradigm shift. Legacy data on urban environments are now being complemented with high-volume, high-frequency data. However, the first step of understanding urban space lies in useful categorization of the space that is usable for data collection, analysis, and visualization. In this paper, we propose a pragmatic categorization method that is readily usable for machine analysis and show applicability of the methodology on a developing world setting. Categorization to plan sustainable urban spaces should encompass the buildings and their surroundings. However, the state-of-the-art is mostly dominated by classification of building structures, building types, etc. and largely represents the developed world. Hence, these methods and models are not sufficient for developing countries such as Bangladesh, where the surrounding environment is crucial for the categorization. Moreover, these categorizations propose small-scale classifications, which give limited information, have poor scalability and are slow to compute in real time. Our proposed method is divided into two steps-categorization and automation. We categorize the urban area in terms of informal and formal spaces and take the surrounding environment into account. 50 km × 50 km Google Earth image of Dhaka, Bangladesh was visually annotated and categorized by an expert and consequently a map was drawn. The categorization is based broadly on two dimensions-the state of urbanization and the architectural form of urban environment. Consequently, the urban space is divided into four categories: 1) highly informal area; 2) moderately informal area; 3) moderately formal area; and 4) highly formal area. In total, sixteen sub-categories were identified. For semantic segmentation and automatic categorization, Google’s DeeplabV3+ model was used. The model uses Atrous convolution operation to analyze different layers of texture and shape. This allows us to enlarge the field of view of the filters to incorporate larger context. Image encompassing 70% of the urban space was used to train the model, and the remaining 30% was used for testing and validation. The model is able to segment with 75% accuracy and 60% Mean Intersection over Union (mIoU). In this paper, we propose a pragmatic categorization method that is readily applicable for automatic use in both developing and developed world context. The method can be augmented for real-time socio-economic comparative analysis among cities. It can be an essential tool for the policy makers to plan future sustainable urban spaces.Keywords: semantic segmentation, urban environment, deep learning, urban building, classification
Procedia PDF Downloads 192854 Long Short-Term Memory Based Model for Modeling Nicotine Consumption Using an Electronic Cigarette and Internet of Things Devices
Authors: Hamdi Amroun, Yacine Benziani, Mehdi Ammi
Abstract:
In this paper, we want to determine whether the accurate prediction of nicotine concentration can be obtained by using a network of smart objects and an e-cigarette. The approach consists of, first, the recognition of factors influencing smoking cessation such as physical activity recognition and participant’s behaviors (using both smartphone and smartwatch), then the prediction of the configuration of the e-cigarette (in terms of nicotine concentration, power, and resistance of e-cigarette). The study uses a network of commonly connected objects; a smartwatch, a smartphone, and an e-cigarette transported by the participants during an uncontrolled experiment. The data obtained from sensors carried in the three devices were trained by a Long short-term memory algorithm (LSTM). Results show that our LSTM-based model allows predicting the configuration of the e-cigarette in terms of nicotine concentration, power, and resistance with a root mean square error percentage of 12.9%, 9.15%, and 11.84%, respectively. This study can help to better control consumption of nicotine and offer an intelligent configuration of the e-cigarette to users.Keywords: Iot, activity recognition, automatic classification, unconstrained environment
Procedia PDF Downloads 225853 Forensic Speaker Verification in Noisy Environmental by Enhancing the Speech Signal Using ICA Approach
Authors: Ahmed Kamil Hasan Al-Ali, Bouchra Senadji, Ganesh Naik
Abstract:
We propose a system to real environmental noise and channel mismatch for forensic speaker verification systems. This method is based on suppressing various types of real environmental noise by using independent component analysis (ICA) algorithm. The enhanced speech signal is applied to mel frequency cepstral coefficients (MFCC) or MFCC feature warping to extract the essential characteristics of the speech signal. Channel effects are reduced using an intermediate vector (i-vector) and probabilistic linear discriminant analysis (PLDA) approach for classification. The proposed algorithm is evaluated by using an Australian forensic voice comparison database, combined with car, street and home noises from QUT-NOISE at a signal to noise ratio (SNR) ranging from -10 dB to 10 dB. Experimental results indicate that the MFCC feature warping-ICA achieves a reduction in equal error rate about (48.22%, 44.66%, and 50.07%) over using MFCC feature warping when the test speech signals are corrupted with random sessions of street, car, and home noises at -10 dB SNR.Keywords: noisy forensic speaker verification, ICA algorithm, MFCC, MFCC feature warping
Procedia PDF Downloads 408852 Data Mining Approach for Commercial Data Classification and Migration in Hybrid Storage Systems
Authors: Mais Haj Qasem, Maen M. Al Assaf, Ali Rodan
Abstract:
Parallel hybrid storage systems consist of a hierarchy of different storage devices that vary in terms of data reading speed performance. As we ascend in the hierarchy, data reading speed becomes faster. Thus, migrating the application’ important data that will be accessed in the near future to the uppermost level will reduce the application I/O waiting time; hence, reducing its execution elapsed time. In this research, we implement trace-driven two-levels parallel hybrid storage system prototype that consists of HDDs and SSDs. The prototype uses data mining techniques to classify application’ data in order to determine its near future data accesses in parallel with the its on-demand request. The important data (i.e. the data that the application will access in the near future) are continuously migrated to the uppermost level of the hierarchy. Our simulation results show that our data migration approach integrated with data mining techniques reduces the application execution elapsed time when using variety of traces in at least to 22%.Keywords: hybrid storage system, data mining, recurrent neural network, support vector machine
Procedia PDF Downloads 309851 A Machine Learning Based Method to Detect System Failure in Resource Constrained Environment
Authors: Payel Datta, Abhishek Das, Abhishek Roychoudhury, Dhiman Chattopadhyay, Tanushyam Chattopadhyay
Abstract:
Machine learning (ML) and deep learning (DL) is most predominantly used in image/video processing, natural language processing (NLP), audio and speech recognition but not that much used in system performance evaluation. In this paper, authors are going to describe the architecture of an abstraction layer constructed using ML/DL to detect the system failure. This proposed system is used to detect the system failure by evaluating the performance metrics of an IoT service deployment under constrained infrastructure environment. This system has been tested on the manually annotated data set containing different metrics of the system, like number of threads, throughput, average response time, CPU usage, memory usage, network input/output captured in different hardware environments like edge (atom based gateway) and cloud (AWS EC2). The main challenge of developing such system is that the accuracy of classification should be 100% as the error in the system has an impact on the degradation of the service performance and thus consequently affect the reliability and high availability which is mandatory for an IoT system. Proposed ML/DL classifiers work with 100% accuracy for the data set of nearly 4,000 samples captured within the organization.Keywords: machine learning, system performance, performance metrics, IoT, edge
Procedia PDF Downloads 195850 Water Detection in Aerial Images Using Fuzzy Sets
Authors: Caio Marcelo Nunes, Anderson da Silva Soares, Gustavo Teodoro Laureano, Clarimar Jose Coelho
Abstract:
This paper presents a methodology to pixel recognition in aerial images using fuzzy $c$-means algorithm. This algorithm is a alternative to recognize areas considering uncertainties and inaccuracies. Traditional clustering technics are used in recognizing of multispectral images of earth's surface. This technics recognize well-defined borders that can be easily discretized. However, in the real world there are many areas with uncertainties and inaccuracies which can be mapped by clustering algorithms that use fuzzy sets. The methodology presents in this work is applied to multispectral images obtained from Landsat-5/TM satellite. The pixels are joined using the $c$-means algorithm. After, a classification process identify the types of surface according the patterns obtained from spectral response of image surface. The classes considered are, exposed soil, moist soil, vegetation, turbid water and clean water. The results obtained shows that the fuzzy clustering identify the real type of the earth's surface.Keywords: aerial images, fuzzy clustering, image processing, pattern recognition
Procedia PDF Downloads 484849 An Empirical Study to Predict Myocardial Infarction Using K-Means and Hierarchical Clustering
Authors: Md. Minhazul Islam, Shah Ashisul Abed Nipun, Majharul Islam, Md. Abdur Rakib Rahat, Jonayet Miah, Salsavil Kayyum, Anwar Shadaab, Faiz Al Faisal
Abstract:
The target of this research is to predict Myocardial Infarction using unsupervised Machine Learning algorithms. Myocardial Infarction Prediction related to heart disease is a challenging factor faced by doctors & hospitals. In this prediction, accuracy of the heart disease plays a vital role. From this concern, the authors have analyzed on a myocardial dataset to predict myocardial infarction using some popular Machine Learning algorithms K-Means and Hierarchical Clustering. This research includes a collection of data and the classification of data using Machine Learning Algorithms. The authors collected 345 instances along with 26 attributes from different hospitals in Bangladesh. This data have been collected from patients suffering from myocardial infarction along with other symptoms. This model would be able to find and mine hidden facts from historical Myocardial Infarction cases. The aim of this study is to analyze the accuracy level to predict Myocardial Infarction by using Machine Learning techniques.Keywords: Machine Learning, K-means, Hierarchical Clustering, Myocardial Infarction, Heart Disease
Procedia PDF Downloads 204848 Flood Monitoring in the Vietnamese Mekong Delta Using Sentinel-1 SAR with Global Flood Mapper
Authors: Ahmed S. Afifi, Ahmed Magdy
Abstract:
Satellite monitoring is an essential tool to study, understand, and map large-scale environmental changes that affect humans, climate, and biodiversity. The Sentinel-1 Synthetic Aperture Radar (SAR) instrument provides a high collection of data in all-weather, short revisit time, and high spatial resolution that can be used effectively in flood management. Floods occur when an overflow of water submerges dry land that requires to be distinguished from flooded areas. In this study, we use global flood mapper (GFM), a new google earth engine application that allows users to quickly map floods using Sentinel-1 SAR. The GFM enables the users to adjust manually the flood map parameters, e.g., the threshold for Z-value for VV and VH bands and the elevation and slope mask threshold. The composite R:G:B image results by coupling the bands of Sentinel-1 (VH:VV:VH) reduces false classification to a large extent compared to using one separate band (e.g., VH polarization band). The flood mapping algorithm in the GFM and the Otsu thresholding are compared with Sentinel-2 optical data. And the results show that the GFM algorithm can overcome the misclassification of a flooded area in An Giang, Vietnam.Keywords: SAR backscattering, Sentinel-1, flood mapping, disaster
Procedia PDF Downloads 107847 Classifying Facial Expressions Based on a Motion Local Appearance Approach
Authors: Fabiola M. Villalobos-Castaldi, Nicolás C. Kemper, Esther Rojas-Krugger, Laura G. Ramírez-Sánchez
Abstract:
This paper presents the classification results about exploring the combination of a motion based approach with a local appearance method to describe the facial motion caused by the muscle contractions and expansions that are presented in facial expressions. The proposed feature extraction method take advantage of the knowledge related to which parts of the face reflects the highest deformations, so we selected 4 specific facial regions at which the appearance descriptor were applied. The most common used approaches for feature extraction are the holistic and the local strategies. In this work we present the results of using a local appearance approach estimating the correlation coefficient to the 4 corresponding landmark-localized facial templates of the expression face related to the neutral face. The results let us to probe how the proposed motion estimation scheme based on the local appearance correlation computation can simply and intuitively measure the motion parameters for some of the most relevant facial regions and how these parameters can be used to recognize facial expressions automatically.Keywords: facial expression recognition system, feature extraction, local-appearance method, motion-based approach
Procedia PDF Downloads 414846 Near Infrared Spectrometry to Determine the Quality of Milk, Experimental Design Setup and Chemometrics: Review
Authors: Meghana Shankara, Priyadarshini Natarajan
Abstract:
Infrared (IR) spectroscopy has revolutionized the way we look at materials around us. Unraveling the pattern in the molecular spectra of materials to analyze the composition and properties of it has been one of the most interesting challenges in modern science. Applications of the IR spectrometry are numerous in the field’s pharmaceuticals, health, food and nutrition, oils, agriculture, construction, polymers, beverage, fabrics and much more limited only by the curiosity of the people. Near Infrared (NIR) spectrometry is applied robustly in analyzing the solids and liquid substances because of its non-destructive analysis method. In this paper, we have reviewed the application of NIR spectrometry in milk quality analysis and have presented the modes of measurement applied in NIRS measurement setup, Design of Experiment (DoE), classification/quantification algorithms used in the case of milk composition prediction like Fat%, Protein%, Lactose%, Solids Not Fat (SNF%) along with different approaches for adulterant identification. We have also discussed the important NIR ranges for the chosen milk parameters. The performance metrics used in the comparison of the various Chemometric approaches include Root Mean Square Error (RMSE), R^2, slope, offset, sensitivity, specificity and accuracyKeywords: chemometrics, design of experiment, milk quality analysis, NIRS measurement modes
Procedia PDF Downloads 271845 Identification of Landslide Features Using Back-Propagation Neural Network on LiDAR Digital Elevation Model
Authors: Chia-Hao Chang, Geng-Gui Wang, Jee-Cheng Wu
Abstract:
The prediction of a landslide is a difficult task because it requires a detailed study of past activities using a complete range of investigative methods to determine the changing condition. In this research, first step, LiDAR 1-meter by 1-meter resolution of digital elevation model (DEM) was used to generate six environmental factors of landslide. Then, back-propagation neural networks (BPNN) was adopted to identify scarp, landslide areas and non-landslide areas. The BPNN uses 6 environmental factors in input layer and 1 output layer. Moreover, 6 landslide areas are used as training areas and 4 landslide areas as test areas in the BPNN. The hidden layer is set to be 1 and 2; the hidden layer neurons are set to be 4, 5, 6, 7 and 8; the learning rates are set to be 0.01, 0.1 and 0.5. When using 1 hidden layer with 7 neurons and the learning rate sets to be 0.5, the result of Network training root mean square error is 0.001388. Finally, evaluation of BPNN classification accuracy by the confusion matrix shows that the overall accuracy can reach 94.4%, and the Kappa value is 0.7464.Keywords: digital elevation model, DEM, environmental factors, back-propagation neural network, BPNN, LiDAR
Procedia PDF Downloads 145844 Subfamilial Relationships within Solanaceae as Inferred from atpB-rbcL Intergenic Spacer
Authors: Syeda Qamarunnisa, Ishrat Jamil, Abid Azhar, Zabta K. Shinwari, Syed Irtifaq Ali
Abstract:
A phylogenetic analysis of family Solanaceae was conducted using sequence data from the chloroplast intergenic atpB-rbcL spacer. Sequence data was generated from 17 species representing 09 out of 14 genera of Solanaceae from Pakistan. Cladogram was constructed using maximum parsimony method and results indicate that Solanaceae is mainly divided into two subfamilies; Solanoideae and Cestroideae. Four major clades within Solanoideae represent tribes; Physaleae, Capsiceae, Datureae and Solaneae are supported by high bootstrap value and the relationships among them are not corroborating with the previous studies. The findings established that subfamily Cestroideae comprised of three genera; Cestrum, Lycium, and Nicotiana with high bootstrap support. Position of Nicotiana inferred with atpB-rbcL sequence is congruent with traditional classification, which placed the taxa in Cestroideae. In the current study Lycium unexpectedly nested with Nicotiana with 100% bootstrap support and identified as a member of tribe Nicotianeae. Expanded sampling of other genera from Pakistan could be valuable towards improving our understanding of intrafamilial relationships within Solanaceae.Keywords: systematics, solanaceae, phylogenetics, intergenic spacer, tribes
Procedia PDF Downloads 469