Search results for: content classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7876

Search results for: content classification

7576 A Comparative Study for Various Techniques Using WEKA for Red Blood Cells Classification

Authors: Jameela Ali, Hamid A. Jalab, Loay E. George, Abdul Rahim Ahmad, Azizah Suliman, Karim Al-Jashamy

Abstract:

Red blood cells (RBC) are the most common types of blood cells and are the most intensively studied in cell biology. The lack of RBCs is a condition in which the amount of hemoglobin level is lower than normal and is referred to as “anemia”. Abnormalities in RBCs will affect the exchange of oxygen. This paper presents a comparative study for various techniques for classifyig the red blood cells as normal, or abnormal (anemic) using WEKA. WEKA is an open source consists of different machine learning algorithms for data mining applications. The algorithm tested are Radial Basis Function neural network, Support vector machine, and K-Nearest Neighbors algorithm. Two sets of combined features were utilized for classification of blood cells images. The first set, exclusively consist of geometrical features, was used to identify whether the tested blood cell has a spherical shape or non-spherical cells. While the second set, consist mainly of textural features was used to recognize the types of the spherical cells. We have provided an evaluation based on applying these classification methods to our RBCs image dataset which were obtained from Serdang Hospital-Malaysia, and measuring the accuracy of test results. The best achieved classification rates are 97%, 98%, and 79% for Support vector machines, Radial Basis Function neural network, and K-Nearest Neighbors algorithm respectively

Keywords: red blood cells, classification, radial basis function neural networks, suport vector machine, k-nearest neighbors algorithm

Procedia PDF Downloads 457
7575 A Spatial Hypergraph Based Semi-Supervised Band Selection Method for Hyperspectral Imagery Semantic Interpretation

Authors: Akrem Sellami, Imed Riadh Farah

Abstract:

Hyperspectral imagery (HSI) typically provides a wealth of information captured in a wide range of the electromagnetic spectrum for each pixel in the image. Hence, a pixel in HSI is a high-dimensional vector of intensities with a large spectral range and a high spectral resolution. Therefore, the semantic interpretation is a challenging task of HSI analysis. We focused in this paper on object classification as HSI semantic interpretation. However, HSI classification still faces some issues, among which are the following: The spatial variability of spectral signatures, the high number of spectral bands, and the high cost of true sample labeling. Therefore, the high number of spectral bands and the low number of training samples pose the problem of the curse of dimensionality. In order to resolve this problem, we propose to introduce the process of dimensionality reduction trying to improve the classification of HSI. The presented approach is a semi-supervised band selection method based on spatial hypergraph embedding model to represent higher order relationships with different weights of the spatial neighbors corresponding to the centroid of pixel. This semi-supervised band selection has been developed to select useful bands for object classification. The presented approach is evaluated on AVIRIS and ROSIS HSIs and compared to other dimensionality reduction methods. The experimental results demonstrate the efficacy of our approach compared to many existing dimensionality reduction methods for HSI classification.

Keywords: dimensionality reduction, hyperspectral image, semantic interpretation, spatial hypergraph

Procedia PDF Downloads 288
7574 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review

Procedia PDF Downloads 139
7573 Brand Content Optimization: A Major Challenge for Sellers on Marketplaces

Authors: Richardson Ciguene, Bertrand Marron, Nicolas Habert

Abstract:

Today, more and more consumers are purchasing their products and services online. At the same time, the penetration rate of very small and medium-sized businesses on marketplaces continues to increase, which has the direct impact of intensifying competition between sellers. Thus, only the best-optimized deals are ranked well by algorithms and are visible to consumers. However, it is almost impossible to know all the Brand Content rules and criteria established by marketplaces, which is essential to optimizing their product sheets, especially since these rules change constantly. In this paper, we propose to detail this question of Brand Content optimization by taking into account the case of Amazon in order to capture the scientific dimension behind such a subject. In a second step, we will present the genesis of our research project, DEEPERFECT, which aims to set up original methods and effective tools in order to help sellers present on marketplaces in the optimization of their branded content.

Keywords: e-commerce, scoring, marketplace, Amazon, brand content, product sheets

Procedia PDF Downloads 103
7572 A Study of Some Water Relations and Soil Salinity Using Geotextile Mat under Sprinkler System

Authors: Al-Molhem, Y.

Abstract:

This work aimed to study the influence of a geotextile material under sprinkler irrigation on the availability of soil moisture content and salinity of 40 cm top soil profile. Field experiment was carried out to measure soil moisture content, soil salinity and water application efficiency under sprinkler irrigation system. The results indicated that, the mats placed at 20 cm depth leads to increasing of the availability of soil moisture content in the root zone. The results further showed increases in water application efficiency because of using the geotextile material. In addition, soil salinity in the root zone decreased because of increasing soil moisture content.

Keywords: geotextile, moisture content, sprinkler irrigation

Procedia PDF Downloads 371
7571 Towards a Balancing Medical Database by Using the Least Mean Square Algorithm

Authors: Kamel Belammi, Houria Fatrim

Abstract:

imbalanced data set, a problem often found in real world application, can cause seriously negative effect on classification performance of machine learning algorithms. There have been many attempts at dealing with classification of imbalanced data sets. In medical diagnosis classification, we often face the imbalanced number of data samples between the classes in which there are not enough samples in rare classes. In this paper, we proposed a learning method based on a cost sensitive extension of Least Mean Square (LMS) algorithm that penalizes errors of different samples with different weight and some rules of thumb to determine those weights. After the balancing phase, we applythe different classifiers (support vector machine (SVM), k- nearest neighbor (KNN) and multilayer neuronal networks (MNN)) for balanced data set. We have also compared the obtained results before and after balancing method.

Keywords: multilayer neural networks, k- nearest neighbor, support vector machine, imbalanced medical data, least mean square algorithm, diabetes

Procedia PDF Downloads 510
7570 Unsupervised Classification of DNA Barcodes Species Using Multi-Library Wavelet Networks

Authors: Abdesselem Dakhli, Wajdi Bellil, Chokri Ben Amar

Abstract:

DNA Barcode, a short mitochondrial DNA fragment, made up of three subunits; a phosphate group, sugar and nucleic bases (A, T, C, and G). They provide good sources of information needed to classify living species. Such intuition has been confirmed by many experimental results. Species classification with DNA Barcode sequences has been studied by several researchers. The classification problem assigns unknown species to known ones by analyzing their Barcode. This task has to be supported with reliable methods and algorithms. To analyze species regions or entire genomes, it becomes necessary to use similarity sequence methods. A large set of sequences can be simultaneously compared using Multiple Sequence Alignment which is known to be NP-complete. To make this type of analysis feasible, heuristics, like progressive alignment, have been developed. Another tool for similarity search against a database of sequences is BLAST, which outputs shorter regions of high similarity between a query sequence and matched sequences in the database. However, all these methods are still computationally very expensive and require significant computational infrastructure. Our goal is to build predictive models that are highly accurate and interpretable. This method permits to avoid the complex problem of form and structure in different classes of organisms. On empirical data and their classification performances are compared with other methods. Our system consists of three phases. The first is called transformation, which is composed of three steps; Electron-Ion Interaction Pseudopotential (EIIP) for the codification of DNA Barcodes, Fourier Transform and Power Spectrum Signal Processing. The second is called approximation, which is empowered by the use of Multi Llibrary Wavelet Neural Networks (MLWNN).The third is called the classification of DNA Barcodes, which is realized by applying the algorithm of hierarchical classification.

Keywords: DNA barcode, electron-ion interaction pseudopotential, Multi Library Wavelet Neural Networks (MLWNN)

Procedia PDF Downloads 294
7569 The Isolation of Enterobacter Ludwigii Strain T976 from Nicotiana Tabacum L. Yunyan 97 and Its Application Study

Authors: Gao Qin, Hu Liwei, Dong Xiangzhou, Zhu Qifa, Cheng Tingming, Zhao Limei, Yang Mengmeng, Zhai Zhen, Dai Huaxin, Liang Taibo, Zhang Shixiang, Xue Chaoqun

Abstract:

The functional strain T976 for starch degradation was isolated from Nicotiana tabacum L. Yunyan 97 tobacco leaves, the ratio of starch hydrolysis transparent circle diameter to colony diameter of the strain was 4.14, 16S rDNA sequencing identified these strains as Enterobacter ludwigii. Then Enterobacter ludwigii T976 was fermented and spaying Yunyan 97 plant in vigorous growing stage. The results of once spraying fermentation broth of Enterobacter ludwigii T976 showed that starch content of upper leaves decreased slightly, from 3.77% to 3.1%, the reducing sugar content increased from 4.39% to 5.53%, and the total sugar content increased from 5.82% to 7.39%. The chemical content was also checked after three time spraying. The starch content of middle leaves decreased from 5.63% to 3.74%, while the content of total sugar and reducing sugar decreased slightly. And the starch content of upper leaves decreased from 7.62% to 4.78%, the total sugar and reducing sugar decreased slightly, and starch content of middle leaf decreased from 6.27% to 3.62%, the total sugar and reducing sugar did not change much, and other chemical components were in a suitable range.

Keywords: nicotiana tabacum, yunyan 97, leaf, starch, degradation, enterobacter ludwigii

Procedia PDF Downloads 27
7568 Enhanced Arabic Semantic Information Retrieval System Based on Arabic Text Classification

Authors: A. Elsehemy, M. Abdeen , T. Nazmy

Abstract:

Since the appearance of the Semantic web, many semantic search techniques and models were proposed to exploit the information in ontology to enhance the traditional keyword-based search. Many advances were made in languages such as English, German, French and Spanish. However, other languages such as Arabic are not fully supported yet. In this paper we present a framework for ontology based information retrieval for Arabic language. Our system consists of four main modules, namely query parser, indexer, search and a ranking module. Our approach includes building a semantic index by linking ontology concepts to documents, including an annotation weight for each link, to be used in ranking the results. We also augmented the framework with an automatic document categorizer, which enhances the overall document ranking. We have built three Arabic domain ontologies: Sports, Economic and Politics as example for the Arabic language. We built a knowledge base that consists of 79 classes and more than 1456 instances. The system is evaluated using the precision and recall metrics. We have done many retrieval operations on a sample of 40,316 documents with a size 320 MB of pure text. The results show that the semantic search enhanced with text classification gives better performance results than the system without classification.

Keywords: Arabic text classification, ontology based retrieval, Arabic semantic web, information retrieval, Arabic ontology

Procedia PDF Downloads 502
7567 Content Analysis of ‘Junk Food’ Content in Children’s TV Programmes: A Comparison of UK Broadcast TV and Video-On-Demand Services

Authors: Shreesh Sinha, Alexander B. Barker, Megan Parkin, Emma Wilson, Rachael L. Murray

Abstract:

Background and Objectives: Exposure to HFSS imagery is associated with the consumption of foods high in fat, sugar or salt (HFSS), and subsequently obesity, among young people. We report and compare the results of two content analyses, one of two popular terrestrial children's television channels in the UK and the other of a selection of children's programmes available on video-on-demand (VOD) streaming sites. Methods: Content analysis of three days' worth of programmes (including advertisements) on two popular children's television channels broadcast on UK television (CBeebies and Milkshake) as well as a sample of 40 highest-rated children's programmes available on the VOD platforms, Netflix and Amazon Prime, using 1-minute interval coding. Results: HFSS content was seen in 181 broadcasts (36%) and in 417 intervals (13%) on terrestrial television, 'Milkshake' had a significantly higher proportion of programmes/adverts which contained HFSS content than 'CBeebies'. In VOD platforms, HFSS content was seen in 82 episodes (72% of the total number of episodes), across 459 intervals (19% of the total number of intervals), with no significant difference in the proportion of programmes containing HFSS content between Netflix and Amazon Prime. Conclusions: This study demonstrates that HFSS content is common in both popular UK children's television channels and children's programmes on VOD services. Since previous research has shown that HFSS content in the media has an effect on HFSS consumption, children's television programmes broadcast either on TV or VOD services are likely to have an effect on HFSS consumption in children, and legislative opportunities to prevent this exposure are being missed.

Keywords: public health, junk food, children's TV, HFSS

Procedia PDF Downloads 70
7566 Effect of Ecologic Fertilizers on Productivity and Yield Quality of Common and Spelt Wheat

Authors: Danutė Jablonskytė-Raščė, Audronė MankevičIenė, Laura Masilionytė

Abstract:

During the period 2009–2015, in Joniškėlis Experimental Station of the Lithuanian Research Centre for Agriculture and Forestry, the effect of ecologic fertilizers Ekoplant, bio-activators Biokal 01 and Terra Sorb Foliar and their combinations on the formation of the productivity elements, grain yield and quality of winter wheat, spelt (Triticum spelta L.), and common wheat (Triticum aestivum L.) was analysed in ecological agro-system. The soil under FAO classification – Endocalcari-Endo-hypogleyic-Cambisol. In a clay loam soil, ecological fertilizer produced from sunflower hull ash and this fertilizer in combination with plant extracts and bio-humus exerted an influence on the grain yield of spelt and common wheat and their mixture (increased the grain yield by 10.0%, compared with the unfertilized crops). Spelt grain yield was by on average 16.9% lower than that of common wheat and by 11.7% lower than that of the mixture, but the role of spelt in organic production systems is important because with no mineral fertilization it produced grains with a higher (by 4%) gluten content and exhibited a greater ability to suppress weeds (by on average 61.9% lower weed weight) compared with the grain yield and weed suppressive ability of common wheat and mixture. Spelt cultivation in a mixture with common wheat significantly improved quality indicators of the mixture (its grain contained by 2.0% higher protein content and by 4.0% higher gluten content than common wheat grain), reduced disease incidence (by 2-8%), and weed infestation level (by 34-81%).

Keywords: common and spelt-wheat, ecological fertilizers, bio-activators, productivity elements, yield, quality

Procedia PDF Downloads 273
7565 Social Media Retailing in the Creator Economy

Authors: Julianne Cai, Weili Xue, Yibin Wu

Abstract:

Social media retailing (SMR) platforms have become popular nowadays. It is characterized by a creative combination of content creation and product selling, which differs from traditional e-tailing (TE) with product selling alone. Motivated by real-world practices like social media platforms “TikTok” and douyin.com, we endeavor to study if the SMR model performs better than the TE model in a monopoly setting. By building a stylized economic model, we find that the SMR model does not always outperform the TE model. Specifically, when the SMR platform collects less commission from the seller than the TE platform, the seller, consumers, and social welfare all benefit more from the SMR model. In contrast, the platform benefits more from the SMR model if and only if the creator’s social influence is high enough or the cost of content creation is small enough. For the incentive structure of the content rewards in the SMR model, we found that a strong incentive mechanism (e.g., the quadratic form) is more powerful than a weak one (e.g., the linear form). The previous one will encourage the creator to choose a much higher quality level of content creation and meanwhile allowing the platform, consumers, and social welfare to become better off. Counterintuitively, providing more generous content rewards is not always helpful for the creator (seller), and it may reduce her profit. Our findings will guide the platform to effectively design incentive mechanisms to boost the content creation and retailing in the SMR model and help the influencers efficiently create content, engage their followers (fans), and price their products sold on the SMR platform.

Keywords: content creation, creator economy, incentive strategy, platform retailing

Procedia PDF Downloads 78
7564 Estimating Tree Height and Forest Classification from Multi Temporal Risat-1 HH and HV Polarized Satellite Aperture Radar Interferometric Phase Data

Authors: Saurav Kumar Suman, P. Karthigayani

Abstract:

In this paper the height of the tree is estimated and forest types is classified from the multi temporal RISAT-1 Horizontal-Horizontal (HH) and Horizontal-Vertical (HV) Polarised Satellite Aperture Radar (SAR) data. The novelty of the proposed project is combined use of the Back-scattering Coefficients (Sigma Naught) and the Coherence. It uses Water Cloud Model (WCM). The approaches use two main steps. (a) Extraction of the different forest parameter data from the Product.xml, BAND-META file and from Grid-xxx.txt file come with the HH & HV polarized data from the ISRO (Indian Space Research Centre). These file contains the required parameter during height estimation. (b) Calculation of the Vegetation and Ground Backscattering, Coherence and other Forest Parameters. (c) Classification of Forest Types using the ENVI 5.0 Tool and ROI (Region of Interest) calculation.

Keywords: RISAT-1, classification, forest, SAR data

Procedia PDF Downloads 380
7563 Monitoring of Quantitative and Qualitative Changes in Combustible Material in the Białowieża Forest

Authors: Damian Czubak

Abstract:

The Białowieża Forest is a very valuable natural area, included in the World Natural Heritage at UNESCO, where, due to infestation by the bark beetle (Ips typographus), norway spruce (Picea abies) have deteriorated. This catastrophic scenario led to an increase in fire danger. This was due to the occurrence of large amounts of dead wood and grass cover, as light penetrated to the bottom of the stands. These factors in a dry state are materials that favour the possibility of fire and the rapid spread of fire. One of the objectives of the study was to monitor the quantitative and qualitative changes of combustible material on the permanent decay plots of spruce stands from 2012-2022. In addition, the size of the area with highly flammable vegetation was monitored and a classification of the stands of the Białowieża Forest by flammability classes was made. The key factor that determines the potential fire hazard of a forest is combustible material. Primarily its type, quantity, moisture content, size and spatial structure. Based on the inventory data on the areas of forest districts in the Białowieża Forest, the average fire load and its changes over the years were calculated. The analysis was carried out taking into account the changes in the health status of the stands and sanitary operations. The quantitative and qualitative assessment of fallen timber and fire load of ground cover used the results of the 2019 and 2021 inventories. Approximately 9,000 circular plots were used for the study. An assessment was made of the amount of potential fuel, understood as ground cover vegetation and dead wood debris. In addition, monitoring of areas with vegetation that poses a high fire risk was conducted using data from 2019 and 2021. All sub-areas were inventoried where vegetation posing a specific fire hazard represented at least 10% of the area with species characteristic of that cover. In addition to the size of the area with fire-prone vegetation, a very important element is the size of the fire load on the indicated plots. On representative plots, the biomass of the land cover was measured on an area of 10 m2 and then the amount of biomass of each component was determined. The resulting element of variability of ground covers in stands was their flammability classification. The classification developed made it possible to track changes in the flammability classes of stands over the period covered by the measurements.

Keywords: classification, combustible material, flammable vegetation, Norway spruce

Procedia PDF Downloads 70
7562 Granule Morphology of Zirconia Powder with Solid Content on Two-Fluid Spray Drying

Authors: Hyeongdo Jeong, Jong Kook Lee

Abstract:

Granule morphology and microstructure were affected by slurry viscosity, chemical composition, particle size and spray drying process. In this study, we investigated granule morphology of zirconia powder with solid content on two-fluid spray drying. Zirconia granules after spray drying show sphere-like shapes with a diameter of 40-70 μm at low solid contents (30 or 40 wt%) and specific surface area of 5.1-5.6 m²/g. But a donut-like shape with a few cracks were observed on zirconia granules prepared from the slurry of high solid content (50 wt %), green compacts after cold isostatic pressing under the pressure of 200 MPa have the density of 2.1-2.2 g/cm³ and homogeneous fracture surface by complete destruction of granules. After the sintering at 1500 °C for 2 h, all specimens have relative density of 96.2-98.3 %. With increasing a solid content from 30 to 50 wt%, grain size increased from 0.3 to 0.6 μm, but relative density was inversely decreased from 98.3 to 96.2 %.

Keywords: zirconia, solid content, granulation, spray drying

Procedia PDF Downloads 196
7561 Ensemble-Based SVM Classification Approach for miRNA Prediction

Authors: Sondos M. Hammad, Sherin M. ElGokhy, Mahmoud M. Fahmy, Elsayed A. Sallam

Abstract:

In this paper, an ensemble-based Support Vector Machine (SVM) classification approach is proposed. It is used for miRNA prediction. Three problems, commonly associated with previous approaches, are alleviated. These problems arise due to impose assumptions on the secondary structural of premiRNA, imbalance between the numbers of the laboratory checked miRNAs and the pseudo-hairpins, and finally using a training data set that does not consider all the varieties of samples in different species. We aggregate the predicted outputs of three well-known SVM classifiers; namely, Triplet-SVM, Virgo and Mirident, weighted by their variant features without any structural assumptions. An additional SVM layer is used in aggregating the final output. The proposed approach is trained and then tested with balanced data sets. The results of the proposed approach outperform the three base classifiers. Improved values for the metrics of 88.88% f-score, 92.73% accuracy, 90.64% precision, 96.64% specificity, 87.2% sensitivity, and the area under the ROC curve is 0.91 are achieved.

Keywords: MiRNAs, SVM classification, ensemble algorithm, assumption problem, imbalance data

Procedia PDF Downloads 318
7560 Development of Fuzzy Logic Control Ontology for E-Learning

Authors: Muhammad Sollehhuddin A. Jalil, Mohd Ibrahim Shapiai, Rubiyah Yusof

Abstract:

Nowadays, ontology is common in many areas like artificial intelligence, bioinformatics, e-commerce, education and many more. Ontology is one of the focus areas in the field of Information Retrieval. The purpose of an ontology is to describe a conceptual representation of concepts and their relationships within a particular domain. In other words, ontology provides a common vocabulary for anyone who needs to share information in the domain. There are several ontology domains in various fields including engineering and non-engineering knowledge. However, there are only a few available ontology for engineering knowledge. Fuzzy logic as engineering knowledge is still not available as ontology domain. In general, fuzzy logic requires step-by-step guidelines and instructions of lab experiments. In this study, we presented domain ontology for Fuzzy Logic Control (FLC) knowledge. We give Table of Content (ToC) with middle strategy based on the Uschold and King method to develop FLC ontology. The proposed framework is developed using Protégé as the ontology tool. The Protégé’s ontology reasoner, known as the Pellet reasoner is then used to validate the presented framework. The presented framework offers better performance based on consistency and classification parameter index. In general, this ontology can provide a platform to anyone who needs to understand FLC knowledge.

Keywords: engineering knowledge, fuzzy logic control ontology, ontology development, table of content

Procedia PDF Downloads 276
7559 Use of Gaussian-Euclidean Hybrid Function Based Artificial Immune System for Breast Cancer Diagnosis

Authors: Cuneyt Yucelbas, Seral Ozsen, Sule Yucelbas, Gulay Tezel

Abstract:

Due to the fact that there exist only a small number of complex systems in artificial immune system (AIS) that work out nonlinear problems, nonlinear AIS approaches, among the well-known solution techniques, need to be developed. Gaussian function is usually used as similarity estimation in classification problems and pattern recognition. In this study, diagnosis of breast cancer, the second type of the most widespread cancer in women, was performed with different distance calculation functions that euclidean, gaussian and gaussian-euclidean hybrid function in the clonal selection model of classical AIS on Wisconsin Breast Cancer Dataset (WBCD), which was taken from the University of California, Irvine Machine-Learning Repository. We used 3-fold cross validation method to train and test the dataset. According to the results, the maximum test classification accuracy was reported as 97.35% by using of gaussian-euclidean hybrid function for fold-3. Also, mean of test classification accuracies for all of functions were obtained as 94.78%, 94.45% and 95.31% with use of euclidean, gaussian and gaussian-euclidean, respectively. With these results, gaussian-euclidean hybrid function seems to be a potential distance calculation method, and it may be considered as an alternative distance calculation method for hard nonlinear classification problems.

Keywords: artificial immune system, breast cancer diagnosis, Euclidean function, Gaussian function

Procedia PDF Downloads 415
7558 Frequency- and Content-Based Tag Cloud Font Distribution Algorithm

Authors: Ágnes Bogárdi-Mészöly, Takeshi Hashimoto, Shohei Yokoyama, Hiroshi Ishikawa

Abstract:

The spread of Web 2.0 has caused user-generated content explosion. Users can tag resources to describe and organize them. Tag clouds provide rough impression of relative importance of each tag within overall cloud in order to facilitate browsing among numerous tags and resources. The goal of our paper is to enrich visualization of tag clouds. A font distribution algorithm has been proposed to calculate a novel metric based on frequency and content, and to classify among classes from this metric based on power law distribution and percentages. The suggested algorithm has been validated and verified on the tag cloud of a real-world thesis portal.

Keywords: tag cloud, font distribution algorithm, frequency-based, content-based, power law

Procedia PDF Downloads 481
7557 An Efficient Motion Recognition System Based on LMA Technique and a Discrete Hidden Markov Model

Authors: Insaf Ajili, Malik Mallem, Jean-Yves Didier

Abstract:

Human motion recognition has been extensively increased in recent years due to its importance in a wide range of applications, such as human-computer interaction, intelligent surveillance, augmented reality, content-based video compression and retrieval, etc. However, it is still regarded as a challenging task especially in realistic scenarios. It can be seen as a general machine learning problem which requires an effective human motion representation and an efficient learning method. In this work, we introduce a descriptor based on Laban Movement Analysis technique, a formal and universal language for human movement, to capture both quantitative and qualitative aspects of movement. We use Discrete Hidden Markov Model (DHMM) for training and classification motions. We improve the classification algorithm by proposing two DHMMs for each motion class to process the motion sequence in two different directions, forward and backward. Such modification allows avoiding the misclassification that can happen when recognizing similar motions. Two experiments are conducted. In the first one, we evaluate our method on a public dataset, the Microsoft Research Cambridge-12 Kinect gesture data set (MSRC-12) which is a widely used dataset for evaluating action/gesture recognition methods. In the second experiment, we build a dataset composed of 10 gestures(Introduce yourself, waving, Dance, move, turn left, turn right, stop, sit down, increase velocity, decrease velocity) performed by 20 persons. The evaluation of the system includes testing the efficiency of our descriptor vector based on LMA with basic DHMM method and comparing the recognition results of the modified DHMM with the original one. Experiment results demonstrate that our method outperforms most of existing methods that used the MSRC-12 dataset, and a near perfect classification rate in our dataset.

Keywords: human motion recognition, motion representation, Laban Movement Analysis, Discrete Hidden Markov Model

Procedia PDF Downloads 181
7556 Formulation and Nutrition Analysis of Low-Sugar Snack Bars

Authors: S. Kongtun-Janphuk, S. Niwitpong Jr., J. Saengsai

Abstract:

Low-sugar snack bars were formulated with 3 main formulas depending on the main ingredient, which were peanut-green bean-sesame, apple, and prune. The most acceptable formula of each group was obtained by sensory evaluation using a nine-point hedonic scale. The moisture content, total ash, protein, fat and fiber were analyzed by the standard methods of AOAC. The peanut-mung bean-sesame snack bar showed the highest protein content (88.32%) and total fat (0.48%) with the lowest of fiber content (0.01%) while the prune formula showed the lowest protein content (71.91%) and total fat (0.21%) with the highest of fiber content (0.03%). This result indicated that the prune formula could be used as diet food to assist in weight loss program.

Keywords: low-sugar snack bar, diet food, nutrition analysis, food formulation

Procedia PDF Downloads 372
7555 Incorporating Information Gain in Regular Expressions Based Classifiers

Authors: Rosa L. Figueroa, Christopher A. Flores, Qing Zeng-Treitler

Abstract:

A regular expression consists of sequence characters which allow describing a text path. Usually, in clinical research, regular expressions are manually created by programmers together with domain experts. Lately, there have been several efforts to investigate how to generate them automatically. This article presents a text classification algorithm based on regexes. The algorithm named REX was designed, and then, implemented as a simplified method to create regexes to classify Spanish text automatically. In order to classify ambiguous cases, such as, when multiple labels are assigned to a testing example, REX includes an information gain method Two sets of data were used to evaluate the algorithm’s effectiveness in clinical text classification tasks. The results indicate that the regular expression based classifier proposed in this work performs statically better regarding accuracy and F-measure than Support Vector Machine and Naïve Bayes for both datasets.

Keywords: information gain, regular expressions, smith-waterman algorithm, text classification

Procedia PDF Downloads 297
7554 Content Based Instruction: An Interdisciplinary Approach in Promoting English Language Competence

Authors: Sanjeeb Kumar Mohanty

Abstract:

Content Based Instruction (CBI) in English Language Teaching (ELT) basically helps English as Second Language (ESL) learners of English. At the same time, it fosters multidisciplinary style of learning by promoting collaborative learning style. It is an approach to teaching ESL that attempts to combine language with interdisciplinary learning for bettering language proficiency and facilitating content learning. Hence, the basic purpose of CBI is that language should be taught in conjunction with academic subject matter. It helps in establishing the content as well as developing language competency. This study aims at supporting the potential values of interdisciplinary approach in promoting English Language Learning (ELL) by teaching writing skills to a small group of learners and discussing the findings with the teachers from various disciplines in a workshop. The teachers who are oriented, they use the same approach in their classes collaboratively. The inputs from the learners as well as the teachers hopefully raise positive consciousness with regard to the vast benefits that Content Based Instruction can offer in advancing the language competence of the learners.

Keywords: content based instruction, interdisciplinary approach, writing skills, collaborative approach

Procedia PDF Downloads 249
7553 Sorting Maize Haploids from Hybrids Using Single-Kernel Near-Infrared Spectroscopy

Authors: Paul R Armstrong

Abstract:

Doubled haploids (DHs) have become an important breeding tool for creating maize inbred lines, although several bottlenecks in the DH production process limit wider development, application, and adoption of the technique. DH kernels are typically sorted manually and represent about 10% of the seeds in a much larger pool where the remaining 90% are hybrid siblings. This introduces time constraints on DH production and manual sorting is often not accurate. Automated sorting based on the chemical composition of the kernel can be effective, but devices, namely NMR, have not achieved the sorting speed to be a cost-effective replacement to manual sorting. This study evaluated a single kernel near-infrared reflectance spectroscopy (skNIR) platform to accurately identify DH kernels based on oil content. The skNIR platform is a higher-throughput device, approximately 3 seeds/s, that uses spectra to predict oil content of each kernel from maize crosses intentionally developed to create larger than normal oil differences, 1.5%-2%, between DH and hybrid kernels. Spectra from the skNIR were used to construct a partial least squares regression (PLS) model for oil and for a categorical reference model of 1 (DH kernel) or 2 (hybrid kernel) and then used to sort several crosses to evaluate performance. Two approaches were used for sorting. The first used a general PLS model developed from all crosses to predict oil content and then used for sorting each induction cross, the second was the development of a specific model from a single induction cross where approximately fifty DH and one hundred hybrid kernels used. This second approach used a categorical reference value of 1 and 2, instead of oil content, for the PLS model and kernels selected for the calibration set were manually referenced based on traditional commercial methods using coloration of the tip cap and germ areas. The generalized PLS oil model statistics were R2 = 0.94 and RMSE = .93% for kernels spanning an oil content of 2.7% to 19.3%. Sorting by this model resulted in extracting 55% to 85% of haploid kernels from the four induction crosses. Using the second method of generating a model for each cross yielded model statistics ranging from R2s = 0.96 to 0.98 and RMSEs from 0.08 to 0.10. Sorting in this case resulted in 100% correct classification but required models that were cross. In summary, the first generalized model oil method could be used to sort a significant number of kernels from a kernel pool but was not close to the accuracy of developing a sorting model from a single cross. The penalty for the second method is that a PLS model would need to be developed for each individual cross. In conclusion both methods could find useful application in the sorting of DH from hybrid kernels.

Keywords: NIR, haploids, maize, sorting

Procedia PDF Downloads 282
7552 Online Handwritten Character Recognition for South Indian Scripts Using Support Vector Machines

Authors: Steffy Maria Joseph, Abdu Rahiman V, Abdul Hameed K. M.

Abstract:

Online handwritten character recognition is a challenging field in Artificial Intelligence. The classification success rate of current techniques decreases when the dataset involves similarity and complexity in stroke styles, number of strokes and stroke characteristics variations. Malayalam is a complex south indian language spoken by about 35 million people especially in Kerala and Lakshadweep islands. In this paper, we consider the significant feature extraction for the similar stroke styles of Malayalam. This extracted feature set are suitable for the recognition of other handwritten south indian languages like Tamil, Telugu and Kannada. A classification scheme based on support vector machines (SVM) is proposed to improve the accuracy in classification and recognition of online malayalam handwritten characters. SVM Classifiers are the best for real world applications. The contribution of various features towards the accuracy in recognition is analysed. Performance for different kernels of SVM are also studied. A graphical user interface has developed for reading and displaying the character. Different writing styles are taken for each of the 44 alphabets. Various features are extracted and used for classification after the preprocessing of input data samples. Highest recognition accuracy of 97% is obtained experimentally at the best feature combination with polynomial kernel in SVM.

Keywords: SVM, matlab, malayalam, South Indian scripts, onlinehandwritten character recognition

Procedia PDF Downloads 554
7551 A t-SNE and UMAP Based Neural Network Image Classification Algorithm

Authors: Shelby Simpson, William Stanley, Namir Naba, Xiaodi Wang

Abstract:

Both t-SNE and UMAP are brand new state of art tools to predominantly preserve the local structure that is to group neighboring data points together, which indeed provides a very informative visualization of heterogeneity in our data. In this research, we develop a t-SNE and UMAP base neural network image classification algorithm to embed the original dataset to a corresponding low dimensional dataset as a preprocessing step, then use this embedded database as input to our specially designed neural network classifier for image classification. We use the fashion MNIST data set, which is a labeled data set of images of clothing objects in our experiments. t-SNE and UMAP are used for dimensionality reduction of the data set and thus produce low dimensional embeddings. Furthermore, we use the embeddings from t-SNE and UMAP to feed into two neural networks. The accuracy of the models from the two neural networks is then compared to a dense neural network that does not use embedding as an input to show which model can classify the images of clothing objects more accurately.

Keywords: t-SNE, UMAP, fashion MNIST, neural networks

Procedia PDF Downloads 170
7550 Effects of Water Content on Dielectric Properties of Mineral Transformer Oil

Authors: Suwarno, M. Helmi Prakoso

Abstract:

Mineral oil is commonly used for high voltage transformer insulation. The insulation quality of mineral oil is affecting the operation process of high voltage transformer. There are many contaminations which could decrease the insulation quality of mineral oil. One of them is water. This research talks about the effect of water content on dielectric properties, physic properties, and partial discharge pattern on mineral oil. Samples were varied with 10 varieties of water content value. And then all samples were tested to measure the dielectric properties, physic properties, and partial discharge pattern. The result of this research showed that an increment of water content value would decrease the insulation quality of mineral oil.

Keywords: dielectric properties, high voltage transformer, mineral oil, water content

Procedia PDF Downloads 377
7549 Analyzing the Effect of Biomass and Cementitious Materials on Air Content in Concrete

Authors: Mohammed Albahttiti, Eliana Aguilar

Abstract:

A push for sustainability in the concrete industry is increasing. Cow manure itself is becoming a problem and having the potential solution to use it in concrete as a cementitious replacement would be an ideal solution. For cow manure ash to become a well-rounded substitute, it would have to meet the right criteria to progress in becoming a more popular idea in the concrete industry. This investigation primarily focuses on how the replacement of cow manure ash affects the air content and air void distribution in concrete. In order to assess these parameters, the Super Air Meter (SAM) was used to test concrete in this research. In addition, multiple additional tests were performed, which included the slump test, temperature, and compression test. The strength results of the manure ash in concrete were promising. The manure showed compression strength results that are similar to that of the other supplementary cementitious materials tested. On the other hand, concrete samples made with cow manure ash showed 2% air content loss and an increasing SAM number proportional to cow manure content starting at 0.38 and increasing to 0.8. In conclusion, while the use of cow manure results in loss of air content, it results in compressive strengths similar to other supplementary cementitious materials.

Keywords: air content, biomass ash, cow manure ash, super air meter, supplementary cementitious materials

Procedia PDF Downloads 122
7548 Study on Sintering System of Calcium Barium Sulphoaluminate by XRD Quantitative Analysis

Authors: Xiaopeng Shang, Xin YU, Jun CHANG

Abstract:

Calcium barium sulphoaluminate (CBSA), derived from calcium sulphoaluminate(CSA), has excellent cementitious properties. In this study, the sintering system of CBSA with a theoretical stoichiometric Ca3BaAl6SO16 was investigated. Rietveld refinement was performed using TOPAS 4.2 software to quantitatively calculate the content of CBSA and the actual ionic site occupancy of Ba2+. The results indicate that the contents of Ca4-xBaxAl6SO16 increases with increasing sintering temperature in the 1200℃-1400℃ ranges. When sintered at 1400℃ for 180min, the content of CBSA reaches 88.4%. However, CBSA begins to decompose at 1440℃ and the content of which decreases. The replacement rate of Ba2+ was also enlarged by increasing sintering temperature and prolonged sintering time. Sintering at 1400℃ for 180min is considered as the optimum when replacement rate of Ba2+ and the content of CBSA were taken into account. Ca3.2Ba0.8Al6SO16 with a content of 88.4% was synthesized.

Keywords: calcium barium sulphoaluminate, sintering system, Ba2+ replacement rate, Rietveld refinement

Procedia PDF Downloads 314
7547 Autonomous Vehicle Detection and Classification in High Resolution Satellite Imagery

Authors: Ali J. Ghandour, Houssam A. Krayem, Abedelkarim A. Jezzini

Abstract:

High-resolution satellite images and remote sensing can provide global information in a fast way compared to traditional methods of data collection. Under such high resolution, a road is not a thin line anymore. Objects such as cars and trees are easily identifiable. Automatic vehicles enumeration can be considered one of the most important applications in traffic management. In this paper, autonomous vehicle detection and classification approach in highway environment is proposed. This approach consists mainly of three stages: (i) first, a set of preprocessing operations are applied including soil, vegetation, water suppression. (ii) Then, road networks detection and delineation is implemented using built-up area index, followed by several morphological operations. This step plays an important role in increasing the overall detection accuracy since vehicles candidates are objects contained within the road networks only. (iii) Multi-level Otsu segmentation is implemented in the last stage, resulting in vehicle detection and classification, where detected vehicles are classified into cars and trucks. Accuracy assessment analysis is conducted over different study areas to show the great efficiency of the proposed method, especially in highway environment.

Keywords: remote sensing, object identification, vehicle and road extraction, vehicle and road features-based classification

Procedia PDF Downloads 209