Search results for: Cover Text
652 TOSOM: A Topic-Oriented Self-Organizing Map for Text Organization
Authors: Hsin-Chang Yang, Chung-Hong Lee, Kuo-Lung Ke
Abstract:
The self-organizing map (SOM) model is a well-known neural network model with wide spread of applications. The main characteristics of SOM are two-fold, namely dimension reduction and topology preservation. Using SOM, a high-dimensional data space will be mapped to some low-dimensional space. Meanwhile, the topological relations among data will be preserved. With such characteristics, the SOM was usually applied on data clustering and visualization tasks. However, the SOM has main disadvantage of the need to know the number and structure of neurons prior to training, which are difficult to be determined. Several schemes have been proposed to tackle such deficiency. Examples are growing/expandable SOM, hierarchical SOM, and growing hierarchical SOM. These schemes could dynamically expand the map, even generate hierarchical maps, during training. Encouraging results were reported. Basically, these schemes adapt the size and structure of the map according to the distribution of training data. That is, they are data-driven or dataoriented SOM schemes. In this work, a topic-oriented SOM scheme which is suitable for document clustering and organization will be developed. The proposed SOM will automatically adapt the number as well as the structure of the map according to identified topics. Unlike other data-oriented SOMs, our approach expands the map and generates the hierarchies both according to the topics and their characteristics of the neurons. The preliminary experiments give promising result and demonstrate the plausibility of the method.
Keywords: Self-organizing map, topic identification, learning algorithm, text clustering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2025651 Exploring Social Impact of Emerging Technologies from Futuristic Data
Authors: Heeyeul Kwon, Yongtae Park
Abstract:
Despite the highly touted benefits, emerging technologies have unleashed pervasive concerns regarding unintended and unforeseen social impacts. Thus, those wishing to create safe and socially acceptable products need to identify such side effects and mitigate them prior to the market proliferation. Various methodologies in the field of technology assessment (TA), namely Delphi, impact assessment, and scenario planning, have been widely incorporated in such a circumstance. However, literatures face a major limitation in terms of sole reliance on participatory workshop activities. They unfortunately missed out the availability of a massive untapped data source of futuristic information flooding through the Internet. This research thus seeks to gain insights into utilization of futuristic data, future-oriented documents from the Internet, as a supplementary method to generate social impact scenarios whilst capturing perspectives of experts from a wide variety of disciplines. To this end, network analysis is conducted based on the social keywords extracted from the futuristic documents by text mining, which is then used as a guide to produce a comprehensive set of detailed scenarios. Our proposed approach facilitates harmonized depictions of possible hazardous consequences of emerging technologies and thereby makes decision makers more aware of, and responsive to, broad qualitative uncertainties.
Keywords: Emerging technologies, futuristic data, scenario, text mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2391650 Maya Semantic Technique: A Mathematical Technique Used to Determine Partial Semantics for Declarative Sentences
Authors: Marcia T. Mitchell
Abstract:
This research uses computational linguistics, an area of study that employs a computer to process natural language, and aims at discerning the patterns that exist in declarative sentences used in technical texts. The approach is mathematical, and the focus is on instructional texts found on web pages. The technique developed by the author and named the MAYA Semantic Technique is used here and organized into four stages. In the first stage, the parts of speech in each sentence are identified. In the second stage, the subject of the sentence is determined. In the third stage, MAYA performs a frequency analysis on the remaining words to determine the verb and its object. In the fourth stage, MAYA does statistical analysis to determine the content of the web page. The advantage of the MAYA Semantic Technique lies in its use of mathematical principles to represent grammatical operations which assist processing and accuracy if performed on unambiguous text. The MAYA Semantic Technique is part of a proposed architecture for an entire web-based intelligent tutoring system. On a sample set of sentences, partial semantics derived using the MAYA Semantic Technique were approximately 80% accurate. The system currently processes technical text in one domain, namely Cµ programming. In this domain all the keywords and programming concepts are known and understood.
Keywords: Natural language understanding, computational linguistics, knowledge representation, linguistic theories.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1671649 A Methodology for Automatic Diversification of Document Categories
Authors: Dasom Kim, Chen Liu, Myungsu Lim, Soo-Hyeon Jeon, Byeoung Kug Jeon, Kee-Young Kwahk, Namgyu Kim
Abstract:
Recently, numerous documents including large volumes of unstructured data and text have been created because of the rapid increase in the use of social media and the Internet. Usually, these documents are categorized for the convenience of users. Because the accuracy of manual categorization is not guaranteed, and such categorization requires a large amount of time and incurs huge costs. Many studies on automatic categorization have been conducted to help mitigate the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorize complex documents with multiple topics because they work on the assumption that individual documents can be categorized into single categories only. Therefore, to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, the learning process employed in these studies involves training using a multi-categorized document set. These methods therefore cannot be applied to the multi-categorization of most documents unless multi-categorized training sets using traditional multi-categorization algorithms are provided. To overcome this limitation, in this study, we review our novel methodology for extending the category of a single-categorized document to multiple categorizes, and then introduce a survey-based verification scenario for estimating the accuracy of our automatic categorization methodology.Keywords: Big Data Analysis, Document Classification, Text Mining, Topic Analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1745648 Spatial-Temporal Clustering Characteristics of Dengue in the Northern Region of Sri Lanka, 2010-2013
Authors: Sumiko Anno, Keiji Imaoka, Takeo Tadono, Tamotsu Igarashi, Subramaniam Sivaganesh, Selvam Kannathasan, Vaithehi Kumaran, Sinnathamby Noble Surendran
Abstract:
Dengue outbreaks are affected by biological, ecological, socio-economic and demographic factors that vary over time and space. These factors have been examined separately and still require systematic clarification. The present study aimed to investigate the spatial-temporal clustering relationships between these factors and dengue outbreaks in the northern region of Sri Lanka. Remote sensing (RS) data gathered from a plurality of satellites were used to develop an index comprising rainfall, humidity and temperature data. RS data gathered by ALOS/AVNIR-2 were used to detect urbanization, and a digital land cover map was used to extract land cover information. Other data on relevant factors and dengue outbreaks were collected through institutions and extant databases. The analyzed RS data and databases were integrated into geographic information systems, enabling temporal analysis, spatial statistical analysis and space-time clustering analysis. Our present results showed that increases in the number of the combination of ecological factor and socio-economic and demographic factors with above the average or the presence contribute to significantly high rates of space-time dengue clusters.
Keywords: ALOS/AVNIR-2, Dengue, Space-time clustering analysis, Sri Lanka.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2284647 Performance Optimization of Data Mining Application Using Radial Basis Function Classifier
Authors: M. Govindarajan, R. M.Chandrasekaran
Abstract:
Text data mining is a process of exploratory data analysis. Classification maps data into predefined groups or classes. It is often referred to as supervised learning because the classes are determined before examining the data. This paper describes proposed radial basis function Classifier that performs comparative crossvalidation for existing radial basis function Classifier. The feasibility and the benefits of the proposed approach are demonstrated by means of data mining problem: direct Marketing. Direct marketing has become an important application field of data mining. Comparative Cross-validation involves estimation of accuracy by either stratified k-fold cross-validation or equivalent repeated random subsampling. While the proposed method may have high bias; its performance (accuracy estimation in our case) may be poor due to high variance. Thus the accuracy with proposed radial basis function Classifier was less than with the existing radial basis function Classifier. However there is smaller the improvement in runtime and larger improvement in precision and recall. In the proposed method Classification accuracy and prediction accuracy are determined where the prediction accuracy is comparatively high.Keywords: Text Data Mining, Comparative Cross-validation, Radial Basis Function, runtime, accuracy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1554646 Probiotic Properties of Lactic Acid Bacteria Isolated from Fermented Food
Authors: Wilailak Siripornadulsil, Siriyanapat Tasaku, Jutamas Buahorm, Surasak Siripornadulsil
Abstract:
The objectives of this study were to isolate LAB from various sources, dietary supplement, Thai traditional fermented food, and freshwater fish and to characterize their potential as probiotic cultures. Out of 1,558 isolates, 730 were identified as LAB based on isolation on MRS agar supplemented with a bromocresol purple indicator&CaCO3 and Gram-positive, catalase- and oxidase-negative characteristics. Eight isolates showed the potential probiotic properties including tolerance to acid, bile salt & heat, proteolytic, amylolytic & lipolytic activities and oxalate-degrading capability. They all showed the antimicrobial activity against some Gram-negative and Gram-positive pathogenic bacteria. Based on 16S rDNA sequence analysis, they were identified as Enterococcus faecalis BT2 & MG30, Leconostoc mesenteroides SW64 and Pediococcus pentosaceous BD33, CF32, NP6, PS34 & SW5. The health beneficial effects and food safety will be further investigated and developed as a probiotic or protective culture used in Nile tilapia belly flap meat fermentation.
Keywords: Lactic acid bacteria, pathogen, probiotic, protective culture.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3897645 Influence of a High-Resolution Land Cover Classification on Air Quality Modelling
Authors: C. Silveira, A. Ascenso, J. Ferreira, A. I. Miranda, P. Tuccella, G. Curci
Abstract:
Poor air quality is one of the main environmental causes of premature deaths worldwide, and mainly in cities, where the majority of the population lives. It is a consequence of successive land cover (LC) and use changes, as a result of the intensification of human activities. Knowing these landscape modifications in a comprehensive spatiotemporal dimension is, therefore, essential for understanding variations in air pollutant concentrations. In this sense, the use of air quality models is very useful to simulate the physical and chemical processes that affect the dispersion and reaction of chemical species into the atmosphere. However, the modelling performance should always be evaluated since the resolution of the input datasets largely dictates the reliability of the air quality outcomes. Among these data, the updated LC is an important parameter to be considered in atmospheric models, since it takes into account the Earth’s surface changes due to natural and anthropic actions, and regulates the exchanges of fluxes (emissions, heat, moisture, etc.) between the soil and the air. This work aims to evaluate the performance of the Weather Research and Forecasting model coupled with Chemistry (WRF-Chem), when different LC classifications are used as an input. The influence of two LC classifications was tested: i) the 24-classes USGS (United States Geological Survey) LC database included by default in the model, and the ii) CLC (Corine Land Cover) and specific high-resolution LC data for Portugal, reclassified according to the new USGS nomenclature (33-classes). Two distinct WRF-Chem simulations were carried out to assess the influence of the LC on air quality over Europe and Portugal, as a case study, for the year 2015, using the nesting technique over three simulation domains (25 km2, 5 km2 and 1 km2 horizontal resolution). Based on the 33-classes LC approach, particular emphasis was attributed to Portugal, given the detail and higher LC spatial resolution (100 m x 100 m) than the CLC data (5000 m x 5000 m). As regards to the air quality, only the LC impacts on tropospheric ozone concentrations were evaluated, because ozone pollution episodes typically occur in Portugal, in particular during the spring/summer, and there are few research works relating to this pollutant with LC changes. The WRF-Chem results were validated by season and station typology using background measurements from the Portuguese air quality monitoring network. As expected, a better model performance was achieved in rural stations: moderate correlation (0.4 – 0.7), BIAS (10 – 21µg.m-3) and RMSE (20 – 30 µg.m-3), and where higher average ozone concentrations were estimated. Comparing both simulations, small differences grounded on the Leaf Area Index and air temperature values were found, although the high-resolution LC approach shows a slight enhancement in the model evaluation. This highlights the role of the LC on the exchange of atmospheric fluxes, and stresses the need to consider a high-resolution LC characterization combined with other detailed model inputs, such as the emission inventory, to improve air quality assessment.Keywords: Land cover, tropospheric ozone, WRF-Chem, air quality assessment.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 796644 The Organizational Justice-Citizenship Behavior Link in Hotels: Does Customer Orientation Matter?
Authors: Pablo Zoghbi-Manrique-de-Lara, Miguel A. Suárez-Acosta
Abstract:
The goal of the present paper is to model two classic lines of research in which employees starred, organizational justice and citizenship behavior (OCB), but that have never been studied together when targeting customers. The suggestion is made that a hotel’s fair treatment (in terms of distributive, procedural, and interactional justice) toward customers will be appreciated by the employees, who will reciprocate in kind by favoring the hotel with increased customer-oriented behaviors (COBs). Data were collected from 204 employees at eight upscale hotels in the Canary Islands (Spain). Unlike in the case of perceptions of distributive justice, results of structural equation modeling demonstrate that employees substantively react to interactional and procedural justice toward guests by engaging in customer-oriented behaviors (COBs). The findings offer new reasons why employees decide to engage in COBs, and they highlight potentially beneficial effects of fair treatment toward guests bring to hospitality through promoting COBs.
Keywords: Hotel guests’ (mis) treatment, customer-oriented behaviors, employee citizenship, organizational justice, third-party observers, third-party intervention.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2404643 Assessment of Agricultural Land Use Land Cover, Land Surface Temperature and Population Changes Using Remote Sensing and GIS: Southwest Part of Marmara Sea, Turkey
Authors: Melis Inalpulat, Levent Genc
Abstract:
Land Use Land Cover (LULC) changes due to human activities and natural causes have become a major environmental concern. Assessment of temporal remote sensing data provides information about LULC impacts on environment. Land Surface Temperature (LST) is one of the important components for modeling environmental changes in climatological, hydrological, and agricultural studies. In this study, LULC changes (September 7, 1984 and July 8, 2014) especially in agricultural lands together with population changes (1985-2014) and LST status were investigated using remotely sensed and census data in South Marmara Watershed, Turkey. LULC changes were determined using Landsat TM and Landsat OLI data acquired in 1984 and 2014 summers. Six-band TM and OLI images were classified using supervised classification method to prepare LULC map including five classes including Forest (F), Grazing Land (G), Agricultural Land (A), Water Surface (W), Residential Area-Bare Soil (R-B) classes. The LST image was also derived from thermal bands of the same dates. LULC classification results showed that forest areas, agricultural lands, water surfaces and residential area-bare soils were increased as 65751 ha, 20163 ha, 1924 ha and 20462 ha respectively. In comparison, a dramatic decrement occurred in grazing land (107985 ha) within three decades. The population increased 29% between years 1984-2014 in whole study area. Along with the natural causes, migration also caused this increase since the study area has an important employment potential. LULC was transformed among the classes due to the expansion in residential, commercial and industrial areas as well as political decisions. In the study, results showed that agricultural lands around the settlement areas transformed to residential areas in 30 years. The LST images showed that mean temperatures were ranged between 26-32°C in 1984 and 27-33°C in 2014. Minimum temperature of agricultural lands was increased 3°C and reached to 23°C. In contrast, maximum temperature of A class decreased to 41°C from 44°C. Considering temperatures of the 2014 R-B class and 1984 status of same areas, it was seen that mean, min and max temperatures increased by 2°C. As a result, the dynamism of population, LULC and LST resulted in increasing mean and maximum surface temperatures, living spaces/industrial areas and agricultural lands.Keywords: Census data, landsat, land surface temperature (LST), land use land cover (LULC).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2120642 Myanmar Character Recognition Using Eight Direction Chain Code Frequency Features
Authors: Kyi Pyar Zaw, Zin Mar Kyu
Abstract:
Character recognition is the process of converting a text image file into editable and searchable text file. Feature Extraction is the heart of any character recognition system. The character recognition rate may be low or high depending on the extracted features. In the proposed paper, 25 features for one character are used in character recognition. Basically, there are three steps of character recognition such as character segmentation, feature extraction and classification. In segmentation step, horizontal cropping method is used for line segmentation and vertical cropping method is used for character segmentation. In the Feature extraction step, features are extracted in two ways. The first way is that the 8 features are extracted from the entire input character using eight direction chain code frequency extraction. The second way is that the input character is divided into 16 blocks. For each block, although 8 feature values are obtained through eight-direction chain code frequency extraction method, we define the sum of these 8 feature values as a feature for one block. Therefore, 16 features are extracted from that 16 blocks in the second way. We use the number of holes feature to cluster the similar characters. We can recognize the almost Myanmar common characters with various font sizes by using these features. All these 25 features are used in both training part and testing part. In the classification step, the characters are classified by matching the all features of input character with already trained features of characters.
Keywords: Chain code frequency, character recognition, feature extraction, features matching, segmentation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 753641 Optimization of a New Three-Phase High Voltage Power Supply for Industrial Microwaves Generators with N Magnetrons by Phase (Treated Case N=1)
Authors: M. Bassoui, M. Ferfra, M. Chraygane, M. Ould Ahmedou, N. Elghazal, A. Belhaiba
Abstract:
Currently, the High voltage power supply for microwave generators with one magnetron uses a single-phase transformer with magnetic shunt. To contribute in the development of technological innovation in industry of manufacturing of power supplies of magnetrons for microwaves, ovens for domestic or industrial use, this original work treats the optimization of a new three-phase high voltage power supply for industrial microwaves generators with N magnetrons by phase (Treated case N=1), from its modeling with Matlab-Simulink. The design of this power supply uses three π quadruple models equivalents of new three-phase transformer with magnetic shunt of each phase. Every one supplies at its output a voltage doubler cell composed of a capacitor and a diode that in its output supplies only one magnetron. In this work we will define a strategy that aims to reduce the volume of the transformer and the weight and cost of the entire system of the high voltage power supply, while respecting the conditions recommended by the manufacturer, concerning the current flowing in each magnetron: (Imax <1.2 A, IAv ≈ 300 mA).
Keywords: Optimization, Three-phase transformer, Modeling, power supply, magnetrons, Matlab Simulink, High Voltage
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2806640 Object Identification with Color, Texture, and Object-Correlation in CBIR System
Authors: Awais Adnan, Muhammad Nawaz, Sajid Anwar, Tamleek Ali, Muhammad Ali
Abstract:
Needs of an efficient information retrieval in recent years in increased more then ever because of the frequent use of digital information in our life. We see a lot of work in the area of textual information but in multimedia information, we cannot find much progress. In text based information, new technology of data mining and data marts are now in working that were started from the basic concept of database some where in 1960. In image search and especially in image identification, computerized system at very initial stages. Even in the area of image search we cannot see much progress as in the case of text based search techniques. One main reason for this is the wide spread roots of image search where many area like artificial intelligence, statistics, image processing, pattern recognition play their role. Even human psychology and perception and cultural diversity also have their share for the design of a good and efficient image recognition and retrieval system. A new object based search technique is presented in this paper where object in the image are identified on the basis of their geometrical shapes and other features like color and texture where object-co-relation augments this search process. To be more focused on objects identification, simple images are selected for the work to reduce the role of segmentation in overall process however same technique can also be applied for other images.Keywords: Object correlation, Geometrical shape, Color, texture, features, contents.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2028639 Persian/Arabic Document Segmentation Based On Pyramidal Image Structure
Authors: Seyyed Yasser Hashemi, Khalil Monfaredi
Abstract:
Automatic transformation of paper documents into electronic documents requires document segmentation at the first stage. However, some parameters restrictions such as variations in character font sizes, different text line spacing, and also not uniform document layout structures altogether have made it difficult to design a general-purpose document layout analysis algorithm for many years. Thus in most previously reported methods it is inevitable to include these parameters. This problem becomes excessively acute and severe, especially in Persian/Arabic documents. Since the Persian/Arabic scripts differ considerably from the English scripts, most of the proposed methods for the English scripts do not render good results for the Persian scripts. In this paper, we present a novel parameter-free method for segmenting the Persian/Arabic document images which also works well for English scripts. This method segments the document image into maximal homogeneous regions and identifies them as texts and non-texts based on a pyramidal image structure. In other words the proposed method is capable of document segmentation without considering the character font sizes, text line spacing, and document layout structures. This algorithm is examined for 150 Arabic/Persian and English documents and document segmentation process are done successfully for 96 percent of documents.
Keywords: Persian/Arabic document, document segmentation, Pyramidal Image Structure, skew detection and correction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1765638 Investigating the Effectiveness of a 3D Printed Composite Mold
Authors: Peng Hao Wang, Garam Kim, Ronald Sterkenburg
Abstract:
In composite manufacturing, the fabrication of tooling and tooling maintenance contributes to a large portion of the total cost. However, as the applications of composite materials continue to increase, there is also a growing demand for more tooling. The demand for more tooling places heavy emphasis on the industry’s ability to fabricate high quality tools while maintaining the tool’s cost effectiveness. One of the popular techniques of tool fabrication currently being developed utilizes additive manufacturing technology known as 3D printing. The popularity of 3D printing is due to 3D printing’s ability to maintain low material waste, low cost, and quick fabrication time. In this study, a team of Purdue University School of Aviation and Transportation Technology (SATT) faculty and students investigated the effectiveness of a 3D printed composite mold. A steel valve cover from an aircraft reciprocating engine was modeled utilizing 3D scanning and computer-aided design (CAD) to create a 3D printed composite mold. The mold was used to fabricate carbon fiber versions of the aircraft reciprocating engine valve cover. The carbon fiber valve covers were evaluated for dimensional accuracy and quality while the 3D printed composite mold was evaluated for durability and dimensional stability. The data collected from this study provided valuable information in the understanding of 3D printed composite molds, potential improvements for the molds, and considerations for future tooling design.Keywords: Additive manufacturing, carbon fiber, composite tooling, molds.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 908637 Decode and Forward Cooperative Protocol Enhancement Using Interference Cancellation
Authors: Siddeeq Y. Ameen, Mohammed K. Yousif
Abstract:
Cooperative communication systems are considered to be a promising technology to improve the system capacity, reliability and performances over fading wireless channels. Cooperative relaying system with a single antenna will be able to reach the advantages of multiple antenna communication systems. It is ideally suitable for the distributed communication systems; the relays can cooperate and form virtual MIMO systems. Thus the paper will aim to investigate the possible enhancement of cooperated system using decode and forward protocol. On the decode and forward an attempt to cancel or at least reduce the interference instead of increasing the SNR values is achieved. The latter can be achieved via the use group of relays depending on the channel status from source to relay and relay to destination respectively.
In the proposed system, the transmission time has been divided into two phases to be used by the decode and forward protocol. The first phase has been allocated for the source to transmit its data whereas the relays and destination nodes are in receiving mode. On the other hand, the second phase is allocated for the first and second groups of relay nodes to relay the data to the destination node. Simulations results have shown an improvement in performance is achieved compared to the conventional decode and forward in terms of BER and transmission rate.
Keywords: Cooperative systems, decode and forward, interference cancellation, virtual MIMO.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3724636 The Name of Thai Muslim Students: The Reflection of Value and Identity of Thai Muslim
Authors: Apichaya Kaewuthai
Abstract:
To study the meaning of Muslim name in order to analyze the underlining value and identity from first year to forth year Muslim students at Prince of Songkla University, Hatyai Campus. The questionnaires are employed as a main analytical tool to acquire the names from 80 Muslim students in four study years. The meanings of obtained names are subsequently analyzed and summarized base upon related documents to uncover the beneath value. The study reveals that name of male is derived from the name of prophet; Nabi Muhammad, merit, dignity, origins, leadership and the faith in Islam. For female, on the other hand, their names are related to virtue and beauty, cleanliness and peace, hope and flowers which comply with their characteristics. One of the reasons contribute to the principle of naming is the regulation of Ministry of Culture which states that the name should represent one’s nature and characters. The given name reflects value and identity of Muslim which can be classified into three categories including 1) Value related to belief in Islam 2) value related to relationship among families and relatives 3) value about relationship with nature and environment. All the above mentioned reflect Muslim value and identity vividly. The name of Muslim students allows the researcher to perceive the perspective, belief and value in giving the name of Thai Muslim. Besides, it reveals social condition and their culture. It can also be the fundamental of studying the meaning of name in other races.
Keywords: The naming, Thai Muslim.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1297635 Text-independent Speaker Identification Based on MAP Channel Compensation and Pitch-dependent Features
Authors: Jiqing Han, Rongchun Gao
Abstract:
One major source of performance decline in speaker recognition system is channel mismatch between training and testing. This paper focuses on improving channel robustness of speaker recognition system in two aspects of channel compensation technique and channel robust features. The system is text-independent speaker identification system based on two-stage recognition. In the aspect of channel compensation technique, this paper applies MAP (Maximum A Posterior Probability) channel compensation technique, which was used in speech recognition, to speaker recognition system. In the aspect of channel robust features, this paper introduces pitch-dependent features and pitch-dependent speaker model for the second stage recognition. Based on the first stage recognition to testing speech using GMM (Gaussian Mixture Model), the system uses GMM scores to decide if it needs to be recognized again. If it needs to, the system selects a few speakers from all of the speakers who participate in the first stage recognition for the second stage recognition. For each selected speaker, the system obtains 3 pitch-dependent results from his pitch-dependent speaker model, and then uses ANN (Artificial Neural Network) to unite the 3 pitch-dependent results and 1 GMM score for getting a fused result. The system makes the second stage recognition based on these fused results. The experiments show that the correct rate of two-stage recognition system based on MAP channel compensation technique and pitch-dependent features is 41.7% better than the baseline system for closed-set test.Keywords: Channel Compensation, Channel Robustness, MAP, Speaker Identification
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1545634 A BERT-Based Model for Financial Social Media Sentiment Analysis
Authors: Josiel Delgadillo, Johnson Kinyua, Charles Mutigwe
Abstract:
The purpose of sentiment analysis is to determine the sentiment strength (e.g., positive, negative, neutral) from a textual source for good decision-making. Natural Language Processing (NLP) in domains such as financial markets requires knowledge of domain ontology, and pre-trained language models, such as BERT, have made significant breakthroughs in various NLP tasks by training on large-scale un-labeled generic corpora such as Wikipedia. However, sentiment analysis is a strong domain-dependent task. The rapid growth of social media has given users a platform to share their experiences and views about products, services, and processes, including financial markets. StockTwits and Twitter are social networks that allow the public to express their sentiments in real time. Hence, leveraging the success of unsupervised pre-training and a large amount of financial text available on social media platforms could potentially benefit a wide range of financial applications. This work is focused on sentiment analysis using social media text on platforms such as StockTwits and Twitter. To meet this need, SkyBERT, a domain-specific language model pre-trained and fine-tuned on financial corpora, has been developed. The results show that SkyBERT outperforms current state-of-the-art models in financial sentiment analysis. Extensive experimental results demonstrate the effectiveness and robustness of SkyBERT.
Keywords: BERT, financial markets, Twitter, sentiment analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 716633 Land-Use Suitability Analysis for Merauke Agriculture Estates
Authors: Sidharta Sahirman, Ardiansyah, Muhammad Rifan, Edy-Melmambessy
Abstract:
Merauke district in Papua, Indonesia has a strategic position and natural potential for the development of agricultural industry. The development of agriculture in this region is being accelerated as part of Indonesian Government’s declaration announcing Merauke as one of future national food barns. Therefore, land-use suitability analysis for Merauke need to be performed. As a result, the mapping for future agriculture-based industries can be done optimally. In this research, a case study is carried out in Semangga sub district. The objective of this study is to determine the suitability of Merauke land for some food crops. A modified agro-ecological zoning is applied to reach the objective. In this research, land cover based on satellite imagery is combined with soil, water and climate survey results to come up with preliminary zoning. Considering the special characteristics of Merauke community, the agricultural zoning maps resulted based on those inputs will be combined with socio-economic information and culture to determine the final zoning map for agricultural industry in Merauke. Examples of culture are customary rights of local residents and the rights of local people and their own local food patterns. This paper presents the results of first year of the two-year research project funded by The Indonesian Government through MP3EI schema. It shares the findings of land cover studies, the distribution of soil physical and chemical parameters, as well as suitability analysis of Semangga sub-district for five different food plants.
Keywords: agriculture, agro-ecological, Merauke, zoning
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1803632 Mining User-Generated Contents to Detect Service Failures with Topic Model
Authors: Kyung Bae Park, Sung Ho Ha
Abstract:
Online user-generated contents (UGC) significantly change the way customers behave (e.g., shop, travel), and a pressing need to handle the overwhelmingly plethora amount of various UGC is one of the paramount issues for management. However, a current approach (e.g., sentiment analysis) is often ineffective for leveraging textual information to detect the problems or issues that a certain management suffers from. In this paper, we employ text mining of Latent Dirichlet Allocation (LDA) on a popular online review site dedicated to complaint from users. We find that the employed LDA efficiently detects customer complaints, and a further inspection with the visualization technique is effective to categorize the problems or issues. As such, management can identify the issues at stake and prioritize them accordingly in a timely manner given the limited amount of resources. The findings provide managerial insights into how analytics on social media can help maintain and improve their reputation management. Our interdisciplinary approach also highlights several insights by applying machine learning techniques in marketing research domain. On a broader technical note, this paper illustrates the details of how to implement LDA in R program from a beginning (data collection in R) to an end (LDA analysis in R) since the instruction is still largely undocumented. In this regard, it will help lower the boundary for interdisciplinary researcher to conduct related research.Keywords: Latent Dirichlet allocation, R program, text mining, topic model, user generated contents, visualization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1216631 LCA and Multi-Criteria Analysis of Fly Ash Concrete Pavements
Authors: M. Ondova, A. Estokova
Abstract:
Rapid industrialization results in increased use of natural resources bring along serious ecological and environmental imbalance due to the dumping of industrial wastes. Principles of sustainable construction have to be accepted with regard to the consumption of natural resources and the production of harmful emissions. Cement is a great importance raw material in the building industry and today is its large amount used in the construction of concrete pavements. Concerning raw materials cost and producing CO2 emission the replacing of cement in concrete mixtures with more sustainable materials is necessary. To reduce this environmental impact people all over the world are looking for a solution. Over a period of last ten years, the image of fly ash has completely been changed from a polluting waste to resource material and it can solve the major problems of cement use. Fly ash concretes are proposed as a potential approach for achieving substantial reductions in cement. It is known that it improves the workability of concrete, extends the life cycle of concrete roads, and reduces energy use and greenhouse gas as well as amount of coal combustion products that must be disposed in landfills.
Life cycle assessment also proved that a concrete pavement with fly ash cement replacement is considerably more environmentally friendly compared to standard concrete roads. In addition, fly ash is cheap raw material, and the costs saving are guaranteed. The strength properties, resistance to a frost or de-icing salts, which are important characteristics in the construction of concrete pavements, have reached the required standards as well. In terms of human health it can´t be stated that a concrete cover with fly ash could be dangerous compared with a cover without fly ash. Final Multi-criteria analysis also pointed that a concrete with fly ash is a clearly proper solution.
Keywords: Life cycle assessment, fly ash, waste, concrete pavements
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3503630 AI-Based Techniques for Online Social Media Network Sentiment Analysis: A Methodical Review
Authors: A. M. John-Otumu, M. M. Rahman, O. C. Nwokonkwo, M. C. Onuoha
Abstract:
Online social media networks have long served as a primary arena for group conversations, gossip, text-based information sharing and distribution. The use of natural language processing techniques for text classification and unbiased decision making has not been far-fetched. Proper classification of these textual information in a given context has also been very difficult. As a result, a systematic review was conducted from previous literature on sentiment classification and AI-based techniques. The study was done in order to gain a better understanding of the process of designing and developing a robust and more accurate sentiment classifier that could correctly classify social media textual information of a given context between hate speech and inverted compliments with a high level of accuracy using the knowledge gain from the evaluation of different artificial intelligence techniques reviewed. The study evaluated over 250 articles from digital sources like ACM digital library, Google Scholar, and IEEE Xplore; and whittled down the number of research to 52 articles. Findings revealed that deep learning approaches such as Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Bidirectional Encoder Representations from Transformer (BERT), and Long Short-Term Memory (LSTM) outperformed various machine learning techniques in terms of performance accuracy. A large dataset is also required to develop a robust sentiment classifier. Results also revealed that data can be obtained from places like Twitter, movie reviews, Kaggle, Stanford Sentiment Treebank (SST), and SemEval Task4 based on the required domain. The hybrid deep learning techniques like CNN+LSTM, CNN+ Gated Recurrent Unit (GRU), CNN+BERT outperformed single deep learning techniques and machine learning techniques. Python programming language outperformed Java programming language in terms of development simplicity and AI-based library functionalities. Finally, the study recommended the findings obtained for building robust sentiment classifier in the future.
Keywords: Artificial Intelligence, Natural Language Processing, Sentiment Analysis, Social Network, Text.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 593629 Improving Topic Quality of Scripts by Using Scene Similarity Based Word Co-Occurrence
Authors: Yunseok Noh, Chang-Uk Kwak, Sun-Joong Kim, Seong-Bae Park
Abstract:
Scripts are one of the basic text resources to understand broadcasting contents. Topic modeling is the method to get the summary of the broadcasting contents from its scripts. Generally, scripts represent contents descriptively with directions and speeches, and provide scene segments that can be seen as semantic units. Therefore, a script can be topic modeled by treating a scene segment as a document. Because scene segments consist of speeches mainly, however, relatively small co-occurrences among words in the scene segments are observed. This causes inevitably the bad quality of topics by statistical learning method. To tackle this problem, we propose a method to improve topic quality with additional word co-occurrence information obtained using scene similarities. The main idea of improving topic quality is that the information that two or more texts are topically related can be useful to learn high quality of topics. In addition, more accurate topical representations lead to get information more accurate whether two texts are related or not. In this paper, we regard two scene segments are related if their topical similarity is high enough. We also consider that words are co-occurred if they are in topically related scene segments together. By iteratively inferring topics and determining semantically neighborhood scene segments, we draw a topic space represents broadcasting contents well. In the experiments, we showed the proposed method generates a higher quality of topics from Korean drama scripts than the baselines.Keywords: Broadcasting contents, generalized P´olya urn model, scripts, text similarity, topic model.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1817628 Analysis on Spatiotemporal Pattern of Land Surface Temperature in Kunming City, China
Authors: Jinrui Ren, Li Wu
Abstract:
Anthropogenic activities and changes of underlying surface affect the temporal and spatial distribution of surface temperature in Kunming. Taking Kunming city as the research area, the surface temperature in 2000, 2010 and 2020 as the research object, using ENVI 5.3 and ArcGIS 10.8 as auxiliary tools, and based on the spatial autocorrelation method, this paper devoted to exploring the interactions among the changes of surface temperature, urban heat island effect and land use type, so as to provide theoretical basis and scientific basis for mitigating climate change. The results showed that: (1) The heat island effect was obvious in Kunming City, the high temperature area increased from 604 km2 in 2000 to 1269 km2 in 2020, and the sub-high temperature area reached 1099 km2 in 2020; (2) In terms of space, the spatial distribution of LST was significantly different with the change of underlying surface. The high temperature zone extended in three directions: south, north and east. The overall spatial distribution pattern of LST was high in the east and low in the west. (3) The inter-annual fluctuation of land surface temperature (LST) was large, and the growth rate was faster, from 2000 to 2010. The lowest temperature in 2000 was 13.45 ℃, which raised to 19.71 ℃ in 2010, and the temperature difference in 10 years was 6.26 ℃. (4) The land use/land cover type has a strong effect on the change of LST: the man-made land made a great contribution to the increase of LST, followed by grassland and farmland, while forest and water have a significant cooling effect on LST. To sum up, the variation of surface temperature in Kunming is the result of the interactions of human activities and climate change.
Keywords: Surface temperature, urban heat island effect, land use cover type, spatiotemporal variation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 187627 Localizing and Recognizing Integral Pitches of Cheque Document Images
Authors: Bremananth R., Veerabadran C. S., Andy W. H. Khong
Abstract:
Automatic reading of handwritten cheque is a computationally complex process and it plays an important role in financial risk management. Machine vision and learning provide a viable solution to this problem. Research effort has mostly been focused on recognizing diverse pitches of cheques and demand drafts with an identical outline. However most of these methods employ templatematching to localize the pitches and such schemes could potentially fail when applied to different types of outline maintained by the bank. In this paper, the so-called outline problem is resolved by a cheque information tree (CIT), which generalizes the localizing method to extract active-region-of-entities. In addition, the weight based density plot (WBDP) is performed to isolate text entities and read complete pitches. Recognition is based on texture features using neural classifiers. Legal amount is subsequently recognized by both texture and perceptual features. A post-processing phase is invoked to detect the incorrect readings by Type-2 grammar using the Turing machine. The performance of the proposed system was evaluated using cheque and demand drafts of 22 different banks. The test data consists of a collection of 1540 leafs obtained from 10 different account holders from each bank. Results show that this approach can easily be deployed without significant design amendments.Keywords: Cheque reading, Connectivity checking, Text localization, Texture analysis, Turing machine, Signature verification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1657626 PoPCoRN: A Power-Aware Periodic Surveillance Scheme in Convex Region using Wireless Mobile Sensor Networks
Authors: A. K. Prajapati
Abstract:
In this paper, the periodic surveillance scheme has been proposed for any convex region using mobile wireless sensor nodes. A sensor network typically consists of fixed number of sensor nodes which report the measurements of sensed data such as temperature, pressure, humidity, etc., of its immediate proximity (the area within its sensing range). For the purpose of sensing an area of interest, there are adequate number of fixed sensor nodes required to cover the entire region of interest. It implies that the number of fixed sensor nodes required to cover a given area will depend on the sensing range of the sensor as well as deployment strategies employed. It is assumed that the sensors to be mobile within the region of surveillance, can be mounted on moving bodies like robots or vehicle. Therefore, in our scheme, the surveillance time period determines the number of sensor nodes required to be deployed in the region of interest. The proposed scheme comprises of three algorithms namely: Hexagonalization, Clustering, and Scheduling, The first algorithm partitions the coverage area into fixed sized hexagons that approximate the sensing range (cell) of individual sensor node. The clustering algorithm groups the cells into clusters, each of which will be covered by a single sensor node. The later determines a schedule for each sensor to serve its respective cluster. Each sensor node traverses all the cells belonging to the cluster assigned to it by oscillating between the first and the last cell for the duration of its life time. Simulation results show that our scheme provides full coverage within a given period of time using few sensors with minimum movement, less power consumption, and relatively less infrastructure cost.Keywords: Sensor Network, Graph Theory, MSN, Communication.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1464625 The Effects of Weather Anomalies on the Quantitative and Qualitative Parameters of Maize Hybrids of Different Genetic Traits in Hungary
Authors: Zs. J. Becze, Á. Krivián, M. Sárvári
Abstract:
Hybrid selection and the application of hybrid specific production technologies are important in terms of the increase of the yield and crop safety of maize. The main explanation for this is climate change, since weather extremes are going on and seem to accelerate in Hungary too.
The biological bases, the selection of appropriate hybrids will be of greater importance in the future. The issue of the adaptability of hybrids will be considerably appreciated. Its good agronomical traits and stress bearing against climatic factors and agrotechnical elements (e.g. different types of herbicides) will be important. There have been examples of 3-4 consecutive droughty years in the past decades, e.g. 1992-1993-1994 or 2009-2011-2012, which made the results of crop production critical. Irrigation cannot be the solution for the problem since currently only the 2% of the arable land is irrigated. Temperatures exceeding the multi-year average are characteristic mainly to the July and August in Hungary, which significantly increase the soil surface evaporation, thus further enhance water shortage. In terms of the yield and crop safety of maize, the weather of these two months is crucial, since the extreme high temperature in July decreases the viability of the pollen and the pistil of maize, decreases the extent of fertilization and makes grain-filling tardy. Consequently, yield and crop safety decrease.
Keywords: Abiotic factors, drought, nutrition content, yield.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1900624 Linguistic Summarization of Structured Patent Data
Authors: E. Y. Igde, S. Aydogan, F. E. Boran, D. Akay
Abstract:
Patent data have an increasingly important role in economic growth, innovation, technical advantages and business strategies and even in countries competitions. Analyzing of patent data is crucial since patents cover large part of all technological information of the world. In this paper, we have used the linguistic summarization technique to prove the validity of the hypotheses related to patent data stated in the literature.Keywords: Data mining, fuzzy sets, linguistic summarization, patent data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1217623 Continuous FAQ Updating for Service Incident Ticket Resolution
Authors: Kohtaroh Miyamoto
Abstract:
As enterprise computing becomes more and more complex, the costs and technical challenges of IT system maintenance and support are increasing rapidly. One popular approach to managing IT system maintenance is to prepare and use a FAQ (Frequently Asked Questions) system to manage and reuse systems knowledge. Such a FAQ system can help reduce the resolution time for each service incident ticket. However, there is a major problem where over time the knowledge in such FAQs tends to become outdated. Much of the knowledge captured in the FAQ requires periodic updates in response to new insights or new trends in the problems addressed in order to maintain its usefulness for problem resolution. These updates require a systematic approach to define the exact portion of the FAQ and its content. Therefore, we are working on a novel method to hierarchically structure the FAQ and automate the updates of its structure and content. We use structured information and the unstructured text information with the timelines of the information in the service incident tickets. We cluster the tickets by structured category information, by keywords, and by keyword modifiers for the unstructured text information. We also calculate an urgency score based on trends, resolution times, and priorities. We carefully studied the tickets of one of our projects over a 2.5-year time period. After the first 6 months we started to create FAQs and confirmed they improved the resolution times. We continued observing over the next 2 years to assess the ongoing effectiveness of our method for the automatic FAQ updates. We improved the ratio of tickets covered by the FAQ from 32.3% to 68.9% during this time. Also, the average time reduction of ticket resolution was between 31.6% and 43.9%. Subjective analysis showed more than 75% reported that the FAQ system was useful in reducing ticket resolution times.
Keywords: FAQ System, Resolution Time, Service Incident Tickets, IT System Maintenance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2493