Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 1466

Search results for: informative theoretic similarity metrics

1286 Decoding Gender Disparities in AI: An Experimental Exploration Within the Realm of AI and Trust Building

Authors: Alexander Scott English, Yilin Ma, Xiaoying Liu

Abstract:

The widespread use of artificial intelligence in everyday life has triggered a fervent discussion covering a wide range of areas. However, to date, research on the influence of gender in various segments and factors from a social science perspective is still limited. This study aims to explore whether there are gender differences in human trust in AI for its application in basic everyday life and correlates with human perceived similarity, perceived emotions (including competence and warmth), and attractiveness. We conducted a study involving 321 participants using a two-subject experimental design with a two-factor (masculinized vs. feminized voice of the AI) multiplied by a two-factor (pitch level of the AI's voice) between-subject experimental design. Four contexts were created for the study and randomly assigned. The results of the study showed significant gender differences in perceived similarity, trust, and perceived emotion of the AIs, with females rating them significantly higher than males. Trust was higher in relation to AIs presenting the same gender (e.g., human female to female AI, human male to male AI). Mediation modeling tests indicated that emotion perception and similarity played a sufficiently mediating role in trust. Notably, although trust in AIs was strongly correlated with human gender, there was no significant effect on the gender of the AI. In addition, the study discusses the effects of subjects' age, job search experience, and job type on the findings.

Keywords: artificial intelligence, gender differences, human-robot trust, mediation modeling

Procedia PDF Downloads 32

1285 Plagiarism Detection for Flowchart and Figures in Texts

Authors: Ahmadu Maidorawa, Idrissa Djibo, Muhammad Tella

Abstract:

This paper presents a method for detecting flow chart and figure plagiarism based on shape of image processing and multimedia retrieval. The method managed to retrieve flowcharts with ranked similarity according to different matching sets. Plagiarism detection is well known phenomenon in the academic arena. Copying other people is considered as serious offense that needs to be checked. There are many plagiarism detection systems such as turn-it-in that has been developed to provide these checks. Most, if not all, discard the figures and charts before checking for plagiarism. Discarding the figures and charts result in look holes that people can take advantage. That means people can plagiarize figures and charts easily without the current plagiarism systems detecting it. There are very few papers which talks about flowcharts plagiarism detection. Therefore, there is a need to develop a system that will detect plagiarism in figures and charts.

Keywords: flowchart, multimedia retrieval, figures similarity, image comparison, figure retrieval

Procedia PDF Downloads 453

1284 Web Proxy Detection via Bipartite Graphs and One-Mode Projections

Authors: Zhipeng Chen, Peng Zhang, Qingyun Liu, Li Guo

Abstract:

With the Internet becoming the dominant channel for business and life, many IPs are increasingly masked using web proxies for illegal purposes such as propagating malware, impersonate phishing pages to steal sensitive data or redirect victims to other malicious targets. Moreover, as Internet traffic continues to grow in size and complexity, it has become an increasingly challenging task to detect the proxy service due to their dynamic update and high anonymity. In this paper, we present an approach based on behavioral graph analysis to study the behavior similarity of web proxy users. Specifically, we use bipartite graphs to model host communications from network traffic and build one-mode projections of bipartite graphs for discovering social-behavior similarity of web proxy users. Based on the similarity matrices of end-users from the derived one-mode projection graphs, we apply a simple yet effective spectral clustering algorithm to discover the inherent web proxy users behavior clusters. The web proxy URL may vary from time to time. Still, the inherent interest would not. So, based on the intuition, by dint of our private tools implemented by WebDriver, we examine whether the top URLs visited by the web proxy users are web proxies. Our experiment results based on real datasets show that the behavior clusters not only reduce the number of URLs analysis but also provide an effective way to detect the web proxies, especially for the unknown web proxies.

Keywords: bipartite graph, one-mode projection, clustering, web proxy detection

Procedia PDF Downloads 233

1283 Assessing Green Metrics of Cement Supply Chain in Iran: A Fuzzy DEMATEL Approach

Authors: Hadi Badri Ahmadi, Xuping Wang

Abstract:

Due to strict regulations and public awareness, corporations should develop policies to effectively decrease the negative environmental effects of their products and enhance their supply chain environmental sustainability. Assessment of environmental issues in the context of many industries has been studied in the previous literature. However, Iran cement industry has received less attention from researchers. Therefore, in this paper, we apply a Decision-Making Trial and Evaluation Laboratory (DEMATEL) approach to assess the relationships among green metrics of Iran cement industry supply chain under fuzzy environment. The study findings provide considerable insight for cement industry managers and experts in order to enhance the environmental sustainability of their supply chain and move towards sustainable development.

Keywords: green supply chain, DEMATEL, fuzzy set theory, environmental sustainability, sustainable development, cement industry

Procedia PDF Downloads 401

1282 Comparison of Crossover Types to Obtain Optimal Queries Using Adaptive Genetic Algorithm

Authors: Wafa’ Alma'Aitah, Khaled Almakadmeh

Abstract:

this study presents an information retrieval system of using genetic algorithm to increase information retrieval efficiency. Using vector space model, information retrieval is based on the similarity measurement between query and documents. Documents with high similarity to query are judge more relevant to the query and should be retrieved first. Using genetic algorithms, each query is represented by a chromosome; these chromosomes are fed into genetic operator process: selection, crossover, and mutation until an optimized query chromosome is obtained for document retrieval. Results show that information retrieval with adaptive crossover probability and single point type crossover and roulette wheel as selection type give the highest recall. The proposed approach is verified using (242) proceedings abstracts collected from the Saudi Arabian national conference.

Keywords: genetic algorithm, information retrieval, optimal queries, crossover

Procedia PDF Downloads 280

1281 Experimental Study Analyzing the Similarity Theory Formulations for the Effect of Aerodynamic Roughness Length on Turbulence Length Scales in the Atmospheric Surface Layer

Authors: Matthew J. Emes, Azadeh Jafari, Maziar Arjomandi

Abstract:

Velocity fluctuations of shear-generated turbulence are largest in the atmospheric surface layer (ASL) of nominal 100 m depth, which can lead to dynamic effects such as galloping and flutter on small physical structures on the ground when the turbulence length scales and characteristic length of the physical structure are the same order of magnitude. Turbulence length scales are a measure of the average sizes of the energy-containing eddies that are widely estimated using two-point cross-correlation analysis to convert the temporal lag to a separation distance using Taylor’s hypothesis that the convection velocity is equal to the mean velocity at the corresponding height. Profiles of turbulence length scales in the neutrally-stratified ASL, as predicted by Monin-Obukhov similarity theory in Engineering Sciences Data Unit (ESDU) 85020 for single-point data and ESDU 86010 for two-point correlations, are largely dependent on the aerodynamic roughness length. Field measurements have shown that longitudinal turbulence length scales show significant regional variation, whereas length scales of the vertical component show consistent Obukhov scaling from site to site because of the absence of low-frequency components. Hence, the objective of this experimental study is to compare the similarity theory relationships between the turbulence length scales and aerodynamic roughness length with those calculated using the autocorrelations and cross-correlations of field measurement velocity data at two sites: the Surface Layer Turbulence and Environmental Science Test (SLTEST) facility in a desert ASL in Dugway, Utah, USA and the Commonwealth Scientific and Industrial Research Organisation (CSIRO) wind tower in a rural ASL in Jemalong, NSW, Australia. The results indicate that the longitudinal turbulence length scales increase with increasing aerodynamic roughness length, as opposed to the relationships derived by similarity theory correlations in ESDU models. However, the ratio of the turbulence length scales in the lateral and vertical directions to the longitudinal length scales is relatively independent of surface roughness, showing consistent inner-scaling between the two sites and the ESDU correlations. Further, the diurnal variation of wind velocity due to changes in atmospheric stability conditions has a significant effect on the turbulence structure of the energy-containing eddies in the lower ASL.

Keywords: aerodynamic roughness length, atmospheric surface layer, similarity theory, turbulence length scales

Procedia PDF Downloads 118

1280 Generation of Photo-Mosaic Images through Block Matching and Color Adjustment

Authors: Hae-Yeoun Lee

Abstract:

Mosaic refers to a technique that makes image by gathering lots of small materials in various colours. This paper presents an automatic algorithm that makes the photomosaic image using photos. The algorithm is composed of four steps: Partition and feature extraction, block matching, redundancy removal and colour adjustment. The input image is partitioned in the small block to extract feature. Each block is matched to find similar photo in database by comparing similarity with Euclidean difference between blocks. The intensity of the block is adjusted to enhance the similarity of image by replacing the value of light and darkness with that of relevant block. Further, the quality of image is improved by minimizing the redundancy of tiles in the adjacent blocks. Experimental results support that the proposed algorithm is excellent in quantitative analysis and qualitative analysis.

Keywords: photomosaic, Euclidean distance, block matching, intensity adjustment

Procedia PDF Downloads 270

1279 Bioinformatics Approach to Support Genetic Research in Autism in Mali

Authors: M. Kouyate, M. Sangare, S. Samake, S. Keita, H. G. Kim, D. H. Geschwind

Abstract:

Background & Objectives: Human genetic studies can be expensive, even unaffordable, in developing countries, partly due to the sequencing costs. Our aim is to pilot the use of bioinformatics tools to guide scientifically valid, locally relevant, and economically sound autism genetic research in Mali. Methods: The following databases, NCBI, HGMD, and LSDB, were used to identify hot point mutations. Phenotype, transmission pattern, theoretical protein expression in the brain, the impact of the mutation on the 3D structure of the protein) were used to prioritize selected autism genes. We used the protein database, Modeller, and clustal W. Results: We found Mef2c (Gly27Ala/Leu38Gln), Pten (Thr131IIle), Prodh (Leu289Met), Nme1 (Ser120Gly), and Dhcr7 (Pro227Thr/Glu224Lys). These mutations were associated with endonucleases BseRI, NspI, PfrJS2IV, BspGI, BsaBI, and SpoDI, respectively. Gly27Ala/Leu38Gln mutations impacted the 3D structure of the Mef2c protein. Mef2c protein sequences across species showed a high percentage of similarity with a highly conserved MADS domain. Discussion: Mef2c, Pten, Prodh, Nme1, and Dhcr 7 gene mutation frequencies in the Malian population will be very informative. PCR coupled with restriction enzyme digestion can be used to screen the targeted gene mutations. Sanger sequencing will be used for confirmation only. This will cut down considerably the sequencing cost for gene-to-gene mutation screening. The knowledge of the 3D structure and potential impact of the mutations on Mef2c protein informed the protein family and altered function (ex. Leu38Gln). Conclusion & Future Work: Bio-informatics will positively impact autism research in Mali. Our approach can be applied to another neuropsychiatric disorder.

Keywords: bioinformatics, endonucleases, autism, Sanger sequencing, point mutations

Procedia PDF Downloads 67

1278 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning

Authors: Walid Cherif

Abstract:

Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.

Keywords: data mining, knowledge discovery, machine learning, similarity measurement, supervised classification

Procedia PDF Downloads 455

1277 Integrating Eye-Tracking Analysis to Enhance Web Usability Evaluation

Authors: Johanna Renny Octavia, Meliana Nurdin, Ignatius Kevin Kurniawan, Ricca Aksara

Abstract:

It is widely believed that usability evaluation is necessary to evaluate a website design for further improvement. Traditional methods of usability evaluation have given sufficient insights to reveal usability problems of websites. Eye-tracking analysis has been considered as a useful method that adds a powerful dimension to web usability evaluation. It allows web designers and usability researchers to understand exactly what users do and do not see on a web page, thus disclose more information on web usability and provide a more complete insights on a website design. This paper elaborates on moving beyond traditional methods of web usability evaluation by integrating eye-tracking analysis to enhance the evaluation of website design, and presents three case studies to support this approach. In these case studies, eye movement metrics such as gaze plots and fixation-derived metrics, and user performance data such as task completion times and number of errors were recorded as objective measurements that can inform the necessity for website design improvements.

Keywords: design, eye-tracking, usability evaluation, website

Procedia PDF Downloads 288

1276 Performance Analysis of Proprietary and Non-Proprietary Tools for Regression Testing Using Genetic Algorithm

Authors: K. Hema Shankari, R. Thirumalaiselvi, N. V. Balasubramanian

Abstract:

The present paper addresses to the research in the area of regression testing with emphasis on automated tools as well as prioritization of test cases. The uniqueness of regression testing and its cyclic nature is pointed out. The difference in approach between industry, with business model as basis, and academia, with focus on data mining, is highlighted. Test Metrics are discussed as a prelude to our formula for prioritization; a case study is further discussed to illustrate this methodology. An industrial case study is also described in the paper, where the number of test cases is so large that they have to be grouped as Test Suites. In such situations, a genetic algorithm proposed by us can be used to reconfigure these Test Suites in each cycle of regression testing. The comparison is made between a proprietary tool and an open source tool using the above-mentioned metrics. Our approach is clarified through several tables.

Keywords: APFD metric, genetic algorithm, regression testing, RFT tool, test case prioritization, selenium tool

Procedia PDF Downloads 416

1275 Effect of Joule Heating on Chemically Reacting Micropolar Fluid Flow over Truncated Cone with Convective Boundary Condition Using Spectral Quasilinearization Method

Authors: Pradeepa Teegala, Ramreddy Chetteti

Abstract:

This work emphasizes the effects of heat generation/absorption and Joule heating on chemically reacting micropolar fluid flow over a truncated cone with convective boundary condition. For this complex fluid flow problem, the similarity solution does not exist and hence using non-similarity transformations, the governing fluid flow equations along with related boundary conditions are transformed into a set of non-dimensional partial differential equations. Several authors have applied the spectral quasi-linearization method to solve the ordinary differential equations, but here the resulting nonlinear partial differential equations are solved for non-similarity solution by using a recently developed method called the spectral quasi-linearization method (SQLM). Comparison with previously published work on special cases of the problem is performed and found to be in excellent agreement. The influence of pertinent parameters namely Biot number, Joule heating, heat generation/absorption, chemical reaction, micropolar and magnetic field on physical quantities of the flow are displayed through graphs and the salient features are explored in detail. Further, the results are analyzed by comparing with two special cases, namely, vertical plate and full cone wherever possible.

Keywords: chemical reaction, convective boundary condition, joule heating, micropolar fluid, spectral quasilinearization method

Procedia PDF Downloads 335

1274 Cosmetic Recommendation Approach Using Machine Learning

Authors: Shakila N. Senarath, Dinesh Asanka, Janaka Wijayanayake

Abstract:

The necessity of cosmetic products is arising to fulfill consumer needs of personality appearance and hygiene. A cosmetic product consists of various chemical ingredients which may help to keep the skin healthy or may lead to damages. Every chemical ingredient in a cosmetic product does not perform on every human. The most appropriate way to select a healthy cosmetic product is to identify the texture of the body first and select the most suitable product with safe ingredients. Therefore, the selection process of cosmetic products is complicated. Consumer surveys have shown most of the time, the selection process of cosmetic products is done in an improper way by consumers. From this study, a content-based system is suggested that recommends cosmetic products for the human factors. To such an extent, the skin type, gender and price range will be considered as human factors. The proposed system will be implemented by using Machine Learning. Consumer skin type, gender and price range will be taken as inputs to the system. The skin type of consumer will be derived by using the Baumann Skin Type Questionnaire, which is a value-based approach that includes several numbers of questions to derive the user’s skin type to one of the 16 skin types according to the Bauman Skin Type indicator (BSTI). Two datasets are collected for further research proceedings. The user data set was collected using a questionnaire given to the public. Those are the user dataset and the cosmetic dataset. Product details are included in the cosmetic dataset, which belongs to 5 different kinds of product categories (Moisturizer, Cleanser, Sun protector, Face Mask, Eye Cream). An alternate approach of TF-IDF (Term Frequency – Inverse Document Frequency) is applied to vectorize cosmetic ingredients in the generic cosmetic products dataset and user-preferred dataset. Using the IF-IPF vectors, each user-preferred products dataset and generic cosmetic products dataset can be represented as sparse vectors. The similarity between each user-preferred product and generic cosmetic product will be calculated using the cosine similarity method. For the recommendation process, a similarity matrix can be used. Higher the similarity, higher the match for consumer. Sorting a user column from similarity matrix in a descending order, the recommended products can be retrieved in ascending order. Even though results return a list of similar products, and since the user information has been gathered, such as gender and the price ranges for product purchasing, further optimization can be done by considering and giving weights for those parameters once after a set of recommended products for a user has been retrieved.

Keywords: content-based filtering, cosmetics, machine learning, recommendation system

Procedia PDF Downloads 123

1273 Saliency Detection Using a Background Probability Model

Authors: Junling Li, Fang Meng, Yichun Zhang

Abstract:

Image saliency detection has been long studied, while several challenging problems are still unsolved, such as detecting saliency inaccurately in complex scenes or suppressing salient objects in the image borders. In this paper, we propose a new saliency detection algorithm in order to solving these problems. We represent the image as a graph with superixels as nodes. By considering appearance similarity between the boundary and the background, the proposed method chooses non-saliency boundary nodes as background priors to construct the background probability model. The probability that each node belongs to the model is computed, which measures its similarity with backgrounds. Thus we can calculate saliency by the transformed probability as a metric. We compare our algorithm with ten-state-of-the-art salient detection methods on the public database. Experimental results show that our simple and effective approach can attack those challenging problems that had been baffling in image saliency detection.

Keywords: visual saliency, background probability, boundary knowledge, background priors

Procedia PDF Downloads 416

1272 Graph-Oriented Summary for Optimized Resource Description Framework Graphs Streams Processing

Authors: Amadou Fall Dia, Maurras Ulbricht Togbe, Aliou Boly, Zakia Kazi Aoul, Elisabeth Metais

Abstract:

Existing RDF (Resource Description Framework) Stream Processing (RSP) systems allow continuous processing of RDF data issued from different application domains such as weather station measuring phenomena, geolocation, IoT applications, drinking water distribution management, and so on. However, processing window phase often expires before finishing the entire session and RSP systems immediately delete data streams after each processed window. Such mechanism does not allow optimized exploitation of the RDF data streams as the most relevant and pertinent information of the data is often not used in a due time and almost impossible to be exploited for further analyzes. It should be better to keep the most informative part of data within streams while minimizing the memory storage space. In this work, we propose an RDF graph summarization system based on an explicit and implicit expressed needs through three main approaches: (1) an approach for user queries (SPARQL) in order to extract their needs and group them into a more global query, (2) an extension of the closeness centrality measure issued from Social Network Analysis (SNA) to determine the most informative parts of the graph and (3) an RDF graph summarization technique combining extracted user query needs and the extended centrality measure. Experiments and evaluations show efficient results in terms of memory space storage and the most expected approximate query results on summarized graphs compared to the source ones.

Keywords: centrality measures, RDF graphs summary, RDF graphs stream, SPARQL query

Procedia PDF Downloads 182

1271 Reading Informational or Fictional Texts to Students: Choices and Perceptions of Preschool and Primary Grade Teachers

Authors: Anne-Marie Dionne

Abstract:

Teacher reading aloud to students is a practice that is well established in preschool and primary classrooms. Many benefits of this pedagogical activity have been highlighted in multiple studies. However, it has also been shown that teachers are not keen on choosing informational texts for their read aloud, as their selections for this venue are mainly fictional stories, mostly written in a unique narrative story-like structure. Considering that students soon have to read complex informational texts by themselves as they go from one grade to another, there is cause for concern because those who do not benefit from an early exposure to informational texts could be lacking knowledge of informational text structures that they will encounter regularly in their reading. Exposing students to informational texts could be done in different ways in classrooms. However, since read aloud appears to be such a common and efficient practice in preschool and primary grades, it is important to examine more deeply the factors taken into account by teachers when they are selecting their readings for this important teaching activity. Moreover, it seems critical to know why teachers are not inclined to choose more often informational texts when they are reading aloud to their pupils. A group of 22 preschool or primary grade teachers participated in this study. The data collection was done by a survey and an individual semi-structured interview. The survey was conducted in order to get quantitative data on the read-aloud practices of teachers. As for the interviews, they were organized around three categories of questions (exploratory, analytical, opinion) regarding the process of selecting the texts for the read-aloud sessions. A statistical analysis was conducted on the data obtained by the survey. As for the interviews, they were subjected to a content analysis aiming to classify the information collected in predetermined categories such as the reasons given to favor fictional texts over informative texts, the reasons given for avoiding informative texts for reading aloud, the perceptions of the challenges that the informative texts could bring when they are read aloud to students, and the perceived advantages that they would present if they were chosen more often for this activity. Results are showing variable factors that are guiding the teachers when they are making their selection of the texts to be read aloud. As for example, some of them are choosing solely fictional texts because of their convictions that these are more interesting for their students. They also perceive that the informational texts are not good choices because they are not suitable for pleasure reading. In that matter, results are pointing to some interesting elements. Many teachers perceive that read aloud of fictional or informational texts have different goals: fictional texts are read for pleasure and informational texts are read mostly for academic purposes. These results bring out the urgency for teachers to become aware of the numerous benefits that the reading aloud of each type of texts could bring to their students, especially the informational texts. The possible consequences of teachers’ perceptions will be discussed further in our presentation.

Keywords: fictional texts, informational texts, preschool or primary grade teachers, reading aloud

Procedia PDF Downloads 133

1270 Analyzing Industry-University Collaboration Using Complex Networks and Game Theory

Authors: Elnaz Kanani-Kuchesfehani, Andrea Schiffauerova

Abstract:

Due to the novelty of the nanotechnology science, its highly knowledge intensive content, and its invaluable application in almost all technological fields, the close interaction between university and industry is essential. A possible gap between academic strengths to generate good nanotechnology ideas and industrial capacity to receive them can thus have far-reaching consequences. In order to be able to enhance the collaboration between the two parties, a better understanding of knowledge transfer within the university-industry relationship is needed. The objective of this research is to investigate the research collaboration between academia and industry in Canadian nanotechnology and to propose the best cooperative strategy to maximize the quality of the produced knowledge. First, a network of all Canadian academic and industrial nanotechnology inventors is constructed using the patent data from the USPTO (United States Patent and Trademark Office), and it is analyzed with social network analysis software. The actual level of university-industry collaboration in Canadian nanotechnology is determined and the significance of each group of actors in the network (academic vs. industrial inventors) is assessed. Second, a novel methodology is proposed, in which the network of nanotechnology inventors is assessed from a game theoretic perspective. It involves studying a cooperative game with n players each having at most n-1 decisions to choose from. The equilibrium leads to a strategy for all the players to choose their co-worker in the next period in order to maximize the correlated payoff of the game. The payoffs of the game represent the quality of the produced knowledge based on the citations of the patents. The best suggestion for the next collaborative relationship is provided for each actor from a game theoretic point of view in order to maximize the quality of the produced knowledge. One of the major contributions of this work is the novel approach which combines game theory and social network analysis for the case of large networks. This approach can serve as a powerful tool in the analysis of the strategic interactions of the network actors within the innovation systems and other large scale networks.

Keywords: cooperative strategy, game theory, industry-university collaboration, knowledge production, social network analysis

Procedia PDF Downloads 246

1269 MCDM Spectrum Handover Models for Cognitive Wireless Networks

Authors: Cesar Hernández, Diego Giral, Fernando Santa

Abstract:

The spectral handoff is important in cognitive wireless networks to ensure an adequate quality of service and performance for secondary user communications. This work proposes a benchmarking of performance of the three spectrum handoff models: VIKOR, SAW and MEW. Four evaluation metrics are used. These metrics are, accumulative average of failed handoffs, accumulative average of handoffs performed, accumulative average of transmission bandwidth and, accumulative average of the transmission delay. As a difference with related work, the performance of the three spectrum handoff models was validated with captured data of spectral occupancy in experiments realized at the GSM frequency band (824 MHz-849 MHz). These data represent the actual behavior of the licensed users for this wireless frequency band. The results of the comparative show that VIKOR Algorithm provides 15.8% performance improvement compared to a SAW Algorithm and, 12.1% better than the MEW Algorithm.

Keywords: cognitive radio, decision making, MEW, SAW, spectrum handoff, VIKOR

Procedia PDF Downloads 420

1268 Social Media Idea Ontology: A Concept for Semantic Search of Product Ideas in Customer Knowledge through User-Centered Metrics and Natural Language Processing

Authors: Martin H¨ausl, Maximilian Auch, Johannes Forster, Peter Mandl, Alexander Schill

Abstract:

In order to survive on the market, companies must constantly develop improved and new products. These products are designed to serve the needs of their customers in the best possible way. The creation of new products is also called innovation and is primarily driven by a company’s internal research and development department. However, a new approach has been taking place for some years now, involving external knowledge in the innovation process. This approach is called open innovation and identifies customer knowledge as the most important source in the innovation process. This paper presents a concept of using social media posts as an external source to support the open innovation approach in its initial phase, the Ideation phase. For this purpose, the social media posts are semantically structured with the help of an ontology and the authors are evaluated using graph-theoretical metrics such as density. For the structuring and evaluation of relevant social media posts, we also use the findings of Natural Language Processing, e. g. Named Entity Recognition, specific dictionaries, Triple Tagger and Part-of-Speech-Tagger. The selection and evaluation of the tools used are discussed in this paper. Using our ontology and metrics to structure social media posts enables users to semantically search these posts for new product ideas and thus gain an improved insight into the external sources such as customer needs.

Keywords: idea ontology, innovation management, semantic search, open information extraction

Procedia PDF Downloads 177

1267 A Ratio-Weighted Decision Tree Algorithm for Imbalance Dataset Classification

Authors: Doyin Afolabi, Phillip Adewole, Oladipupo Sennaike

Abstract:

Most well-known classifiers, including the decision tree algorithm, can make predictions on balanced datasets efficiently. However, the decision tree algorithm tends to be biased towards imbalanced datasets because of the skewness of the distribution of such datasets. To overcome this problem, this study proposes a weighted decision tree algorithm that aims to remove the bias toward the majority class and prevents the reduction of majority observations in imbalance datasets classification. The proposed weighted decision tree algorithm was tested on three imbalanced datasets- cancer dataset, german credit dataset, and banknote dataset. The specificity, sensitivity, and accuracy metrics were used to evaluate the performance of the proposed decision tree algorithm on the datasets. The evaluation results show that for some of the weights of our proposed decision tree, the specificity, sensitivity, and accuracy metrics gave better results compared to that of the ID3 decision tree and decision tree induced with minority entropy for all three datasets.

Keywords: data mining, decision tree, classification, imbalance dataset

Procedia PDF Downloads 110

1266 Empirical Investigation for the Correlation between Object-Oriented Class Lack of Cohesion and Coupling

Authors: Jehad Al Dallal

Abstract:

The design of the internal relationships among object-oriented class members (i.e., attributes and methods) and the external relationships among classes affects the overall quality of the object-oriented software. The degree of relatedness among class members is referred to as class cohesion and the degree to which a class is related to other classes is called class coupling. Well designed classes are expected to exhibit high cohesion and low coupling values. In this paper, using classes of three open-source Java systems, we empirically investigate the relation between class cohesion and coupling. In the empirical study, five lack-of-cohesion metrics and eight coupling metrics are considered. The empirical study results show that class cohesion and coupling internal quality attributes are inversely correlated. The strength of the correlation highly depends on the cohesion and coupling measurement approaches.

Keywords: class cohesion measure, class coupling measure, object-oriented class, software quality

Procedia PDF Downloads 221

1265 A Numerical Solution Based on Operational Matrix of Differentiation of Shifted Second Kind Chebyshev Wavelets for a Stefan Problem

Authors: Rajeev, N. K. Raigar

Abstract:

In this study, one dimensional phase change problem (a Stefan problem) is considered and a numerical solution of this problem is discussed. First, we use similarity transformation to convert the governing equations into ordinary differential equations with its boundary conditions. The solutions of ordinary differential equation with the associated boundary conditions and interface condition (Stefan condition) are obtained by using a numerical approach based on operational matrix of differentiation of shifted second kind Chebyshev wavelets. The obtained results are compared with existing exact solution which is sufficiently accurate.

Keywords: operational matrix of differentiation, similarity transformation, shifted second kind chebyshev wavelets, stefan problem

Procedia PDF Downloads 393

1264 Advancing Phenological Understanding of Plants/Trees Through Phenocam Digital Time-lapse Images

Authors: Siddhartha Khare, Suyash Khare

Abstract:

Phenology, a crucial discipline in ecology, offers insights into the seasonal dynamics of organisms within natural ecosystems and the underlying environmental triggers. Leveraging the potent capabilities of digital repeat photography, PhenoCams capture invaluable data on the phenology of crops, plants, and trees. These cameras yield digital imagery in Red Green Blue (RGB) color channels, and some advanced systems even incorporate Near Infrared (NIR) bands. This study presents compelling case studies employing PhenoCam technology to unravel the phenology of black spruce trees. Through the analysis of RGB color channels, a range of essential color metrics including red chromatic coordinate (RCC), green chromatic coordinate (GCC), blue chromatic coordinate (BCC), vegetation contrast index (VCI), and excess green index (ExGI) are derived. These metrics illuminate variations in canopy color across seasons, shedding light on bud and leaf development. This, in turn, facilitates a deeper understanding of phenological events and aids in delineating the growth periods of trees and plants. The initial phase of this study addresses critical questions surrounding the fidelity of continuous canopy greenness records in representing bud developmental phases. Additionally, it discerns which color-based index most accurately tracks the seasonal variations in tree phenology within evergreen forest ecosystems. The subsequent section of this study delves into the transition dates of black spruce (Picea mariana (Mill.) B.S.P.) phenology. This is achieved through a fortnightly comparative analysis of the MODIS normalized difference vegetation index (NDVI) and the enhanced vegetation index (EVI). By employing PhenoCam technology and leveraging advanced color metrics, this study significantly advances our comprehension of black spruce tree phenology, offering valuable insights for ecological research and management.

Keywords: phenology, remote sensing, phenocam, color metrics, NDVI, GCC

Procedia PDF Downloads 44

1263 A Data-Driven Monitoring Technique Using Combined Anomaly Detectors

Authors: Fouzi Harrou, Ying Sun, Sofiane Khadraoui

Abstract:

Anomaly detection based on Principal Component Analysis (PCA) was studied intensively and largely applied to multivariate processes with highly cross-correlated process variables. Monitoring metrics such as the Hotelling's T2 and the Q statistics are usually used in PCA-based monitoring to elucidate the pattern variations in the principal and residual subspaces, respectively. However, these metrics are ill suited to detect small faults. In this paper, the Exponentially Weighted Moving Average (EWMA) based on the Q and T statistics, T2-EWMA and Q-EWMA, were developed for detecting faults in the process mean. The performance of the proposed methods was compared with that of the conventional PCA-based fault detection method using synthetic data. The results clearly show the benefit and the effectiveness of the proposed methods over the conventional PCA method, especially for detecting small faults in highly correlated multivariate data.

Keywords: data-driven method, process control, anomaly detection, dimensionality reduction

Procedia PDF Downloads 283

1262 Learning Dynamic Representations of Nodes in Temporally Variant Graphs

Authors: Sandra Mitrovic, Gaurav Singh

Abstract:

In many industries, including telecommunications, churn prediction has been a topic of active research. A lot of attention has been drawn on devising the most informative features, and this area of research has gained even more focus with spread of (social) network analytics. The call detail records (CDRs) have been used to construct customer networks and extract potentially useful features. However, to the best of our knowledge, no studies including network features have yet proposed a generic way of representing network information. Instead, ad-hoc and dataset dependent solutions have been suggested. In this work, we build upon a recently presented method (node2vec) to obtain representations for nodes in observed network. The proposed approach is generic and applicable to any network and domain. Unlike node2vec, which assumes a static network, we consider a dynamic and time-evolving network. To account for this, we propose an approach that constructs the feature representation of each node by generating its node2vec representations at different timestamps, concatenating them and finally compressing using an auto-encoder-like method in order to retain reasonably long and informative feature vectors. We test the proposed method on churn prediction task in telco domain. To predict churners at timestamp ts+1, we construct training and testing datasets consisting of feature vectors from time intervals [t1, ts-1] and [t2, ts] respectively, and use traditional supervised classification models like SVM and Logistic Regression. Observed results show the effectiveness of proposed approach as compared to ad-hoc feature selection based approaches and static node2vec.

Keywords: churn prediction, dynamic networks, node2vec, auto-encoders

Procedia PDF Downloads 302

1261 A Graph Theoretic Algorithm for Bandwidth Improvement in Computer Networks

Authors: Mehmet Karaata

Abstract:

Given two distinct vertices (nodes) source s and target t of a graph G = (V, E), the two node-disjoint paths problem is to identify two node-disjoint paths between s ∈ V and t ∈ V . Two paths are node-disjoint if they have no common intermediate vertices. In this paper, we present an algorithm with O(m)-time complexity for finding two node-disjoint paths between s and t in arbitrary graphs where m is the number of edges. The proposed algorithm has a wide range of applications in ensuring reliability and security of sensor, mobile and fixed communication networks.

Keywords: disjoint paths, distributed systems, fault-tolerance, network routing, security

Procedia PDF Downloads 427

1260 Graph Planning Based Composition for Adaptable Semantic Web Services

Authors: Rihab Ben Lamine, Raoudha Ben Jemaa, Ikram Amous Ben Amor

Abstract:

This paper proposes a graph planning technique for semantic adaptable Web Services composition. First, we use an ontology based context model for extending Web Services descriptions with information about the most suitable context for its use. Then, we transform the composition problem into a semantic context aware graph planning problem to build the optimal service composition based on user's context. The construction of the planning graph is based on semantic context aware Web Service discovery that allows for each step to add most suitable Web Services in terms of semantic compatibility between the services parameters and their context similarity with the user's context. In the backward search step, semantic and contextual similarity scores are used to find best composed Web Services list. Finally, in the ranking step, a score is calculated for each best solution and a set of ranked solutions is returned to the user.

Keywords: semantic web service, web service composition, adaptation, context, graph planning

Procedia PDF Downloads 504

1259 A Model Based Metaheuristic for Hybrid Hierarchical Community Structure in Social Networks

Authors: Radhia Toujani, Jalel Akaichi

Abstract:

In recent years, the study of community detection in social networks has received great attention. The hierarchical structure of the network leads to the emergence of the convergence to a locally optimal community structure. In this paper, we aim to avoid this local optimum in the introduced hybrid hierarchical method. To achieve this purpose, we present an objective function where we incorporate the value of structural and semantic similarity based modularity and a metaheuristic namely bees colonies algorithm to optimize our objective function on both hierarchical level divisive and agglomerative. In order to assess the efficiency and the accuracy of the introduced hybrid bee colony model, we perform an extensive experimental evaluation on both synthetic and real networks.

Keywords: social network, community detection, agglomerative hierarchical clustering, divisive hierarchical clustering, similarity, modularity, metaheuristic, bee colony

Procedia PDF Downloads 368

1258 Understanding Mathematics Achievements among U. S. Middle School Students: A Bayesian Multilevel Modeling Analysis with Informative Priors

Authors: Jing Yuan, Hongwei Yang

Abstract:

This paper aims to understand U.S. middle school students’ mathematics achievements by examining relevant student and school-level predictors. Through a variance component analysis, the study first identifies evidence supporting the use of multilevel modeling. Then, a multilevel analysis is performed under Bayesian statistical inference where prior information is incorporated into the modeling process. During the analysis, independent variables are entered sequentially in the order of theoretical importance to create a hierarchy of models. By evaluating each model using Bayesian fit indices, a best-fit and most parsimonious model is selected where Bayesian statistical inference is performed for the purpose of result interpretation and discussion. The primary dataset for Bayesian modeling is derived from the Program for International Student Assessment (PISA) in 2012 with a secondary PISA dataset from 2003 analyzed under the traditional ordinary least squares method to provide the information needed to specify informative priors for a subset of the model parameters. The dependent variable is a composite measure of mathematics literacy, calculated from an exploratory factor analysis of all five PISA 2012 mathematics achievement plausible values for which multiple evidences are found supporting data unidimensionality. The independent variables include demographics variables and content-specific variables: mathematics efficacy, teacher-student ratio, proportion of girls in the school, etc. Finally, the entire analysis is performed using the MCMCpack and MCMCglmm packages in R.

Keywords: Bayesian multilevel modeling, mathematics education, PISA, multilevel

Procedia PDF Downloads 321

1257 The Bayesian Premium Under Entropy Loss

Authors: Farouk Metiri, Halim Zeghdoudi, Mohamed Riad Remita

Abstract:

Credibility theory is an experience rating technique in actuarial science which can be seen as one of quantitative tools that allows the insurers to perform experience rating, that is, to adjust future premiums based on past experiences. It is used usually in automobile insurance, worker's compensation premium, and IBNR (incurred but not reported claims to the insurer) where credibility theory can be used to estimate the claim size amount. In this study, we focused on a popular tool in credibility theory which is the Bayesian premium estimator, considering Lindley distribution as a claim distribution. We derive this estimator under entropy loss which is asymmetric and squared error loss which is a symmetric loss function with informative and non-informative priors. In a purely Bayesian setting, the prior distribution represents the insurer’s prior belief about the insured’s risk level after collection of the insured’s data at the end of the period. However, the explicit form of the Bayesian premium in the case when the prior is not a member of the exponential family could be quite difficult to obtain as it involves a number of integrations which are not analytically solvable. The paper finds a solution to this problem by deriving this estimator using numerical approximation (Lindley approximation) which is one of the suitable approximation methods for solving such problems, it approaches the ratio of the integrals as a whole and produces a single numerical result. Simulation study using Monte Carlo method is then performed to evaluate this estimator and mean squared error technique is made to compare the Bayesian premium estimator under the above loss functions.

Keywords: bayesian estimator, credibility theory, entropy loss, monte carlo simulation

Procedia PDF Downloads 319