Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 13215

Search results for: optimal number of topics

13215 Visualization and Performance Measure to Determine Number of Topics in Twitter Data Clustering Using Hybrid Topic Modeling

Abstract:

Topic models are widely used in building clusters of documents for more than a decade, yet problems occurring in choosing optimal number of topics. The main problem is the lack of a stable metric of the quality of topics obtained during the construction of topic models. The authors analyzed from previous works, most of the models used in determining the number of topics are non-parametric and quality of topics determined by using perplexity and coherence measures and concluded that they are not applicable in solving this problem. In this paper, we used the parametric method, which is an extension of the traditional topic model with visual access tendency for visualization of the number of topics (clusters) to complement clustering and to choose optimal number of topics based on results of cluster validity indices. Developed hybrid topic models are demonstrated with different Twitter datasets on various topics in obtaining the optimal number of topics and in measuring the quality of clusters. The experimental results showed that the Visual Non-negative Matrix Factorization (VNMF) topic model performs well in determining the optimal number of topics with interactive visualization and in performance measure of the quality of clusters with validity indices.

Keywords: interactive visualization, visual mon-negative matrix factorization model, optimal number of topics, cluster validity indices, Twitter data clustering

Procedia PDF Downloads 131

13214 Incremental Learning of Independent Topic Analysis

Authors: Takahiro Nishigaki, Katsumi Nitta, Takashi Onoda

Abstract:

In this paper, we present a method of applying Independent Topic Analysis (ITA) to increasing the number of document data. The number of document data has been increasing since the spread of the Internet. ITA was presented as one method to analyze the document data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis (ICA). ICA is a technique in the signal processing; however, it is difficult to apply the ITA to increasing number of document data. Because ITA must use the all document data so temporal and spatial cost is very high. Therefore, we present Incremental ITA which extracts the independent topics from increasing number of document data. Incremental ITA is a method of updating the independent topics when the document data is added after extracted the independent topics from a just previous the data. In addition, Incremental ITA updates the independent topics when the document data is added. And we show the result applied Incremental ITA to benchmark datasets.

Keywords: text mining, topic extraction, independent, incremental, independent component analysis

Procedia PDF Downloads 304

13213 Web Search Engine Based Naming Procedure for Independent Topic

Authors: Takahiro Nishigaki, Takashi Onoda

Abstract:

In recent years, the number of document data has been increasing since the spread of the Internet. Many methods have been studied for extracting topics from large document data. We proposed Independent Topic Analysis (ITA) to extract topics independent of each other from large document data such as newspaper data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis. The topic represented by ITA is represented by a set of words. However, the set of words is quite different from the topics the user imagines. For example, the top five words with high independence of a topic are as follows. Topic1 = {"scor", "game", "lead", "quarter", "rebound"}. This Topic 1 is considered to represent the topic of "SPORTS". This topic name "SPORTS" has to be attached by the user. ITA cannot name topics. Therefore, in this research, we propose a method to obtain topics easy for people to understand by using the web search engine, topics given by the set of words given by independent topic analysis. In particular, we search a set of topical words, and the title of the homepage of the search result is taken as the topic name. And we also use the proposed method for some data and verify its effectiveness.

Keywords: independent topic analysis, topic extraction, topic naming, web search engine

Procedia PDF Downloads 116

13212 Operations Research Applications in Audit Planning and Scheduling

Authors: Abdel-Aziz M. Mohamed

Abstract:

This paper presents a state-of-the-art survey of the operations research models developed for internal audit planning. Two alternative approaches have been followed in the literature for audit planning: (1) identifying the optimal audit frequency; and (2) determining the optimal audit resource allocation. The first approach identifies the elapsed time between two successive audits, which can be presented as the optimal number of audits in a given planning horizon, or the optimal number of transactions after which an audit should be performed. It also includes the optimal audit schedule. The second approach determines the optimal allocation of audit frequency among all auditable units in the firm. In our review, we discuss both the deterministic and probabilistic models developed for audit planning. In addition, game theory models are reviewed to find the optimal auditing strategy based on the interactions between the auditors and the clients.

Keywords: operations research applications, audit frequency, audit-staff scheduling, audit planning

Procedia PDF Downloads 811

13211 A Polynomial Time Clustering Algorithm for Solving the Assignment Problem in the Vehicle Routing Problem

Authors: Lydia Wahid, Mona F. Ahmed, Nevin Darwish

Abstract:

The vehicle routing problem (VRP) consists of a group of customers that needs to be served. Each customer has a certain demand of goods. A central depot having a fleet of vehicles is responsible for supplying the customers with their demands. The problem is composed of two subproblems: The first subproblem is an assignment problem where the number of vehicles that will be used as well as the customers assigned to each vehicle are determined. The second subproblem is the routing problem in which for each vehicle having a number of customers assigned to it, the order of visits of the customers is determined. Optimal number of vehicles, as well as optimal total distance, should be achieved. In this paper, an approach for solving the first subproblem (the assignment problem) is presented. In the approach, a clustering algorithm is proposed for finding the optimal number of vehicles by grouping the customers into clusters where each cluster is visited by one vehicle. Finding the optimal number of clusters is NP-hard. This work presents a polynomial time clustering algorithm for finding the optimal number of clusters and solving the assignment problem.

Keywords: vehicle routing problems, clustering algorithms, Clarke and Wright Saving Method, agglomerative hierarchical clustering

Procedia PDF Downloads 390

13210 Online Topic Model for Broadcasting Contents Using Semantic Correlation Information

Authors: Chang-Uk Kwak, Sun-Joong Kim, Seong-Bae Park, Sang-Jo Lee

Abstract:

This paper proposes a method of learning topics for broadcasting contents. There are two kinds of texts related to broadcasting contents. One is a broadcasting script which is a series of texts including directions and dialogues. The other is blogposts which possesses relatively abstracted contents, stories and diverse information of broadcasting contents. Although two texts range over similar broadcasting contents, words in blogposts and broadcasting script are different. In order to improve the quality of topics, it needs a method to consider the word difference. In this paper, we introduce a semantic vocabulary expansion method to solve the word difference. We expand topics of the broadcasting script by incorporating the words in blogposts. Each word in blogposts is added to the most semantically correlated topics. We use word2vec to get the semantic correlation between words in blogposts and topics of scripts. The vocabularies of topics are updated and then posterior inference is performed to rearrange the topics. In experiments, we verified that the proposed method can learn more salient topics for broadcasting contents.

Keywords: broadcasting script analysis, topic expansion, semantic correlation analysis, word2vec

Procedia PDF Downloads 246

13209 Determining Optimal Number of Trees in Random Forests

Authors: Songul Cinaroglu

Abstract:

Background: Random Forest is an efficient, multi-class machine learning method using for classification, regression and other tasks. This method is operating by constructing each tree using different bootstrap sample of the data. Determining the number of trees in random forests is an open question in the literature for studies about improving classification performance of random forests. Aim: The aim of this study is to analyze whether there is an optimal number of trees in Random Forests and how performance of Random Forests differ according to increase in number of trees using sample health data sets in R programme. Method: In this study we analyzed the performance of Random Forests as the number of trees grows and doubling the number of trees at every iteration using “random forest” package in R programme. For determining minimum and optimal number of trees we performed Mc Nemar test and Area Under ROC Curve respectively. Results: At the end of the analysis it was found that as the number of trees grows, it does not always means that the performance of the forest is better than forests which have fever trees. In other words larger number of trees only increases computational costs but not increases performance results. Conclusion: Despite general practice in using random forests is to generate large number of trees for having high performance results, this study shows that increasing number of trees doesn’t always improves performance. Future studies can compare different kinds of data sets and different performance measures to test whether Random Forest performance results change as number of trees increase or not.

Keywords: classification methods, decision trees, number of trees, random forest

Procedia PDF Downloads 391

13208 Optimal Bayesian Chart for Controlling Expected Number of Defects in Production Processes

Authors: V. Makis, L. Jafari

Abstract:

In this paper, we develop an optimal Bayesian chart to control the expected number of defects per inspection unit in production processes with long production runs. We formulate this control problem in the optimal stopping framework. The objective is to determine the optimal stopping rule minimizing the long-run expected average cost per unit time considering partial information obtained from the process sampling at regular epochs. We prove the optimality of the control limit policy, i.e., the process is stopped and the search for assignable causes is initiated when the posterior probability that the process is out of control exceeds a control limit. An algorithm in the semi-Markov decision process framework is developed to calculate the optimal control limit and the corresponding average cost. Numerical examples are presented to illustrate the developed optimal control chart and to compare it with the traditional u-chart.

Keywords: Bayesian u-chart, economic design, optimal stopping, semi-Markov decision process, statistical process control

Procedia PDF Downloads 569

13207 Optimal Sensing Technique for Estimating Stress Distribution of 2-D Steel Frame Structure Using Genetic Algorithm

Authors: Jun Su Park, Byung Kwan Oh, Jin Woo Hwang, Yousok Kim, Hyo Seon Park

Abstract:

For the structural safety, the maximum stress calculated from the stress distribution of a structure is widely used. The stress distribution can be estimated by deformed shape of the structure obtained from measurement. Although the estimation of stress is strongly affected by the location and number of sensing points, most studies have conducted the stress estimation without reasonable basis on sensing plan such as the location and number of sensors. In this paper, an optimal sensing technique for estimating the stress distribution is proposed. This technique proposes the optimal location and number of sensing points for a 2-D frame structure while minimizing the error of stress distribution between analytical model and estimation by cubic smoothing splines using genetic algorithm. To verify the proposed method, the optimal sensor measurement technique is applied to simulation tests on 2-D steel frame structure. The simulation tests are performed under various loading scenarios. Through those tests, the optimal sensing plan for the structure is suggested and verified.

Keywords: genetic algorithm, optimal sensing, optimizing sensor placements, steel frame structure

Procedia PDF Downloads 529

13206 Replacement Time and Number of Preventive Maintenance Actions for Second-Hand Device

Authors: Wen Liang Chang

Abstract:

In this study, the optimal replacement time and number of preventive maintenance (PM) actions were investigated for a second-hand device. Suppose that a user intends to use a second-hand device for manufacturing products, and that the device is replaced with a new one. Any device failure is rectified through minimal repair, thereby incurring a fixed repair cost to the user. If the new device fails within the FRW period, minimal repair is performed at no cost to the user. After the FRW expires, a failed device is repaired and the cost of repair is incurred by the user. In this study, two profit models were developed, and the optimal replacement time and number of PM actions were determined to maximize profits. Finally, the influence of the optimal replacement time and number of PM actions were elaborated on, using numerical examples.

Keywords: second-hand device, preventive maintenance, replacement time, device failure

Procedia PDF Downloads 463

13205 Determining the Number of Single Models in a Combined Forecast

Authors: Serkan Aras, Emrah Gulay

Abstract:

Combining various forecasting models is an important tool for researchers to attain more accurate forecasts. A great number of papers have shown that selecting single models as dissimilar models, or methods based on different information as possible leads to better forecasting performances. However, there is not a certain rule regarding the number of single models to be used in any combining methods. This study focuses on determining the optimal or near optimal number for single models with the help of statistical tests. An extensive experiment is carried out by utilizing some well-known time series data sets from diverse fields. Furthermore, many rival forecasting methods and some of the commonly used combining methods are employed. The obtained results indicate that some statistically significant performance differences can be found regarding the number of the single models in the combining methods under investigation.

Keywords: combined forecast, forecasting, M-competition, time series

Procedia PDF Downloads 352

13204 Portfolio Selection with Constraints on Trading Frequency

Authors: Min Dai, Hong Liu, Shuaijie Qian

Abstract:

We study a portfolio selection problem of an investor who faces constraints on rebalancing frequency, which is common in pension fund investment. We formulate it as a multiple optimal stopping problem and utilize the dynamic programming principle. By numerically solving the corresponding Hamilton-Jacobi-Bellman (HJB) equation, we find a series of free boundaries characterizing optimal strategy, and the constraints significantly impact the optimal strategy. Even in the absence of transaction costs, there is a no-trading region, depending on the number of the remaining trading chances. We also find that the equivalent wealth loss caused by the constraints is large. In conclusion, our model clarifies the impact of the constraints on transaction frequency on the optimal strategy.

Keywords: portfolio selection, rebalancing frequency, optimal strategy, free boundary, optimal stopping

Procedia PDF Downloads 77

13203 Active Vibration Reduction for a Flexible Structure Bonded with Sensor/Actuator Pairs on Efficient Locations Using a Developed Methodology

Authors: Ali H. Daraji, Jack M. Hale, Ye Jianqiao

Abstract:

With the extensive use of high specific strength structures to optimise the loading capacity and material cost in aerospace and most engineering applications, much effort has been expended to develop intelligent structures for active vibration reduction and structural health monitoring. These structures are highly flexible, inherently low internal damping and associated with large vibration and long decay time. The modification of such structures by adding lightweight piezoelectric sensors and actuators at efficient locations integrated with an optimal control scheme is considered an effective solution for structural vibration monitoring and controlling. The size and location of sensor and actuator are important research topics to investigate their effects on the level of vibration detection and reduction and the amount of energy provided by a controller. Several methodologies have been presented to determine the optimal location of a limited number of sensors and actuators for small-scale structures. However, these studies have tackled this problem directly, measuring the fitness function based on eigenvalues and eigenvectors achieved with numerous combinations of sensor/actuator pair locations and converging on an optimal set using heuristic optimisation techniques such as the genetic algorithms. This is computationally expensive for small- and large-scale structures subject to optimise a number of s/a pairs to suppress multiple vibration modes. This paper proposes an efficient method to determine optimal locations for a limited number of sensor/actuator pairs for active vibration reduction of a flexible structure based on finite element method and Hamilton’s principle. The current work takes the simplified approach of modelling a structure with sensors at all locations, subjecting it to an external force to excite the various modes of interest and noting the locations of sensors giving the largest average percentage sensors effectiveness measured by dividing all sensor output voltage over the maximum for each mode. The methodology was implemented for a cantilever plate under external force excitation to find the optimal distribution of six sensor/actuator pairs to suppress the first six modes of vibration. It is shown that the results of the optimal sensor locations give good agreement with published optimal locations, but with very much reduced computational effort and higher effectiveness. Furthermore, it is shown that collocated sensor/actuator pairs placed in these locations give very effective active vibration reduction using optimal linear quadratic control scheme.

Keywords: optimisation, plate, sensor effectiveness, vibration control

Procedia PDF Downloads 224

13202 A Stable Method for Determination of the Number of Independent Components

Authors: Yuyan Yi, Jingyi Zheng, Nedret Billor

Abstract:

Independent component analysis (ICA) is one of the most commonly used blind source separation (BSS) techniques for signal pre-processing, such as noise reduction and feature extraction. The main parameter in the ICA method is the number of independent components (IC). Although there have been several methods for the determination of the number of ICs, it has not been given sufficient attentionto this important parameter. In this study, wereview the mostused methods fordetermining the number of ICs and providetheir advantages and disadvantages. Further, wepropose an improved version of column-wise ICAByBlock method for the determination of the number of ICs.To assess the performance of the proposed method, we compare the column-wise ICAbyBlock with several existing methods through different ICA methods by using simulated and real signal data. Results show that the proposed column-wise ICAbyBlock is an effective and stable method for determining the optimal number of components in ICA. This method is simple, and results can be demonstrated intuitively with good visualizations.

Keywords: independent component analysis, optimal number, column-wise, correlation coefficient, cross-validation, ICAByblock

Procedia PDF Downloads 93

13201 A Comparative Study of Multi-SOM Algorithms for Determining the Optimal Number of Clusters

Authors: Imèn Khanchouch, Malika Charrad, Mohamed Limam

Abstract:

The interpretation of the quality of clusters and the determination of the optimal number of clusters is still a crucial problem in clustering. We focus in this paper on multi-SOM clustering method which overcomes the problem of extracting the number of clusters from the SOM map through the use of a clustering validity index. We then tested multi-SOM using real and artificial data sets with different evaluation criteria not used previously such as Davies Bouldin index, Dunn index and silhouette index. The developed multi-SOM algorithm is compared to k-means and Birch methods. Results show that it is more efficient than classical clustering methods.

Keywords: clustering, SOM, multi-SOM, DB index, Dunn index, silhouette index

Procedia PDF Downloads 593

13200 Single Machine Scheduling Problem to Minimize the Number of Tardy Jobs

Authors: Ali Allahverdi, Harun Aydilek, Asiye Aydilek

Abstract:

Minimizing the number of tardy jobs is an important factor to consider while making scheduling decisions. This is because on-time shipments are vital for lowering cost and increasing customers’ satisfaction. This paper addresses the single machine scheduling problem with the objective of minimizing the number of tardy jobs. The only known information is the lower and upper bounds for processing times, and deterministic job due dates. A dominance relation is established, and an algorithm is proposed. Several heuristics are generated from the proposed algorithm. Computational analysis indicates that the performance of one of the heuristics is very close to the optimal solution, i.e., on average, less than 1.5 % from the optimal solution.

Keywords: single machine scheduling, number of tardy jobs, heuristi, lower and upper bounds

Procedia PDF Downloads 552

13199 Optimal Number of Reconfigurable Robots in a Transport System

Authors: Mari Chaikovskaia, Jean-Philippe Gayon, Alain Quilliot

Abstract:

We consider a fleet of elementary robots that can be connected in different ways to transport loads of different types. For instance, a single robot can transport a small load, and the association of two robots can either transport a large load or two small loads. We seek to determine the optimal number of robots to transport a set of loads in a given time interval, with or without reconfiguration. We show that the problem with reconfiguration is strongly NP-hard by a reduction to the bin-packing problem. Then, we study a special case with unit capacities and derive simple formulas for the minimum number of robots, up to 3 types of loads. For this special case, we compare the minimum number of robots with or without reconfiguration and show that the gain is limited in absolute value but may be significant for small fleets.

Keywords: fleet sizing, reconfigurability, robots, transportation

Procedia PDF Downloads 82

13198 An Investigation of the Relationship Between Privacy Crisis, Public Discourse on Privacy, and Key Performance Indicators at Facebook (2004–2021)

Authors: Prajwal Eachempati, Laurent Muzellec, Ashish Kumar Jha

Abstract:

We use Facebook as a case study to investigate the complex relationship between the firm’s public discourse (and actions) surrounding data privacy and the performance of a business model based on monetizing user’s data. We do so by looking at the evolution of public discourse over time (2004–2021) and relate topics to revenue and stock market evolution Drawing from archival sources like Zuckerberg We use LDA topic modelling algorithm to reveal 19 topics regrouped in 6 major themes. We first show how, by using persuasive and convincing language that promises better protection of consumer data usage, but also emphasizes greater user control over their own data, the privacy issue is being reframed as one of greater user control and responsibility. Second, we aim to understand and put a value on the extent to which privacy disclosures have a potential impact on the financial performance of social media firms. There we found significant relationship between the topics pertaining to privacy and social media/technology, sentiment score and stock market prices. Revenue is found to be impacted by topics pertaining to politics and new product and service innovations while number of active users is not impacted by the topics unless moderated by external control variables like Return on Assets and Brand Equity.

Keywords: public discourses, data protection, social media, privacy, topic modeling, business models, financial performance

Procedia PDF Downloads 88

13197 Analysis of the Topics of Research of Brazilian Researchers Acting in the Areas of Engineering

Authors: Jether Gomes, Thiago M. R. Dias, Gray F. Moita

Abstract:

The production and publication of scientific works have increased significantly in the last years, being the Internet the main factor of access and diffusion of these. In view of this, researchers from several areas of knowledge have carried out several studies on scientific production data in order to analyze phenomena and trends about science. The understanding of how research has evolved can, for example, serve as a basis for building scientific policies for further advances in science and stimulating research groups to become more productive. In this context, the objective of this work is to analyze the main research topics investigated along the trajectory of the Brazilian science of researchers working in the areas of engineering, in order to map scientific knowledge and identify topics in highlights. To this end, studies are carried out on the frequency and relationship of the keywords of the set of scientific articles registered in the existing curricula in the Lattes Platform of each one of the selected researchers, counting with the aid of bibliometric analysis features.

Keywords: research topics, bibliometrics, topics of interest, Lattes Platform

Procedia PDF Downloads 217

13196 Optimal Design of Profiled Steel Sheet for Composite Slab

Authors: Adinew Gebremeskel Tizazu

Abstract:

Nowadays, in our world of technological development, there is an enhanced intention imposed on the building construction industry to improve the time, economy, and structural efficiency of structures. Modern profiled steel sheets are mostly designed as formwork and tensile reinforcement. This research is concerned with the optimal design of profiled steel sheets for composite slabs. Apart from satisfying the safety requirement, the design should be economical. For a given condition, there might be a large number of alternatives that satisfy the requirement set by the codes. But the designer must be in a position to choose the design, which is optimal against certain measures of optimality. Therefore, the designers have to do some optimization to arrive at such a design. In this research, the optimal cross-sectional dimensions of profiled steel sheets will be determined by considering different spans, loadings, and materials.

Keywords: profiled sheeting, optimal cross-sectional dimensions, cold-formed profiled sheets, composite slab

Procedia PDF Downloads 13

13195 Building an Opinion Dynamics Model from Experimental Data

Authors: Dino Carpentras, Paul J. Maher, Caoimhe O'Reilly, Michael Quayle

Abstract:

Opinion dynamics is a sub-field of agent-based modeling that focuses on people’s opinions and their evolutions over time. Despite the rapid increase in the number of publications in this field, it is still not clear how to apply these models to real-world scenarios. Indeed, there is no agreement on how people update their opinion while interacting. Furthermore, it is not clear if different topics will show the same dynamics (e.g., more polarized topics may behave differently). These problems are mostly due to the lack of experimental validation of the models. Some previous studies started bridging this gap in the literature by directly measuring people’s opinions before and after the interaction. However, these experiments force people to express their opinion as a number instead of using natural language (and then, eventually, encoding it as numbers). This is not the way people normally interact, and it may strongly alter the measured dynamics. Another limitation of these studies is that they usually average all the topics together, without checking if different topics may show different dynamics. In our work, we collected data from 200 participants on 5 unpolarized topics. Participants expressed their opinions in natural language (“agree” or “disagree”). We also measured the certainty of their answer, expressed as a number between 1 and 10. However, this value was not shown to other participants to keep the interaction based on natural language. We then showed the opinion (and not the certainty) of another participant and, after a distraction task, we repeated the measurement. To make the data compatible with opinion dynamics models, we multiplied opinion and certainty to obtain a new parameter (here called “continuous opinion”) ranging from -10 to +10 (using agree=1 and disagree=-1). We firstly checked the 5 topics individually, finding that all of them behaved in a similar way despite having different initial opinions distributions. This suggested that the same model could be applied for different unpolarized topics. We also observed that people tend to maintain similar levels of certainty, even when they changed their opinion. This is a strong violation of what is suggested from common models, where people starting at, for example, +8, will first move towards 0 instead of directly jumping to -8. We also observed social influence, meaning that people exposed with “agree” were more likely to move to higher levels of continuous opinion, while people exposed with “disagree” were more likely to move to lower levels. However, we also observed that the effect of influence was smaller than the effect of random fluctuations. Also, this configuration is different from standard models, where noise, when present, is usually much smaller than the effect of social influence. Starting from this, we built an opinion dynamics model that explains more than 80% of data variance. This model was also able to show the natural conversion of polarization from unpolarized states. This experimental approach offers a new way to build models grounded on experimental data. Furthermore, the model offers new insight into the fundamental terms of opinion dynamics models.

Keywords: experimental validation, micro-dynamics rule, opinion dynamics, update rule

Procedia PDF Downloads 107

13194 Heat Transfer Enhancement Due to the Optimal Porosity in Plate Heat Exchangers with Sinusoidal Plates

Authors: Hossein Shokouhmand, Seyyed Mostafa Saadat

Abstract:

In this paper, the effect of thermal dispersion on the performance of plate heat exchangers (PHEs) with sinusoidal plates is investigated. In this regard, the PHE is considered as a porous medium. The important property of a porous medium is porosity that is defined as the total fluid volume divided by the total volume occupied by the solid and fluid. A 2D array of parallel sinusoidal plates with laminar periodically developed forced convection and single-phase constant property flows and conduction in a homogenous solid phase in two directions is considered. The array of flows is counter and the flows heat capacities are equal. Numerical study of conjugate heat transfer and axial conduction in the solid phase with different plate thicknesses showed that there is an optimal porosity in which the efficiency of heat transfer is up to 4% more than the time when the porosity is near one. It is shown that the optimal porosity at zero angle of inclination depends both on Reynolds number and the aspect ratio. The optimal porosity increased while either the Reynolds number or waviness of plates increased.

Keywords: plate heat exchanger, optimal porosity, efficiency, aspect ratio

Procedia PDF Downloads 399

13193 Clustering Performance Analysis using New Correlation-Based Cluster Validity Indices

Authors: Nathakhun Wiroonsri

Abstract:

There are various cluster validity measures used for evaluating clustering results. One of the main objectives of using these measures is to seek the optimal unknown number of clusters. Some measures work well for clusters with different densities, sizes and shapes. Yet, one of the weaknesses that those validity measures share is that they sometimes provide only one clear optimal number of clusters. That number is actually unknown and there might be more than one potential sub-optimal option that a user may wish to choose based on different applications. We develop two new cluster validity indices based on a correlation between an actual distance between a pair of data points and a centroid distance of clusters that the two points are located in. Our proposed indices constantly yield several peaks at different numbers of clusters which overcome the weakness previously stated. Furthermore, the introduced correlation can also be used for evaluating the quality of a selected clustering result. Several experiments in different scenarios, including the well-known iris data set and a real-world marketing application, have been conducted to compare the proposed validity indices with several well-known ones.

Keywords: clustering algorithm, cluster validity measure, correlation, data partitions, iris data set, marketing, pattern recognition

Procedia PDF Downloads 102

13192 An Improved Genetic Algorithm for Traveling Salesman Problem with Precedence Constraint

Authors: M. F. F. Ab Rashid, A. N. Mohd Rose, N. M. Z. Nik Mohamed, W. S. Wan Harun, S. A. Che Ghani

Abstract:

Traveling salesman problem with precedence constraint (TSPPC) is one of the most complex problems in combinatorial optimization. The existing algorithms to solve TSPPC cost large computational time to find the optimal solution. The purpose of this paper is to present an efficient genetic algorithm that guarantees optimal solution with less number of generations and iterations time. Unlike the existing algorithm that generates priority factor as chromosome, the proposed algorithm directly generates sequence of solution as chromosome. As a result, the proposed algorithm is capable of generating optimal solution with smaller number of generations and iteration time compare to existing algorithm.

Keywords: traveling salesman problem, sequencing, genetic algorithm, precedence constraint

Procedia PDF Downloads 553

13191 Experimental Measurements of Mean and Turbulence Quantities behind the Circular Cylinder by Attaching Different Number of Tripping Wires

Authors: Amir Bak Khoshnevis, Mahdieh Khodadadi, Aghil Lotfi

Abstract:

For a bluff body, roughness elements in simulating a turbulent boundary layer, leading to delayed flow separation, a smaller wake, and lower form drag. In the present work, flow past a circular cylinder with using tripping wires is studied experimentally. The wind tunnel used for modeling free stream is open blow circuit (maximum speed = 30m/s and maximum turbulence of free stream = 0.1%). The selected Reynolds number for all tests was constant (Re = 25000). The circular cylinder selected for this experiment is 20 and 400mm in diameter and length, respectively. The aim of this research is to find the optimal operation mode. In this study installed some tripping wires 1mm in diameter, with a different number of wires on the circular cylinder and the wake characteristics of the circular cylinder is studied. Results showed that by increasing number of tripping wires attached to the circular cylinder (6, 8, and 10, respectively), The optimal angle for the tripping wires with 1mm in diameter to be installed on the cylinder is 60̊ (or 6 wires required at angle difference of 60̊). Strouhal number for the cylinder with tripping wires 1mm in diameter at angular position 60̊ showed the maximum value.

Keywords: wake of circular cylinder, trip wire, velocity defect, strouhal number

Procedia PDF Downloads 394

13190 Optimal Placement of Phasor Measurement Units Using Gravitational Search Method

Authors: Satyendra Pratap Singh, S. P. Singh

Abstract:

This paper presents a methodology using Gravitational Search Algorithm for optimal placement of Phasor Measurement Units (PMUs) in order to achieve complete observability of the power system. The objective of proposed algorithm is to minimize the total number of PMUs at the power system buses, which in turn minimize installation cost of the PMUs. In this algorithm, the searcher agents are collection of masses which interact with each other using Newton’s laws of gravity and motion. This new Gravitational Search Algorithm based method has been applied to the IEEE 14-bus, IEEE 30-bus and IEEE 118-bus test systems. Case studies reveal optimal number of PMUs with better observability by proposed method.

Keywords: gravitational search algorithm (GSA), law of motion, law of gravity, observability, phasor measurement unit

Procedia PDF Downloads 499

13189 Application of Simulation of Discrete Events in Resource Management of Massive Concreting

Authors: Mohammad Amin Hamedirad, Seyed Javad Vaziri Kang Olyaei

Abstract:

Project planning and control are one of the most critical issues in the management of construction projects. Traditional methods of project planning and control, such as the critical path method or Gantt chart, are not widely used for planning projects with discrete and repetitive activities, and one of the problems of project managers is planning the implementation process and optimal allocation of its resources. Massive concreting projects is also a project with discrete and repetitive activities. This study uses the concept of simulating discrete events to manage resources, which includes finding the optimal number of resources considering various limitations such as limitations of machinery, equipment, human resources and even technical, time and implementation limitations using analysis of resource consumption rate, project completion time and critical points analysis of the implementation process. For this purpose, the concept of discrete-event simulation has been used to model different stages of implementation. After reviewing the various scenarios, the optimal number of allocations for each resource is finally determined to reach the maximum utilization rate and also to reduce the project completion time or reduce its cost according to the existing constraints. The results showed that with the optimal allocation of resources, the project completion time could be reduced by 90%, and the resulting costs can be reduced by up to 49%. Thus, allocating the optimal number of project resources using this method will reduce its time and cost.

Keywords: simulation, massive concreting, discrete event simulation, resource management

Procedia PDF Downloads 142

13188 Validating the Micro-Dynamic Rule in Opinion Dynamics Models

Authors: Dino Carpentras, Paul Maher, Caoimhe O'Reilly, Michael Quayle

Abstract:

Opinion dynamics is dedicated to modeling the dynamic evolution of people's opinions. Models in this field are based on a micro-dynamic rule, which determines how people update their opinion when interacting. Despite the high number of new models (many of them based on new rules), little research has been dedicated to experimentally validate the rule. A few studies started bridging this literature gap by experimentally testing the rule. However, in these studies, participants are forced to express their opinion as a number instead of using natural language. Furthermore, some of these studies average data from experimental questions, without testing if differences existed between them. Indeed, it is possible that different topics could show different dynamics. For example, people may be more prone to accepting someone's else opinion regarding less polarized topics. In this work, we collected data from 200 participants on 5 unpolarized topics. Participants expressed their opinions using natural language ('agree' or 'disagree') and the certainty of their answer, expressed as a number between 1 and 10. To keep the interaction based on natural language, certainty was not shown to other participants. We then showed to the participant someone else's opinion on the same topic and, after a distraction task, we repeated the measurement. To produce data compatible with standard opinion dynamics models, we multiplied the opinion (encoded as agree=1 and disagree=-1) with the certainty to obtain a single 'continuous opinion' ranging from -10 to 10. By analyzing the topics independently, we observed that each one shows a different initial distribution. However, the dynamics (i.e., the properties of the opinion change) appear to be similar between all topics. This suggested that the same micro-dynamic rule could be applied to unpolarized topics. Another important result is that participants that change opinion tend to maintain similar levels of certainty. This is in contrast with typical micro-dynamics rules, where agents move to an average point instead of directly jumping to the opposite continuous opinion. As expected, in the data, we also observed the effect of social influence. This means that exposing someone with 'agree' or 'disagree' influenced participants to respectively higher or lower values of the continuous opinion. However, we also observed random variations whose effect was stronger than the social influence’s one. We even observed cases of people that changed from 'agree' to 'disagree,' even if they were exposed to 'agree.' This phenomenon is surprising, as, in the standard literature, the strength of the noise is usually smaller than the strength of social influence. Finally, we also built an opinion dynamics model from the data. The model was able to explain more than 80% of the data variance. Furthermore, by iterating the model, we were able to produce polarized states even starting from an unpolarized population. This experimental approach offers a way to test the micro-dynamic rule. This also allows us to build models which are directly grounded on experimental results.

Keywords: experimental validation, micro-dynamic rule, opinion dynamics, update rule

Procedia PDF Downloads 152

13187 Investigating Non-suicidal Self-Injury Discussions on Twitter

Authors: Muhammad Abubakar Alhassan, Diane Pennington

Abstract:

Social networking sites have become a space for people to discuss public health issues such as non-suicidal self-injury (NSSI). There are thousands of tweets containing self-harm and self-injury hashtags on Twitter. It is difficult to distinguish between different users who participate in self-injury discussions on Twitter and how their opinions change over time. Also, it is challenging to understand the topics surrounding NSSI discussions on Twitter. We retrieved tweets using #selfham and #selfinjury hashtags and investigated those from the United kingdom. We applied inductive coding and grouped tweeters into different categories. This study used the Latent Dirichlet Allocation (LDA) algorithm to infer the optimum number of topics that describes our corpus. Our findings revealed that many of those participating in NSSI discussions are non-professional users as opposed to medical experts and academics. Support organisations, medical teams, and academics were campaigning positively on rais-ing self-injury awareness and recovery. Using LDAvis visualisation technique, we selected the top 20 most relevant terms from each topic and interpreted the topics as; children and youth well-being, self-harm misjudgement, mental health awareness, school and mental health support and, suicide and mental-health issues. More than 50% of these topics were discussed in England compared to Scotland, Wales, Ireland and Northern Ireland. Our findings highlight the advantages of using the Twitter social network in tackling the problem of self-injury through awareness. There is a need to study the potential risks associated with the use of social networks among self-injurers.

Keywords: self-harm, non-suicidal self-injury, Twitter, social networks

Procedia PDF Downloads 123

13186 Towards Law Data Labelling Using Topic Modelling

Authors: Daniel Pinheiro Da Silva Junior, Aline Paes, Daniel De Oliveira, Christiano Lacerda Ghuerren, Marcio Duran

Abstract:

The Courts of Accounts are institutions responsible for overseeing and point out irregularities of Public Administration expenses. They have a high demand for processes to be analyzed, whose decisions must be grounded on severity laws. Despite the existing large amount of processes, there are several cases reporting similar subjects. Thus, previous decisions on already analyzed processes can be a precedent for current processes that refer to similar topics. Identifying similar topics is an open, yet essential task for identifying similarities between several processes. Since the actual amount of topics is considerably large, it is tedious and error-prone to identify topics using a pure manual approach. This paper presents a tool based on Machine Learning and Natural Language Processing to assists in building a labeled dataset. The tool relies on Topic Modelling with Latent Dirichlet Allocation to find the topics underlying a document followed by Jensen Shannon distance metric to generate a probability of similarity between documents pairs. Furthermore, in a case study with a corpus of decisions of the Rio de Janeiro State Court of Accounts, it was noted that data pre-processing plays an essential role in modeling relevant topics. Also, the combination of topic modeling and a calculated distance metric over document represented among generated topics has been proved useful in helping to construct a labeled base of similar and non-similar document pairs.

Keywords: courts of accounts, data labelling, document similarity, topic modeling

Procedia PDF Downloads 171