Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 4030

Search results for: weighted frequent patterns

4000 An Analysis of Sequential Pattern Mining on Databases Using Approximate Sequential Patterns

Abstract:

Sequential Pattern Mining involves applying data mining methods to large data repositories to extract usage patterns. Sequential pattern mining methodologies used to analyze the data and identify patterns. The patterns have been used to implement efficient systems can recommend on previously observed patterns, in making predictions, improve usability of systems, detecting events, and in general help in making strategic product decisions. In this paper, identified performance of approximate sequential pattern mining defines as identifying patterns approximately shared with many sequences. Approximate sequential patterns can effectively summarize and represent the databases by identifying the underlying trends in the data. Conducting an extensive and systematic performance over synthetic and real data. The results demonstrate that ApproxMAP effective and scalable in mining large sequences databases with long patterns.

Keywords: multiple data, performance analysis, sequential pattern, sequence database scalability

Procedia PDF Downloads 304

3999 Frequent Pattern Mining for Digenic Human Traits

Authors: Atsuko Okazaki, Jurg Ott

Abstract:

Some genetic diseases (‘digenic traits’) are due to the interaction between two DNA variants. For example, certain forms of Retinitis Pigmentosa (a genetic form of blindness) occur in the presence of two mutant variants, one in the ROM1 gene and one in the RDS gene, while the occurrence of only one of these mutant variants leads to a completely normal phenotype. Detecting such digenic traits by genetic methods is difficult. A common approach to finding disease-causing variants is to compare 100,000s of variants between individuals with a trait (cases) and those without the trait (controls). Such genome-wide association studies (GWASs) have been very successful but hinge on genetic effects of single variants, that is, there should be a difference in allele or genotype frequencies between cases and controls at a disease-causing variant. Frequent pattern mining (FPM) methods offer an avenue at detecting digenic traits even in the absence of single-variant effects. The idea is to enumerate pairs of genotypes (genotype patterns) with each of the two genotypes originating from different variants that may be located at very different genomic positions. What is needed is for genotype patterns to be significantly more common in cases than in controls. Let Y = 2 refer to cases and Y = 1 to controls, with X denoting a specific genotype pattern. We are seeking association rules, ‘X → Y’, with high confidence, P(Y = 2|X), significantly higher than the proportion of cases, P(Y = 2) in the study. Clearly, generally available FPM methods are very suitable for detecting disease-associated genotype patterns. We use fpgrowth as the basic FPM algorithm and built a framework around it to enumerate high-frequency digenic genotype patterns and to evaluate their statistical significance by permutation analysis. Application to a published dataset on opioid dependence furnished results that could not be found with classical GWAS methodology. There were 143 cases and 153 healthy controls, each genotyped for 82 variants in eight genes of the opioid system. The aim was to find out whether any of these variants were disease-associated. The single-variant analysis did not lead to significant results. Application of our FPM implementation resulted in one significant (p < 0.01) genotype pattern with both genotypes in the pattern being heterozygous and originating from two variants on different chromosomes. This pattern occurred in 14 cases and none of the controls. Thus, the pattern seems quite specific to this form of substance abuse and is also rather predictive of disease. An algorithm called Multifactor Dimension Reduction (MDR) was developed some 20 years ago and has been in use in human genetics ever since. This and our algorithms share some similar properties, but they are also very different in other respects. The main difference seems to be that our algorithm focuses on patterns of genotypes while the main object of inference in MDR is the 3 × 3 table of genotypes at two variants.

Keywords: digenic traits, DNA variants, epistasis, statistical genetics

Procedia PDF Downloads 97

3998 Phonological Variation in the Speech of Grade 1 Teachers in Select Public Elementary Schools in the Philippines

Authors: M. Leonora D. Guerrero

Abstract:

The study attempted to uncover the most and least frequent phonological variation evident in the speech patterns of grade 1 teachers in select public elementary schools in the Philippines. It also determined the lectal description of the participants based on Tayao’s consonant charts for American and Philippine English. Descriptive method was utilized. A total of 24 grade 1 teachers participated in the study. The instrument used was word list. Each column in the word list is represented by words with the target consonant phonemes: labiodental fricatives f/ and /v/ and lingua-alveolar fricative /z/. These phonemes were in the initial, medial, and final positions, respectively. Findings of the study revealed that the most frequent variation happened when the participants read words with /z/ in the final position while the least frequent variation happened when the participants read words with /z/ in the initial position. The study likewise proved that the grade 1 teachers exhibited the segmental features of both the mesolect and basilect. Based on these results, it is suggested that teachers of English in the Philippines must aspire to manifest the features of the mesolect, if not, the acrolect since it is expected of the academicians not to be displaying the phonological features of the acrolects since this variety is only used by the 'uneducated.' This is especially so with grade 1 teachers who are often mimicked by their students who classify their speech as the 'standard.'

Keywords: consonant phonemes, lectal description, Philippine English, phonological variation

Procedia PDF Downloads 181

3997 Irreducible Sign Patterns of Minimum Rank of 3 and Symmetric Sign Patterns That Allow Diagonalizability

Authors: Sriparna Bandopadhyay

Abstract:

It is known that irreducible sign patterns in general may not allow diagonalizability and in particular irreducible sign patterns with minimum rank greater than or equal to 4. It is also known that every irreducible sign pattern matrix with minimum rank of 2 allow diagonalizability with rank of 2 and the maximum rank of the sign pattern. In general sign patterns with minimum rank of 3 may not allow diagonalizability if the condition of irreducibility is dropped, but the problem of whether every irreducible sign pattern with minimum rank of 3 allows diagonalizability remains open. In this paper it is shown that irreducible sign patterns with minimum rank of 3 under certain conditions on the underlying graph allow diagonalizability. An alternate proof of the results that every sign pattern matrix with minimum rank of 2 and no zero lines allow diagonalizability with rank of 2 and also that every full sign pattern allows diagonalizability with all permissible ranks of the sign pattern is given. Some open problems regarding composite cycles in an irreducible symmetric sign pattern that support of a rank principal certificate are also answered.

Keywords: irreducible sign patterns, minimum rank, symmetric sign patterns, rank -principal certificate, allowing diagonalizability

Procedia PDF Downloads 65

3996 Frequent-Flyer Program: The Connection between Commercial Partners and Spin-off

Authors: Changmin Jiang

Abstract:

In this paper, we build a theoretical model to investigate the relationship between two recent trends in airline frequent-flyer programs (FFPs): the adoption of the “coalition” business model with other commercial partners, and the separation from airlines’ operations. We show that commercial partners benefit from teaming up with FFP, while increasing the number of commercial partners will increase the total profit; it reduces the average profit of the parties involved. Furthermore, we show that the number of commercial partners of an FFP is negatively related with the benefit to keep the FFP in-house.

Keywords: frequent flyer program, coalition, commercial partners, spin-off

Procedia PDF Downloads 253

3995 A Study and Design Scarf Collection Applied Vietnamese Traditional Patterns by Using Printing Method on Fabric

Authors: Mai Anh Pham Ho

Abstract:

Scarf products today is a symbol of fashion to decorate, to make our life more beautiful and bring new features to our living space. It also shows the cultural identity by using the traditional patterns that make easily to introduce the image of Vietnam to other nations all over the world. Therefore, the purpose of this research is to classify Vietnamese traditional patterns according to the era and dynasties. Vietnamese traditional patterns through the dynasties of Vietnamese history are done and classified by five groups of patterns including the geometric patterns, the natural patterns, the animal patterns, the floral patterns, and the character patterns in the Prehistoric times, the Bronze and Iron age, the Chinese domination, the Ngo-Dinh-TienLe-Ly-Tran-Ho dynasty, and the LeSo-Mac-LeTrinh-TaySon-Nguyen dynasty. Besides, there are some special kinds of Vietnamese traditional patterns like buffalo, lotus, bronze-drum, Phuc Loc Tho character, and so on. Extensive research was conducted for modernizing scarf collection applied Vietnamese traditional patterns which the fashion trend is used on creating works. The concept, target, image map, lifestyle map, motif, colours, arrangement and completion of patterns on scarf were set up. The scarf collection is designed and developed by the Adobe Illustrator program with three colour ways for each scarf. Upon completion of the research, digital printing technology is chosen for using on scarf collection which Vietnamese traditional patterns were researched deeply and widely with the purpose of establishment the basic background for Vietnamese culture in order to identify Vietnamese national personality as well as establish and preserve the cultural heritage.

Keywords: scarf collection, Vietnamese traditional patterns, printing methods, fabric design

Procedia PDF Downloads 315

3994 Solving Weighted Number of Operation Plus Processing Time Due-Date Assignment, Weighted Scheduling and Process Planning Integration Problem Using Genetic and Simulated Annealing Search Methods

Authors: Halil Ibrahim Demir, Caner Erden, Mumtaz Ipek, Ozer Uygun

Abstract:

Traditionally, the three important manufacturing functions, which are process planning, scheduling and due-date assignment, are performed separately and sequentially. For couple of decades, hundreds of studies are done on integrated process planning and scheduling problems and numerous researches are performed on scheduling with due date assignment problem, but unfortunately the integration of these three important functions are not adequately addressed. Here, the integration of these three important functions is studied by using genetic, random-genetic hybrid, simulated annealing, random-simulated annealing hybrid and random search techniques. As well, the importance of the integration of these three functions and the power of meta-heuristics and of hybrid heuristics are studied.

Keywords: process planning, weighted scheduling, weighted due-date assignment, genetic search, simulated annealing, hybrid meta-heuristics

Procedia PDF Downloads 449

3993 Weighted Rank Regression with Adaptive Penalty Function

Authors: Kang-Mo Jung

Abstract:

The use of regularization for statistical methods has become popular. The least absolute shrinkage and selection operator (LASSO) framework has become the standard tool for sparse regression. However, it is well known that the LASSO is sensitive to outliers or leverage points. We consider a new robust estimation which is composed of the weighted loss function of the pairwise difference of residuals and the adaptive penalty function regulating the tuning parameter for each variable. Rank regression is resistant to regression outliers, but not to leverage points. By adopting a weighted loss function, the proposed method is robust to leverage points of the predictor variable. Furthermore, the adaptive penalty function gives us good statistical properties in variable selection such as oracle property and consistency. We develop an efficient algorithm to compute the proposed estimator using basic functions in program R. We used an optimal tuning parameter based on the Bayesian information criterion (BIC). Numerical simulation shows that the proposed estimator is effective for analyzing real data set and contaminated data.

Keywords: adaptive penalty function, robust penalized regression, variable selection, weighted rank regression

Procedia PDF Downloads 431

3992 Quality Assurance in Software Design Patterns

Authors: Rabbia Tariq, Hannan Sajjad, Mehreen Sirshar

Abstract:

Design patterns are widely used to make the process of development easier as they greatly help the developers to develop the software. Different design patterns have been introduced till now but the behavior of same design pattern may differ in different domains that can lead to the wrong selection of the design pattern. The paper aims to discover the design patterns that suits best with respect to their domain thereby helping the developers to choose an effective design pattern. It presents the comprehensive analysis of design patterns based on different methodologies that include simulation, case study and comparison of various algorithms. Due to the difference of the domain the methodology used in one domain may be inapplicable to the other domain. The paper draws a conclusion based on strength and limitation of each design pattern in their respective domain.

Keywords: design patterns, evaluation, quality assurance, software domains

Procedia PDF Downloads 489

3991 Bag of Words Representation Based on Weighting Useful Visual Words

Authors: Fatma Abdedayem

Abstract:

The most effective and efficient methods in image categorization are almost based on bag-of-words (BOW) which presents image by a histogram of occurrence of visual words. In this paper, we propose a novel extension to this method. Firstly, we extract features in multi-scales by applying a color local descriptor named opponent-SIFT. Secondly, in order to represent image we use Spatial Pyramid Representation (SPR) and an extension to the BOW method which based on weighting visual words. Typically, the visual words are weighted during histogram assignment by computing the ratio of their occurrences in the image to the occurrences in the background. Finally, according to classical BOW retrieval framework, only a few words of the vocabulary is useful for image representation. Therefore, we select the useful weighted visual words that respect the threshold value. Experimentally, the algorithm is tested by using different image classes of PASCAL VOC 2007 and is compared against the classical bag-of-visual-words algorithm.

Keywords: BOW, useful visual words, weighted visual words, bag of visual words

Procedia PDF Downloads 410

3990 Enhanced Weighted Centroid Localization Algorithm for Indoor Environments

Authors: I. Nižetić Kosović, T. Jagušt

Abstract:

Lately, with the increasing number of location-based applications, demand for highly accurate and reliable indoor localization became urgent. This is a challenging problem, due to the measurement variance which is the consequence of various factors like obstacles, equipment properties and environmental changes in complex nature of indoor environments. In this paper we propose low-cost custom-setup infrastructure solution and localization algorithm based on the Weighted Centroid Localization (WCL) method. Localization accuracy is increased by several enhancements: calibration of RSSI values gained from wireless nodes, repetitive measurements of RSSI to exclude deviating values from the position estimation, and by considering orientation of the device according to the wireless nodes. We conducted several experiments to evaluate the proposed algorithm. High accuracy of ~1m was achieved.

Keywords: indoor environment, received signal strength indicator, weighted centroid localization, wireless localization

Procedia PDF Downloads 208

3989 Optimization of Monitoring Networks for Air Quality Management in Urban Hotspots

Authors: Vethathirri Ramanujam Srinivasan, S. M. Shiva Nagendra

Abstract:

Air quality management in urban areas is a serious concern in both developed and developing countries. In this regard, more number of air quality monitoring stations are planned to mitigate air pollution in urban areas. In India, Central Pollution Control Board has set up 574 air quality monitoring stations across the country and proposed to set up another 500 stations in the next few years. The number of monitoring stations for each city has been decided based on population data. The setting up of ambient air quality monitoring stations and their operation and maintenance are highly expensive. Therefore, there is a need to optimize monitoring networks for air quality management. The present paper discusses the various methods such as Indian Standards (IS) method, US EPA method and European Union (EU) method to arrive at the minimum number of air quality monitoring stations. In addition, optimization of rain-gauge method and Inverse Distance Weighted (IDW) method using Geographical Information System (GIS) are also explored in the present work for the design of air quality network in Chennai city. In summary, additionally 18 stations are required for Chennai city, and the potential monitoring locations with their corresponding land use patterns are ranked and identified from the 1km x 1km sized grids.

Keywords: air quality monitoring network, inverse distance weighted method, population based method, spatial variation

Procedia PDF Downloads 154

3988 An Efficient Data Mining Technique for Online Stores

Authors: Mohammed Al-Shalabi, Alaa Obeidat

Abstract:

In any food stores, some items will be expired or destroyed because the demand on these items is infrequent, so we need a system that can help the decision maker to make an offer on such items to improve the demand on the items by putting them with some other frequent item and decrease the price to avoid losses. The system generates hundreds or thousands of patterns (offers) for each low demand item, then it uses the association rules (support, confidence) to find the interesting patterns (the best offer to achieve the lowest losses). In this paper, we propose a data mining method for determining the best offer by merging the data mining techniques with the e-commerce strategy. The task is to build a model to predict the best offer. The goal is to maximize the profits of a store and avoid the loss of products. The idea in this paper is the using of the association rules in marketing with a combination with e-commerce.

Keywords: data mining, association rules, confidence, online stores

Procedia PDF Downloads 381

3987 Statistical and Analytical Comparison of GIS Overlay Modelings: An Appraisal on Groundwater Prospecting in Precambrian Metamorphics

Authors: Tapas Acharya, Monalisa Mitra

Abstract:

Overlay modeling is the most widely used conventional analysis for spatial decision support system. Overlay modeling requires a set of themes with different weightage computed in varied manners, which gives a resultant input for further integrated analysis. In spite of the popularity and most widely used technique; it gives inconsistent and erroneous results for similar inputs while processed in various GIS overlay techniques. This study is an attempt to compare and analyse the differences in the outputs of different overlay methods using GIS platform with same set of themes of the Precambrian metamorphic to obtain groundwater prospecting in Precambrian metamorphic rocks. The objective of the study is to emphasize the most suitable overlay method for groundwater prospecting in older Precambrian metamorphics. Seven input thematic layers like slope, Digital Elevation Model (DEM), soil thickness, lineament intersection density, average groundwater table fluctuation, stream density and lithology have been used in the spatial overlay models of fuzzy overlay, weighted overlay and weighted sum overlay methods to yield the suitable groundwater prospective zones. Spatial concurrence analysis with high yielding wells of the study area and the statistical comparative studies among the outputs of various overlay models using RStudio reveal that the Weighted Overlay model is the most efficient GIS overlay model to delineate the groundwater prospecting zones in the Precambrian metamorphic rocks.

Keywords: fuzzy overlay, GIS overlay model, groundwater prospecting, Precambrian metamorphics, weighted overlay, weighted sum overlay

Procedia PDF Downloads 99

3986 Humeral Head and Scapula Detection in Proton Density Weighted Magnetic Resonance Images Using YOLOv8

Authors: Aysun Sezer

Abstract:

Magnetic Resonance Imaging (MRI) is one of the advanced diagnostic tools for evaluating shoulder pathologies. Proton Density (PD)-weighted MRI sequences prove highly effective in detecting edema. However, they are deficient in the anatomical identification of bones due to a trauma-induced decrease in signal-to-noise ratio and blur in the traumatized cortices. Computer-based diagnostic systems require precise segmentation, identification, and localization of anatomical regions in medical imagery. Deep learning-based object detection algorithms exhibit remarkable proficiency in real-time object identification and localization. In this study, the YOLOv8 model was employed to detect humeral head and scapular regions in 665 axial PD-weighted MR images. The YOLOv8 configuration achieved an overall success rate of 99.60% and 89.90% for detecting the humeral head and scapula, respectively, with an intersection over union (IoU) of 0.5. Our findings indicate a significant promise of employing YOLOv8-based detection for the humerus and scapula regions, particularly in the context of PD-weighted images affected by both noise and intensity inhomogeneity.

Keywords: YOLOv8, object detection, humerus, scapula, IRM

Procedia PDF Downloads 35

3985 Exercise Training for Management Hypertensive Patients: A Systematic Review and Meta-Analysis

Authors: Noor F. Ilias, Mazlifah Omar, Hashbullah Ismail

Abstract:

Exercise training has been shown to improve functional capacity and is recommended as a therapy for management of blood pressure. Our purpose was to establish whether different exercise capacity produces different effect size for Cardiorespiratory Fitness (CRF), systolic (SBP) and diastolic (DBP) blood pressure in patients with hypertension. Exercise characteristic is required in order to have optimal benefit from the training, but optimal exercise capacity is still unwarranted. A MEDLINE search (1985 to 2015) was conducted for exercise based rehabilitation trials in hypertensive patients. Thirty-seven studies met the selection criteria. Of these, 31 (83.7%) were aerobic exercise and 6 (16.3%) aerobic with additional resistance exercise, providing a total of 1318 exercise subjects and 819 control, the total of subjects was 2137. We calculated exercise volume and energy expenditure through the description of exercise characteristics. 4 studies (18.2%) were 451kcal - 900 kcal, 12 (54.5%) were 900 kcal – 1350 kcal and 6 (27.3%) >1351kcal per week. Peak oxygen consumption (peak VO2) increased by mean difference of 1.44 ml/kg/min (95% confidence interval [CI]: 1.08 to 1.79 ml/kg/min; p = 0.00001) with weighted mean 21.2% for aerobic exercise compare to aerobic with additional resistance exercise 4.50 ml/kg/min (95% confidence interval [CI]: 3.57 to 5.42 ml/kg/min; p = 0.00001) with weighted mean 14.5%. SBP was clinically reduce for both aerobic and aerobic with resistance training by mean difference of -4.66 mmHg (95% confidence interval [CI]: -5.68 to -3.63 mmHg; p = 0.00001) weighted mean 6% reduction and -5.06 mmHg (95% confidence interval [CI]: -7.32 to -2.8 mmHg; p = 0.0001) weighted mean 5% reduction respectively. Result for DBP was clinically reduce for aerobic by mean difference of -1.62 mmHg (95% confidence interval [CI]: -2.09 to -1.15 mmHg; p = 0.00001) weighted mean 4% reduction and aerobic with resistance training reduce by mean difference of -3.26 mmHg (95% confidence interval [CI]: -4.87 to -1.65 mmHg; p = 0.0001) weighted mean 6% reduction. Optimum exercise capacity for 451 kcal – 900 kcal showed greater improvement in peak VO2 and SBP by 2.76 ml/kg/min (95% confidence interval [CI]: 1.47 to 4.05 ml/kg/min; p = 0.0001) with weighted mean 40.6% and -16.66 mmHg (95% confidence interval [CI]: -21.72 to -11.60 mmHg; p = 0.00001) weighted mean 9.8% respectively. Our data demonstrated that aerobic exercise with total volume of 451 kcal – 900 kcal/ week energy expenditure may elicit greater changes in cardiorespiratory fitness and blood pressure in hypertensive patients. Higher exercise capacity weekly does not seem better result in management hypertensive patients.

Keywords: blood Pressure, exercise, hypertension, peak VO2

Procedia PDF Downloads 257

3984 The Development of Micro Patterns Using Benchtop Lithography for Marine Antifouling Applications

Authors: Felicia Wong Yen Myan, James Walker

Abstract:

Development of micro topographies usually begins with the fabrication of a master stamp. Fabrication of such small structures can be technically challenging and expensive. These techniques are often used for applications where patterns only cover a small surface area (e.g. semiconductors, microfluidic channels). This research investigated the use of benchtop lithography to fabricate patterns with average widths of 50 and 100 microns on silicon wafer substrates. Further development of this method will attempt to layer patterns to create hierarchical structures. Photomasks consisted of patterns printed onto transparency films with a high resolution printer and a fully patterned 10cm by 10cm area has been successfully developed. UV exposure was carried out with a self-made array of ultraviolet LEDs that was positioned a distance above a glass diffuser. Observations under a light microscope and SEM showed that developed patterns exhibit an adequate degree of fidelity with patterns from the master stamp.

Keywords: lithography, antifouling, marine, microtopography

Procedia PDF Downloads 258

3983 Discussion about Frequent Adjustment of Urban Master Planning in China: A Case Study of Changshou District, Chongqing City

Authors: Sun Ailu, Zhao Wanmin

Abstract:

Since the reform and opening, the urbanization process of China has entered a rapid development period. In recent years, the authors participated in some projects of urban master planning in China and found a phenomenon that the rapid urbanization area of China is experiencing frequent adjustment process of urban master planning. This phenomenon is not the natural process of urbanization development. It may be caused by different government roles from different levels. Through the methods of investigation, data comparison and case study, this paper aims to explore the reason why the rapid urbanization area is experiencing frequent adjustment of master planning and give some solution strategies. Firstly, taking Changshou district of Chongqing city as an example, this paper wants to introduce the phenomenon about frequent adjustment process in China. And then, discuss distinct roles in the process between national government, provincial government and local government of China. At last, put forward preliminary solutions strategies for this area in China from the aspects of land use, intergovernmental cooperation and so on.

Keywords: urban master planning, frequent adjustment, urbanization development, problems and strategies, China

Procedia PDF Downloads 333

3982 Automatic Seizure Detection Using Weighted Permutation Entropy and Support Vector Machine

Authors: Noha Seddik, Sherine Youssef, Mohamed Kholeif

Abstract:

The automated epileptic seizure detection research field has emerged in the recent years; this involves analyzing the Electroencephalogram (EEG) signals instead of the traditional visual inspection performed by expert neurologists. In this study, a Support Vector Machine (SVM) that uses Weighted Permutation Entropy (WPE) as the input feature is proposed for classifying normal and seizure EEG records. WPE is a modified statistical parameter of the permutation entropy (PE) that measures the complexity and irregularity of a time series. It incorporates both the mapped ordinal pattern of the time series and the information contained in the amplitude of its sample points. The proposed system utilizes the fact that entropy based measures for the EEG segments during epileptic seizure are lower than in normal EEG.

Keywords: electroencephalogram (EEG), epileptic seizure detection, weighted permutation entropy (WPE), support vector machine (SVM)

Procedia PDF Downloads 341

3981 Efficient Recommendation System for Frequent and High Utility Itemsets over Incremental Datasets

Authors: J. K. Kavitha, D. Manjula, U. Kanimozhi

Abstract:

Mining frequent and high utility item sets have gained much significance in the recent years. When the data arrives sporadically, incremental and interactive rule mining and utility mining approaches can be adopted to handle user’s dynamic environmental needs and avoid redundancies, using previous data structures, and mining results. The dependence on recommendation systems has exponentially risen since the advent of search engines. This paper proposes a model for building a recommendation system that suggests frequent and high utility item sets over dynamic datasets for a cluster based location prediction strategy to predict user’s trajectories using the Efficient Incremental Rule Mining (EIRM) algorithm and the Fast Update Utility Pattern Tree (FUUP) algorithm. Through comprehensive evaluations by experiments, this scheme has shown to deliver excellent performance.

Keywords: data sets, recommendation system, utility item sets, frequent item sets mining

Procedia PDF Downloads 271

3980 Using Closed Frequent Itemsets for Hierarchical Document Clustering

Authors: Cheng-Jhe Lee, Chiun-Chieh Hsu

Abstract:

Due to the rapid development of the Internet and the increased availability of digital documents, the excessive information on the Internet has led to information overflow problem. In order to solve these problems for effective information retrieval, document clustering in text mining becomes a popular research topic. Clustering is the unsupervised classification of data items into groups without the need of training data. Many conventional document clustering methods perform inefficiently for large document collections because they were originally designed for relational database. Therefore they are impractical in real-world document clustering and require special handling for high dimensionality and high volume. We propose the FIHC (Frequent Itemset-based Hierarchical Clustering) method, which is a hierarchical clustering method developed for document clustering, where the intuition of FIHC is that there exist some common words for each cluster. FIHC uses such words to cluster documents and builds hierarchical topic tree. In this paper, we combine FIHC algorithm with ontology to solve the semantic problem and mine the meaning behind the words in documents. Furthermore, we use the closed frequent itemsets instead of only use frequent itemsets, which increases efficiency and scalability. The experimental results show that our method is more accurate than those of well-known document clustering algorithms.

Keywords: FIHC, documents clustering, ontology, closed frequent itemset

Procedia PDF Downloads 367

3979 Distribution Patterns of the Renieramycin-M-Producing Blue Sponge, Xestospongia sp. (De Laubenfels, 1932) (Phylum: Porifera, Class: Demospongiae) in Puerto Galera, Oriental Mindoro, Philippines

Authors: Geminne Manzano, Clairecynth Yu, Lilibeth Salvador-Reyes, Viviene Santiago, Porfirio AliñO

Abstract:

The distribution and abundance patterns of many marine sessile organisms such as sponges vary among and within reefs. Determining the factors affecting its distribution is essential especially for organisms that produce secondary metabolites with pharmaceutical importance. In this study, the small-scale distribution patterns of the Philippine blue sponge, Xestospongia sp. in relation to some ecological factors were examined. The relationship between the renieramycin-M production and their benthic attributes were also determined. Ecological surveys were conducted on two stations with varying depth and exposure located in Oriental Mindoro, Philippines. Three 30 by 6m belt transect were used to assess the sponge abundance at each station. The substratum of the sponges was also characterized. Fish visual census observations were also taken together with the photo transect methods benthic surveys. Sponge samples were also collected for the extraction of Renieramycin-M and for further chemical analysis. Varying distribution patterns were observed to be attributed to the combination of different ecological and environmental factors. The amount of Renieramycin-production also varied in each station. The common substratum for blue sponges includes hard and soft corals, as well as, dead coral with algal patches. Blue sponges from exposed habitat frequently grow associated with massive and branching corals, Porites sp., while the most frequent substrate found on sheltered habitats is the coral Pavona sp. Exploring the influence of ecological and environmental parameters on the abundance and distribution of sponge assemblages provide ecological insights and their potential applications to pharmaceutical studies. The results of this study provide further impetus in pursuing studies into patterns and processes of the Philippine blue sponge, Xestospongia sp. distribution in relation to the chemical ecology of its secondary metabolites.

Keywords: distribution patterns, Porifera, Renieramycin-M, sponge assemblages, Xestospongia sp.

Procedia PDF Downloads 245

3978 Reliability and Probability Weighted Moment Estimation for Three Parameter Mukherjee-Islam Failure Model

Authors: Ariful Islam, Showkat Ahmad Lone

Abstract:

The Mukherjee-Islam Model is commonly used as a simple life time distribution to assess system reliability. The model exhibits a better fit for failure information and provides more appropriate information about hazard rate and other reliability measures as shown by various authors. It is possible to introduce a location parameter at a time (i.e., a time before which failure cannot occur) which makes it a more useful failure distribution than the existing ones. Even after shifting the location of the distribution, it represents a decreasing, constant and increasing failure rate. It has been shown to represent the appropriate lower tail of the distribution of random variables having fixed lower bound. This study presents the reliability computations and probability weighted moment estimation of three parameter model. A comparative analysis is carried out between three parameters finite range model and some existing bathtub shaped curve fitting models. Since probability weighted moment method is used, the results obtained can also be applied on small sample cases. Maximum likelihood estimation method is also applied in this study.

Keywords: comparative analysis, maximum likelihood estimation, Mukherjee-Islam failure model, probability weighted moment estimation, reliability

Procedia PDF Downloads 244

3977 Trends and Inequalities in Distance to and Use of Nearest Natural Space in the Context of the 20-Minute Neighbourhood: A 4-Wave National Repeat Crosssectional Study, 2013 to 2019

Authors: Jonathan R. Olsen, Natalie Nicholls, Jenna Panter, Hannah Burnett, Michael Tornow, Richard Mitchell

Abstract:

The 20-minute neighborhood is a policy priority for governments worldwide and a key feature of this policy is providing access to natural space within 800 meters of home. The study aims were to (1) examine the association between distance to nearest natural space and frequent use over time and (2) examine whether frequent use and changes in use were patterned by income and housing tenure over time. Bi-annual Scottish Household Survey data were obtained for 2013 to 2019 (n:42128 aged 16+). Adults were asked the walking distance to their nearest natural space, the frequency of visits to this space and their housing tenure, as well as age, sex and income. We examined the association between distance from home of nearest natural space, housing tenure, and the likelihood of frequent natural space use (visited once a week or more). Two-way interaction terms were further applied to explore variation in the association between tenure and frequent natural space use over time. We found that 87% of respondents lived within 10 minute walk of a natural space, meeting the policy specification for a 20-minute neighbourhood. Greater proximity to natural space was associated with increased use; individuals living a 6 to 10 minute walk and over 10 minute walk were respectively 53% and 78% less likely to report frequent natural space use than those living within a 5 minute walk. Housing tenure was an important predictor of frequent natural space use; private renters and homeowners were more likely to report frequent natural space use than social renters. Our findings provide evidence that proximity to natural space is a strong predictor of frequent use. Our study provides important evidence that time-based access measures alone do not consider deep-rooted socioeconomic variation in use of Natural space. Policy makers should ensure a nuanced lens is applied to operationalising and monitoring the 20-minute neighbourhood to safeguard against exacerbating existing inequalities.

Keywords: natural space, housing, inequalities, 20-minute neighbourhood, urban design

Procedia PDF Downloads 83

3976 Solving Single Machine Total Weighted Tardiness Problem Using Gaussian Process Regression

Authors: Wanatchapong Kongkaew

Abstract:

This paper proposes an application of probabilistic technique, namely Gaussian process regression, for estimating an optimal sequence of the single machine with total weighted tardiness (SMTWT) scheduling problem. In this work, the Gaussian process regression (GPR) model is utilized to predict an optimal sequence of the SMTWT problem, and its solution is improved by using an iterated local search based on simulated annealing scheme, called GPRISA algorithm. The results show that the proposed GPRISA method achieves a very good performance and a reasonable trade-off between solution quality and time consumption. Moreover, in the comparison of deviation from the best-known solution, the proposed mechanism noticeably outperforms the recently existing approaches.

Keywords: Gaussian process regression, iterated local search, simulated annealing, single machine total weighted tardiness

Procedia PDF Downloads 279

3975 A Weighted Approach to Unconstrained Iris Recognition

Authors: Yao-Hong Tsai

Abstract:

This paper presents a weighted approach to unconstrained iris recognition. Nowadays, commercial systems are usually characterized by strong acquisition constraints based on the subject’s cooperation. However, it is not always achievable for real scenarios in our daily life. Researchers have been focused on reducing these constraints and maintaining the performance of the system by new techniques at the same time. With large variation in the environment, there are two main improvements to develop the proposed iris recognition system. For solving extremely uneven lighting condition, statistic based illumination normalization is first used on eye region to increase the accuracy of iris feature. The detection of the iris image is based on Adaboost algorithm. Secondly, the weighted approach is designed by Gaussian functions according to the distance to the center of the iris. Furthermore, local binary pattern (LBP) histogram is then applied to texture classification with the weight. Experiment showed that the proposed system provided users a more flexible and feasible way to interact with the verification system through iris recognition.

Keywords: authentication, iris recognition, adaboost, local binary pattern

Procedia PDF Downloads 192

3974 X̄ and S Control Charts based on Weighted Standard Deviation Method

Authors: Derya Karagöz

Abstract:

A Shewhart chart based on normality assumption is not appropriate for skewed distributions since its Type-I error rate is inflated. This study presents X̄ and S control charts for monitoring the process variability for skewed distributions. We propose Weighted Standard Deviation (WSD) X̄ and S control charts. Standard deviation estimator is applied to monitor the process variability for estimating the process standard deviation, in the case of the W SD X̄ and S control charts as this estimator is simple and easy to compute. Unlike the Shewhart control chart, the proposed charts provide asymmetric limits in accordance with the direction and degree of skewness to construct the upper and lower limits. The performances of the proposed charts are compared with other heuristic charts for skewed distributions by using Simulation study. The Simulation studies show that the proposed control charts have good properties for skewed distributions and large sample sizes.

Keywords: weighted standard deviation, MAD, skewed distributions, S control charts

Procedia PDF Downloads 368

3973 Investigating the Morphological Patterns of Lip Prints and Their Effectiveness in Individualization and Gender Determination in Pakistani Population

Authors: Makhdoom Saad Wasim Ghouri, Muneeba Butt, Mohammad Ashraf Tahir, Rashid Bhatti, Akbar Ali, Abdul Rehman, Abdul Basit, Muzzamel Rehman, Shahbaz Aslam, Farakh Mansoor, Ahmad Fayyaz, Hadia Siddiqui

Abstract:

Lip print analysis (Cheiloscopy) is the new emerging technique that might be the guardian angel in establishing the personal identity. Cheiloscopy is basically the study of elevations and depressions present on the external surface of the lips. In our study, 600 lip prints samples were taken (300 males and 300 females). Lip prints of each individual were divided into four quadrants and the upper middle portion. For general classification, middle part of the lower lip almost 10 mm wide would be taken into consideration. After analysis of lip-prints, our results show that lip prints are the unique and permanent character of every individual. No two lip print was matched with each other even of the identical twins. Our study reveals that there is equal distribution of lip print patterns among all the four quadrants of lips and the upper middle portion; these distributions were statistically analyzed by applying chi-square test which shows the significant results. In general classification, 5 lip print types/patterns were studied, Type 1 (Vertical lines), Type 2 (Branched pattern), Type 3 (Intersected pattern), Type 4 (Reticular pattern) and Type 5 (Undetermined). Type 1 and Type 2 were found to be the most frequent patterns in female population, while Type 3 and Type 4 most commonly found in male population. These results were also analyzed by applying Chi-square test, and the results show significance statistically. Thus, establishing sex determination on the basis of lip print types among the gender. Type 5 was the least common pattern among genders.

Keywords: cheiloscopy, distribution, quadrants, sex determination

Procedia PDF Downloads 256

3972 Hybrid Fuzzy Weighted K-Nearest Neighbor to Predict Hospital Readmission for Diabetic Patients

Authors: Soha A. Bahanshal, Byung G. Kim

Abstract:

Identification of patients at high risk for hospital readmission is of crucial importance for quality health care and cost reduction. Predicting hospital readmissions among diabetic patients has been of great interest to many researchers and health decision makers. We build a prediction model to predict hospital readmission for diabetic patients within 30 days of discharge. The core of the prediction model is a modified k Nearest Neighbor called Hybrid Fuzzy Weighted k Nearest Neighbor algorithm. The prediction is performed on a patient dataset which consists of more than 70,000 patients with 50 attributes. We applied data preprocessing using different techniques in order to handle data imbalance and to fuzzify the data to suit the prediction algorithm. The model so far achieved classification accuracy of 80% compared to other models that only use k Nearest Neighbor.

Keywords: machine learning, prediction, classification, hybrid fuzzy weighted k-nearest neighbor, diabetic hospital readmission

Procedia PDF Downloads 156

3971 A Ratio-Weighted Decision Tree Algorithm for Imbalance Dataset Classification

Authors: Doyin Afolabi, Phillip Adewole, Oladipupo Sennaike

Abstract:

Most well-known classifiers, including the decision tree algorithm, can make predictions on balanced datasets efficiently. However, the decision tree algorithm tends to be biased towards imbalanced datasets because of the skewness of the distribution of such datasets. To overcome this problem, this study proposes a weighted decision tree algorithm that aims to remove the bias toward the majority class and prevents the reduction of majority observations in imbalance datasets classification. The proposed weighted decision tree algorithm was tested on three imbalanced datasets- cancer dataset, german credit dataset, and banknote dataset. The specificity, sensitivity, and accuracy metrics were used to evaluate the performance of the proposed decision tree algorithm on the datasets. The evaluation results show that for some of the weights of our proposed decision tree, the specificity, sensitivity, and accuracy metrics gave better results compared to that of the ID3 decision tree and decision tree induced with minority entropy for all three datasets.

Keywords: data mining, decision tree, classification, imbalance dataset

Procedia PDF Downloads 93