Search results for: Cluster Ensemble
51 Mining Genes Relations in Microarray Data Combined with Ontology in Colon Cancer Automated Diagnosis System
Authors: A. Gruzdz, A. Ihnatowicz, J. Siddiqi, B. Akhgar
Abstract:
MATCH project [1] entitle the development of an automatic diagnosis system that aims to support treatment of colon cancer diseases by discovering mutations that occurs to tumour suppressor genes (TSGs) and contributes to the development of cancerous tumours. The constitution of the system is based on a) colon cancer clinical data and b) biological information that will be derived by data mining techniques from genomic and proteomic sources The core mining module will consist of the popular, well tested hybrid feature extraction methods, and new combined algorithms, designed especially for the project. Elements of rough sets, evolutionary computing, cluster analysis, self-organization maps and association rules will be used to discover the annotations between genes, and their influence on tumours [2]-[11]. The methods used to process the data have to address their high complexity, potential inconsistency and problems of dealing with the missing values. They must integrate all the useful information necessary to solve the expert's question. For this purpose, the system has to learn from data, or be able to interactively specify by a domain specialist, the part of the knowledge structure it needs to answer a given query. The program should also take into account the importance/rank of the particular parts of data it analyses, and adjusts the used algorithms accordingly.Keywords: Bioinformatics, gene expression, ontology, selforganizingmaps.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 197450 Analyzing The Effect of Variable Round Time for Clustering Approach in Wireless Sensor Networks
Authors: Vipin Pal, Girdhari Singh, R P Yadav
Abstract:
As wireless sensor networks are energy constraint networks so energy efficiency of sensor nodes is the main design issue. Clustering of nodes is an energy efficient approach. It prolongs the lifetime of wireless sensor networks by avoiding long distance communication. Clustering algorithms operate in rounds. Performance of clustering algorithm depends upon the round time. A large round time consumes more energy of cluster heads while a small round time causes frequent re-clustering. So existing clustering algorithms apply a trade off to round time and calculate it from the initial parameters of networks. But it is not appropriate to use initial parameters based round time value throughout the network lifetime because wireless sensor networks are dynamic in nature (nodes can be added to the network or some nodes go out of energy). In this paper a variable round time approach is proposed that calculates round time depending upon the number of active nodes remaining in the field. The proposed approach makes the clustering algorithm adaptive to network dynamics. For simulation the approach is implemented with LEACH in NS-2 and the results show that there is 6% increase in network lifetime, 7% increase in 50% node death time and 5% improvement over the data units gathered at the base station.Keywords: Wireless Sensor Network, Clustering, Energy Efficiency, Round Time.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 178749 A Codebook-based Redundancy Suppression Mechanism with Lifetime Prediction in Cluster-based WSN
Authors: Huan Chen, Bo-Chao Cheng, Chih-Chuan Cheng, Yi-Geng Chen, Yu Ling Chou
Abstract:
Wireless Sensor Network (WSN) comprises of sensor nodes which are designed to sense the environment, transmit sensed data back to the base station via multi-hop routing to reconstruct physical phenomena. Since physical phenomena exists significant overlaps between temporal redundancy and spatial redundancy, it is necessary to use Redundancy Suppression Algorithms (RSA) for sensor node to lower energy consumption by reducing the transmission of redundancy. A conventional algorithm of RSAs is threshold-based RSA, which sets threshold to suppress redundant data. Although many temporal and spatial RSAs are proposed, temporal-spatial RSA are seldom to be proposed because it is difficult to determine when to utilize temporal or spatial RSAs. In this paper, we proposed a novel temporal-spatial redundancy suppression algorithm, Codebookbase Redundancy Suppression Mechanism (CRSM). CRSM adopts vector quantization to generate a codebook, which is easily used to implement temporal-spatial RSA. CRSM not only achieves power saving and reliability for WSN, but also provides the predictability of network lifetime. Simulation result shows that the network lifetime of CRSM outperforms at least 23% of that of other RSAs.Keywords: Redundancy Suppression Algorithm (RSA), Threshold-based RSA, Temporal RSA, Spatial RSA and Codebookbase Redundancy Suppression Mechanism (CRSM)
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 144048 Socio-Demographic Status and Arrack Drinking Patterns among Muslim, Hindu, Santal and Oraon Communities in Rasulpur Union,Bangladesh: A Cross-Cultural Perspective
Authors: Md. Emaj Uddin
Abstract:
Arrack is one of the forms of alcoholic beverage or liquor which is produced from palm or date juice and commonly consumed by the lower social class of all religious/ethnic communities in the north-western villages of Bangladesh. The purpose of the study was to compare arrack drinking patterns associated with socio-demographic status among the Muslim, Hindu, Santal, and Oraon communities in the Rasulpur union of Bangladesh. A total of 391 respondents (Muslim n-109, Hindu n-103, Santal n-89, Oraon n-90) selected by cluster random sampling were interviewed by ADP (Arrack Drinking Pattern) questionnaire. The results of Pearson Chi-Squire test revealed that arrack drinking patterns were significantly differed among the Muslim, Hindu, Santal, and Oraon communities- drinkers. In addition, the results of Spearman-s bivariate correlation coefficients also revealed that sociodemographic characteristics of the communities- drinkers were the significantly positive and negative associations with the arrack drinking patterns in the Rasulpur union, Bangladesh. The study suggests that further cross-cultural researches should be conducted on the consequences of arrack drinking patterns on the communities- drinkers.Keywords: Arrack Drinking Patterns, Bangladesh, Community, Cross-Cultural Comparison, Socio-Demographic Status.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 189847 Microseismicity of the Tehran Region Based on Three Seismic Networks
Authors: Jamileh Vasheghani Farahani
Abstract:
The main purpose of this research is to show the current active faults and active tectonic of the area by three seismic networks in Tehran region: 1-Tehran Disaster Mitigation and Management Organization (TDMMO), 2-Broadband Iranian National Seismic Network Center (BIN), 3-Iranian Seismological Center (IRSC). In this study, we analyzed microearthquakes happened in Tehran city and its surroundings using the Tehran networks from 1996 to 2015. We found some active faults and trends in the region. There is a 200-year history of historical earthquakes in Tehran. Historical and instrumental seismicity show that the east of Tehran is more active than the west. The Mosha fault in the North of Tehran is one of the active faults of the central Alborz. Moreover, other major faults in the region are Kahrizak, Eyvanakey, Parchin and North Tehran faults. An important seismicity region is an intersection of the Mosha and North Tehran fault systems (Kalan village in Lavasan). This region shows a cluster of microearthquakes. According to the historical and microseismic events analyzed in this research, there is a seismic gap in SE of Tehran. The empirical relationship is used to assess the Mmax based on the rupture length. There is a probability of occurrence of a strong motion of 7.0 to 7.5 magnitudes in the region (based on the assessed capability of the major faults such as Parchin and Eyvanekey faults and historical earthquakes).
Keywords: Iran, major faults, microseismicity, Tehran.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 151846 Modeling of Processes Running in Radical Clusters Formed by Ionizing Radiation with the Help of Continuous Petri Nets and Oxygen Effect
Authors: J. Barilla, M. Lokajíček, H. Pisaková, P. Simr
Abstract:
The final biological effect of ionizing particles may be influenced strongly by some chemical substances present in cells mainly in the case of low-LET radiation. The influence of oxygen may by particularly important because oxygen is always present in living cells. The corresponding processes are then running mainly in the chemical stage of radiobiological mechanism.
The radical clusters formed by densely ionizing ends of primary or secondary charged particles are mainly responsible for final biological effect. The damage effect depends then on radical concentration at a time when the cluster meets a DNA molecule. It may be strongly influenced by oxygen present in a cell as oxygen may act in different directions: at small concentration of it the interaction with hydrogen radicals prevails while at higher concentrations additional efficient oxygen radicals may be formed.
The basic radical concentration in individual clusters diminishes, which is influenced by two parallel processes: chemical reactions and diffusion of corresponding clusters. The given simultaneous evolution may be modeled and analyzed well with the help of Continuous Petri nets. The influence of other substances present in cells during irradiation may be studied, too. Some results concerning the impact of oxygen content will be presented.
Keywords: DSB formation, chemical stage, Petri nets, radiobiological mechanism.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 157445 Discovering Complex Regularities: from Tree to Semi-Lattice Classifications
Authors: A. Faro, D. Giordano, F. Maiorana
Abstract:
Data mining uses a variety of techniques each of which is useful for some particular task. It is important to have a deep understanding of each technique and be able to perform sophisticated analysis. In this article we describe a tool built to simulate a variation of the Kohonen network to perform unsupervised clustering and support the entire data mining process up to results visualization. A graphical representation helps the user to find out a strategy to optimize classification by adding, moving or delete a neuron in order to change the number of classes. The tool is able to automatically suggest a strategy to optimize the number of classes optimization, but also support both tree classifications and semi-lattice organizations of the classes to give to the users the possibility of passing from one class to the ones with which it has some aspects in common. Examples of using tree and semi-lattice classifications are given to illustrate advantages and problems. The tool is applied to classify macroeconomic data that report the most developed countries- import and export. It is possible to classify the countries based on their economic behaviour and use the tool to characterize the commercial behaviour of a country in a selected class from the analysis of positive and negative features that contribute to classes formation. Possible interrelationships between the classes and their meaning are also discussed.Keywords: Unsupervised classification, Kohonen networks, macroeconomics, Visual data mining, Cluster interpretation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 154244 A Bibliometric Assessment on Sustainability and Clustering
Authors: Fernanda M. Assef, Maria Teresinha A. Steiner, David Gabriel F. de Barros
Abstract:
Review researches are useful in terms of analysis of research problems. Between the types of review documents, we commonly find bibliometric studies. This type of application often helps the global visualization of a research problem and helps academics worldwide to understand the context of a research area better. In this document, a bibliometric view surrounding clustering techniques and sustainability problems is presented. The authors aimed at which issues mostly use clustering techniques and even which sustainability issue would be more impactful on today’s moment of research. During the bibliometric analysis, we found 10 different groups of research in clustering applications for sustainability issues: Energy; Environmental; Non-urban Planning; Sustainable Development; Sustainable Supply Chain; Transport; Urban Planning; Water; Waste Disposal; and, Others. Moreover, by analyzing the citations of each group, it was discovered that the Environmental group could be classified as the most impactful research cluster in the area mentioned. After the content analysis of each paper classified in the environmental group, it was found that the k-means technique is preferred for solving sustainability problems with clustering methods since it appeared the most amongst the documents. The authors finally conclude that a bibliometric assessment could help indicate a gap of researches on waste disposal – which was the group with the least amount of publications – and the most impactful research on environmental problems.
Keywords: Bibliometric assessment, clustering, sustainability, territorial partitioning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 39143 Myanmar Character Recognition Using Eight Direction Chain Code Frequency Features
Authors: Kyi Pyar Zaw, Zin Mar Kyu
Abstract:
Character recognition is the process of converting a text image file into editable and searchable text file. Feature Extraction is the heart of any character recognition system. The character recognition rate may be low or high depending on the extracted features. In the proposed paper, 25 features for one character are used in character recognition. Basically, there are three steps of character recognition such as character segmentation, feature extraction and classification. In segmentation step, horizontal cropping method is used for line segmentation and vertical cropping method is used for character segmentation. In the Feature extraction step, features are extracted in two ways. The first way is that the 8 features are extracted from the entire input character using eight direction chain code frequency extraction. The second way is that the input character is divided into 16 blocks. For each block, although 8 feature values are obtained through eight-direction chain code frequency extraction method, we define the sum of these 8 feature values as a feature for one block. Therefore, 16 features are extracted from that 16 blocks in the second way. We use the number of holes feature to cluster the similar characters. We can recognize the almost Myanmar common characters with various font sizes by using these features. All these 25 features are used in both training part and testing part. In the classification step, the characters are classified by matching the all features of input character with already trained features of characters.
Keywords: Chain code frequency, character recognition, feature extraction, features matching, segmentation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 75342 A New Bound on the Average Information Ratio of Perfect Secret-Sharing Schemes for Access Structures Based On Bipartite Graphs of Larger Girth
Authors: Hui-Chuan Lu
Abstract:
In a perfect secret-sharing scheme, a dealer distributes a secret among a set of participants in such a way that only qualified subsets of participants can recover the secret and the joint share of the participants in any unqualified subset is statistically independent of the secret. The access structure of the scheme refers to the collection of all qualified subsets. In a graph-based access structures, each vertex of a graph G represents a participant and each edge of G represents a minimal qualified subset. The average information ratio of a perfect secret-sharing scheme realizing a given access structure is the ratio of the average length of the shares given to the participants to the length of the secret. The infimum of the average information ratio of all possible perfect secret-sharing schemes realizing an access structure is called the optimal average information ratio of that access structure. We study the optimal average information ratio of the access structures based on bipartite graphs. Based on some previous results, we give a bound on the optimal average information ratio for all bipartite graphs of girth at least six. This bound is the best possible for some classes of bipartite graphs using our approach.
Keywords: Secret-sharing scheme, average information ratio, star covering, deduction, core cluster.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 143441 Non-Overlapping Hierarchical Index Structure for Similarity Search
Authors: Mounira Taileb, Sid Lamrous, Sami Touati
Abstract:
In order to accelerate the similarity search in highdimensional database, we propose a new hierarchical indexing method. It is composed of offline and online phases. Our contribution concerns both phases. In the offline phase, after gathering the whole of the data in clusters and constructing a hierarchical index, the main originality of our contribution consists to develop a method to construct bounding forms of clusters to avoid overlapping. For the online phase, our idea improves considerably performances of similarity search. However, for this second phase, we have also developed an adapted search algorithm. Our method baptized NOHIS (Non-Overlapping Hierarchical Index Structure) use the Principal Direction Divisive Partitioning (PDDP) as algorithm of clustering. The principle of the PDDP is to divide data recursively into two sub-clusters; division is done by using the hyper-plane orthogonal to the principal direction derived from the covariance matrix and passing through the centroid of the cluster to divide. Data of each two sub-clusters obtained are including by a minimum bounding rectangle (MBR). The two MBRs are directed according to the principal direction. Consequently, the nonoverlapping between the two forms is assured. Experiments use databases containing image descriptors. Results show that the proposed method outperforms sequential scan and SRtree in processing k-nearest neighbors.
Keywords: K-nearest neighbour search, multi-dimensional indexing, multimedia databases, similarity search.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 156340 Family Communication Patterns between Muslim and Santal Communities in Rural Bangladesh: A Cross-Cultural Perspective
Authors: Md. Emaj Uddin
Abstract:
This study compares family communication patterns in association with family socio-cultural status, especially marriage and family pattern, and couples- socio-economic status between Muslim and Santal communities in rural Bangladesh. A total of 288 couples, 145 couples from the Muslim and 143 couples from the Santal were randomly selected through cluster sampling procedure from Kalna village situated in Tanore Upazila of Rajshahi district of Bangladesh, where both the communities dwell as neighbors. In order to collect data from the selected samples, interview method with semistructural questionnaire schedule was applied. The responses given by the respondents were analyzed by Pearson-s chi-squire test and bivariate correlation techniques. The results of Pearson-s chi-squire test revealed that family communication patterns (X2= 25. 90, df= 2, p<0.01, p>0.05) were significantly different between the Muslim and Santal communities. In addition, Spearman-s bivariate correlation coefficients suggested that among the exogenous factors, family type (rs=.135, p<0.05) and occupation of both husband (rs= .197, p<0.01) and wife (rs= .265, p<0.01) were significantly positive associations, and marital arrangement (rs= -.177, p<0.01), education of husband (rs= -.108, p<0.05) and wife (rs= -.142, p<0.01 & p<0.05), and family income (rs= -.164, p<0.01) were significantly negative relations with the family communication patterns followed between the two communities, although age difference between husband and wife, family head and residence patterns were not significant relations with ones.
Keywords: Bangladesh, Cross-Cultural Comparison, Family Communication Patterns, Family Socio-Cultural Status, Muslim, Santal.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 257639 Authenticity of Lipid and Soluble Sugar Profiles of Various Oat Cultivars (Avena sativa)
Authors: Marijana M. Ačanski, Kristian A. Pastor, Djura N. Vujić
Abstract:
The identification of lipid and soluble sugar components in flour samples of different cultivars belonging to common oat species (Avena sativa L.) was performed: spring oat, winter oat and hulless oat. Fatty acids were extracted from flour samples with n-hexane, and derivatized into volatile methyl esters, using TMSH (trimethylsulfonium hydroxide in methanol). Soluble sugars were then extracted from defatted and dried samples of oat flour with 96% ethanol, and further derivatized into corresponding TMS-oximes, using hydroxylamine hydrochloride solution and BSTFA (N,O-bis-(trimethylsilyl)-trifluoroacetamide). The hexane and ethanol extracts of each oat cultivar were analyzed using GC-MS system. Lipid and simple sugar compositions are very similar in all samples of investigated cultivars. Chemometric tool was applied to numeric values of automatically integrated surface areas of detected lipid and simple sugar components in their corresponding derivatized forms. Hierarchical cluster analysis shows a very high similarity between the investigated flour samples of oat cultivars, according to the fatty acid content (0.9955). Moderate similarity was observed according to the content of soluble sugars (0.50). These preliminary results support the idea of establishing methods for oat flour authentication, and provide the means for distinguishing oat flour samples, regardless of the variety, from flour samples made of other cereal species, just by lipid and simple sugar profile analysis.
Keywords: Authentication, chemometrics, GC-MS, lipid and soluble sugar composition, oat cultivars.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 137338 The Development of a Teachers- Self-Efficacy Instrument for High School Physical Education Teacher
Authors: Yi-Hsiang Pan
Abstract:
The purpose of this study was to develop a “teachers’ self-efficacy scale for high school physical education teachers (TSES-HSPET)” in Taiwan. This scale is based on the self-efficacy theory of Bandura [1], [2]. This study used exploratory and confirmatory factor analyses to test the reliability and validity. The participants were high school physical education teachers in Taiwan. Both stratified random sampling and cluster sampling were used to sample participants for the study. 350 teachers were sampled in the first stage and 234 valid scales (male 133, female 101) returned. During the second stage, 350 teachers were sampled and 257 valid scales (male 143, female 110, 4 did not indicate gender) returned. The exploratory factor analysis was used in the first stage, and it got 60.77% of total variance for construct validity. The Cronbach’s alpha coefficient of internal consistency was 0.91 for sumscale, and subscales were 0.84 and 0.90. In the second stage, confirmatory factor analysis was used to test construct validity. The result showed that the fit index could be accepted (χ2 (75) =167.94, p <.05, RMSEA =0.07, SRMR=0.05, GFI=0.92, NNFI=0.97, CFI=0.98, PNFI=0.79). Average variance extracted of latent variables were 0.43 and 0.53, which composite reliability are 0.78 and 0.90. It is concluded that the TSES-HSPET is a well-considered measurement instrument with acceptable validity and reliability. It may be used to estimate teachers’ self-efficacy for high school physical education teachers.Keywords: teaching in physical education, teacher's self-efficacy, teacher's belief
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 318037 Identifying Neighborhoods at Potential Risk of Food Insecurity in Rural British Columbia
Authors: Amirmohsen Behjat, Aleck Ostry, Christina Miewald, Bernie Pauly
Abstract:
Substantial research has indicated that socioeconomic and demographic characteristics’ of neighborhoods are strong determinants of food security. The aim of this study was to develop a Food Insecurity Neighborhood Index (FINI) based on the associated socioeconomic and demographic variables to identify the areas at potential risk of food insecurity in rural British Columbia (BC). Principle Component Analysis (PCA) technique was used to calculate the FINI for each rural Dissemination Area (DA) using the food security determinant variables from Canadian Census data. Using ArcGIS, the neighborhoods with the top quartile FINI values were classified as food insecure. The results of this study indicated that the most food insecure neighborhood with the highest FINI value of 99.1 was in the Bulkley-Nechako (central BC) area whereas the lowest FINI with the value of 2.97 was for a rural neighborhood in the Cowichan Valley area. In total, 98.049 (19%) of the rural population of British Columbians reside in high food insecure areas. Moreover, the distribution of food insecure neighborhoods was found to be strongly dependent on the degree of rurality in BC. In conclusion, the cluster of food insecure neighbourhoods was more pronounced in Central Coast, Mount Wadington, Peace River, Kootenay Boundary, and the Alberni-Clayoqout Regional Districts.
Keywords: Neighbourhood food insecurity index, socioeconomic and demographic determinants, principal component analysis, Canada Census, ArcGIS.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 89936 The Effect of Cooperation Teaching Method on Learning of Students in Primary Schools
Authors: Fereshteh Afkari, Davood Bagheri
Abstract:
The effect of teaching method on learning assistance Dunn Review .The study, to compare the effects of collaboration on teaching mathematics learning courses, including writing, science, experimental girl students by other methods of teaching basic first paid and the amount of learning students methods have been trained to cooperate with other students with other traditional methods have been trained to compare. The survey on 100 students in Tehran that using random sampling ¬ cluster of girl students between the first primary selections was performed. Considering the topic of semi-experimental research methods used to practice the necessary information by questionnaire, examination questions by the researcher, in collaboration with teachers and view authority in this field and related courses that teach these must have been collected. Research samples to test and control groups were divided. Experimental group and control group collaboration using traditional methods of mathematics courses, including writing and experimental sciences were trained. Research results using statistical methods T is obtained in two independent groups show that, through training assistance will lead to positive results and student learning in comparison with traditional methods, will increase also led to collaboration methods increase skills to solve math lesson practice, better understanding and increased skill level of students in practical lessons such as science and has been writing.Keywords: method of teaching, learning, collaboration
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 163835 Visualization and Indexing of Spectral Databases
Authors: Tibor Kulcsar, Gabor Sarossy, Gabor Bereznai, Robert Auer, Janos Abonyi
Abstract:
On-line (near infrared) spectroscopy is widely used to support the operation of complex process systems. Information extracted from spectral database can be used to estimate unmeasured product properties and monitor the operation of the process. These techniques are based on looking for similar spectra by nearest neighborhood algorithms and distance based searching methods. Search for nearest neighbors in the spectral space is an NP-hard problem, the computational complexity increases by the number of points in the discrete spectrum and the number of samples in the database. To reduce the calculation time some kind of indexing could be used. The main idea presented in this paper is to combine indexing and visualization techniques to reduce the computational requirement of estimation algorithms by providing a two dimensional indexing that can also be used to visualize the structure of the spectral database. This 2D visualization of spectral database does not only support application of distance and similarity based techniques but enables the utilization of advanced clustering and prediction algorithms based on the Delaunay tessellation of the mapped spectral space. This means the prediction has not to use the high dimension space but can be based on the mapped space too. The results illustrate that the proposed method is able to segment (cluster) spectral databases and detect outliers that are not suitable for instance based learning algorithms.
Keywords: indexing high dimensional databases, dimensional reduction, clustering, similarity, k-nn algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 176934 Inferring User Preference Using Distance Dependent Chinese Restaurant Process and Weighted Distribution for a Content Based Recommender System
Authors: Bagher Rahimpour Cami, Hamid Hassanpour, Hoda Mashayekhi
Abstract:
Nowadays websites provide a vast number of resources for users. Recommender systems have been developed as an essential element of these websites to provide a personalized environment for users. They help users to retrieve interested resources from large sets of available resources. Due to the dynamic feature of user preference, constructing an appropriate model to estimate the user preference is the major task of recommender systems. Profile matching and latent factors are two main approaches to identify user preference. In this paper, we employed the latent factor and profile matching to cluster the user profile and identify user preference, respectively. The method uses the Distance Dependent Chines Restaurant Process as a Bayesian nonparametric framework to extract the latent factors from the user profile. These latent factors are mapped to user interests and a weighted distribution is used to identify user preferences. We evaluate the proposed method using a real-world data-set that contains news tweets of a news agency (BBC). The experimental results and comparisons show the superior recommendation accuracy of the proposed approach related to existing methods, and its ability to effectively evolve over time.Keywords: Content-based recommender systems, dynamic user modeling, extracting user interests, predicting user preference.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 81833 A Neurofuzzy Learning and its Application to Control System
Authors: Seema Chopra, R. Mitra, Vijay Kumar
Abstract:
A neurofuzzy approach for a given set of input-output training data is proposed in two phases. Firstly, the data set is partitioned automatically into a set of clusters. Then a fuzzy if-then rule is extracted from each cluster to form a fuzzy rule base. Secondly, a fuzzy neural network is constructed accordingly and parameters are tuned to increase the precision of the fuzzy rule base. This network is able to learn and optimize the rule base of a Sugeno like Fuzzy inference system using Hybrid learning algorithm, which combines gradient descent, and least mean square algorithm. This proposed neurofuzzy system has the advantage of determining the number of rules automatically and also reduce the number of rules, decrease computational time, learns faster and consumes less memory. The authors also investigate that how neurofuzzy techniques can be applied in the area of control theory to design a fuzzy controller for linear and nonlinear dynamic systems modelling from a set of input/output data. The simulation analysis on a wide range of processes, to identify nonlinear components on-linely in a control system and a benchmark problem involving the prediction of a chaotic time series is carried out. Furthermore, the well-known examples of linear and nonlinear systems are also simulated under the Matlab/Simulink environment. The above combination is also illustrated in modeling the relationship between automobile trips and demographic factors.
Keywords: Fuzzy control, neuro-fuzzy techniques, fuzzy subtractive clustering, extraction of rules, and optimization of membership functions.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 259432 Awareness and Attitudes of Primary Grade Teachers (1-4thGrade) towards Inclusive Education
Authors: P. Maheshwari, M. Shapurkar
Abstract:
The present research aimed at studying the awareness and attitudes of teachers towards inclusive education. The sample consisted of 60 teachers, teaching in the primary section (1st – 4th) of regular schools affiliated to the SSC board in Mumbai. Sample was selected by Multi-stage cluster sampling technique. A semi-structured self-constructed interview schedule and a self-constructed attitude scale was used to study the awareness of teachers about disability and Inclusive education, and their attitudes towards inclusive education respectively. Themes were extracted from the interview data and quantitative data was analyzed using SPSS package. Results revealed that teachers had some amount of awareness but an inadequate amount of information on disabilities and inclusive education. Disability to most (37) teachers meant “an inability to do something”. The difference between disability and handicap was stated by most as former being cognitive while handicap being physical in nature. With regard to Inclusive education, a large number (46) stated that they were unaware of the term and did not know what it meant. Majority (52) of them perceived maximum challenges for themselves in an inclusive set up, and emphasized on the role of teacher training courses in the area of providing knowledge (49) and training in teaching methodology (53). Although, 83.3% of teachers held a moderately positive attitude towards inclusive education, a large percentage (61.6%) of participants felt that being in inclusive set up would be very challenging for both children with special needs and without special needs. Though, most (49) of the teachers stated that children with special needs should be educated in regular classroom but they further clarified that only those should be in a regular classroom who have physical impairments of mild or moderate degree.
Keywords: Attitudes, awareness, inclusive education, teachers.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 339631 Age at First Marriage for Husband and Wife between Muslim and Santal Communities in Rural Bangladesh: A Cross-Cultural Perspective
Authors: Md. Emaj Uddin
Abstract:
Age at first marriage is a basic temporal term that is culturally constructed for marriage relationship between an adult male and an adult female intended to have sex, to reproduce and to adapt to environment from one generation to another around the world. Cross-cultural evidences suggest that age at first marriage for both male and female not only varies across the cultures, but also varies among the subcultures of the same society. The purpose of the study was to compare age at first marriage for husband and wife including age differences between them between Muslim and Santal communities in rural Bangladesh. For this we hypothesized that (1) there were significant differences in age at first marriage and age interval between husband and wife between Muslim and Santal communities in rural Bangladesh. In so doing, 288 couples (145 pairs of couples for Muslim and 143 pairs of couples for Santal) were selected by cluster random sampling from the Kalna village situated in the Tanore Upazila of Rajshahi district, Bangladesh, whose current mean age range was 36.59 years for husband and 28.85 years for wife for the Muslim and 31.74 years for husband and 25.21 years for wife for the Santal respectively. The results of Independent Sample t test showed that mean age at first marriage for the Muslim samples was 23.05 years for husbands and 15.11 years for wives, while mean age at first marriage for the Santal samples was 20.71 years for husbands and 14.34 years for wives respectively that were significantly different at p<0.05 level. Although husbands compared to wives in both the communities were relatively older, there were significant similarities in mean age differences (7.71 for Muslim couples and 7.51 for Santal, p>0.05) among the selected husbands and wives between the two communities. This study recommends that further cross-cultural researches should be done on the causeeffect relationships between socio-cultural factors and age at marriage between the two communities in Bangladesh.Keywords: Age at First Marriage, Age Difference at Marriage, Bangladesh, Cross-Cultural Comparison, Muslim, Santal.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 254130 Identification of Critical Success Factors in Non-Formal Service Sector Using Delphi Technique
Authors: Amol A. Talankar, Prakash Verma, Nitin Seth
Abstract:
The purpose of this study is to identify the critical success factors (CSFs) for the effective implementation of Six Sigma in non-formal service Sectors.
Based on the survey of literature, the critical success factors (CSFs) for Six Sigma have been identified and are assessed for their importance in Non-formal service sector using Delphi Technique. These selected CSFs were put forth to the panel of expert to cluster them and prepare cognitive map to establish their relationship.
All the critical success factors examined and obtained from the review of literature have been assessed for their importance with respect to their contribution to Six Sigma effectiveness in non formal service sector.
The study is limited to the non-formal service sectors involved in the organization of religious festival only. However, the similar exercise can be conducted for broader sample of other non-formal service sectors like temple/ashram management, religious tours management etc.
The research suggests an approach to identify CSFs of Six Sigma for Non-formal service sector. All the CSFs of the formal service sector will not be applicable to Non-formal services, hence opinion of experts was sought to add or delete the CSFs. In the first round of Delphi, the panel of experts has suggested, two new CSFs-“competitive benchmarking (F19) and resident’s involvement (F28)”, which were added for assessment in the next round of Delphi. One of the CSFs-“fulltime six sigma personnel (F15)” has been omitted in proposed clusters of CSFs for non-formal organization, as it is practically impossible to deploy full time trained Six Sigma recruits.
Keywords: Critical success factors (CSFs), Quality assurance, non-formal service sectors, Six Sigma.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 245329 Regional Analysis of Streamflow Drought: A Case Study for Southwestern Iran
Authors: M. Byzedi, B. Saghafian
Abstract:
Droughts are complex, natural hazards that, to a varying degree, affect some parts of the world every year. The range of drought impacts is related to drought occurring in different stages of the hydrological cycle and usually different types of droughts, such as meteorological, agricultural, hydrological, and socioeconomical are distinguished. Streamflow drought was analyzed by the method of truncation level (at 70% level) on daily discharges measured in 54 hydrometric stations in southwestern Iran. Frequency analysis was carried out for annual maximum series (AMS) of drought deficit volume and duration series. Some factors including physiographic, climatic, geologic, and vegetation cover were studied as influential factors in the regional analysis. According to the results of factor analysis, six most effective factors were identified as area, rainfall from December to February, the percent of area with Normalized Difference Vegetation Index (NDVI) <0.1, the percent of convex area, drainage density and the minimum of watershed elevation that explained 90.9% of variance. The homogenous regions were determined by cluster analysis and discriminate function analysis. Suitable multivariate regression models were evaluated for streamflow drought deficit volume with 2 years return period. The significance level of regression models was 0.01. The results showed that the watershed area is the most effective factor with high correlation with deficit volume. Also, drought duration was not a suitable drought index for regional analysis.Keywords: Iran, Streamflow drought, truncation level method, regional analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 174428 Upgraded Rough Clustering and Outlier Detection Method on Yeast Dataset by Entropy Rough K-Means Method
Authors: P. Ashok, G. M. Kadhar Nawaz
Abstract:
Rough set theory is used to handle uncertainty and incomplete information by applying two accurate sets, Lower approximation and Upper approximation. In this paper, the rough clustering algorithms are improved by adopting the Similarity, Dissimilarity–Similarity and Entropy based initial centroids selection method on three different clustering algorithms namely Entropy based Rough K-Means (ERKM), Similarity based Rough K-Means (SRKM) and Dissimilarity-Similarity based Rough K-Means (DSRKM) were developed and executed by yeast dataset. The rough clustering algorithms are validated by cluster validity indexes namely Rand and Adjusted Rand indexes. An experimental result shows that the ERKM clustering algorithm perform effectively and delivers better results than other clustering methods. Outlier detection is an important task in data mining and very much different from the rest of the objects in the clusters. Entropy based Rough Outlier Factor (EROF) method is seemly to detect outlier effectively for yeast dataset. In rough K-Means method, by tuning the epsilon (ᶓ) value from 0.8 to 1.08 can detect outliers on boundary region and the RKM algorithm delivers better results, when choosing the value of epsilon (ᶓ) in the specified range. An experimental result shows that the EROF method on clustering algorithm performed very well and suitable for detecting outlier effectively for all datasets. Further, experimental readings show that the ERKM clustering method outperformed the other methods.
Keywords: Clustering, Entropy, Outlier, Rough K-Means, validity index.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 141327 Total and Partial Factor Productivity Analysis of Irrigated Wheat in Iran by Separate of Exploitation Scales
Authors: Hassan Masoumi, Rashed Alavi
Abstract:
Wheat is one of the strategic crops in Iran, on which the household food basket is highly dependent. Although this crop is cultivated and produced in almost all provinces of the country, its production efficiency is lower than the global and regional averages due to the lack of optimal use of allocated resources. In this research, which was carried out with a documentary and library method, first, the total and partial productivity indices of irrigated wheat production were calculated in large, medium and small exploitation scales in different provinces of the country, and then the provinces were clustered in terms of these indices. The results showed that the total productivity of production factors had a direct correlation with the scale of exploitation, so that with the increase in the size of exploitations, the total productivity index increased. On the scale of small exploitations, North Khorasan, Zanjan, Chaharmahal and Bakhtiari Province, on a medium scale, Chaharmahal and Bakhtiari Province and on the scale of large exploitations, Zanjan, Chaharmahal and Bakhtiari provinces, Kohkiloyeh and Boyer Ahmad and North Khorasan, with better use of production resources compared to other provinces, were placed in the best cluster in terms of total productivity index. The high total productivity index in Zanjan, Chaharmahal and Bakhtiari Province is related to the higher productivity of factors such as mechanization and land in these provinces. Finally, the methods of using these factors in productive provinces, along with technical and specialized regional guidelines, can facilitate the improvement of productivity in less productive provinces.
Keywords: Clustering, Irrigated wheat, Iran, total productivity.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17126 Cumulative Learning based on Dynamic Clustering of Hierarchical Production Rules(HPRs)
Authors: Kamal K.Bharadwaj, Rekha Kandwal
Abstract:
An important structuring mechanism for knowledge bases is building clusters based on the content of their knowledge objects. The objects are clustered based on the principle of maximizing the intraclass similarity and minimizing the interclass similarity. Clustering can also facilitate taxonomy formation, that is, the organization of observations into a hierarchy of classes that group similar events together. Hierarchical representation allows us to easily manage the complexity of knowledge, to view the knowledge at different levels of details, and to focus our attention on the interesting aspects only. One of such efficient and easy to understand systems is Hierarchical Production rule (HPRs) system. A HPR, a standard production rule augmented with generality and specificity information, is of the following form Decision If < condition> Generality
Keywords: Cumulative learning, clustering, data mining, hierarchical production rules.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 143925 MinRoot and CMesh: Interconnection Architectures for Network-on-Chip Systems
Authors: Mohammad Ali Jabraeil Jamali, Ahmad Khademzadeh
Abstract:
The success of an electronic system in a System-on- Chip is highly dependent on the efficiency of its interconnection network, which is constructed from routers and channels (the routers move data across the channels between nodes). Since neither classical bus based nor point to point architectures can provide scalable solutions and satisfy the tight power and performance requirements of future applications, the Network-on-Chip (NoC) approach has recently been proposed as a promising solution. Indeed, in contrast to the traditional solutions, the NoC approach can provide large bandwidth with moderate area overhead. The selected topology of the components interconnects plays prime rule in the performance of NoC architecture as well as routing and switching techniques that can be used. In this paper, we present two generic NoC architectures that can be customized to the specific communication needs of an application in order to reduce the area with minimal degradation of the latency of the system. An experimental study is performed to compare these structures with basic NoC topologies represented by 2D mesh, Butterfly-Fat Tree (BFT) and SPIN. It is shown that Cluster mesh (CMesh) and MinRoot schemes achieves significant improvements in network latency and energy consumption with only negligible area overhead and complexity over existing architectures. In fact, in the case of basic NoC topologies, CMesh and MinRoot schemes provides substantial savings in area as well, because they requires fewer routers. The simulation results show that CMesh and MinRoot networks outperforms MESH, BFT and SPIN in main performance metrics.
Keywords: MinRoot, CMesh, NoC, Topology, Performance Evaluation
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 212724 Continuous FAQ Updating for Service Incident Ticket Resolution
Authors: Kohtaroh Miyamoto
Abstract:
As enterprise computing becomes more and more complex, the costs and technical challenges of IT system maintenance and support are increasing rapidly. One popular approach to managing IT system maintenance is to prepare and use a FAQ (Frequently Asked Questions) system to manage and reuse systems knowledge. Such a FAQ system can help reduce the resolution time for each service incident ticket. However, there is a major problem where over time the knowledge in such FAQs tends to become outdated. Much of the knowledge captured in the FAQ requires periodic updates in response to new insights or new trends in the problems addressed in order to maintain its usefulness for problem resolution. These updates require a systematic approach to define the exact portion of the FAQ and its content. Therefore, we are working on a novel method to hierarchically structure the FAQ and automate the updates of its structure and content. We use structured information and the unstructured text information with the timelines of the information in the service incident tickets. We cluster the tickets by structured category information, by keywords, and by keyword modifiers for the unstructured text information. We also calculate an urgency score based on trends, resolution times, and priorities. We carefully studied the tickets of one of our projects over a 2.5-year time period. After the first 6 months we started to create FAQs and confirmed they improved the resolution times. We continued observing over the next 2 years to assess the ongoing effectiveness of our method for the automatic FAQ updates. We improved the ratio of tickets covered by the FAQ from 32.3% to 68.9% during this time. Also, the average time reduction of ticket resolution was between 31.6% and 43.9%. Subjective analysis showed more than 75% reported that the FAQ system was useful in reducing ticket resolution times.
Keywords: FAQ System, Resolution Time, Service Incident Tickets, IT System Maintenance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 249523 Genetic Diversity Based Population Study of Freshwater Mud Eel (Monopterus cuchia) in Bangladesh
Authors: M. F. Miah, K. M. A. Zinnah, M. J. Raihan, H. Ali, M. N. Naser
Abstract:
As genetic diversity is most important for existing, breeding and production of any fish; this study was undertaken for investigating genetic diversity of freshwater mud eel, Monopterus cuchia at population level where three ecological populations such as flooded area of Sylhet (P1), open water of Moulvibazar (P2) and open water of Sunamganj (P3) districts of Bangladesh were considered. Four arbitrary RAPD primers (OPB-12, C0-4, B-03 and OPB-08) were screened and RAPD banding patterns were analyzed among the populations considering 15 individuals of each population. In total 174, 138 and 149 bands were detected in the populations of P1, P2 and P3 respectively; however, each primer revealed less number of bands in each population. 100% polymorphic loci were recorded in P2 and P3 whereas only one monomorphic locus was observed in P1, recorded 97.5% polymorphism. Different genetic parameters such as inter-individual pairwise similarity, genetic distance, Nei genetic similarity, linkage distances, cluster analysis and allelic information, etc. were considered for measuring genetic diversity. The average inter-individual pairwise similarity was recorded 2.98, 1.47 and 1.35 in P1, P2 and P3 respectively. Considering genetic distance analysis, the highest distance 1 was recorded in P2 and P3 and the lowest genetic distance 0.444 was found in P2. The average Nei genetic similarity was observed 0.19, 0.16 and 0.13 in P1, P2 and P3, respectively; however, the average linkage distance was recorded 24.92, 17.14 and 15.28 in P1, P3 and P2 respectively. Based on linkage distance, genetic clusters were generated in three populations where 6 clades and 7 clusters were found in P1, 3 clades and 5 clusters were observed in P2 and 4 clades and 7 clusters were detected in P3. In addition, allelic information was observed where the frequency of p and q alleles were observed 0.093 and 0.907 in P1, 0.076 and 0.924 in P2, 0.074 and 0.926 in P3 respectively. The average gene diversity was observed highest in P2 (0.132) followed by P3 (0.131) and P1 (0.121) respectively.
Keywords: Genetic diversity, Monopterus cuchia, population, RAPD, Bangladesh.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 183222 Machine Learning Facing Behavioral Noise Problem in an Imbalanced Data Using One Side Behavioral Noise Reduction: Application to a Fraud Detection
Authors: Salma El Hajjami, Jamal Malki, Alain Bouju, Mohammed Berrada
Abstract:
With the expansion of machine learning and data mining in the context of Big Data analytics, the common problem that affects data is class imbalance. It refers to an imbalanced distribution of instances belonging to each class. This problem is present in many real world applications such as fraud detection, network intrusion detection, medical diagnostics, etc. In these cases, data instances labeled negatively are significantly more numerous than the instances labeled positively. When this difference is too large, the learning system may face difficulty when tackling this problem, since it is initially designed to work in relatively balanced class distribution scenarios. Another important problem, which usually accompanies these imbalanced data, is the overlapping instances between the two classes. It is commonly referred to as noise or overlapping data. In this article, we propose an approach called: One Side Behavioral Noise Reduction (OSBNR). This approach presents a way to deal with the problem of class imbalance in the presence of a high noise level. OSBNR is based on two steps. Firstly, a cluster analysis is applied to groups similar instances from the minority class into several behavior clusters. Secondly, we select and eliminate the instances of the majority class, considered as behavioral noise, which overlap with behavior clusters of the minority class. The results of experiments carried out on a representative public dataset confirm that the proposed approach is efficient for the treatment of class imbalances in the presence of noise.Keywords: Machine learning, Imbalanced data, Data mining, Big data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1138