Search results for: constrained clustering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 928

Search results for: constrained clustering

658 Post Apartheid Language Positionality and Policy: Student Teachers' Narratives from Teaching Practicum

Authors: Thelma Mort

Abstract:

This empirical, qualitative research uses interviews of four intermediate phase English language student teachers at one university in South Africa and is an exploration of student teacher learning on their teaching practicum in their penultimate year of the initial teacher education course. The country’s post-apartheid language in education policy provides a context to this study in that children move from mother tongue language of instruction in foundation phase to English as a language of instruction in Intermediate phase. There is another layer of context informing this study which is the school context; the student teachers’ reflections are from their teaching practicum in resource constrained schools, which make up more than 75% of schools in South Africa. The findings were that in these schools, deep biases existed to local languages, that language was being used as a proxy for social class, and that conditions necessary for language acquisition were absent. The student teachers’ attitudes were in contrast to those found in the schools, namely that they had various pragmatic approaches to overcoming obstacles and that they saw language as enabling interdisciplinary work. This study describes language issues, tensions created by policy in South African schools and also supplies a regional account of learning to teach in resource constrained schools in Cape Town, where such language tensions are more inflated. The central findings in this research illuminate attitudes to language and language education in these teaching practicum schools and the complexity of learning to be a language teacher in these contexts. This study is one of the few local empirical studies regarding language teaching in the classroom and language teacher education; as such it offers some background to the country’s poor performance in both international and national literacy assessments.

Keywords: language teaching, narrative, post apartheid, South Africa, student teacher

Procedia PDF Downloads 124
657 Embedded Hybrid Intuition: A Deep Learning and Fuzzy Logic Approach to Collective Creation and Computational Assisted Narratives

Authors: Roberto Cabezas H

Abstract:

The current work shows the methodology developed to create narrative lighting spaces for the multimedia performance piece 'cluster: the vanished paradise.' This empirical research is focused on exploring unconventional roles for machines in subjective creative processes, by delving into the semantics of data and machine intelligence algorithms in hybrid technological, creative contexts to expand epistemic domains trough human-machine cooperation. The creative process in scenic and performing arts is guided mostly by intuition; from that idea, we developed an approach to embed collective intuition in computational creative systems, by joining the properties of Generative Adversarial Networks (GAN’s) and Fuzzy Clustering based on a semi-supervised data creation and analysis pipeline. The model makes use of GAN’s to learn from phenomenological data (data generated from experience with lighting scenography) and algorithmic design data (augmented data by procedural design methods), fuzzy logic clustering is then applied to artificially created data from GAN’s to define narrative transitions built on membership index; this process allowed for the creation of simple and complex spaces with expressive capabilities based on position and light intensity as the parameters to guide the narrative. Hybridization comes not only from the human-machine symbiosis but also on the integration of different techniques for the implementation of the aided design system. Machine intelligence tools as proposed in this work are well suited to redefine collaborative creation by learning to express and expand a conglomerate of ideas and a wide range of opinions for the creation of sensory experiences. We found in GAN’s and Fuzzy Logic an ideal tool to develop new computational models based on interaction, learning, emotion and imagination to expand the traditional algorithmic model of computation.

Keywords: fuzzy clustering, generative adversarial networks, human-machine cooperation, hybrid collective data, multimedia performance

Procedia PDF Downloads 113
656 Remote Assessment and Change Detection of GreenLAI of Cotton Crop Using Different Vegetation Indices

Authors: Ganesh B. Shinde, Vijaya B. Musande

Abstract:

Cotton crop identification based on the timely information has significant advantage to the different implications of food, economic and environment. Due to the significant advantages, the accurate detection of cotton crop regions using supervised learning procedure is challenging problem in remote sensing. Here, classifiers on the direct image are played a major role but the results are not much satisfactorily. In order to further improve the effectiveness, variety of vegetation indices are proposed in the literature. But, recently, the major challenge is to find the better vegetation indices for the cotton crop identification through the proposed methodology. Accordingly, fuzzy c-means clustering is combined with neural network algorithm, trained by Levenberg-Marquardt for cotton crop classification. To experiment the proposed method, five LISS-III satellite images was taken and the experimentation was done with six vegetation indices such as Simple Ratio, Normalized Difference Vegetation Index, Enhanced Vegetation Index, Green Atmospherically Resistant Vegetation Index, Wide-Dynamic Range Vegetation Index, Green Chlorophyll Index. Along with these indices, Green Leaf Area Index is also considered for investigation. From the research outcome, Green Atmospherically Resistant Vegetation Index outperformed with all other indices by reaching the average accuracy value of 95.21%.

Keywords: Fuzzy C-Means clustering (FCM), neural network, Levenberg-Marquardt (LM) algorithm, vegetation indices

Procedia PDF Downloads 285
655 The Effect of Classroom Atmospherics on Second Language Learning

Authors: Sresha Yadav, Ishwar Kumar

Abstract:

Second language learning is an important area of research in the language and linguistic domains. Literature suggests that several factors impact second language learning, including age, motivation, objectives, teacher, instructional material, classroom interaction, intelligence and previous background, previous linguistic experience, other student characteristics. Previous researchers have also highlighted that classroom atmospherics has a significant impact on learning as well as on the performance of students. However, the impact of classroom atmospherics on second language learning is still not known in the existing literature. Therefore, the purpose of the present study is to explore whether classroom atmospherics has an impact on second language learning or not? And if it does, it would be worthwhile to explore the nature of such relationship. The present study aims to explore the impact of classroom atmospherics on second language learning by dwelling into the existing literature to explore factors which impact second language learning, classroom atmospherics which impact language learning and the metrics through which such learning impacts could be measured. Based on the findings of literature review, the researchers have adopted a clustering approach for categorization and positioning of various measures of second language learning. Based on the clustering approach, the researchers have approach for measuring the impact of classroom atmospherics on second language learning by drawing a student sample consisting of 80 respondents. The results of the study uncover various basic premises of second language learning, especially with regard to classroom atmospherics. The present study is important not only from the point of view of language learning but implications could be drawn with regard to the design of classroom atmospherics, environmental psychology, anthropometrics, etc as well.

Keywords: classroom atmospherics, cluster analysis, linguistics, second language learning

Procedia PDF Downloads 425
654 A Comparative Study on the Effects of Different Clustering Layouts and Geometry of Urban Street Canyons on Urban Heat Island in Residential Neighborhoods of Kolkata

Authors: Shreya Banerjee, Roshmi Sen, Subrata Chattopadhyay

Abstract:

Urbanization during the second half of the last century has created many serious environment related issues leading to global warming and climate change. India is not an exception as the country is also facing the problems of global warming and urban heat islands (UHI) in all the major metropolises. This paper discusses the effect of different housing cluster layouts, site geometry, and geometry of urban street canyons on the urban heat island profile. The study is carried out using the three dimensional microclimatic computational fluid dynamics model ENVI-met version 3.1. Simulation models are done for a typical summer day of 21st June, 2015 in four different residential neighborhoods in the city of Kolkata which predominantly belongs to Warm-Humid Monsoon Climate. The results show the changing pattern of urban heat island profile with respect to different clustering layouts, geometry, and morphology of urban street canyons. The comparison between the four neighborhoods shows that different microclimatic variables are strongly dependant on the neighborhood layout pattern and geometry. The inferences obtained from this study can be indicative towards the formulation of neighborhood design by-laws that will attenuate the urban heat island effect.

Keywords: urban heat island, neighborhood morphology, site microclimate, ENVI-met, numerical analysis

Procedia PDF Downloads 339
653 Genetic Diversity in Capsicum Germplasm Based on Inter Simple Sequence Repeat Markers

Authors: Siwapech Silapaprayoon, Januluk Khanobdee, Sompid Samipak

Abstract:

Chili peppers are the fruits of Capsicum pepper plants well known for their fiery burning sensation on the tongue after consumption. They are members of the Solanaceae or common nightshade family along with potato, tomato and eggplant. Thai cuisine has gained popularity for its distinct flavors due to usages of various spices and its heat from the addition of chili pepper. Though being used in little quantity for each dish, chili pepper holds a special place in Thai cuisine. There are many varieties of chili peppers in Thailand, and thirty accessions were collected at Rajamangala University of Technology Lanna, Lampang, Thailand. To effectively manage any germplasm it is essential to know the diversity and relationships among members. Thirty-six Inter Simple Sequence Repeat (ISSRs) DNA markers were used to analyze the germplasm. Total of 335 polymorphic bands was obtained giving the average of 9.3 alleles per marker. Unweighted pair-group mean arithmetic method (UPGMA) clustering of data using NTSYS-pc software indicated that the accessions showed varied levels of genetic similarity ranging from 0.57-1.00 similarity coefficient index indicating significant levels of variation. At SM coefficient of 0.81, the germplasm was separated into four groups. Phenotypic variation was discussed in context of phylogenetic tree clustering.

Keywords: diversity, germplasm, Chili pepper, ISSR

Procedia PDF Downloads 126
652 Another Justice: Litigation Masters in Chinese Legal Story

Authors: Lung-Lung Hu

Abstract:

Ronald Dworkin offered a legal theory of ‘chain enterprise’ that all the judges in legal history altogether create a ‘law’ aiming a specific purpose. Those judges are like co-writers of a chain-story who not only create freely but also are constrained by the story made by the judges before them. The law created by Chinese traditional judges is another case, they, compared with the judges mentioned by Ronald Dworkin, have relatively narrower space of making a legal sentence according to their own discretions because the statutes in Chinese traditional law at the very beginning have been designed as panel code that leaves small room to judge’s discretion. Furthermore, because law is a representative of the authority of the government, i.e. the emperor, any misjudges and misuses deviated from the law will be considered as a challenge to the supreme power. However, different from judges as the defenders of law, Chinese litigation masters who want to win legal cases have to be offenders challenging the verdict that does not favor his or his client’s interest. Besides, litigation master as an illegal or non-authorized profession does not belong to any legal system, therefore, they are relatively freer to ‘create’ the law. According to Stanley Fish’s articles that question Ronald Dworkin and Owen Fiss’ ideas about law, he construes that, since law is made of language, law is open to interpretations that cannot be constrained by any rules or any particular legal purposes. Stanley Fish’s idea can also be applied on the analysis about the stories of Chinese litigation masters in traditional Chinese literature. These Chinese litigation masters’ legal opinions in the so-called chain enterprise are like an unexpected episode that tries to revise the fixed story told by law. Although they are not welcome to the officials and also to the society, their existence is still a phenomenon representing another version of justice different from the official’s and can be seen as a de-structural power to the government. Hence, in this present paper the language and strategy applied by Chinese litigation masters in Chinese legal stories will be analysed to see how they refute made legal judgments and challenge the official standard of justice.

Keywords: Chinese legal stories, interdisciplinary, litigation master, post-structuralism

Procedia PDF Downloads 358
651 The Analyzer: Clustering Based System for Improving Business Productivity by Analyzing User Profiles to Enhance Human Computer Interaction

Authors: Dona Shaini Abhilasha Nanayakkara, Kurugamage Jude Pravinda Gregory Perera

Abstract:

E-commerce platforms have revolutionized the shopping experience, offering convenient ways for consumers to make purchases. To improve interactions with customers and optimize marketing strategies, it is essential for businesses to understand user behavior, preferences, and needs on these platforms. This paper focuses on recommending businesses to customize interactions with users based on their behavioral patterns, leveraging data-driven analysis and machine learning techniques. Businesses can improve engagement and boost the adoption of e-commerce platforms by aligning behavioral patterns with user goals of usability and satisfaction. We propose TheAnalyzer, a clustering-based system designed to enhance business productivity by analyzing user-profiles and improving human-computer interaction. The Analyzer seamlessly integrates with business applications, collecting relevant data points based on users' natural interactions without additional burdens such as questionnaires or surveys. It defines five key user analytics as features for its dataset, which are easily captured through users' interactions with e-commerce platforms. This research presents a study demonstrating the successful distinction of users into specific groups based on the five key analytics considered by TheAnalyzer. With the assistance of domain experts, customized business rules can be attached to each group, enabling The Analyzer to influence business applications and provide an enhanced personalized user experience. The outcomes are evaluated quantitatively and qualitatively, demonstrating that utilizing TheAnalyzer’s capabilities can optimize business outcomes, enhance customer satisfaction, and drive sustainable growth. The findings of this research contribute to the advancement of personalized interactions in e-commerce platforms. By leveraging user behavioral patterns and analyzing both new and existing users, businesses can effectively tailor their interactions to improve customer satisfaction, loyalty and ultimately drive sales.

Keywords: data clustering, data standardization, dimensionality reduction, human computer interaction, user profiling

Procedia PDF Downloads 35
650 Machine Learning Approaches Based on Recency, Frequency, Monetary (RFM) and K-Means for Predicting Electrical Failures and Voltage Reliability in Smart Cities

Authors: Panaya Sudta, Wanchalerm Patanacharoenwong, Prachya Bumrungkun

Abstract:

As With the evolution of smart grids, ensuring the reliability and efficiency of electrical systems in smart cities has become crucial. This paper proposes a distinct approach that combines advanced machine learning techniques to accurately predict electrical failures and address voltage reliability issues. This approach aims to improve the accuracy and efficiency of reliability evaluations in smart cities. The aim of this research is to develop a comprehensive predictive model that accurately predicts electrical failures and voltage reliability in smart cities. This model integrates RFM analysis, K-means clustering, and LSTM networks to achieve this objective. The research utilizes RFM analysis, traditionally used in customer value assessment, to categorize and analyze electrical components based on their failure recency, frequency, and monetary impact. K-means clustering is employed to segment electrical components into distinct groups with similar characteristics and failure patterns. LSTM networks are used to capture the temporal dependencies and patterns in customer data. This integration of RFM, K-means, and LSTM results in a robust predictive tool for electrical failures and voltage reliability. The proposed model has been tested and validated on diverse electrical utility datasets. The results show a significant improvement in prediction accuracy and reliability compared to traditional methods, achieving an accuracy of 92.78% and an F1-score of 0.83. This research contributes to the proactive maintenance and optimization of electrical infrastructures in smart cities. It also enhances overall energy management and sustainability. The integration of advanced machine learning techniques in the predictive model demonstrates the potential for transforming the landscape of electrical system management within smart cities. The research utilizes diverse electrical utility datasets to develop and validate the predictive model. RFM analysis, K-means clustering, and LSTM networks are applied to these datasets to analyze and predict electrical failures and voltage reliability. The research addresses the question of how accurately electrical failures and voltage reliability can be predicted in smart cities. It also investigates the effectiveness of integrating RFM analysis, K-means clustering, and LSTM networks in achieving this goal. The proposed approach presents a distinct, efficient, and effective solution for predicting and mitigating electrical failures and voltage issues in smart cities. It significantly improves prediction accuracy and reliability compared to traditional methods. This advancement contributes to the proactive maintenance and optimization of electrical infrastructures, overall energy management, and sustainability in smart cities.

Keywords: electrical state prediction, smart grids, data-driven method, long short-term memory, RFM, k-means, machine learning

Procedia PDF Downloads 21
649 The Application of Video Segmentation Methods for the Purpose of Action Detection in Videos

Authors: Nassima Noufail, Sara Bouhali

Abstract:

In this work, we develop a semi-supervised solution for the purpose of action detection in videos and propose an efficient algorithm for video segmentation. The approach is divided into video segmentation, feature extraction, and classification. In the first part, a video is segmented into clips, and we used the K-means algorithm for this segmentation; our goal is to find groups based on similarity in the video. The application of k-means clustering into all the frames is time-consuming; therefore, we started by the identification of transition frames where the scene in the video changes significantly, and then we applied K-means clustering into these transition frames. We used two image filters, the gaussian filter and the Laplacian of Gaussian. Each filter extracts a set of features from the frames. The Gaussian filter blurs the image and omits the higher frequencies, and the Laplacian of gaussian detects regions of rapid intensity changes; we then used this vector of filter responses as an input to our k-means algorithm. The output is a set of cluster centers. Each video frame pixel is then mapped to the nearest cluster center and painted with a corresponding color to form a visual map. The resulting visual map had similar pixels grouped. We then computed a cluster score indicating how clusters are near each other and plotted a signal representing frame number vs. clustering score. Our hypothesis was that the evolution of the signal would not change if semantically related events were happening in the scene. We marked the breakpoints at which the root mean square level of the signal changes significantly, and each breakpoint is an indication of the beginning of a new video segment. In the second part, for each segment from part 1, we randomly selected a 16-frame clip, then we extracted spatiotemporal features using convolutional 3D network C3D for every 16 frames using a pre-trained model. The C3D final output is a 512-feature vector dimension; hence we used principal component analysis (PCA) for dimensionality reduction. The final part is the classification. The C3D feature vectors are used as input to a multi-class linear support vector machine (SVM) for the training model, and we used a multi-classifier to detect the action. We evaluated our experiment on the UCF101 dataset, which consists of 101 human action categories, and we achieved an accuracy that outperforms the state of art by 1.2%.

Keywords: video segmentation, action detection, classification, Kmeans, C3D

Procedia PDF Downloads 49
648 Spatiotemporal Propagation and Pattern of Epileptic Spike Predict Seizure Onset Zone

Authors: Mostafa Mohammadpour, Christoph Kapeller, Christy Li, Josef Scharinger, Christoph Guger

Abstract:

Interictal spikes provide valuable information on electrocorticography (ECoG), which aids in surgical planning for patients who suffer from refractory epilepsy. However, the shape and temporal dynamics of these spikes remain unclear. The purpose of this work was to analyze the shape of interictal spikes and measure their distance to the seizure onset zone (SOZ) to use in epilepsy surgery. Thirteen patients' data from the iEEG portal were retrospectively studied. For analysis, half an hour of ECoG data was used from each patient, with the data being truncated before the onset of a seizure. Spikes were first detected and grouped in a sequence, then clustered into interictal epileptiform discharges (IEDs) and non-IED groups using two-step clustering. The distance of the spikes from IED and non-IED groups to SOZ was quantified and compared using the Wilcoxon rank-sum test. Spikes in the IED group tended to be in SOZ or close to it, while spikes in the non-IED group were in distance of SOZ or non-SOZ area. At the group level, the distribution for sharp wave, positive baseline shift, slow wave, and slow wave to sharp wave ratio was significantly different for IED and non-IED groups. The distance of the IED cluster was 10.00mm and significantly closer to the SOZ than the 17.65mm for non-IEDs. These findings provide insights into the shape and spatiotemporal dynamics of spikes that could influence the network mechanisms underlying refractory epilepsy.

Keywords: spike propagation, spike pattern, clustering, SOZ

Procedia PDF Downloads 30
647 Finding the Longest Common Subsequence in Normal DNA and Disease Affected Human DNA Using Self Organizing Map

Authors: G. Tamilpavai, C. Vishnuppriya

Abstract:

Bioinformatics is an active research area which combines biological matter as well as computer science research. The longest common subsequence (LCSS) is one of the major challenges in various bioinformatics applications. The computation of the LCSS plays a vital role in biomedicine and also it is an essential task in DNA sequence analysis in genetics. It includes wide range of disease diagnosing steps. The objective of this proposed system is to find the longest common subsequence which presents in a normal and various disease affected human DNA sequence using Self Organizing Map (SOM) and LCSS. The human DNA sequence is collected from National Center for Biotechnology Information (NCBI) database. Initially, the human DNA sequence is separated as k-mer using k-mer separation rule. Mean and median values are calculated from each separated k-mer. These calculated values are fed as input to the Self Organizing Map for the purpose of clustering. Then obtained clusters are given to the Longest Common Sub Sequence (LCSS) algorithm for finding common subsequence which presents in every clusters. It returns nx(n-1)/2 subsequence for each cluster where n is number of k-mer in a specific cluster. Experimental outcomes of this proposed system produce the possible number of longest common subsequence of normal and disease affected DNA data. Thus the proposed system will be a good initiative aid for finding disease causing sequence. Finally, performance analysis is carried out for different DNA sequences. The obtained values show that the retrieval of LCSS is done in a shorter time than the existing system.

Keywords: clustering, k-mers, longest common subsequence, SOM

Procedia PDF Downloads 231
646 Wind Velocity Climate Zonation Based on Observation Data in Indonesia Using Cluster and Principal Component Analysis

Authors: I Dewa Gede Arya Putra

Abstract:

Principal Component Analysis (PCA) is a mathematical procedure that uses orthogonal transformation techniques to change a set of data with components that may be related become components that are not related to each other. This can have an impact on clustering wind speed characteristics in Indonesia. This study uses data daily wind speed observations of the Site Meteorological Station network for 30 years. Multicollinearity tests were also performed on all of these data before doing clustering with PCA. The results show that the four main components have a total diversity of above 80% which will be used for clusters. Division of clusters using Ward's method obtained 3 types of clusters. Cluster 1 covers the central part of Sumatra Island, northern Kalimantan, northern Sulawesi, and northern Maluku with the climatological pattern of wind speed that does not have an annual cycle and a weak speed throughout the year with a low-speed ranging from 0 to 1,5 m/s². Cluster 2 covers the northern part of Sumatra Island, South Sulawesi, Bali, northern Papua with the climatological pattern conditions of wind speed that have annual cycle variations with low speeds ranging from 1 to 3 m/s². Cluster 3 covers the eastern part of Java Island, the Southeast Nusa Islands, and the southern Maluku Islands with the climatological pattern of wind speed conditions that have annual cycle variations with high speeds ranging from 1 to 4.5 m/s².

Keywords: PCA, cluster, Ward's method, wind speed

Procedia PDF Downloads 165
645 Designing Floor Planning in 2D and 3D with an Efficient Topological Structure

Authors: V. Nagammai

Abstract:

Very-large-scale integration (VLSI) is the process of creating an integrated circuit (IC) by combining thousands of transistors into a single chip. Development of technology increases the complexity in IC manufacturing which may vary the power consumption, increase the size and latency period. Topology defines a number of connections between network. In this project, NoC topology is generated using atlas tool which will increase performance in turn determination of constraints are effective. The routing is performed by XY routing algorithm and wormhole flow control. In NoC topology generation, the value of power, area and latency are predetermined. In previous work, placement, routing and shortest path evaluation is performed using an algorithm called floor planning with cluster reconstruction and path allocation algorithm (FCRPA) with the account of 4 3x3 switch, 6 4x4 switch, and 2 5x5 switches. The usage of the 4x4 and 5x5 switch will increase the power consumption and area of the block. In order to avoid the problem, this paper has used one 8x8 switch and 4 3x3 switches. This paper uses IPRCA which of 3 steps they are placement, clustering, and shortest path evaluation. The placement is performed using min – cut placement and clustering are performed using an algorithm called cluster generation. The shortest path is evaluated using an algorithm called Dijkstra's algorithm. The power consumption of each block is determined. The experimental result shows that the area, power, and wire length improved simultaneously.

Keywords: application specific noc, b* tree representation, floor planning, t tree representation

Procedia PDF Downloads 368
644 Feature Based Unsupervised Intrusion Detection

Authors: Deeman Yousif Mahmood, Mohammed Abdullah Hussein

Abstract:

The goal of a network-based intrusion detection system is to classify activities of network traffics into two major categories: normal and attack (intrusive) activities. Nowadays, data mining and machine learning plays an important role in many sciences; including intrusion detection system (IDS) using both supervised and unsupervised techniques. However, one of the essential steps of data mining is feature selection that helps in improving the efficiency, performance and prediction rate of proposed approach. This paper applies unsupervised K-means clustering algorithm with information gain (IG) for feature selection and reduction to build a network intrusion detection system. For our experimental analysis, we have used the new NSL-KDD dataset, which is a modified dataset for KDDCup 1999 intrusion detection benchmark dataset. With a split of 60.0% for the training set and the remainder for the testing set, a 2 class classifications have been implemented (Normal, Attack). Weka framework which is a java based open source software consists of a collection of machine learning algorithms for data mining tasks has been used in the testing process. The experimental results show that the proposed approach is very accurate with low false positive rate and high true positive rate and it takes less learning time in comparison with using the full features of the dataset with the same algorithm.

Keywords: information gain (IG), intrusion detection system (IDS), k-means clustering, Weka

Procedia PDF Downloads 263
643 Web Proxy Detection via Bipartite Graphs and One-Mode Projections

Authors: Zhipeng Chen, Peng Zhang, Qingyun Liu, Li Guo

Abstract:

With the Internet becoming the dominant channel for business and life, many IPs are increasingly masked using web proxies for illegal purposes such as propagating malware, impersonate phishing pages to steal sensitive data or redirect victims to other malicious targets. Moreover, as Internet traffic continues to grow in size and complexity, it has become an increasingly challenging task to detect the proxy service due to their dynamic update and high anonymity. In this paper, we present an approach based on behavioral graph analysis to study the behavior similarity of web proxy users. Specifically, we use bipartite graphs to model host communications from network traffic and build one-mode projections of bipartite graphs for discovering social-behavior similarity of web proxy users. Based on the similarity matrices of end-users from the derived one-mode projection graphs, we apply a simple yet effective spectral clustering algorithm to discover the inherent web proxy users behavior clusters. The web proxy URL may vary from time to time. Still, the inherent interest would not. So, based on the intuition, by dint of our private tools implemented by WebDriver, we examine whether the top URLs visited by the web proxy users are web proxies. Our experiment results based on real datasets show that the behavior clusters not only reduce the number of URLs analysis but also provide an effective way to detect the web proxies, especially for the unknown web proxies.

Keywords: bipartite graph, one-mode projection, clustering, web proxy detection

Procedia PDF Downloads 220
642 Short Association Bundle Atlas for Lateralization Studies from dMRI Data

Authors: C. Román, M. Guevara, P. Salas, D. Duclap, J. Houenou, C. Poupon, J. F. Mangin, P. Guevara

Abstract:

Diffusion Magnetic Resonance Imaging (dMRI) allows the non-invasive study of human brain white matter. From diffusion data, it is possible to reconstruct fiber trajectories using tractography algorithms. Our previous work consists in an automatic method for the identification of short association bundles of the superficial white matter (SWM), based on a whole brain inter-subject hierarchical clustering applied to a HARDI database. The method finds representative clusters of similar fibers, belonging to a group of subjects, according to a distance measure between fibers, using a non-linear registration (DTI-TK). The algorithm performs an automatic labeling based on the anatomy, defined by a cortex mesh parcelated with FreeSurfer software. The clustering was applied to two independent groups of 37 subjects. The clusters resulting from both groups were compared using a restrictive threshold of mean distance between each pair of bundles from different groups, in order to keep reproducible connections. In the left hemisphere, 48 reproducible bundles were found, while 43 bundles where found in the right hemisphere. An inter-hemispheric bundle correspondence was then applied. The symmetric horizontal reflection of the right bundles was calculated, in order to obtain the position of them in the left hemisphere. Next, the intersection between similar bundles was calculated. The pairs of bundles with a fiber intersection percentage higher than 50% were considered similar. The similar bundles between both hemispheres were fused and symmetrized. We obtained 30 common bundles between hemispheres. An atlas was created with the resulting bundles and used to segment 78 new subjects from another HARDI database, using a distance threshold between 6-8 mm according to the bundle length. Finally, a laterality index was calculated based on the bundle volume. Seven bundles of the atlas presented right laterality (IP_SP_1i, LO_LO_1i, Op_Tr_0i, PoC_PoC_0i, PoC_PreC_2i, PreC_SM_0i, y RoMF_RoMF_0i) and one presented left laterality (IP_SP_2i), there is no tendency of lateralization according to the brain region. Many factors can affect the results, like tractography artifacts, subject registration, and bundle segmentation. Further studies are necessary in order to establish the influence of these factors and evaluate SWM laterality.

Keywords: dMRI, hierarchical clustering, lateralization index, tractography

Procedia PDF Downloads 299
641 Graph-Based Semantical Extractive Text Analysis

Authors: Mina Samizadeh

Abstract:

In the past few decades, there has been an explosion in the amount of available data produced from various sources with different topics. The availability of this enormous data necessitates us to adopt effective computational tools to explore the data. This leads to an intense growing interest in the research community to develop computational methods focused on processing this text data. A line of study focused on condensing the text so that we are able to get a higher level of understanding in a shorter time. The two important tasks to do this are keyword extraction and text summarization. In keyword extraction, we are interested in finding the key important words from a text. This makes us familiar with the general topic of a text. In text summarization, we are interested in producing a short-length text which includes important information about the document. The TextRank algorithm, an unsupervised learning method that is an extension of the PageRank (algorithm which is the base algorithm of Google search engine for searching pages and ranking them), has shown its efficacy in large-scale text mining, especially for text summarization and keyword extraction. This algorithm can automatically extract the important parts of a text (keywords or sentences) and declare them as a result. However, this algorithm neglects the semantic similarity between the different parts. In this work, we improved the results of the TextRank algorithm by incorporating the semantic similarity between parts of the text. Aside from keyword extraction and text summarization, we develop a topic clustering algorithm based on our framework, which can be used individually or as a part of generating the summary to overcome coverage problems.

Keywords: keyword extraction, n-gram extraction, text summarization, topic clustering, semantic analysis

Procedia PDF Downloads 44
640 The Survey Research and Evaluation of Green Residential Building Based on the Improved Group Analytical Hierarchy Process Method in Yinchuan

Authors: Yun-na Wu, Zhen Wang

Abstract:

Due to the economic downturn and the deterioration of the living environment, the development of residential buildings as high energy consuming building is gradually changing from “extensive” to green building in China. So, the evaluation system of green building is continuously improved, but the current evaluation work has the following problems: (1) There are differences in the cost of the actual investment and the purchasing power of residents, also construction target of green residential building is single and lacks multi-objective performance development. (2) Green building evaluation lacks regional characteristics and cannot reflect the different regional residents demand. (3) In the process of determining the criteria weight, the experts’ judgment matrix is difficult to meet the requirement of consistency. Therefore, to solve those problems, questionnaires which are about the green residential building for Ningxia area are distributed, and the results of questionnaires can feedback the purchasing power of residents and the acceptance of the green building cost. Secondly, combined with the geographical features of Ningxia minority areas, the evaluation criteria system of green residential building is constructed. Finally, using the improved group AHP method and the grey clustering method, the criteria weight is determined, and a real case is evaluated, which is located in Xing Qing district, Ningxia. A conclusion can be obtained that the professional evaluation for this project and good social recognition is basically the same.

Keywords: evaluation, green residential building, grey clustering method, group AHP

Procedia PDF Downloads 367
639 Global Low Carbon Transitions in the Power Sector: A Machine Learning Archetypical Clustering Approach

Authors: Abdullah Alotaiq, David Wallom, Malcolm McCulloch

Abstract:

This study presents an archetype-based approach to designing effective strategies for low-carbon transitions in the power sector. To achieve global energy transition goals, a renewable energy transition is critical, and understanding diverse energy landscapes across different countries is essential to design effective renewable energy policies and strategies. Using a clustering approach, this study identifies 12 energy archetypes based on the electricity mix, socio-economic indicators, and renewable energy contribution potential of 187 UN countries. Each archetype is characterized by distinct challenges and opportunities, ranging from high dependence on fossil fuels to low electricity access, low economic growth, and insufficient contribution potential of renewables. Archetype A, for instance, consists of countries with low electricity access, high poverty rates, and limited power infrastructure, while Archetype J comprises developed countries with high electricity demand and installed renewables. The study findings have significant implications for renewable energy policymaking and investment decisions, with policymakers and investors able to use the archetype approach to identify suitable renewable energy policies and measures and assess renewable energy potential and risks. Overall, the archetype approach provides a comprehensive framework for understanding diverse energy landscapes and accelerating decarbonisation of the power sector.

Keywords: fossil fuels, power plants, energy transition, renewable energy, archetypes

Procedia PDF Downloads 20
638 Recognition and Counting Algorithm for Sub-Regional Objects in a Handwritten Image through Image Sets

Authors: Kothuri Sriraman, Mattupalli Komal Teja

Abstract:

In this paper, a novel algorithm is proposed for the recognition of hulls in a hand written images that might be irregular or digit or character shape. Identification of objects and internal objects is quite difficult to extract, when the structure of the image is having bulk of clusters. The estimation results are easily obtained while going through identifying the sub-regional objects by using the SASK algorithm. Focusing mainly to recognize the number of internal objects exist in a given image, so as it is shadow-free and error-free. The hard clustering and density clustering process of obtained image rough set is used to recognize the differentiated internal objects, if any. In order to find out the internal hull regions it involves three steps pre-processing, Boundary Extraction and finally, apply the Hull Detection system. By detecting the sub-regional hulls it can increase the machine learning capability in detection of characters and it can also be extend in order to get the hull recognition even in irregular shape objects like wise black holes in the space exploration with their intensities. Layered hulls are those having the structured layers inside while it is useful in the Military Services and Traffic to identify the number of vehicles or persons. This proposed SASK algorithm is helpful in making of that kind of identifying the regions and can useful in undergo for the decision process (to clear the traffic, to identify the number of persons in the opponent’s in the war).

Keywords: chain code, Hull regions, Hough transform, Hull recognition, Layered Outline Extraction, SASK algorithm

Procedia PDF Downloads 307
637 Genetic Trait Analysis of RIL Barley Genotypes to Sort-out the Top Ranked Elites for Advanced Yield Breeding Across Multi Environments of Tigray, Ethiopia

Authors: Hailekiros Tadesse Tekle, Yemane Tsehaye, Fetien Abay

Abstract:

Barley (Hordeum vulgare L.) is one of the most important cereal crops in the world, grown for the poor farmers in Tigray with low yield production. The purpose of this research was to estimate the performance of 166 barley genotypes against the quantitative traits with detailed analysis of the variance component, heritability, genetic advance, and genetic usefulness parameters. The finding of ANOVA was highly significant variation (p ≤ 0:01) for all the genotypes. We found significant differences in coefficient of variance (CV of 15%) for 5 traits out of the 12 quantitative traits. The topmost broad sense heritability (H2) was recorded for seeds per spike (98.8%), followed by thousand seed weight (96.5%) with 79.16% and 56.25%, respectively, of GAM. The traits with H2 ≥ 60% and GA/GAM ≥ 20% suggested the least influenced by the environment, governed by the additive genes and direct selection for improvement of such beneficial traits for the studied genotypes. Hence, the 20 outstanding recombinant inbred lines (RIL) barley genotypes performing early maturity, high yield, and 1000 seed weight traits simultaneously were the top ranked group barley genotypes out of the 166 genotypes. These are; G5, G25, G33, G118, G36, G123, G28, G34, G14, G10, G3, G13, G11, G32, G8, G39, G23, G30, G37, and G26. They were early in maturity, high TSW and GYP (TSW ≥ 55 g, GYP ≥ 15.22 g/plant, and DTM below 106 days). In general, the 166 genotypes were classified as high (group 1), medium (group 2), and low yield production (group 3) genotypes in terms of yield and yield component trait analysis by clustering; and genotype parameter analysis such as the heritability, genetic advance, and genetic usefulness traits in this investigation.

Keywords: barley, clustering, genetic advance, heritability, usefulness, variability, yield

Procedia PDF Downloads 45
636 An Analysis of Mongolian Possessive Markers

Authors: Yaxuan Wang

Abstract:

It has long been a mystery that why the Mongolian possessive suffix, which is constrained by Condition A of binding theory, has the ability to probe a potential antecedent outside of its binding domain. This squib argues that binding theory alone is not sufficient to explain the linguistic facts and proposes an analysis adopting the Agree operation. The current analysis correctly predicts all the possible and impossible structures, with an additional hypothesis that Mongolian possessive suffixes serve as an antecedent for PROs in adjunct. The findings thus provide insights into how Agree operates in Mongolian language.

Keywords: syntax, Mongolian, agreement, possessive particles

Procedia PDF Downloads 72
635 A Mixing Matrix Estimation Algorithm for Speech Signals under the Under-Determined Blind Source Separation Model

Authors: Jing Wu, Wei Lv, Yibing Li, Yuanfan You

Abstract:

The separation of speech signals has become a research hotspot in the field of signal processing in recent years. It has many applications and influences in teleconferencing, hearing aids, speech recognition of machines and so on. The sounds received are usually noisy. The issue of identifying the sounds of interest and obtaining clear sounds in such an environment becomes a problem worth exploring, that is, the problem of blind source separation. This paper focuses on the under-determined blind source separation (UBSS). Sparse component analysis is generally used for the problem of under-determined blind source separation. The method is mainly divided into two parts. Firstly, the clustering algorithm is used to estimate the mixing matrix according to the observed signals. Then the signal is separated based on the known mixing matrix. In this paper, the problem of mixing matrix estimation is studied. This paper proposes an improved algorithm to estimate the mixing matrix for speech signals in the UBSS model. The traditional potential algorithm is not accurate for the mixing matrix estimation, especially for low signal-to noise ratio (SNR).In response to this problem, this paper considers the idea of an improved potential function method to estimate the mixing matrix. The algorithm not only avoids the inuence of insufficient prior information in traditional clustering algorithm, but also improves the estimation accuracy of mixing matrix. This paper takes the mixing of four speech signals into two channels as an example. The results of simulations show that the approach in this paper not only improves the accuracy of estimation, but also applies to any mixing matrix.

Keywords: DBSCAN, potential function, speech signal, the UBSS model

Procedia PDF Downloads 105
634 Understanding the Qualitative Nature of Product Reviews by Integrating Text Processing Algorithm and Usability Feature Extraction

Authors: Cherry Yieng Siang Ling, Joong Hee Lee, Myung Hwan Yun

Abstract:

The quality of a product to be usable has become the basic requirement in consumer’s perspective while failing the requirement ends up the customer from not using the product. Identifying usability issues from analyzing quantitative and qualitative data collected from usability testing and evaluation activities aids in the process of product design, yet the lack of studies and researches regarding analysis methodologies in qualitative text data of usability field inhibits the potential of these data for more useful applications. While the possibility of analyzing qualitative text data found with the rapid development of data analysis studies such as natural language processing field in understanding human language in computer, and machine learning field in providing predictive model and clustering tool. Therefore, this research aims to study the application capability of text processing algorithm in analysis of qualitative text data collected from usability activities. This research utilized datasets collected from LG neckband headset usability experiment in which the datasets consist of headset survey text data, subject’s data and product physical data. In the analysis procedure, which integrated with the text-processing algorithm, the process includes training of comments onto vector space, labeling them with the subject and product physical feature data, and clustering to validate the result of comment vector clustering. The result shows 'volume and music control button' as the usability feature that matches best with the cluster of comment vectors where centroid comments of a cluster emphasized more on button positions, while centroid comments of the other cluster emphasized more on button interface issues. When volume and music control buttons are designed separately, the participant experienced less confusion, and thus, the comments mentioned only about the buttons' positions. While in the situation where the volume and music control buttons are designed as a single button, the participants experienced interface issues regarding the buttons such as operating methods of functions and confusion of functions' buttons. The relevance of the cluster centroid comments with the extracted feature explained the capability of text processing algorithms in analyzing qualitative text data from usability testing and evaluations.

Keywords: usability, qualitative data, text-processing algorithm, natural language processing

Procedia PDF Downloads 250
633 Identification of Watershed Landscape Character Types in Middle Yangtze River within Wuhan Metropolitan Area

Authors: Huijie Wang, Bin Zhang

Abstract:

In China, the middle reaches of the Yangtze River are well-developed, boasting a wealth of different types of watershed landscape. In this regard, landscape character assessment (LCA) can serve as a basis for protection, management and planning of trans-regional watershed landscape types. For this study, we chose the middle reaches of the Yangtze River in Wuhan metropolitan area as our study site, wherein the water system consists of rich variety in landscape types. We analyzed trans-regional data to cluster and identify types of landscape characteristics at two levels. 55 basins were analyzed as variables with topography, land cover and river system features in order to identify the watershed landscape character types. For watershed landscape, drainage density and degree of curvature were specified as special variables to directly reflect the regional differences of river system features. Then, we used the principal component analysis (PCA) method and hierarchical clustering algorithm based on the geographic information system (GIS) and statistical products and services solution (SPSS) to obtain results for clusters of watershed landscape which were divided into 8 characteristic groups. These groups highlighted watershed landscape characteristics of different river systems as well as key landscape characteristics that can serve as a basis for targeted protection of watershed landscape characteristics, thus helping to rationally develop multi-value landscape resources and promote coordinated development of trans-regions.

Keywords: GIS, hierarchical clustering, landscape character, landscape typology, principal component analysis, watershed

Procedia PDF Downloads 178
632 Portfolio Risk Management Using Quantum Annealing

Authors: Thomas Doutre, Emmanuel De Meric De Bellefon

Abstract:

This paper describes the application of local-search metaheuristic quantum annealing to portfolio opti- mization. Heuristic technics are particularly handy when Markowitz’ classical Mean-Variance problem is enriched with additional realistic constraints. Once tailored to the problem, computational experiments on real collected data have shown the superiority of quantum annealing over simulated annealing for this constrained optimization problem, taking advantages of quantum effects such as tunnelling.

Keywords: optimization, portfolio risk management, quantum annealing, metaheuristic

Procedia PDF Downloads 349
631 Ill-Posed Inverse Problems in Molecular Imaging

Authors: Ranadhir Roy

Abstract:

Inverse problems arise in medical (molecular) imaging. These problems are characterized by large in three dimensions, and by the diffusion equation which models the physical phenomena within the media. The inverse problems are posed as a nonlinear optimization where the unknown parameters are found by minimizing the difference between the predicted data and the measured data. To obtain a unique and stable solution to an ill-posed inverse problem, a priori information must be used. Mathematical conditions to obtain stable solutions are established in Tikhonov’s regularization method, where the a priori information is introduced via a stabilizing functional, which may be designed to incorporate some relevant information of an inverse problem. Effective determination of the Tikhonov regularization parameter requires knowledge of the true solution, or in the case of optical imaging, the true image. Yet, in, clinically-based imaging, true image is not known. To alleviate these difficulties we have applied the penalty/modified barrier function (PMBF) method instead of Tikhonov regularization technique to make the inverse problems well-posed. Unlike the Tikhonov regularization method, the constrained optimization technique, which is based on simple bounds of the optical parameter properties of the tissue, can easily be implemented in the PMBF method. Imposing the constraints on the optical properties of the tissue explicitly restricts solution sets and can restore uniqueness. Like the Tikhonov regularization method, the PMBF method limits the size of the condition number of the Hessian matrix of the given objective function. The accuracy and the rapid convergence of the PMBF method require a good initial guess of the Lagrange multipliers. To obtain the initial guess of the multipliers, we use a least square unconstrained minimization problem. Three-dimensional images of fluorescence absorption coefficients and lifetimes were reconstructed from contact and noncontact experimentally measured data.

Keywords: constrained minimization, ill-conditioned inverse problems, Tikhonov regularization method, penalty modified barrier function method

Procedia PDF Downloads 247
630 Comparative Study of Tensile Properties of Cast and Hot Forged Alumina Nanoparticle Reinforced Composites

Authors: S. Ghanaraja, Subrata Ray, S. K. Nath

Abstract:

Particle reinforced Metal Matrix Composite (MMC) succeeds in synergizing the metallic matrix with ceramic particle reinforcements to result in improved strength, particularly at elevated temperatures, but adversely it affects the ductility of the matrix because of agglomeration and porosity. The present study investigates the outcome of tensile properties in a cast and hot forged composite reinforced simultaneously with coarse and fine particles. Nano-sized alumina particles have been generated by milling mixture of aluminum and manganese dioxide powders. Milled particles after drying are added to molten metal and the resulting slurry is cast. The microstructure of the composites shows good distribution of both the size categories of particles without significant clustering. The presence of nanoparticles along with coarser particles in a composite improves both strength and ductility considerably. Delay in debonding of coarser particles to higher stress is due to reduced mismatch in extension caused by increased strain hardening in presence of the nanoparticles. However, higher addition of powder mix beyond a limit results in deterioration of mechanical properties, possibly due to clustering of nanoparticles. The porosity in cast composite generally increases with the increasing addition of powder mix as observed during process and on forging it has got reduced. The base alloy and nanocomposites show improvement in flow stress which could be attributed to lowering of porosity and grain refinement as a consequence of forging.

Keywords: aluminium, alumina, nano-particle reinforced composites, porosity

Procedia PDF Downloads 219
629 Predicting Open Chromatin Regions in Cell-Free DNA Whole Genome Sequencing Data by Correlation Clustering  

Authors: Fahimeh Palizban, Farshad Noravesh, Amir Hossein Saeidian, Mahya Mehrmohamadi

Abstract:

In the recent decade, the emergence of liquid biopsy has significantly improved cancer monitoring and detection. Dying cells, including those originating from tumors, shed their DNA into the blood and contribute to a pool of circulating fragments called cell-free DNA. Accordingly, identifying the tissue origin of these DNA fragments from the plasma can result in more accurate and fast disease diagnosis and precise treatment protocols. Open chromatin regions are important epigenetic features of DNA that reflect cell types of origin. Profiling these features by DNase-seq, ATAC-seq, and histone ChIP-seq provides insights into tissue-specific and disease-specific regulatory mechanisms. There have been several studies in the area of cancer liquid biopsy that integrate distinct genomic and epigenomic features for early cancer detection along with tissue of origin detection. However, multimodal analysis requires several types of experiments to cover the genomic and epigenomic aspects of a single sample, which will lead to a huge amount of cost and time. To overcome these limitations, the idea of predicting OCRs from WGS is of particular importance. In this regard, we proposed a computational approach to target the prediction of open chromatin regions as an important epigenetic feature from cell-free DNA whole genome sequence data. To fulfill this objective, local sequencing depth will be fed to our proposed algorithm and the prediction of the most probable open chromatin regions from whole genome sequencing data can be carried out. Our method integrates the signal processing method with sequencing depth data and includes count normalization, Discrete Fourie Transform conversion, graph construction, graph cut optimization by linear programming, and clustering. To validate the proposed method, we compared the output of the clustering (open chromatin region+, open chromatin region-) with previously validated open chromatin regions related to human blood samples of the ATAC-DB database. The percentage of overlap between predicted open chromatin regions and the experimentally validated regions obtained by ATAC-seq in ATAC-DB is greater than 67%, which indicates meaningful prediction. As it is evident, OCRs are mostly located in the transcription start sites (TSS) of the genes. In this regard, we compared the concordance between the predicted OCRs and the human genes TSS regions obtained from refTSS and it showed proper accordance around 52.04% and ~78% with all and the housekeeping genes, respectively. Accurately detecting open chromatin regions from plasma cell-free DNA-seq data is a very challenging computational problem due to the existence of several confounding factors, such as technical and biological variations. Although this approach is in its infancy, there has already been an attempt to apply it, which leads to a tool named OCRDetector with some restrictions like the need for highly depth cfDNA WGS data, prior information about OCRs distribution, and considering multiple features. However, we implemented a graph signal clustering based on a single depth feature in an unsupervised learning manner that resulted in faster performance and decent accuracy. Overall, we tried to investigate the epigenomic pattern of a cell-free DNA sample from a new computational perspective that can be used along with other tools to investigate genetic and epigenetic aspects of a single whole genome sequencing data for efficient liquid biopsy-related analysis.

Keywords: open chromatin regions, cancer, cell-free DNA, epigenomics, graph signal processing, correlation clustering

Procedia PDF Downloads 110