Search results for: rare codon clusters
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1430

Search results for: rare codon clusters

1370 Visualization and Performance Measure to Determine Number of Topics in Twitter Data Clustering Using Hybrid Topic Modeling

Authors: Moulana Mohammed

Abstract:

Topic models are widely used in building clusters of documents for more than a decade, yet problems occurring in choosing optimal number of topics. The main problem is the lack of a stable metric of the quality of topics obtained during the construction of topic models. The authors analyzed from previous works, most of the models used in determining the number of topics are non-parametric and quality of topics determined by using perplexity and coherence measures and concluded that they are not applicable in solving this problem. In this paper, we used the parametric method, which is an extension of the traditional topic model with visual access tendency for visualization of the number of topics (clusters) to complement clustering and to choose optimal number of topics based on results of cluster validity indices. Developed hybrid topic models are demonstrated with different Twitter datasets on various topics in obtaining the optimal number of topics and in measuring the quality of clusters. The experimental results showed that the Visual Non-negative Matrix Factorization (VNMF) topic model performs well in determining the optimal number of topics with interactive visualization and in performance measure of the quality of clusters with validity indices.

Keywords: interactive visualization, visual mon-negative matrix factorization model, optimal number of topics, cluster validity indices, Twitter data clustering

Procedia PDF Downloads 107
1369 The Effective Use of the Network in the Distributed Storage

Authors: Mamouni Mohammed Dhiya Eddine

Abstract:

This work aims at studying the exploitation of high-speed networks of clusters for distributed storage. Parallel applications running on clusters require both high-performance communications between nodes and efficient access to the storage system. Many studies on network technologies led to the design of dedicated architectures for clusters with very fast communications between computing nodes. Efficient distributed storage in clusters has been essentially developed by adding parallelization mechanisms so that the server(s) may sustain an increased workload. In this work, we propose to improve the performance of distributed storage systems in clusters by efficiently using the underlying high-performance network to access distant storage systems. The main question we are addressing is: do high-speed networks of clusters fit the requirements of a transparent, efficient and high-performance access to remote storage? We show that storage requirements are very different from those of parallel computation. High-speed networks of clusters were designed to optimize communications between different nodes of a parallel application. We study their utilization in a very different context, storage in clusters, where client-server models are generally used to access remote storage (for instance NFS, PVFS or LUSTRE). Our experimental study based on the usage of the GM programming interface of MYRINET high-speed networks for distributed storage raised several interesting problems. Firstly, the specific memory utilization in the storage access system layers does not easily fit the traditional memory model of high-speed networks. Secondly, client-server models that are used for distributed storage have specific requirements on message control and event processing, which are not handled by existing interfaces. We propose different solutions to solve communication control problems at the filesystem level. We show that a modification of the network programming interface is required. Data transfer issues need an adaptation of the operating system. We detail several propositions for network programming interfaces which make their utilization easier in the context of distributed storage. The integration of a flexible processing of data transfer in the new programming interface MYRINET/MX is finally presented. Performance evaluations show that its usage in the context of both storage and other types of applications is easy and efficient.

Keywords: distributed storage, remote file access, cluster, high-speed network, MYRINET, zero-copy, memory registration, communication control, event notification, application programming interface

Procedia PDF Downloads 194
1368 Synthesis of Rare-Earth Pyrazolate Compounds

Authors: Nazli Eslamirad, Peter C. Junk, Jun Wang, Glen B. Deacon

Abstract:

Since coordination behavior of pyrazoles and pyrazolate ions are widely versatile towards a great range of metals such as d-block, f-block as well as main group elements; they attract interest as ligands for preparing compounds. A variety of rare-earth pyrazolate complexes have been synthesized by redox transmetalation/protolysis (RTP) previously, therefore, a variety of rare-earth pyrazolate complexes using two pyrazoles, 3,5-dimethylpyrazole (Me₂pzH) and 3,5-di-tert -butylpyrazolate (t-Bu₂pzH), in which the structures span the whole La-Lu array beside Sc and Y has been synthesized by RTP reaction. There have been further developments in this study: Synthesizing structure of [Tb(Me₂pz)₃(thf)]₂ which is isomorphous with those of the previously reported [Dy(Me₂pz)₃(thf)]₂ and [Lu(Me₂pz)₃(thf)]₂ analogous that has two µ-1(N):2(Nʹ)-Me2pz ligands (the most common pyrazolate ligation for non-rare-earth complexes). Previously most of the reported compounds using t-Bu2pzH were monomeric compounds however the lanthanum derivative [La(Me₂pz)₃thf₂] ,which has been reported previously without crystal structure, has now been structurally characterized, along with cerium and lutetium analogue. Also a polymeric structure with samarium has now been synthesized which the neodymium analogue has been reported previously and comparing these polymeric structures can support the idea that the geometry of Sm(tBu₂pz)₃ affect the coordination of the solvent. Also, by using 1,2-dimethoxyethane (DME) instead of tetrahydrofuran (THF) new [Er(tBu₂pz)₃ (dme)₂] has now been reported.

Keywords: lanthanoid complexes, pyrazolate, redox transmetalation/protolysis, x-ray crystal structures

Procedia PDF Downloads 187
1367 Cluster Based Ant Colony Routing Algorithm for Mobile Ad-Hoc Networks

Authors: Alaa Eddien Abdallah, Bajes Yousef Alskarnah

Abstract:

Ant colony based routing algorithms are known to grantee the packet delivery, but they su ffer from the huge overhead of control messages which are needed to discover the route. In this paper we utilize the network nodes positions to group the nodes in connected clusters. We use clusters-heads only on forwarding the route discovery control messages. Our simulations proved that the new algorithm has decreased the overhead dramatically without affecting the delivery rate.

Keywords: ad-hoc network, MANET, ant colony routing, position based routing

Procedia PDF Downloads 393
1366 Conventional Synthesis and Characterization of Zirconium Molybdate, Nd2Zr3(MoO4)9

Authors: G. Çelik Gül, F. Kurtuluş

Abstract:

Rare earths containing complex metal oxides have drawn much attention due to physical, chemical and optical properties which make them feasible in so many areas such as non-linear optical materials and ion exchanger. We have researched a systematic study to obtain rare earth containing zirconium molybdate compound, characterization, investigation of crystal system and calculation of unit cell parameters.  After a successful synthesis of Nd2Zr3(MoO4)9 which is a member of rare earth metal containing complex oxides family, X-ray diffraction (XRD), High Score Plus/Rietveld refinement analysis, and Fourier Transform Infrared Spectroscopy (FTIR) were completed to determine the crystal structure. Morphological properties and elemental composition were determined by scanning electron microscopy (SEM) and energy dispersive X-ray (EDX) analysis. Thermal properties were observed via Thermogravimetric-differential thermal analysis (TG/DTA).

Keywords: Nd₂Zr₃(MoO₄)₉, powder x-ray diffraction, solid state synthesis, zirconium molybdates

Procedia PDF Downloads 366
1365 Novel Spoke-Type BLDC Motor Design for Cost Effective and High Power Density

Authors: Suyong Kim

Abstract:

Recently because of the rise in the price of rare earth magnet, interest of non-rare earth or less-rare earth motor is growing. Especially to achieve the high power density, Spoke-Type BLDC (Brushless Permanent Magnet) Motor with ferrite permanent magnet are spotlighted. But Spoke-Type Ferrite BLDC Motor has much of magnetic flux leakage in the direction of rotor shaft. In order to solve this problem, there are two conventional ways. But conventional ways bring the increases of product cost or the decreases of the power density. Therefore, this paper proposes new Spoke-Type BLDC Rotor shape that has the advantages of both conventional methods. The new shape is consists of a one-piece core. The inside and the outside of the rotor are open alternately. So it can take reduced production cost and high power density.

Keywords: motor, BLDC, spoke, ferrite

Procedia PDF Downloads 539
1364 Rare Case of Pyoderma Gangrenosum of the Upper Limb

Authors: Karissa A. Graham

Abstract:

Pyoderma gangrenosum (PG) is a prototypic autoinflammatory neutrophilic dermatosis that is a rare disorder. It presents a diagnostic challenge owing to its variable presentation, clinical overlap with other conditions, it is often associated with other systemic conditions, and there is no definitive histological or laboratory characteristic. The Delphai consensus for PG includes the presence of at least one ulcer on the anterior lower limb. Systemic corticosteroids and immunosuppressive therapies are the mainstay treatment for PG. We describe a case report of delayed diagnosis of ulcerative pyoderma gangrenosum in a 44-year-old male on his forearm. The patient presented with an infected ulcer on his right forearm that had been present for over three years. The patient was a Type 2 Diabetic with no personal or family history of inflammatory bowel disease or other autoimmune diseases. The patient was initially investigated for malignancy, but biopsies returned as chronic inflammatory tissue with neutrophilic infiltrate and no malignancy. The patient was commenced on systemic prednisone for the treatment of pyoderma gangrenosum. The diagnosis of ulcerative PG poses a challenge given the vast differential diagnosis for a cutaneous ulcer (i.e., malignant, vascular, autoimmune, trauma, infective, etc.). Diagnostic accuracy is important given that the treatment for PG with steroids does not go without risks and indeed may be contraindicated in other potential causes of the ulcer. Indeed, more common and more sinister causes of ulcers should be investigated first, as death from PG is quite rare.

Keywords: dermatological diagnosis, dermatosis, pyoderma gangrenosum, rare presentation

Procedia PDF Downloads 59
1363 Specific Frequency of Globular Clusters in Different Galaxy Types

Authors: Ahmed H. Abdullah, Pavel Kroupa

Abstract:

Globular clusters (GC) are important objects for tracing the early evolution of a galaxy. We study the correlation between the cluster population and the global properties of the host galaxy. We found that the correlation between cluster population (NGC) and the baryonic mass (Mb) of the host galaxy are best described as 10 −5.6038Mb. In order to understand the origin of the U -shape relation between the GC specific frequency (SN) and Mb (caused by the high value of SN for dwarfs galaxies and giant ellipticals and a minimum SN for intermediate mass galaxies≈ 1010M), we derive a theoretical model for the specific frequency (SNth). The theoretical model for SNth is based on the slope of the power-law embedded cluster mass function (β) and different time scale (Δt) of the forming galaxy. Our results show a good agreement between the observation and the model at a certain β and Δt. The model seems able to reproduce higher value of SNth of β = 1.5 at the midst formation time scale.

Keywords: galaxies: dwarf, globular cluster: specific frequency, number of globular clusters, formation time scale

Procedia PDF Downloads 292
1362 Design and Optimisation of 2-Oxoglutarate Dioxygenase Expression in Escherichia coli Strains for Production of Bioethylene from Crude Glycerol

Authors: Idan Chiyanzu, Maruping Mangena

Abstract:

Crude glycerol, a major by-product from the transesterification of triacylglycerides with alcohol to biodiesel, is known to have a broad range of applications. For example, its bioconversion can afford a wide range of chemicals including alcohols, organic acids, hydrogen, solvents and intermediate compounds. In bacteria, the 2-oxoglutarate dioxygenase (2-OGD) enzymes are widely found among the Pseudomonas syringae species and have been recognized with an emerging importance in ethylene formation. However, the use of optimized enzyme function in recombinant systems for crude glycerol conversion to ethylene is still not been reported. The present study investigated the production of ethylene from crude glycerol using engineered E. coli MG1655 and JM109 strains. Ethylene production with an optimized expression system for 2-OGD in E. coli using a codon optimized construct of the ethylene-forming gene was studied. The codon-optimization resulted in a 20-fold increase of protein production and thus an enhanced production of the ethylene gas. For a reliable bioreactor performance, the effect of temperature, fermentation time, pH, substrate concentration, the concentration of methanol, concentration of potassium hydroxide and media supplements on ethylene yield was investigated. The results demonstrate that the recombinant enzyme can be used for future studies to exploit the conversion of low-priced crude glycerol into advanced value products like light olefins, and tools including recombineering techniques for DNA, molecular biology, and bioengineering can be used to allowing unlimited the production of ethylene directly from the fermentation of crude glycerol. It can be concluded that recombinant E.coli production systems represent significantly secure, renewable and environmentally safe alternative to thermochemical approach to ethylene production.

Keywords: crude glycerol, bioethylene, recombinant E. coli, optimization

Procedia PDF Downloads 259
1361 Recovery of Rare Earths and Scandium from in situ Leaching Solutions

Authors: Maxim S. Botalov, Svetlana М. Titova, Denis V. Smyshlyaev, Grigory M. Bunkov, Evgeny V. Kirillov, Sergey V. Kirillov, Maxim A. Mashkovtsev, Vladimir N. Rychkov

Abstract:

In uranium production, in-situ leaching (ISL) with its relatively low cost has become an important technology. As the orebody containing uranium most often contains a considerable value of other metals, particularly rare earth metals it has rendered feasible to recover the REM from the barren ISL solutions, from which the major uranium content has been removed. Ural Federal University (UrFU, Ekaterinburg, Russia) have performed joint research on the development of industrial technologies for the extraction of REM and Scandium compounds from Uranium ISL solutions. Leaching experiments at UrFU have been supported with multicomponent solution model. The experimental work combines solvent extraction with advanced ion exchange methodology in a pilot facility capable of treating 500 kg/hr of solids. The pilot allows for the recovery of a 99% concentrate of scandium oxide and collective concentrate with over 50 % REM content, with further recovery of heavy and light REM concentrates (99%).

Keywords: extraction, ion exchange, rare earth elements, scandium

Procedia PDF Downloads 203
1360 Harnessing Sunlight for Clean Water: Scalable Approach for Silver-Loaded Titanium Dioxide Nanoparticles

Authors: Satam Alotibi, Muhammad J. Al-Zahrani, Fahd K. Al-Naqidan, Turki S. Hussein, Moteb Alotaibi, Mohammed Alyami, Mahdy M. Elmahdy, Abdellah Kaiba, Fatehia S. Alhakami, Talal F. Qahtan

Abstract:

Water pollution is a critical global challenge that demands scalable and effective solutions for water decontamination. In this captivating research, we unveil a groundbreaking strategy for harnessing solar energy to synthesize silver (Ag) clusters on stable titanium dioxide (TiO₂) nanoparticles dispersed in water, without the need for traditional stabilization agents. These Ag-loaded TiO₂ nanoparticles exhibit exceptional photocatalytic activity, surpassing that of pristine TiO₂ nanoparticles, offering a promising solution for highly efficient water decontamination under sunlight irradiation. To the best knowledge, we have developed a unique method to stabilize TiO₂ P25 nanoparticles in water without the use of stabilization agents. This breakthrough allows us to create an ideal platform for the solar-driven synthesis of Ag clusters. Under sunlight irradiation, the stable dispersion of TiO₂ P25 nanoparticles acts as a highly efficient photocatalyst, generating electron-hole pairs. The photogenerated electrons effectively reduce silver ions derived from a silver precursor, resulting in the formation of Ag clusters. The Ag clusters loaded on TiO₂ P25 nanoparticles exhibit remarkable photocatalytic activity for water decontamination under sunlight irradiation. Acting as active sites, these Ag clusters facilitate the generation of reactive oxygen species (ROS) upon exposure to sunlight. These ROS play a pivotal role in rapidly degrading organic pollutants, enabling efficient water decontamination. To confirm the success of our approach, we characterized the synthesized Ag-loaded TiO₂ P25 nanoparticles using cutting-edge analytical techniques, such as transmission electron microscopy (TEM), scanning electron microscopy (SEM), X-ray diffraction (XRD), and spectroscopic methods. These characterizations unequivocally confirm the successful synthesis of Ag clusters on stable TiO₂ P25 nanoparticles without traditional stabilization agents. Comparative studies were conducted to evaluate the superior photocatalytic performance of Ag-loaded TiO₂ P25 nanoparticles compared to pristine TiO₂ P25 nanoparticles. The Ag clusters loaded on TiO₂ P25 nanoparticles exhibit significantly enhanced photocatalytic activity, benefiting from the synergistic effect between the Ag clusters and TiO₂ nanoparticles, which promotes ROS generation for efficient water decontamination. Our scalable strategy for synthesizing Ag clusters on stable TiO₂ P25 nanoparticles without stabilization agents presents a game-changing solution for highly efficient water decontamination under sunlight irradiation. The use of commercially available TiO₂ P25 nanoparticles streamlines the synthesis process and enables practical scalability. The outstanding photocatalytic performance of Ag-loaded TiO₂ P25 nanoparticles opens up new avenues for their application in large-scale water treatment and remediation processes, addressing the urgent need for sustainable water decontamination solutions.

Keywords: water pollution, solar energy, silver clusters, TiO₂ nanoparticles, photocatalytic activity

Procedia PDF Downloads 41
1359 Biosafety Study of Genetically Modified CEMB Sugarcane on Animals for Glyphosate Tolerance

Authors: Aminah Salim, Idrees Ahmed Nasir, Abdul Qayyum Rao, Muhammad Ali, Muhammad Sohail Anjum, Ayesha Hameed, Bushra Tabassum, Anwar Khan, Arfan Ali, Mariyam Zameer, Tayyab Husnain

Abstract:

Risk assessment of transgenic herbicide tolerant sugarcane having CEMB codon optimized cp4EPSPS gene was done in present study. Fifteen days old chicks taken from K&Ns Company were randomly assorted into four groups with eight chicks in each group namely control chicken group fed with commercial diet, non-transgenic group fed with non-experimental sugarcane and transgenic group fed with transgenic sugarcane with minimum and maximum level. Body weights, biochemical analysis for Urea, alkaline phosphatase, alanine transferase, aspartate transferase, creatinine and bilirubin determination and histological examination of chicks fed with four types of feed was taken at fifteen days interval and no significant difference was observed in body weight biochemical and histological studies of all four groups. Protein isolated from the serum sample was analyzed through dipstick and SDS-PAGE, showing the absence of transgene protein in the serum sample of control and experimental groups. Moreover the amplification of cp4EPSPS gene with gene specific primers of DNA isolated from chicks blood and also from commercial diet was done to determine the presence and mobility of any nucleotide fragment of the transgene in/from feed and no amplification was obtained in feed as well as in blood extracted DNA of any group. Also no mRNA expression of cp4EPSPS gene was obtained in any tissue of four groups of chicks. From the results it is clear that there is no deleterious or harmful effect of the CEMB codon optimized transgenic cp4EPSPS sugarcane on the chicks health.

Keywords: chicks, cp4EPSPS, glyphosate, sugarcane

Procedia PDF Downloads 345
1358 Photoluminescence Properties of Lu1.98Er0.02Ti2O7 Pyrochlore (A2B2O7) Phosphor

Authors: Esra Öztürk, Erkul Karacaoglu

Abstract:

Pyrochlores, having compounds of the general formula, A2B2O7 (A and B are metals/rare earths) are important class of materials thanks to having technological applications like in luminescence, ionic conductivity, nuclear waste immobilization etc. The rare earths included pyrochlore compounds have also potential photoluminescence characteristics. In this context, Er3+-activated Lu2Ti2O7 pyrochlore was chosen and synthesized through a high-temperature solid-state reaction route that was sintered under the open atmosphere in this study. The optimal reaction conditions to obtain expected single phase system, the thermal analysis (DTA/TG) were carried out. The X-ray powder diffraction (XRD) was used to determine phase properties of the sample. The photoluminescence (PL) results were done to obtain excitation, emission and decay time properties by a PL spectrometer under room temperature. According to the PL, there are excitation bands at 352 nm, 388 nm, 423 nm and 453 nm that are due to 4I15/2 → 2G7/2, 4I15/2 → 4G11/2 and 4I15/2 → 4F5/2 transitions of Er3+ ions, respectively. The emission bands are placed at 582 nm, 677 nm and 762 nm that are associated with 2H11/2, 4S3/2 → 4I15/2, 4F9/2 → 4I15/2, 4I9/2 → 4I15/2 transitions of Er3+ ions, respectively.

Keywords: Er3+, Lu2Ti2O7, photoluminescence, pyrochlore, rare-earths

Procedia PDF Downloads 245
1357 Changing New York Financial Clusters in the 2000s: Modeling the Impact and Policy Implication of the Global Financial Crisis

Authors: Silvia Lorenzo, Hongmian Gong

Abstract:

With the influx of research assessing the economic impact of the global financial crisis of 2007-8, a spatial analysis based on empirical data is needed to better understand the spatial significance of the financial crisis in New York, a key international financial center also considered the origin of the crisis. Using spatial statistics, the existence of financial clusters specializing in credit and securities throughout the New York metropolitan area are identified for 2000 and 2010, the time period before and after the height of the global financial crisis. Geographically Weighted Regressions are then used to examine processes underlying the formation and movement of financial geographies across state, county and ZIP codes of the New York metropolitan area throughout the 2000s with specific attention to tax regimes, employment, household income, technology, and transportation hubs. This analysis provides useful inputs for financial risk management and public policy initiatives aimed at addressing regional economic sustainability across state boundaries, while also developing the groundwork for further research on a spatial analysis of the global financial crisis.

Keywords: financial clusters, New York, global financial crisis, geographically weighted regression

Procedia PDF Downloads 276
1356 Approaching a Tat-Rev Independent HIV-1 Clone towards a Model for Research

Authors: Walter Vera-Ortega, Idoia Busnadiego, Sam J. Wilson

Abstract:

Introduction: Human Immunodeficiency Virus type 1 (HIV-1) is responsible for the acquired immunodeficiency syndrome (AIDS), a leading cause of death worldwide infecting millions of people each year. Despite intensive research in vaccine development, therapies against HIV-1 infection are not curative, and the huge genetic variability of HIV-1 challenges to drug development. Current animal models for HIV-1 research present important limitations, impairing the progress of in vivo approaches. Macaques require a CD8+ depletion to progress to AIDS, and the maintenance cost is high. Mice are a cheaper alternative but need to be 'humanized,' and breeding is not possible. The development of an HIV-1 clone able to replicate in mice is a challenging proposal. The lack of human co-factors in mice impedes the function of the HIV-1 accessory proteins, Tat and Rev, hampering HIV-1 replication. However, Tat and Rev function can be replaced by constitutive/chimeric promoters, codon-optimized proteins and the constitutive transport element (CTE), generating a novel HIV-1 clone able to replicate in mice without disrupting the amino acid sequence of the virus. By minimally manipulating the genomic 'identity' of the virus, we propose the generation of an HIV-1 clone able to replicate in mice to assist in antiviral drug development. Methods: i) Plasmid construction: The chimeric promoters and CTE copies were cloned by PCR using lentiviral vectors as templates (pCGSW and pSIV-MPCG). Tat mutants were generated from replication competent HIV-1 plasmids (NHG and NL4-3). ii) Infectivity assays: Retroviral vectors were generated by transfection of human 293T cells and murine NIH 3T3 cells. Virus titre was determined by flow cytometry measuring GFP expression. Human B-cells (AA-2) and Hela cells (TZMbl) were used for infectivity assays. iii) Protein analysis: Tat protein expression was determined by TZMbl assay and HIV-1 capsid by western blot. Results: We have determined that NIH 3T3 cells are able to generate HIV-1 particles. However, they are not infectious, and further analysis needs to be performed. Codon-optimized HIV-1 constructs are efficiently made in 293T cells in a Tat and Rev independent manner and capable of packaging a competent genome in trans. CSGW is capable of generating infectious particles in the absence of Tat and Rev in human cells when 4 copies of the CTE are placed preceding the 3’LTR. HIV-1 Tat mutant clones encoding different promoters are functional during the first cycle of replication when Tat is added in trans. Conclusion: Our findings suggest that the development of an HIV-1 Tat-Rev independent clone is challenging but achievable aim. However, further investigations need to be developed prior presenting our HIV-1 clone as a candidate model for research.

Keywords: codon-optimized, constitutive transport element, HIV-1, long terminal repeats, research model

Procedia PDF Downloads 278
1355 Atomic Hydrogen Storage in Hexagonal GdNi5 and GdNi4Cu Rare Earth Compounds: A Comparative Density Functional Theory Study

Authors: A. Kellou, L. Rouaiguia, L. Rabahi

Abstract:

In the present work, the atomic hydrogen absorption trend in the GdNi5 and GdNi4Cu rare earth compounds within the hexagonal CaCu5 type of crystal structure (space group P6/mmm) is investigated. The density functional theory (DFT) combined with the generalized gradient approximation (GGA) is used to study the site preference of atomic hydrogen at 0K. The octahedral and tetrahedral interstitial sites are considered. The formation energies and structural properties are determined in order to evaluate hydrogen effects on the stability of the studied compounds. The energetic diagram of hydrogen storage is established and compared in GdNi5 and GdNi4Cu. The magnetic properties of the selected compounds are determined using spin polarized calculations. The obtained results are discussed with and without hydrogen addition taking into account available theoretical and experimental results.

Keywords: density functional theory, hydrogen storage, rare earth compounds, structural and magnetic properties

Procedia PDF Downloads 85
1354 Unseen Classes: The Paradigm Shift in Machine Learning

Authors: Vani Singhal, Jitendra Parmar, Satyendra Singh Chouhan

Abstract:

Unseen class discovery has now become an important part of a machine-learning algorithm to judge new classes. Unseen classes are the classes on which the machine learning model is not trained on. With the advancement in technology and AI replacing humans, the amount of data has increased to the next level. So while implementing a model on real-world examples, we come across unseen new classes. Our aim is to find the number of unseen classes by using a hierarchical-based active learning algorithm. The algorithm is based on hierarchical clustering as well as active sampling. The number of clusters that we will get in the end will give the number of unseen classes. The total clusters will also contain some clusters that have unseen classes. Instead of first discovering unseen classes and then finding their number, we directly calculated the number by applying the algorithm. The dataset used is for intent classification. The target data is the intent of the corresponding query. We conclude that when the machine learning model will encounter real-world data, it will automatically find the number of unseen classes. In the future, our next work would be to label these unseen classes correctly.

Keywords: active sampling, hierarchical clustering, open world learning, unseen class discovery

Procedia PDF Downloads 136
1353 An Approach for Association Rules Ranking

Authors: Rihab Idoudi, Karim Saheb Ettabaa, Basel Solaiman, Kamel Hamrouni

Abstract:

Medical association rules induction is used to discover useful correlations between pertinent concepts from large medical databases. Nevertheless, ARs algorithms produce huge amount of delivered rules and do not guarantee the usefulness and interestingness of the generated knowledge. To overcome this drawback, we propose an ontology based interestingness measure for ARs ranking. According to domain expert, the goal of the use of ARs is to discover implicit relationships between items of different categories such as ‘clinical features and disorders’, ‘clinical features and radiological observations’, etc. That’s to say, the itemsets which are composed of ‘similar’ items are uninteresting. Therefore, the dissimilarity between the rule’s items can be used to judge the interestingness of association rules; the more different are the items, the more interesting the rule is. In this paper, we design a distinct approach for ranking semantically interesting association rules involving the use of an ontology knowledge mining approach. The basic idea is to organize the ontology’s concepts into a hierarchical structure of conceptual clusters of targeted subjects, where each cluster encapsulates ‘similar’ concepts suggesting a specific category of the domain knowledge. The interestingness of association rules is, then, defined as the dissimilarity between corresponding clusters. That is to say, the further are the clusters of the items in the AR, the more interesting the rule is. We apply the method in our domain of interest – mammographic domain- using an existing mammographic ontology called Mammo with the goal of deriving interesting rules from past experiences, to discover implicit relationships between concepts modeling the domain.

Keywords: association rule, conceptual clusters, interestingness measures, ontology knowledge mining, ranking

Procedia PDF Downloads 299
1352 A Polynomial Time Clustering Algorithm for Solving the Assignment Problem in the Vehicle Routing Problem

Authors: Lydia Wahid, Mona F. Ahmed, Nevin Darwish

Abstract:

The vehicle routing problem (VRP) consists of a group of customers that needs to be served. Each customer has a certain demand of goods. A central depot having a fleet of vehicles is responsible for supplying the customers with their demands. The problem is composed of two subproblems: The first subproblem is an assignment problem where the number of vehicles that will be used as well as the customers assigned to each vehicle are determined. The second subproblem is the routing problem in which for each vehicle having a number of customers assigned to it, the order of visits of the customers is determined. Optimal number of vehicles, as well as optimal total distance, should be achieved. In this paper, an approach for solving the first subproblem (the assignment problem) is presented. In the approach, a clustering algorithm is proposed for finding the optimal number of vehicles by grouping the customers into clusters where each cluster is visited by one vehicle. Finding the optimal number of clusters is NP-hard. This work presents a polynomial time clustering algorithm for finding the optimal number of clusters and solving the assignment problem.

Keywords: vehicle routing problems, clustering algorithms, Clarke and Wright Saving Method, agglomerative hierarchical clustering

Procedia PDF Downloads 357
1351 Multi-Cluster Overlapping K-Means Extension Algorithm (MCOKE)

Authors: Said Baadel, Fadi Thabtah, Joan Lu

Abstract:

Clustering involves the partitioning of n objects into k clusters. Many clustering algorithms use hard-partitioning techniques where each object is assigned to one cluster. In this paper, we propose an overlapping algorithm MCOKE which allows objects to belong to one or more clusters. The algorithm is different from fuzzy clustering techniques because objects that overlap are assigned a membership value of 1 (one) as opposed to a fuzzy membership degree. The algorithm is also different from other overlapping algorithms that require a similarity threshold to be defined as a priority which can be difficult to determine by novice users.

Keywords: data mining, k-means, MCOKE, overlapping

Procedia PDF Downloads 536
1350 An Adaptive Oversampling Technique for Imbalanced Datasets

Authors: Shaukat Ali Shahee, Usha Ananthakumar

Abstract:

A data set exhibits class imbalance problem when one class has very few examples compared to the other class, and this is also referred to as between class imbalance. The traditional classifiers fail to classify the minority class examples correctly due to its bias towards the majority class. Apart from between-class imbalance, imbalance within classes where classes are composed of a different number of sub-clusters with these sub-clusters containing different number of examples also deteriorates the performance of the classifier. Previously, many methods have been proposed for handling imbalanced dataset problem. These methods can be classified into four categories: data preprocessing, algorithmic based, cost-based methods and ensemble of classifier. Data preprocessing techniques have shown great potential as they attempt to improve data distribution rather than the classifier. Data preprocessing technique handles class imbalance either by increasing the minority class examples or by decreasing the majority class examples. Decreasing the majority class examples lead to loss of information and also when minority class has an absolute rarity, removing the majority class examples is generally not recommended. Existing methods available for handling class imbalance do not address both between-class imbalance and within-class imbalance simultaneously. In this paper, we propose a method that handles between class imbalance and within class imbalance simultaneously for binary classification problem. Removing between class imbalance and within class imbalance simultaneously eliminates the biases of the classifier towards bigger sub-clusters by minimizing the error domination of bigger sub-clusters in total error. The proposed method uses model-based clustering to find the presence of sub-clusters or sub-concepts in the dataset. The number of examples oversampled among the sub-clusters is determined based on the complexity of sub-clusters. The method also takes into consideration the scatter of the data in the feature space and also adaptively copes up with unseen test data using Lowner-John ellipsoid for increasing the accuracy of the classifier. In this study, neural network is being used as this is one such classifier where the total error is minimized and removing the between-class imbalance and within class imbalance simultaneously help the classifier in giving equal weight to all the sub-clusters irrespective of the classes. The proposed method is validated on 9 publicly available data sets and compared with three existing oversampling techniques that rely on the spatial location of minority class examples in the euclidean feature space. The experimental results show the proposed method to be statistically significantly superior to other methods in terms of various accuracy measures. Thus the proposed method can serve as a good alternative to handle various problem domains like credit scoring, customer churn prediction, financial distress, etc., that typically involve imbalanced data sets.

Keywords: classification, imbalanced dataset, Lowner-John ellipsoid, model based clustering, oversampling

Procedia PDF Downloads 389
1349 Direct Organogenesis of Begonia Rex cv. DS-EYWA, An Unique Rare Cultivar, via Thin Cell Layering (TCL) Technique

Authors: Mahboubeh Davoudi Pahnekolayi

Abstract:

Begonia rex cv. DS-EYWA is a rare, unique cultivar of begonia rex with curly colorful leaves. Optimization of an in vitro efficient regeneration protocol by focusing on transverse Thin Cell Layer (tTCL) petiole explants for high-scale production of such a beautiful cultivar was considered as our main purpose in this experiment. Thus, various concentrations of Plant Growth Regulators (PGRs) including 6-Benzylaminopurine (BAP), Thidiazuron (TDY), and –Naphthaleneacetic Acid (NAA), were selected in a Completely Randomized Design (CRD) to establish and optimize the direct organogenesis efficiency of this cultivar. Cultivation of 1 mm tTCL petiole explants in noted treatments showed that 1.5 mgl-1 BAP + 0.5 mgl-1 NAA can induce the highest number of direct regenerated shoots and lower concentration of BAP (0.5 mgl-1) can be suggested for shoot elongation before rooting stage. Elongated shoots were successfully rooted in MS free basal medium and acclimatized in 1:1 peat moss: perlite sterilized pot mixture.

Keywords: begonia rare cultivar, direct organogenesis, explant type, regeneration, thin cell layering (TCL)

Procedia PDF Downloads 39
1348 Assessing Functional Structure in European Marine Ecosystems Using a Vector-Autoregressive Spatio-Temporal Model

Authors: Katyana A. Vert-Pre, James T. Thorson, Thomas Trancart, Eric Feunteun

Abstract:

In marine ecosystems, spatial and temporal species structure is an important component of ecosystems’ response to anthropological and environmental factors. Although spatial distribution patterns and fish temporal series of abundance have been studied in the past, little research has been allocated to the joint dynamic spatio-temporal functional patterns in marine ecosystems and their use in multispecies management and conservation. Each species represents a function to the ecosystem, and the distribution of these species might not be random. A heterogeneous functional distribution will lead to a more resilient ecosystem to external factors. Applying a Vector-Autoregressive Spatio-Temporal (VAST) model for count data, we estimate the spatio-temporal distribution, shift in time, and abundance of 140 species of the Eastern English Chanel, Bay of Biscay and Mediterranean Sea. From the model outputs, we determined spatio-temporal clusters, calculating p-values for hierarchical clustering via multiscale bootstrap resampling. Then, we designed a functional map given the defined cluster. We found that the species distribution within the ecosystem was not random. Indeed, species evolved in space and time in clusters. Moreover, these clusters remained similar over time deriving from the fact that species of a same cluster often shifted in sync, keeping the overall structure of the ecosystem similar overtime. Knowing the co-existing species within these clusters could help with predicting data-poor species distribution and abundance. Further analysis is being performed to assess the ecological functions represented in each cluster.

Keywords: cluster distribution shift, European marine ecosystems, functional distribution, spatio-temporal model

Procedia PDF Downloads 166
1347 Investigation of Clusters of MRSA Cases in a Hospital in Western Kenya

Authors: Lillian Musila, Valerie Oundo, Daniel Erwin, Willie Sang

Abstract:

Staphylococcus aureus infections are a major cause of nosocomial infections in Kenya. Methicillin resistant S. aureus (MRSA) infections are a significant burden to public health and are associated with considerable morbidity and mortality. At a hospital in Western Kenya two clusters of MRSA cases emerged within short periods of time. In this study we explored whether these clusters represented a nosocomial outbreak by characterizing the isolates using phenotypic and molecular assays and examining epidemiological data to identify possible transmission patterns. Specimens from the site of infection of the subjects were collected, cultured and S. aureus isolates identified phenotypically and confirmed by APIStaph™. MRSA were identified by cefoxitin disk screening per CLSI guidelines. MRSA were further characterized based on their antibiotic susceptibility patterns and spa gene typing. Characteristics of cases with MRSA isolates were compared with those with MSSA isolated around the same time period. Two cases of MRSA infection were identified in the two week period between 21 April and 4 May 2015. A further 2 MRSA isolates were identified on the same day on 7 September 2015. The antibiotic resistance patterns of the two MRSA isolates in the 1st cluster of cases were different suggesting that these were distinct isolates. One isolate had spa type t2029 and the other had a novel spa type. The 2 isolates were obtained from urine and an open skin wound. In the 2nd cluster of MRSA isolates, the antibiotic susceptibility patterns were similar but isolates had different spa types: one was t037 and the other a novel spa type different from the novel MRSA spa type in the first cluster. Both cases in the second cluster were admitted into the hospital but one infection was community- and the other hospital-acquired. Only one of the four MRSA cases was classified as an HAI from an infection acquired post-operatively. When compared to other S. aureus strains isolated within the same time period from the same hospital only one spa type t2029 was found in both MRSA and non-MRSA strains. None of the cases infected with MRSA in the two clusters shared any common epidemiological characteristic such as age, sex or known risk factors for MRSA such as prolonged hospitalization or institutionalization. These data suggest that the observed MRSA clusters were multi strain clusters and not an outbreak of a single strain. There was no clear relationship between the isolates by spa type suggesting that no transmission was occurring within the hospital between these cluster cases but rather that the majority of the MRSA strains were circulating in the community. There was high diversity of spa types among the MRSA strains with none of the isolates sharing spa types. Identification of disease clusters in space and time is critical for immediate infection control action and patient management. Spa gene typing is a rapid way of confirming or ruling out MRSA outbreaks so that costly interventions are applied only when necessary.

Keywords: cluster, Kenya, MRSA, spa typing

Procedia PDF Downloads 289
1346 Genetic Diversity Based Population Study of Freshwater Mud Eel (Monopterus cuchia) in Bangladesh

Authors: M. F. Miah, K. M. A. Zinnah, M. J. Raihan, H. Ali, M. N. Naser

Abstract:

As genetic diversity is most important for existing, breeding and production of any fish; this study was undertaken for investigating genetic diversity of freshwater mud eel, Monopterus cuchia at population level where three ecological populations such as flooded area of Sylhet (P1), open water of Moulvibazar (P2) and open water of Sunamganj (P3) districts of Bangladesh were considered. Four arbitrary RAPD primers (OPB-12, C0-4, B-03 and OPB-08) were screened and RAPD banding patterns were analyzed among the populations considering 15 individuals of each population. In total 174, 138 and 149 bands were detected in the populations of P1, P2 and P3 respectively; however, each primer revealed less number of bands in each population. 100% polymorphic loci were recorded in P2 and P3 whereas only one monomorphic locus was observed in P1, recorded 97.5% polymorphism. Different genetic parameters such as inter-individual pairwise similarity, genetic distance, Nei genetic similarity, linkage distances, cluster analysis and allelic information, etc. were considered for measuring genetic diversity. The average inter-individual pairwise similarity was recorded 2.98, 1.47 and 1.35 in P1, P2 and P3 respectively. Considering genetic distance analysis, the highest distance 1 was recorded in P2 and P3 and the lowest genetic distance 0.444 was found in P2. The average Nei genetic similarity was observed 0.19, 0.16 and 0.13 in P1, P2 and P3, respectively; however, the average linkage distance was recorded 24.92, 17.14 and 15.28 in P1, P3 and P2 respectively. Based on linkage distance, genetic clusters were generated in three populations where 6 clades and 7 clusters were found in P1, 3 clades and 5 clusters were observed in P2 and 4 clades and 7 clusters were detected in P3. In addition, allelic information was observed where the frequency of p and q alleles were observed 0.093 and 0.907 in P1, 0.076 and 0.924 in P2, 0.074 and 0.926 in P3 respectively. The average gene diversity was observed highest in P2 (0.132) followed by P3 (0.131) and P1 (0.121) respectively.

Keywords: genetic diversity, Monopterus cuchia, population, RAPD, Bangladesh

Procedia PDF Downloads 468
1345 The Relationship between Proximity to Sources of Industrial-Related Outdoor Air Pollution and Children Emergency Department Visits for Asthma in the Census Metropolitan Area of Edmonton, Canada, 2004/2005 to 2009/2010

Authors: Laura A. Rodriguez-Villamizar, Alvaro Osornio-Vargas, Brian H. Rowe, Rhonda J. Rosychuk

Abstract:

Introduction/Objectives: The Census Metropolitan Area of Edmonton (CMAE) has important industrial emissions to the air from the Industrial Heartland Alberta (IHA) at the Northeast and the coal-fired power plants (CFPP) at the West. The objective of the study was to explore the presence of clusters of children asthma ED visits in the areas around the IHA and the CFPP. Methods: Retrospective data on children asthma ED visits was collected at the dissemination area (DA) level for children between 2 and 14 years of age, living in the CMAE between April 1, 2004, and March 31, 2010. We conducted a spatial analysis of disease clusters around putative sources with count (ecological) data using descriptive, hypothesis testing, and multivariable modeling analysis. Results: The mean crude rate of asthma ED visits was 9.3/1,000 children population per year during the study period. Circular spatial scan test for cases and events identified a cluster of children asthma ED visits in the DA where the CFPP are located in the Wabamum area. No clusters were identified around the IHA area. The multivariable models suggest that there is a significant decline in risk for children asthma ED visits as distance increases around the CFPP area this effect is modified at the SE direction with mean angle 125.58 degrees, where the risk increases with distance. In contrast, the regression models for IHA suggest that there is a significant increase in risk for children asthma ED visits as distance increases around the IHA area and this effect is modified at SW direction with mean angle 216.52 degrees, where the risk increases at shorter distances. Conclusions: Different methods for detecting clusters of disease consistently suggested the existence of a cluster of children asthma ED visits around the CFPP but not around the IHA within the CMAE. These results are probably explained by the direction of the air pollutants dispersion caused by the predominant and subdominant wind direction at each point. The use of different approaches to detect clusters of disease is valuable to have a better understanding of the presence, shape, direction and size of clusters of disease around pollution sources.

Keywords: air pollution, asthma, disease cluster, industry

Procedia PDF Downloads 254
1344 Proposing an Algorithm to Cluster Ad Hoc Networks, Modulating Two Levels of Learning Automaton and Nodes Additive Weighting

Authors: Mohammad Rostami, Mohammad Reza Forghani, Elahe Neshat, Fatemeh Yaghoobi

Abstract:

An Ad Hoc network consists of wireless mobile equipment which connects to each other without any infrastructure, using connection equipment. The best way to form a hierarchical structure is clustering. Various methods of clustering can form more stable clusters according to nodes' mobility. In this research we propose an algorithm, which allocates some weight to nodes based on factors, i.e. link stability and power reduction rate. According to the allocated weight in the previous phase, the cellular learning automaton picks out in the second phase nodes which are candidates for being cluster head. In the third phase, learning automaton selects cluster head nodes, member nodes and forms the cluster. Thus, this automaton does the learning from the setting and can form optimized clusters in terms of power consumption and link stability. To simulate the proposed algorithm we have used omnet++4.2.2. Simulation results indicate that newly formed clusters have a longer lifetime than previous algorithms and decrease strongly network overload by reducing update rate.

Keywords: mobile Ad Hoc networks, clustering, learning automaton, cellular automaton, battery power

Procedia PDF Downloads 376
1343 Cultural Landscape Planning – A Case of Chettinad Village Clusters

Authors: Adhithy Menon E., Biju C. A.

Abstract:

In the 1960s, the concept of preserving heritage monuments was first introduced. During the 1990s, the concept of cultural landscapes gained importance, highlighting the importance of culture and heritage. Throughout this paper, we examine the second category of the cultural landscape, which is an organically evolving landscape as it represents a web of tangible, intangible, and ecological heritage and the ways in which they can be rejuvenated. Cultural landscapes in various regions, such as the Chettinad Village clusters, are in serious decline, which is identified through the Heritage Passport program of this area (2007). For this reason, it is necessary to conduct a detailed analysis of the factors that contribute to this degradation to ensure its protection in the future. An analysis of the cultural landscape of the Chettinad Village clusters and its impact on the community is presented in this paper. The paper follows the first objective, which is to understand cultural landscapes and their different criteria and categories. It is preceded by the study of various methods for protecting cultural landscapes. To identify a core area of intervention based on the parameters of Cultural Landscapes and Community Based Tourism, a study and analysis of the regional context of Chettinad village clusters considering tourism development must first be conducted. Lastly, planning interventions for integrating community-based tourism in Chettinad villages for the purpose of rejuvenating the cultural landscapes of the villages as well as their communities. The major findings include the importance of the local community in protecting cultural landscapes. The parameters identified to have an impact on Chettinad Village clusters are a community (community well-being, local maintenance, and enhancement, demand, alternative income for community, public participation, awareness), tourism (location and physical access, journey time, tourist attractions), integrity (natural factors, natural disasters, demolition of structures, deterioration of materials) authenticity (sense of place, living elements, building techniques, artistic expression, religious context) disaster management (natural disasters) and environmental impact (pollution). This area can be restored to its former glory and preserved as part of the cultural landscape for future generations by focusing on and addressing these parameters within the identified core area of the Chettinad Villages cluster (Kanadukathan TP, Kothamangalam, Kottaiyur, Athangudi, Karikudi, and Palathur).

Keywords: Chettinad village clusters, community, cultural landscapes, organically evolved.

Procedia PDF Downloads 48
1342 A Learning-Based EM Mixture Regression Algorithm

Authors: Yi-Cheng Tian, Miin-Shen Yang

Abstract:

The mixture likelihood approach to clustering is a popular clustering method where the expectation and maximization (EM) algorithm is the most used mixture likelihood method. In the literature, the EM algorithm had been used for mixture regression models. However, these EM mixture regression algorithms are sensitive to initial values with a priori number of clusters. In this paper, to resolve these drawbacks, we construct a learning-based schema for the EM mixture regression algorithm such that it is free of initializations and can automatically obtain an approximately optimal number of clusters. Some numerical examples and comparisons demonstrate the superiority and usefulness of the proposed learning-based EM mixture regression algorithm.

Keywords: clustering, EM algorithm, Gaussian mixture model, mixture regression model

Procedia PDF Downloads 478
1341 A Relative Entropy Regularization Approach for Fuzzy C-Means Clustering Problem

Authors: Ouafa Amira, Jiangshe Zhang

Abstract:

Clustering is an unsupervised machine learning technique; its aim is to extract the data structures, in which similar data objects are grouped in the same cluster, whereas dissimilar objects are grouped in different clusters. Clustering methods are widely utilized in different fields, such as: image processing, computer vision , and pattern recognition, etc. Fuzzy c-means clustering (fcm) is one of the most well known fuzzy clustering methods. It is based on solving an optimization problem, in which a minimization of a given cost function has been studied. This minimization aims to decrease the dissimilarity inside clusters, where the dissimilarity here is measured by the distances between data objects and cluster centers. The degree of belonging of a data point in a cluster is measured by a membership function which is included in the interval [0, 1]. In fcm clustering, the membership degree is constrained with the condition that the sum of a data object’s memberships in all clusters must be equal to one. This constraint can cause several problems, specially when our data objects are included in a noisy space. Regularization approach took a part in fuzzy c-means clustering technique. This process introduces an additional information in order to solve an ill-posed optimization problem. In this study, we focus on regularization by relative entropy approach, where in our optimization problem we aim to minimize the dissimilarity inside clusters. Finding an appropriate membership degree to each data object is our objective, because an appropriate membership degree leads to an accurate clustering result. Our clustering results in synthetic data sets, gaussian based data sets, and real world data sets show that our proposed model achieves a good accuracy.

Keywords: clustering, fuzzy c-means, regularization, relative entropy

Procedia PDF Downloads 240