Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4

Search results for: Atsuko Okazaki

4 Frequent Pattern Mining for Digenic Human Traits

Authors: Atsuko Okazaki, Jurg Ott

Abstract:

Some genetic diseases (‘digenic traits’) are due to the interaction between two DNA variants. For example, certain forms of Retinitis Pigmentosa (a genetic form of blindness) occur in the presence of two mutant variants, one in the ROM1 gene and one in the RDS gene, while the occurrence of only one of these mutant variants leads to a completely normal phenotype. Detecting such digenic traits by genetic methods is difficult. A common approach to finding disease-causing variants is to compare 100,000s of variants between individuals with a trait (cases) and those without the trait (controls). Such genome-wide association studies (GWASs) have been very successful but hinge on genetic effects of single variants, that is, there should be a difference in allele or genotype frequencies between cases and controls at a disease-causing variant. Frequent pattern mining (FPM) methods offer an avenue at detecting digenic traits even in the absence of single-variant effects. The idea is to enumerate pairs of genotypes (genotype patterns) with each of the two genotypes originating from different variants that may be located at very different genomic positions. What is needed is for genotype patterns to be significantly more common in cases than in controls. Let Y = 2 refer to cases and Y = 1 to controls, with X denoting a specific genotype pattern. We are seeking association rules, ‘X → Y’, with high confidence, P(Y = 2|X), significantly higher than the proportion of cases, P(Y = 2) in the study. Clearly, generally available FPM methods are very suitable for detecting disease-associated genotype patterns. We use fpgrowth as the basic FPM algorithm and built a framework around it to enumerate high-frequency digenic genotype patterns and to evaluate their statistical significance by permutation analysis. Application to a published dataset on opioid dependence furnished results that could not be found with classical GWAS methodology. There were 143 cases and 153 healthy controls, each genotyped for 82 variants in eight genes of the opioid system. The aim was to find out whether any of these variants were disease-associated. The single-variant analysis did not lead to significant results. Application of our FPM implementation resulted in one significant (p < 0.01) genotype pattern with both genotypes in the pattern being heterozygous and originating from two variants on different chromosomes. This pattern occurred in 14 cases and none of the controls. Thus, the pattern seems quite specific to this form of substance abuse and is also rather predictive of disease. An algorithm called Multifactor Dimension Reduction (MDR) was developed some 20 years ago and has been in use in human genetics ever since. This and our algorithms share some similar properties, but they are also very different in other respects. The main difference seems to be that our algorithm focuses on patterns of genotypes while the main object of inference in MDR is the 3 × 3 table of genotypes at two variants.

Keywords: digenic traits, DNA variants, epistasis, statistical genetics

Procedia PDF Downloads 53
3 The Microwave and Far Infrared Spectra of Acetaldehyde-d1 in vt=2

Authors: A. Larrousi, M. Elkeurti, K. Amara, M. Zemouli, L. H. Coudert, I. R. Medvedev, F. C. De Lucia, Atsuko Maeda, R. W. C. McKellar, D. Appadoo

Abstract:

Experimental and theoretical investigations of the microwave and far infrared spectra of CH3COD are reported. Two hundred twelve lines were identified in the far infrared spectrum recorded using the Canadian synchrotron radiation light source. Two thousand one hundred and sixty-eight lines in vt=0,1 and 216 in vt=2 have been measured in the microwave spectrum obtained using the fast scan submillimeter spectroscopic technique. A global analysis of the new data and of already available microwave lines has been carried out and yielded values for rotation–torsion parameters. The unitless weighted standard deviation of the fit is 1.6. 46 parameters and 216 lines were identified.

Keywords: CH3COD, torsion, the microwave spectra, far infrared spectra high resolution

Procedia PDF Downloads 268
2 Evaluation of Radioprotective Effect of Solanun melongena L. in the Survival of Lasioderma serricorne (Coleoptera, Anobiidae) Irradiated with Gamma Rays of Cobalt-60

Authors: Adilson C. Barros, Kayo Okazaki, Valter Arthur

Abstract:

The radio-protective substances protect the organism from ionizing radiation when previously ingested. Synthetic radio-protectives produce unpleasant side effects and are expensive. This article reports the search for natural radio-protective agents in foods, whose occurrence is widespread, costs are lower and the side effects are non-existent. In this work, we studied the eggplant, a food widely used in Brazil, comparing the radiosensitivity of insects reared on diet eggplant and outside this diet. The eggplant causes change in LD50 parameter of insects population but the response curve needs to be better shaped to conclude something about radioprotection. What we can see is that it seems to contain some radiomodifier substance.

Keywords: radioprotector, radiobiology, Solanun melongena L., Lasioderma serricorne

Procedia PDF Downloads 366
1 Integration of Educational Data Mining Models to a Web-Based Support System for Predicting High School Student Performance

Authors: Sokkhey Phauk, Takeo Okazaki

Abstract:

The challenging task in educational institutions is to maximize the high performance of students and minimize the failure rate of poor-performing students. An effective method to leverage this task is to know student learning patterns with highly influencing factors and get an early prediction of student learning outcomes at the timely stage for setting up policies for improvement. Educational data mining (EDM) is an emerging disciplinary field of data mining, statistics, and machine learning concerned with extracting useful knowledge and information for the sake of improvement and development in the education environment. The study is of this work is to propose techniques in EDM and integrate it into a web-based system for predicting poor-performing students. A comparative study of prediction models is conducted. Subsequently, high performing models are developed to get higher performance. The hybrid random forest (Hybrid RF) produces the most successful classification. For the context of intervention and improving the learning outcomes, a feature selection method MICHI, which is the combination of mutual information (MI) and chi-square (CHI) algorithms based on the ranked feature scores, is introduced to select a dominant feature set that improves the performance of prediction and uses the obtained dominant set as information for intervention. By using the proposed techniques of EDM, an academic performance prediction system (APPS) is subsequently developed for educational stockholders to get an early prediction of student learning outcomes for timely intervention. Experimental outcomes and evaluation surveys report the effectiveness and usefulness of the developed system. The system is used to help educational stakeholders and related individuals for intervening and improving student performance.

Keywords: academic performance prediction system, educational data mining, dominant factors, feature selection method, prediction model, student performance

Procedia PDF Downloads 39