Search results for: λ-levelwise statistical cluster points
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 6777

Search results for: λ-levelwise statistical cluster points

6447 Keypoint Detection Method Based on Multi-Scale Feature Fusion of Attention Mechanism

Authors: Xiaoxiao Li, Shuangcheng Jia, Qian Li

Abstract:

Keypoint detection has always been a challenge in the field of image recognition. This paper proposes a novelty keypoint detection method which is called Multi-Scale Feature Fusion Convolutional Network with Attention (MFFCNA). We verified that the multi-scale features with the attention mechanism module have better feature expression capability. The feature fusion between different scales makes the information that the network model can express more abundant, and the network is easier to converge. On our self-made street sign corner dataset, we validate the MFFCNA model with an accuracy of 97.8% and a recall of 81%, which are 5 and 8 percentage points higher than the HRNet network, respectively. On the COCO dataset, the AP is 71.9%, and the AR is 75.3%, which are 3 points and 2 points higher than HRNet, respectively. Extensive experiments show that our method has a remarkable improvement in the keypoint recognition tasks, and the recognition effect is better than the existing methods. Moreover, our method can be applied not only to keypoint detection but also to image classification and semantic segmentation with good generality.

Keywords: keypoint detection, feature fusion, attention, semantic segmentation

Procedia PDF Downloads 92
6446 Physical Activity and Nutrition Intervention for Singaporean Women Aged 50 Years and Above: A Study Protocol for a Community Based Randomised Controlled Trial

Authors: Elaine Yee Sing Wong, Jonine Jancey, Andy H. Lee, Anthony P. James

Abstract:

Singapore has a rapidly aging population, where the majority of older women aged 50 years and above, are physically inactive and have unhealthy dietary habits, placing them at ‘high risk’ of non-communicable diseases. Given the multiplicity of less than optimal dietary habits and high levels of physical inactivity among Singaporean women, it is imperative to develop appropriate lifestyle interventions at recreational centres to enhance both their physical and nutritional knowledge, as well as provide them with the opportunity to develop skills to support behaviour change. To the best of our knowledge, this proposed study is the first physical activity and nutrition cluster randomised controlled trial conducted in Singapore for older women. Findings from this study may provide insights and recommendations for policy makers and key stakeholders to create new healthy living, recreational centres with supportive environments. This 6-month community-based cluster randomised controlled trial will involve the implementation and evaluation of physical activity and nutrition program for community dwelling Singaporean women, who currently attend recreational centres to promote social leisure activities in their local neighbourhood. The intervention will include dietary education and counselling sessions, physical activity classes, and telephone contact by certified fitness instructors and qualified nutritionists. Social Cognitive Theory with Motivational Interviewing will inform the development of strategies to support health behaviour change. Sixty recreational centres located in Singapore will be randomly selected from five major geographical districts and randomly allocated to the intervention (n=30) or control (n=30) cluster. A sample of 600 (intervention n=300; control n=300) women aged 50 years and above will then be recruited from these recreational centres. The control clusters will only undergo pre and post data collection and will not receive the intervention. It is hypothesised that by the end of the intervention, the intervention group participants (n = 300) compared to the control group (n = 300), will show significant improvements in the following variables: lipid profile, body mass index, physical activity and dietary behaviour, anthropometry, mental and physical health. Data collection will be examined and compared via the Statistical Package for the Social Science version 23. Descriptive and summary statistics will be used to quantify participants’ characteristics and outcome variables. Multi-variable mixed regression analyses will be used to confirm the effects of the proposed health intervention, taking into account the repeated measures and the clustering of the observations. The research protocol was approved by the Curtin University Human Research Ethics Committee (approval number: HRE2016-0366). The study has been registered with the Australian and New Zealand Clinical Trial Registry (12617001022358).

Keywords: community based, healthy aging, intervention, nutrition, older women, physical activity

Procedia PDF Downloads 146
6445 Increasing the Apparent Time Resolution of Tc-99m Diethylenetriamine Pentaacetic Acid Galactosyl Human Serum Albumin Dynamic SPECT by Use of an 180-Degree Interpolation Method

Authors: Yasuyuki Takahashi, Maya Yamashita, Kyoko Saito

Abstract:

In general, dynamic SPECT data acquisition needs a few minutes for one rotation. Thus, the time-activity curve (TAC) derived from the dynamic SPECT is relatively coarse. In order to effectively shorten the interval, between data points, we adopted a 180-degree interpolation method. This method is already used for reconstruction of the X-ray CT data. In this study, we applied this 180-degree interpolation method to SPECT and investigated its effectiveness.To briefly describe the 180-degree interpolation method: the 180-degree data in the second half of one rotation are combined with the 180-degree data in the first half of the next rotation to generate a 360-degree data set appropriate for the time halfway between the first and second rotations. In both a phantom and a patient study, the data points from the interpolated images fell in good agreement with the data points tracking the accumulation of 99mTc activity over time for appropriate region of interest. We conclude that data derived from interpolated images improves the apparent time resolution of dynamic SPECT.

Keywords: dynamic SPECT, time resolution, 180-degree interpolation method, 99mTc-GSA.

Procedia PDF Downloads 473
6444 Studying the Effects of Job Training on Employees Efficiency: A Case Study of University Employees, Qom, Iran

Authors: Seyfollah Fazlollahi, Ahmad Bayan Memar

Abstract:

Background: A review of manpower planning includes a training analysis based on job descriptions and job specifications which looks carefully at training from the points of view of the company, its various departments and personnel. This may show weaknesses in some departments and as a result, training is needed for the staff. Purpose: The aim of this research is to investigate the effects of training on employee’s efficiency in different aspects of work. Methodology: This is a descriptive-survey study. Statistical population was 85 official employees of University of Qom, Iran. 70 of these individuals were selected on a statistical random sampling method using Morgan&Gorki table. The instrument used in this study was a questionnaire including 22 questions. Result: Findings in this study according to data analysis indicate that majority of respondents had positive attitude towards training programs, in the job or off the job. They believed that training programs promoted and enhanced their behavior positively which leads to high efficiency in their job. In fact, data support the main hypothesis that training has positive effects on job performance and efficiency. Conclusion: It is concluded from this study and other related researches that training (on the job and off the job) has positive and effective role in human development and labor as employee’s efficiency. Employees get acquainted with different tasks of a job. Group co-operation, creativity and innovation will be enforced. Training leads to job skills, increasing knowledge and information about a job. It also increases technical and conceptual human skills, which are important in an organization. We can also mention workers' increasing positive motivation toward their job, enforcement of coordinating moral, their good human relations and good contact with clients.

Keywords: training, work efficiency, employee, human relation, job satisfaction

Procedia PDF Downloads 174
6443 Background Check System for Turkish IT Companies

Authors: Arzu Baloglu, Ugur Kaplancali

Abstract:

This paper focuses on Background Check Systems and Pre-Employment Screening. In our study, we attempted to make an online background checking site that will help employers when hiring employees. Our site has two types of users which are free and powered user. Free users are the employees and powered users are the employers which will hire employers. The database of the site will contain all the information about the employees and employers which are registered in the system so the employers can make a search based on their searching criteria to find the suitable employee for the job. The web site also has a comments and points system. The current employer can make comments to his/her employees and can also give them points. The comments will be shown on employee’s profile, so; when an employer searches for an employee he/she can check the points and comments of the employee to see whether he or she is capable of the job or not. The employers can also follow some employees if they desire. This paper has been designed and implemented with using ASP.NET, C# and JavaScript. The outputs have a user friendly interface. The interface also aimed to provide the useful information for Turkish Technology Companies.

Keywords: background, checking, verification, human resources, online

Procedia PDF Downloads 166
6442 A Semantic and Concise Structure to Represent Human Actions

Authors: Tobias Strübing, Fatemeh Ziaeetabar

Abstract:

Humans usually manipulate objects with their hands. To represent these actions in a simple and understandable way, we need to use a semantic framework. For this purpose, the Semantic Event Chain (SEC) method has already been presented which is done by consideration of touching and non-touching relations between manipulated objects in a scene. This method was improved by a computational model, the so-called enriched Semantic Event Chain (eSEC), which incorporates the information of static (e.g. top, bottom) and dynamic spatial relations (e.g. moving apart, getting closer) between objects in an action scene. This leads to a better action prediction as well as the ability to distinguish between more actions. Each eSEC manipulation descriptor is a huge matrix with thirty rows and a massive set of the spatial relations between each pair of manipulated objects. The current eSEC framework has so far only been used in the category of manipulation actions, which eventually involve two hands. Here, we would like to extend this approach to a whole body action descriptor and make a conjoint activity representation structure. For this purpose, we need to do a statistical analysis to modify the current eSEC by summarizing while preserving its features, and introduce a new version called Enhanced eSEC or (e2SEC). This summarization can be done from two points of the view: 1) reducing the number of rows in an eSEC matrix, 2) shrinking the set of possible semantic spatial relations. To achieve these, we computed the importance of each matrix row in an statistical way, to see if it is possible to remove a particular one while all manipulations are still distinguishable from each other. On the other hand, we examined which semantic spatial relations can be merged without compromising the unity of the predefined manipulation actions. Therefore by performing the above analyses, we made the new e2SEC framework which has 20% fewer rows, 16.7% less static spatial and 11.1% less dynamic spatial relations. This simplification, while preserving the salient features of a semantic structure in representing actions, has a tremendous impact on the recognition and prediction of complex actions, as well as the interactions between humans and robots. It also creates a comprehensive platform to integrate with the body limbs descriptors and dramatically increases system performance, especially in complex real time applications such as human-robot interaction prediction.

Keywords: enriched semantic event chain, semantic action representation, spatial relations, statistical analysis

Procedia PDF Downloads 83
6441 Hierarchical Cluster Analysis of Raw Milk Samples Obtained from Organic and Conventional Dairy Farming in Autonomous Province of Vojvodina, Serbia

Authors: Lidija Jevrić, Denis Kučević, Sanja Podunavac-Kuzmanović, Strahinja Kovačević, Milica Karadžić

Abstract:

In the present study, the Hierarchical Cluster Analysis (HCA) was applied in order to determine the differences between the milk samples originating from a conventional dairy farm (CF) and an organic dairy farm (OF) in AP Vojvodina, Republic of Serbia. The clustering was based on the basis of the average values of saturated fatty acids (SFA) content and unsaturated fatty acids (UFA) content obtained for every season. Therefore, the HCA included the annual SFA and UFA content values. The clustering procedure was carried out on the basis of Euclidean distances and Single linkage algorithm. The obtained dendrograms indicated that the clustering of UFA in OF was much more uniform compared to clustering of UFA in CF. In OF, spring stands out from the other months of the year. The same case can be noticed for CF, where winter is separated from the other months. The results could be expected because the composition of fatty acids content is greatly influenced by the season and nutrition of dairy cows during the year.

Keywords: chemometrics, clustering, food engineering, milk quality

Procedia PDF Downloads 252
6440 Examining Statistical Monitoring Approach against Traditional Monitoring Techniques in Detecting Data Anomalies during Conduct of Clinical Trials

Authors: Sheikh Omar Sillah

Abstract:

Introduction: Monitoring is an important means of ensuring the smooth implementation and quality of clinical trials. For many years, traditional site monitoring approaches have been critical in detecting data errors but not optimal in identifying fabricated and implanted data as well as non-random data distributions that may significantly invalidate study results. The objective of this paper was to provide recommendations based on best statistical monitoring practices for detecting data-integrity issues suggestive of fabrication and implantation early in the study conduct to allow implementation of meaningful corrective and preventive actions. Methodology: Electronic bibliographic databases (Medline, Embase, PubMed, Scopus, and Web of Science) were used for the literature search, and both qualitative and quantitative studies were sought. Search results were uploaded into Eppi-Reviewer Software, and only publications written in the English language from 2012 were included in the review. Gray literature not considered to present reproducible methods was excluded. Results: A total of 18 peer-reviewed publications were included in the review. The publications demonstrated that traditional site monitoring techniques are not efficient in detecting data anomalies. By specifying project-specific parameters such as laboratory reference range values, visit schedules, etc., with appropriate interactive data monitoring, statistical monitoring can offer early signals of data anomalies to study teams. The review further revealed that statistical monitoring is useful to identify unusual data patterns that might be revealing issues that could impact data integrity or may potentially impact study participants' safety. However, subjective measures may not be good candidates for statistical monitoring. Conclusion: The statistical monitoring approach requires a combination of education, training, and experience sufficient to implement its principles in detecting data anomalies for the statistical aspects of a clinical trial.

Keywords: statistical monitoring, data anomalies, clinical trials, traditional monitoring

Procedia PDF Downloads 45
6439 The Impact of Covid-19 on Anxiety Levels in the General Population of the United States: An Exploratory Survey

Authors: Amro Matyori, Fatimah Sherbeny, Askal Ali, Olayiwola Popoola

Abstract:

Objectives: The study evaluated the impact of COVID-19 on anxiety levels in the general population in the United States. Methods: The study used an online questionnaire. It adopted the Generalized Anxiety Disorder Assessment (GAD-7) instrument. It is a self-administered scale with seven items used as a screening tool and severity measure for generalized anxiety disorder. The participants rated the frequency of anxiety symptoms in the last two weeks on a Likert scale, which ranges from 0-3. Then the item points are summed to provide the total score. Results: Thirty-two participants completed the questionnaire. Among them, 24 (83%) females and 5 (17%) males. The age range of 18-24-year-old represented the most respondents. Only one of the participants tested positive for the COVID-19, and 39% of them, one of their family members, friends, or colleagues, tested positive for the coronavirus. Moreover, 10% have lost a family member, a close friend, or a colleague because of COVID-19. Among the respondents, there were ten who scored approximately five points on the GAD-7 scale, which indicates mild anxiety. Furthermore, eight participants scored 10 to 14 points, which put them under the category of moderate anxiety, and one individual who was categorized under severe anxiety scored 15 points. Conclusions: It is identified that most of the respondents scored the points that put them under the mild anxiety category during the COVID-19 pandemic. It is also noticed that severe anxiety was the lowest among the participants, and people who tested positive and/or their family members, close friends, and colleagues were more likely to experience anxiety. Additionally, participants who lost friends or family members were also at high risk of anxiety. It is obvious the COVID-19 outcomes and too much thinking about the pandemic put people under stress which led to anxiety. Therefore, continuous assessment and monitoring of psychological outcomes during pandemics will help to establish early well-informed interventions.

Keywords: anxiety and covid-19, covid-19 and mental health outcomes, influence of covid-19 on anxiety, population and covid-19 impact on mental health

Procedia PDF Downloads 182
6438 Turning Points in the Development of Translator Training in the West from the 1980s to the Present

Authors: B. Sayaheen

Abstract:

The translator’s competence is one of the topics that has received a great deal of research in the field of translation studies because such competencies are still debatable and not yet agreed upon. Besides, scholars tackle this topic from different points of view. Approaches to teaching these competencies have gone through some developments. This paper aims at investigating these developments, exploring the major turning points and shifts in the developments of teaching methods in translator training. The significance of these turning points and the external or internal causes will also be discussed. Based on the past and present status of teaching approaches in translator training, this paper tries to predict the future of these approaches. This paper is mainly concerned with developments of teaching approaches in the West since the 1980s to the present. The reason behind choosing this specific period is not because translator training started in the 1980s but because most criticism of the teacher-centered approach started at that time. The implications of this research stem from the fact that it identifies the turning points and the causes that led teachers to adopt student-centered approaches rather than teacher-centered approaches and then to incorporate technology and the Internet in translator training. These reasons were classified as external or internal reasons. Translation programs in the West and in other cultures can benefit from this study. Translation programs in the West can notice that teaching translation is geared toward incorporating more technologies. If these programs already use technology and the Internet to teach translation, they might benefit from the assumed future direction of teaching translation. On the other hand, some non-Western countries, and to be specific some professors, are still applying the teacher-centered approach. Moreover, these programs should include technology and the Internet in their teaching approaches to meet the drastic changes in the translation process, which seems to rely more on software and technologies to accomplish the translator’s tasks. Finally, translator training has borrowed many of its approaches from other disciplines, mainly language teaching. The teaching approaches in translator training have gone through some developments, from teacher-centered to student-centered and then toward the integration of technologies and the Internet. Both internal and external causes have played a crucial role in these developments. These borrowed approaches should be comprehensively evaluated in order to see if they achieve the goals of translator training. Such evaluation may lead us to come up with new teaching approaches developed specifically for translator training. While considering these methods and designing new approaches, we need to keep an eye on the future needs of the market.

Keywords: turning points, developments, translator training, market, The West

Procedia PDF Downloads 86
6437 Light-Weight Network for Real-Time Pose Estimation

Authors: Jianghao Hu, Hongyu Wang

Abstract:

The effective and efficient human pose estimation algorithm is an important task for real-time human pose estimation on mobile devices. This paper proposes a light-weight human key points detection algorithm, Light-Weight Network for Real-Time Pose Estimation (LWPE). LWPE uses light-weight backbone network and depthwise separable convolutions to reduce parameters and lower latency. LWPE uses the feature pyramid network (FPN) to fuse the high-resolution, semantically weak features with the low-resolution, semantically strong features. In the meantime, with multi-scale prediction, the predicted result by the low-resolution feature map is stacked to the adjacent higher-resolution feature map to intermediately monitor the network and continuously refine the results. At the last step, the key point coordinates predicted in the highest-resolution are used as the final output of the network. For the key-points that are difficult to predict, LWPE adopts the online hard key points mining strategy to focus on the key points that hard predicting. The proposed algorithm achieves excellent performance in the single-person dataset selected in the AI (artificial intelligence) challenge dataset. The algorithm maintains high-precision performance even though the model only contains 3.9M parameters, and it can run at 225 frames per second (FPS) on the generic graphics processing unit (GPU).

Keywords: depthwise separable convolutions, feature pyramid network, human pose estimation, light-weight backbone

Procedia PDF Downloads 122
6436 Diagnosing Depression during Pregnancy-Identifying Risk Factors of Prenatal Depression in Polish Women

Authors: Olga Plaza, Katarzyna Kosinska-Kaczynska, Stepan Feduniw, Dominika Pazdzior, Kinga Zebrowska, Katarzyna Kwiatkowska

Abstract:

Introduction: The main causes of depression among pregnant women remain unclear. However, it is clear that pregnancy carries a higher risk of depression occurrence. Left untreated, prenatal depression can be a cause of serious both maternal and neonatal complications. Aim of the study: The aim of the study was to define potential risk factors of prenatal depression and to assess the frequency of its occurrence among pregnant women. Material and Methods: A prospective cross-sectional study was performed among 346 women. The self- composed questionnaire consisting of 46 questions, was distributed via the Internet between November 2017 and March 2018. The questionnaire contained the Edinburgh Postnatal Depression Scale (EPDS), in which the results of 13 and more points (out of 30) suggested possible prenatal depression. Statistical analysis was performed with Chi2 Pearson. P value < 0.05 was considered significant. Results: 37.57% (n=130) of women had a score of 13 or more points. Women with depressive symptoms (DS) reported lack of support from the partner (46.9% vs. 16.2%; p < 0.001) as well as other family members (40.8% vs. 14.4%; p < 0.001), current pregnancy being unplanned (21.5% vs. 12.5%; p=0.014) and low socio-economic status (10% vs. 0.9%; p < 0.001). Both early and advanced maternal age seemed to play a role in occurrence of DS: in women aged 17-24 40.8% declared symptoms (vs 28.7%; p < 0.01), in mothers aged ≥37 6.2% did (vs 0.5%; p < 0.001). Smoking during pregnancy was also more frequent among patients with DS (31.5% vs. 18.1%; p=0.004). Previous diagnosis of depression or other mood disorders significantly increased a chance of DS occurrence (respectively- 17.7% vs. 4.6%; p < 0.001 and 49.2% vs. 25%; p<0.001). Parental diagnosis of mood disorders and other mental disorders was also more frequent in this group of patients (respectively- 24.6% vs. 15.7%; p= 0.026 and 26.4% vs. 9.7%; p < 0.001). Only 23.8% of women with DS sought help from healthcare professionals, with 21.5% receiving pharmacological treatment. Conclusions: Pregnant women often report having DS. Evaluation of risk factors of DS and possible prenatal depression is essential in proper screening for depression among pregnant women.

Keywords: obstetrics, polish women, prenatal care, prenatal depression, risk factors

Procedia PDF Downloads 187
6435 Routing and Energy Efficiency through Data Coupled Clustering in Large Scale Wireless Sensor Networks (WSNs)

Authors: Jainendra Singh, Zaheeruddin

Abstract:

A typical wireless sensor networks (WSNs) consists of several tiny and low-power sensors which use radio frequency to perform distributed sensing tasks. The longevity of wireless sensor networks (WSNs) is a major issue that impacts the application of such networks. While routing protocols are striving to save energy by acting on sensor nodes, recent studies show that network lifetime can be enhanced by further involving sink mobility. A common approach for energy efficiency is partitioning the network into clusters with correlated data, where the representative nodes simply transmit or average measurements inside the cluster. In this paper, we propose an energy- efficient homogenous clustering (EHC) technique. In this technique, the decision of each sensor is based on their residual energy and an estimate of how many of its neighboring cluster heads (CHs) will benefit from it being a CH. We, also explore the routing algorithm in clustered WSNs. We show that the proposed schemes significantly outperform current approaches in terms of packet delay, hop count and energy consumption of WSNs.

Keywords: wireless sensor network, energy efficiency, clustering, routing

Procedia PDF Downloads 236
6434 Prediction of Sepsis Illness from Patients Vital Signs Using Long Short-Term Memory Network and Dynamic Analysis

Authors: Marcio Freire Cruz, Naoaki Ono, Shigehiko Kanaya, Carlos Arthur Mattos Teixeira Cavalcante

Abstract:

The systems that record patient care information, known as Electronic Medical Record (EMR) and those that monitor vital signs of patients, such as heart rate, body temperature, and blood pressure have been extremely valuable for the effectiveness of the patient’s treatment. Several kinds of research have been using data from EMRs and vital signs of patients to predict illnesses. Among them, we highlight those that intend to predict, classify, or, at least identify patterns, of sepsis illness in patients under vital signs monitoring. Sepsis is an organic dysfunction caused by a dysregulated patient's response to an infection that affects millions of people worldwide. Early detection of sepsis is expected to provide a significant improvement in its treatment. Preceding works usually combined medical, statistical, mathematical and computational models to develop detection methods for early prediction, getting higher accuracies, and using the smallest number of variables. Among other techniques, we could find researches using survival analysis, specialist systems, machine learning and deep learning that reached great results. In our research, patients are modeled as points moving each hour in an n-dimensional space where n is the number of vital signs (variables). These points can reach a sepsis target point after some time. For now, the sepsis target point was calculated using the median of all patients’ variables on the sepsis onset. From these points, we calculate for each hour the position vector, the first derivative (velocity vector) and the second derivative (acceleration vector) of the variables to evaluate their behavior. And we construct a prediction model based on a Long Short-Term Memory (LSTM) Network, including these derivatives as explanatory variables. The accuracy of the prediction 6 hours before the time of sepsis, considering only the vital signs reached 83.24% and by including the vectors position, speed, and acceleration, we obtained 94.96%. The data are being collected from Medical Information Mart for Intensive Care (MIMIC) Database, a public database that contains vital signs, laboratory test results, observations, notes, and so on, from more than 60.000 patients.

Keywords: dynamic analysis, long short-term memory, prediction, sepsis

Procedia PDF Downloads 97
6433 Extraction of Compound Words in Malay Sentences Using Linguistic and Statistical Approaches

Authors: Zamri Abu Bakar Zamri, Normaly Kamal Ismail Normaly, Mohd Izani Mohamed Rawi Izani

Abstract:

Malay noun compound are phrases that consist of two or more nouns. The key characteristic behind noun compounds lies on its frequent occurrences within the text. Therefore, extracting these noun compounds is essential for several domains of research such as Information Retrieval, Sentiment Analysis and Question Answering. Many research efforts have been proposed in terms of extracting Malay noun compounds using linguistic and statistical approaches. Most of the existing methods have concentrated on the extraction of bi-gram noun+noun compound. However, extracting noun+verb, noun+adjective and noun+prepositional is challenging due to the difficulty of selecting an appropriate method with effective results. Thus, there is still room for improvement in terms of enhancing the effectiveness of compound word extraction. Therefore, this study proposed a combination of linguistic approach and statistical measures in order to enhance the extraction of compound words. Several preprocessing steps are involved including normalization, tokenization, and stemming. The linguistic approach that has been used in this study is Part-of-Speech (POS) tagging. In addition, a new linguistic pattern for named entities has been utilized using a list of Malays named entities in order to enhance the linguistic approach in terms of noun compound recognition. The proposed statistical measures consists of NC-value, NTC-value and NLC value.

Keywords: Compound Word, Noun Compound, Linguistic Approach, Statistical Approach

Procedia PDF Downloads 316
6432 Bandwidth Efficient Cluster Based Collision Avoidance Multicasting Protocol in VANETs

Authors: Navneet Kaur, Amarpreet Singh

Abstract:

In Vehicular Adhoc Networks, Data Dissemination is a challenging task. There are number of techniques, types and protocols available for disseminating the data but in order to preserve limited bandwidth and to disseminate maximum data over networks makes it more challenging. There are broadcasting, multicasting and geocasting based protocols. Multicasting based protocols are found to be best for conserving the bandwidth. One such protocol named BEAM exists that improves the performance of Vehicular Adhoc Networks by reducing the number of in-network message transactions and thereby efficiently utilizing the bandwidth during an emergency situation. But this protocol may result in multicar chain collision as there was no V2V communication. So, this paper proposes a new protocol named Enhanced Bandwidth Efficient Cluster Based Multicasting Protocol (EBECM) that will overcome the limitations of existing BEAM protocol. And Simulation results will show the improved performance of EBECM in terms of Routing overhead, throughput and PDR when compared with BEAM protocol.

Keywords: BEAM, data dissemination, emergency situation, vehicular adhoc network

Procedia PDF Downloads 319
6431 Application of Combined Cluster and Discriminant Analysis to Make the Operation of Monitoring Networks More Economical

Authors: Norbert Magyar, Jozsef Kovacs, Peter Tanos, Balazs Trasy, Tamas Garamhegyi, Istvan Gabor Hatvani

Abstract:

Water is one of the most important common resources, and as a result of urbanization, agriculture, and industry it is becoming more and more exposed to potential pollutants. The prevention of the deterioration of water quality is a crucial role for environmental scientist. To achieve this aim, the operation of monitoring networks is necessary. In general, these networks have to meet many important requirements, such as representativeness and cost efficiency. However, existing monitoring networks often include sampling sites which are unnecessary. With the elimination of these sites the monitoring network can be optimized, and it can operate more economically. The aim of this study is to illustrate the applicability of the CCDA (Combined Cluster and Discriminant Analysis) to the field of water quality monitoring and optimize the monitoring networks of a river (the Danube), a wetland-lake system (Kis-Balaton & Lake Balaton), and two surface-subsurface water systems on the watershed of Lake Neusiedl/Lake Fertő and on the Szigetköz area over a period of approximately two decades. CCDA combines two multivariate data analysis methods: hierarchical cluster analysis and linear discriminant analysis. Its goal is to determine homogeneous groups of observations, in our case sampling sites, by comparing the goodness of preconceived classifications obtained from hierarchical cluster analysis with random classifications. The main idea behind CCDA is that if the ratio of correctly classified cases for a grouping is higher than at least 95% of the ratios for the random classifications, then at the level of significance (α=0.05) the given sampling sites don’t form a homogeneous group. Due to the fact that the sampling on the Lake Neusiedl/Lake Fertő was conducted at the same time at all sampling sites, it was possible to visualize the differences between the sampling sites belonging to the same or different groups on scatterplots. Based on the results, the monitoring network of the Danube yields redundant information over certain sections, so that of 12 sampling sites, 3 could be eliminated without loss of information. In the case of the wetland (Kis-Balaton) one pair of sampling sites out of 12, and in the case of Lake Balaton, 5 out of 10 could be discarded. For the groundwater system of the catchment area of Lake Neusiedl/Lake Fertő all 50 monitoring wells are necessary, there is no redundant information in the system. The number of the sampling sites on the Lake Neusiedl/Lake Fertő can decrease to approximately the half of the original number of the sites. Furthermore, neighbouring sampling sites were compared pairwise using CCDA and the results were plotted on diagrams or isoline maps showing the location of the greatest differences. These results can help researchers decide where to place new sampling sites. The application of CCDA proved to be a useful tool in the optimization of the monitoring networks regarding different types of water bodies. Based on the results obtained, the monitoring networks can be operated more economically.

Keywords: combined cluster and discriminant analysis, cost efficiency, monitoring network optimization, water quality

Procedia PDF Downloads 322
6430 Clustering-Based Computational Workload Minimization in Ontology Matching

Authors: Mansir Abubakar, Hazlina Hamdan, Norwati Mustapha, Teh Noranis Mohd Aris

Abstract:

In order to build a matching pattern for each class correspondences of ontology, it is required to specify a set of attribute correspondences across two corresponding classes by clustering. Clustering reduces the size of potential attribute correspondences considered in the matching activity, which will significantly reduce the computation workload; otherwise, all attributes of a class should be compared with all attributes of the corresponding class. Most existing ontology matching approaches lack scalable attributes discovery methods, such as cluster-based attribute searching. This problem makes ontology matching activity computationally expensive. It is therefore vital in ontology matching to design a scalable element or attribute correspondence discovery method that would reduce the size of potential elements correspondences during mapping thereby reduce the computational workload in a matching process as a whole. The objective of this work is 1) to design a clustering method for discovering similar attributes correspondences and relationships between ontologies, 2) to discover element correspondences by classifying elements of each class based on element’s value features using K-medoids clustering technique. Discovering attribute correspondence is highly required for comparing instances when matching two ontologies. During the matching process, any two instances across two different data sets should be compared to their attribute values, so that they can be regarded to be the same or not. Intuitively, any two instances that come from classes across which there is a class correspondence are likely to be identical to each other. Besides, any two instances that hold more similar attribute values are more likely to be matched than the ones with less similar attribute values. Most of the time, similar attribute values exist in the two instances across which there is an attribute correspondence. This work will present how to classify attributes of each class with K-medoids clustering, then, clustered groups to be mapped by their statistical value features. We will also show how to map attributes of a clustered group to attributes of the mapped clustered group, generating a set of potential attribute correspondences that would be applied to generate a matching pattern. The K-medoids clustering phase would largely reduce the number of attribute pairs that are not corresponding for comparing instances as only the coverage probability of attributes pairs that reaches 100% and attributes above the specified threshold can be considered as potential attributes for a matching. Using clustering will reduce the size of potential elements correspondences to be considered during mapping activity, which will in turn reduce the computational workload significantly. Otherwise, all element of the class in source ontology have to be compared with all elements of the corresponding classes in target ontology. K-medoids can ably cluster attributes of each class, so that a proportion of attribute pairs that are not corresponding would not be considered when constructing the matching pattern.

Keywords: attribute correspondence, clustering, computational workload, k-medoids clustering, ontology matching

Procedia PDF Downloads 221
6429 Geospatial and Statistical Evidences of Non-Engineered Landfill Leachate Effects on Groundwater Quality in a Highly Urbanised Area of Nigeria

Authors: David A. Olasehinde, Peter I. Olasehinde, Segun M. A. Adelana, Dapo O. Olasehinde

Abstract:

An investigation was carried out on underground water system dynamics within Ilorin metropolis to monitor the subsurface flow and its corresponding pollution. Africa population growth rate is the highest among the regions of the world, especially in urban areas. A corresponding increase in waste generation and a change in waste composition from predominantly organic to non-organic waste has also been observed. Percolation of leachate from non-engineered landfills, the chief means of waste disposal in many of its cities, constitutes a threat to the underground water bodies. Ilorin city, a transboundary town in southwestern Nigeria, is a ready microcosm of Africa’s unique challenge. In spite of the fact that groundwater is naturally protected from common contaminants such as bacteria as the subsurface provides natural attenuation process, groundwater samples have been noted to however possesses relatively higher dissolved chemical contaminants such as bicarbonate, sodium, and chloride which poses a great threat to environmental receptors and human consumption. The Geographic Information System (GIS) was used as a tool to illustrate, subsurface dynamics and the corresponding pollutant indicators. Forty-four sampling points were selected around known groundwater pollutant, major old dumpsites without landfill liners. The results of the groundwater flow directions and the corresponding contaminant transport were presented using expert geospatial software. The experimental results were subjected to four descriptive statistical analyses, namely: principal component analysis, Pearson correlation analysis, scree plot analysis, and Ward cluster analysis. Regression model was also developed aimed at finding functional relationships that can adequately relate or describe the behaviour of water qualities and the hypothetical factors landfill characteristics that may influence them namely; distance of source of water body from dumpsites, static water level of groundwater, subsurface permeability (inferred from hydraulic gradient), and soil infiltration. The regression equations developed were validated using the graphical approach. Underground water seems to flow from the northern portion of Ilorin metropolis down southwards transporting contaminants. Pollution pattern in the study area generally assumed a bimodal pattern with the major concentration of the chemical pollutants in the underground watershed and the recharge. The correlation between contaminant concentrations and the spread of pollution indicates that areas of lower subsurface permeability display a higher concentration of dissolved chemical content. The principal component analysis showed that conductivity, suspended solids, calcium hardness, total dissolved solids, total coliforms, and coliforms were the chief contaminant indicators in the underground water system in the study area. Pearson correlation revealed a high correlation of electrical conductivity for many parameters analyzed. In the same vein, the regression models suggest that the heavier the molecular weight of a chemical contaminant of a pollutant from a point source, the greater the pollution of the underground water system at a short distance. The study concludes that the associative properties of landfill have a significant effect on groundwater quality in the study area.

Keywords: dumpsite, leachate, groundwater pollution, linear regression, principal component

Procedia PDF Downloads 84
6428 Alcoxysilanes Production from Silica and Dimethylcarbonate Promoted by Alkali Bases: A DFT Investigation of the Reaction Mechanism

Authors: Valeria Butera, Norihisa Fukaya, Jun-Chu Choi, Kazuhiko Sato, Yoong-Kee Choe

Abstract:

Several silicon dioxide sources can react with dimethyl carbonate (DMC) in presence of alkali bases catalysts to ultimately produce tetramethoxysilane (TMOS). Experimental findings suggested that the reaction proceeds through several steps in which the first molecule of DMC is converted to dimethylsilyloxide (DMOS) and CO₂. Following the same mechanistic steps, a second molecule of DMC reacts with the DMOS to afford the final product TMOS. Using a cluster model approach, a quantum-mechanical investigation of the first part of the reaction leading to DMOS formation is reported with a twofold purpose: (1) verify the viability of the reaction mechanism proposed on the basis of experimental evidences .(2) compare the behaviors of three different alkali hydroxides MOH, where M=Li, K and Cs, to determine whether diverse ionic radius and charge density can be considered responsible for the observed differences in reactivity. Our findings confirm the observed experimental trend and furnish important information about the effective role of the alkali hydroxides giving an explanation of the different catalytic activity of the three metal cations.

Keywords: Alcoxysilanes production, cluster model approach, DFT, DMC conversion

Procedia PDF Downloads 245
6427 Predictive Analytics for Theory Building

Authors: Ho-Won Jung, Donghun Lee, Hyung-Jin Kim

Abstract:

Predictive analytics (data analysis) uses a subset of measurements (the features, predictor, or independent variable) to predict another measurement (the outcome, target, or dependent variable) on a single person or unit. It applies empirical methods in statistics, operations research, and machine learning to predict the future, or otherwise unknown events or outcome on a single or person or unit, based on patterns in data. Most analyses of metabolic syndrome are not predictive analytics but statistical explanatory studies that build a proposed model (theory building) and then validate metabolic syndrome predictors hypothesized (theory testing). A proposed theoretical model forms with causal hypotheses that specify how and why certain empirical phenomena occur. Predictive analytics and explanatory modeling have their own territories in analysis. However, predictive analytics can perform vital roles in explanatory studies, i.e., scientific activities such as theory building, theory testing, and relevance assessment. In the context, this study is to demonstrate how to use our predictive analytics to support theory building (i.e., hypothesis generation). For the purpose, this study utilized a big data predictive analytics platform TM based on a co-occurrence graph. The co-occurrence graph is depicted with nodes (e.g., items in a basket) and arcs (direct connections between two nodes), where items in a basket are fully connected. A cluster is a collection of fully connected items, where the specific group of items has co-occurred in several rows in a data set. Clusters can be ranked using importance metrics, such as node size (number of items), frequency, surprise (observed frequency vs. expected), among others. The size of a graph can be represented by the numbers of nodes and arcs. Since the size of a co-occurrence graph does not depend directly on the number of observations (transactions), huge amounts of transactions can be represented and processed efficiently. For a demonstration, a total of 13,254 metabolic syndrome training data is plugged into the analytics platform to generate rules (potential hypotheses). Each observation includes 31 predictors, for example, associated with sociodemographic, habits, and activities. Some are intentionally included to get predictive analytics insights on variable selection such as cancer examination, house type, and vaccination. The platform automatically generates plausible hypotheses (rules) without statistical modeling. Then the rules are validated with an external testing dataset including 4,090 observations. Results as a kind of inductive reasoning show potential hypotheses extracted as a set of association rules. Most statistical models generate just one estimated equation. On the other hand, a set of rules (many estimated equations from a statistical perspective) in this study may imply heterogeneity in a population (i.e., different subpopulations with unique features are aggregated). Next step of theory development, i.e., theory testing, statistically tests whether a proposed theoretical model is a plausible explanation of a phenomenon interested in. If hypotheses generated are tested statistically with several thousand observations, most of the variables will become significant as the p-values approach zero. Thus, theory validation needs statistical methods utilizing a part of observations such as bootstrap resampling with an appropriate sample size.

Keywords: explanatory modeling, metabolic syndrome, predictive analytics, theory building

Procedia PDF Downloads 244
6426 Pedagogical Content Knowledge for Nature of Science: In Search for a Meaning for the Construct

Authors: Elaosi Vhurumuku

Abstract:

During the past twenty years, there has been an increased interest by science educators in researching and developing teachers’ pedagogical content knowledge for teaching the nature of science (PCKNOS). While there has been this surge in interest in the idea of PCKNOS, there has not been a common understanding among NOS researchers as to how exactly the PCKNOS concept should be construed. In this paper, we analyse and evaluate published accredited journal articles on PCKNOS research. We also draw from our teaching experiences. The major points of foci are the researchers’ presentations of SMKNOS and their centres of attention regarding the elements of PCKNOS. Our content, cluster analysis, and evaluation of the studies on PCKNOS reveal that most researchers have presented SMKNOS in the form of a heuristic or a set of heuristics (targeted NOS ideas) to be mastered by teachers or learners. Furthermore, we found that most of the researchers’ attention has been on developing and recommending teacher pedagogical practices for teaching NOS. From this, we synthesize and propose a subject knowledge content structure and a pedagogical approach that we believe is relevant and appropriate for secondary school and science teacher education if the goal of science education for scientific literacy is to be achieved. The justification of our arguments is rooted in tracing and unpacking the origins and meaning of pedagogical content knowledge (PCK). From our analysis, synthesis, and evaluation, as well as teaching experiences, we distil and construct a meaning for the PCKNOS construct.

Keywords: pedagogical content knowledge, teaching, nature of science, construct, subject matter knowledge

Procedia PDF Downloads 49
6425 Determining Abnomal Behaviors in UAV Robots for Trajectory Control in Teleoperation

Authors: Kiwon Yeom

Abstract:

Change points are abrupt variations in a data sequence. Detection of change points is useful in modeling, analyzing, and predicting time series in application areas such as robotics and teleoperation. In this paper, a change point is defined to be a discontinuity in one of its derivatives. This paper presents a reliable method for detecting discontinuities within a three-dimensional trajectory data. The problem of determining one or more discontinuities is considered in regular and irregular trajectory data from teleoperation. We examine the geometric detection algorithm and illustrate the use of the method on real data examples.

Keywords: change point, discontinuity, teleoperation, abrupt variation

Procedia PDF Downloads 137
6424 Comparison of Statistical Methods for Estimating Missing Precipitation Data in the River Subbasin Lenguazaque, Colombia

Authors: Miguel Cañon, Darwin Mena, Ivan Cabeza

Abstract:

In this work was compared and evaluated the applicability of statistical methods for the estimation of missing precipitations data in the basin of the river Lenguazaque located in the departments of Cundinamarca and Boyacá, Colombia. The methods used were the method of simple linear regression, distance rate, local averages, mean rates, correlation with nearly stations and multiple regression method. The analysis used to determine the effectiveness of the methods is performed by using three statistical tools, the correlation coefficient (r2), standard error of estimation and the test of agreement of Bland and Altmant. The analysis was performed using real rainfall values removed randomly in each of the seasons and then estimated using the methodologies mentioned to complete the missing data values. So it was determined that the methods with the highest performance and accuracy in the estimation of data according to conditions that were counted are the method of multiple regressions with three nearby stations and a random application scheme supported in the precipitation behavior of related data sets.

Keywords: statistical comparison, precipitation data, river subbasin, Bland and Altmant

Procedia PDF Downloads 444
6423 GPS Refinement in Cities Using Statistical Approach

Authors: Ashwani Kumar

Abstract:

GPS plays an important role in everyday life for safe and convenient transportation. While pedestrians use hand held devices to know their position in a city, vehicles in intelligent transport systems use relatively sophisticated GPS receivers for estimating their current position. However, in urban areas where the GPS satellites are occluded by tall buildings, trees and reflections of GPS signals from nearby vehicles, GPS position estimation becomes poor. In this work, an exhaustive GPS data is collected at a single point in urban area under different times of day and under dynamic environmental conditions. The data is analyzed and statistical refinement methods are used to obtain optimal position estimate among all the measured positions. The results obtained are compared with publically available datasets and obtained position estimation refinement results are promising.

Keywords: global positioning system, statistical approach, intelligent transport systems, least squares estimation

Procedia PDF Downloads 260
6422 Approximating Fixed Points by a Two-Step Iterative Algorithm

Authors: Safeer Hussain Khan

Abstract:

In this paper, we introduce a two-step iterative algorithm to prove a strong convergence result for approximating common fixed points of three contractive-like operators. Our algorithm basically generalizes an existing algorithm..Our iterative algorithm also contains two famous iterative algorithms: Mann iterative algorithm and Ishikawa iterative algorithm. Thus our result generalizes the corresponding results proved for the above three iterative algorithms to a class of more general operators. At the end, we remark that nothing prevents us to extend our result to the case of the iterative algorithm with error terms.

Keywords: contractive-like operator, iterative algorithm, fixed point, strong convergence

Procedia PDF Downloads 515
6421 Understanding the Qualitative Nature of Product Reviews by Integrating Text Processing Algorithm and Usability Feature Extraction

Authors: Cherry Yieng Siang Ling, Joong Hee Lee, Myung Hwan Yun

Abstract:

The quality of a product to be usable has become the basic requirement in consumer’s perspective while failing the requirement ends up the customer from not using the product. Identifying usability issues from analyzing quantitative and qualitative data collected from usability testing and evaluation activities aids in the process of product design, yet the lack of studies and researches regarding analysis methodologies in qualitative text data of usability field inhibits the potential of these data for more useful applications. While the possibility of analyzing qualitative text data found with the rapid development of data analysis studies such as natural language processing field in understanding human language in computer, and machine learning field in providing predictive model and clustering tool. Therefore, this research aims to study the application capability of text processing algorithm in analysis of qualitative text data collected from usability activities. This research utilized datasets collected from LG neckband headset usability experiment in which the datasets consist of headset survey text data, subject’s data and product physical data. In the analysis procedure, which integrated with the text-processing algorithm, the process includes training of comments onto vector space, labeling them with the subject and product physical feature data, and clustering to validate the result of comment vector clustering. The result shows 'volume and music control button' as the usability feature that matches best with the cluster of comment vectors where centroid comments of a cluster emphasized more on button positions, while centroid comments of the other cluster emphasized more on button interface issues. When volume and music control buttons are designed separately, the participant experienced less confusion, and thus, the comments mentioned only about the buttons' positions. While in the situation where the volume and music control buttons are designed as a single button, the participants experienced interface issues regarding the buttons such as operating methods of functions and confusion of functions' buttons. The relevance of the cluster centroid comments with the extracted feature explained the capability of text processing algorithms in analyzing qualitative text data from usability testing and evaluations.

Keywords: usability, qualitative data, text-processing algorithm, natural language processing

Procedia PDF Downloads 250
6420 Closed Urban Block versus Open Housing Estates Structures: Sustainability Surveys in Brno, Czech Republic

Authors: M. Wittmann, G. Kopacik, A. Leitmannova

Abstract:

A prominent place in the spatial arrangement of Czech as well as other post-socialist, Central European cities belongs to 19th century closed urban blocks and the open concrete panel housing estates which were erected during the socialism era in the second half of 20th century. The characteristics of these two fundamentally diverse types of residential structures have, as we suppose, a different impact on the sustainable development of the urban area. The characteristics of these residential structures may influence the ecological stability of the area, its hygienic qualities, the intensity and way of using by various social groups, and also, e.g., the prices of real estates. These and many other phenomena indicate the environmental, social and economic sustainability of the urban area. The proposed research methodology assessed specific indicators of sustainability within a range from 0 to 10 points. 5 points correspond to the general standard in the area, 0 points indicates degradation, and 10 points indicate the highest contribution to sustainable development. The survey results are reflected in the overall sustainability index and in the residents’ satisfaction index. The paper analyses the residential structures in the Central European city of Brno, Czech Republic. The case studies of the urban blocks near the city centre and of the housing estate Brno - Vinohrady are compared. The results imply that a considerable positive impact on the sustainable development of the area should be ascribed to the closed urban blocks near the city centre.

Keywords: City of Brno, closed urban block, open housing estate, urban structure

Procedia PDF Downloads 148
6419 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Mpho Mokoatle, Darlington Mapiye, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on $k$-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0%, 80.5%, 80.5%, 63.6%, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms.

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 133
6418 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Darlington Mapiye, Mpho Mokoatle, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on k-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0 %, 80.5 %, 80.5 %, 63.6 %, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 123