Search results for: utility mining.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 768

Search results for: utility mining.

48 A Growing Natural Gas Approach for Evaluating Quality of Software Modules

Authors: Parvinder S. Sandhu, Sandeep Khimta, Kiranpreet Kaur

Abstract:

The prediction of Software quality during development life cycle of software project helps the development organization to make efficient use of available resource to produce the product of highest quality. “Whether a module is faulty or not" approach can be used to predict quality of a software module. There are numbers of software quality prediction models described in the literature based upon genetic algorithms, artificial neural network and other data mining algorithms. One of the promising aspects for quality prediction is based on clustering techniques. Most quality prediction models that are based on clustering techniques make use of K-means, Mixture-of-Guassians, Self-Organizing Map, Neural Gas and fuzzy K-means algorithm for prediction. In all these techniques a predefined structure is required that is number of neurons or clusters should be known before we start clustering process. But in case of Growing Neural Gas there is no need of predetermining the quantity of neurons and the topology of the structure to be used and it starts with a minimal neurons structure that is incremented during training until it reaches a maximum number user defined limits for clusters. Hence, in this work we have used Growing Neural Gas as underlying cluster algorithm that produces the initial set of labeled cluster from training data set and thereafter this set of clusters is used to predict the quality of test data set of software modules. The best testing results shows 80% accuracy in evaluating the quality of software modules. Hence, the proposed technique can be used by programmers in evaluating the quality of modules during software development.

Keywords: Growing Neural Gas, data clustering, fault prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1854
47 Aircraft Selection Using Multiple Criteria Decision Making Analysis Method with Different Data Normalization Techniques

Authors: C. Ardil

Abstract:

This paper presents an original application of multiple criteria decision making analysis theory to the evaluation of aircraft selection problem. The selection of an optimal, efficient and reliable fleet, network and operations planning policy is one of the most important factors in aircraft selection problem. Given that decision making in aircraft selection involves the consideration of a number of opposite criteria and possible solutions, such a selection can be considered as a multiple criteria decision making analysis problem. This study presents a new integrated approach to decision making by considering the multiple criteria utility theory and the maximal regret minimization theory methods as well as aircraft technical, economical, and environmental aspects. Multiple criteria decision making analysis method uses different normalization techniques to allow criteria to be aggregated with qualitative and quantitative data of the decision problem. Therefore, selecting a suitable normalization technique for the model is also a challenge to provide data aggregation for the aircraft selection problem. To compare the impact of different normalization techniques on the decision problem, the vector, linear (sum), linear (max), and linear (max-min) data normalization techniques were identified to evaluate aircraft selection problem. As a logical implication of the proposed approach, it enhances the decision making process through enabling the decision maker to: (i) use higher level knowledge regarding the selection of criteria weights and the proposed technique, (ii) estimate the ranking of an alternative, under different data normalization techniques and integrated criteria weights after a posteriori analysis of the final rankings of alternatives. A set of commercial passenger aircraft were considered in order to illustrate the proposed approach. The obtained results of the proposed approach were compared using Spearman's rho tests. An analysis of the final rank stability with respect to the changes in criteria weights was also performed so as to assess the sensitivity of the alternative rankings obtained by the application of different data normalization techniques and the proposed approach.

Keywords: Normalization Techniques, Aircraft Selection, Multiple Criteria Decision Making, Multiple Criteria Decision Making Analysis, MCDMA

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 560
46 Effective Stacking of Deep Neural Models for Automated Object Recognition in Retail Stores

Authors: Ankit Sinha, Soham Banerjee, Pratik Chattopadhyay

Abstract:

Automated product recognition in retail stores is an important real-world application in the domain of Computer Vision and Pattern Recognition. In this paper, we consider the problem of automatically identifying the classes of the products placed on racks in retail stores from an image of the rack and information about the query/product images. We improve upon the existing approaches in terms of effectiveness and memory requirement by developing a two-stage object detection and recognition pipeline comprising of a Faster-RCNN-based object localizer that detects the object regions in the rack image and a ResNet-18-based image encoder that classifies  the detected regions into the appropriate classes. Each of the models is fine-tuned using appropriate data sets for better prediction and data augmentation is performed on each query image to prepare an extensive gallery set for fine-tuning the ResNet-18-based product recognition model. This encoder is trained using a triplet loss function following the strategy of online-hard-negative-mining for improved prediction. The proposed models are lightweight and can be connected in an end-to-end manner during deployment to automatically identify each product object placed in a rack image. Extensive experiments using Grozi-32k and GP-180 data sets verify the effectiveness of the proposed model.

Keywords: Retail stores, Faster-RCNN, object localization, ResNet-18, triplet loss, data augmentation, product recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 563
45 SUPAR: System for User-Centric Profiling of Association Rules in Streaming Data

Authors: Sarabjeet Kaur Kochhar

Abstract:

With a surge of stream processing applications novel techniques are required for generation and analysis of association rules in streams. The traditional rule mining solutions cannot handle streams because they generally require multiple passes over the data and do not guarantee the results in a predictable, small time. Though researchers have been proposing algorithms for generation of rules from streams, there has not been much focus on their analysis. We propose Association rule profiling, a user centric process for analyzing association rules and attaching suitable profiles to them depending on their changing frequency behavior over a previous snapshot of time in a data stream. Association rule profiles provide insights into the changing nature of associations and can be used to characterize the associations. We discuss importance of characteristics such as predictability of linkages present in the data and propose metric to quantify it. We also show how association rule profiles can aid in generation of user specific, more understandable and actionable rules. The framework is implemented as SUPAR: System for Usercentric Profiling of Association Rules in streaming data. The proposed system offers following capabilities: i) Continuous monitoring of frequency of streaming item-sets and detection of significant changes therein for association rule profiling. ii) Computation of metrics for quantifying predictability of associations present in the data. iii) User-centric control of the characterization process: user can control the framework through a) constraint specification and b) non-interesting rule elimination.

Keywords: Data Streams, User subjectivity, Change detection, Association rule profiles, Predictability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1449
44 Upgraded Rough Clustering and Outlier Detection Method on Yeast Dataset by Entropy Rough K-Means Method

Authors: P. Ashok, G. M. Kadhar Nawaz

Abstract:

Rough set theory is used to handle uncertainty and incomplete information by applying two accurate sets, Lower approximation and Upper approximation. In this paper, the rough clustering algorithms are improved by adopting the Similarity, Dissimilarity–Similarity and Entropy based initial centroids selection method on three different clustering algorithms namely Entropy based Rough K-Means (ERKM), Similarity based Rough K-Means (SRKM) and Dissimilarity-Similarity based Rough K-Means (DSRKM) were developed and executed by yeast dataset. The rough clustering algorithms are validated by cluster validity indexes namely Rand and Adjusted Rand indexes. An experimental result shows that the ERKM clustering algorithm perform effectively and delivers better results than other clustering methods. Outlier detection is an important task in data mining and very much different from the rest of the objects in the clusters. Entropy based Rough Outlier Factor (EROF) method is seemly to detect outlier effectively for yeast dataset. In rough K-Means method, by tuning the epsilon (ᶓ) value from 0.8 to 1.08 can detect outliers on boundary region and the RKM algorithm delivers better results, when choosing the value of epsilon (ᶓ) in the specified range. An experimental result shows that the EROF method on clustering algorithm performed very well and suitable for detecting outlier effectively for all datasets. Further, experimental readings show that the ERKM clustering method outperformed the other methods.

Keywords: Clustering, Entropy, Outlier, Rough K-Means, validity index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1401
43 Comparative Evaluation of Accuracy of Selected Machine Learning Classification Techniques for Diagnosis of Cancer: A Data Mining Approach

Authors: Rajvir Kaur, Jeewani Anupama Ginige

Abstract:

With recent trends in Big Data and advancements in Information and Communication Technologies, the healthcare industry is at the stage of its transition from clinician oriented to technology oriented. Many people around the world die of cancer because the diagnosis of disease was not done at an early stage. Nowadays, the computational methods in the form of Machine Learning (ML) are used to develop automated decision support systems that can diagnose cancer with high confidence in a timely manner. This paper aims to carry out the comparative evaluation of a selected set of ML classifiers on two existing datasets: breast cancer and cervical cancer. The ML classifiers compared in this study are Decision Tree (DT), Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), Logistic Regression, Ensemble (Bagged Tree) and Artificial Neural Networks (ANN). The evaluation is carried out based on standard evaluation metrics Precision (P), Recall (R), F1-score and Accuracy. The experimental results based on the evaluation metrics show that ANN showed the highest-level accuracy (99.4%) when tested with breast cancer dataset. On the other hand, when these ML classifiers are tested with the cervical cancer dataset, Ensemble (Bagged Tree) technique gave better accuracy (93.1%) in comparison to other classifiers.

Keywords: Artificial neural networks, breast cancer, cancer dataset, classifiers, cervical cancer, F-score, logistic regression, machine learning, precision, recall, support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1539
42 Interactive Garments: Flexible Technologies for Textile Integration

Authors: Anupam Bhatia

Abstract:

Upon reviewing the literature and the pragmatic work done in the field of E- textiles, it is observed that the applications of wearable technologies have found a steady growth in the field of military, medical, industrial, sports; whereas fashion is at a loss to know how to treat this technology and bring it to market. The purpose of this paper is to understand the practical issues of integration of electronics in garments; cutting patterns for mass production, maintaining the basic properties of textiles and daily maintenance of garments that hinder the wide adoption of interactive fabric technology within Fashion and leisure wear. To understand the practical hindrances an experimental and laboratory approach is taken. “Techno Meets Fashion” has been an interactive fashion project where sensor technologies have been embedded with textiles that result in set of ensembles that are light emitting garments, sound sensing garments, proximity garments, shape memory garments etc. Smart textiles, especially in the form of textile interfaces, are drastically underused in fashion and other lifestyle product design. Clothing and some other textile products must be washable, which subjects to the interactive elements to water and chemical immersion, physical stress, and extreme temperature. The current state of the art tends to be too fragile for this treatment. The process for mass producing traditional textiles becomes difficult in interactive textiles. As cutting patterns from larger rolls of cloth and sewing them together to make garments breaks and reforms electronic connections in an uncontrolled manner. Because of this, interactive fabric elements are integrated by hand into textiles produced by standard methods. The Arduino has surely made embedding electronics into textiles much easier than before; even then electronics are not integral to the daily wear garments. Soft and flexible interfaces of MEMS (micro sensors and Micro actuators) can be an option to make this possible by blending electronics within E-textiles in a way that’s seamless and still retains functions of the circuits as well as the garment. Smart clothes, which offer simultaneously a challenging design and utility value, can be only mass produced if the demands of the body are taken care of i.e. protection, anthropometry, ergonomics of human movement, thermo- physiological regulation.

Keywords: Ambient Intelligence, Proximity Sensors, Shape Memory Materials, Sound sensing garments, Wearable Technology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3261
41 Using Data Mining in Automotive Safety

Authors: Carine Cridelich, Pablo Juesas Cano, Emmanuel Ramasso, Noureddine Zerhouni, Bernd Weiler

Abstract:

Safety is one of the most important considerations when buying a new car. While active safety aims at avoiding accidents, passive safety systems such as airbags and seat belts protect the occupant in case of an accident. In addition to legal regulations, organizations like Euro NCAP provide consumers with an independent assessment of the safety performance of cars and drive the development of safety systems in automobile industry. Those ratings are mainly based on injury assessment reference values derived from physical parameters measured in dummies during a car crash test. The components and sub-systems of a safety system are designed to achieve the required restraint performance. Sled tests and other types of tests are then carried out by car makers and their suppliers to confirm the protection level of the safety system. A Knowledge Discovery in Databases (KDD) process is proposed in order to minimize the number of tests. The KDD process is based on the data emerging from sled tests according to Euro NCAP specifications. About 30 parameters of the passive safety systems from different data sources (crash data, dummy protocol) are first analysed together with experts opinions. A procedure is proposed to manage missing data and validated on real data sets. Finally, a procedure is developed to estimate a set of rough initial parameters of the passive system before testing aiming at reducing the number of tests.

Keywords: KDD process, passive safety systems, sled test, dummy injury assessment reference values, frontal impact

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2838
40 Analyzing Factors Impacting COVID-19 Vaccination Rates

Authors: Dongseok Cho, Mitchell Driedger, Sera Han, Noman Khan, Mohammed Elmorsy, Mohamad El-Hajj

Abstract:

Since the approval of the COVID-19 vaccine in late 2020, vaccination rates have varied around the globe. Access to a vaccine supply, mandated vaccination policy, and vaccine hesitancy contribute to these rates. This study used COVID-19 vaccination data from Our World in Data and the Multilateral Leaders Task Force on COVID-19 to create two COVID-19 vaccination indices. The first index is the Vaccine Utilization Index (VUI), which measures how effectively each country has utilized its vaccine supply to doubly vaccinate its population. The second index is the Vaccination Acceleration Index (VAI), which evaluates how efficiently each country vaccinated their populations within their first 150 days. Pearson correlations were created between these indices and country indicators obtained from the World Bank. Results of these correlations identify countries with stronger Health indicators such as lower mortality rates, lower age-dependency ratios, and higher rates of immunization to other diseases display higher VUI and VAI scores than countries with lesser values. VAI scores are also positively correlated to Governance and Economic indicators, such as regulatory quality, control of corruption, and GDP per capita. As represented by the VUI, proper utilization of the COVID-19 vaccine supply by country is observed in countries that display excellence in health practices. A country’s motivation to accelerate its vaccination rates within the first 150 days of vaccinating, as represented by the VAI, was largely a product of the governing body’s effectiveness and economic status, as well as overall excellence in health practises.

Keywords: Data mining, Pearson Correlation, COVID-19, vaccination rates, hesitancy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 324
39 Comparative Study Using Weka for Red Blood Cells Classification

Authors: Jameela Ali Alkrimi, Hamid A. Jalab, Loay E. George, Abdul Rahim Ahmad, Azizah Suliman, Karim Al-Jashamy

Abstract:

Red blood cells (RBC) are the most common types of blood cells and are the most intensively studied in cell biology. The lack of RBCs is a condition in which the amount of hemoglobin level is lower than normal and is referred to as “anemia”. Abnormalities in RBCs will affect the exchange of oxygen. This paper presents a comparative study for various techniques for classifying the RBCs as normal or abnormal (anemic) using WEKA. WEKA is an open source consists of different machine learning algorithms for data mining applications. The algorithms tested are Radial Basis Function neural network, Support vector machine, and K-Nearest Neighbors algorithm. Two sets of combined features were utilized for classification of blood cells images. The first set, exclusively consist of geometrical features, was used to identify whether the tested blood cell has a spherical shape or non-spherical cells. While the second set, consist mainly of textural features was used to recognize the types of the spherical cells. We have provided an evaluation based on applying these classification methods to our RBCs image dataset which were obtained from Serdang Hospital - Malaysia, and measuring the accuracy of test results. The best achieved classification rates are 97%, 98%, and 79% for Support vector machines, Radial Basis Function neural network, and K-Nearest Neighbors algorithm respectively.

Keywords: K-Nearest Neighbors, Neural Network, Radial Basis Function, Red blood cells, Support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2981
38 An Architectural Study on the Railway Station Buildings in Malaysia during British Era, 1885-1957

Authors: Nor Hafizah Anuar, M. Gul Akdeniz

Abstract:

This paper attempted on emphasize on the station buildings façade elements. Station buildings were essential part of the transportation that reflected the technology. Comparative analysis on architectural styles will also be made between the railway station buildings of Malaysia and any railway station buildings which have similarities. The Malay Peninsula which is strategically situated between the Straits of Malacca and the South China Sea makes it an ideal location for trade. Malacca became an important trading port whereby merchants from around the world stopover to exchange various products. The Portuguese ruled Malacca for 130 years (1511–1641) and for the next century and a half (1641–1824), the Dutch endeavoured to maintain an economic monopoly along the coasts of Malaya. Malacca came permanently under British rule under the Anglo-Dutch Treaty, 1824. Up to Malaysian independence in 1957, Malaya saw a great influx of Chinese and Indian migrants as workers to support its growing industrial needs facilitated by the British. The growing tin ore mining and rubber industry resulted as the reason of the development of the railways as urgency to transport it from one place to another. The existence of railway transportation becomes more significant when the city started to bloom and the British started to build grandeur buildings that have different functions; administrative buildings, town and city halls, railway stations, public works department, courts, and post offices.

Keywords: Malaysia, railway station, architectural design, façade elements.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1641
37 Biomechanical Modeling, Simulation, and Comparison of Human Arm Motion to Mitigate Astronaut Task during Extra Vehicular Activity

Authors: B. Vadiraj, S. N. Omkar, B. Kapil Bharadwaj, Yash Vardhan Gupta

Abstract:

During manned exploration of space, missions will require astronaut crewmembers to perform Extra Vehicular Activities (EVAs) for a variety of tasks. These EVAs take place after long periods of operations in space, and in and around unique vehicles, space structures and systems. Considering the remoteness and time spans in which these vehicles will operate, EVA system operations should utilize common worksites, tools and procedures as much as possible to increase the efficiency of training and proficiency in operations. All of the preparations need to be carried out based on studies of astronaut motions. Until now, development and training activities associated with the planned EVAs in Russian and U.S. space programs have relied almost exclusively on physical simulators. These experimental tests are expensive and time consuming. During the past few years a strong increase has been observed in the use of computer simulations due to the fast developments in computer hardware and simulation software. Based on this idea, an effort to develop a computational simulation system to model human dynamic motion for EVA is initiated. This study focuses on the simulation of an astronaut moving the orbital replaceable units into the worksites or removing them from the worksites. Our physics-based methodology helps fill the gap in quantitative analysis of astronaut EVA by providing a multisegment human arm model. Simulation work described in the study improves on the realism of previous efforts, incorporating joint stops to account for the physiological limits of range of motion. To demonstrate the utility of this approach human arm model is simulated virtually using ADAMS/LifeMOD® software. Kinematic mechanism for the astronaut’s task is studied from joint angles and torques. Simulation results obtained is validated with numerical simulation based on the principles of Newton-Euler method. Torques determined using mathematical model are compared among the subjects to know the grace and consistency of the task performed. We conclude that due to uncertain nature of exploration-class EVA, a virtual model developed using multibody dynamics approach offers significant advantages over traditional human modeling approaches.

Keywords: Extra vehicular activity, biomechanics, inverse kinematics, human body modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2822
36 Co-Disposal of Coal Ash with Mine Tailings in Surface Paste Disposal Practices: A Gold Mining Case Study

Authors: M. L. Dinis, M. C. Vila, A. Fiúza, A. Futuro, C. Nunes

Abstract:

The present paper describes the study of paste tailings prepared in laboratory using gold tailings, produced in a Finnish gold mine with the incorporation of coal ash. Natural leaching tests were conducted with the original materials (tailings, fly and bottom ashes) and also with paste mixtures that were prepared with different percentages of tailings and ashes. After leaching, the solid wastes were physically and chemically characterized and the results were compared to those selected as blank – the unleached samples. The tailings and the coal ash, as well as the prepared mixtures, were characterized, in addition to the textural parameters, by the following measurements: grain size distribution, chemical composition and pH. Mixtures were also tested in order to characterize their mechanical behavior by measuring the flexural strength, the compressive strength and the consistency. The original tailing samples presented an alkaline pH because during their processing they were previously submitted to pressure oxidation with destruction of the sulfides. Therefore, it was not possible to ascertain the effect of the coal ashes in the acid mine drainage. However, it was possible to verify that the paste reactivity was affected mostly by the bottom ash and that the tailings blended with bottom ash present lower mechanical strength than when blended with a combination of fly and bottom ash. Surface paste disposal offer an attractive alternative to traditional methods in addition to the environmental benefits of incorporating large-volume wastes (e.g. bottom ash). However, a comprehensive characterization of the paste mixtures is crucial to optimize paste design in order to enhance engineer and environmental properties.

Keywords: Coal ash, gold tailings, paste, surface disposal.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1435
35 An Index for the Differential Diagnosis of Morbid Obese Children with and without Metabolic Syndrome

Authors: Mustafa M. Donma, Orkide Donma

Abstract:

Metabolic syndrome (MetS) is a severe health problem caused by morbid obesity, the severest form of obesity. The components of MetS are rather stable in adults. However, the diagnosis of MetS in morbid obese (MO) children still constitutes a matter of discussion. The aim of this study was to develop a formula, which facilitated the diagnosis of MetS in MO children and was capable of discriminating MO children with and without MetS findings. The study population comprised MO children. Age and sex-dependent body mass index (BMI) percentiles of the children were above 99. Increased blood pressure, elevated fasting blood glucose (FBG), elevated triglycerides (TRG) and/or decreased high density lipoprotein cholesterol (HDL-C) in addition to central obesity were listed as MetS components for each child. Two groups were constituted. In the first group, there were 42 MO children without MetS components. Second group was composed of 44 MO children with at least two MetS components. Anthropometric measurements including weight, height, waist and hip circumferences were performed during physical examination. BMI and homeostatic model assessment of insulin resistance (HOMA-IR) values were calculated. Informed consent forms were obtained from the parents of the children. Institutional Non-Interventional Clinical Studies Ethics Committee approved the study design. Routine biochemical analyses including FBG, insulin (INS), TRG, HDL-C were performed. The performance and the clinical utility of Diagnostic Obesity Notation Model Assessment Metabolic Syndrome Index (DONMA MetS index) [(INS/FBG)/(HDL-C/TRG)*100] was tested. Appropriate statistical tests were applied to the study data. p value smaller than 0.05 was defined as significant. MetS index values were 41.6 ± 5.1 in MO group and 104.4 ± 12.8 in MetS group. Corresponding values for HDL-C values were 54.5 ± 13.2 mg/dl and 44.2 ± 11.5 mg/dl. There was a statistically significant difference between the groups (p < 0.001). Upon evaluation of the correlations between MetS index and HDL-C values, a much stronger negative correlation was found in MetS group (r = -0.515; p = 0.001) in comparison with the correlation detected in MO group (r = -0.371; p = 0.016). From these findings, it was concluded that the statistical significance degree of the difference between MO and MetS groups was highly acceptable for this recently introduced MetS index. This was due to the involvement of all of the biochemically defined MetS components into the index. This is particularly important because each of these four parameters used in the formula is a cardiac risk factor. Aside from discriminating MO children with and without MetS findings, MetS index introduced in this study is important from the cardiovascular risk point of view in MetS group of children.

Keywords: Fasting blood glucose, high density lipoprotein cholesterol, insulin, metabolic syndrome, morbid obesity, triglycerides.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 229
34 Discovery of Quantified Hierarchical Production Rules from Large Set of Discovered Rules

Authors: Tamanna Siddiqui, M. Afshar Alam

Abstract:

Automated discovery of Rule is, due to its applicability, one of the most fundamental and important method in KDD. It has been an active research area in the recent past. Hierarchical representation allows us to easily manage the complexity of knowledge, to view the knowledge at different levels of details, and to focus our attention on the interesting aspects only. One of such efficient and easy to understand systems is Hierarchical Production rule (HPRs) system. A HPR, a standard production rule augmented with generality and specificity information, is of the following form: Decision If < condition> Generality Specificity . HPRs systems are capable of handling taxonomical structures inherent in the knowledge about the real world. This paper focuses on the issue of mining Quantified rules with crisp hierarchical structure using Genetic Programming (GP) approach to knowledge discovery. The post-processing scheme presented in this work uses Quantified production rules as initial individuals of GP and discovers hierarchical structure. In proposed approach rules are quantified by using Dempster Shafer theory. Suitable genetic operators are proposed for the suggested encoding. Based on the Subsumption Matrix(SM), an appropriate fitness function is suggested. Finally, Quantified Hierarchical Production Rules (HPRs) are generated from the discovered hierarchy, using Dempster Shafer theory. Experimental results are presented to demonstrate the performance of the proposed algorithm.

Keywords: Knowledge discovery in database, quantification, dempster shafer theory, genetic programming, hierarchy, subsumption matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1516
33 Rare Earth Elements in Soils of Jharia Coal Field

Authors: R. E. Masto, L. C. Ram, S. K. Verma, V. A. Selvi, J. George, R. C. Tripathi, N. K. Srivastava, D. Mohanty, S. K.Jha, A. K. Sinha, A. Sinha

Abstract:

There are many sources trough which the soil get enriched and contaminated with REEs. The determination of REEs in environmental samples has been limited because of the lack of sensitive analytical techniques. Soil samples were collected from four sites including open cast coal mine, natural coal burning, coal washery and control in the coal field located in Dhanbad, India. Total concentrations of rare earth elements (REEs) were determined using the inductively coupled plasma atomic absorption spectrometry in order to assess enrichment status in the coal field. Results showed that the mean concentrations of La, Pr, Eu, Tb, Ho, and Tm in open cast mine and natural coal burning sites were elevated compared to the reference concentrations, while Ce, Nd, Sm, and Gd were elevated in coal washery site. When compared to reference soil, heavy REEs (HREEs) were enriched in open cast mines and natural coal burning affected soils, however, the HREEs were depleted in the coal washery sites. But, the Chondrite-normalization diagram showed significant enrichment for light REEs (LREEs) in all the soils. High concentration of Pr, Eu, Tb, Ho, Tm, and Lu in coal mining and coal burning sites may pose human health risks. Factor analysis showed that distribution and relative abundance of REEs of the coal washery site is comparable with the control. Eventually washing or cleaning of coal could significantly decrease the emission of REEs from coal into the environment.

Keywords: Rare earth elements, coal, soil, factor analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2817
32 Exploring Influence Range of Tainan City Using Electronic Toll Collection Big Data

Authors: Chen Chou, Feng-Tyan Lin

Abstract:

Big Data has been attracted a lot of attentions in many fields for analyzing research issues based on a large number of maternal data. Electronic Toll Collection (ETC) is one of Intelligent Transportation System (ITS) applications in Taiwan, used to record starting point, end point, distance and travel time of vehicle on the national freeway. This study, taking advantage of ETC big data, combined with urban planning theory, attempts to explore various phenomena of inter-city transportation activities. ETC, one of government's open data, is numerous, complete and quick-update. One may recall that living area has been delimited with location, population, area and subjective consciousness. However, these factors cannot appropriately reflect what people’s movement path is in daily life. In this study, the concept of "Living Area" is replaced by "Influence Range" to show dynamic and variation with time and purposes of activities. This study uses data mining with Python and Excel, and visualizes the number of trips with GIS to explore influence range of Tainan city and the purpose of trips, and discuss living area delimited in current. It dialogues between the concepts of "Central Place Theory" and "Living Area", presents the new point of view, integrates the application of big data, urban planning and transportation. The finding will be valuable for resource allocation and land apportionment of spatial planning.

Keywords: Big Data, ITS, influence range, living area, central place theory, visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 970
31 Privacy Concerns and Law Enforcement Data Collection to Tackle Domestic and Sexual Violence

Authors: Francesca Radice

Abstract:

It has been observed that violent or coercive behaviour has been apparent from initial conversations on dating apps like Tinder. Child pornography, stalking, and coercive control are some criminal offences from dating apps, including women murdered after finding partners through Tinder. Police databases and predictive policing are novel approaches taken to prevent crime before harm is done. This research will investigate how police databases can be used in a privacy-preserving way to characterise users in terms of their potential for violent crime. Using the COPS database of NSW Police, we will explore how the past criminal record can be interpreted to yield a category of potential danger for each dating app user. It is up to the judgement of each subscriber on what degree of the potential danger they are prepared to enter into. Sentiment analysis is an area where research into natural language processing has made great progress over the last decade. This research will investigate how sentiment analysis can be used to interpret interchanges between dating app users to detect manipulative or coercive sentiments. These can be used to alert law enforcement if continued for a defined number of communications. One of the potential problems of this approach is the potential prejudice a categorisation can cause. Another drawback is the possibility of misinterpreting communications and involving law enforcement without reason. The approach will be thoroughly tested with cross-checks by human readers who verify both the level of danger predicted by the interpretation of the criminal record and the sentiment detected from personal messages. Even if only a few violent crimes can be prevented, the approach will have a tangible value for real people.

Keywords: Sentiment Analysis, data mining, predictive policing, virtual manipulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 203
30 Evaluation of the Weight-Based and Fat-Based Indices in Relation to Basal Metabolic Rate-to-Weight Ratio

Authors: Orkide Donma, Mustafa M. Donma

Abstract:

Basal metabolic rate is questioned as a risk factor for weight gain. The relations between basal metabolic rate and body composition have not been cleared yet. The impact of fat mass on basal metabolic rate is also uncertain. Within this context, indices based upon total body mass as well as total body fat mass are available. In this study, the aim is to investigate the potential clinical utility of these indices in the adult population. 287 individuals, aged from 18 to 79 years, were included into the scope of the study. Based upon body mass index values, 10 underweight, 88 normal, 88 overweight, 81 obese, and 20 morbid obese individuals participated. Anthropometric measurements including height (m), and weight (kg) were performed. Body mass index, diagnostic obesity notation model assessment index I, diagnostic obesity notation model assessment index II, basal metabolic rate-to-weight ratio were calculated. Total body fat mass (kg), fat percent (%), basal metabolic rate, metabolic age, visceral adiposity, fat mass of upper as well as lower extremities and trunk, obesity degree were measured by TANITA body composition monitor using bioelectrical impedance analysis technology. Statistical evaluations were performed by statistical package (SPSS) for Windows Version 16.0. Scatterplots of individual measurements for the parameters concerning correlations were drawn. Linear regression lines were displayed. The statistical significance degree was accepted as p < 0.05. The strong correlations between body mass index and diagnostic obesity notation model assessment index I as well as diagnostic obesity notation model assessment index II were obtained (p < 0.001). A much stronger correlation was detected between basal metabolic rate and diagnostic obesity notation model assessment index I in comparison with that calculated for basal metabolic rate and body mass index (p < 0.001). Upon consideration of the associations between basal metabolic rate-to-weight ratio and these three indices, the best association was observed between basal metabolic rate-to-weight and diagnostic obesity notation model assessment index II. In a similar manner, this index was highly correlated with fat percent (p < 0.001). Being independent of the indices, a strong correlation was found between fat percent and basal metabolic rate-to-weight ratio (p < 0.001). Visceral adiposity was much strongly correlated with metabolic age when compared to that with chronological age (p < 0.001). In conclusion, all three indices were associated with metabolic age, but not with chronological age. Diagnostic obesity notation model assessment index II values were highly correlated with body mass index values throughout all ranges starting with underweight going towards morbid obesity. This index is the best in terms of its association with basal metabolic rate-to-weight ratio, which can be interpreted as basal metabolic rate unit.

Keywords: Basal metabolic rate, body mass index, children, diagnostic obesity notation model assessment index, obesity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1038
29 Evaluation of Ensemble Classifiers for Intrusion Detection

Authors: M. Govindarajan

Abstract:

One of the major developments in machine learning in the past decade is the ensemble method, which finds highly accurate classifier by combining many moderately accurate component classifiers. In this research work, new ensemble classification methods are proposed with homogeneous ensemble classifier using bagging and heterogeneous ensemble classifier using arcing and their performances are analyzed in terms of accuracy. A Classifier ensemble is designed using Radial Basis Function (RBF) and Support Vector Machine (SVM) as base classifiers. The feasibility and the benefits of the proposed approaches are demonstrated by the means of standard datasets of intrusion detection. The main originality of the proposed approach is based on three main parts: preprocessing phase, classification phase, and combining phase. A wide range of comparative experiments is conducted for standard datasets of intrusion detection. The performance of the proposed homogeneous and heterogeneous ensemble classifiers are compared to the performance of other standard homogeneous and heterogeneous ensemble methods. The standard homogeneous ensemble methods include Error correcting output codes, Dagging and heterogeneous ensemble methods include majority voting, stacking. The proposed ensemble methods provide significant improvement of accuracy compared to individual classifiers and the proposed bagged RBF and SVM performs significantly better than ECOC and Dagging and the proposed hybrid RBF-SVM performs significantly better than voting and stacking. Also heterogeneous models exhibit better results than homogeneous models for standard datasets of intrusion detection. 

Keywords: Data mining, ensemble, radial basis function, support vector machine, accuracy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1692
28 The Relationship between Anthropometric Obesity Indices and Insulin in Children with Metabolic Syndrome

Authors: Mustafa M. Donma, Orkide Donma

Abstract:

The number of indices developed for the evaluation of obesity and metabolic syndrome (MetS) both in adults and pediatric population is ever increasing. These indices can be weight-dependent or weight–independent. Some are extremely sophisticated equations and their clinical utility is questionable in routine clinical practice. The aim of this study was to compare presently available obesity indices and find the most practical one. Their associations with MetS components were also investigated to determine their capacities in differential diagnosis of morbid obesity with and without MetS. Children with normal body mass index (N-BMI) and morbid obesity were recruited for this study. Three groups were constituted. Age- and sex-dependent BMI percentiles for morbid obese (MO) children were above 99 according to World Health Organization tables. Of them, those with MetS findings were evaluated as MetS group. Children, whose values were between 85 and 15, were included in N-BMI group. The study protocol was approved by the Ethics Committee of Tekirdag Namik Kemal University, Faculty of Medicine. Parents filled out informed consent forms to participate in the study. Anthropometric measurements and blood pressure values were recorded. BMI, hip index (HI), conicity index (CI), triponderal mass index (TPMI), body adiposity index (BAI), body shape index (BSI), body roundness index (BRI), abdominal volume index (AVI), waist-to-hip ratio (WHR) and [waist circumference (WC) + hip circumference (HC)]/2 were the formulas examined in this study. Routine biochemical tests including fasting blood glucose (FBG), insulin (INS), blood lipids were performed. Statistical program SPSS was used for the evaluation of study data; p < 0.05 was accepted as the statistical significance degree. HI did not differ among the groups. A statistically significant difference was noted between N-BMI and MetS groups in terms of ABSI. All the other indices were capable of making discrimination between N-BMI-MO, N-BMI- MetS and MO-MetS groups. No correlation was found between FBG and any obesity indices in any groups. The same was true for INS in N-BMI group. Insulin was correlated with BAI, TPMI, CI, BRI, AVI and (WC+HC)/2 in MO group without MetS findings. In the MetS group, the only index, which was correlated with INS, was (WC+HC)/2. These findings have pointed out that complicated formulas may not be required for the evaluation of the alterations among N-BMI and various obesity groups including MetS. The simple easily computable weight-independent index, (WC+HC)/2, was unique, because it was the only index, which exhibits a valuable association with INS in MetS group. It did not exhibit any correlation with other obesity indices showing associations with INS in MO group. It was concluded that (WC+HC)/2 was pretty valuable practicable index for the discrimination of MO children with and without MetS findings.

Keywords: Fasting blood glucose, insulin, metabolic syndrome, obesity indices.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 267
27 Feasibility Study of Mine Tailing’s Treatment by Acidithiobacillus thiooxidans DSM 26636

Authors: M. Gómez-Ramírez, A. Rivas-Castillo, I. Rodríguez-Pozos, R. A. Avalos-Zuñiga, N. G. Rojas-Avelizapa

Abstract:

Among the diverse types of pollutants produced by anthropogenic activities, metals represent a serious threat, due to their accumulation in ecosystems and their elevated toxicity. The mine tailings of abandoned mines contain high levels of metals such as arsenic (As), zinc (Zn), copper (Cu), and lead (Pb), which do not suffer any degradation process, they are accumulated in environment. Abandoned mine tailings potentially could contaminate rivers and aquifers representing a risk for human health due to their high metal content. In an attempt to remove the metals and thereby mitigate the environmental pollution, an environmentally friendly and economical method of bioremediation has been introduced. Bioleaching has been actively studied over the last several years, and it is one of the bioremediation solutions used to treat heavy metals contained in sewage sludge, sediment and contaminated soil. Acidithiobacillus thiooxidans, an extremely acidophilic, chemolithoautotrophic, gram-negative, rod shaped microorganism, which is typically related to Cu mining operations (bioleaching), has been well studied for industrial applications. The sulfuric acid produced plays a major role in bioleaching. Specifically, Acidithiobacillus thiooxidans strain DSM 26636 has been able to leach Al, Ni, V, Fe, Mg, Si, and Ni contained in slags from coal combustion wastes. The present study reports the ability of A. thiooxidans DSM 26636 for the bioleaching of metals contained in two different mine tailing samples (MT1 and MT2). It was observed that Al, Fe, and Mn were removed in 36.3±1.7, 191.2±1.6, and 4.5±0.2 mg/kg for MT1, and in 74.5±0.3, 208.3±0.5, and 20.9±0.1 for MT2. Besides, < 1.5 mg/kg of Au and Ru were also bioleached from MT1; in MT2, bioleaching of Zn was observed at 55.7±1.3 mg/kg, besides removal of < 1.5 mg/kg was observed for As, Ir, Li, and 0.6 for Os in this residue. These results show the potential of strain DSM 26636 for the bioleaching of metals that came from different mine tailings.

Keywords: A. thiooxidans, bioleaching, metals, mine tailings.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 973
26 Autonomous Robots- Visual Perception in Underground Terrains Using Statistical Region Merging

Authors: Omowunmi E. Isafiade, Isaac O. Osunmakinde, Antoine B. Bagula

Abstract:

Robots- visual perception is a field that is gaining increasing attention from researchers. This is partly due to emerging trends in the commercial availability of 3D scanning systems or devices that produce a high information accuracy level for a variety of applications. In the history of mining, the mortality rate of mine workers has been alarming and robots exhibit a great deal of potentials to tackle safety issues in mines. However, an effective vision system is crucial to safe autonomous navigation in underground terrains. This work investigates robots- perception in underground terrains (mines and tunnels) using statistical region merging (SRM) model. SRM reconstructs the main structural components of an imagery by a simple but effective statistical analysis. An investigation is conducted on different regions of the mine, such as the shaft, stope and gallery, using publicly available mine frames, with a stream of locally captured mine images. An investigation is also conducted on a stream of underground tunnel image frames, using the XBOX Kinect 3D sensors. The Kinect sensors produce streams of red, green and blue (RGB) and depth images of 640 x 480 resolution at 30 frames per second. Integrating the depth information to drivability gives a strong cue to the analysis, which detects 3D results augmenting drivable and non-drivable regions in 2D. The results of the 2D and 3D experiment with different terrains, mines and tunnels, together with the qualitative and quantitative evaluation, reveal that a good drivable region can be detected in dynamic underground terrains.

Keywords: Drivable Region Detection, Kinect Sensor, Robots' Perception, SRM, Underground Terrains.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1830
25 Petrology Investigation of Apatite Minerals in the Esfordi Mine, Yazd, Iran

Authors: Haleh Rezaei Zanjirabadi, Fatemeh Saberi, Bahman Rahimzadeh, Fariborz Masoudi, Mohammad Rahgosha

Abstract:

In this study, apatite minerals from the iron-phosphate deposit of Yazd have been investigated within the microcontinent zone of Iran in the Zagros structural zone. The geological units in the Esfordi area belong to the pre-Cambrian to lower-Cambrian age, consisting of a succession of carbonate rocks (dolomite), shale, tuff, sandstone, and volcanic rocks. In addition to the mentioned sedimentary and volcanic rocks, the granitoid mass of Bahabad, which is the largest intrusive mass in the region, has intruded into the eastern part of this series and has caused its metamorphism and alteration. After collecting the available data, various samples of Esfordi’s apatite were prepared, and their mineralogy and crystallography were investigated using laboratory methods such as petrographic microscopy, Raman spectroscopy, EDS (Energy Dispersive Spectroscopy), and Scanning Electron Microscopy (SEM). In non-destructive Raman spectroscopy, the molecular structure of apatite minerals was revealed in four distinct spectral ranges. Initially, the spectra of phosphate and aluminum bonds with O2HO, OH, were observed, followed by the identification of Cl, OH, Al, Na, Ca and hydroxyl units depending on the type of apatite mineral family. In SEM analysis, based on various shapes and different phases of apatites, their constituent major elements were identified through EDS, indicating that the samples from the Esfordi mining area exhibit a dense and coherent texture with smooth surfaces. Based on the elemental analysis results by EDS, the apatites in the Esfordi area are classified into the calcic apatite group.

Keywords: Petrology, apatite, Esfordi, EDS, SEM, Scanning Electron Microscopy, Raman spectroscopy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 128
24 Performance Analysis of Search Medical Imaging Service on Cloud Storage Using Decision Trees

Authors: González A. Julio, Ramírez L. Leonardo, Puerta A. Gabriel

Abstract:

Telemedicine services use a large amount of data, most of which are diagnostic images in Digital Imaging and Communications in Medicine (DICOM) and Health Level Seven (HL7) formats. Metadata is generated from each related image to support their identification. This study presents the use of decision trees for the optimization of information search processes for diagnostic images, hosted on the cloud server. To analyze the performance in the server, the following quality of service (QoS) metrics are evaluated: delay, bandwidth, jitter, latency and throughput in five test scenarios for a total of 26 experiments during the loading and downloading of DICOM images, hosted by the telemedicine group server of the Universidad Militar Nueva Granada, Bogotá, Colombia. By applying decision trees as a data mining technique and comparing it with the sequential search, it was possible to evaluate the search times of diagnostic images in the server. The results show that by using the metadata in decision trees, the search times are substantially improved, the computational resources are optimized and the request management of the telemedicine image service is improved. Based on the experiments carried out, search efficiency increased by 45% in relation to the sequential search, given that, when downloading a diagnostic image, false positives are avoided in management and acquisition processes of said information. It is concluded that, for the diagnostic images services in telemedicine, the technique of decision trees guarantees the accessibility and robustness in the acquisition and manipulation of medical images, in improvement of the diagnoses and medical procedures in patients.

Keywords: Cloud storage, decision trees, diagnostic image, search, telemedicine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 934
23 An Empirical Investigation on the Dynamics of Knowledge and IT Industries in Korea

Authors: Sang Ho Lee, Tae Heon Moon, Youn Taik Leem, Kwang Woo Nam

Abstract:

Knowledge and IT inputs to other industrial production have become more important as a key factor for the competitiveness of national and regional economies, such as knowledge economies in smart cities. Knowledge and IT industries lead the industrial innovation and technical (r)evolution through low cost, high efficiency in production, and by creating a new value chain and new production path chains, which is referred as knowledge and IT dynamics. This study aims to investigate the knowledge and IT dynamics in Korea, which are analyzed through the input-output model and structural path analysis. Twenty-eight industries were reclassified into seven categories; Agriculture and Mining, IT manufacture, Non-IT manufacture, Construction, IT-service, Knowledge service, Non-knowledge service to take close look at the knowledge and IT dynamics. Knowledge and IT dynamics were analyzed through the change of input output coefficient and multiplier indices in terms of technical innovation, as well as the changes of the structural paths of the knowledge and IT to other industries in terms of new production value creation from 1985 and 2010. The structural paths of knowledge and IT explain not only that IT foster the generation, circulation and use of knowledge through IT industries and IT-based service, but also that knowledge encourages IT use through creating, sharing and managing knowledge. As a result, this paper found the empirical investigation on the knowledge and IT dynamics of the Korean economy. Knowledge and IT has played an important role regarding the inter-industrial transactional input for production, as well as new industrial creation. The birth of the input-output production path has mostly originated from the knowledge and IT industries, while the death of the input-output production path took place in the traditional industries from 1985 and 2010. The Korean economy has been in transition to a knowledge economy in the Smart City.

Keywords: Knowledge and IT industries, input-output model, structural path analysis, dynamics of knowledge and IT, knowledge economy, knowledge city, smart city.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1162
22 Physicians’ Knowledge and Perception of Gene Profiling in Malaysia

Authors: Farahnaz Amini, Woo Yun Kin, Lazwani Kolandaiveloo

Abstract:

Availability of different genetic tests after completion of Human Genome Project increases the physicians’ responsibility to keep themselves update on the potential implementation of these genetic tests in their daily practice. However, due to numbers of barriers, still many of physicians are not either aware of these tests or are not willing to offer or refer their patients for genetic tests. This study was conducted an anonymous, cross-sectional, mailed-based survey to develop a primary data of Malaysian physicians’ level of knowledge and perception of gene profiling. Questionnaire had 29 questions. Total scores on selected questions were used to assess the level of knowledge. The highest possible score was 11. Descriptive statistics, one way ANOVA and chi-squared test was used for statistical analysis. Sixty three completed questionnaires were returned by 27 general practitioners (GPs) and 36 medical specialists. Responders’ age ranges from 24 to 55 years old (mean 30.2 ± 6.4). About 40% of the participants rated themselves as having poor level of knowledge in genetics in general whilst 60% believed that they have fair level of knowledge; however, almost half (46%) of the respondents felt that they were not knowledgeable about available genetic tests. A majority (94%) of the responders were not aware of any lab or company which is offering gene profiling services in Malaysia. Only 4% of participants were aware of using gene profiling for detection of dosage of some drugs. Respondents perceived greater utility of gene profiling for breast cancer (38%) compared to the colorectal familial cancer (3%). The score of knowledge ranged from 2 to 8 (mean 4.38 ± 1.67). Non- significant differences between score of knowledge of GPs and specialists were observed, with score of 4.19 and 4.58 respectively. There was no significant association between any demographic factors and level of knowledge. However, those who graduated between years 2001 to 2005 had higher level of knowledge. Overall, 83% of participants showed relatively high level of perception on value of gene profiling to detect patient’s risk of disease. However, low perception was observed for both statements of using gene profiling for general population in order to alter their lifestyle (25%) as well as having the full sequence of a patient genome for the purpose of determining a patient’s best match for treatment (18%). The lack of clinical guidelines, limited provider knowledge and awareness, lack of time and resources to educate patients, lack of evidence-based clinical information and cost of tests were the most barriers of ordering gene profiling mentioned by physicians. In conclusion Malaysian physicians who participate in this study had mediocre level of knowledge and awareness in gene profiling. The low exposure to the genetic questions and problems might be a key predictor of lack of awareness and knowledge on available genetic tests. Educational and training workshop might be useful in helping Malaysian physicians incorporate genetic profiling into practice for eligible patients.

Keywords: Gene Profiling, Knowledge, Malaysia, Physician.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1949
21 Author Profiling: Prediction of Learners’ Gender on a MOOC Platform Based on Learners’ Comments

Authors: Tahani Aljohani, Jialin Yu, Alexandra. I. Cristea

Abstract:

The more an educational system knows about a learner, the more personalised interaction it can provide, which leads to better learning. However, asking a learner directly is potentially disruptive, and often ignored by learners. Especially in the booming realm of MOOC Massive Online Learning platforms, only a very low percentage of users disclose demographic information about themselves. Thus, in this paper, we aim to predict learners’ demographic characteristics, by proposing an approach using linguistically motivated Deep Learning Architectures for Learner Profiling, particularly targeting gender prediction on a FutureLearn MOOC platform. Additionally, we tackle here the difficult problem of predicting the gender of learners based on their comments only – which are often available across MOOCs. The most common current approaches to text classification use the Long Short-Term Memory (LSTM) model, considering sentences as sequences. However, human language also has structures. In this research, rather than considering sentences as plain sequences, we hypothesise that higher semantic - and syntactic level sentence processing based on linguistics will render a richer representation. We thus evaluate, the traditional LSTM versus other bleeding edge models, which take into account syntactic structure, such as tree-structured LSTM, Stack-augmented Parser-Interpreter Neural Network (SPINN) and the Structure-Aware Tag Augmented model (SATA). Additionally, we explore using different word-level encoding functions. We have implemented these methods on Our MOOC dataset, which is the most performant one comparing with a public dataset on sentiment analysis that is further used as a cross-examining for the models' results.

Keywords: Deep learning, data mining, gender predication, MOOCs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1350
20 Full-genomic Network Inference for Non-model organisms: A Case Study for the Fungal Pathogen Candida albicans

Authors: Jörg Linde, Ekaterina Buyko, Robert Altwasser, Udo Hahn, Reinhard Guthke

Abstract:

Reverse engineering of full-genomic interaction networks based on compendia of expression data has been successfully applied for a number of model organisms. This study adapts these approaches for an important non-model organism: The major human fungal pathogen Candida albicans. During the infection process, the pathogen can adapt to a wide range of environmental niches and reversibly changes its growth form. Given the importance of these processes, it is important to know how they are regulated. This study presents a reverse engineering strategy able to infer fullgenomic interaction networks for C. albicans based on a linear regression, utilizing the sparseness criterion (LASSO). To overcome the limited amount of expression data and small number of known interactions, we utilize different prior-knowledge sources guiding the network inference to a knowledge driven solution. Since, no database of known interactions for C. albicans exists, we use a textmining system which utilizes full-text research papers to identify known regulatory interactions. By comparing with these known regulatory interactions, we find an optimal value for global modelling parameters weighting the influence of the sparseness criterion and the prior-knowledge. Furthermore, we show that soft integration of prior-knowledge additionally improves the performance. Finally, we compare the performance of our approach to state of the art network inference approaches.

Keywords: Pathogen, network inference, text-mining, Candida albicans, LASSO, mutual information, reverse engineering, linear regression, modelling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1661
19 Radon-222 Concentration and Potential Risk to Workers of Al-Jalamid Phosphate Mines, North Province, Saudi Arabia

Authors: El-Said. I. Shabana, Mohammad S. Tayeb, Maher M. T. Qutub, Abdulraheem A. Kinsara

Abstract:

Usually, phosphate deposits contain 238U and 232Th in addition to their decay products. Due to their different pathways in the environment, the 238U/232Th activity concentration ratio usually found to be greater than unity in phosphate sediments. The presence of these radionuclides creates a potential need to control exposure of workers in the mining and processing activities of the phosphate minerals in accordance with IAEA safety standards. The greatest dose to workers comes from exposure to radon, especially 222Rn from the uranium series, and has to be controlled. In this regard, radon (222Rn) was measured in the atmosphere (indoor and outdoor) of Al-Jalamid phosphate-mines working area using a portable radon-measurement instrument RAD7, in a purpose of radiation protection. Radon was measured in 61 sites inside the open phosphate mines, the phosphate upgrading facility (offices and rooms of the workers, and in some open-air sites) and in the dwellings of the workers residence-village that lies at about 3 km from the mines working area. The obtained results indicated that the average indoor radon concentration was about 48.4 Bq/m3. Inside the upgrading facility, the average outdoor concentrations were 10.8 and 9.7 Bq/m3 in the concentrate piles and crushing areas, respectively. It was 12.3 Bq/m3 in the atmosphere of the open mines. These values are comparable with the global average values. Based on the average values, the annual effective dose due to radon inhalation was calculated and risk estimates have been done. The average annual effective dose to workers due to the radon inhalation was estimated by 1.32 mSv. The potential excess risk of lung cancer mortality that could be attributed to radon, when considering the lifetime exposure, was estimated by 53.0x10-4. The results have been discussed in detail.

Keywords: Dosimetry, environmental monitoring, phosphate deposits, radiation protection, radon-22.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1380