Search results for: heterogeneous massive data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25279

Search results for: heterogeneous massive data

24649 Immunization-Data-Quality in Public Health Facilities in the Pastoralist Communities: A Comparative Study Evidence from Afar and Somali Regional States, Ethiopia

Authors: Melaku Tsehay

Abstract:

The Consortium of Christian Relief and Development Associations (CCRDA), and the CORE Group Polio Partners (CGPP) Secretariat have been working with Global Alliance for Vac-cines and Immunization (GAVI) to improve the immunization data quality in Afar and Somali Regional States. The main aim of this study was to compare the quality of immunization data before and after the above interventions in health facilities in the pastoralist communities in Ethiopia. To this end, a comparative-cross-sectional study was conducted on 51 health facilities. The baseline data was collected in May 2019, while the end line data in August 2021. The WHO data quality self-assessment tool (DQS) was used to collect data. A significant improvment was seen in the accuracy of the pentavalent vaccine (PT)1 (p = 0.012) data at the health posts (HP), while PT3 (p = 0.010), and Measles (p = 0.020) at the health centers (HC). Besides, a highly sig-nificant improvment was observed in the accuracy of tetanus toxoid (TT)2 data at HP (p < 0.001). The level of over- or under-reporting was found to be < 8%, at the HP, and < 10% at the HC for PT3. The data completeness was also increased from 72.09% to 88.89% at the HC. Nearly 74% of the health facilities timely reported their respective immunization data, which is much better than the baseline (7.1%) (p < 0.001). These findings may provide some hints for the policies and pro-grams targetting on improving immunization data qaulity in the pastoralist communities.

Keywords: data quality, immunization, verification factor, pastoralist region

Procedia PDF Downloads 84
24648 Author Profiling: Prediction of Learners’ Gender on a MOOC Platform Based on Learners’ Comments

Authors: Tahani Aljohani, Jialin Yu, Alexandra. I. Cristea

Abstract:

The more an educational system knows about a learner, the more personalised interaction it can provide, which leads to better learning. However, asking a learner directly is potentially disruptive, and often ignored by learners. Especially in the booming realm of MOOC Massive Online Learning platforms, only a very low percentage of users disclose demographic information about themselves. Thus, in this paper, we aim to predict learners’ demographic characteristics, by proposing an approach using linguistically motivated Deep Learning Architectures for Learner Profiling, particularly targeting gender prediction on a FutureLearn MOOC platform. Additionally, we tackle here the difficult problem of predicting the gender of learners based on their comments only – which are often available across MOOCs. The most common current approaches to text classification use the Long Short-Term Memory (LSTM) model, considering sentences as sequences. However, human language also has structures. In this research, rather than considering sentences as plain sequences, we hypothesise that higher semantic - and syntactic level sentence processing based on linguistics will render a richer representation. We thus evaluate, the traditional LSTM versus other bleeding edge models, which take into account syntactic structure, such as tree-structured LSTM, Stack-augmented Parser-Interpreter Neural Network (SPINN) and the Structure-Aware Tag Augmented model (SATA). Additionally, we explore using different word-level encoding functions. We have implemented these methods on Our MOOC dataset, which is the most performant one comparing with a public dataset on sentiment analysis that is further used as a cross-examining for the models' results.

Keywords: deep learning, data mining, gender predication, MOOCs

Procedia PDF Downloads 123
24647 Identifying Critical Success Factors for Data Quality Management through a Delphi Study

Authors: Maria Paula Santos, Ana Lucas

Abstract:

Organizations support their operations and decision making on the data they have at their disposal, so the quality of these data is remarkably important and Data Quality (DQ) is currently a relevant issue, the literature being unanimous in pointing out that poor DQ can result in large costs for organizations. The literature review identified and described 24 Critical Success Factors (CSF) for Data Quality Management (DQM) that were presented to a panel of experts, who ordered them according to their degree of importance, using the Delphi method with the Q-sort technique, based on an online questionnaire. The study shows that the five most important CSF for DQM are: definition of appropriate policies and standards, control of inputs, definition of a strategic plan for DQ, organizational culture focused on quality of the data and obtaining top management commitment and support.

Keywords: critical success factors, data quality, data quality management, Delphi, Q-Sort

Procedia PDF Downloads 202
24646 Numerical Simulation of Lightning Strike Direct Effects on Aircraft Skin Composite Laminate

Authors: Muhammad Khalil, Nader Abuelfoutouh, Gasser Abdelal, Adrian Murphy

Abstract:

Nowadays, the direct effects of lightning to aircrafts are of great importance because of the massive use of composite materials. In comparison with metallic materials, composites present several weaknesses for lightning strike direct effects. Especially, their low electrical and thermal conductivities lead to severe lightning strike damage. The lightning strike direct effects are burning, heating, magnetic force, sparking and arcing. As the problem is complex, we investigated it gradually. A magnetohydrodynamics (MHD) model is developed to simulate the lightning strikes in order to estimate the damages on the composite materials. Then, a coupled thermal-electrical finite element analysis is used to study the interaction between the lightning arc and the composite laminate and to investigate the material degradation.

Keywords: composite structures, lightning multiphysics, magnetohydrodynamic (MHD), coupled thermal-electrical analysis, thermal plasmas.

Procedia PDF Downloads 354
24645 Assessment of Low Income Housing Delivery, Accessibility and Affordability Problem in Nigeria

Authors: Asimiyu Mohammed Jinadu

Abstract:

Housing is a basic necessity of life. Housing plays a central role in the life of living organisms as it provides the basic platform for the life support systems in human settlements. It is considered a social service and a basic right. Despite the importance of housing, Nigeria as a nation is faced with the problem of quantitative and qualitative shortfall in the number of housing units required to accommodate the citizens. This study examined the accessibility and affordability problems of low-income housing in Nigeria. It relied on secondary data obtained for the records of government ministries and agencies. Descriptive statistics were used in the analysis, and the information was presented in simple tables and charts. The findings show that over the years the government has provided serviced plots of land, owner occupier houses and mortgage loans for the people. As at 2016, the Federal Housing Authority (FHA) has completed a total of 23,038 housing units while another 14, 488 units were on-going under the Public Private Partnership scheme across the country. The study revealed that a total of 910, 671 housing units were proposed by the Government under the various low-income housing programmes between 1960 and 2017, but only 156, 336 units were delivered within the period, representing 17.17% success rate. Amongst others, the low-income group faced the problems of low access to and unaffordability of the few low-income housing delivered in Nigeria. The study recommended that all abandoned housing projects should be reviewed, rationalized, completed and made available to the targeted low-income people. Investment in micro housing finance, design and implementation of pro-poor housing programme and massive investment in innovative slum upgrading programmes by both the government and private sector are also recommended to ameliorate the housing problems of the low-income group in Nigeria.

Keywords: housing, low income group, problem, programme

Procedia PDF Downloads 235
24644 Searching k-Nearest Neighbors to be Appropriate under Gaming Environments

Authors: Jae Moon Lee

Abstract:

In general, algorithms to find continuous k-nearest neighbors have been researched on the location based services, monitoring periodically the moving objects such as vehicles and mobile phone. Those researches assume the environment that the number of query points is much less than that of moving objects and the query points are not moved but fixed. In gaming environments, this problem is when computing the next movement considering the neighbors such as flocking, crowd and robot simulations. In this case, every moving object becomes a query point so that the number of query point is same to that of moving objects and the query points are also moving. In this paper, we analyze the performance of the existing algorithms focused on location based services how they operate under gaming environments.

Keywords: flocking behavior, heterogeneous agents, similarity, simulation

Procedia PDF Downloads 281
24643 FSO Performance under High Solar Irradiation: Case Study Qatar

Authors: Syed Jawad Hussain, Abir Touati, Farid Touati

Abstract:

Free-Space Optics (FSO) is a wireless technology that enables the optical transmission of data though the air. FSO is emerging as a promising alternative or complementary technology to fiber optic and wireless radio-frequency (RF) links due to its high-bandwidth, robustness to EMI, and operation in unregulated spectrum. These systems are envisioned to be an essential part of future generation heterogeneous communication networks. Despite the vibrant advantages of FSO technology and the variety of its applications, its widespread adoption has been hampered by rather disappointing link reliability for long-range links due to atmospheric turbulence-induced fading and sensitivity to detrimental climate conditions. Qatar, with modest cloud coverage, high concentrations of airborne dust and high relative humidity particularly lies in virtually rainless sunny belt with a typical daily average solar radiation exceeding 6 kWh/m2 and 80-90% clear skies throughout the year. The specific objective of this work is to study for the first time in Qatar the effect of solar irradiation on the deliverability of the FSO Link. In order to analyze the transport media, we have ported Embedded Linux kernel on Field Programmable Gate Array (FPGA) and designed a network sniffer application that can run into FPGA. We installed new FSO terminals and configure and align them successively. In the reporting period, we carry out measurement and relate them to weather conditions.

Keywords: free space optics, solar irradiation, field programmable gate array, FSO outage

Procedia PDF Downloads 347
24642 Load Balancing Algorithms for SIP Server Clusters in Cloud Computing

Authors: Tanmay Raj, Vedika Gupta

Abstract:

For its groundbreaking and substantial power, cloud computing is today’s most popular breakthrough. It is a sort of Internet-based computing that allows users to request and receive numerous services in a cost-effective manner. Virtualization, grid computing, and utility computing are the most widely employed emerging technologies in cloud computing, making it the most powerful. However, cloud computing still has a number of key challenges, such as security, load balancing, and non-critical failure adaption, to name a few. The massive growth of cloud computing will put an undue strain on servers. As a result, network performance will deteriorate. A good load balancing adjustment can make cloud computing more productive and in- crease client fulfillment execution. Load balancing is an important part of cloud computing because it prevents certain nodes from being overwhelmed while others are idle or have little work to perform. Response time, cost, throughput, performance, and resource usage are all parameters that may be improved using load balancing.

Keywords: cloud computing, load balancing, computing, SIP server clusters

Procedia PDF Downloads 103
24641 Modelling Railway Noise Over Large Areas, Assisted by GIS

Authors: Conrad Weber

Abstract:

The modelling of railway noise over large projects areas can be very time consuming in terms of preparing the noise models and calculation time. An open-source GIS program has been utilised to assist with the modelling of operational noise levels for 675km of railway corridor. A range of GIS algorithms were utilised to break up the noise model area into manageable calculation sizes. GIS was utilised to prepare and filter a range of noise modelling inputs, including building files, land uses and ground terrain. A spreadsheet was utilised to manage the accuracy of key input parameters, including train speeds, train types, curve corrections, bridge corrections and engine notch settings. GIS was utilised to present the final noise modelling results. This paper explains the noise modelling process and how the spreadsheet and GIS were utilised to accurately model this massive project efficiently.

Keywords: noise, modeling, GIS, rail

Procedia PDF Downloads 105
24640 Data Mining in Medicine Domain Using Decision Trees and Vector Support Machine

Authors: Djamila Benhaddouche, Abdelkader Benyettou

Abstract:

In this paper, we used data mining to extract biomedical knowledge. In general, complex biomedical data collected in studies of populations are treated by statistical methods, although they are robust, they are not sufficient in themselves to harness the potential wealth of data. For that you used in step two learning algorithms: the Decision Trees and Support Vector Machine (SVM). These supervised classification methods are used to make the diagnosis of thyroid disease. In this context, we propose to promote the study and use of symbolic data mining techniques.

Keywords: biomedical data, learning, classifier, algorithms decision tree, knowledge extraction

Procedia PDF Downloads 536
24639 Analysis of Different Classification Techniques Using WEKA for Diabetic Disease

Authors: Usama Ahmed

Abstract:

Data mining is the process of analyze data which are used to predict helpful information. It is the field of research which solve various type of problem. In data mining, classification is an important technique to classify different kind of data. Diabetes is most common disease. This paper implements different classification technique using Waikato Environment for Knowledge Analysis (WEKA) on diabetes dataset and find which algorithm is suitable for working. The best classification algorithm based on diabetic data is Naïve Bayes. The accuracy of Naïve Bayes is 76.31% and take 0.06 seconds to build the model.

Keywords: data mining, classification, diabetes, WEKA

Procedia PDF Downloads 132
24638 Soil Salinity from Wastewater Irrigation in Urban Greenery

Authors: H. Nouri, S. Chavoshi Borujeni, S. Anderson, S. Beecham, P. Sutton

Abstract:

The potential risk of salt leaching through wastewater irrigation is of concern for most local governments and city councils. Despite the necessity of salinity monitoring and management in urban greenery, most attention has been on agricultural fields. This study was defined to investigate the capability and feasibility of monitoring and predicting soil salinity using near sensing and remote sensing approaches using EM38 surveys, and high-resolution multispectral image of WorldView3. Veale Gardens within the Adelaide Parklands was selected as the experimental site. The results of the near sensing investigation were validated by testing soil salinity samples in the laboratory. Over 30 band combinations forming salinity indices were tested using image processing techniques. The outcomes of the remote sensing and near sensing approaches were compared to examine whether remotely sensed salinity indicators could map and predict the spatial variation of soil salinity through a potential statistical model. Statistical analysis was undertaken using the Stata 13 statistical package on over 52,000 points. Several regression models were fitted to the data, and the mixed effect modelling was selected the most appropriate one as it takes to account the systematic observation-specific unobserved heterogeneity. Results showed that SAVI (Soil Adjusted Vegetation Index) was the only salinity index that could be considered as a predictor for soil salinity but further investigation is needed. However, near sensing was found as a rapid, practical and realistically accurate approach for salinity mapping of heterogeneous urban vegetation.

Keywords: WorldView3, remote sensing, EM38, near sensing, urban green spaces, green smart cities

Procedia PDF Downloads 146
24637 Estimation of Ribb Dam Catchment Sediment Yield and Reservoir Effective Life Using Soil and Water Assessment Tool Model and Empirical Methods

Authors: Getalem E. Haylia

Abstract:

The Ribb dam is one of the irrigation projects in the Upper Blue Nile basin, Ethiopia, to irrigate the Fogera plain. Reservoir sedimentation is a major problem because it reduces the useful reservoir capacity by the accumulation of sediments coming from the watersheds. Estimates of sediment yield are needed for studies of reservoir sedimentation and planning of soil and water conservation measures. The objective of this study was to simulate the Ribb dam catchment sediment yield using SWAT model and to estimate Ribb reservoir effective life according to trap efficiency methods. The Ribb dam catchment is found in North Western part of Ethiopia highlands, and it belongs to the upper Blue Nile and Lake Tana basins. Soil and Water Assessment Tool (SWAT) was selected to simulate flow and sediment yield in the Ribb dam catchment. The model sensitivity, calibration, and validation analysis at Ambo Bahir site were performed with Sequential Uncertainty Fitting (SUFI-2). The flow data at this site was obtained by transforming the Lower Ribb gauge station (2002-2013) flow data using Area Ratio Method. The sediment load was derived based on the sediment concentration yield curve of Ambo site. Stream flow results showed that the Nash-Sutcliffe efficiency coefficient (NSE) was 0.81 and the coefficient of determination (R²) was 0.86 in calibration period (2004-2010) and, 0.74 and 0.77 in validation period (2011-2013), respectively. Using the same periods, the NS and R² for the sediment load calibration were 0.85 and 0.79 and, for the validation, it became 0.83 and 0.78, respectively. The simulated average daily flow rate and sediment yield generated from Ribb dam watershed were 3.38 m³/s and 1772.96 tons/km²/yr, respectively. The effective life of Ribb reservoir was estimated using the developed empirical methods of the Brune (1953), Churchill (1948) and Brown (1958) methods and found to be 30, 38 and 29 years respectively. To conclude, massive sediment comes from the steep slope agricultural areas, and approximately 98-100% of this incoming annual sediment loads have been trapped by the Ribb reservoir. In Ribb catchment, as well as reservoir systematic and thorough consideration of technical, social, environmental, and catchment managements and practices should be made to lengthen the useful life of Ribb reservoir.

Keywords: catchment, reservoir effective life, reservoir sedimentation, Ribb, sediment yield, SWAT model

Procedia PDF Downloads 168
24636 Impact of Herbicides on Soil Biology in Rapeseed

Authors: M. Eickermann, M. K. Class, J. Junk

Abstract:

Winter oilseed rape, Brassica napus L., is characterized by a high number of herbicide applications. Therefore, its cultivation can lead to massive contamination of ground water and soil by herbicide and their metabolites. A multi-side long-term field experiment (EFFO, Efficient crop rotation) was set-up in Luxembourg to quantify these effects. Based on soil sampling and laboratory analysis, preliminary results showed reduced dehydrogenase activities of several soil organisms due to herbicide treatments. This effect is highly depending on the soil type. Relation between the dehydrogenase activity and the amount of microbial carbon showed higher variability on the test side with loamy Brown Earth, based on Bunter than on those with sandy-loamy Brown Earth, based on calciferous Sandstone.

Keywords: cropping system, dehydrogenase activity, herbicides, mechanical weed control, oilseed rape

Procedia PDF Downloads 231
24635 Sampling and Characterization of Fines Created during the Shredding of Non Hazardous Waste

Authors: Soukaina Oujana, Peggy Zwolinski

Abstract:

Fines are heterogeneous residues created during the shredding of non-hazardous waste. They are one of the most challenging issues faced by recyclers, because they are at the present time considered as non-sortable and non-reusable mixtures destined to landfill. However, fines contain a large amount of recoverable materials that could be recycled or reused for the production of solid recovered fuel. This research is conducted in relation to a project named ValoRABES. The aim is to characterize fines and establish a suitable sorting process in order to extract the materials contained in the mixture and define their suitable recovery paths. This paper will highlight the importance of a good sampling and will propose a sampling methodology for fines characterization. First results about the characterization will be also presented.

Keywords: fines, non-hazardous waste, recovery, shredding residues, waste characterization, waste sampling

Procedia PDF Downloads 177
24634 A Lexicographic Approach to Obstacles Identified in the Ontological Representation of the Tree of Life

Authors: Sandra Young

Abstract:

The biodiversity literature is vast and heterogeneous. In today’s data age, numbers of data integration and standardisation initiatives aim to facilitate simultaneous access to all the literature across biodiversity domains for research and forecasting purposes. Ontologies are being used increasingly to organise this information, but the rationalisation intrinsic to ontologies can hit obstacles when faced with the intrinsic fluidity and inconsistency found in the domains comprising biodiversity. Essentially the problem is a conceptual one: biological taxonomies are formed on the basis of specific, physical specimens yet nomenclatural rules are used to provide labels to describe these physical objects. These labels are ambiguous representations of the physical specimen. An example of this is with the genus Melpomene, the scientific nomenclatural representation of a genus of ferns, but also for a genus of spiders. The physical specimens for each of these are vastly different, but they have been assigned the same nomenclatural reference. While there is much research into the conceptual stability of the taxonomic concept versus the nomenclature used, to the best of our knowledge as yet no research has looked empirically at the literature to see the conceptual plurality or singularity of the use of these species’ names, the linguistic representation of a physical entity. Language itself uses words as symbols to represent real world concepts, whether physical entities or otherwise, and as such lexicography has a well-founded history in the conceptual mapping of words in context for dictionary making. This makes it an ideal candidate to explore this problem. The lexicographic approach uses corpus-based analysis to look at word use in context, with a specific focus on collocated word frequencies (the frequencies of words used in specific grammatical and collocational contexts). It allows for inconsistencies and contradictions in the source data and in fact includes these in the word characterisation so that 100% of the available evidence is counted. Corpus analysis is indeed suggested as one of the ways to identify concepts for ontology building, because of its ability to look empirically at data and show patterns in language usage, which can indicate conceptual ideas which go beyond words themselves. In this sense it could potentially be used to identify if the hierarchical structures present within the empirical body of literature match those which have been identified in ontologies created to represent them. The first stages of this research have revealed a hierarchical structure that becomes apparent in the biodiversity literature when annotating scientific species’ names, common names and more general names as classes, which will be the focus of this paper. The next step in the research is focusing on a larger corpus in which specific words can be analysed and then compared with existing ontological structures looking at the same material, to evaluate the methods by means of an alternative perspective. This research aims to provide evidence as to the validity of the current methods in knowledge representation for biological entities, and also shed light on the way that scientific nomenclature is used within the literature.

Keywords: ontology, biodiversity, lexicography, knowledge representation, corpus linguistics

Procedia PDF Downloads 123
24633 Comprehensive Study of Data Science

Authors: Asifa Amara, Prachi Singh, Kanishka, Debargho Pathak, Akshat Kumar, Jayakumar Eravelly

Abstract:

Today's generation is totally dependent on technology that uses data as its fuel. The present study is all about innovations and developments in data science and gives an idea about how efficiently to use the data provided. This study will help to understand the core concepts of data science. The concept of artificial intelligence was introduced by Alan Turing in which the main principle was to create an artificial system that can run independently of human-given programs and can function with the help of analyzing data to understand the requirements of the users. Data science comprises business understanding, analyzing data, ethical concerns, understanding programming languages, various fields and sources of data, skills, etc. The usage of data science has evolved over the years. In this review article, we have covered a part of data science, i.e., machine learning. Machine learning uses data science for its work. Machines learn through their experience, which helps them to do any work more efficiently. This article includes a comparative study image between human understanding and machine understanding, advantages, applications, and real-time examples of machine learning. Data science is an important game changer in the life of human beings. Since the advent of data science, we have found its benefits and how it leads to a better understanding of people, and how it cherishes individual needs. It has improved business strategies, services provided by them, forecasting, the ability to attend sustainable developments, etc. This study also focuses on a better understanding of data science which will help us to create a better world.

Keywords: data science, machine learning, data analytics, artificial intelligence

Procedia PDF Downloads 63
24632 Seismic Vulnerability Assessment of Masonry Buildings in Seismic Prone Regions: The Case of Annaba City, Algeria

Authors: Allaeddine Athmani, Abdelhacine Gouasmia, Tiago Ferreira, Romeu Vicente

Abstract:

Seismic vulnerability assessment of masonry buildings is a fundamental issue even for moderate to low seismic hazard regions. This fact is even more important when dealing with old structures such as those located in Annaba city (Algeria), which the majority of dates back to the French colonial era from 1830. This category of buildings is in high risk due to their highly degradation state, heterogeneous materials and intrusive modifications to structural and non-structural elements. Furthermore, they are usually shelter a dense population, which is exposed to such risk. In order to undertake a suitable seismic risk mitigation strategies and reinforcement process for such structures, it is essential to estimate their seismic resistance capacity at a large scale. In this sense, two seismic vulnerability index methods and damage estimation have been adapted and applied to a pilot-scale building area located in the moderate seismic hazard region of Annaba city: The first one based on the EMS-98 building typologies, and the second one derived from the Italian GNDT approach. To perform this task, the authors took the advantage of an existing data survey previously performed for other purposes. The results obtained from the application of the two methods were integrated and compared using a geographic information system tool (GIS), with the ultimate goal of supporting the city council of Annaba for the implementation of risk mitigation and emergency planning strategies.

Keywords: Annaba city, EMS98 concept, GNDT method, old city center, seismic vulnerability index, unreinforced masonry buildings

Procedia PDF Downloads 604
24631 A Description Logics Based Approach for Building Multi-Viewpoints Ontologies

Authors: M. Hemam, M. Djezzar, T. Djouad

Abstract:

We are interested in the problem of building an ontology in a heterogeneous organization, by taking into account different viewpoints and different terminologies of communities in the organization. Such ontology, that we call multi-viewpoint ontology, confers to the same universe of discourse, several partial descriptions, where each one is relative to a particular viewpoint. In addition, these partial descriptions share at global level, ontological elements constituent a consensus between the various viewpoints. In order to provide response elements to this problem we define a multi-viewpoints knowledge model based on viewpoint and ontology notions. The multi-viewpoints knowledge model is used to formalize the multi-viewpoints ontology in description logics language.

Keywords: description logic, knowledge engineering, ontology, viewpoint

Procedia PDF Downloads 292
24630 Thresholding Approach for Automatic Detection of Pseudomonas aeruginosa Biofilms from Fluorescence in situ Hybridization Images

Authors: Zonglin Yang, Tatsuya Akiyama, Kerry S. Williamson, Michael J. Franklin, Thiruvarangan Ramaraj

Abstract:

Pseudomonas aeruginosa is an opportunistic pathogen that forms surface-associated microbial communities (biofilms) on artificial implant devices and on human tissue. Biofilm infections are difficult to treat with antibiotics, in part, because the bacteria in biofilms are physiologically heterogeneous. One measure of biological heterogeneity in a population of cells is to quantify the cellular concentrations of ribosomes, which can be probed with fluorescently labeled nucleic acids. The fluorescent signal intensity following fluorescence in situ hybridization (FISH) analysis correlates to the cellular level of ribosomes. The goals here are to provide computationally and statistically robust approaches to automatically quantify cellular heterogeneity in biofilms from a large library of epifluorescent microscopy FISH images. In this work, the initial steps were developed toward these goals by developing an automated biofilm detection approach for use with FISH images. The approach allows rapid identification of biofilm regions from FISH images that are counterstained with fluorescent dyes. This methodology provides advances over other computational methods, allowing subtraction of spurious signals and non-biological fluorescent substrata. This method will be a robust and user-friendly approach which will enable users to semi-automatically detect biofilm boundaries and extract intensity values from fluorescent images for quantitative analysis of biofilm heterogeneity.

Keywords: image informatics, Pseudomonas aeruginosa, biofilm, FISH, computer vision, data visualization

Procedia PDF Downloads 119
24629 Semi-Supervised Learning for Spanish Speech Recognition Using Deep Neural Networks

Authors: B. R. Campomanes-Alvarez, P. Quiros, B. Fernandez

Abstract:

Automatic Speech Recognition (ASR) is a machine-based process of decoding and transcribing oral speech. A typical ASR system receives acoustic input from a speaker or an audio file, analyzes it using algorithms, and produces an output in the form of a text. Some speech recognition systems use Hidden Markov Models (HMMs) to deal with the temporal variability of speech and Gaussian Mixture Models (GMMs) to determine how well each state of each HMM fits a short window of frames of coefficients that represents the acoustic input. Another way to evaluate the fit is to use a feed-forward neural network that takes several frames of coefficients as input and produces posterior probabilities over HMM states as output. Deep neural networks (DNNs) that have many hidden layers and are trained using new methods have been shown to outperform GMMs on a variety of speech recognition systems. Acoustic models for state-of-the-art ASR systems are usually training on massive amounts of data. However, audio files with their corresponding transcriptions can be difficult to obtain, especially in the Spanish language. Hence, in the case of these low-resource scenarios, building an ASR model is considered as a complex task due to the lack of labeled data, resulting in an under-trained system. Semi-supervised learning approaches arise as necessary tasks given the high cost of transcribing audio data. The main goal of this proposal is to develop a procedure based on acoustic semi-supervised learning for Spanish ASR systems by using DNNs. This semi-supervised learning approach consists of: (a) Training a seed ASR model with a DNN using a set of audios and their respective transcriptions. A DNN with a one-hidden-layer network was initialized; increasing the number of hidden layers in training, to a five. A refinement, which consisted of the weight matrix plus bias term and a Stochastic Gradient Descent (SGD) training were also performed. The objective function was the cross-entropy criterion. (b) Decoding/testing a set of unlabeled data with the obtained seed model. (c) Selecting a suitable subset of the validated data to retrain the seed model, thereby improving its performance on the target test set. To choose the most precise transcriptions, three confidence scores or metrics, regarding the lattice concept (based on the graph cost, the acoustic cost and a combination of both), was performed as selection technique. The performance of the ASR system will be calculated by means of the Word Error Rate (WER). The test dataset was renewed in order to extract the new transcriptions added to the training dataset. Some experiments were carried out in order to select the best ASR results. A comparison between a GMM-based model without retraining and the DNN proposed system was also made under the same conditions. Results showed that the semi-supervised ASR-model based on DNNs outperformed the GMM-model, in terms of WER, in all tested cases. The best result obtained an improvement of 6% relative WER. Hence, these promising results suggest that the proposed technique could be suitable for building ASR models in low-resource environments.

Keywords: automatic speech recognition, deep neural networks, machine learning, semi-supervised learning

Procedia PDF Downloads 325
24628 Application of Artificial Neural Network Technique for Diagnosing Asthma

Authors: Azadeh Bashiri

Abstract:

Introduction: Lack of proper diagnosis and inadequate treatment of asthma leads to physical and financial complications. This study aimed to use data mining techniques and creating a neural network intelligent system for diagnosis of asthma. Methods: The study population is the patients who had visited one of the Lung Clinics in Tehran. Data were analyzed using the SPSS statistical tool and the chi-square Pearson's coefficient was the basis of decision making for data ranking. The considered neural network is trained using back propagation learning technique. Results: According to the analysis performed by means of SPSS to select the top factors, 13 effective factors were selected, in different performances, data was mixed in various forms, so the different models were made for training the data and testing networks and in all different modes, the network was able to predict correctly 100% of all cases. Conclusion: Using data mining methods before the design structure of system, aimed to reduce the data dimension and the optimum choice of the data, will lead to a more accurate system. Therefore, considering the data mining approaches due to the nature of medical data is necessary.

Keywords: asthma, data mining, Artificial Neural Network, intelligent system

Procedia PDF Downloads 258
24627 Synthesis and Evaluation of Heterogeneous Nano-Catalyst: Cr Loaded in to MCM-41

Authors: A. Salemi Golezania, A. Sharifi Fateha

Abstract:

In this study a nano-composite catalyst was synthesized by incorporation of chromium into the framework of MCM-41 as a base catalyst. Mesoporous silica molecular sieves MCM-41 were synthesized under Hydrothermal Continues pH Adjusting Path Way. Then, MCM-41 was impregnated by chromium nitrate aqueous solution for several times under water aspiration. Raw powder was cured by heat treatment in vacuum furnace at 500°C. Phase formation, morphology and gas absorption properties of resulted materials were characterized by XRD, TEM and BET analysis, respectively. The results showed that high quality hexagonal meso structure as a matrix and Cr as a second phase has been formed with a narrow size pore diameter distribution and high surface area in Cr/MCM-41 nano-composite structure. The specific surface and total volume of porosity of the synthesized nanocomposite are obtained 931m^2/gr and 1.12 cm^3/gr, respectively.

Keywords: nano-catalyst, MCM-41, Cr/MCM-41, Marine Science and Engineering

Procedia PDF Downloads 371
24626 Machine Learning Analysis of Student Success in Introductory Calculus Based Physics I Course

Authors: Chandra Prayaga, Aaron Wade, Lakshmi Prayaga, Gopi Shankar Mallu

Abstract:

This paper presents the use of machine learning algorithms to predict the success of students in an introductory physics course. Data having 140 rows pertaining to the performance of two batches of students was used. The lack of sufficient data to train robust machine learning models was compensated for by generating synthetic data similar to the real data. CTGAN and CTGAN with Gaussian Copula (Gaussian) were used to generate synthetic data, with the real data as input. To check the similarity between the real data and each synthetic dataset, pair plots were made. The synthetic data was used to train machine learning models using the PyCaret package. For the CTGAN data, the Ada Boost Classifier (ADA) was found to be the ML model with the best fit, whereas the CTGAN with Gaussian Copula yielded Logistic Regression (LR) as the best model. Both models were then tested for accuracy with the real data. ROC-AUC analysis was performed for all the ten classes of the target variable (Grades A, A-, B+, B, B-, C+, C, C-, D, F). The ADA model with CTGAN data showed a mean AUC score of 0.4377, but the LR model with the Gaussian data showed a mean AUC score of 0.6149. ROC-AUC plots were obtained for each Grade value separately. The LR model with Gaussian data showed consistently better AUC scores compared to the ADA model with CTGAN data, except in two cases of the Grade value, C- and A-.

Keywords: machine learning, student success, physics course, grades, synthetic data, CTGAN, gaussian copula CTGAN

Procedia PDF Downloads 29
24625 Kirchhoff’s Depth Migration over Heterogeneous Velocity Models with Ray Tracing Modeling Approach

Authors: Alok Kumar Routa, Priya Ranjan Mohanty

Abstract:

Complex seismic signatures are generated due to the complexity of the subsurface which is difficult to interpret. In the present study, an attempt has been made to model the complex subsurface using the Ray tracing modeling technique. Add to this, for the imaging of these geological features, Kirchhoff’s prestack depth migration is applied over the synthetic common shot gather dataset. It is found that the Kirchhoff’s migration technique in addition with the Ray tracing modeling concept has the flexibility towards the imaging of various complex geology which gives satisfactory results with proper delineation of the reflectors at their respective true depth position. The entire work has been carried out under the MATLAB environment.

Keywords: Kirchhoff's migration, Prestack depth migration, Ray tracing modelling, velocity model

Procedia PDF Downloads 343
24624 Analyzing the Connection between Productive Structure and Communicable Diseases: An Econometric Panel Study

Authors: Julio Silva, Lia Hasenclever, Gilson G. Silva Jr.

Abstract:

The aim of this paper is to check possible convergence in health measures (aged-standard rate of morbidity and mortality) for communicable diseases between developed and developing countries, conditional to productive structures features. Understanding the interrelations between health patterns and economic development is particularly important in the context of low- and middle-income countries, where economic development comes along with deep social inequality. Developing countries with less diversified productive structures (measured through complexity index) but high heterogeneous inter-sectorial labor productivity (using as a proxy inter-sectorial coefficient of variation of labor productivity) has on average low health levels in communicable diseases compared to developed countries with high diversified productive structures and low labor market heterogeneity. Structural heterogeneity and productive diversification may have influence on health levels even considering per capita income. We set up a panel data for 139 countries from 1995 to 2015, joining several data about the countries, as economic development, health, and health system coverage, environmental and socioeconomic aspects. This information was obtained from World Bank, International Labour Organization, Atlas of Economic Complexity, United Nation (Development Report) and Institute for Health Metrics and Evaluation Database. Econometric panel models evidence shows that the level of communicable diseases has a positive relationship with structural heterogeneity, even considering other factors as per capita income. On the other hand, the recent process of convergence in terms of communicable diseases have been motivated for other reasons not directly related to productive structure, as health system coverage and environmental aspects. These evidences suggest a joint dynamics between the unequal distribution of communicable diseases and countries' productive structure aspects. These set of evidence are quite important to public policy as meet the health aims in Millennium Development Goals. It also highlights the importance of the process of structural change as fundamental to shift the levels of health in terms of communicable diseases and can contribute to the debate between the relation of economic development and health patterns changes.

Keywords: economic development, inequality, population health, structural change

Procedia PDF Downloads 120
24623 Locus of Control and Sense of Happiness: A Mediating Role of Self-Esteem

Authors: Ivanna Shubina

Abstract:

Background/Objectives and Goals: Recent interest in positive psychology is reflected in a plenty of studies conducted on its basic constructs (e.g. self-esteem and happiness) in interrelation with personality features, social rules, business and technology development. The purpose of this study is to investigate the mediating role of self-esteem, exploring the relationships between self-esteem and happiness, self-esteem and locus of control (LOC). It hypothesizes that self-esteem may be interpreted as a predictor of happiness and mediator in the locus of control establishment. A plenty of various empirical studies results have been analyzed in order to collect data for this theoretical study, and some of the analysed results can be considered as arguable or incoherent. However, the majority of results indicate a strong relationship between three considered concepts: self-esteem, happiness, the locus of control. Methods: In particular, this study addresses the following broad research questions: i) Is self-esteem just an index of global happiness? ii) May happiness be possible or realizable without a healthy self-confidence and self-acceptance? iii) To what extent does self-esteem influence on the level of happiness? iv) Is high self-esteem a sufficient condition for happiness? v) Is self-esteem is a strong predictor of internal locus of control maintenance? vi) Is high self-esteem related to internal LOC, while low self-esteem to external LOC? In order to find the answers for listed questions, 60 reliable sources have been analyzed, results of what are discussed more detailed below. Expected Results/Conclusion/Contribution:It is recognized that the relationship between self-esteem, happiness, locus of control is complex: internal LOC is contributing to happiness, but it is not directly related to it; self-esteem is a powerful and important psychological factor in mental health and well-being; the feelings of being worthy and empowered are associated with significant achievements and high self-esteem; strong and appropriate self-esteem (when the discrepancy between “ideal” and “real” self is balanced) is correlated with more internal LOC (when the individual tends to believe that personal achievements depend on possessed features, vigor, and persistence). Despite the special attention paid to happiness, the locus of control and self-esteem, independently, theoretical and empirical equivocations within each literature foreclose many obvious predictions about the nature of their empirical distinction. In terms of theoretical framework, no model has achieved consensus as an ultimate theoretical background for any of the mentioned constructs. To be able to clarify the relationship between self-esteem, happiness, and locus of control more interdisciplinary studies have to take place in order to get data on heterogeneous samples, provided from various countries, cultures, and social groups.

Keywords: happiness, locus of control, self-esteem, mediation

Procedia PDF Downloads 228
24622 Data Access, AI Intensity, and Scale Advantages

Authors: Chuping Lo

Abstract:

This paper presents a simple model demonstrating that ceteris paribus countries with lower barriers to accessing global data tend to earn higher incomes than other countries. Therefore, large countries that inherently have greater data resources tend to have higher incomes than smaller countries, such that the former may be more hesitant than the latter to liberalize cross-border data flows to maintain this advantage. Furthermore, countries with higher artificial intelligence (AI) intensity in production technologies tend to benefit more from economies of scale in data aggregation, leading to higher income and more trade as they are better able to utilize global data.

Keywords: digital intensity, digital divide, international trade, scale of economics

Procedia PDF Downloads 51
24621 Secured Transmission and Reserving Space in Images Before Encryption to Embed Data

Authors: G. R. Navaneesh, E. Nagarajan, C. H. Rajam Raju

Abstract:

Nowadays the multimedia data are used to store some secure information. All previous methods allocate a space in image for data embedding purpose after encryption. In this paper, we propose a novel method by reserving space in image with a boundary surrounded before encryption with a traditional RDH algorithm, which makes it easy for the data hider to reversibly embed data in the encrypted images. The proposed method can achieve real time performance, that is, data extraction and image recovery are free of any error. A secure transmission process is also discussed in this paper, which improves the efficiency by ten times compared to other processes as discussed.

Keywords: secure communication, reserving room before encryption, least significant bits, image encryption, reversible data hiding

Procedia PDF Downloads 396
24620 Identity Verification Using k-NN Classifiers and Autistic Genetic Data

Authors: Fuad M. Alkoot

Abstract:

DNA data have been used in forensics for decades. However, current research looks at using the DNA as a biometric identity verification modality. The goal is to improve the speed of identification. We aim at using gene data that was initially used for autism detection to find if and how accurate is this data for identification applications. Mainly our goal is to find if our data preprocessing technique yields data useful as a biometric identification tool. We experiment with using the nearest neighbor classifier to identify subjects. Results show that optimal classification rate is achieved when the test set is corrupted by normally distributed noise with zero mean and standard deviation of 1. The classification rate is close to optimal at higher noise standard deviation reaching 3. This shows that the data can be used for identity verification with high accuracy using a simple classifier such as the k-nearest neighbor (k-NN). 

Keywords: biometrics, genetic data, identity verification, k nearest neighbor

Procedia PDF Downloads 235