Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 395

Search results for: Attendance in classes

65 A New Bound on the Average Information Ratio of Perfect Secret-Sharing Schemes for Access Structures Based On Bipartite Graphs of Larger Girth

Abstract:

In a perfect secret-sharing scheme, a dealer distributes a secret among a set of participants in such a way that only qualified subsets of participants can recover the secret and the joint share of the participants in any unqualified subset is statistically independent of the secret. The access structure of the scheme refers to the collection of all qualified subsets. In a graph-based access structures, each vertex of a graph G represents a participant and each edge of G represents a minimal qualified subset. The average information ratio of a perfect secret-sharing scheme realizing a given access structure is the ratio of the average length of the shares given to the participants to the length of the secret. The infimum of the average information ratio of all possible perfect secret-sharing schemes realizing an access structure is called the optimal average information ratio of that access structure. We study the optimal average information ratio of the access structures based on bipartite graphs. Based on some previous results, we give a bound on the optimal average information ratio for all bipartite graphs of girth at least six. This bound is the best possible for some classes of bipartite graphs using our approach.

Keywords: Secret-sharing scheme, average information ratio, star covering, deduction, core cluster.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1434

64 Intelligent Recognition of Diabetes Disease via FCM Based Attribute Weighting

Authors: Kemal Polat

Abstract:

In this paper, an attribute weighting method called fuzzy C-means clustering based attribute weighting (FCMAW) for classification of Diabetes disease dataset has been used. The aims of this study are to reduce the variance within attributes of diabetes dataset and to improve the classification accuracy of classifier algorithm transforming from non-linear separable datasets to linearly separable datasets. Pima Indians Diabetes dataset has two classes including normal subjects (500 instances) and diabetes subjects (268 instances). Fuzzy C-means clustering is an improved version of K-means clustering method and is one of most used clustering methods in data mining and machine learning applications. In this study, as the first stage, fuzzy C-means clustering process has been used for finding the centers of attributes in Pima Indians diabetes dataset and then weighted the dataset according to the ratios of the means of attributes to centers of theirs. Secondly, after weighting process, the classifier algorithms including support vector machine (SVM) and k-NN (k- nearest neighbor) classifiers have been used for classifying weighted Pima Indians diabetes dataset. Experimental results show that the proposed attribute weighting method (FCMAW) has obtained very promising results in the classification of Pima Indians diabetes dataset.

Keywords: Fuzzy C-means clustering, Fuzzy C-means clustering based attribute weighting, Pima Indians diabetes dataset, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1763

63 Designing Social Care Policies in the Long Term: A Study Using Regression, Clustering and Backpropagation Neural Nets

Authors: Sotirios Raptis

Abstract:

Linking social needs to social classes using different criteria may lead to social services misuse. The paper discusses using ML and Neural Networks (NNs) in linking public services in Scotland in the long term and advocates, this can result in a reduction of the services cost connecting resources needed in groups for similar services. The paper combines typical regression models with clustering and cross-correlation as complementary constituents to predict the demand. Insurance companies and public policymakers can pack linked services such as those offered to the elderly or to low-income people in the longer term. The work is based on public data from 22 services offered by Public Health Services (PHS) Scotland and from the Scottish Government (SG) from 1981 to 2019 that are broken into 110 years series called factors and uses Linear Regression (LR), Autoregression (ARMA) and 3 types of back-propagation (BP) Neural Networks (BPNN) to link them under specific conditions. Relationships found were between smoking related healthcare provision, mental health-related health services, and epidemiological weight in Primary 1(Education) Body Mass Index (BMI) in children. Primary component analysis (PCA) found 11 significant factors while C-Means (CM) clustering gave 5 major factors clusters.

Keywords: Probability, cohorts, data frames, services, prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 462

62 Face Recognition Using Principal Component Analysis, K-Means Clustering, and Convolutional Neural Network

Authors: Zukisa Nante, Wang Zenghui

Abstract:

Face recognition is the problem of identifying or recognizing individuals in an image. This paper investigates a possible method to bring a solution to this problem. The method proposes an amalgamation of Principal Component Analysis (PCA), K-Means clustering, and Convolutional Neural Network (CNN) for a face recognition system. It is trained and evaluated using the ORL dataset. This dataset consists of 400 different faces with 40 classes of 10 face images per class. Firstly, PCA enabled the usage of a smaller network. This reduces the training time of the CNN. Thus, we get rid of the redundancy and preserve the variance with a smaller number of coefficients. Secondly, the K-Means clustering model is trained using the compressed PCA obtained data which select the K-Means clustering centers with better characteristics. Lastly, the K-Means characteristics or features are an initial value of the CNN and act as input data. The accuracy and the performance of the proposed method were tested in comparison to other Face Recognition (FR) techniques namely PCA, Support Vector Machine (SVM), as well as K-Nearest Neighbour (kNN). During experimentation, the accuracy and the performance of our suggested method after 90 epochs achieved the highest performance: 99% accuracy F1-Score, 99% precision, and 99% recall in 463.934 seconds. It outperformed the PCA that obtained 97% and KNN with 84% during the conducted experiments. Therefore, this method proved to be efficient in identifying faces in the images.

Keywords: Face recognition, Principal Component Analysis, PCA, Convolutional Neural Network, CNN, Rectified Linear Unit, ReLU, feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 506

61 Contraception in Guatemala, Panajachel and the Surrounding Areas: Barriers Affecting Women’s Contraceptive Usage

Authors: Natasha Bhate

Abstract:

Contraception is important in helping to reduce maternal and infant mortality rates by allowing women to control the number and spacing in-between their children. It also reduces the need for unsafe abortions. Women worldwide use contraception; however, the contraceptive prevalence rate is still relatively low in Central American countries like Guatemala. There is also an unmet need for contraception in Guatemala, which is more significant in rural, indigenous women due to barriers preventing contraceptive use. The study objective was to investigate and analyse the current barriers women face, in Guatemala, Panajachel and the surrounding areas, in using contraception, with a view of identifying ways to overcome these barriers. This included exploring the contraceptive barriers women believe exist and the influence of males in contraceptive decision making. The study took place at a charity in Panajachel, Guatemala, and had a cross-sectional, qualitative design to allow an in-depth understanding of information gathered. This particular study design was also chosen to help inform the charity with qualitative research analysis, in view of their intent to create a local reproductive health programme. A semi-structured interview design, including photo facilitation to improve cross-cultural communication, with interpreter assistance, was utilized. A pilot interview was initially conducted with small improvements required. Participants were recruited through purposive and convenience sampling. The study host at the charity acted as a gatekeeper; participants were identified through attendance of the charity’s women’s-initiative programme workshops. 20 participants were selected and agreed to study participation with two not attending; a total of 18 participants were interviewed in June 2017. Interviews were audio-recorded and data were stored on encrypted memory sticks. Framework analysis was used to analyse the data using NVivo11 software. The University of Leeds granted ethical approval for the research. Religion, language, the community, and fear of sickness were examples of existing contraceptive barrier themes recognized by many participants. The influence of men was also an important barrier identified, with themes of machismo and abuse preventing contraceptive use in some women. Women from more rural areas were believed to still face barriers which some participants did not encounter anymore, such as distance and affordability of contraceptives. Participants believed that informative workshops in various settings were an ideal method of overcoming existing contraceptive barriers and allowing women to be more empowered. The involvement of men in such workshops was also deemed important by participants to help reduce their negative influence in contraceptive usage. Overall, four recommendations following this study were made, including contraceptive educational courses, a gender equality campaign, couple-focused contraceptive workshops, and further qualitative research to gain a better insight into men’s opinions regarding women using contraception.

Keywords: Barrier, contraception, machismo, religion.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 622

60 Selection of Best Band Combination for Soil Salinity Studies using ETM+ Satellite Images (A Case study: Nyshaboor Region,Iran)

Authors: Sanaeinejad, S. H.; A. Astaraei, . P. Mirhoseini.Mousavi, M. Ghaemi,

Abstract:

One of the main environmental problems which affect extensive areas in the world is soil salinity. Traditional data collection methods are neither enough for considering this important environmental problem nor accurate for soil studies. Remote sensing data could overcome most of these problems. Although satellite images are commonly used for these studies, however there are still needs to find the best calibration between the data and real situations in each specified area. Neyshaboor area, North East of Iran was selected as a field study of this research. Landsat satellite images for this area were used in order to prepare suitable learning samples for processing and classifying the images. 300 locations were selected randomly in the area to collect soil samples and finally 273 locations were reselected for further laboratory works and image processing analysis. Electrical conductivity of all samples was measured. Six reflective bands of ETM+ satellite images taken from the study area in 2002 were used for soil salinity classification. The classification was carried out using common algorithms based on the best composition bands. The results showed that the reflective bands 7, 3, 4 and 1 are the best band composition for preparing the color composite images. We also found out, that hybrid classification is a suitable method for identifying and delineation of different salinity classes in the area.

Keywords: Soil salinity, Remote sensing, Image processing, ETM+, Nyshaboor

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2021

59 Transmission Model for Plasmodium Vivax Malaria: Conditions for Bifurcation

Authors: P. Pongsumpun, I.M. Tang

Abstract:

Plasmodium vivax malaria differs from P. falciparum malaria in that a person suffering from P. vivax infection can suffer relapses of the disease. This is due the parasite being able to remain dormant in the liver of the patients where it is able to re-infect the patient after a passage of time. During this stage, the patient is classified as being in the dormant class. The model to describe the transmission of P. vivax malaria consists of a human population divided into four classes, the susceptible, the infected, the dormant and the recovered. The effect of a time delay on the transmission of this disease is studied. The time delay is the period in which the P. vivax parasite develops inside the mosquito (vector) before the vector becomes infectious (i.e., pass on the infection). We analyze our model by using standard dynamic modeling method. Two stable equilibrium states, a disease free state E0 and an endemic state E1, are found to be possible. It is found that the E0 state is stable when a newly defined basic reproduction number G is less than one. If G is greater than one the endemic state E1 is stable. The conditions for the endemic equilibrium state E1 to be a stable spiral node are established. For realistic values of the parameters in the model, it is found that solutions in phase space are trajectories spiraling into the endemic state. It is shown that the limit cycle and chaotic behaviors can only be achieved with unrealistic parameter values.

Keywords: Equilibrium states, Hopf bifurcation, limit cyclebehavior, local stability, Plasmodium Vivax, time delay.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2243

58 Reasons for Doing Job outside Household and Difficulties Faced by the Working Women of Bangladesh

Authors: Md. Sayeed Akhter, Md. Akhtar Hossain Mazumder, Syeda Afreena Mamun

Abstract:

Bangladesh is a patriarchal and male dominated country. Traditional, cultural, social, and religious values and practices have reinforced the lower status of women accorded to them in society and have limited their opportunities for education, technical and vocational training, and involvement with earning activities outside their households. After independence numbers of women are doing job outside their households. This study attempts to find out the reasons of engaging in earning activities outside households and difficulties faced by upper and lower class working women in Bangladesh. To explore the objectives and research questions of the study descriptive techniques had been used. Survey was conducted among the women who were working in Rajshahi city of Bangladesh and face-to-face interviews were conducted to collect data. Findings of the study illustrates that most of the upper class working women engaged into job because they wanted to utilized their education and to bring solvency in the family, and they spend their income for meeting the needs of all the members of the family. On the other hand, most of the lower class working women involved into earning activities outside their households because they want to bring solvency in their families and spend their income on household expenditure. Both classes became tensed for their children because they had to stay at their working place for long time. Therefore, day care center should be established besides their working place for their children.

Keywords: Working Women, Reasons for Doing Jobs, Working Environment, Difficulties Faced.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1791

57 Standard Deviation of Mean and Variance of Rows and Columns of Images for CBIR

Authors: H. B. Kekre, Kavita Patil

Abstract:

This paper describes a novel and effective approach to content-based image retrieval (CBIR) that represents each image in the database by a vector of feature values called “Standard deviation of mean vectors of color distribution of rows and columns of images for CBIR". In many areas of commerce, government, academia, and hospitals, large collections of digital images are being created. This paper describes the approach that uses contents as feature vector for retrieval of similar images. There are several classes of features that are used to specify queries: colour, texture, shape, spatial layout. Colour features are often easily obtained directly from the pixel intensities. In this paper feature extraction is done for the texture descriptor that is 'variance' and 'Variance of Variances'. First standard deviation of each row and column mean is calculated for R, G, and B planes. These six values are obtained for one image which acts as a feature vector. Secondly we calculate variance of the row and column of R, G and B planes of an image. Then six standard deviations of these variance sequences are calculated to form a feature vector of dimension six. We applied our approach to a database of 300 BMP images. We have determined the capability of automatic indexing by analyzing image content: color and texture as features and by applying a similarity measure Euclidean distance.

Keywords: Standard deviation Image retrieval, color distribution, Variance, Variance of Variance, Euclidean distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3746

56 Performance Assessment of Multi-Level Ensemble for Multi-Class Problems

Authors: Rodolfo Lorbieski, Silvia Modesto Nassar

Abstract:

Many supervised machine learning tasks require decision making across numerous different classes. Multi-class classification has several applications, such as face recognition, text recognition and medical diagnostics. The objective of this article is to analyze an adapted method of Stacking in multi-class problems, which combines ensembles within the ensemble itself. For this purpose, a training similar to Stacking was used, but with three levels, where the final decision-maker (level 2) performs its training by combining outputs from the tree-based pair of meta-classifiers (level 1) from Bayesian families. These are in turn trained by pairs of base classifiers (level 0) of the same family. This strategy seeks to promote diversity among the ensembles forming the meta-classifier level 2. Three performance measures were used: (1) accuracy, (2) area under the ROC curve, and (3) time for three factors: (a) datasets, (b) experiments and (c) levels. To compare the factors, ANOVA three-way test was executed for each performance measure, considering 5 datasets by 25 experiments by 3 levels. A triple interaction between factors was observed only in time. The accuracy and area under the ROC curve presented similar results, showing a double interaction between level and experiment, as well as for the dataset factor. It was concluded that level 2 had an average performance above the other levels and that the proposed method is especially efficient for multi-class problems when compared to binary problems.

Keywords: Stacking, multi-layers, ensemble, multi-class.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1094

55 Quantifying the Second-Level Digital Divide on Sub-National Level

Authors: Vladimir Korovkin, Albert Park, Evgeny Kaganer

Abstract:

Digital divide, the gap in the access to the world of digital technologies and the socio-economic opportunities that they create is an important phenomenon of the XXI century. This gap may exist between countries, regions within a country or socio-demographic groups, creating the classes of “digital have and have nots”. While the 1st-level divide (the difference in opportunities to access the digital networks) was demonstrated to diminish with time, the issues of 2nd level divide (the difference in skills and usage of digital systems) and 3rd level divide (the difference in effects obtained from digital technology) may grow. The paper offers a systemic review of literature on the measurement of the digital divide, noting the certain conceptual stagnation due to the lack of effective instruments that would capture the complex nature of the phenomenon. As a result, many important concepts do not receive the empiric exploration they deserve. As a solution the paper suggests a composite Digital Life Index, that studies separately the digital supply and demand across seven independent dimensions providing for 14 subindices. The Index is based on Internet-borne data, a distinction from traditional research approaches that rely on official statistics or surveys. The application of the model to the study of the digital divide between Russian regions and between cities in China have brought promising results. The paper advances the existing methodological literature on the 2nd level digital divide and can also inform practical decision-making regarding the strategies of national and regional digital development.

Keywords: Digital transformation, second-level digital divide, composite index, digital policy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 463

54 Interpreting Chopin’s Music Today: Mythologization of Art: Kitsch

Authors: Ilona Bala

Abstract:

The subject of this abstract is related to the notion of 'popular music', a notion that should be treated with extreme care, particularly when applied to Frederic Chopin, one of the greatest composers of Romanticism. By ‘popular music’, we mean a category of everyday music, set against the more intellectual kind, referred to as ‘classical’. We only need to look back to the culture of the nineteenth century to realize that this ‘popular music’ refers to the ‘music of the low’. It can be studied from a sociological viewpoint, or as sociological aesthetics. However, we cannot ignore the fact that, very quickly, this music spread to the wealthiest strata of the European society of the nineteenth century, while likewise the lowest classes often listen to the intellectual classical music, so pleasant to listen to. Further, we can observe that a sort of ‘sacralisation of kitsch’ occurs at the intersection between the classical and popular music. This process is the topic of this contribution. We will start by investigating the notion of kitsch through the study of Chopin’s popular compositions. However, before considering the popularisation of this music in today’s culture, we will have to focus on the use of the word kitsch in Chopin’s times, through his own musical aesthetics. Finally, the objective here will be to negate the theory that art is simply the intellectual definition of aesthetics. A kitsch can, obviously, only work on the emotivity of the masses, as it represents one of the features of culture-language (the words which the masses identify with). All art is transformed, becoming something outdated or even outmoded. Here, we are truly within a process of mythologization of art, through the study of the aesthetic reception of the musical work.

Keywords: F. Chopin, musical work, popular music, romantic music, mythologization of art, kitsch.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1272

53 Satellite Data Classification Accuracy Assessment Based from Reference Dataset

Authors: Mohd Hasmadi Ismail, Kamaruzaman Jusoff

Abstract:

In order to develop forest management strategies in tropical forest in Malaysia, surveying the forest resources and monitoring the forest area affected by logging activities is essential. There are tremendous effort has been done in classification of land cover related to forest resource management in this country as it is a priority in all aspects of forest mapping using remote sensing and related technology such as GIS. In fact classification process is a compulsory step in any remote sensing research. Therefore, the main objective of this paper is to assess classification accuracy of classified forest map on Landsat TM data from difference number of reference data (200 and 388 reference data). This comparison was made through observation (200 reference data), and interpretation and observation approaches (388 reference data). Five land cover classes namely primary forest, logged over forest, water bodies, bare land and agricultural crop/mixed horticultural can be identified by the differences in spectral wavelength. Result showed that an overall accuracy from 200 reference data was 83.5 % (kappa value 0.7502459; kappa variance 0.002871), which was considered acceptable or good for optical data. However, when 200 reference data was increased to 388 in the confusion matrix, the accuracy slightly improved from 83.5% to 89.17%, with Kappa statistic increased from 0.7502459 to 0.8026135, respectively. The accuracy in this classification suggested that this strategy for the selection of training area, interpretation approaches and number of reference data used were importance to perform better classification result.

Keywords: Image Classification, Reference Data, Accuracy Assessment, Kappa Statistic, Forest Land Cover

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3141

52 A Supervised Learning Data Mining Approach for Object Recognition and Classification in High Resolution Satellite Data

Authors: Mais Nijim, Rama Devi Chennuboyina, Waseem Al Aqqad

Abstract:

Advances in spatial and spectral resolution of satellite images have led to tremendous growth in large image databases. The data we acquire through satellites, radars, and sensors consists of important geographical information that can be used for remote sensing applications such as region planning, disaster management. Spatial data classification and object recognition are important tasks for many applications. However, classifying objects and identifying them manually from images is a difficult task. Object recognition is often considered as a classification problem, this task can be performed using machine-learning techniques. Despite of many machine-learning algorithms, the classification is done using supervised classifiers such as Support Vector Machines (SVM) as the area of interest is known. We proposed a classification method, which considers neighboring pixels in a region for feature extraction and it evaluates classifications precisely according to neighboring classes for semantic interpretation of region of interest (ROI). A dataset has been created for training and testing purpose; we generated the attributes by considering pixel intensity values and mean values of reflectance. We demonstrated the benefits of using knowledge discovery and data-mining techniques, which can be on image data for accurate information extraction and classification from high spatial resolution remote sensing imagery.

Keywords: Remote sensing, object recognition, classification, data mining, waterbody identification, feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2055

51 Feature Reduction of Nearest Neighbor Classifiers using Genetic Algorithm

Authors: M. Analoui, M. Fadavi Amiri

Abstract:

The design of a pattern classifier includes an attempt to select, among a set of possible features, a minimum subset of weakly correlated features that better discriminate the pattern classes. This is usually a difficult task in practice, normally requiring the application of heuristic knowledge about the specific problem domain. The selection and quality of the features representing each pattern have a considerable bearing on the success of subsequent pattern classification. Feature extraction is the process of deriving new features from the original features in order to reduce the cost of feature measurement, increase classifier efficiency, and allow higher classification accuracy. Many current feature extraction techniques involve linear transformations of the original pattern vectors to new vectors of lower dimensionality. While this is useful for data visualization and increasing classification efficiency, it does not necessarily reduce the number of features that must be measured since each new feature may be a linear combination of all of the features in the original pattern vector. In this paper a new approach is presented to feature extraction in which feature selection, feature extraction, and classifier training are performed simultaneously using a genetic algorithm. In this approach each feature value is first normalized by a linear equation, then scaled by the associated weight prior to training, testing, and classification. A knn classifier is used to evaluate each set of feature weights. The genetic algorithm optimizes a vector of feature weights, which are used to scale the individual features in the original pattern vectors in either a linear or a nonlinear fashion. By this approach, the number of features used in classifying can be finely reduced.

Keywords: Feature reduction, genetic algorithm, pattern classification, nearest neighbor rule classifiers (k-NNR).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1768

50 Designing Social Care Plans Considering Cause-Effect Relationships: A Study in Scotland

Authors: Sotirios N. Raptis

Abstract:

The paper links social needs to social classes by the creation of cohorts of public services matched as causes to other ones as effects using cause-effect (CE) models. It then compares these associations using CE and typical regression methods (LR, ARMA). The paper discusses such public service groupings offered in Scotland in the long term to estimate the risk of multiple causes or effects that can ultimately reduce the healthcare cost by linking the next services to the likely causes of them. The same generic goal can be achieved using LR or ARMA and differences are discussed. The work uses Health and Social Care (H&Sc) public services data from 11 service packs offered by Public Health Services (PHS) Scotland that boil down to 110 single-attribute year series, called ’factors’. The study took place at Macmillan Cancer Support, UK and Abertay University, Dundee, from 2020 to 2023. The paper discusses CE relationships as a main method and compares sample findings with Linear Regression (LR), ARMA, to see how the services are linked. Relationships found were between smoking-related healthcare provision, mental-health-related services, and epidemiological weight in Primary-1-Education Body-Mass-Index (BMI) in children as CE models. Insurance companies and public policymakers can pack CE-linked services in plans such as those for the elderly, low-income people, in the long term. The linkage of services was confirmed allowing more accurate resource planning.

Keywords: Probability, regression, cause-effect cohorts, data frames, services, prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 58

49 Probabilistic Crash Prediction and Prevention of Vehicle Crash

Authors: Lavanya Annadi, Fahimeh Jafari

Abstract:

Transportation brings immense benefits to society, but it also has its costs. Costs include the cost of infrastructure, personnel, and equipment, but also the loss of life and property in traffic accidents on the road, delays in travel due to traffic congestion, and various indirect costs in terms of air transport. This research aims to predict the probabilistic crash prediction of vehicles using Machine Learning due to natural and structural reasons by excluding spontaneous reasons, like overspeeding, etc., in the United States. These factors range from meteorological elements such as weather conditions, precipitation, visibility, wind speed, wind direction, temperature, pressure, and humidity, to human-made structures, like road structure components such as Bumps, Roundabouts, No Exit, Turning Loops, Give Away, etc. The probabilities are categorized into ten distinct classes. All the predictions are based on multiclass classification techniques, which are supervised learning. This study considers all crashes in all states collected by the US government. The probability of the crash was determined by employing Multinomial Expected Value, and a classification label was assigned accordingly. We applied three classification models, including multiclass Logistic Regression, Random Forest and XGBoost. The numerical results show that XGBoost achieved a 75.2% accuracy rate which indicates the part that is being played by natural and structural reasons for the crash. The paper has provided in-depth insights through exploratory data analysis.

Keywords: Road safety, crash prediction, exploratory analysis, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 83

48 Bayes Net Classifiers for Prediction of Renal Graft Status and Survival Period

Authors: Jiakai Li, Gursel Serpen, Steven Selman, Matt Franchetti, Mike Riesen, Cynthia Schneider

Abstract:

This paper presents the development of a Bayesian belief network classifier for prediction of graft status and survival period in renal transplantation using the patient profile information prior to the transplantation. The objective was to explore feasibility of developing a decision making tool for identifying the most suitable recipient among the candidate pool members. The dataset was compiled from the University of Toledo Medical Center Hospital patients as reported to the United Network Organ Sharing, and had 1228 patient records for the period covering 1987 through 2009. The Bayes net classifiers were developed using the Weka machine learning software workbench. Two separate classifiers were induced from the data set, one to predict the status of the graft as either failed or living, and a second classifier to predict the graft survival period. The classifier for graft status prediction performed very well with a prediction accuracy of 97.8% and true positive values of 0.967 and 0.988 for the living and failed classes, respectively. The second classifier to predict the graft survival period yielded a prediction accuracy of 68.2% and a true positive rate of 0.85 for the class representing those instances with kidneys failing during the first year following transplantation. Simulation results indicated that it is feasible to develop a successful Bayesian belief network classifier for prediction of graft status, but not the graft survival period, using the information in UNOS database.

Keywords: Bayesian network classifier, renal transplantation, graft survival period, United Network for Organ Sharing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2109

47 Evaluation of the Impact of Dataset Characteristics for Classification Problems in Biological Applications

Authors: Kanthida Kusonmano, Michael Netzer, Bernhard Pfeifer, Christian Baumgartner, Klaus R. Liedl, Armin Graber

Abstract:

Availability of high dimensional biological datasets such as from gene expression, proteomic, and metabolic experiments can be leveraged for the diagnosis and prognosis of diseases. Many classification methods in this area have been studied to predict disease states and separate between predefined classes such as patients with a special disease versus healthy controls. However, most of the existing research only focuses on a specific dataset. There is a lack of generic comparison between classifiers, which might provide a guideline for biologists or bioinformaticians to select the proper algorithm for new datasets. In this study, we compare the performance of popular classifiers, which are Support Vector Machine (SVM), Logistic Regression, k-Nearest Neighbor (k-NN), Naive Bayes, Decision Tree, and Random Forest based on mock datasets. We mimic common biological scenarios simulating various proportions of real discriminating biomarkers and different effect sizes thereof. The result shows that SVM performs quite stable and reaches a higher AUC compared to other methods. This may be explained due to the ability of SVM to minimize the probability of error. Moreover, Decision Tree with its good applicability for diagnosis and prognosis shows good performance in our experimental setup. Logistic Regression and Random Forest, however, strongly depend on the ratio of discriminators and perform better when having a higher number of discriminators.

Keywords: Classification, High dimensional data, Machine learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2384

46 An Online Space for Practitioners in the Water, Sanitation and Hygiene Sector

Authors: Olivier Mills, Bernard McDonell, Laura A. S. MacDonald

Abstract:

The increasing availability and quality of internet access throughout the developing world provides an opportunity to utilize online spaces to disseminate water, sanitation and hygiene (WASH) knowledge to practitioners. Since 2001, CAWST has provided in-person education, training and consulting services to thousands of WASH practitioners all over the world, supporting them to start, troubleshoot, improve and expand their WASH projects. As CAWST continues to grow, the organization faces challenges in meeting demand from clients and in providing consistent, timely technical support. In 2012, CAWST began utilizing online spaces to expand its reach by developing a series of resources websites and webinars. CAWST has developed a WASH Education and Training resources website, a Biosand Filter (BSF) Knowledge Base, a Household Water Treatment and Safe Storage Knowledge Base, a mobile app for offline users, a live chat support tool, a WASH e-library, and a series of webinar-style online training sessions to complement its in-person capacity development services. In order to determine the preliminary outcomes of providing these online services, CAWST has monitored and analyzed registration to the online spaces, downloads of the educational materials, and webinar attendance; as well as conducted user surveys. The purpose of this analysis was to find out who was using the online spaces, where users came from, and how the resources were being used. CAWST’s WASH Resources website has served over 5,800 registered users from 3,000 organizations in 183 countries. Additionally, the BSF Knowledge Base has served over 1000 registered users from 68 countries, and over 540 people from 73 countries have attended CAWST’s online training sessions. This indicates that the online spaces are effectively reaching a large numbers of users, from a range of countries. A 2016 survey of the Biosand Filter Knowledge Base showed that approximately 61% of users are practitioners, and 39% are either researchers or students. Of the respondents, 46% reported using the BSF Knowledge Base to initiate a BSF project and 43% reported using the information to train BSF technicians. Finally, 61% indicated they would like even greater support from CAWST’s Technical Advisors going forward. The analysis has provided an encouraging indication that CAWST’s online spaces are contributing to its objective of engaging and supporting WASH practitioners to start, improve and expand their initiatives. CAWST has learned several lessons during the development of these online spaces, in particular related to the resources needed to create and maintain the spaces, and respond to the demand created. CAWST plans to continue expanding its online spaces, improving user experience of the sites, and involving new contributors and content types. Through the use of online spaces, CAWST has been able to increase its global reach and impact without significantly increasing its human resources by connecting WASH practitioners with the information they most need, in a practical and accessible manner. This paper presents on CAWST’s use of online spaces through the CAWST-developed platforms discussed above and the analysis of the use of these platforms.

Keywords: Education and training, knowledge sharing, online resources, water and sanitation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1683

45 Primary Level Teachers’ Response to Gender Representation in Textbook Contents

Authors: Pragya Paneru

Abstract:

This paper explores altogether 10 primary teachers’ views on gender representation in primary level textbooks. Data were collected from the teachers who taught in private schools in the Kailali and Kathmandu districts. This research uses a semi-structured interview method to obtain information regarding teachers’ attitudes toward gender representations in textbook contents. The interview data were analysed by using critical skills of qualitative research. The findings revealed that most of the teachers were unaware and regarded gender issues as insignificant to discuss in primary-level classes. Most of them responded to the questions personally and claimed that there were no gender issues in their classrooms. Some of the teachers connected gender issues with contexts other than textbook representations such as school discrimination in the distribution of salary among male and female teachers, school practices of awarding girls rather than boys as the most disciplined students, following girls’ first rule in the assembly marching, encouraging only girls in the stage shows, and involving students in gender-specific activities such as decorating works for girls and physical tasks for boys. The interview also revealed teachers’ covert gendered attitudes in their remarks. Nevertheless, most of the teachers accepted that gender-biased contents have an impact on learners and this problem can be solved with more gender-centred research in the education field, discussions, and training to increase awareness regarding gender issues. Agreeing with the suggestion of teachers, this paper recommends proper training and awareness regarding how to confront gender issues in textbooks.

Keywords: Content analysis, gender equality, school education, critical awareness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 243

44 Image Ranking to Assist Object Labeling for Training Detection Models

Authors: Tonislav Ivanov, Oleksii Nedashkivskyi, Denis Babeshko, Vadim Pinskiy, Matthew Putman

Abstract:

Training a machine learning model for object detection that generalizes well is known to benefit from a training dataset with diverse examples. However, training datasets usually contain many repeats of common examples of a class and lack rarely seen examples. This is due to the process commonly used during human annotation where a person would proceed sequentially through a list of images labeling a sufficiently high total number of examples. Instead, the method presented involves an active process where, after the initial labeling of several images is completed, the next subset of images for labeling is selected by an algorithm. This process of algorithmic image selection and manual labeling continues in an iterative fashion. The algorithm used for the image selection is a deep learning algorithm, based on the U-shaped architecture, which quantifies the presence of unseen data in each image in order to find images that contain the most novel examples. Moreover, the location of the unseen data in each image is highlighted, aiding the labeler in spotting these examples. Experiments performed using semiconductor wafer data show that labeling a subset of the data, curated by this algorithm, resulted in a model with a better performance than a model produced from sequentially labeling the same amount of data. Also, similar performance is achieved compared to a model trained on exhaustive labeling of the whole dataset. Overall, the proposed approach results in a dataset that has a diverse set of examples per class as well as more balanced classes, which proves beneficial when training a deep learning model.

Keywords: Computer vision, deep learning, object detection, semiconductor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 829

43 Cumulative Learning based on Dynamic Clustering of Hierarchical Production Rules(HPRs)

Authors: Kamal K.Bharadwaj, Rekha Kandwal

Abstract:

An important structuring mechanism for knowledge bases is building clusters based on the content of their knowledge objects. The objects are clustered based on the principle of maximizing the intraclass similarity and minimizing the interclass similarity. Clustering can also facilitate taxonomy formation, that is, the organization of observations into a hierarchy of classes that group similar events together. Hierarchical representation allows us to easily manage the complexity of knowledge, to view the knowledge at different levels of details, and to focus our attention on the interesting aspects only. One of such efficient and easy to understand systems is Hierarchical Production rule (HPRs) system. A HPR, a standard production rule augmented with generality and specificity information, is of the following form Decision If < condition> Generality Specificity . HPRs systems are capable of handling taxonomical structures inherent in the knowledge about the real world. In this paper, a set of related HPRs is called a cluster and is represented by a HPR-tree. This paper discusses an algorithm based on cumulative learning scenario for dynamic structuring of clusters. The proposed scheme incrementally incorporates new knowledge into the set of clusters from the previous episodes and also maintains summary of clusters as Synopsis to be used in the future episodes. Examples are given to demonstrate the behaviour of the proposed scheme. The suggested incremental structuring of clusters would be useful in mining data streams.

Keywords: Cumulative learning, clustering, data mining, hierarchical production rules.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1439

42 Evaluation of Gingival Hyperplasia Caused by Medications

Authors: Ilma Robo, Saimir Heta, Greta Plaka, Vera Ostreni

Abstract:

Purpose: Drug gingival hyperplasia is an uncommon pathology encountered during routine work in dental units. The purpose of this paper is to present the clinical appearance of gingival hyperplasia caused by medications. There are already three classes of medications that cause hyperplasia and based on data from the literature, the clinical cases encountered and included in this study have been compared. Materials and Methods: The study was conducted in a total of 311 patients, out of which 182 patients were included in our study, meeting the inclusion criteria. After each patient's history was recorded and it was found that patients were in their knowledge of chronic illness, undergoing treatment of gingivitis hypertrophic drugs was performed with a clinical examination of oral cavity and assessment by vertical and horizontal evaluation according to the periodontal indexes. Results: Of the data collected during the study, it was observed that 97% of patients with gingival hyperplasia are treated with nifedipine. 84% of patients treated with selected medicines and gingival hyperplasia in the oral cavity has been exposed at time period for more than 1 year and 1 month. According to the GOI, in the first rank of this index are about 21% of patients, in the second rank are 52%, in the third rank are 24% and in the fourth grade are 3%. According to the horizontal growth index of gingival hyperplasia, grade 1 included about 61% of patients and grade 2 included about 39% of patients with gingival hyperplasia. Bacterial index divides patients by degrees: grading 0 - 8.2%, grading 1 - 32.4%, grading 2 - 14% and grading 3 - 45.1%. Conclusions: The highest percentage of gingival hyperplasia caused by drugs is due to dosing of nifedipine for a duration of dosing and application for systemic healing for more than 1 year.

Keywords: Drug gingival hyperplasia, horizontal growth index, vertical growth index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 476

41 Impact of Vehicle Travel Characteristics on Level of Service: A Comparative Analysis of Rural and Urban Freeways

Authors: Anwaar Ahmed, Muhammad Bilal Khurshid, Samuel Labi

Abstract:

The effect of trucks on the level of service is determined by considering passenger car equivalents (PCE) of trucks. The current version of Highway Capacity Manual (HCM) uses a single PCE value for all tucks combined. However, the composition of truck traffic varies from location to location; therefore, a single PCE value for all trucks may not correctly represent the impact of truck traffic at specific locations. Consequently, present study developed separate PCE values for single-unit and combination trucks to replace the single value provided in the HCM on different freeways. Site specific PCE values, were developed using concept of spatial lagging headways (that is the distance between rear bumpers of two vehicles in a traffic stream) measured from field traffic data. The study used data from four locations on a single urban freeway and three different rural freeways in Indiana. Three-stage-leastsquares (3SLS) regression techniques were used to generate models that predicted lagging headways for passenger cars, single unit trucks (SUT), and combination trucks (CT). The estimated PCE values for single-unit and combination truck for basic urban freeways (level terrain) were: 1.35 and 1.60, respectively. For rural freeways the estimated PCE values for single-unit and combination truck were: 1.30 and 1.45, respectively. As expected, traffic variables such as vehicle flow rates and speed have significant impacts on vehicle headways. Study results revealed that the use of separate PCE values for different truck classes can have significant influence on the LOS estimation.

Keywords: Level of Service, Capacity Analysis, Lagging Headway.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2096

40 Prediction Modeling of Alzheimer’s Disease and Its Prodromal Stages from Multimodal Data with Missing Values

Authors: M. Aghili, S. Tabarestani, C. Freytes, M. Shojaie, M. Cabrerizo, A. Barreto, N. Rishe, R. E. Curiel, D. Loewenstein, R. Duara, M. Adjouadi

Abstract:

A major challenge in medical studies, especially those that are longitudinal, is the problem of missing measurements which hinders the effective application of many machine learning algorithms. Furthermore, recent Alzheimer's Disease studies have focused on the delineation of Early Mild Cognitive Impairment (EMCI) and Late Mild Cognitive Impairment (LMCI) from cognitively normal controls (CN) which is essential for developing effective and early treatment methods. To address the aforementioned challenges, this paper explores the potential of using the eXtreme Gradient Boosting (XGBoost) algorithm in handling missing values in multiclass classification. We seek a generalized classification scheme where all prodromal stages of the disease are considered simultaneously in the classification and decision-making processes. Given the large number of subjects (1631) included in this study and in the presence of almost 28% missing values, we investigated the performance of XGBoost on the classification of the four classes of AD, NC, EMCI, and LMCI. Using 10-fold cross validation technique, XGBoost is shown to outperform other state-of-the-art classification algorithms by 3% in terms of accuracy and F-score. Our model achieved an accuracy of 80.52%, a precision of 80.62% and recall of 80.51%, supporting the more natural and promising multiclass classification.

Keywords: eXtreme Gradient Boosting, missing data, Alzheimer disease, early mild cognitive impairment, late mild cognitive impairment, multiclass classification, ADNI, support vector machine, random forest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 958

39 Machine Learning Facing Behavioral Noise Problem in an Imbalanced Data Using One Side Behavioral Noise Reduction: Application to a Fraud Detection

Authors: Salma El Hajjami, Jamal Malki, Alain Bouju, Mohammed Berrada

Abstract:

With the expansion of machine learning and data mining in the context of Big Data analytics, the common problem that affects data is class imbalance. It refers to an imbalanced distribution of instances belonging to each class. This problem is present in many real world applications such as fraud detection, network intrusion detection, medical diagnostics, etc. In these cases, data instances labeled negatively are significantly more numerous than the instances labeled positively. When this difference is too large, the learning system may face difficulty when tackling this problem, since it is initially designed to work in relatively balanced class distribution scenarios. Another important problem, which usually accompanies these imbalanced data, is the overlapping instances between the two classes. It is commonly referred to as noise or overlapping data. In this article, we propose an approach called: One Side Behavioral Noise Reduction (OSBNR). This approach presents a way to deal with the problem of class imbalance in the presence of a high noise level. OSBNR is based on two steps. Firstly, a cluster analysis is applied to groups similar instances from the minority class into several behavior clusters. Secondly, we select and eliminate the instances of the majority class, considered as behavioral noise, which overlap with behavior clusters of the minority class. The results of experiments carried out on a representative public dataset confirm that the proposed approach is efficient for the treatment of class imbalances in the presence of noise.

Keywords: Machine learning, Imbalanced data, Data mining, Big data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1137

38 Optimization Modeling of the Hybrid Antenna Array for the DoA Estimation

Authors: Somayeh Komeylian

Abstract:

The direction of arrival (DoA) estimation is the crucial aspect of the radar technologies for detecting and dividing several signal sources. In this scenario, the antenna array output modeling involves numerous parameters including noise samples, signal waveform, signal directions, signal number, and signal to noise ratio (SNR), and thereby the methods of the DoA estimation rely heavily on the generalization characteristic for establishing a large number of the training data sets. Hence, we have analogously represented the two different optimization models of the DoA estimation; (1) the implementation of the decision directed acyclic graph (DDAG) for the multiclass least-squares support vector machine (LS-SVM), and (2) the optimization method of the deep neural network (DNN) radial basis function (RBF). We have rigorously verified that the LS-SVM DDAG algorithm is capable of accurately classifying DoAs for the three classes. However, the accuracy and robustness of the DoA estimation are still highly sensitive to technological imperfections of the antenna arrays such as non-ideal array design and manufacture, array implementation, mutual coupling effect, and background radiation and thereby the method may fail in representing high precision for the DoA estimation. Therefore, this work has a further contribution on developing the DNN-RBF model for the DoA estimation for overcoming the limitations of the non-parametric and data-driven methods in terms of array imperfection and generalization. The numerical results of implementing the DNN-RBF model have confirmed the better performance of the DoA estimation compared with the LS-SVM algorithm. Consequently, we have analogously evaluated the performance of utilizing the two aforementioned optimization methods for the DoA estimation using the concept of the mean squared error (MSE).

Keywords: DoA estimation, adaptive antenna array, Deep Neural Network, LS-SVM optimization model, radial basis function, MSE.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 539

37 Determination of Soil Loss by Erosion in Different Land Covers Categories and Slope Classes in Bovilla Watershed, Tirana, Albania

Authors: Valmir Baloshi, Fran Gjoka, Nehat Çollaku, Elvin Toromani

Abstract:

As a sediment production mechanism, soil erosion is the main environmental threat to the Bovilla watershed, including the decline of water quality of the Bovilla reservoir that provides drinking water to Tirana city (the capital of Albania). Therefore, an experiment with 25 erosion plots for soil erosion monitoring has been set up since June 2017. The aim was to determine the soil loss on plot and watershed scale in Bovilla watershed (Tirana region) for implementation of soil and water protection measures or payments for ecosystem services (PES) programs. The results of erosion monitoring for the period June 2017 - May 2018 showed that the highest values of surface runoff were noted in bare land of 38829.91 liters on slope of 74% and the lowest values in forest land of 12840.6 liters on slope of 64% while the highest values of soil loss were found in bare land of 595.15 t/ha on slope of 62% and lowest values in forest land of 18.99 t/ha on slope of 64%. These values are much higher than the average rate of soil loss in the European Union (2.46 ton/ha/year). In the same sloping class, the soil loss was reduced from orchard or bare land to the forest land, and in the same category of land use, the soil loss increased with increasing land slope. It is necessary to conduct chemical analyses of sediments to determine the amount of chemical elements leached out of the soil and end up in the reservoir of Bovilla. It is concluded that PES programs should be implemented for rehabilitation of sub-watersheds Ranxe, Vilez and Zall-Bastar of the Bovilla watershed with valuable conservation practices.

Keywords: ANOVA, Bovilla, land cover, slope, soil loss, watershed management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 886

36 Evolutionary Origin of the αC Helix in Integrins

Authors: B. Chouhan, A. Denesyuk, J. Heino, M. S. Johnson, K. Denessiouk

Abstract:

Integrins are a large family of multidomain α/β cell signaling receptors. Some integrins contain an additional inserted I domain, whose earliest expression appears to be with the chordates, since they are observed in the urochordates Ciona intestinalis (vase tunicate) and Halocynthia roretzi (sea pineapple), but not in integrins of earlier diverging species. The domain-s presence is viewed as a hallmark of integrins of higher metazoans, however in vertebrates, there are clearly three structurally-different classes: integrins without I domains, and two groups of integrins with I domains but separable by the presence or absence of an additional αC helix. For example, the αI domains in collagen-binding integrins from Osteichthyes (bony fish) and all higher vertebrates contain the specific αC helix, whereas the αI domains in non-collagen binding integrins from vertebrates and the αI domains from earlier diverging urochordate integrins, i.e. tunicates, do not. Unfortunately, within the early chordates, there is an evolutionary gap due to extinctions between the tunicates and cartilaginous fish. This, coupled with a knowledge gap due to the lack of complete genomic data from surviving species, means that the origin of collagen-binding αC-containing αI domains remains unknown. Here, we analyzed two available genomes from Callorhinchus milii (ghost shark/elephant shark; Chondrichthyes – cartilaginous fish) and Petromyzon marinus (sea lamprey; Agnathostomata), and several available Expression Sequence Tags from two Chondrichthyes species: Raja erinacea (little skate) and Squalus acanthias (dogfish shark); and Eptatretus burgeri (inshore hagfish; Agnathostomata), which evolutionary reside between the urochordates and osteichthyes. In P. marinus, we observed several fragments coding for the αC-containing αI domain, allowing us to shed more light on the evolution of the collagen-binding integrins.

Keywords: Integrin αI domain, integrin evolution, collagen binding, structure, αC helix

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3672