Search results for: supervised clustering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 565

Search results for: supervised clustering

25 Total and Partial Factor Productivity Analysis of Irrigated Wheat in Iran by Separate of Exploitation Scales

Authors: Hassan Masoumi, Rashed Alavi

Abstract:

Wheat is one of the strategic crops in Iran, on which the household food basket is highly dependent. Although this crop is cultivated and produced in almost all provinces of the country, its production efficiency is lower than the global and regional averages due to the lack of optimal use of allocated resources. In this research, which was carried out with a documentary and library method, first, the total and partial productivity indices of irrigated wheat production were calculated in large, medium and small exploitation scales in different provinces of the country, and then the provinces were clustered in terms of these indices. The results showed that the total productivity of production factors had a direct correlation with the scale of exploitation, so that with the increase in the size of exploitations, the total productivity index increased. On the scale of small exploitations, North Khorasan, Zanjan, Chaharmahal and Bakhtiari Province, on a medium scale, Chaharmahal and Bakhtiari Province and on the scale of large exploitations, Zanjan, Chaharmahal and Bakhtiari provinces, Kohkiloyeh and Boyer Ahmad and North Khorasan, with better use of production resources compared to other provinces, were placed in the best cluster in terms of total productivity index. The high total productivity index in Zanjan, Chaharmahal and Bakhtiari Province is related to the higher productivity of factors such as mechanization and land in these provinces. Finally, the methods of using these factors in productive provinces, along with technical and specialized regional guidelines, can facilitate the improvement of productivity in less productive provinces.

Keywords: Clustering, Irrigated wheat, Iran, total productivity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 168
24 A New Approach In Protein Folding Studies Revealed The Potential Site For Nucleation Center

Authors: Nurul Bahiyah Ahmad Khairudin, Habibah A Wahab

Abstract:

A new approach to predict the 3D structures of proteins by combining the knowledge-based method and Molecular Dynamics Simulation is presented on the chicken villin headpiece subdomain (HP-36). Comparative modeling is employed as the knowledge-based method to predict the core region (Ala9-Asn28) of the protein while the remaining residues are built as extended regions (Met1-Lys8; Leu29-Phe36) which then further refined using Molecular Dynamics Simulation for 120 ns. Since the core region is built based on a high sequence identity to the template (65%) resulting in RMSD of 1.39 Å from the native, it is believed that this well-developed core region can act as a 'nucleation center' for subsequent rapid downhill folding. Results also demonstrate that the formation of the non-native contact which tends to hamper folding rate can be avoided. The best 3D model that exhibits most of the native characteristics is identified using clustering method which then further ranked based on the conformational free energies. It is found that the backbone RMSD of the best model compared to the NMR-MDavg is 1.01 Å and 3.53 Å, for the core region and the complete protein, respectively. In addition to this, the conformational free energy of the best model is lower by 5.85 kcal/mol as compared to the NMR-MDavg. This structure prediction protocol is shown to be effective in predicting the 3D structure of small globular protein with a considerable accuracy in much shorter time compared to the conventional Molecular Dynamics simulation alone.

Keywords: 3D model, Chicken villin headpiece subdomain, Molecular dynamic simulation NMR-MDavg, RMSD.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1549
23 Determination of Potential Agricultural Lands Using Landsat 8 OLI Images and GIS: Case Study of Gokceada (Imroz) Turkey

Authors: Rahmi Kafadar, Levent Genc

Abstract:

In present study, it was aimed to determine potential agricultural lands (PALs) in Gokceada (Imroz) Island of Canakkale province, Turkey. Seven-band Landsat 8 OLI images acquired on July 12 and August 13, 2013, and their 14-band combination image were used to identify current Land Use Land Cover (LULC) status. Principal Component Analysis (PCA) was applied to three Landsat datasets in order to reduce the correlation between the bands. A total of six Original and PCA images were classified using supervised classification method to obtain the LULC maps including 6 main classes (“Forest”, “Agriculture”, “Water Surface”, “Residential Area- Bare Soil”, “Reforestation” and “Other”). Accuracy assessment was performed by checking the accuracy of 120 randomized points for each LULC maps. The best overall accuracy and Kappa statistic values (90.83%, 0.8791% respectively) were found for PCA images which were generated from 14-bands combined images called 3- B/JA. Digital Elevation Model (DEM) with 15 m spatial resolution (ASTER) was used to consider topographical characteristics. Soil properties were obtained by digitizing 1:25000 scaled soil maps of Rural Services Directorate General. Potential Agricultural Lands (PALs) were determined using Geographic information Systems (GIS). Procedure was applied considering that “Other” class of LULC map may be used for agricultural purposes in the future properties. Overlaying analysis was conducted using Slope (S), Land Use Capability Class (LUCC), Other Soil Properties (OSP) and Land Use Capability Sub-Class (SUBC) properties. A total of 901.62 ha areas within “Other” class (15798.2 ha) of LULC map were determined as PALs. These lands were ranked as “Very Suitable”, “Suitable”, “Moderate Suitable” and “Low Suitable”. It was determined that the 8.03 ha were classified as “Very Suitable” while 18.59 ha as suitable and 11.44 ha as “Moderate Suitable” for PALs. In addition, 756.56 ha were found to be “Low Suitable”. The results obtained from this preliminary study can serve as basis for further studies.

Keywords: Digital Elevation Model (DEM), Geographic Information Systems (GIS), LANDSAT 8 OLI-TIRS, Land Use Land Cover (LULC).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2646
22 Screen of MicroRNA Targets in Zebrafish Using Heterogeneous Data Sources: A Case Study for Dre-miR-10 and Dre-miR-196

Authors: Yanju Zhang, Joost M. Woltering, Fons J. Verbeek

Abstract:

It has been established that microRNAs (miRNAs) play an important role in gene expression by post-transcriptional regulation of messengerRNAs (mRNAs). However, the precise relationships between microRNAs and their target genes in sense of numbers, types and biological relevance remain largely unclear. Dissecting the miRNA-target relationships will render more insights for miRNA targets identification and validation therefore promote the understanding of miRNA function. In miRBase, miRanda is the key algorithm used for target prediction for Zebrafish. This algorithm is high-throughput but brings lots of false positives (noise). Since validation of a large scale of targets through laboratory experiments is very time consuming, several computational methods for miRNA targets validation should be developed. In this paper, we present an integrative method to investigate several aspects of the relationships between miRNAs and their targets with the final purpose of extracting high confident targets from miRanda predicted targets pool. This is achieved by using the techniques ranging from statistical tests to clustering and association rules. Our research focuses on Zebrafish. It was found that validated targets do not necessarily associate with the highest sequence matching. Besides, for some miRNA families, the frequency of their predicted targets is significantly higher in the genomic region nearby their own physical location. Finally, in a case study of dre-miR-10 and dre-miR-196, it was found that the predicted target genes hoxd13a, hoxd11a, hoxd10a and hoxc4a of dre-miR- 10 while hoxa9a, hoxc8a and hoxa13a of dre-miR-196 have similar characteristics as validated target genes and therefore represent high confidence target candidates.

Keywords: MicroRNA targets validation, microRNA-target relationships, dre-miR-10, dre-miR-196.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1990
21 A Robust Method for Finding Nearest-Neighbor using Hexagon Cells

Authors: Ahmad Attiq Al-Ogaibi, Ahmad Sharieh, Moh’d Belal Al-Zoubi, R. Bremananth

Abstract:

In pattern clustering, nearest neighborhood point computation is a challenging issue for many applications in the area of research such as Remote Sensing, Computer Vision, Pattern Recognition and Statistical Imaging. Nearest neighborhood computation is an essential computation for providing sufficient classification among the volume of pixels (voxels) in order to localize the active-region-of-interests (AROI). Furthermore, it is needed to compute spatial metric relationships of diverse area of imaging based on the applications of pattern recognition. In this paper, we propose a new methodology for finding the nearest neighbor point, depending on making a virtually grid of a hexagon cells, then locate every point beneath them. An algorithm is suggested for minimizing the computation and increasing the turnaround time of the process. The nearest neighbor query points Φ are fetched by seeking fashion of hexagon holistic. Seeking will be repeated until an AROI Φ is to be expected. If any point Υ is located then searching starts in the nearest hexagons in a circular way. The First hexagon is considered be level 0 (L0) and the surrounded hexagons is level 1 (L1). If Υ is located in L1, then search starts in the next level (L2) to ensure that Υ is the nearest neighbor for Φ. Based on the result and experimental results, we found that the proposed method has an advantage over the traditional methods in terms of minimizing the time complexity required for searching the neighbors, in turn, efficiency of classification will be improved sufficiently.

Keywords: Hexagon cells, k-nearest neighbors, Nearest Neighbor, Pattern recognition, Query pattern, Virtually grid

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2801
20 Rice Area Determination Using Landsat-Based Indices and Land Surface Temperature Values

Authors: Burçin Saltık, Levent Genç

Abstract:

In this study, it was aimed to determine a route for identification of rice cultivation areas within Thrace and Marmara regions of Turkey using remote sensing and GIS. Landsat 8 (OLI-TIRS) imageries acquired in production season of 2013 with 181/32 Path/Row number were used. Four different seasonal images were generated utilizing original bands and different transformation techniques. All images were classified individually using supervised classification techniques and Land Use Land Cover Maps (LULC) were generated with 8 classes. Areas (ha, %) of each classes were calculated. In addition, district-based rice distribution maps were developed and results of these maps were compared with Turkish Statistical Institute (TurkSTAT; TSI)’s actual rice cultivation area records. Accuracy assessments were conducted, and most accurate map was selected depending on accuracy assessment and coherency with TSI results. Additionally, rice areas on over 4° slope values were considered as mis-classified pixels and they eliminated using slope map and GIS tools. Finally, randomized rice zones were selected to obtain maximum-minimum value ranges of each date (May, June, July, August, September images separately) NDVI, LSWI, and LST images to test whether they may be used for rice area determination via raster calculator tool of ArcGIS. The most accurate classification for rice determination was obtained from seasonal LSWI LULC map, and considering TSI data and accuracy assessment results and mis-classified pixels were eliminated from this map. According to results, 83151.5 ha of rice areas exist within study area. However, this result is higher than TSI records with an area of 12702.3 ha. Use of maximum-minimum range of rice area NDVI, LSWI, and LST was tested in Meric district. It was seen that using the value ranges obtained from July imagery, gave the closest results to TSI records, and the difference was only 206.4 ha. This difference is normal due to relatively low resolution of images. Thus, employment of images with higher spectral, spatial, temporal and radiometric resolutions may provide more reliable results.

Keywords: Landsat 8 (OLI-TIRS), LULC, spectral indices, rice.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1298
19 Computational Method for Annotation of Protein Sequence According to Gene Ontology Terms

Authors: Razib M. Othman, Safaai Deris, Rosli M. Illias

Abstract:

Annotation of a protein sequence is pivotal for the understanding of its function. Accuracy of manual annotation provided by curators is still questionable by having lesser evidence strength and yet a hard task and time consuming. A number of computational methods including tools have been developed to tackle this challenging task. However, they require high-cost hardware, are difficult to be setup by the bioscientists, or depend on time intensive and blind sequence similarity search like Basic Local Alignment Search Tool. This paper introduces a new method of assigning highly correlated Gene Ontology terms of annotated protein sequences to partially annotated or newly discovered protein sequences. This method is fully based on Gene Ontology data and annotations. Two problems had been identified to achieve this method. The first problem relates to splitting the single monolithic Gene Ontology RDF/XML file into a set of smaller files that can be easy to assess and process. Thus, these files can be enriched with protein sequences and Inferred from Electronic Annotation evidence associations. The second problem involves searching for a set of semantically similar Gene Ontology terms to a given query. The details of macro and micro problems involved and their solutions including objective of this study are described. This paper also describes the protein sequence annotation and the Gene Ontology. The methodology of this study and Gene Ontology based protein sequence annotation tool namely extended UTMGO is presented. Furthermore, its basic version which is a Gene Ontology browser that is based on semantic similarity search is also introduced.

Keywords: automatic clustering, bioinformatics tool, gene ontology, protein sequence annotation, semantic similarity search

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3127
18 A New Fuzzy DSS/ES for Stock Portfolio Selection using Technical and Fundamental Approaches in Parallel

Authors: H. Zarei, M. H. Fazel Zarandi, M. Karbasian

Abstract:

A Decision Support System/Expert System for stock portfolio selection presented where at first step, both technical and fundamental data used to estimate technical and fundamental return and risk (1st phase); Then, the estimated values are aggregated with the investor preferences (2nd phase) to produce convenient stock portfolio. In the 1st phase, there are two expert systems, each of which is responsible for technical or fundamental estimation. In the technical expert system, for each stock, twenty seven candidates are identified and with using rough sets-based clustering method (RC) the effective variables have been selected. Next, for each stock two fuzzy rulebases are developed with fuzzy C-Mean method and Takai-Sugeno- Kang (TSK) approach; one for return estimation and the other for risk. Thereafter, the parameters of the rule-bases are tuned with backpropagation method. In parallel, for fundamental expert systems, fuzzy rule-bases have been identified in the form of “IF-THEN" rules through brainstorming with the stock market experts and the input data have been derived from financial statements; as a result two fuzzy rule-bases have been generated for all the stocks, one for return and the other for risk. In the 2nd phase, user preferences represented by four criteria and are obtained by questionnaire. Using an expert system, four estimated values of return and risk have been aggregated with the respective values of user preference. At last, a fuzzy rule base having four rules, treats these values and produce a ranking score for each stock which will lead to a satisfactory portfolio for the user. The stocks of six manufacturing companies and the period of 2003-2006 selected for data gathering.

Keywords: Stock Portfolio Selection, Fuzzy Rule-Base ExpertSystems, Financial Decision Support Systems, Technical Analysis, Fundamental Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1840
17 Low Overhead Dynamic Channel Selection with Cluster-Based Spatial-Temporal Station Reporting in Wireless Networks

Authors: Zeyad Abdelmageid, Xianbin Wang

Abstract:

Choosing the operational channel for a WLAN access point (AP) in WLAN networks has been a static channel assignment process initiated by the user during the deployment process of the AP, which fails to cope with the dynamic conditions of the assigned channel at the station side afterwards. However, the dramatically growing number of Wi-Fi APs and stations operating in the unlicensed band has led to dynamic, distributed and often severe interference. This highlights the urgent need for the AP to dynamically select the best overall channel of operation for the basic service set (BSS) by considering the distributed and changing channel conditions at all stations. Consequently, dynamic channel selection algorithms which consider feedback from the station side have been developed. Despite the significant performance improvement, existing channel selection algorithms suffer from very high feedback overhead. Feedback latency from the STAs, due the high overhead, can cause the eventually selected channel to no longer be optimal for operation due to the dynamic sharing nature of the unlicensed band. This has inspired us to develop our own dynamic channel selection algorithm with reduced overhead through the proposed low-overhead, cluster-based station reporting mechanism. The main idea behind the cluster-based station reporting is the observation that STAs which are very close to each other tend to have very similar channel conditions. Instead of requesting each STA to report on every candidate channel while causing high overhead, the AP divides STAs into clusters then assigns each STA in each cluster one channel to report feedback on. With proper design of the cluster based reporting, the AP does not lose any information about the channel conditions at the station side while reducing feedback overhead. The simulation results show equal performance and at times better performance with a fraction of the overhead. We believe that this algorithm has great potential in designing future dynamic channel selection algorithms with low overhead.

Keywords: Channel assignment, Wi-Fi networks, clustering, DBSCAN, overhead.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 383
16 Dimensionality Reduction in Modal Analysis for Structural Health Monitoring

Authors: Elia Favarelli, Enrico Testi, Andrea Giorgetti

Abstract:

Autonomous structural health monitoring (SHM) of many structures and bridges became a topic of paramount importance for maintenance purposes and safety reasons. This paper proposes a set of machine learning (ML) tools to perform automatic feature selection and detection of anomalies in a bridge from vibrational data and compare different feature extraction schemes to increase the accuracy and reduce the amount of data collected. As a case study, the Z-24 bridge is considered because of the extensive database of accelerometric data in both standard and damaged conditions. The proposed framework starts from the first four fundamental frequencies extracted through operational modal analysis (OMA) and clustering, followed by time-domain filtering (tracking). The fundamental frequencies extracted are then fed to a dimensionality reduction block implemented through two different approaches: feature selection (intelligent multiplexer) that tries to estimate the most reliable frequencies based on the evaluation of some statistical features (i.e., entropy, variance, kurtosis), and feature extraction (auto-associative neural network (ANN)) that combine the fundamental frequencies to extract new damage sensitive features in a low dimensional feature space. Finally, one-class classification (OCC) algorithms perform anomaly detection, trained with standard condition points, and tested with normal and anomaly ones. In particular, principal component analysis (PCA), kernel principal component analysis (KPCA), and autoassociative neural network (ANN) are presented and their performance are compared. It is also shown that, by evaluating the correct features, the anomaly can be detected with accuracy and an F1 score greater than 95%.

Keywords: Anomaly detection, dimensionality reduction, frequencies selection, modal analysis, neural network, structural health monitoring, vibration measurement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 707
15 A Hybrid Multi-Criteria Hotel Recommender System Using Explicit and Implicit Feedbacks

Authors: Ashkan Ebadi, Adam Krzyzak

Abstract:

Recommender systems, also known as recommender engines, have become an important research area and are now being applied in various fields. In addition, the techniques behind the recommender systems have been improved over the time. In general, such systems help users to find their required products or services (e.g. books, music) through analyzing and aggregating other users’ activities and behavior, mainly in form of reviews, and making the best recommendations. The recommendations can facilitate user’s decision making process. Despite the wide literature on the topic, using multiple data sources of different types as the input has not been widely studied. Recommender systems can benefit from the high availability of digital data to collect the input data of different types which implicitly or explicitly help the system to improve its accuracy. Moreover, most of the existing research in this area is based on single rating measures in which a single rating is used to link users to items. This paper proposes a highly accurate hotel recommender system, implemented in various layers. Using multi-aspect rating system and benefitting from large-scale data of different types, the recommender system suggests hotels that are personalized and tailored for the given user. The system employs natural language processing and topic modelling techniques to assess the sentiment of the users’ reviews and extract implicit features. The entire recommender engine contains multiple sub-systems, namely users clustering, matrix factorization module, and hybrid recommender system. Each sub-system contributes to the final composite set of recommendations through covering a specific aspect of the problem. The accuracy of the proposed recommender system has been tested intensively where the results confirm the high performance of the system.

Keywords: Tourism, hotel recommender system, hybrid, implicit features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1899
14 Evaluating the Understanding of the University Students (Basic Sciences and Engineering) about the Numerical Representation of the Average Rate of Change

Authors: Saeid Haghjoo, Ebrahim Reyhani, Fahimeh Kolahdouz

Abstract:

The present study aimed to evaluate the understanding of the students in Tehran universities (Iran) about the numerical representation of the average rate of change based on the Structure of Observed Learning Outcomes (SOLO). In the present descriptive-survey research, the statistical population included undergraduate students (basic sciences and engineering) in the universities of Tehran. The samples were 604 students selected by random multi-stage clustering. The measurement tool was a task whose face and content validity was confirmed by math and mathematics education professors. Using Cronbach's Alpha criterion, the reliability coefficient of the task was obtained 0.95, which verified its reliability. The collected data were analyzed by descriptive statistics and inferential statistics (chi-squared and independent t-tests) under SPSS-24 software. According to the SOLO model in the prestructural, unistructural, and multistructural levels, basic science students had a higher percentage of understanding than that of engineering students, although the outcome was inverse at the relational level. However, there was no significant difference in the average understanding of both groups. The results indicated that students failed to have a proper understanding of the numerical representation of the average rate of change, in addition to missconceptions when using physics formulas in solving the problem. In addition, multiple solutions were derived along with their dominant methods during the qualitative analysis. The current research proposed to focus on the context problems with approximate calculations and numerical representation, using software and connection common relations between math and physics in the teaching process of teachers and professors.

Keywords: Average rate of change, context problems, derivative, numerical representation, SOLO taxonomy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 759
13 Urban Heat Island Intensity Assessment through Comparative Study on Land Surface Temperature and Normalized Difference Vegetation Index: A Case Study of Chittagong, Bangladesh

Authors: Tausif A. Ishtiaque, Zarrin T. Tasin, Kazi S. Akter

Abstract:

Current trend of urban expansion, especially in the developing countries has caused significant changes in land cover, which is generating great concern due to its widespread environmental degradation. Energy consumption of the cities is also increasing with the aggravated heat island effect. Distribution of land surface temperature (LST) is one of the most significant climatic parameters affected by urban land cover change. Recent increasing trend of LST is causing elevated temperature profile of the built up area with less vegetative cover. Gradual change in land cover, especially decrease in vegetative cover is enhancing the Urban Heat Island (UHI) effect in the developing cities around the world. Increase in the amount of urban vegetation cover can be a useful solution for the reduction of UHI intensity. LST and Normalized Difference Vegetation Index (NDVI) have widely been accepted as reliable indicators of UHI and vegetation abundance respectively. Chittagong, the second largest city of Bangladesh, has been a growth center due to rapid urbanization over the last several decades. This study assesses the intensity of UHI in Chittagong city by analyzing the relationship between LST and NDVI based on the type of land use/land cover (LULC) in the study area applying an integrated approach of Geographic Information System (GIS), remote sensing (RS), and regression analysis. Land cover map is prepared through an interactive supervised classification using remotely sensed data from Landsat ETM+ image along with NDVI differencing using ArcGIS. LST and NDVI values are extracted from the same image. The regression analysis between LST and NDVI indicates that within the study area, UHI is directly correlated with LST while negatively correlated with NDVI. It interprets that surface temperature reduces with increase in vegetation cover along with reduction in UHI intensity. Moreover, there are noticeable differences in the relationship between LST and NDVI based on the type of LULC. In other words, depending on the type of land usage, increase in vegetation cover has a varying impact on the UHI intensity. This analysis will contribute to the formulation of sustainable urban land use planning decisions as well as suggesting suitable actions for mitigation of UHI intensity within the study area.

Keywords: Land cover change, land surface temperature, normalized difference vegetation index, urban heat island.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1457
12 A Literature Review on the Effect of Industrial Clusters and the Absorptive Capacity on Innovation

Authors: Enrique Claver Cortés, Bartolomé Marco Lajara, Eduardo Sánchez García, Pedro Seva Larrosa, Encarnación Manresa Marhuenda, Lorena Ruiz Fernández, Esther Poveda Pareja

Abstract:

In recent decades, the analysis of the effects of clustering as an essential factor for the development of innovations and the competitiveness of enterprises has raised great interest in different areas. Nowadays, companies have access to almost all tangible and intangible resources located and/or developed in any country in the world. However, despite the obvious advantages that this situation entails for companies, their geographical location has shown itself, increasingly clearly, to be a fundamental factor that positively influences their innovative performance and competitiveness. Industrial clusters could represent a unique level of analysis, positioned between the individual company and the industry, which makes them an ideal unit of analysis to determine the effects derived from company membership of a cluster. Also, the absorptive capacity (hereinafter 'AC') can mediate the process of innovation development by companies located in a cluster. The transformation and exploitation of knowledge could have a mediating effect between knowledge acquisition and innovative performance. The main objective of this work is to determine the key factors that affect the degree of generation and use of knowledge from the environment by companies and, consequently, their innovative performance and competitiveness. The elements analyzed are the companies' membership of a cluster and the AC. To this end, 30 most relevant papers published on this subject in the "Web of Science" database have been reviewed. Our findings show that, within a cluster, the knowledge coming from the companies' environment can significantly influence their innovative performance and competitiveness, although in this relationship, the degree of access and exploitation of the companies to this knowledge plays a fundamental role, which depends on a series of elements both internal and external to the company.

Keywords: Absorptive capacity, clusters, innovation, knowledge.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 894
11 Assessment of Agricultural Land Use Land Cover, Land Surface Temperature and Population Changes Using Remote Sensing and GIS: Southwest Part of Marmara Sea, Turkey

Authors: Melis Inalpulat, Levent Genc

Abstract:

Land Use Land Cover (LULC) changes due to human activities and natural causes have become a major environmental concern. Assessment of temporal remote sensing data provides information about LULC impacts on environment. Land Surface Temperature (LST) is one of the important components for modeling environmental changes in climatological, hydrological, and agricultural studies. In this study, LULC changes (September 7, 1984 and July 8, 2014) especially in agricultural lands together with population changes (1985-2014) and LST status were investigated using remotely sensed and census data in South Marmara Watershed, Turkey. LULC changes were determined using Landsat TM and Landsat OLI data acquired in 1984 and 2014 summers. Six-band TM and OLI images were classified using supervised classification method to prepare LULC map including five classes including Forest (F), Grazing Land (G), Agricultural Land (A), Water Surface (W), Residential Area-Bare Soil (R-B) classes. The LST image was also derived from thermal bands of the same dates. LULC classification results showed that forest areas, agricultural lands, water surfaces and residential area-bare soils were increased as 65751 ha, 20163 ha, 1924 ha and 20462 ha respectively. In comparison, a dramatic decrement occurred in grazing land (107985 ha) within three decades. The population increased 29% between years 1984-2014 in whole study area. Along with the natural causes, migration also caused this increase since the study area has an important employment potential. LULC was transformed among the classes due to the expansion in residential, commercial and industrial areas as well as political decisions. In the study, results showed that agricultural lands around the settlement areas transformed to residential areas in 30 years. The LST images showed that mean temperatures were ranged between 26-32°C in 1984 and 27-33°C in 2014. Minimum temperature of agricultural lands was increased 3°C and reached to 23°C. In contrast, maximum temperature of A class decreased to 41°C from 44°C. Considering temperatures of the 2014 R-B class and 1984 status of same areas, it was seen that mean, min and max temperatures increased by 2°C. As a result, the dynamism of population, LULC and LST resulted in increasing mean and maximum surface temperatures, living spaces/industrial areas and agricultural lands.

Keywords: Census data, landsat, land surface temperature (LST), land use land cover (LULC).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2119
10 Questions Categorization in E-Learning Environment Using Data Mining Technique

Authors: Vilas P. Mahatme, K. K. Bhoyar

Abstract:

Nowadays, education cannot be imagined without digital technologies. It broadens the horizons of teaching learning processes. Several universities are offering online courses. For evaluation purpose, e-examination systems are being widely adopted in academic environments. Multiple-choice tests are extremely popular. Moving away from traditional examinations to e-examination, Moodle as Learning Management Systems (LMS) is being used. Moodle logs every click that students make for attempting and navigational purposes in e-examination. Data mining has been applied in various domains including retail sales, bioinformatics. In recent years, there has been increasing interest in the use of data mining in e-learning environment. It has been applied to discover, extract, and evaluate parameters related to student’s learning performance. The combination of data mining and e-learning is still in its babyhood. Log data generated by the students during online examination can be used to discover knowledge with the help of data mining techniques. In web based applications, number of right and wrong answers of the test result is not sufficient to assess and evaluate the student’s performance. So, assessment techniques must be intelligent enough. If student cannot answer the question asked by the instructor then some easier question can be asked. Otherwise, more difficult question can be post on similar topic. To do so, it is necessary to identify difficulty level of the questions. Proposed work concentrate on the same issue. Data mining techniques in specific clustering is used in this work. This method decide difficulty levels of the question and categories them as tough, easy or moderate and later this will be served to the desire students based on their performance. Proposed experiment categories the question set and also group the students based on their performance in examination. This will help the instructor to guide the students more specifically. In short mined knowledge helps to support, guide, facilitate and enhance learning as a whole.

Keywords: Data mining, e-examination, e-learning, moodle.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2073
9 Learning Classifier Systems Approach for Automated Discovery of Censored Production Rules

Authors: Suraiya Jabin, Kamal K. Bharadwaj

Abstract:

In the recent past Learning Classifier Systems have been successfully used for data mining. Learning Classifier System (LCS) is basically a machine learning technique which combines evolutionary computing, reinforcement learning, supervised or unsupervised learning and heuristics to produce adaptive systems. A LCS learns by interacting with an environment from which it receives feedback in the form of numerical reward. Learning is achieved by trying to maximize the amount of reward received. All LCSs models more or less, comprise four main components; a finite population of condition–action rules, called classifiers; the performance component, which governs the interaction with the environment; the credit assignment component, which distributes the reward received from the environment to the classifiers accountable for the rewards obtained; the discovery component, which is responsible for discovering better rules and improving existing ones through a genetic algorithm. The concatenate of the production rules in the LCS form the genotype, and therefore the GA should operate on a population of classifier systems. This approach is known as the 'Pittsburgh' Classifier Systems. Other LCS that perform their GA at the rule level within a population are known as 'Mitchigan' Classifier Systems. The most predominant representation of the discovered knowledge is the standard production rules (PRs) in the form of IF P THEN D. The PRs, however, are unable to handle exceptions and do not exhibit variable precision. The Censored Production Rules (CPRs), an extension of PRs, were proposed by Michalski and Winston that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: IF P THEN D UNLESS C, where Censor C is an exception to the rule. Such rules are employed in situations, in which conditional statement IF P THEN D holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence are tight or there is simply no information available as to whether it holds or not. Thus, the IF P THEN D part of CPR expresses important information, while the UNLESS C part acts only as a switch and changes the polarity of D to ~D. In this paper Pittsburgh style LCSs approach is used for automated discovery of CPRs. An appropriate encoding scheme is suggested to represent a chromosome consisting of fixed size set of CPRs. Suitable genetic operators are designed for the set of CPRs and individual CPRs and also appropriate fitness function is proposed that incorporates basic constraints on CPR. Experimental results are presented to demonstrate the performance of the proposed learning classifier system.

Keywords: Censored Production Rule, Data Mining, GeneticAlgorithm, Learning Classifier System, Machine Learning, PittsburgApproach, , Reinforcement learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1528
8 A Distributed Mobile Agent Based on Intrusion Detection System for MANET

Authors: Maad Kamal Al-Anni

Abstract:

This study is about an algorithmic dependence of Artificial Neural Network on Multilayer Perceptron (MPL) pertaining to the classification and clustering presentations for Mobile Adhoc Network vulnerabilities. Moreover, mobile ad hoc network (MANET) is ubiquitous intelligent internetworking devices in which it has the ability to detect their environment using an autonomous system of mobile nodes that are connected via wireless links. Security affairs are the most important subject in MANET due to the easy penetrative scenarios occurred in such an auto configuration network. One of the powerful techniques used for inspecting the network packets is Intrusion Detection System (IDS); in this article, we are going to show the effectiveness of artificial neural networks used as a machine learning along with stochastic approach (information gain) to classify the malicious behaviors in simulated network with respect to different IDS techniques. The monitoring agent is responsible for detection inference engine, the audit data is collected from collecting agent by simulating the node attack and contrasted outputs with normal behaviors of the framework, whenever. In the event that there is any deviation from the ordinary behaviors then the monitoring agent is considered this event as an attack , in this article we are going to demonstrate the  signature-based IDS approach in a MANET by implementing the back propagation algorithm over ensemble-based Traffic Table (TT), thus the signature of malicious behaviors or undesirable activities are often significantly prognosticated and efficiently figured out, by increasing the parametric set-up of Back propagation algorithm during the experimental results which empirically shown its effectiveness  for the ratio of detection index up to 98.6 percentage. Consequently it is proved in empirical results in this article, the performance matrices are also being included in this article with Xgraph screen show by different through puts like Packet Delivery Ratio (PDR), Through Put(TP), and Average Delay(AD).

Keywords: Mobile ad hoc network, MANET, intrusion detection system, back propagation algorithm, neural networks, traffic table, multilayer perceptron, feed-forward back-propagation, network simulator 2.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 927
7 A Risk Assessment Tool for the Contamination of Aflatoxins on Dried Figs based on Machine Learning Algorithms

Authors: Kottaridi Klimentia, Demopoulos Vasilis, Sidiropoulos Anastasios, Ihara Diego, Nikolaidis Vasileios, Antonopoulos Dimitrios

Abstract:

Aflatoxins are highly poisonous and carcinogenic compounds produced by species of the genus Aspergillus spp. that can infect a variety of agricultural foods, including dried figs. Biological and environmental factors, such as population, pathogenicity and aflatoxinogenic capacity of the strains, topography, soil and climate parameters of the fig orchards are believed to have a strong effect on aflatoxin levels. Existing methods for aflatoxin detection and measurement, such as high-performance liquid chromatography (HPLC), and enzyme-linked immunosorbent assay (ELISA), can provide accurate results, but the procedures are usually time-consuming, sample-destructive and expensive. Predicting aflatoxin levels prior to crop harvest is useful for minimizing the health and financial impact of a contaminated crop. Consequently, there is interest in developing a tool that predicts aflatoxin levels based on topography and soil analysis data of fig orchards. This paper describes the development of a risk assessment tool for the contamination of aflatoxin on dried figs, based on the location and altitude of the fig orchards, the population of the fungus Aspergillus spp. in the soil, and soil parameters such as pH, saturation percentage (SP), electrical conductivity (EC), organic matter, particle size analysis (sand, silt, clay), concentration of the exchangeable cations (Ca, Mg, K, Na), extractable P and trace of elements (B, Fe, Mn, Zn and Cu), by employing machine learning methods. In particular, our proposed method integrates three machine learning techniques i.e., dimensionality reduction on the original dataset (Principal Component Analysis), metric learning (Mahalanobis Metric for Clustering) and K-nearest Neighbors learning algorithm (KNN), into an enhanced model, with mean performance equal to 85% by terms of the Pearson Correlation Coefficient (PCC) between observed and predicted values.

Keywords: aflatoxins, Aspergillus spp., dried figs, k-nearest neighbors, machine learning, prediction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 647
6 An Intelligent Text Independent Speaker Identification Using VQ-GMM Model Based Multiple Classifier System

Authors: Cheima Ben Soltane, Ittansa Yonas Kelbesa

Abstract:

Speaker Identification (SI) is the task of establishing identity of an individual based on his/her voice characteristics. The SI task is typically achieved by two-stage signal processing: training and testing. The training process calculates speaker specific feature parameters from the speech and generates speaker models accordingly. In the testing phase, speech samples from unknown speakers are compared with the models and classified. Even though performance of speaker identification systems has improved due to recent advances in speech processing techniques, there is still need of improvement. In this paper, a Closed-Set Tex-Independent Speaker Identification System (CISI) based on a Multiple Classifier System (MCS) is proposed, using Mel Frequency Cepstrum Coefficient (MFCC) as feature extraction and suitable combination of vector quantization (VQ) and Gaussian Mixture Model (GMM) together with Expectation Maximization algorithm (EM) for speaker modeling. The use of Voice Activity Detector (VAD) with a hybrid approach based on Short Time Energy (STE) and Statistical Modeling of Background Noise in the pre-processing step of the feature extraction yields a better and more robust automatic speaker identification system. Also investigation of Linde-Buzo-Gray (LBG) clustering algorithm for initialization of GMM, for estimating the underlying parameters, in the EM step improved the convergence rate and systems performance. It also uses relative index as confidence measures in case of contradiction in identification process by GMM and VQ as well. Simulation results carried out on voxforge.org speech database using MATLAB highlight the efficacy of the proposed method compared to earlier work.

Keywords: Feature Extraction, Speaker Modeling, Feature Matching, Mel Frequency Cepstrum Coefficient (MFCC), Gaussian mixture model (GMM), Vector Quantization (VQ), Linde-Buzo-Gray (LBG), Expectation Maximization (EM), pre-processing, Voice Activity Detection (VAD), Short Time Energy (STE), Background Noise Statistical Modeling, Closed-Set Tex-Independent Speaker Identification System (CISI).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1880
5 Towards Real-Time Classification of Finger Movement Direction Using Encephalography Independent Components

Authors: Mohamed Mounir Tellache, Hiroyuki Kambara, Yasuharu Koike, Makoto Miyakoshi, Natsue Yoshimura

Abstract:

This study explores the practicality of using electroencephalographic (EEG) independent components to predict eight-direction finger movements in pseudo-real-time. Six healthy participants with individual-head MRI images performed finger movements in eight directions with two different arm configurations. The analysis was performed in two stages. The first stage consisted of using independent component analysis (ICA) to separate the signals representing brain activity from non-brain activity signals and to obtain the unmixing matrix. The resulting independent components (ICs) were checked, and those reflecting brain-activity were selected. Finally, the time series of the selected ICs were used to predict eight finger-movement directions using Sparse Logistic Regression (SLR). The second stage consisted of using the previously obtained unmixing matrix, the selected ICs, and the model obtained by applying SLR to classify a different EEG dataset. This method was applied to two different settings, namely the single-participant level and the group-level. For the single-participant level, the EEG dataset used in the first stage and the EEG dataset used in the second stage originated from the same participant. For the group-level, the EEG datasets used in the first stage were constructed by temporally concatenating each combination without repetition of the EEG datasets of five participants out of six, whereas the EEG dataset used in the second stage originated from the remaining participants. The average test classification results across datasets (mean ± S.D.) were 38.62 ± 8.36% for the single-participant, which was significantly higher than the chance level (12.50 ± 0.01%), and 27.26 ± 4.39% for the group-level which was also significantly higher than the chance level (12.49% ± 0.01%). The classification accuracy within [–45°, 45°] of the true direction is 70.03 ± 8.14% for single-participant and 62.63 ± 6.07% for group-level which may be promising for some real-life applications. Clustering and contribution analyses further revealed the brain regions involved in finger movement and the temporal aspect of their contribution to the classification. These results showed the possibility of using the ICA-based method in combination with other methods to build a real-time system to control prostheses.

Keywords: Brain-computer interface, BCI, electroencephalography, EEG, finger motion decoding, independent component analysis, pseudo-real-time motion decoding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 598
4 A Construction Management Tool: Determining a Project Schedule Typical Behaviors Using Cluster Analysis

Authors: Natalia Rudeli, Elisabeth Viles, Adrian Santilli

Abstract:

Delays in the construction industry are a global phenomenon. Many construction projects experience extensive delays exceeding the initially estimated completion time. The main purpose of this study is to identify construction projects typical behaviors in order to develop a prognosis and management tool. Being able to know a construction projects schedule tendency will enable evidence-based decision-making to allow resolutions to be made before delays occur. This study presents an innovative approach that uses Cluster Analysis Method to support predictions during Earned Value Analyses. A clustering analysis was used to predict future scheduling, Earned Value Management (EVM), and Earned Schedule (ES) principal Indexes behaviors in construction projects. The analysis was made using a database with 90 different construction projects. It was validated with additional data extracted from literature and with another 15 contrasting projects. For all projects, planned and executed schedules were collected and the EVM and ES principal indexes were calculated. A complete linkage classification method was used. In this way, the cluster analysis made considers that the distance (or similarity) between two clusters must be measured by its most disparate elements, i.e. that the distance is given by the maximum span among its components. Finally, through the use of EVM and ES Indexes and Tukey and Fisher Pairwise Comparisons, the statistical dissimilarity was verified and four clusters were obtained. It can be said that construction projects show an average delay of 35% of its planned completion time. Furthermore, four typical behaviors were found and for each of the obtained clusters, the interim milestones and the necessary rhythms of construction were identified. In general, detected typical behaviors are: (1) Projects that perform a 5% of work advance in the first two tenths and maintain a constant rhythm until completion (greater than 10% for each remaining tenth), being able to finish on the initially estimated time. (2) Projects that start with an adequate construction rate but suffer minor delays culminating with a total delay of almost 27% of the planned time. (3) Projects which start with a performance below the planned rate and end up with an average delay of 64%, and (4) projects that begin with a poor performance, suffer great delays and end up with an average delay of a 120% of the planned completion time. The obtained clusters compose a tool to identify the behavior of new construction projects by comparing their current work performance to the validated database, thus allowing the correction of initial estimations towards more accurate completion schedules.

Keywords: Cluster analysis, construction management, earned value, schedule.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1199
3 Corporate Social Responsibility and Corporate Reputation: A Bibliometric Analysis

Authors: Songdi Li, Louise Spry, Tony Woodall

Abstract:

Nowadays, Corporate Social responsibility (CSR) is becoming a buzz word, and more and more academics are putting efforts on CSR studies. It is believed that CSR could influence Corporate Reputation (CR), and they hold a favourable view that CSR leads to a positive CR. To be specific, the CSR related activities in the reputational context have been regarded as ways that associate to excellent financial performance, value creation, etc. Also, it is argued that CSR and CR are two sides of one coin; hence, to some extent, doing CSR is equal to establishing a good reputation. Still, there is no consensus of the CSR-CR relationship in the literature; thus, a systematic literature review is highly in need. This research conducts a systematic literature review with both bibliometric and content analysis. Data are selected from English language sources, and academic journal articles only, then, keyword combinations are applied to identify relevant sources. Data from Scopus and WoS are gathered for bibliometric analysis. Scopus search results were saved in RIS and CSV formats, and Web of Science (WoS) data were saved in TXT format and CSV formats in order to process data in the Bibexcel software for further analysis which later will be visualised by the software VOSviewer. Also, content analysis was applied to analyse the data clusters and the key articles. In terms of the topic of CSR-CR, this literature review with bibliometric analysis has made four achievements. First, this paper has developed a systematic study which quantitatively depicts the knowledge structure of CSR and CR by identifying terms closely related to CSR-CR (such as ‘corporate governance’) and clustering subtopics emerged in co-citation analysis. Second, content analysis is performed to acquire insight on the findings of bibliometric analysis in the discussion section. And it highlights some insightful implications for the future research agenda, for example, a psychological link between CSR-CR is identified from the result; also, emerging economies and qualitative research methods are new elements emerged in the CSR-CR big picture. Third, a multidisciplinary perspective presents through the whole bibliometric analysis mapping and co-word and co-citation analysis; hence, this work builds a structure of interdisciplinary perspective which potentially leads to an integrated conceptual framework in the future. Finally, Scopus and WoS are compared and contrasted in this paper; as a result, Scopus which has more depth and comprehensive data is suggested as a tool for future bibliometric analysis studies. Overall, this paper has fulfilled its initial purposes and contributed to the literature. To the author’s best knowledge, this paper conducted the first literature review of CSR-CR researches that applied both bibliometric analysis and content analysis; therefore, this paper achieves its methodological originality. And this dual approach brings advantages of carrying out a comprehensive and semantic exploration in the area of CSR-CR in a scientific and realistic method. Admittedly, its work might exist subjective bias in terms of search terms selection and paper selection; hence triangulation could reduce the subjective bias to some degree.

Keywords: Corporate social responsibility, corporate reputation, bibliometric analysis, software data analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 936
2 Bioinformatic Analysis of Retroelement-Associated Sequences in Human and Mouse Promoters

Authors: Nadezhda M. Usmanova, Nikolai V. Tomilin

Abstract:

Mammalian genomes contain large number of retroelements (SINEs, LINEs and LTRs) which could affect expression of protein coding genes through associated transcription factor binding sites (TFBS). Activity of the retroelement-associated TFBS in many genes is confirmed experimentally but their global functional impact remains unclear. Human SINEs (Alu repeats) and mouse SINEs (B1 and B2 repeats) are known to be clustered in GCrich gene rich genome segments consistent with the view that they can contribute to regulation of gene expression. We have shown earlier that Alu are involved in formation of cis-regulatory modules (clusters of TFBS) in human promoters, and other authors reported that Alu located near promoter CpG islands have an increased frequency of CpG dinucleotides suggesting that these Alu are undermethylated. Human Alu and mouse B1/B2 elements have an internal bipartite promoter for RNA polymerase III containing conserved sequence motif called B-box which can bind basal transcription complex TFIIIC. It has been recently shown that TFIIIC binding to B-box leads to formation of a boundary which limits spread of repressive chromatin modifications in S. pombe. SINEassociated B-boxes may have similar function but conservation of TFIIIC binding sites in SINEs located near mammalian promoters has not been studied earlier. Here we analysed abundance and distribution of retroelements (SINEs, LINEs and LTRs) in annotated sequences of the Database of mammalian transcription start sites (DBTSS). Fractions of SINEs in human and mouse promoters are slightly lower than in all genome but >40% of human and mouse promoters contain Alu or B1/B2 elements within -1000 to +200 bp interval relative to transcription start site (TSS). Most of these SINEs is associated with distal segments of promoters (-1000 to -200 bp relative to TSS) indicating that their insertion at distances >200 bp upstream of TSS is tolerated during evolution. Distribution of SINEs in promoters correlates negatively with the distribution of CpG sequences. Using analysis of abundance of 12-mer motifs from the B1 and Alu consensus sequences in genome and DBTSS it has been confirmed that some subsegments of Alu and B1 elements are poorly conserved which depends in part on the presence of CpG dinucleotides. One of these CpG-containing subsegments in B1 elements overlaps with SINE-associated B-box and it shows better conservation in DBTSS compared to genomic sequences. It has been also studied conservation in DBTSS and genome of the B-box containing segments of old (AluJ, AluS) and young (AluY) Alu repeats and found that CpG sequence of the B-box of old Alu is better conserved in DBTSS than in genome. This indicates that Bbox- associated CpGs in promoters are better protected from methylation and mutation than B-box-associated CpGs in genomic SINEs. These results are consistent with the view that potential TFIIIC binding motifs in SINEs associated with human and mouse promoters may be functionally important. These motifs may protect promoters from repressive histone modifications which spread from adjacent sequences. This can potentially explain well known clustering of SINEs in GC-rich gene rich genome compartments and existence of unmethylated CpG islands.

Keywords: Retroelement, promoter, CpG island, DNAmethylation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1572
1 A Study on the Relation among Primary Care Professionals Serving the Disadvantaged Community, Socioeconomic Status, and Adverse Health Outcome

Authors: Chau-Kuang Chen, Juanita Buford, Colette Davis, Raisha Allen, John Hughes, Jr., James Tyus, Dexter Samuels

Abstract:

During the post-Civil War era, the city of Nashville, Tennessee, had the highest mortality rate in the United States. The elevated death and disease rates among former slaves were attributable to lack of quality healthcare. To address the paucity of healthcare services, Meharry Medical College, an institution with the mission of educating minority professionals and serving the underserved population, was established in 1876. Purpose: The social ecological framework and partial least squares (PLS) path modeling were used to quantify the impact of socioeconomic status and adverse health outcome on primary care professionals serving the disadvantaged community. Thus, the study results could demonstrate the accomplishment of the College’s mission of training primary care professionals to serve in underserved areas. Methods: Various statistical methods were used to analyze alumni data from 1975 – 2013. K-means cluster analysis was utilized to identify individual medical and dental graduates in the cluster groups of the practice communities (Disadvantaged or Non-disadvantaged Communities). Discriminant analysis was implemented to verify the classification accuracy of cluster analysis. The independent t-test was performed to detect the significant mean differences of respective clustering and criterion variables. Chi-square test was used to test if the proportions of primary care and non-primary care specialists are consistent with those of medical and dental graduates practicing in the designated community clusters. Finally, the PLS path model was constructed to explore the construct validity of analytic model by providing the magnitude effects of socioeconomic status and adverse health outcome on primary care professionals serving the disadvantaged community. Results: Approximately 83% (3,192/3,864) of Meharry Medical College’s medical and dental graduates from 1975 to 2013 were practicing in disadvantaged communities. Independent t-test confirmed the content validity of the cluster analysis model. Also, the PLS path modeling demonstrated that alumni served as primary care professionals in communities with significantly lower socioeconomic status and higher adverse health outcome (p < .001). The PLS path modeling exhibited the meaningful interrelation between primary care professionals practicing communities and surrounding environments (socioeconomic statues and adverse health outcome), which yielded model reliability, validity, and applicability. Conclusion: This study applied social ecological theory and analytic modeling approaches to assess the attainment of Meharry Medical College’s mission of training primary care professionals to serve in underserved areas, particularly in communities with low socioeconomic status and high rates of adverse health outcomes. In summary, the majority of medical and dental graduates from Meharry Medical College provided primary care services to disadvantaged communities with low socioeconomic status and high adverse health outcome, which demonstrated that Meharry Medical College has fulfilled its mission. The high reliability, validity, and applicability of this model imply that it could be replicated for comparable universities and colleges elsewhere.

Keywords: Disadvantaged Community, K-means Cluster Analysis, PLS Path Modeling, Primary care.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2033