Search results for: relational databases

119 Deep iCrawl: An Intelligent Vision-Based Deep Web Crawler

Authors: R.Anita, V.Ganga Bharani, N.Nityanandam, Pradeep Kumar Sahoo

Abstract:

The explosive growth of World Wide Web has posed a challenging problem in extracting relevant data. Traditional web crawlers focus only on the surface web while the deep web keeps expanding behind the scene. Deep web pages are created dynamically as a result of queries posed to specific web databases. The structure of the deep web pages makes it impossible for traditional web crawlers to access deep web contents. This paper, Deep iCrawl, gives a novel and vision-based approach for extracting data from the deep web. Deep iCrawl splits the process into two phases. The first phase includes Query analysis and Query translation and the second covers vision-based extraction of data from the dynamically created deep web pages. There are several established approaches for the extraction of deep web pages but the proposed method aims at overcoming the inherent limitations of the former. This paper also aims at comparing the data items and presenting them in the required order.

Keywords: Crawler, Deep web, Web Database

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2096

118 Local Mesh Co-Occurrence Pattern for Content Based Image Retrieval

Authors: C. Yesubai Rubavathi, R. Ravi

Abstract:

This paper presents the local mesh co-occurrence patterns (LMCoP) using HSV color space for image retrieval system. HSV color space is used in this method to utilize color, intensity and brightness of images. Local mesh patterns are applied to define the local information of image and gray level co-occurrence is used to obtain the co-occurrence of LMeP pixels. Local mesh co-occurrence pattern extracts the local directional information from local mesh pattern and converts it into a well-mannered feature vector using gray level co-occurrence matrix. The proposed method is tested on three different databases called MIT VisTex, Corel, and STex. Also, this algorithm is compared with existing methods, and results in terms of precision and recall are shown in this paper.

Keywords: Content-based image retrieval system, HSV color space, gray level co-occurrence matrix, local mesh pattern.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2170

117 Improving University Operations with Data Mining: Predicting Student Performance

Authors: Mladen Dragičević, Mirjana Pejić Bach, Vanja Šimičević

Abstract:

The purpose of this paper is to develop models that would enable predicting student success. These models could improve allocation of students among colleges and optimize the newly introduced model of government subsidies for higher education. For the purpose of collecting data, an anonymous survey was carried out in the last year of undergraduate degree student population using random sampling method. Decision trees were created of which two have been chosen that were most successful in predicting student success based on two criteria: Grade Point Average (GPA) and time that a student needs to finish the undergraduate program (time-to-degree). Decision trees have been shown as a good method of classification student success and they could be even more improved by increasing survey sample and developing specialized decision trees for each type of college. These types of methods have a big potential for use in decision support systems.

Keywords: Data mining, knowledge discovery in databases, prediction models, student success.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2452

116 Automatic Clustering of Gene Ontology by Genetic Algorithm

Authors: Razib M. Othman, Safaai Deris, Rosli M. Illias, Zalmiyah Zakaria, Saberi M. Mohamad

Abstract:

Nowadays, Gene Ontology has been used widely by many researchers for biological data mining and information retrieval, integration of biological databases, finding genes, and incorporating knowledge in the Gene Ontology for gene clustering. However, the increase in size of the Gene Ontology has caused problems in maintaining and processing them. One way to obtain their accessibility is by clustering them into fragmented groups. Clustering the Gene Ontology is a difficult combinatorial problem and can be modeled as a graph partitioning problem. Additionally, deciding the number k of clusters to use is not easily perceived and is a hard algorithmic problem. Therefore, an approach for solving the automatic clustering of the Gene Ontology is proposed by incorporating cohesion-and-coupling metric into a hybrid algorithm consisting of a genetic algorithm and a split-and-merge algorithm. Experimental results and an example of modularized Gene Ontology in RDF/XML format are given to illustrate the effectiveness of the algorithm.

Keywords: Automatic clustering, cohesion-and-coupling metric, gene ontology; genetic algorithm, split-and-merge algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1914

115 A Self Adaptive Genetic Based Algorithm for the Identification and Elimination of Bad Data

Authors: A. A. Hossam-Eldin, E. N. Abdallah, M. S. El-Nozahy

Abstract:

The identification and elimination of bad measurements is one of the basic functions of a robust state estimator as bad data have the effect of corrupting the results of state estimation according to the popular weighted least squares method. However this is a difficult problem to handle especially when dealing with multiple errors from the interactive conforming type. In this paper, a self adaptive genetic based algorithm is proposed. The algorithm utilizes the results of the classical linearized normal residuals approach to tune the genetic operators thus instead of making a randomized search throughout the whole search space it is more likely to be a directed search thus the optimum solution is obtained at very early stages(maximum of 5 generations). The algorithm utilizes the accumulating databases of already computed cases to reduce the computational burden to minimum. Tests are conducted with reference to the standard IEEE test systems. Test results are very promising.

Keywords: Bad Data, Genetic Algorithms, Linearized Normal residuals, Observability, Power System State Estimation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1306

114 A Tree Based Association Rule Approach for XML Data with Semantic Integration

Authors: D. Sasikala, K. Premalatha

Abstract:

The use of eXtensible Markup Language (XML) in web, business and scientific databases lead to the development of methods, techniques and systems to manage and analyze XML data. Semi-structured documents suffer due to its heterogeneity and dimensionality. XML structure and content mining represent convergence for research in semi-structured data and text mining. As the information available on the internet grows drastically, extracting knowledge from XML documents becomes a harder task. Certainly, documents are often so large that the data set returned as answer to a query may also be very big to convey the required information. To improve the query answering, a Semantic Tree Based Association Rule (STAR) mining method is proposed. This method provides intentional information by considering the structure, content and the semantics of the content. The method is applied on Reuter’s dataset and the results show that the proposed method outperforms well.

Keywords: Semi--structured Document, Tree based Association Rule (TAR), Semantic Association Rule Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2292

113 Face Recognition Based On Vector Quantization Using Fuzzy Neuro Clustering

Authors: Elizabeth B. Varghese, M. Wilscy

Abstract:

A face recognition system is a computer application for automatically identifying or verifying a person from a digital image or a video frame. A lot of algorithms have been proposed for face recognition. Vector Quantization (VQ) based face recognition is a novel approach for face recognition. Here a new codebook generation for VQ based face recognition using Integrated Adaptive Fuzzy Clustering (IAFC) is proposed. IAFC is a fuzzy neural network which incorporates a fuzzy learning rule into a competitive neural network. The performance of proposed algorithm is demonstrated by using publicly available AT&T database, Yale database, Indian Face database and a small face database, DCSKU database created in our lab. In all the databases the proposed approach got a higher recognition rate than most of the existing methods. In terms of Equal Error Rate (ERR) also the proposed codebook is better than the existing methods.

Keywords: Face Recognition, Vector Quantization, Integrated Adaptive Fuzzy Clustering, Self Organization Map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2193

112 Face Texture Reconstruction for Illumination Variant Face Recognition

Authors: Pengfei Xiong, Lei Huang, Changping Liu

Abstract:

In illumination variant face recognition, existing methods extracting face albedo as light normalized image may lead to loss of extensive facial details, with light template discarded. To improve that, a novel approach for realistic facial texture reconstruction by combining original image and albedo image is proposed. First, light subspaces of different identities are established from the given reference face images; then by projecting the original and albedo image into each light subspace respectively, texture reference images with corresponding lighting are reconstructed and two texture subspaces are formed. According to the projections in texture subspaces, facial texture with normal light can be synthesized. Due to the combination of original image, facial details can be preserved with face albedo. In addition, image partition is applied to improve the synthesization performance. Experiments on Yale B and CMUPIE databases demonstrate that this algorithm outperforms the others both in image representation and in face recognition.

Keywords: texture reconstruction, illumination, face recognition, subspaces

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1438

111 Proximate and Mineral Composition of Chicken Giblets from Vojvodina (Northern Serbia)

Authors: M. R. Jokanović, V. M. Tomović, M. T. Jović, S. B. Škaljac, B. V. Šojić, P. M. Ikonić, T. A. Tasić

Abstract:

Proximate (moisture, protein, total fat, total ash) and mineral (K, P, Na, Mg, Ca, Zn, Fe, Cu and Mn) composition of chicken giblets (heart, liver and gizzard) were investigated. Phosphorous content, as well as proximate composition, were determined according to recommended ISO methods. The content of all elements, except phosphorus, of the giblets tissues were determined using inductively coupled plasma-optical emission spectrometry (ICP-OES), after dry ashing mineralization. Regarding proximate composition heart was the highest in total fat content, and the lowest in protein content. Liver was the highest in protein and total ash content, while gizzard was the highest in moisture and the lowest in total fat content. Regarding mineral composition liver was the highest for K, P, Ca, Mg, Fe, Zn, Cu, and Mn, while heart was the highest for Na content. The contents of almost all investigated minerals in analysed giblets tissues of chickens from Vojvodina were similar to values reported in the literature, i.e. in national food composition databases of other countries.

Keywords: Chicken giblets, proximate composition, mineral composition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2659

110 Bottom Up Text Mining through Hierarchical Document Representation

Authors: Y. Djouadi., F. Souam.

Abstract:

Most of the existing text mining approaches are proposed, keeping in mind, transaction databases model. Thus, the mined dataset is structured using just one concept: the “transaction", whereas the whole dataset is modeled using the “set" abstract type. In such cases, the structure of the whole dataset and the relationships among the transactions themselves are not modeled and consequently, not considered in the mining process. We believe that taking into account structure properties of hierarchically structured information (e.g. textual document, etc ...) in the mining process, can leads to best results. For this purpose, an hierarchical associations rule mining approach for textual documents is proposed in this paper and the classical set-oriented mining approach is reconsidered profits to a Direct Acyclic Graph (DAG) oriented approach. Natural languages processing techniques are used in order to obtain the DAG structure. Based on this graph model, an hierarchical bottom up algorithm is proposed. The main idea is that each node is mined with its parent node.

Keywords: Graph based association rules mining, Hierarchical document structure, Text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2009

109 Localizing Acoustic Touch Impacts using Zip-stuffing in Complex k-space Domain

Authors: R. Bremananth, Andy W. H. Khong, A. Chitra

Abstract:

Visualizing sound and noise often help us to determine an appropriate control over the source localization. Near-field acoustic holography (NAH) is a powerful tool for the ill-posed problem. However, in practice, due to the small finite aperture size, the discrete Fourier transform, FFT based NAH couldn-t predict the activeregion- of-interest (AROI) over the edges of the plane. Theoretically few approaches were proposed for solving finite aperture problem. However most of these methods are not quite compatible for the practical implementation, especially near the edge of the source. In this paper, a zip-stuffing extrapolation approach has suggested with 2D Kaiser window. It is operated on wavenumber complex space to localize the predicted sources. We numerically form a practice environment with touch impact databases to test the localization of sound source. It is observed that zip-stuffing aperture extrapolation and 2D window with evanescent components provide more accuracy especially in the small aperture and its derivatives.

Keywords: Acoustic source localization, Near-field acoustic holography (NAH), FFT, Extrapolation, k-space wavenumber errors.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1611

108 Image Retrieval Based on Multi-Feature Fusion for Heterogeneous Image Databases

Authors: N. W. U. D. Chathurani, Shlomo Geva, Vinod Chandran, Proboda Rajapaksha

Abstract:

Selecting an appropriate image representation is the most important factor in implementing an effective Content-Based Image Retrieval (CBIR) system. This paper presents a multi-feature fusion approach for efficient CBIR, based on the distance distribution of features and relative feature weights at the time of query processing. It is a simple yet effective approach, which is free from the effect of features' dimensions, ranges, internal feature normalization and the distance measure. This approach can easily be adopted in any feature combination to improve retrieval quality. The proposed approach is empirically evaluated using two benchmark datasets for image classification (a subset of the Corel dataset and Oliva and Torralba) and compared with existing approaches. The performance of the proposed approach is confirmed with the significantly improved performance in comparison with the independently evaluated baseline of the previously proposed feature fusion approaches.

Keywords: Feature fusion, image retrieval, membership function, normalization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1301

107 PIELG: A Protein Interaction Extraction Systemusing a Link Grammar Parser from Biomedical Abstracts

Authors: Rania A. Abul Seoud, Nahed H. Solouma, Abou-Baker M. Youssef, Yasser M. Kadah

Abstract:

Due to the ever growing amount of publications about protein-protein interactions, information extraction from text is increasingly recognized as one of crucial technologies in bioinformatics. This paper presents a Protein Interaction Extraction System using a Link Grammar Parser from biomedical abstracts (PIELG). PIELG uses linkage given by the Link Grammar Parser to start a case based analysis of contents of various syntactic roles as well as their linguistically significant and meaningful combinations. The system uses phrasal-prepositional verbs patterns to overcome preposition combinations problems. The recall and precision are 74.4% and 62.65%, respectively. Experimental evaluations with two other state-of-the-art extraction systems indicate that PIELG system achieves better performance. For further evaluation, the system is augmented with a graphical package (Cytoscape) for extracting protein interaction information from sequence databases. The result shows that the performance is remarkably promising.

Keywords: Link Grammar Parser, Interaction extraction, protein-protein interaction, Natural language processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2197

106 Artificial Neural Network Development by means of Genetic Programming with Graph Codification

Authors: Daniel Rivero, Julián Dorado, Juan R. Rabuñal, Alejandro Pazos, Javier Pereira

Abstract:

The development of Artificial Neural Networks (ANNs) is usually a slow process in which the human expert has to test several architectures until he finds the one that achieves best results to solve a certain problem. This work presents a new technique that uses Genetic Programming (GP) for automatically generating ANNs. To do this, the GP algorithm had to be changed in order to work with graph structures, so ANNs can be developed. This technique also allows the obtaining of simplified networks that solve the problem with a small group of neurons. In order to measure the performance of the system and to compare the results with other ANN development methods by means of Evolutionary Computation (EC) techniques, several tests were performed with problems based on some of the most used test databases. The results of those comparisons show that the system achieves good results comparable with the already existing techniques and, in most of the cases, they worked better than those techniques.

Keywords: Artificial Neural Networks, Evolutionary Computation, Genetic Programming.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1416

105 The Resource Description Framework (RDF) as a Modern Structure for Medical Data

Authors: Gabriela Lindemann, Danilo Schmidt, Thomas Schrader, Dietmar Keune

Abstract:

The amount and heterogeneity of data in biomedical research, notably in interdisciplinary fields, requires new methods for the collection, presentation and analysis of information. Important data from laboratory experiments as well as patient trials are available but come out of distributed resources. The Charité - University Hospital Berlin has established together with the German Research Foundation (DFG) a new information service centre for kidney diseases and transplantation (Open European Nephrology Science Centre - OpEN.SC). Beside a collaborative aspect to create new research groups every single partner or institution of this science information centre making his own data available is allowed to search the whole data pool of the various involved centres. A core task is the implementation of a non-restricting open data structure for the various different data sources. We decided to use a modern RDF model and in a first phase transformed original data coming from the web-based Electronic Patient Record database TBase©.

Keywords: Medical databases, Resource Description Framework (RDF), metadata repository.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1982

104 A Query Optimization Strategy for Autonomous Distributed Database Systems

Authors: Dina K. Badawy, Dina M. Ibrahim, Alsayed A. Sallam

Abstract:

Distributed database is a collection of logically related databases that cooperate in a transparent manner. Query processing uses a communication network for transmitting data between sites. It refers to one of the challenges in the database world. The development of sophisticated query optimization technology is the reason for the commercial success of database systems, which complexity and cost increase with increasing number of relations in the query. Mariposa, query trading and query trading with processing task-trading strategies developed for autonomous distributed database systems, but they cause high optimization cost because of involvement of all nodes in generating an optimal plan. In this paper, we proposed a modification on the autonomous strategy K-QTPT that make the seller’s nodes with the lowest cost have gradually high priorities to reduce the optimization time. We implement our proposed strategy and present the results and analysis based on those results.

Keywords: Autonomous strategies, distributed database systems, high priority, query optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1003

103 The Use of Mobile Phones by Refugees to Create Social Connectedness: A Literature Review

Authors: Sarah Vuningoma, Maria Rosa Lorini, Wallace Chigona

Abstract:

Mobile phones are one of the main tools for promoting the wellbeing of people and supporting the integration of communities on the margins such as refugees. Information and Communication Technology has the potential to contribute towards reducing isolation, loneliness, and to assist in improving interpersonal relations and fostering acculturation processes. Therefore, the use of mobile phones by refugees might contribute to their social connectedness. This paper aims to demonstrate how existing literature has shown how the use of mobile phones by refugees should engender social connectedness amongst the refugees. Data for the study are drawn from existing literature; we searched a number of electronic databases for papers published between 2010 and 2019. The main findings of the study relate to the use of mobile phones by refugees to (i) create a sense of belonging, (ii) maintain relationships, and (iii) advance the acculturation process. The analysis highlighted a gap in the research over refugees and social connectedness. In particular, further studies should consider evaluating the differences between those who have a refugee permit, those who are waiting for the refugee permit, and those whose request was denied.

Keywords: Belonging, mobile phones, refugees, social connectedness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 768

102 Bioinformatics and Molecular Biological Characterization of a Hypothetical Protein SAV1226 as a Potential Drug Target for Methicillin/Vancomycin- Staphylococcus aureus Infections

Authors: Nichole Haag, Kimberly Velk, Tyler McCune, Chun Wu

Abstract:

Methicillin/multiple-resistant Staphylococcus aureus (MRSA) are infectious bacteria that are resistant to common antibiotics. A previous in silico study in our group has identified a hypothetical protein SAV1226 as one of the potential drug targets. In this study, we reported the bioinformatics characterization, as well as cloning, expression, purification and kinetic assays of hypothetical protein SAV1226 from methicillin/vancomycin-resistant Staphylococcus aureus Mu50 strain. MALDI-TOF/MS analysis revealed a low degree of structural similarity with known proteins. Kinetic assays demonstrated that hypothetical protein SAV1226 is neither a domain of an ATP dependent dihydroxyacetone kinase nor of a phosphotransferase system (PTS) dihydroxyacetone kinase, suggesting that the function of hypothetical protein SAV1226 might be misannotated on public databases such as UniProt and InterProScan 5.

Keywords: Dihydroxyacetone kinase, essential genes, Methicillin-resistant Staphylococcus aureus, drug target.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1721

101 Automatic Method for Exudates and Hemorrhages Detection from Fundus Retinal Images

Authors: A. Biran, P. Sobhe Bidari, K. Raahemifar

Abstract:

Diabetic Retinopathy (DR) is an eye disease that leads to blindness. The earliest signs of DR are the appearance of red and yellow lesions on the retina called hemorrhages and exudates. Early diagnosis of DR prevents from blindness; hence, many automated algorithms have been proposed to extract hemorrhages and exudates. In this paper, an automated algorithm is presented to extract hemorrhages and exudates separately from retinal fundus images using different image processing techniques including Circular Hough Transform (CHT), Contrast Limited Adaptive Histogram Equalization (CLAHE), Gabor filter and thresholding. Since Optic Disc is the same color as the exudates, it is first localized and detected. The presented method has been tested on fundus images from Structured Analysis of the Retina (STARE) and Digital Retinal Images for Vessel Extraction (DRIVE) databases by using MATLAB codes. The results show that this method is perfectly capable of detecting hard exudates and the highly probable soft exudates. It is also capable of detecting the hemorrhages and distinguishing them from blood vessels.

Keywords: Diabetic retinopathy, fundus, CHT, exudates, hemorrhages.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2587

100 Methods of Geodesic Distance in Two-Dimensional Face Recognition

Authors: Rachid Ahdid, Said Safi, Bouzid Manaut

Abstract:

In this paper, we present a comparative study of three methods of 2D face recognition system such as: Iso-Geodesic Curves (IGC), Geodesic Distance (GD) and Geodesic-Intensity Histogram (GIH). These approaches are based on computing of geodesic distance between points of facial surface and between facial curves. In this study we represented the image at gray level as a 2D surface in a 3D space, with the third coordinate proportional to the intensity values of pixels. In the classifying step, we use: Neural Networks (NN), K-Nearest Neighbor (KNN) and Support Vector Machines (SVM). The images used in our experiments are from two wellknown databases of face images ORL and YaleB. ORL data base was used to evaluate the performance of methods under conditions where the pose and sample size are varied, and the database YaleB was used to examine the performance of the systems when the facial expressions and lighting are varied.

Keywords: 2D face recognition, Geodesic distance, Iso-Geodesic Curves, Geodesic-Intensity Histogram, facial surface, Neural Networks, K-Nearest Neighbor, Support Vector Machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1773

99 Ways of Life of Undergraduate Students Based On Sufficiency Economy Philosophy in Suan Sunandha Rajabhat University

Authors: Phusit Phukamchanoad

Abstract:

This study aimed to analyse the application of sufficiency economy in students’ ways of life on campus at Suan Sunandha Rajabhat University. Data was gathered through 394 questionnaires. The study results found that the majority of students were confident that “where there’s a will, there’s a way.” Overall, the students applied the sufficiency economy at a great level, along with being persons who do not exploit others, were satisfied with living their lives moderately, according to the sufficiency economy. Importance was also given to kindness and generosity. Importantly, students were happy with living according to their individual circumstances and status at the present. They saw the importance of joint life planning, self-development, and self-dependence, always learning to be satisfied with “adequate”. As for their practices and ways of life, socially relational activities rated highly, especially initiation activities for underclassmen at the university and the seniority system, which are suitable for activities on campus. Furthermore, the students knew how to build a career and find supplemental income, knew how to earnestly work according to convention to finish work, and preferred to study elective subjects which directly benefit career-wise. The students’ application of sufficiency economy philosophy principles depended on their lives in their hometowns. The students from the provinces regularly applied sufficiency economy philosophy to their lives, for example, by being frugal, steadfast, determined, avoiding negligence, and making economical spending plans; more so than the students from the capital.

Keywords: Application of Sufficiency Economy Philosophy, Way of Living, Undergraduate Students.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2732

98 Small Businesses' Decision to have a Website Saudi Arabia Case Study

Authors: M. Al-hawari, H. AL–Yamani, B. Izwawa

Abstract:

Recognizing the increasing importance of using the Internet to conduct business, this paper looks at some related matters associated with small businesses making a decision of whether or not to have a Website and go online. Small businesses in Saudi Arabia struggle to have this decision. For organizations, to fully go online, conduct business and provide online information services, they need to connect their database to the Web. Some issues related to doing that might be beyond the capabilities of most small businesses in Saudi Arabia, such as Website management, technical issues and security concerns. Here we focus on a small business firm in Saudi Arabia (Case Study), discussing the issues related to going online decision and the firm's options of what to do and how to do it. The paper suggested some valuable solutions of connecting databases to the Web. It also discusses some of the important issues related to online information services and e-commerce, mainly Web hosting options and security issues.

Keywords: E-Commerce, Saudi Arabia, Small business, Webdatabase connection, Web hosting, World Wide Web (Web).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1919

97 Improvement in Power Transformer Intelligent Dissolved Gas Analysis Method

Authors: S. Qaedi, S. Seyedtabaii

Abstract:

Non-Destructive evaluation of in-service power transformer condition is necessary for avoiding catastrophic failures. Dissolved Gas Analysis (DGA) is one of the important methods. Traditional, statistical and intelligent DGA approaches have been adopted for accurate classification of incipient fault sources. Unfortunately, there are not often enough faulty patterns required for sufficient training of intelligent systems. By bootstrapping the shortcoming is expected to be alleviated and algorithms with better classification success rates to be obtained. In this paper the performance of an artificial neural network, K-Nearest Neighbour and support vector machine methods using bootstrapped data are detailed and shown that while the success rate of the ANN algorithms improves remarkably, the outcome of the others do not benefit so much from the provided enlarged data space. For assessment, two databases are employed: IEC TC10 and a dataset collected from reported data in papers. High average test success rate well exhibits the remarkable outcome.

Keywords: Dissolved gas analysis, Transformer incipient fault, Artificial Neural Network, Support Vector Machine (SVM), KNearest Neighbor (KNN)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2688

96 Values as a Predictor of Cyber-bullying Among Secondary School Students

Authors: Bülent Dilmaç, Didem Aydoğan

Abstract:

The use of new technologies such internet (e-mail, chat rooms) and cell phones has steeply increased in recent years. Especially among children and young people, use of technological tools and equipments is widespread. Although many teachers and administrators now recognize the problem of school bullying, few are aware that students are being harassed through electronic communication. Referred to as electronic bullying, cyber bullying, or online social cruelty, this phenomenon includes bullying through email, instant messaging, in a chat room, on a website, or through digital messages or images sent to a cell phone. Cyber bullying is defined as causing deliberate/intentional harm to others using internet or other digital technologies. It has a quantitative research design nd uses relational survey as its method. The participants consisted of 300 secondary school students in the city of Konya, Turkey. 195 (64.8%) participants were female and 105 (35.2%) were male. 39 (13%) students were at grade 1, 187 (62.1%) were at grade 2 and 74 (24.6%) were at grade 3. The “Cyber Bullying Question List" developed by Ar─▒cak (2009) was given to students. Following questions about demographics, a functional definition of cyber bullying was provided. In order to specify students- human values, “Human Values Scale (HVS)" developed by Dilmaç (2007) for secondary school students was administered. The scale consists of 42 items in six dimensions. Data analysis was conducted by the primary investigator of the study using SPSS 14.00 statistical analysis software. Descriptive statistics were calculated for the analysis of students- cyber bullying behaviour and simple regression analysis was conducted in order to test whether each value in the scale could explain cyber bullying behaviour.

Keywords: Cyber bullying, Values, Secondary SchoolStudents

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3774

95 Optimized Preprocessing for Accurate and Efficient Bioassay Prediction with Machine Learning Algorithms

Authors: Jeff Clarine, Chang-Shyh Peng, Daisy Sang

Abstract:

Bioassay is the measurement of the potency of a chemical substance by its effect on a living animal or plant tissue. Bioassay data and chemical structures from pharmacokinetic and drug metabolism screening are mined from and housed in multiple databases. Bioassay prediction is calculated accordingly to determine further advancement. This paper proposes a four-step preprocessing of datasets for improving the bioassay predictions. The first step is instance selection in which dataset is categorized into training, testing, and validation sets. The second step is discretization that partitions the data in consideration of accuracy vs. precision. The third step is normalization where data are normalized between 0 and 1 for subsequent machine learning processing. The fourth step is feature selection where key chemical properties and attributes are generated. The streamlined results are then analyzed for the prediction of effectiveness by various machine learning algorithms including Pipeline Pilot, R, Weka, and Excel. Experiments and evaluations reveal the effectiveness of various combination of preprocessing steps and machine learning algorithms in more consistent and accurate prediction.

Keywords: Bioassay, machine learning, preprocessing, virtual screen.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 935

94 Q-Map: Clinical Concept Mining from Clinical Documents

Authors: Sheikh Shams Azam, Manoj Raju, Venkatesh Pagidimarri, Vamsi Kasivajjala

Abstract:

Over the past decade, there has been a steep rise in the data-driven analysis in major areas of medicine, such as clinical decision support system, survival analysis, patient similarity analysis, image analytics etc. Most of the data in the field are well-structured and available in numerical or categorical formats which can be used for experiments directly. But on the opposite end of the spectrum, there exists a wide expanse of data that is intractable for direct analysis owing to its unstructured nature which can be found in the form of discharge summaries, clinical notes, procedural notes which are in human written narrative format and neither have any relational model nor any standard grammatical structure. An important step in the utilization of these texts for such studies is to transform and process the data to retrieve structured information from the haystack of irrelevant data using information retrieval and data mining techniques. To address this problem, the authors present Q-Map in this paper, which is a simple yet robust system that can sift through massive datasets with unregulated formats to retrieve structured information aggressively and efficiently. It is backed by an effective mining technique which is based on a string matching algorithm that is indexed on curated knowledge sources, that is both fast and configurable. The authors also briefly examine its comparative performance with MetaMap, one of the most reputed tools for medical concepts retrieval and present the advantages the former displays over the latter.

Keywords: Information retrieval (IR), unified medical language system (UMLS), Syntax Based Analysis, natural language processing (NLP), medical informatics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 723

93 A Watermarking System Using the Wavelet Technique for Satellite Images

Authors: I. R. Farah, I. B. Ismail, M. B. Ahmed

Abstract:

The huge development of new technologies and the apparition of open communication system more and more sophisticated create a new challenge to protect digital content from piracy. Digital watermarking is a recent research axis and a new technique suggested as a solution to these problems. This technique consists in inserting identification information (watermark) into digital data (audio, video, image, databases...) in an invisible and indelible manner and in such a way not to degrade original medium-s quality. Moreover, we must be able to correctly extract the watermark despite the deterioration of the watermarked medium (i.e attacks). In this paper we propose a system for watermarking satellite images. We chose to embed the watermark into frequency domain, precisely the discrete wavelet transform (DWT). We applied our algorithm on satellite images of Tunisian center. The experiments show satisfying results. In addition, our algorithm showed an important resistance facing different attacks, notably the compression (JEPG, JPEG2000), the filtering, the histogram-s manipulation and geometric distortions such as rotation, cropping, scaling.

Keywords: Digital data watermarking, Spatial Database, Satellite images, Discrete Wavelets Transform (DWT).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1629

92 Examining the Value of Attribute Scores for Author-Supplied Keyphrases in Automatic Keyphrase Extraction

Authors: Vicky Min-How Lim, Siew Fan Wong, Tong Ming Lim

Abstract:

Automatic keyphrase extraction is useful in efficiently locating specific documents in online databases. While several techniques have been introduced over the years, improvement on accuracy rate is minimal. This research examines attribute scores for author-supplied keyphrases to better understand how the scores affect the accuracy rate of automatic keyphrase extraction. Five attributes are chosen for examination: Term Frequency, First Occurrence, Last Occurrence, Phrase Position in Sentences, and Term Cohesion Degree. The results show that First Occurrence is the most reliable attribute. Term Frequency, Last Occurrence and Term Cohesion Degree display a wide range of variation but are still usable with suggested tweaks. Only Phrase Position in Sentences shows a totally unpredictable pattern. The results imply that the commonly used ranking approach which directly extracts top ranked potential phrases from candidate keyphrase list as the keyphrases may not be reliable.

Keywords: Accuracy, Attribute Score, Author-supplied keyphrases, Automatic keyphrase extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1294

91 Categorizing Search Result Records Using Word Sense Disambiguation

Authors: R. Babisaraswathi, N. Shanthi, S. S. Kiruthika

Abstract:

Web search engines are designed to retrieve and extract the information in the web databases and to return dynamic web pages. The Semantic Web is an extension of the current web in which it includes semantic content in web pages. The main goal of semantic web is to promote the quality of the current web by changing its contents into machine understandable form. Therefore, the milestone of semantic web is to have semantic level information in the web. Nowadays, people use different keyword- based search engines to find the relevant information they need from the web. But many of the words are polysemous. When these words are used to query a search engine, it displays the Search Result Records (SRRs) with different meanings. The SRRs with similar meanings are grouped together based on Word Sense Disambiguation (WSD). In addition to that semantic annotation is also performed to improve the efficiency of search result records. Semantic Annotation is the process of adding the semantic metadata to web resources. Thus the grouped SRRs are annotated and generate a summary which describes the information in SRRs. But the automatic semantic annotation is a significant challenge in the semantic web. Here ontology and knowledge based representation are used to annotate the web pages.

Keywords: Ontology, Semantic Web, WordNet, Word Sense Disambiguation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1713

90 Investigation on Performance of Change Point Algorithm in Time Series Dynamical Regimes and Effect of Data Characteristics

Authors: Farhad Asadi, Mohammad Javad Mollakazemi

Abstract:

In this paper, Bayesian online inference in models of data series are constructed by change-points algorithm, which separated the observed time series into independent series and study the change and variation of the regime of the data with related statistical characteristics. variation of statistical characteristics of time series data often represent separated phenomena in the some dynamical system, like a change in state of brain dynamical reflected in EEG signal data measurement or a change in important regime of data in many dynamical system. In this paper, prediction algorithm for studying change point location in some time series data is simulated. It is verified that pattern of proposed distribution of data has important factor on simpler and smother fluctuation of hazard rate parameter and also for better identification of change point locations. Finally, the conditions of how the time series distribution effect on factors in this approach are explained and validated with different time series databases for some dynamical system.

Keywords: Time series, fluctuation in statistical characteristics, optimal learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1758