Search results for: specific databases

8043 Assessment of Image Databases Used for Human Skin Detection Methods

Abstract:

Human skin detection is a vital step in many applications. Some of the applications are critical especially those related to security. This leverages the importance of a high-performance detection algorithm. To validate the accuracy of the algorithm, image databases are usually used. However, the suitability of these image databases is still questionable. It is suggested that the suitability can be measured mainly by the span the database covers of the color space. This research investigates the validity of three famous image databases.

Keywords: image databases, image processing, pattern recognition, neural networks

Procedia PDF Downloads 217

8042 Database Playlists: Croatia's Popular Music in the Mirror of Collective Memory

Authors: Diana Grguric, Robert Svetlacic, Vladimir Simovic

Abstract:

Scientific research analytically explores database playlists by studying the memory culture through Croatian popular radio music. The research is based on the scientific analysis of databases developed on the basis of the playlist of ten Croatian radio stations. The most recent Croatian song on Statehood Day 2008-2013 is analyzed in order to gain insight into their (memory) potential in terms of storing, interpreting and presenting a national identity. The research starts with the general assumption that popular music is an efficient identifier, transmitter, and promoter of national identity. The aim of the scientific research of the database was to analytically reveal specific titles of Croatian popular songs that participate in marking memories and analyzing their symbolic capital to gain insight into the popular music experience of the past and to develop a new method of scientifically based analysis of specific databases.

Keywords: specific databases, popular radio music, collective memory, national identity

Procedia PDF Downloads 326

8041 Developing an Information Model of Manufacturing Process for Sustainability

Authors: Jae Hyun Lee

Abstract:

Manufacturing companies use life-cycle inventory databases to analyze sustainability of their manufacturing processes. Life cycle inventory data provides reference data which may not be accurate for a specific company. Collecting accurate data of manufacturing processes for a specific company requires enormous time and efforts. An information model of typical manufacturing processes can reduce time and efforts to get appropriate reference data for a specific company. This paper shows an attempt to build an abstract information model which can be used to develop information models for specific manufacturing processes.

Keywords: process information model, sustainability, OWL, manufacturing

Procedia PDF Downloads 391

8040 Enhance Security in XML Databases: XLog File for Severity-Aware Trust-Based Access Control

Authors: A: Asmawi, L. S. Affendey, N. I. Udzir, R. Mahmod

Abstract:

The topic of enhancing security in XML databases is important as it includes protecting sensitive data and providing a secure environment to users. In order to improve security and provide dynamic access control for XML databases, we presented XLog file to calculate user trust values by recording users’ bad transaction, errors and query severities. Severity-aware trust-based access control for XML databases manages the access policy depending on users' trust values and prevents unauthorized processes, malicious transactions and insider threats. Privileges are automatically modified and adjusted over time depending on user behaviour and query severity. Logging in database is an important process and is used for recovery and security purposes. In this paper, the Xlog file is presented as a dynamic and temporary log file for XML databases to enhance the level of security.

Keywords: XML database, trust-based access control, severity-aware, trust values, log file

Procedia PDF Downloads 266

8039 Analysis of Cyber Activities of Potential Business Customers Using Neo4j Graph Databases

Authors: Suglo Tohari Luri

Abstract:

Data analysis is an important aspect of business performance. With the application of artificial intelligence within databases, selecting a suitable database engine for an application design is also very crucial for business data analysis. The application of business intelligence (BI) software into some relational databases such as Neo4j has proved highly effective in terms of customer data analysis. Yet what remains of great concern is the fact that not all business organizations have the neo4j business intelligence software applications to implement for customer data analysis. Further, those with the BI software lack personnel with the requisite expertise to use it effectively with the neo4j database. The purpose of this research is to demonstrate how the Neo4j program code alone can be applied for the analysis of e-commerce website customer visits. As the neo4j database engine is optimized for handling and managing data relationships with the capability of building high performance and scalable systems to handle connected data nodes, it will ensure that business owners who advertise their products at websites using neo4j as a database are able to determine the number of visitors so as to know which products are visited at routine intervals for the necessary decision making. It will also help in knowing the best customer segments in relation to specific goods so as to place more emphasis on their advertisement on the said websites.

Keywords: data, engine, intelligence, customer, neo4j, database

Procedia PDF Downloads 162

8038 Perceptions of Academic Staff on the Influences of Librarians and Working Colleagues Towards the Awareness and Use of Electronic Databases in Umaru Musa Yar’adua University, Katsina

Authors: Lawal Kado

Abstract:

This paper investigates the perceptions of academic staff at Umaru Musa Yar’adua University regarding the influences of librarians and working colleagues on the awareness and use of electronic databases. The study aims to provide insights into the effectiveness of these influences and suggest strategies to improve the usage of electronic databases. Research aim: The aim of this study is to determine the perceptions of academic staff on the influence of librarians and working colleagues towards the awareness and use of electronic databases in Umaru Musa Yar’adua University, Katsina. Methodology: The study adopts a quantitative method and survey research design. The survey questionnaire is distributed to 110 respondents selected through simple random sampling from a population of 523 academic staff. The collected data is analyzed using the Statistical Package for Social Sciences (SPSS) version 23. Findings: The study reveals a high level of general awareness of electronic databases in the university, largely influenced by librarians and colleagues. Librarians have played a crucial role in making academic staff aware of the available databases. The sources of information for awareness include colleagues, social media, e-mails from the library, and internet searching. Theoretical importance: This study contributes to the literature by examining the perceptions of academic staff, which can inform policymakers and stakeholders in developing strategies to maximize the use of electronic databases. Data collection and analysis procedures: The data is collected through a survey questionnaire that utilizes the Likert scaling technique. The closed-ended questions are analyzed using SPSS 23. Question addressed: The paper addresses the question of how librarians and working colleagues influence the awareness and use of electronic databases among academic staff. Conclusion: The study concludes that the influence of librarians and working colleagues significantly contributes to the awareness and use of electronic databases among academic staff. The paper recommends the establishment of dedicated departments or units for marketing library resources to further promote the usage of electronic databases.

Keywords: awareness, electronic databases, academic staff, unified theory of acceptance and use of technology, social influence

Procedia PDF Downloads 34

8037 In silico Subtractive Genomics Approach for Identification of Strain-Specific Putative Drug Targets among Hypothetical Proteins of Drug-Resistant Klebsiella pneumoniae Strain 825795-1

Authors: Umairah Natasya Binti Mohd Omeershffudin, Suresh Kumar

Abstract:

Klebsiella pneumoniae, a Gram-negative enteric bacterium that causes nosocomial and urinary tract infections. Particular concern is the global emergence of multidrug-resistant (MDR) strains of Klebsiella pneumoniae. Characterization of antibiotic resistance determinants at the genomic level plays a critical role in understanding, and potentially controlling, the spread of multidrug-resistant (MDR) pathogens. In this study, drug-resistant Klebsiella pneumoniae strain 825795-1 was investigated with extensive computational approaches aimed at identifying novel drug targets among hypothetical proteins. We have analyzed 1099 hypothetical proteins available in genome. We have used in-silico genome subtraction methodology to design potential and pathogen-specific drug targets against Klebsiella pneumoniae. We employed bioinformatics tools to subtract the strain-specific paralogous and host-specific homologous sequences from the bacterial proteome. The sorted 645 proteins were further refined to identify the essential genes in the pathogenic bacterium using the database of essential genes (DEG). We found 135 unique essential proteins in the target proteome that could be utilized as novel targets to design newer drugs. Further, we identified 49 cytoplasmic protein as potential drug targets through sub-cellular localization prediction. Further, we investigated these proteins in the DrugBank databases, and 11 of the unique essential proteins showed druggability according to the FDA approved drug bank databases with diverse broad-spectrum property. The results of this study will facilitate discovery of new drugs against Klebsiella pneumoniae.

Keywords: pneumonia, drug target, hypothetical protein, subtractive genomics

Procedia PDF Downloads 148

8036 Comparison between RILM, JSTOR, and WorldCat Used to Search for Secondary Literature

Authors: Stacy Jarvis

Abstract:

Databases such as JSTOR, RILM and WorldCat have been the main source and storage of literature in the music orb. The Reference Index to Music Literature is a bibliographic database of over 2.6 million citations to writings about music from over 70 countries. The Research Institute produces RILM for the Study of Music at the University of Buffalo. JSTOR is an e-library of academic journals, books, and primary sources. Database JSTOR helps scholars find, utilise, and build upon a vast range of literature through a powerful teaching and research platform. Another database, WorldCat, is the world's biggest library catalogue, assisting scholars in finding library materials online. An evaluation of these databases in the music sphere is conducted by looking into the description and intended use and finding similarities and differences among them. Through comparison, it is found that these aim to serve different purposes, though they have the same goal of providing and storing literature. Also, since each database has different parts of literature that it majors on, the intended use of the three databases is evaluated. This can be found in the description, scope, and intended uses section. These areas are crucial to the research as it addresses the functional or literature differences among the three databases. It is also found that these databases have different quantitative potentials. This is determined by addressing the year each database began collecting literature and the number of articles, periodicals, albums, conference proceedings, music, dissertations, digital media, essays collections, journal articles, monographs, online resources, reviews, and reference materials that can be found in each one of them. This can be found in the sections- description, scope and intended uses and the importance of the database in identifying literature on different topics. To compare the delivery of services to the users, the importance of databases in identifying literature on different topics is also addressed in the section -the importance of databases in identifying literature on different topics. Even though these databases are used in research, they all have disadvantages and advantages. This is addressed in the sections on advantages and disadvantages. This will be significant in determining which of the three is the best. Also, it will help address how the shortcomings of one database can be addressed by utilising two databases together while conducting research. It is addressed in the section- a combination of RILM and JSTOR. All this information revolves around the idea that a huge amount of quantitative and qualitative data can be found in the presented databases on music and digital content; however, each of the given databases has a different construction and material features contributing to the musical scholarship in its way.

Keywords: RILM, JSTOR, WorldCat, database, literature, research

Procedia PDF Downloads 58

8035 A Method for Reduction of Association Rules in Data Mining

Authors: Diego De Castro Rodrigues, Marcelo Lisboa Rocha, Daniela M. De Q. Trevisan, Marcos Dias Da Conceicao, Gabriel Rosa, Rommel M. Barbosa

Abstract:

The use of association rules algorithms within data mining is recognized as being of great value in the knowledge discovery in databases. Very often, the number of rules generated is high, sometimes even in databases with small volume, so the success in the analysis of results can be hampered by this quantity. The purpose of this research is to present a method for reducing the quantity of rules generated with association algorithms. Therefore, a computational algorithm was developed with the use of a Weka Application Programming Interface, which allows the execution of the method on different types of databases. After the development, tests were carried out on three types of databases: synthetic, model, and real. Efficient results were obtained in reducing the number of rules, where the worst case presented a gain of more than 50%, considering the concepts of support, confidence, and lift as measures. This study concluded that the proposed model is feasible and quite interesting, contributing to the analysis of the results of association rules generated from the use of algorithms.

Keywords: data mining, association rules, rules reduction, artificial intelligence

Procedia PDF Downloads 126

8034 Building an Integrated Relational Database from Swiss Nutrition National Survey and Swiss Health Datasets for Data Mining Purposes

Authors: Ilona Mewes, Helena Jenzer, Farshideh Einsele

Abstract:

Objective: The objective of the study was to integrate two big databases from Swiss nutrition national survey (menuCH) and Swiss health national survey 2012 for data mining purposes. Each database has a demographic base data. An integrated Swiss database is built to later discover critical food consumption patterns linked with lifestyle diseases known to be strongly tied with food consumption. Design: Swiss nutrition national survey (menuCH) with approx. 2000 respondents from two different surveys, one by Phone and the other by questionnaire along with Swiss health national survey 2012 with 21500 respondents were pre-processed, cleaned and finally integrated to a unique relational database. Results: The result of this study is an integrated relational database from the Swiss nutritional and health databases.

Keywords: health informatics, data mining, nutritional and health databases, nutritional and chronical databases

Procedia PDF Downloads 80

8033 Computational Screening of Secretory Proteins with Brain-Specific Expression in Glioblastoma Multiforme

Authors: Sumera, Sanila Amber, Fatima Javed Mirza, Amjad Ali, Saadia Zahid

Abstract:

Glioblastoma multiforme (GBM) is a widely spread and fatal primary brain tumor with an increased risk of relapse in spite of aggressive treatment. The current procedures for GBM diagnosis include invasive procedures i.e. resection or biopsy, to acquire tumor mass. Implementation of negligibly invasive tests as a potential diagnostic technique and biofluid-based monitoring of GBM stresses on discovering biomarkers in CSF and blood. Therefore, we performed a comprehensive in silico analysis to identify potential circulating biomarkers for GBM. Initially, six gene and protein databases were utilized to mine brain-specific proteins. The resulting proteins were filtered using a channel of five tools to predict the secretory proteins. Subsequently, the expression profile of the secreted proteins was verified in the brain and blood using two databases. Additional verification of the resulting proteins was done using Plasma Proteome Database (PPD) to confirm their presence in blood. The final set of proteins was searched in literature for their relationship with GBM, keeping a special emphasis on secretome proteome. 2145 proteins were firstly mined as brain-specific, out of which 69 proteins were identified as secretory in nature. Verification of expression profile in brain and blood eliminated 58 proteins from the 69 proteins, providing a final list of 11 proteins. Further verification of these 11 proteins further eliminated 2 proteins, giving a final set of nine secretory proteins i.e. OPCML, NPTX1, LGI1, CNTN2, LY6H, SLIT1, CREG2, GDF1 and SERPINI1. Out of these 9 proteins, 7 were found to be linked to GBM, whereas 2 proteins are not investigated in GBM so far. We propose that these secretory proteins can serve as potential circulating biomarker signatures of GBM and will facilitate the development of minimally invasive diagnostic methods and novel therapeutic interventions for GBM.

Keywords: glioblastoma multiforme, secretory proteins, brain secretome, biomarkers

Procedia PDF Downloads 119

8032 Utilization of CD-ROM Database as a Storage and Retrieval System by Students of Nasarawa State University Keffi

Authors: Suleiman Musa

Abstract:

The utilization of CD-ROM as a storage and retrieval system by Nasarawa State University Keffi (NSUK) Library is crucial in preserving and dissemination of information to students and staff. This study investigated the utilization of CD-ROM Database storage and retrieval system by students of NUSK. Data was generated using structure questionnaire. One thousand and fifty two (1052) respondents were randomly selected among post-graduate and under-graduate students. Eight hundred and ten (810) questionnaires were returned, but only five hundred and ninety three (593) questionnaires were well completed and useful. The study found that post-graduate students use CD-ROM Databases more often than the under-graduate students in NSUK. The result of the study revealed that knowledge about CD-ROM Database 33.22% got it through library staff. 29.69% use CD-ROM once a month. Large number of users 45.70% purposely uses CD-ROM Databases for study and research. In fact, lack of users’ orientation amount to 58.35% of problems faced, while 31.20% lack of trained staff make it more difficult for utilization of CD-ROM Database. Major numbers of users 38.28% are neither satisfied nor dissatisfied, while a good number of them 27.99% are satisfied. Then 1.52% is highly dissatisfied but could not give reasons why. However, to ensure effective utilization of CD-ROM Database storage and retrieval system by students of NSUK, the following recommendations are made: effort should be made to encourage under-graduate in using CD-ROM Database. The institution should conduct orientation/induction course for students on CD-ROM Databases in the library. There is need for NSUK to produce in house databases on their CD-ROM for easy access by users.

Keywords: utilization, CD-ROM databases, storage, retrieval, students

Procedia PDF Downloads 409

8031 Life Cycle Datasets for the Ornamental Stone Sector

Authors: Isabella Bianco, Gian Andrea Blengini

Abstract:

The environmental impact related to ornamental stones (such as marbles and granites) is largely debated. Starting from the industrial revolution, continuous improvements of machineries led to a higher exploitation of this natural resource and to a more international interaction between markets. As a consequence, the environmental impact of the extraction and processing of stones has increased. Nevertheless, if compared with other building materials, ornamental stones are generally more durable, natural, and recyclable. From the scientific point of view, studies on stone life cycle sustainability have been carried out, but these are often partial or not very significant because of the high percentage of approximations and assumptions in calculations. This is due to the lack, in life cycle databases (e.g. Ecoinvent, Thinkstep, and ELCD), of datasets about the specific technologies employed in the stone production chain. For example, databases do not contain information about diamond wires, chains or explosives, materials commonly used in quarries and transformation plants. The project presented in this paper aims to populate the life cycle databases with specific data of specific stone processes. To this goal, the methodology follows the standardized approach of Life Cycle Assessment (LCA), according to the requirements of UNI 14040-14044 and to the International Reference Life Cycle Data System (ILCD) Handbook guidelines of the European Commission. The study analyses the processes of the entire production chain (from-cradle-to-gate system boundaries), including the extraction of benches, the cutting of blocks into slabs/tiles and the surface finishing. Primary data have been collected in Italian quarries and transformation plants which use technologies representative of the current state-of-the-art. Since the technologies vary according to the hardness of the stone, the case studies comprehend both soft stones (marbles) and hard stones (gneiss). In particular, data about energy, materials and emissions were collected in marble basins of Carrara and in Beola and Serizzo basins located in the province of Verbano Cusio Ossola. Data were then elaborated through an appropriate software to build a life cycle model. The model was realized setting free parameters that allow an easy adaptation to specific productions. Through this model, the study aims to boost the direct participation of stone companies and encourage the use of LCA tool to assess and improve the stone sector environmental sustainability. At the same time, the realization of accurate Life Cycle Inventory data aims at making available, to researchers and stone experts, ILCD compliant datasets of the most significant processes and technologies related to the ornamental stone sector.

Keywords: life cycle assessment, LCA datasets, ornamental stone, stone environmental impact

Procedia PDF Downloads 198

8030 Cleaning of Scientific References in Large Patent Databases Using Rule-Based Scoring and Clustering

Authors: Emiel Caron

Abstract:

Patent databases contain patent related data, organized in a relational data model, and are used to produce various patent statistics. These databases store raw data about scientific references cited by patents. For example, Patstat holds references to tens of millions of scientific journal publications and conference proceedings. These references might be used to connect patent databases with bibliographic databases, e.g. to study to the relation between science, technology, and innovation in various domains. Problematic in such studies is the low data quality of the references, i.e. they are often ambiguous, unstructured, and incomplete. Moreover, a complete bibliographic reference is stored in only one attribute. Therefore, a computerized cleaning and disambiguation method for large patent databases is developed in this work. The method uses rule-based scoring and clustering. The rules are based on bibliographic metadata, retrieved from the raw data by regular expressions, and are transparent and adaptable. The rules in combination with string similarity measures are used to detect pairs of records that are potential duplicates. Due to the scoring, different rules can be combined, to join scientific references, i.e. the rules reinforce each other. The scores are based on expert knowledge and initial method evaluation. After the scoring, pairs of scientific references that are above a certain threshold, are clustered by means of single-linkage clustering algorithm to form connected components. The method is designed to disambiguate all the scientific references in the Patstat database. The performance evaluation of the clustering method, on a large golden set with highly cited papers, shows on average a 99% precision and a 95% recall. The method is therefore accurate but careful, i.e. it weighs precision over recall. Consequently, separate clusters of high precision are sometimes formed, when there is not enough evidence for connecting scientific references, e.g. in the case of missing year and journal information for a reference. The clusters produced by the method can be used to directly link the Patstat database with bibliographic databases as the Web of Science or Scopus.

Keywords: clustering, data cleaning, data disambiguation, data mining, patent analysis, scientometrics

Procedia PDF Downloads 167

8029 Recommender System Based on Mining Graph Databases for Data-Intensive Applications

Authors: Mostafa Gamal, Hoda K. Mohamed, Islam El-Maddah, Ali Hamdi

Abstract:

In recent years, many digital documents on the web have been created due to the rapid growth of ’social applications’ communities or ’Data-intensive applications’. The evolution of online-based multimedia data poses new challenges in storing and querying large amounts of data for online recommender systems. Graph data models have been shown to be more efficient than relational data models for processing complex data. This paper will explain the key differences between graph and relational databases, their strengths and weaknesses, and why using graph databases is the best technology for building a realtime recommendation system. Also, The paper will discuss several similarity metrics algorithms that can be used to compute a similarity score of pairs of nodes based on their neighbourhoods or their properties. Finally, the paper will discover how NLP strategies offer the premise to improve the accuracy and coverage of realtime recommendations by extracting the information from the stored unstructured knowledge, which makes up the bulk of the world’s data to enrich the graph database with this information. As the size and number of data items are increasing rapidly, the proposed system should meet current and future needs.

Keywords: graph databases, NLP, recommendation systems, similarity metrics

Procedia PDF Downloads 68

8028 Recent Advances in Data Warehouse

Authors: Fahad Hanash Alzahrani

Abstract:

This paper describes some recent advances in a quickly developing area of data storing and processing based on Data Warehouses and Data Mining techniques, which are associated with software, hardware, data mining algorithms and visualisation techniques having common features for any specific problems and tasks of their implementation.

Keywords: data warehouse, data mining, knowledge discovery in databases, on-line analytical processing

Procedia PDF Downloads 363

8027 A Novel Framework for User-Friendly Ontology-Mediated Access to Relational Databases

Authors: Efthymios Chondrogiannis, Vassiliki Andronikou, Efstathios Karanastasis, Theodora Varvarigou

Abstract:

A large amount of data is typically stored in relational databases (DB). The latter can efficiently handle user queries which intend to elicit the appropriate information from data sources. However, direct access and use of this data requires the end users to have an adequate technical background, while they should also cope with the internal data structure and values presented. Consequently the information retrieval is a quite difficult process even for IT or DB experts, taking into account the limited contributions of relational databases from the conceptual point of view. Ontologies enable users to formally describe a domain of knowledge in terms of concepts and relations among them and hence they can be used for unambiguously specifying the information captured by the relational database. However, accessing information residing in a database using ontologies is feasible, provided that the users are keen on using semantic web technologies. For enabling users form different disciplines to retrieve the appropriate data, the design of a Graphical User Interface is necessary. In this work, we will present an interactive, ontology-based, semantically enable web tool that can be used for information retrieval purposes. The tool is totally based on the ontological representation of underlying database schema while it provides a user friendly environment through which the users can graphically form and execute their queries.

Keywords: ontologies, relational databases, SPARQL, web interface

Procedia PDF Downloads 246

8026 An Analysis of Sequential Pattern Mining on Databases Using Approximate Sequential Patterns

Authors: J. Suneetha, Vijayalaxmi

Abstract:

Sequential Pattern Mining involves applying data mining methods to large data repositories to extract usage patterns. Sequential pattern mining methodologies used to analyze the data and identify patterns. The patterns have been used to implement efficient systems can recommend on previously observed patterns, in making predictions, improve usability of systems, detecting events, and in general help in making strategic product decisions. In this paper, identified performance of approximate sequential pattern mining defines as identifying patterns approximately shared with many sequences. Approximate sequential patterns can effectively summarize and represent the databases by identifying the underlying trends in the data. Conducting an extensive and systematic performance over synthetic and real data. The results demonstrate that ApproxMAP effective and scalable in mining large sequences databases with long patterns.

Keywords: multiple data, performance analysis, sequential pattern, sequence database scalability

Procedia PDF Downloads 302

8025 Generating Insights from Data Using a Hybrid Approach

Authors: Allmin Susaiyah, Aki Härmä, Milan Petković

Abstract:

Automatic generation of insights from data using insight mining systems (IMS) is useful in many applications, such as personal health tracking, patient monitoring, and business process management. Existing IMS face challenges in controlling insight extraction, scaling to large databases, and generalising to unseen domains. In this work, we propose a hybrid approach consisting of rule-based and neural components for generating insights from data while overcoming the aforementioned challenges. Firstly, a rule-based data 2CNL component is used to extract statistically significant insights from data and represent them in a controlled natural language (CNL). Secondly, a BERTSum-based CNL2NL component is used to convert these CNLs into natural language texts. We improve the model using task-specific and domain-specific fine-tuning. Our approach has been evaluated using statistical techniques and standard evaluation metrics. We overcame the aforementioned challenges and observed significant improvement with domain-specific fine-tuning.

Keywords: data mining, insight mining, natural language generation, pre-trained language models

Procedia PDF Downloads 72

8024 The Role of Cyfra 21-1 in Diagnosing Non Small Cell Lung Cancer (NSCLC)

Authors: H. J. T. Kevin Mozes, Dyah Purnamasari

Abstract:

Background: Lung cancer accounted for the fourth most common cancer in Indonesia. 85% of all lung cancer cases are the Non-Small Cell Lung Cancer (NSCLC). The indistinct signs and symptoms of NSCLC sometimes lead to misdiagnosis. The gold standard assessment for the diagnosis of NSCLC is the histopathological biopsy, which is invasive. Cyfra 21-1 is a tumor marker, which can be found in the intermediate protein structure in the epitel. The accuracy of Cyfra 21-1 in diagnosing NSCLC is not yet known, so this report is made to seek the answer for the question above. Methods: Literature searching is done using online databases. Proquest and Pubmed are online databases being used in this report. Then, literature selection is done by excluding and including based on inclusion criterias and exclusion criterias. The selected literature is then being appraised using the criteria of validity, importance, and validity. Results: From six journals appraised, five of them are valid. Sensitivity value acquired from all five literature is ranging from 50-84.5 %, meanwhile the specificity is 87.8 %-94.4 %. Likelihood the ratio of all appraised literature is ranging from 5.09 -10.54, which categorized to Intermediate High. Conclusion: Serum Cyfra 21-1 is a sensitive and very specific tumor marker for diagnosis of non-small cell lung cancer (NSCLC).

Keywords: cyfra 21-1, diagnosis, nonsmall cell lung cancer, NSCLC, tumor marker

Procedia PDF Downloads 202

8023 3D Object Retrieval Based on Similarity Calculation in 3D Computer Aided Design Systems

Authors: Ahmed Fradi

Abstract:

Nowadays, recent technological advances in the acquisition, modeling, and processing of three-dimensional (3D) objects data lead to the creation of models stored in huge databases, which are used in various domains such as computer vision, augmented reality, game industry, medicine, CAD (Computer-aided design), 3D printing etc. On the other hand, the industry is currently benefiting from powerful modeling tools enabling designers to easily and quickly produce 3D models. The great ease of acquisition and modeling of 3D objects make possible to create large 3D models databases, then, it becomes difficult to navigate them. Therefore, the indexing of 3D objects appears as a necessary and promising solution to manage this type of data, to extract model information, retrieve an existing model or calculate similarity between 3D objects. The objective of the proposed research is to develop a framework allowing easy and fast access to 3D objects in a CAD models database with specific indexing algorithm to find objects similar to a reference model. Our main objectives are to study existing methods of similarity calculation of 3D objects (essentially shape-based methods) by specifying the characteristics of each method as well as the difference between them, and then we will propose a new approach for indexing and comparing 3D models, which is suitable for our case study and which is based on some previously studied methods. Our proposed approach is finally illustrated by an implementation, and evaluated in a professional context.

Keywords: CAD, 3D object retrieval, shape based retrieval, similarity calculation

Procedia PDF Downloads 230

8022 Ontology-Driven Knowledge Discovery and Validation from Admission Databases: A Structural Causal Model Approach for Polytechnic Education in Nigeria

Authors: Bernard Igoche Igoche, Olumuyiwa Matthew, Peter Bednar, Alexander Gegov

Abstract:

This study presents an ontology-driven approach for knowledge discovery and validation from admission databases in Nigerian polytechnic institutions. The research aims to address the challenges of extracting meaningful insights from vast amounts of admission data and utilizing them for decision-making and process improvement. The proposed methodology combines the knowledge discovery in databases (KDD) process with a structural causal model (SCM) ontological framework. The admission database of Benue State Polytechnic Ugbokolo (Benpoly) is used as a case study. The KDD process is employed to mine and distill knowledge from the database, while the SCM ontology is designed to identify and validate the important features of the admission process. The SCM validation is performed using the conditional independence test (CIT) criteria, and an algorithm is developed to implement the validation process. The identified features are then used for machine learning (ML) modeling and prediction of admission status. The results demonstrate the adequacy of the SCM ontological framework in representing the admission process and the high predictive accuracies achieved by the ML models, with k-nearest neighbors (KNN) and support vector machine (SVM) achieving 92% accuracy. The study concludes that the proposed ontology-driven approach contributes to the advancement of educational data mining and provides a foundation for future research in this domain.

Keywords: admission databases, educational data mining, machine learning, ontology-driven knowledge discovery, polytechnic education, structural causal model

Procedia PDF Downloads 17

8021 A Framework for SQL Learning: Linking Learning Taxonomy, Cognitive Model and Cross Cutting Factors

Authors: Huda Al Shuaily, Karen Renaud

Abstract:

Databases comprise the foundation of most software systems. System developers inevitably write code to query these databases. The de facto language for querying is SQL and this, consequently, is the default language taught by higher education institutions. There is evidence that learners find it hard to master SQL, harder than mastering other programming languages such as Java. Educators do not agree about explanations for this seeming anomaly. Further investigation may well reveal the reasons. In this paper, we report on our investigations into how novices learn SQL, the actual problems they experience when writing SQL, as well as the differences between expert and novice SQL query writers. We conclude by presenting a model of SQL learning that should inform the instructional material design process better to support the SQL learning process.

Keywords: pattern, SQL, learning, model

Procedia PDF Downloads 230

8020 Formulation of a Rapid Earthquake Risk Ranking Criteria for National Bridges in the National Capital Region Affected by the West Valley Fault Using GIS Data Integration

Authors: George Mariano Soriano

Abstract:

In this study, a Rapid Earthquake Risk Ranking Criteria was formulated by integrating various existing maps and databases by the Department of Public Works and Highways (DPWH) and Philippine Institute of Volcanology and Seismology (PHIVOLCS). Utilizing Geographic Information System (GIS) software, the above-mentioned maps and databases were used in extracting seismic hazard parameters and bridge vulnerability characteristics in order to rank the seismic damage risk rating of bridges in the National Capital Region.

Keywords: bridge, earthquake, GIS, hazard, risk, vulnerability

Procedia PDF Downloads 373

8019 Patient-Specific Modeling Algorithm for Medical Data Based on AUC

Authors: Guilherme Ribeiro, Alexandre Oliveira, Antonio Ferreira, Shyam Visweswaran, Gregory Cooper

Abstract:

Patient-specific models are instance-based learning algorithms that take advantage of the particular features of the patient case at hand to predict an outcome. We introduce two patient-specific algorithms based on decision tree paradigm that use AUC as a metric to select an attribute. We apply the patient specific algorithms to predict outcomes in several datasets, including medical datasets. Compared to the patient-specific decision path (PSDP) entropy-based and CART methods, the AUC-based patient-specific decision path models performed equivalently on area under the ROC curve (AUC). Our results provide support for patient-specific methods being a promising approach for making clinical predictions.

Keywords: approach instance-based, area under the ROC curve, patient-specific decision path, clinical predictions

Procedia PDF Downloads 446

8018 A Protein-Wave Alignment Tool for Frequency Related Homologies Identification in Polypeptide Sequences

Authors: Victor Prevost, Solene Landerneau, Michel Duhamel, Joel Sternheimer, Olivier Gallet, Pedro Ferrandiz, Marwa Mokni

Abstract:

The search for homologous proteins is one of the ongoing challenges in biology and bioinformatics. Traditionally, a pair of proteins is thought to be homologous when they originate from the same ancestral protein. In such a case, their sequences share similarities, and advanced scientific research effort is spent to investigate this question. On this basis, we propose the Protein-Wave Alignment Tool (”P-WAT”) developed within the framework of the France Relance 2030 plan. Our work takes into consideration the mass-related wave aspect of protein biosynthesis, by associating specific frequencies to each amino acid according to its mass. Amino acids are then regrouped within their mass category. This way, our algorithm produces specific alignments in addition to those obtained with a common amino acid coding system. For this purpose, we develop the ”P-WAT” original algorithm, able to address large protein databases, with different attributes such as species, protein names, etc. that allow us to align user’s requests with a set of specific protein sequences. The primary intent of this algorithm is to achieve efficient alignments, in this specific conceptual frame, by minimizing execution costs and information loss. Our algorithm identifies sequence similarities by searching for matches of sub-sequences of different sizes, referred to as primers. Our algorithm relies on Boolean operations upon a dot plot matrix to identify primer amino acids common to both proteins which are likely to be part of a significant alignment of peptides. From those primers, dynamic programming-like traceback operations generate alignments and alignment scores based on an adjusted PAM250 matrix.

Keywords: protein, alignment, homologous, Genodic

Procedia PDF Downloads 77

8017 JREM: An Approach for Formalising Models in the Requirements Phase with JSON and NoSQL Databases

Authors: Aitana Alonso-Nogueira, Helia Estévez-Fernández, Isaías García

Abstract:

This paper presents an approach to reduce some of its current flaws in the requirements phase inside the software development process. It takes the software requirements of an application, makes a conceptual modeling about it and formalizes it within JSON documents. This formal model is lodged in a NoSQL database which is document-oriented, that is, MongoDB, because of its advantages in flexibility and efficiency. In addition, this paper underlines the contributions of the detailed approach and shows some applications and benefits for the future work in the field of automatic code generation using model-driven engineering tools.

Keywords: conceptual modelling, JSON, NoSQL databases, requirements engineering, software development

Procedia PDF Downloads 352

8016 Knowledge Discovery from Production Databases for Hierarchical Process Control

Authors: Pavol Tanuska, Pavel Vazan, Michal Kebisek, Dominika Jurovata

Abstract:

The paper gives the results of the project that was oriented on the usage of knowledge discoveries from production systems for needs of the hierarchical process control. One of the main project goals was the proposal of knowledge discovery model for process control. Specifics data mining methods and techniques was used for defined problems of the process control. The gained knowledge was used on the real production system, thus, the proposed solution has been verified. The paper documents how it is possible to apply new discovery knowledge to be used in the real hierarchical process control. There are specified the opportunities for application of the proposed knowledge discovery model for hierarchical process control.

Keywords: hierarchical process control, knowledge discovery from databases, neural network, process control

Procedia PDF Downloads 445

8015 De-Novo Structural Elucidation from Mass/NMR Spectra

Authors: Ismael Zamora, Elisabeth Ortega, Tatiana Radchenko, Guillem Plasencia

Abstract:

The structure elucidation based on Mass Spectra (MS) data of unknown substances is an unresolved problem that affects many different fields of application. The recent overview of software available for structure elucidation of small molecules has shown the demand for efficient computational tool that will be able to perform structure elucidation of unknown small molecules and peptides. We developed an algorithm for De-Novo fragment analysis based on MS data that proposes a set of scored and ranked structures that are compatible with the MS and MSMS spectra. Several different algorithms were developed depending on the initial set of fragments and the structure building processes. Also, in all cases, several scores for the final molecule ranking were computed. They were validated with small and middle databases (DB) with the eleven test set compounds. Similar results were obtained from any of the databases that contained the fragments of the expected compound. We presented an algorithm. Or De-Novo fragment analysis based on only mass spectrometry (MS) data only that proposed a set of scored/ranked structures that was validated on different types of databases and showed good results as proof of concept. Moreover, the solutions proposed by Mass Spectrometry were submitted to the prediction of NMR spectra in order to elucidate which of the proposed structures was compatible with the NMR spectra collected.

Keywords: De Novo, structure elucidation, mass spectrometry, NMR

Procedia PDF Downloads 253

8014 A Survey of Semantic Integration Approaches in Bioinformatics

Authors: Chaimaa Messaoudi, Rachida Fissoune, Hassan Badir

Abstract:

Technological advances of computer science and data analysis are helping to provide continuously huge volumes of biological data, which are available on the web. Such advances involve and require powerful techniques for data integration to extract pertinent knowledge and information for a specific question. Biomedical exploration of these big data often requires the use of complex queries across multiple autonomous, heterogeneous and distributed data sources. Semantic integration is an active area of research in several disciplines, such as databases, information-integration, and ontology. We provide a survey of some approaches and techniques for integrating biological data, we focus on those developed in the ontology community.

Keywords: biological ontology, linked data, semantic data integration, semantic web

Procedia PDF Downloads 409