Search results for: data mining analytics
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25603

Search results for: data mining analytics

24763 Knowledge Discovery from Production Databases for Hierarchical Process Control

Authors: Pavol Tanuska, Pavel Vazan, Michal Kebisek, Dominika Jurovata

Abstract:

The paper gives the results of the project that was oriented on the usage of knowledge discoveries from production systems for needs of the hierarchical process control. One of the main project goals was the proposal of knowledge discovery model for process control. Specifics data mining methods and techniques was used for defined problems of the process control. The gained knowledge was used on the real production system, thus, the proposed solution has been verified. The paper documents how it is possible to apply new discovery knowledge to be used in the real hierarchical process control. There are specified the opportunities for application of the proposed knowledge discovery model for hierarchical process control.

Keywords: hierarchical process control, knowledge discovery from databases, neural network, process control

Procedia PDF Downloads 481
24762 Copyright Clearance for Artificial Intelligence Training Data: Challenges and Solutions

Authors: Erva Akin

Abstract:

– The use of copyrighted material for machine learning purposes is a challenging issue in the field of artificial intelligence (AI). While machine learning algorithms require large amounts of data to train and improve their accuracy and creativity, the use of copyrighted material without permission from the authors may infringe on their intellectual property rights. In order to overcome copyright legal hurdle against the data sharing, access and re-use of data, the use of copyrighted material for machine learning purposes may be considered permissible under certain circumstances. For example, if the copyright holder has given permission to use the data through a licensing agreement, then the use for machine learning purposes may be lawful. It is also argued that copying for non-expressive purposes that do not involve conveying expressive elements to the public, such as automated data extraction, should not be seen as infringing. The focus of such ‘copy-reliant technologies’ is on understanding language rules, styles, and syntax and no creative ideas are being used. However, the non-expressive use defense is within the framework of the fair use doctrine, which allows the use of copyrighted material for research or educational purposes. The questions arise because the fair use doctrine is not available in EU law, instead, the InfoSoc Directive provides for a rigid system of exclusive rights with a list of exceptions and limitations. One could only argue that non-expressive uses of copyrighted material for machine learning purposes do not constitute a ‘reproduction’ in the first place. Nevertheless, the use of machine learning with copyrighted material is difficult because EU copyright law applies to the mere use of the works. Two solutions can be proposed to address the problem of copyright clearance for AI training data. The first is to introduce a broad exception for text and data mining, either mandatorily or for commercial and scientific purposes, or to permit the reproduction of works for non-expressive purposes. The second is that copyright laws should permit the reproduction of works for non-expressive purposes, which opens the door to discussions regarding the transposition of the fair use principle from the US into EU law. Both solutions aim to provide more space for AI developers to operate and encourage greater freedom, which could lead to more rapid innovation in the field. The Data Governance Act presents a significant opportunity to advance these debates. Finally, issues concerning the balance of general public interests and legitimate private interests in machine learning training data must be addressed. In my opinion, it is crucial that robot-creation output should fall into the public domain. Machines depend on human creativity, innovation, and expression. To encourage technological advancement and innovation, freedom of expression and business operation must be prioritised.

Keywords: artificial intelligence, copyright, data governance, machine learning

Procedia PDF Downloads 83
24761 Factors of Social Media Platforms on Consumer Behavior

Authors: Zebider Asire Munyelet, Yibeltal Chanie Manie

Abstract:

In the modern digital landscape, the increase of social media platforms has become identical to the evolution of online consumer behavior. This study investigates the complicated relationship between social media and the purchasing decisions of online buyers. Through an extensive review of existing literature and empirical research, the aim is to comprehensively analyze the multidimensional impact that social media exerts on the various stages of the online buyer's journey. The investigation encompasses the exploration of how social media platforms serve as influential channels for information dissemination, product discovery, and consumer engagement. Additionally, the study investigates into the psychological aspects underlying the role of social media in shaping buyer preferences, perceptions, and trust in online transactions. The methodologies employed include both quantitative and qualitative analyses, incorporating surveys, interviews, and data analytics to derive meaningful insights. Statistical models are applied to distinguish patterns in online buyer behavior concerning product awareness, brand loyalty, and decision-making processes. The expected outcomes of this research contribute not only to the academic understanding of the dynamic interplay between social media and online buyer behavior but also offer practical implications for marketers, e-commerce platforms, and policymakers.

Keywords: consumer Behavior, social media, online purchasing, online transaction

Procedia PDF Downloads 76
24760 Digital Transformation: Actionable Insights to Optimize the Building Performance

Authors: Jovian Cheung, Thomas Kwok, Victor Wong

Abstract:

Buildings are entwined with smart city developments. Building performance relies heavily on electrical and mechanical (E&M) systems and services accounting for about 40 percent of global energy use. By cohering the advancement of technology as well as energy and operation-efficient initiatives into the buildings, people are enabled to raise building performance and enhance the sustainability of the built environment in their daily lives. Digital transformation in the buildings is the profound development of the city to leverage the changes and opportunities of digital technologies To optimize the building performance, intelligent power quality and energy management system is developed for transforming data into actions. The system is formed by interfacing and integrating legacy metering and internet of things technologies in the building and applying big data techniques. It provides operation and energy profile and actionable insights of a building, which enables to optimize the building performance through raising people awareness on E&M services and energy consumption, predicting the operation of E&M systems, benchmarking the building performance, and prioritizing assets and energy management opportunities. The intelligent power quality and energy management system comprises four elements, namely the Integrated Building Performance Map, Building Performance Dashboard, Power Quality Analysis, and Energy Performance Analysis. It provides predictive operation sequence of E&M systems response to the built environment and building activities. The system collects the live operating conditions of E&M systems over time to identify abnormal system performance, predict failure trends and alert users before anticipating system failure. The actionable insights collected can also be used for system design enhancement in future. This paper will illustrate how intelligent power quality and energy management system provides operation and energy profile to optimize the building performance and actionable insights to revitalize an existing building into a smart building. The system is driving building performance optimization and supporting in developing Hong Kong into a suitable smart city to be admired.

Keywords: intelligent buildings, internet of things technologies, big data analytics, predictive operation and maintenance, building performance

Procedia PDF Downloads 157
24759 Recent Findings of Late Bronze Age Mining and Archaeometallurgy Activities in the Mountain Region of Colchis (Southern Lechkhumi, Georgia)

Authors: Rusudan Chagelishvili, Nino Sulava, Tamar Beridze, Nana Rezesidze, Nikoloz Tatuashvili

Abstract:

The South Caucasus is one of the most important centers of prehistoric metallurgy, known for its Colchian bronze culture. Modern Lechkhumi – historical Mountainous Colchis where the existence of prehistoric metallurgy is confirmed by the discovery of many artifacts is a part of this area. Studies focused on prehistoric smelting sites, related artefacts, and ore deposits have been conducted during last ten years in Lechkhumi. More than 20 prehistoric smelting sites and artefacts associated with metallurgical activities (ore roasting furnaces, slags, crucible, and tuyères fragments) have been identified so far. Within the framework of integrated studies was established that these sites were operating in 13-9 centuries B.C. and used for copper smelting. Palynological studies of slags revealed that chestnut (Castanea sativa) and hornbeam (Carpinus sp.) wood were used as smelting fuel. Geological exploration-analytical studies revealed that copper ore mining, processing, and smelting sites were distributed close to each other. Despite recent complex data, the signs of prehistoric mines (trenches) haven’t been found in this part of the study area so far. Since 2018 the archaeological-geological exploration has been focused on the southern part of Lechkhumi and covered the areas of villages Okureshi and Opitara. Several copper smelting sites (Okureshi 1 and 2, Opitara 1), as well as a Colchian Bronze culture settlement, have been identified here. Three mine workings have been found in the narrow gorge of the river Rtkhmelebisgele in the vicinities of the village Opitara. In order to establish a link between the Opitara-Okureshi archaeometallurgical sites, Late Bronze Age settlements, and mines, various scientific analytical methods -mineralized rock and slags petrography and atomic absorption spectrophotometry (AAS) analysis have been applied. The careful examination of Opitara mine workings revealed that there is a striking difference between the mine #1 on the right bank of the river and mines #2 and #3 on the left bank. The first one has all characteristic features of the Soviet period mine working (e. g. high portal with angular ribs and roof showing signs of blasting). In contrast, mines #2 and #3, which are located very close to each other, have round-shaped portals/entrances, low roofs, and fairly smooth ribs and are filled with thick layers of river sediments and collapsed weathered rock mass. A thorough review of the publications related to prehistoric mine workings revealed some striking similarities between mines #2 and #3 with their worldwide analogues. Apparently, the ore extraction from these mines was conducted by fire-setting applying primitive tools. It was also established that mines are cut in Jurassic mineralized volcanic rocks. Ore minerals (chalcopyrite, pyrite, galena) are related to calcite and quartz veins. The results obtained through the petrochemical and petrography studies of mineralized rock samples from Opitara mines and prehistoric slags are in complete correlation with each other, establishing the direct link between copper mining and smelting within the study area. Acknowledgment: This work was supported by the Shota Rustaveli National Science Foundation of Georgia (grant # FR-19-13022).

Keywords: archaeometallurgy, Mountainous Colchis, mining, ore minerals

Procedia PDF Downloads 180
24758 Blue Economy and Marine Mining

Authors: Fani Sakellariadou

Abstract:

The Blue Economy includes all marine-based and marine-related activities. They correspond to established, emerging as well as unborn ocean-based industries. Seabed mining is an emerging marine-based activity; its operations depend particularly on cutting-edge science and technology. The 21st century will face a crisis in resources as a consequence of the world’s population growth and the rising standard of living. The natural capital stored in the global ocean is decisive for it to provide a wide range of sustainable ecosystem services. Seabed mineral deposits were identified as having a high potential for critical elements and base metals. They have a crucial role in the fast evolution of green technologies. The major categories of marine mineral deposits are deep-sea deposits, including cobalt-rich ferromanganese crusts, polymetallic nodules, phosphorites, and deep-sea muds, as well as shallow-water deposits including marine placers. Seabed mining operations may take place within continental shelf areas of nation-states. In international waters, the International Seabed Authority (ISA) has entered into 15-year contracts for deep-seabed exploration with 21 contractors. These contracts are for polymetallic nodules (18 contracts), polymetallic sulfides (7 contracts), and cobalt-rich ferromanganese crusts (5 contracts). Exploration areas are located in the Clarion-Clipperton Zone, the Indian Ocean, the Mid Atlantic Ridge, the South Atlantic Ocean, and the Pacific Ocean. Potential environmental impacts of deep-sea mining include habitat alteration, sediment disturbance, plume discharge, toxic compounds release, light and noise generation, and air emissions. They could cause burial and smothering of benthic species, health problems for marine species, biodiversity loss, reduced photosynthetic mechanism, behavior change and masking acoustic communication for mammals and fish, heavy metals bioaccumulation up the food web, decrease of the content of dissolved oxygen, and climate change. An important concern related to deep-sea mining is our knowledge gap regarding deep-sea bio-communities. The ecological consequences that will be caused in the remote, unique, fragile, and little-understood deep-sea ecosystems and inhabitants are still largely unknown. The blue economy conceptualizes oceans as developing spaces supplying socio-economic benefits for current and future generations but also protecting, supporting, and restoring biodiversity and ecological productivity. In that sense, people should apply holistic management and make an assessment of marine mining impacts on ecosystem services, including the categories of provisioning, regulating, supporting, and cultural services. The variety in environmental parameters, the range in sea depth, the diversity in the characteristics of marine species, and the possible proximity to other existing maritime industries cause a span of marine mining impact the ability of ecosystems to support people and nature. In conclusion, the use of the untapped potential of the global ocean demands a liable and sustainable attitude. Moreover, there is a need to change our lifestyle and move beyond the philosophy of single-use. Living in a throw-away society based on a linear approach to resource consumption, humans are putting too much pressure on the natural environment. Applying modern, sustainable and eco-friendly approaches according to the principle of circular economy, a substantial amount of natural resource savings will be achieved. Acknowledgement: This work is part of the MAREE project, financially supported by the Division VI of IUPAC. This work has been partly supported by the University of Piraeus Research Center.

Keywords: blue economy, deep-sea mining, ecosystem services, environmental impacts

Procedia PDF Downloads 83
24757 JavaScript Object Notation Data against eXtensible Markup Language Data in Software Applications a Software Testing Approach

Authors: Theertha Chandroth

Abstract:

This paper presents a comparative study on how to check JSON (JavaScript Object Notation) data against XML (eXtensible Markup Language) data from a software testing point of view. JSON and XML are widely used data interchange formats, each with its unique syntax and structure. The objective is to explore various techniques and methodologies for validating comparison and integration between JSON data to XML and vice versa. By understanding the process of checking JSON data against XML data, testers, developers and data practitioners can ensure accurate data representation, seamless data interchange, and effective data validation.

Keywords: XML, JSON, data comparison, integration testing, Python, SQL

Procedia PDF Downloads 140
24756 Using Machine Learning Techniques to Extract Useful Information from Dark Data

Authors: Nigar Hussain

Abstract:

It is a subset of big data. Dark data means those data in which we fail to use for future decisions. There are many issues in existing work, but some need powerful tools for utilizing dark data. It needs sufficient techniques to deal with dark data. That enables users to exploit their excellence, adaptability, speed, less time utilization, execution, and accessibility. Another issue is the way to utilize dark data to extract helpful information to settle on better choices. In this paper, we proposed upgrade strategies to remove the dark side from dark data. Using a supervised model and machine learning techniques, we utilized dark data and achieved an F1 score of 89.48%.

Keywords: big data, dark data, machine learning, heatmap, random forest

Procedia PDF Downloads 28
24755 Evaluation of Modern Natural Language Processing Techniques via Measuring a Company's Public Perception

Authors: Burak Oksuzoglu, Savas Yildirim, Ferhat Kutlu

Abstract:

Opinion mining (OM) is one of the natural language processing (NLP) problems to determine the polarity of opinions, mostly represented on a positive-neutral-negative axis. The data for OM is usually collected from various social media platforms. In an era where social media has considerable control over companies’ futures, it’s worth understanding social media and taking actions accordingly. OM comes to the fore here as the scale of the discussion about companies increases, and it becomes unfeasible to gauge opinion on individual levels. Thus, the companies opt to automize this process by applying machine learning (ML) approaches to their data. For the last two decades, OM or sentiment analysis (SA) has been mainly performed by applying ML classification algorithms such as support vector machines (SVM) and Naïve Bayes to a bag of n-gram representations of textual data. With the advent of deep learning and its apparent success in NLP, traditional methods have become obsolete. Transfer learning paradigm that has been commonly used in computer vision (CV) problems started to shape NLP approaches and language models (LM) lately. This gave a sudden rise to the usage of the pretrained language model (PTM), which contains language representations that are obtained by training it on the large datasets using self-supervised learning objectives. The PTMs are further fine-tuned by a specialized downstream task dataset to produce efficient models for various NLP tasks such as OM, NER (Named-Entity Recognition), Question Answering (QA), and so forth. In this study, the traditional and modern NLP approaches have been evaluated for OM by using a sizable corpus belonging to a large private company containing about 76,000 comments in Turkish: SVM with a bag of n-grams, and two chosen pre-trained models, multilingual universal sentence encoder (MUSE) and bidirectional encoder representations from transformers (BERT). The MUSE model is a multilingual model that supports 16 languages, including Turkish, and it is based on convolutional neural networks. The BERT is a monolingual model in our case and transformers-based neural networks. It uses a masked language model and next sentence prediction tasks that allow the bidirectional training of the transformers. During the training phase of the architecture, pre-processing operations such as morphological parsing, stemming, and spelling correction was not used since the experiments showed that their contribution to the model performance was found insignificant even though Turkish is a highly agglutinative and inflective language. The results show that usage of deep learning methods with pre-trained models and fine-tuning achieve about 11% improvement over SVM for OM. The BERT model achieved around 94% prediction accuracy while the MUSE model achieved around 88% and SVM did around 83%. The MUSE multilingual model shows better results than SVM, but it still performs worse than the monolingual BERT model.

Keywords: BERT, MUSE, opinion mining, pretrained language model, SVM, Turkish

Procedia PDF Downloads 146
24754 Genomics of Aquatic Adaptation

Authors: Agostinho Antunes

Abstract:

The completion of the human genome sequencing in 2003 opened a new perspective into the importance of whole genome sequencing projects, and currently multiple species are having their genomes completed sequenced, from simple organisms, such as bacteria, to more complex taxa, such as mammals. This voluminous sequencing data generated across multiple organisms provides also the framework to better understand the genetic makeup of such species and related ones, allowing to explore the genetic changes underlining the evolution of diverse phenotypic traits. Here, recent results from our group retrieved from comparative evolutionary genomic analyses of selected marine animal species will be considered to exemplify how gene novelty and gene enhancement by positive selection might have been determinant in the success of adaptive radiations into diverse habitats and lifestyles.

Keywords: comparative genomics, adaptive evolution, bioinformatics, phylogenetics, genome mining

Procedia PDF Downloads 533
24753 Decision Support System for Diagnosis of Breast Cancer

Authors: Oluwaponmile D. Alao

Abstract:

In this paper, two models have been developed to ascertain the best network needed for diagnosis of breast cancer. Breast cancer has been a disease that required the attention of the medical practitioner. Experience has shown that misdiagnose of the disease has been a major challenge in the medical field. Therefore, designing a system with adequate performance for will help in making diagnosis of the disease faster and accurate. In this paper, two models: backpropagation neural network and support vector machine has been developed. The performance obtained is also compared with other previously obtained algorithms to ascertain the best algorithms.

Keywords: breast cancer, data mining, neural network, support vector machine

Procedia PDF Downloads 347
24752 The Right to Data Portability and Its Influence on the Development of Digital Services

Authors: Roman Bieda

Abstract:

The General Data Protection Regulation (GDPR) will come into force on 25 May 2018 which will create a new legal framework for the protection of personal data in the European Union. Article 20 of GDPR introduces a right to data portability. This right allows for data subjects to receive the personal data which they have provided to a data controller, in a structured, commonly used and machine-readable format, and to transmit this data to another data controller. The right to data portability, by facilitating transferring personal data between IT environments (e.g.: applications), will also facilitate changing the provider of services (e.g. changing a bank or a cloud computing service provider). Therefore, it will contribute to the development of competition and the digital market. The aim of this paper is to discuss the right to data portability and its influence on the development of new digital services.

Keywords: data portability, digital market, GDPR, personal data

Procedia PDF Downloads 473
24751 Application of Building Information Modeling in Energy Management of Individual Departments Occupying University Facilities

Authors: Kung-Jen Tu, Danny Vernatha

Abstract:

To assist individual departments within universities in their energy management tasks, this study explores the application of Building Information Modeling in establishing the ‘BIM based Energy Management Support System’ (BIM-EMSS). The BIM-EMSS consists of six components: (1) sensors installed for each occupant and each equipment, (2) electricity sub-meters (constantly logging lighting, HVAC, and socket electricity consumptions of each room), (3) BIM models of all rooms within individual departments’ facilities, (4) data warehouse (for storing occupancy status and logged electricity consumption data), (5) building energy management system that provides energy managers with various energy management functions, and (6) energy simulation tool (such as eQuest) that generates real time 'standard energy consumptions' data against which 'actual energy consumptions' data are compared and energy efficiency evaluated. Through the building energy management system, the energy manager is able to (a) have 3D visualization (BIM model) of each room, in which the occupancy and equipment status detected by the sensors and the electricity consumptions data logged are displayed constantly; (b) perform real time energy consumption analysis to compare the actual and standard energy consumption profiles of a space; (c) obtain energy consumption anomaly detection warnings on certain rooms so that energy management corrective actions can be further taken (data mining technique is employed to analyze the relation between space occupancy pattern with current space equipment setting to indicate an anomaly, such as when appliances turn on without occupancy); and (d) perform historical energy consumption analysis to review monthly and annually energy consumption profiles and compare them against historical energy profiles. The BIM-EMSS was further implemented in a research lab in the Department of Architecture of NTUST in Taiwan and implementation results presented to illustrate how it can be used to assist individual departments within universities in their energy management tasks.

Keywords: database, electricity sub-meters, energy anomaly detection, sensor

Procedia PDF Downloads 307
24750 Artificial Intelligence and Governance in Relevance to Satellites in Space

Authors: Anwesha Pathak

Abstract:

With the increasing number of satellites and space debris, space traffic management (STM) becomes crucial. AI can aid in STM by predicting and preventing potential collisions, optimizing satellite trajectories, and managing orbital slots. Governance frameworks need to address the integration of AI algorithms in STM to ensure safe and sustainable satellite activities. AI and governance play significant roles in the context of satellite activities in space. Artificial intelligence (AI) technologies, such as machine learning and computer vision, can be utilized to process vast amounts of data received from satellites. AI algorithms can analyse satellite imagery, detect patterns, and extract valuable information for applications like weather forecasting, urban planning, agriculture, disaster management, and environmental monitoring. AI can assist in automating and optimizing satellite operations. Autonomous decision-making systems can be developed using AI to handle routine tasks like orbit control, collision avoidance, and antenna pointing. These systems can improve efficiency, reduce human error, and enable real-time responsiveness in satellite operations. AI technologies can be leveraged to enhance the security of satellite systems. AI algorithms can analyze satellite telemetry data to detect anomalies, identify potential cyber threats, and mitigate vulnerabilities. Governance frameworks should encompass regulations and standards for securing satellite systems against cyberattacks and ensuring data privacy. AI can optimize resource allocation and utilization in satellite constellations. By analyzing user demands, traffic patterns, and satellite performance data, AI algorithms can dynamically adjust the deployment and routing of satellites to maximize coverage and minimize latency. Governance frameworks need to address fair and efficient resource allocation among satellite operators to avoid monopolistic practices. Satellite activities involve multiple countries and organizations. Governance frameworks should encourage international cooperation, information sharing, and standardization to address common challenges, ensure interoperability, and prevent conflicts. AI can facilitate cross-border collaborations by providing data analytics and decision support tools for shared satellite missions and data sharing initiatives. AI and governance are critical aspects of satellite activities in space. They enable efficient and secure operations, ensure responsible and ethical use of AI technologies, and promote international cooperation for the benefit of all stakeholders involved in the satellite industry.

Keywords: satellite, space debris, traffic, threats, cyber security.

Procedia PDF Downloads 76
24749 A Comparative Study on Automatic Feature Classification Methods of Remote Sensing Images

Authors: Lee Jeong Min, Lee Mi Hee, Eo Yang Dam

Abstract:

Geospatial feature extraction is a very important issue in the remote sensing research. In the meantime, the image classification based on statistical techniques, but, in recent years, data mining and machine learning techniques for automated image processing technology is being applied to remote sensing it has focused on improved results generated possibility. In this study, artificial neural network and decision tree technique is applied to classify the high-resolution satellite images, as compared to the MLC processing result is a statistical technique and an analysis of the pros and cons between each of the techniques.

Keywords: remote sensing, artificial neural network, decision tree, maximum likelihood classification

Procedia PDF Downloads 347
24748 The Implementation of Corporate Social Responsibility to Contribute the Isolated District and the Drop behind District to Overcome the Poverty, Study Cases: PT. Kaltim Prima Coal (KPC) Sanggata, East Borneo, Indonesia

Authors: Sri Suryaningsum

Abstract:

The achievement ‘Best Practice Model’ holds by the government on behalf of the success implementation corporate social responsibility program that held on PT. Kaltim Prima Coal which had operation located in the isolated district in Sanggata, it could be the reference for the other companies to improve the social welfare in surrounding area, especially for the companies that have operated in the isolated area in Indonesia. The rule of Kaltim Prima Coal as the catalyst in the development progress to push up the independence of district especially for the district which has located in surrounding mining operation from village level to the regency level, those programs had written in the 7 field program in Corporate Social Responsibility, it was doing by stakeholders. The stakeholders are village government, sub-district government, Regency and citizen. One of the best programs that implement at PT. Kaltim Prima Coal is Regarding Resettlement that was completed based on Asian Development Bank Resettlement Best Practice and International Financial Corporation Resettlement Action Plan. This program contributed on the resettlement residences to develop the isolated and the neglected district.

Keywords: CSR, isolated, neglected, poverty, mining industry

Procedia PDF Downloads 247
24747 Explainable Graph Attention Networks

Authors: David Pham, Yongfeng Zhang

Abstract:

Graphs are an important structure for data storage and computation. Recent years have seen the success of deep learning on graphs such as Graph Neural Networks (GNN) on various data mining and machine learning tasks. However, most of the deep learning models on graphs cannot easily explain their predictions and are thus often labelled as “black boxes.” For example, Graph Attention Network (GAT) is a frequently used GNN architecture, which adopts an attention mechanism to carefully select the neighborhood nodes for message passing and aggregation. However, it is difficult to explain why certain neighbors are selected while others are not and how the selected neighbors contribute to the final classification result. In this paper, we present a graph learning model called Explainable Graph Attention Network (XGAT), which integrates graph attention modeling and explainability. We use a single model to target both the accuracy and explainability of problem spaces and show that in the context of graph attention modeling, we can design a unified neighborhood selection strategy that selects appropriate neighbor nodes for both better accuracy and enhanced explainability. To justify this, we conduct extensive experiments to better understand the behavior of our model under different conditions and show an increase in both accuracy and explainability.

Keywords: explainable AI, graph attention network, graph neural network, node classification

Procedia PDF Downloads 199
24746 An Approach to Building a Recommendation Engine for Travel Applications Using Genetic Algorithms and Neural Networks

Authors: Adrian Ionita, Ana-Maria Ghimes

Abstract:

The lack of features, design and the lack of promoting an integrated booking application are some of the reasons why most online travel platforms only offer automation of old booking processes, being limited to the integration of a smaller number of services without addressing the user experience. This paper represents a practical study on how to improve travel applications creating user-profiles through data-mining based on neural networks and genetic algorithms. Choices made by users and their ‘friends’ in the ‘social’ network context can be considered input data for a recommendation engine. The purpose of using these algorithms and this design is to improve user experience and to deliver more features to the users. The paper aims to highlight a broader range of improvements that could be applied to travel applications in terms of design and service integration, while the main scientific approach remains the technical implementation of the neural network solution. The motivation of the technologies used is also related to the initiative of some online booking providers that have made the fact that they use some ‘neural network’ related designs public. These companies use similar Big-Data technologies to provide recommendations for hotels, restaurants, and cinemas with a neural network based recommendation engine for building a user ‘DNA profile’. This implementation of the ‘profile’ a collection of neural networks trained from previous user choices, can improve the usability and design of any type of application.

Keywords: artificial intelligence, big data, cloud computing, DNA profile, genetic algorithms, machine learning, neural networks, optimization, recommendation system, user profiling

Procedia PDF Downloads 163
24745 Clustering of Association Rules of ISIS & Al-Qaeda Based on Similarity Measures

Authors: Tamanna Goyal, Divya Bansal, Sanjeev Sofat

Abstract:

In world-threatening terrorist attacks, where early detection, distinction, and prediction are effective diagnosis techniques and for functionally accurate and precise analysis of terrorism data, there are so many data mining & statistical approaches to assure accuracy. The computational extraction of derived patterns is a non-trivial task which comprises specific domain discovery by means of sophisticated algorithm design and analysis. This paper proposes an approach for similarity extraction by obtaining the useful attributes from the available datasets of terrorist attacks and then applying feature selection technique based on the statistical impurity measures followed by clustering techniques on the basis of similarity measures. On the basis of degree of participation of attributes in the rules, the associative dependencies between the attacks are analyzed. Consequently, to compute the similarity among the discovered rules, we applied a weighted similarity measure. Finally, the rules are grouped by applying using hierarchical clustering. We have applied it to an open source dataset to determine the usability and efficiency of our technique, and a literature search is also accomplished to support the efficiency and accuracy of our results.

Keywords: association rules, clustering, similarity measure, statistical approaches

Procedia PDF Downloads 320
24744 The Investigation of Enzymatic Activity in the Soils Under the Impact of Metallurgical Industrial Activity in Lori Marz, Armenia

Authors: T. H. Derdzyan, K. A. Ghazaryan, G. A. Gevorgyan

Abstract:

Beta-glucosidase, chitinase, leucine-aminopeptidase, acid phosphomonoestearse and acetate-esterase enzyme activities in the soils under the impact of metallurgical industrial activity in Lori marz (district) were investigated. The results of the study showed that the activities of the investigated enzymes in the soils decreased with increasing distance from the Shamlugh copper mine, the Chochkan tailings storage facility and the ore transportation road. Statistical analysis revealed that the activities of the enzymes were positively correlated (significant) to each other according to the observation sites which indicated that enzyme activities were affected by the same anthropogenic factor. The investigations showed that the soils were polluted with heavy metals (Cu, Pb, As, Co, Ni, Zn) due to copper mining activity in this territory. The results of Pearson correlation analysis revealed a significant negative correlation between heavy metal pollution degree (Nemerow integrated pollution index) and soil enzyme activity. All of this indicated that copper mining activity in this territory causing the heavy metal pollution of the soils resulted in the inhabitation of the activities of the enzymes which are considered as biological catalysts to decompose organic materials and facilitate the cycling of nutrients.

Keywords: Armenia, metallurgical industrial activity, heavy metal pollutionl, soil enzyme activity

Procedia PDF Downloads 296
24743 Aviation versus Aerospace: A Differential Analysis of Workforce Jobs via Text Mining

Authors: Sarah Werner, Michael J. Pritchard

Abstract:

From pilots to engineers, the skills development within the aerospace industry is exceptionally broad. Employers often struggle with finding the right mixture of qualified skills to fill their organizational demands. This effort to find qualified talent is further complicated by the industrial delineation between two key areas: aviation and aerospace. In a broad sense, the aerospace industry overlaps with the aviation industry. In turn, the aviation industry is a smaller sector segment within the context of the broader definition of the aerospace industry. Furthermore, it could be conceptually argued that -in practice- there is little distinction between these two sectors (i.e., aviation and aerospace). However, through our unstructured text analysis of over 6,000 job listings captured, our team found a clear delineation between aviation-related jobs and aerospace-related jobs. Using techniques in natural language processing, our research identifies an integrated workforce skill pattern that clearly breaks between these two sectors. While the aviation sector has largely maintained its need for pilots, mechanics, and associated support personnel, the staffing needs of the aerospace industry are being progressively driven by integrative engineering needs. Increasingly, this is leading many aerospace-based organizations towards the acquisition of 'system level' staffing requirements. This research helps to better align higher educational institutions with the current industrial staffing complexities within the broader aerospace sector.

Keywords: aerospace industry, job demand, text mining, workforce development

Procedia PDF Downloads 272
24742 A Schema of Building an Efficient Quality Gate throughout the Software Development with Tools

Authors: Le Chen

Abstract:

This paper presents an efficient tool platform scheme to ensure quality protection throughout the software development process. The main principle is to manage the information of requirements, design, development, testing, operation and maintenance process with proper tools, and to set up the quality standards of each process. Through the tools’ display and summary of quality standards, the quality standards can be visualizad and ready for policy decision, which is called Quality Gate in this paper. In addition, the tools are also integrated to achieve the exchange and relation of information which highly improving operational efficiency. In this paper, the feasibility of the scheme is verified by practical application of development projects, and the overall information display and data mining are proposed to be further improved.

Keywords: efficiency, quality gate, software process, tools

Procedia PDF Downloads 358
24741 How to Use Big Data in Logistics Issues

Authors: Mehmet Akif Aslan, Mehmet Simsek, Eyup Sensoy

Abstract:

Big Data stands for today’s cutting-edge technology. As the technology becomes widespread, so does Data. Utilizing massive data sets enable companies to get competitive advantages over their adversaries. Out of many area of Big Data usage, logistics has significance role in both commercial sector and military. This paper lays out what big data is and how it is used in both military and commercial logistics.

Keywords: big data, logistics, operational efficiency, risk management

Procedia PDF Downloads 641
24740 A Case Study of Ontology-Based Sentiment Analysis for Fan Pages

Authors: C. -L. Huang, J. -H. Ho

Abstract:

Social media has become more and more important in our life. Many enterprises promote their services and products to fans via the social media. The positive or negative sentiment of feedbacks from fans is very important for enterprises to improve their products, services, and promotion activities. The purpose of this paper is to understand the sentiment of the fan’s responses by analyzing the responses posted by fans on Facebook. The entity and aspect of fan’s responses were analyzed based on a predefined ontology. The ontology for cell phone sentiment analysis consists of aspect categories on the top level as follows: overall, shape, hardware, brand, price, and service. Each category consists of several sub-categories. All aspects for a fan’s response were found based on the ontology, and their corresponding sentimental terms were found using lexicon-based approach. The sentimental scores for aspects of fan responses were obtained by summarizing the sentimental terms in responses. The frequency of 'like' was also weighted in the sentimental score calculation. Three famous cell phone fan pages on Facebook were selected as demonstration cases to evaluate performances of the proposed methodology. Human judgment by several domain experts was also built for performance comparison. The performances of proposed approach were as good as those of human judgment on precision, recall and F1-measure.

Keywords: opinion mining, ontology, sentiment analysis, text mining

Procedia PDF Downloads 232
24739 Effect of Heavy Metals on the Life History Trait of Heterocephalobellus sp. and Cephalobus sp. (Nematode: Cephalobidae) Collected from a Small-Scale Mining Site, Davao de Oro, Philippines

Authors: Alissa Jane S. Mondejar, Florifern C. Paglinawan, Nanette Hope N. Sumaya, Joey Genevieve T. Martinez, Mylah Villacorte-Tabelin

Abstract:

Mining is associated with increased heavy metals in the environment, and heavy metal contamination disrupts the activities of soil fauna, such as nematodes, causing changes in the function of the soil ecosystem. Previous studies found that nematode community composition and diversity indices were strongly affected by heavy metals (e.g., Pb, Cu, and Zn). In this study, the influence of heavy metals on nematode survivability and reproduction were investigated. Life history analysis of the free-living nematodes, Heterocephalobellus sp. and Cephalobus sp. (Rhabditida: Cephalobidae) were assessed using the hanging drop technique, a technique often used in life history trait experiments. The nematodes were exposed to different temperatures, i.e.,20°C, 25°C, and 30°C, in different groups (control and heavy metal exposed) and fed with the same bacterial density of 1×109 Escherichia coli cells ml-1 for 30 days. Results showed that increasing temperature and exposure to heavy metals had a significant influence on the survivability and egg production of both species. Heterocephalobellus sp. and Cephalobus sp., when exposed to 20°C survived longer and produced few numbers of eggs but without subsequent hatching. Life history parameters of Heterocephalobellus sp. showed that the value of parameters was higher in the control group under net production rate (R0), fecundity (mx) which is also the same value for the total fertility rate (TFR), generation times (G0, G₁, and Gh) and Population doubling time (PDT). However, a lower rate of natural increase (rm) was observed since generation times were higher. Meanwhile, the life history parameters of Cephalobus sp. showed that the value of net production rate (R0) was higher in the exposed group. Fecundity (mx) which is also the same value for the TFR, G0, G1, Gh, and PDT, were higher in the control group. However, a lower rate of natural increase (rm) was observed since generation times were higher. In conclusion, temperature and exposure to heavy metals had a negative influence on the life history of the nematodes, however, further experiments should be considered.

Keywords: artisanal and small-scale gold mining (ASGM), hanging drop method, heavy metals, life history trait.

Procedia PDF Downloads 97
24738 Understanding Student Engagement through Sentiment Analytics of Response Times to Electronically Shared Feedback

Authors: Yaxin Bi, Peter Nicholl

Abstract:

The rapid advancement of Information and communication technologies (ICT) is extremely influencing every aspect of Higher Education. It has transformed traditional teaching, learning, assessment and feedback into a new era of Digital Education. This also introduces many challenges in capturing and understanding student engagement with their studies in Higher Education. The School of Computing at Ulster University has developed a Feedback And Notification (FAN) Online tool that has been used to send students links to personalized feedback on their submitted assessments and record students’ frequency of review of the shared feedback as well as the speed of collection. The feedback that the students initially receive is via a personal email directing them through to the feedback via a URL link that maps to the feedback created by the academic marker. This feedback is typically a Word or PDF report including comments and the final mark for the work submitted approximately three weeks before. When the student clicks on the link, the student’s personal feedback is viewable in the browser and they can view the contents. The FAN tool provides the academic marker with a report that includes when and how often a student viewed the feedback via the link. This paper presents an investigation into student engagement through analyzing the interaction timestamps and frequency of review by the student. We have proposed an approach to modeling interaction timestamps and use sentiment classification techniques to analyze the data collected over the last five years for a set of modules. The data studied is across a number of final years and second-year modules in the School of Computing. The paper presents the details of quantitative analysis methods and describes further their interactions with the feedback overtime on each module studied. We have projected the students into different groups of engagement based on sentiment analysis results and then provide a suggestion of early targeted intervention for the set of students seen to be under-performing via our proposed model.

Keywords: feedback, engagement, interaction modelling, sentiment analysis

Procedia PDF Downloads 103
24737 Advancement of Computer Science Research in Nigeria: A Bibliometric Analysis of the Past Three Decades

Authors: Temidayo O. Omotehinwa, David O. Oyewola, Friday J. Agbo

Abstract:

This study aims to gather a proper perspective of the development landscape of Computer Science research in Nigeria. Therefore, a bibliometric analysis of 4,333 bibliographic records of Computer Science research in Nigeria in the last 31 years (1991-2021) was carried out. The bibliographic data were extracted from the Scopus database and analyzed using VOSviewer and the bibliometrix R package through the biblioshiny web interface. The findings of this study revealed that Computer Science research in Nigeria has a growth rate of 24.19%. The most developed and well-studied research areas in the Computer Science field in Nigeria are machine learning, data mining, and deep learning. The social structure analysis result revealed that there is a need for improved international collaborations. Sparsely established collaborations are largely influenced by geographic proximity. The funding analysis result showed that Computer Science research in Nigeria is under-funded. The findings of this study will be useful for researchers conducting Computer Science related research. Experts can gain insights into how to develop a strategic framework that will advance the field in a more impactful manner. Government agencies and policymakers can also utilize the outcome of this research to develop strategies for improved funding for Computer Science research.

Keywords: bibliometric analysis, biblioshiny, computer science, Nigeria, science mapping

Procedia PDF Downloads 112
24736 Toxic Metal and Radiological Risk Assessment of Soil, Water and Vegetables around a Gold Mine Turned Residential Area in Mokuro Area of Ile-Ife, Osun State Nigeria: An Implications for Human Health

Authors: Grace O. Akinlade, Danjuma D. Maza, Oluwakemi O. Olawolu, Delight O. Babalola, John A. O. Oyekunle, Joshua O. Ojo

Abstract:

The Mokuro area of Ile-Ife, South West Nigeria, was well known for gold mining in the past (about twenty years ago). However, the place has since been reclaimed and converted to residential area without any environmental risk assessment of the impact of the mining tailings on the environment. Soil, water, and plant samples were collected from 4 different locations around the mine-turned-residential area. Soil samples were pulverized and sieved into finer particles, while the plant samples were dried and pulverized. All the samples were digested and analyzed for As, Pb, Cd, and Zn using atomic absorption spectroscopy (AAS). From the analysis results, the hazard index (HI) was then calculated for the metals. The soil and plant samples were air dried and pulverized, then weighed, after which the samples were packed into special and properly sealed containers to prevent radon gas leakage. After the sealing, the samples were kept for 28 days to attain secular equilibrium. The concentrations of 40K, 238U, and 232Th in the samples were measured using a cesium iodide (CsI) spectrometer and URSA software. The AAS analysis showed that As, Pb, Cd (Toxic metals), and Zn (essential trace metals) are in concentrations lower than permissible limits in plants and soil samples, while the water samples had concentrations higher than permissible limits. The calculated health indices (HI) show that HI for water is >1 and that of plants and soil is <1. Gamma spectrometry result shows high levels of activity concentrations above the recommended limits for all the soil and plant samples collected from the area. Only the water samples have activity concentrations below the recommended limit. Consequently, the absorbed dose, annual effective dose, and excess lifetime cancer risk are all above the recommended safe limit for all the samples except for water samples. In conclusion, all the samples collected from the area are either contaminated with toxic metals or they pose radiological hazards to the consumers. Further detailed study is therefore recommended in order to be able to advise the residents appropriately.

Keywords: toxic metals, gamma spectrometry, Ile-Ife, radiological hazards, gold mining

Procedia PDF Downloads 57
24735 Business Intelligence Dashboard Solutions for Improving Decision Making Process: A Focus on Prostate Cancer

Authors: Mona Isazad Mashinchi, Davood Roshan Sangachin, Francis J. Sullivan, Dietrich Rebholz-Schuhmann

Abstract:

Background: Decision-making processes are nowadays driven by data, data analytics and Business Intelligence (BI). BI as a software platform can provide a wide variety of capabilities such as organization memory, information integration, insight creation and presentation capabilities. Visualizing data through dashboards is one of the BI solutions (for a variety of areas) which helps managers in the decision making processes to expose the most informative information at a glance. In the healthcare domain to date, dashboard presentations are more frequently used to track performance related metrics and less frequently used to monitor those quality parameters which relate directly to patient outcomes. Providing effective and timely care for patients and improving the health outcome are highly dependent on presenting and visualizing data and information. Objective: In this research, the focus is on the presentation capabilities of BI to design a dashboard for prostate cancer (PC) data that allows better decision making for the patients, the hospital and the healthcare system related to a cancer dataset. The aim of this research is to customize a retrospective PC dataset in a dashboard interface to give a better understanding of data in the categories (risk factors, treatment approaches, disease control and side effects) which matter most to patients as well as other stakeholders. By presenting the outcome in the dashboard we address one of the major targets of a value-based health care (VBHC) delivery model which is measuring the value and presenting the outcome to different actors in HC industry (such as patients and doctors) for a better decision making. Method: For visualizing the stored data to users, three interactive dashboards based on the PC dataset have been developed (using the Tableau Software) to provide better views to the risk factors, treatment approaches, and side effects. Results: Many benefits derived from interactive graphs and tables in dashboards which helped to easily visualize and see the patients at risk, better understanding the relationship between patient's status after treatment and their initial status before treatment, or to choose better decision about treatments with fewer side effects regarding patient status and etc. Conclusions: Building a well-designed and informative dashboard is related to three important factors including; the users, goals and the data types. Dashboard's hierarchies, drilling, and graphical features can guide doctors to better navigate through information. The features of the interactive PC dashboard not only let doctors ask specific questions and filter the results based on the key performance indicators (KPI) such as: Gleason Grade, Patient's Age and Status, but may also help patients to better understand different treatment outcomes, such as side effects during the time, and have an active role in their treatment decisions. Currently, we are extending the results to the real-time interactive dashboard that users (either patients and doctors) can easily explore the data by choosing preferred attribute and data to make better near real-time decisions.

Keywords: business intelligence, dashboard, decision making, healthcare, prostate cancer, value-based healthcare

Procedia PDF Downloads 141
24734 High-Throughput Artificial Guide RNA Sequence Design for Type I, II and III CRISPR/Cas-Mediated Genome Editing

Authors: Farahnaz Sadat Golestan Hashemi, Mohd Razi Ismail, Mohd Y. Rafii

Abstract:

A huge revolution has emerged in genome engineering by the discovery of CRISPR (clustered regularly interspaced palindromic repeats) and CRISPR-associated system genes (Cas) in bacteria. The function of type II Streptococcus pyogenes (Sp) CRISPR/Cas9 system has been confirmed in various species. Other S. thermophilus (St) CRISPR-Cas systems, CRISPR1-Cas and CRISPR3-Cas, have been also reported for preventing phage infection. The CRISPR1-Cas system interferes by cleaving foreign dsDNA entering the cell in a length-specific and orientation-dependant manner. The S. thermophilus CRISPR3-Cas system also acts by cleaving phage dsDNA genomes at the same specific position inside the targeted protospacer as observed in the CRISPR1-Cas system. It is worth mentioning, for the effective DNA cleavage activity, RNA-guided Cas9 orthologs require their own specific PAM (protospacer adjacent motif) sequences. Activity levels are based on the sequence of the protospacer and specific combinations of favorable PAM bases. Therefore, based on the specific length and sequence of PAM followed by a constant length of target site for the three orthogonals of Cas9 protein, a well-organized procedure will be required for high-throughput and accurate mining of possible target sites in a large genomic dataset. Consequently, we created a reliable procedure to explore potential gRNA sequences for type I (Streptococcus thermophiles), II (Streptococcus pyogenes), and III (Streptococcus thermophiles) CRISPR/Cas systems. To mine CRISPR target sites, four different searching modes of sgRNA binding to target DNA strand were applied. These searching modes are as follows: i) coding strand searching, ii) anti-coding strand searching, iii) both strand searching, and iv) paired-gRNA searching. The output of such procedure highlights the power of comparative genome mining for different CRISPR/Cas systems. This could yield a repertoire of Cas9 variants with expanded capabilities of gRNA design, and will pave the way for further advance genome and epigenome engineering.

Keywords: CRISPR/Cas systems, gRNA mining, Streptococcus pyogenes, Streptococcus thermophiles

Procedia PDF Downloads 257