Search results for: Data Mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25268

Search results for: Data Mining

24548 A Data Envelopment Analysis Model in a Multi-Objective Optimization with Fuzzy Environment

Authors: Michael Gidey Gebru

Abstract:

Most of Data Envelopment Analysis models operate in a static environment with input and output parameters that are chosen by deterministic data. However, due to ambiguity brought on shifting market conditions, input and output data are not always precisely gathered in real-world scenarios. Fuzzy numbers can be used to address this kind of ambiguity in input and output data. Therefore, this work aims to expand crisp Data Envelopment Analysis into Data Envelopment Analysis with fuzzy environment. In this study, the input and output data are regarded as fuzzy triangular numbers. Then, the Data Envelopment Analysis model with fuzzy environment is solved using a multi-objective method to gauge the Decision Making Units' efficiency. Finally, the developed Data Envelopment Analysis model is illustrated with an application on real data 50 educational institutions.

Keywords: efficiency, Data Envelopment Analysis, fuzzy, higher education, input, output

Procedia PDF Downloads 53
24547 Smart in Performance: More to Practical Life than Hardware and Software

Authors: Faten Hatem

Abstract:

This paper promotes the importance of focusing on spatial aspects and affective factors that impact smart urbanism. This helps to better inform city governance, spatial planning, and policymaking to focus on what Smart does and what it can achieve for cities in terms of performance rather than on using the notion for prestige in a worldwide trend towards becoming a smart city. By illustrating how this style of practice compromises the social aspects and related elements of space making through an interdisciplinary comparative approach, the paper clarifies the impact of this compromise on the overall smart city performance. In response, this paper recognizes the importance of establishing a new meaning for urban progress by moving beyond improving basic services of the city to enhance the actual human experience which is essential for the development of authentic smart cities. The topic is presented under five overlooked areas that discuss the relation between smart cities’ potential and efficiency paradox, the social aspect, connectedness with nature, the human factor, and untapped resources. However, these themes are not meant to be discussed in silos, instead, they are presented to collectively examine smart cities in performance, arguing there is more to the practical life of smart cities than software and hardware inventions. The study is based on a case study approach, presenting Milton Keynes as a living example to learn from while engaging with various methods for data collection including multi-disciplinary semi-structured interviews, field observations, and data mining.

Keywords: smart design, the human in the city, human needs and urban planning, sustainability, smart cities, smart

Procedia PDF Downloads 98
24546 Hydrogeophysical Investigations And Mapping of Ingress Channels Along The Blesbokspruit Stream In The East Rand Basin Of The Witwatersrand, South Africa

Authors: Melvin Sethobya, Sithule Xanga, Sechaba Lenong, Lunga Nolakana, Gbenga Adesola

Abstract:

Mining has been the cornerstone of the South African economy for the last century. Most of the gold mining in South Africa was conducted within the Witwatersrand basin, which contributed to the rapid growth of the city of Johannesburg and capitulated the city to becoming the business and wealth capital of the country. But with gradual depletion of resources, a stoppage in the extraction of underground water from mines and other factors relating to survival of the mining operations over a lengthy period, most of the mines were abandoned and left to pollute the local waterways and groundwater with toxins, heavy metal residue and increased acid mine drainage ensued. The Department of Mineral Resources and Energy commissioned a project whose aim is to monitor, maintain, and mitigate the adverse environmental impacts of polluted water mine water flowing into local streams affecting local ecosystems and livelihoods downstream. As part of mitigation efforts, the diagnosis and monitoring of groundwater or surface water polluted sites has become important. Geophysical surveys, in particular, Resistivity and Magnetics surveys, were selected as some of most suitable techniques for investigation of local ingress points along of one the major streams cutting through the Witwatersrand basin, namely the Blesbokspruit, which is found in the eastern part of the basin. The aim of the surveys was to provide information that could be used to assist in determining possible water loss/ ingress from the Blesbokspriut stream. Modelling of geophysical surveys results offered an in-depth insight into the interaction and pathways of polluted water through mapping of possible ingress channels near the Blesbokspruit. The resistivity - depth profile of the surveyed site exhibit a three(3) layered model with low resistivity values (10 to 200 Ω.m) overburden, which is underlain by a moderate resistivity weathered layer (>300 Ω.m), which sits on a more resistive crystalline bedrock (>500 Ω.m). Two locations of potential ingress channels were mapped across the two traverses at the site. The magnetic survey conducted at the site mapped a major NE-SW trending regional linearment with a strong magnetic signature, which was modeled to depth beyond 100m, with the potential to act as a conduit for dispersion of stream water away from the stream, as it shared a similar orientation with the potential ingress channels as mapped using the resistivity method.

Keywords: eletrictrical resistivity, magnetics survey, blesbokspruit, ingress

Procedia PDF Downloads 61
24545 Identification of Conserved Domains and Motifs for GRF Gene Family

Authors: Jafar Ahmadi, Nafiseh Noormohammadi, Sedegeh Fabriki Ourang

Abstract:

GRF, Growth regulating factor, genes encode a novel class of plant-specific transcription factors. The GRF proteins play a role in the regulation of cell numbers in young and growing tissues and may act as transcription activations in growth and development of plants. Identification of GRF genes and their expression are important in plants to performance of the growth and development of various organs. In this study, to better understanding the structural and functional differences of GRFs family, 45 GRF proteins sequences in A. thaliana, Z. mays, O. sativa, B. napus, B. rapa, H. vulgare, and S. bicolor, have been collected and analyzed through bioinformatics data mining. As a result, in secondary structure of GRFs, the number of alpha helices was more than beta sheets and in all of them QLQ domains were completely in the biggest alpha helix. In all GRFs, QLQ, and WRC domains were completely protected except in AtGRF9. These proteins have no trans-membrane domain and due to have nuclear localization signals act in nuclear and they are component of unstable proteins in the test tube.

Keywords: domain, gene family, GRF, motif

Procedia PDF Downloads 455
24544 Crime Prevention with Artificial Intelligence

Authors: Mehrnoosh Abouzari, Shahrokh Sahraei

Abstract:

Today, with the increase in quantity and quality and variety of crimes, the discussion of crime prevention has faced a serious challenge that human resources alone and with traditional methods will not be effective. One of the developments in the modern world is the presence of artificial intelligence in various fields, including criminal law. In fact, the use of artificial intelligence in criminal investigations and fighting crime is a necessity in today's world. The use of artificial intelligence is far beyond and even separate from other technologies in the struggle against crime. Second, its application in criminal science is different from the discussion of prevention and it comes to the prediction of crime. Crime prevention in terms of the three factors of the offender, the offender and the victim, following a change in the conditions of the three factors, based on the perception of the criminal being wise, and therefore increasing the cost and risk of crime for him in order to desist from delinquency or to make the victim aware of self-care and possibility of exposing him to danger or making it difficult to commit crimes. While the presence of artificial intelligence in the field of combating crime and social damage and dangers, like an all-seeing eye, regardless of time and place, it sees the future and predicts the occurrence of a possible crime, thus prevent the occurrence of crimes. The purpose of this article is to collect and analyze the studies conducted on the use of artificial intelligence in predicting and preventing crime. How capable is this technology in predicting crime and preventing it? The results have shown that the artificial intelligence technologies in use are capable of predicting and preventing crime and can find patterns in the data set. find large ones in a much more efficient way than humans. In crime prediction and prevention, the term artificial intelligence can be used to refer to the increasing use of technologies that apply algorithms to large sets of data to assist or replace police. The use of artificial intelligence in our debate is in predicting and preventing crime, including predicting the time and place of future criminal activities, effective identification of patterns and accurate prediction of future behavior through data mining, machine learning and deep learning, and data analysis, and also the use of neural networks. Because the knowledge of criminologists can provide insight into risk factors for criminal behavior, among other issues, computer scientists can match this knowledge with the datasets that artificial intelligence uses to inform them.

Keywords: artificial intelligence, criminology, crime, prevention, prediction

Procedia PDF Downloads 72
24543 The Economic Limitations of Defining Data Ownership Rights

Authors: Kacper Tomasz Kröber-Mulawa

Abstract:

This paper will address the topic of data ownership from an economic perspective, and examples of economic limitations of data property rights will be provided, which have been identified using methods and approaches of economic analysis of law. To properly build a background for the economic focus, in the beginning a short perspective of data and data ownership in the EU’s legal system will be provided. It will include a short introduction to its political and social importance and highlight relevant viewpoints. This will stress the importance of a Single Market for data but also far-reaching regulations of data governance and privacy (including the distinction of personal and non-personal data, data held by public bodies and private businesses). The main discussion of this paper will build upon the briefly referred to legal basis as well as methods and approaches of economic analysis of law.

Keywords: antitrust, data, data ownership, digital economy, property rights

Procedia PDF Downloads 76
24542 Protecting the Cloud Computing Data Through the Data Backups

Authors: Abdullah Alsaeed

Abstract:

Virtualized computing and cloud computing infrastructures are no longer fuzz or marketing term. They are a core reality in today’s corporate Information Technology (IT) organizations. Hence, developing an effective and efficient methodologies for data backup and data recovery is required more than any time. The purpose of data backup and recovery techniques are to assist the organizations to strategize the business continuity and disaster recovery approaches. In order to accomplish this strategic objective, a variety of mechanism were proposed in the recent years. This research paper will explore and examine the latest techniques and solutions to provide data backup and restoration for the cloud computing platforms.

Keywords: data backup, data recovery, cloud computing, business continuity, disaster recovery, cost-effective, data encryption.

Procedia PDF Downloads 82
24541 Missing Link Data Estimation with Recurrent Neural Network: An Application Using Speed Data of Daegu Metropolitan Area

Authors: JaeHwan Yang, Da-Woon Jeong, Seung-Young Kho, Dong-Kyu Kim

Abstract:

In terms of ITS, information on link characteristic is an essential factor for plan or operation. But in practical cases, not every link has installed sensors on it. The link that does not have data on it is called “Missing Link”. The purpose of this study is to impute data of these missing links. To get these data, this study applies the machine learning method. With the machine learning process, especially for the deep learning process, missing link data can be estimated from present link data. For deep learning process, this study uses “Recurrent Neural Network” to take time-series data of road. As input data, Dedicated Short-range Communications (DSRC) data of Dalgubul-daero of Daegu Metropolitan Area had been fed into the learning process. Neural Network structure has 17 links with present data as input, 2 hidden layers, for 1 missing link data. As a result, forecasted data of target link show about 94% of accuracy compared with actual data.

Keywords: data estimation, link data, machine learning, road network

Procedia PDF Downloads 508
24540 Intelligent Software Architecture and Automatic Re-Architecting Based on Machine Learning

Authors: Gebremeskel Hagos Gebremedhin, Feng Chong, Heyan Huang

Abstract:

Software system is the combination of architecture and organized components to accomplish a specific function or set of functions. A good software architecture facilitates application system development, promotes achievement of functional requirements, and supports system reconfiguration. We describe three studies demonstrating the utility of our architecture in the subdomain of mobile office robots and identify software engineering principles embodied in the architecture. The main aim of this paper is to analyze prove architecture design and automatic re-architecting using machine learning. Intelligence software architecture and automatic re-architecting process is reorganizing in to more suitable one of the software organizational structure system using the user access dataset for creating relationship among the components of the system. The 3-step approach of data mining was used to analyze effective recovery, transformation and implantation with the use of clustering algorithm. Therefore, automatic re-architecting without changing the source code is possible to solve the software complexity problem and system software reuse.

Keywords: intelligence, software architecture, re-architecting, software reuse, High level design

Procedia PDF Downloads 114
24539 Cross-Knowledge Graph Relation Completion for Non-Isomorphic Cross-Lingual Entity Alignment

Authors: Yuhong Zhang, Dan Lu, Chenyang Bu, Peipei Li, Kui Yu, Xindong Wu

Abstract:

The Cross-Lingual Entity Alignment (CLEA) task aims to find the aligned entities that refer to the same identity from two knowledge graphs (KGs) in different languages. It is an effective way to enhance the performance of data mining for KGs with scarce resources. In real-world applications, the neighborhood structures of the same entities in different KGs tend to be non-isomorphic, which makes the representation of entities contain diverse semantic information and then poses a great challenge for CLEA. In this paper, we try to address this challenge from two perspectives. On the one hand, the cross-KG relation completion rules are designed with the alignment constraint of entities and relations to improve the topology isomorphism of two KGs. On the other hand, a representation method combining isomorphic weights is designed to include more isomorphic semantics for counterpart entities, which will benefit the CLEA. Experiments show that our model can improve the isomorphism of two KGs and the alignment performance, especially for two non-isomorphic KGs.

Keywords: knowledge graphs, cross-lingual entity alignment, non-isomorphic, relation completion

Procedia PDF Downloads 116
24538 Opening up Government Datasets for Big Data Analysis to Support Policy Decisions

Authors: K. Hardy, A. Maurushat

Abstract:

Policy makers are increasingly looking to make evidence-based decisions. Evidence-based decisions have historically used rigorous methodologies of empirical studies by research institutes, as well as less reliable immediate survey/polls often with limited sample sizes. As we move into the era of Big Data analytics, policy makers are looking to different methodologies to deliver reliable empirics in real-time. The question is not why did these people do this for the last 10 years, but why are these people doing this now, and if the this is undesirable, and how can we have an impact to promote change immediately. Big data analytics rely heavily on government data that has been released in to the public domain. The open data movement promises greater productivity and more efficient delivery of services; however, Australian government agencies remain reluctant to release their data to the general public. This paper considers the barriers to releasing government data as open data, and how these barriers might be overcome.

Keywords: big data, open data, productivity, data governance

Procedia PDF Downloads 367
24537 A Systematic Review on Challenges in Big Data Environment

Authors: Rimmy Yadav, Anmol Preet Kaur

Abstract:

Big Data has demonstrated the vast potential in streamlining, deciding, spotting business drifts in different fields, for example, producing, fund, Information Technology. This paper gives a multi-disciplinary diagram of the research issues in enormous information and its procedures, instruments, and system identified with the privacy, data storage management, network and energy utilization, adaptation to non-critical failure and information representations. Other than this, result difficulties and openings accessible in this Big Data platform have made.

Keywords: big data, privacy, data management, network and energy consumption

Procedia PDF Downloads 306
24536 Efficient Fuzzy Classified Cryptographic Model for Intelligent Encryption Technique towards E-Banking XML Transactions

Authors: Maher Aburrous, Adel Khelifi, Manar Abu Talib

Abstract:

Transactions performed by financial institutions on daily basis require XML encryption on large scale. Encrypting large volume of message fully will result both performance and resource issues. In this paper a novel approach is presented for securing financial XML transactions using classification data mining (DM) algorithms. Our strategy defines the complete process of classifying XML transactions by using set of classification algorithms, classified XML documents processed at later stage using element-wise encryption. Classification algorithms were used to identify the XML transaction rules and factors in order to classify the message content fetching important elements within. We have implemented four classification algorithms to fetch the importance level value within each XML document. Classified content is processed using element-wise encryption for selected parts with "High", "Medium" or “Low” importance level values. Element-wise encryption is performed using AES symmetric encryption algorithm and proposed modified algorithm for AES to overcome the problem of computational overhead, in which substitute byte, shift row will remain as in the original AES while mix column operation is replaced by 128 permutation operation followed by add round key operation. An implementation has been conducted using data set fetched from e-banking service to present system functionality and efficiency. Results from our implementation showed a clear improvement in processing time encrypting XML documents.

Keywords: XML transaction, encryption, Advanced Encryption Standard (AES), XML classification, e-banking security, fuzzy classification, cryptography, intelligent encryption

Procedia PDF Downloads 405
24535 Survey on Big Data Stream Classification by Decision Tree

Authors: Mansoureh Ghiasabadi Farahani, Samira Kalantary, Sara Taghi-Pour, Mahboubeh Shamsi

Abstract:

Nowadays, the development of computers technology and its recent applications provide access to new types of data, which have not been considered by the traditional data analysts. Two particularly interesting characteristics of such data sets include their huge size and streaming nature .Incremental learning techniques have been used extensively to address the data stream classification problem. This paper presents a concise survey on the obstacles and the requirements issues classifying data streams with using decision tree. The most important issue is to maintain a balance between accuracy and efficiency, the algorithm should provide good classification performance with a reasonable time response.

Keywords: big data, data streams, classification, decision tree

Procedia PDF Downloads 517
24534 Robust and Dedicated Hybrid Cloud Approach for Secure Authorized Deduplication

Authors: Aishwarya Shekhar, Himanshu Sharma

Abstract:

Data deduplication is one of important data compression techniques for eliminating duplicate copies of repeating data, and has been widely used in cloud storage to reduce the amount of storage space and save bandwidth. In this process, duplicate data is expunged, leaving only one copy means single instance of the data to be accumulated. Though, indexing of each and every data is still maintained. Data deduplication is an approach for minimizing the part of storage space an organization required to retain its data. In most of the company, the storage systems carry identical copies of numerous pieces of data. Deduplication terminates these additional copies by saving just one copy of the data and exchanging the other copies with pointers that assist back to the primary copy. To ignore this duplication of the data and to preserve the confidentiality in the cloud here we are applying the concept of hybrid nature of cloud. A hybrid cloud is a fusion of minimally one public and private cloud. As a proof of concept, we implement a java code which provides security as well as removes all types of duplicated data from the cloud.

Keywords: confidentiality, deduplication, data compression, hybridity of cloud

Procedia PDF Downloads 377
24533 A Review of Machine Learning for Big Data

Authors: Devatha Kalyan Kumar, Aravindraj D., Sadathulla A.

Abstract:

Big data are now rapidly expanding in all engineering and science and many other domains. The potential of large or massive data is undoubtedly significant, make sense to require new ways of thinking and learning techniques to address the various big data challenges. Machine learning is continuously unleashing its power in a wide range of applications. In this paper, the latest advances and advancements in the researches on machine learning for big data processing. First, the machine learning techniques methods in recent studies, such as deep learning, representation learning, transfer learning, active learning and distributed and parallel learning. Then focus on the challenges and possible solutions of machine learning for big data.

Keywords: active learning, big data, deep learning, machine learning

Procedia PDF Downloads 437
24532 Strengthening Legal Protection of Personal Data through Technical Protection Regulation in Line with Human Rights

Authors: Tomy Prihananto, Damar Apri Sudarmadi

Abstract:

Indonesia recognizes the right to privacy as a human right. Indonesia provides legal protection against data management activities because the protection of personal data is a part of human rights. This paper aims to describe the arrangement of data management and data management in Indonesia. This paper is a descriptive research with qualitative approach and collecting data from literature study. Results of this paper are comprehensive arrangement of data that have been set up as a technical requirement of data protection by encryption methods. Arrangements on encryption and protection of personal data are mutually reinforcing arrangements in the protection of personal data. Indonesia has two important and immediately enacted laws that provide protection for the privacy of information that is part of human rights.

Keywords: Indonesia, protection, personal data, privacy, human rights, encryption

Procedia PDF Downloads 180
24531 Passive Attenuation of Nitrogen Species at Northern Mine Sites

Authors: Patrick Mueller, Alan Martin, Justin Stockwell, Robert Goldblatt

Abstract:

Elevated concentrations of inorganic nitrogen (N) compounds (nitrate, nitrite, and ammonia) are a ubiquitous feature to mine-influenced drainages due to the leaching of blasting residues and use of cyanide in the milling of gold ores. For many mines, the management of N is a focus for environmental protection, therefore understanding the factors controlling the speciation and behavior of N is central to effective decision making. In this paper, the passive attenuation of ammonia and nitrite is described for three northern water bodies (two lakes and a tailings pond) influenced by mining activities. In two of the water bodies, inorganic N compounds originate from explosives residues in mine water and waste rock. The third water body is a decommissioned tailings impoundment, with N compounds largely originating from the breakdown of cyanide compounds used in the processing of gold ores. Empirical observations from water quality monitoring indicate nitrification (the oxidation of ammonia to nitrate) occurs in all three waterbodies, where enrichment of nitrate occurs commensurately with ammonia depletion. The N species conversions in these systems occurred more rapidly than chemical oxidation kinetics permit, indicating that microbial mediated conversion was occurring, despite the cool water temperatures. While nitrification of ammonia and nitrite to nitrate was the primary process, in all three waterbodies nitrite was consistently present at approximately 0.5 to 2.0 % of total N, even following ammonia depletion. The persistence of trace amounts of nitrite under these conditions suggests the co-occurrence denitrification processes in the water column and/or underlying substrates. The implications for N management in mine waters are discussed.

Keywords: explosives, mining, nitrification, water

Procedia PDF Downloads 314
24530 Frequent-Pattern Tree Algorithm Application to S&P and Equity Indexes

Authors: E. Younsi, H. Andriamboavonjy, A. David, S. Dokou, B. Lemrabet

Abstract:

Software and time optimization are very important factors in financial markets, which are competitive fields, and emergence of new computer tools further stresses the challenge. In this context, any improvement of technical indicators which generate a buy or sell signal is a major issue. Thus, many tools have been created to make them more effective. This worry about efficiency has been leading in present paper to seek best (and most innovative) way giving largest improvement in these indicators. The approach consists in attaching a signature to frequent market configurations by application of frequent patterns extraction method which is here most appropriate to optimize investment strategies. The goal of proposed trading algorithm is to find most accurate signatures using back testing procedure applied to technical indicators for improving their performance. The problem is then to determine the signatures which, combined with an indicator, outperform this indicator alone. To do this, the FP-Tree algorithm has been preferred, as it appears to be the most efficient algorithm to perform this task.

Keywords: quantitative analysis, back-testing, computational models, apriori algorithm, pattern recognition, data mining, FP-tree

Procedia PDF Downloads 359
24529 The Various Legal Dimensions of Genomic Data

Authors: Amy Gooden

Abstract:

When human genomic data is considered, this is often done through only one dimension of the law, or the interplay between the various dimensions is not considered, thus providing an incomplete picture of the legal framework. This research considers and analyzes the various dimensions in South African law applicable to genomic sequence data – including property rights, personality rights, and intellectual property rights. The effective use of personal genomic sequence data requires the acknowledgement and harmonization of the rights applicable to such data.

Keywords: artificial intelligence, data, law, genomics, rights

Procedia PDF Downloads 135
24528 Big Brain: A Single Database System for a Federated Data Warehouse Architecture

Authors: X. Gumara Rigol, I. Martínez de Apellaniz Anzuola, A. Garcia Serrano, A. Franzi Cros, O. Vidal Calbet, A. Al Maruf

Abstract:

Traditional federated architectures for data warehousing work well when corporations have existing regional data warehouses and there is a need to aggregate data at a global level. Schibsted Media Group has been maturing from a decentralised organisation into a more globalised one and needed to build both some of the regional data warehouses for some brands at the same time as the global one. In this paper, we present the architectural alternatives studied and why a custom federated approach was the notable recommendation to go further with the implementation. Although the data warehouses are logically federated, the implementation uses a single database system which presented many advantages like: cost reduction and improved data access to global users allowing consumers of the data to have a common data model for detailed analysis across different geographies and a flexible layer for local specific needs in the same place.

Keywords: data integration, data warehousing, federated architecture, Online Analytical Processing (OLAP)

Procedia PDF Downloads 232
24527 Factors That Contribute to Noise Induced Hearing Loss Amongst Employees at the Platinum Mine in Limpopo Province, South Africa

Authors: Livhuwani Muthelo, R. N. Malema, T. M. Mothiba

Abstract:

Long term exposure to excessive noise in the mining industry increases the risk of noise induced hearing loss, with consequences for employee’s health, productivity and the overall quality of life. Objective: The objective of this study was to investigate the factors that contribute to Noise Induced Hearing Loss amongst employees at the Platinum mine in the Limpopo Province, South Africa. Study method: A qualitative, phenomenological, exploratory, descriptive, contextual design was applied in order to explore and describe the contributory factors. Purposive non-probability sampling was used to select 10 male employees who were diagnosed with NIHL in the year 2014 in four mine shafts, and 10 managers who were involved in a Hearing Conservation Programme. The data were collected using semi-structured one-on-one interviews. A qualitative data analysis of Tesch’s approach was followed. Results: The following themes emerged: Experiences and challenges faced by employees in the work environment, hearing protective device factors and management and leadership factors. Hearing loss was caused by partial application of guidelines, policies, and procedures from the Department of Minerals and Energy. Conclusion: The study results indicate that although there are guidelines, policies, and procedures available, failure in the implementation of one element will affect the development and maintenance of employees hearing mechanism. It is recommended that the mine management should apply the guidelines, policies, and procedures and promptly repair the broken hearing protective devices.

Keywords: employees, factors, noise induced hearing loss, noise exposure

Procedia PDF Downloads 118
24526 A Deep Learning Based Approach for Dynamically Selecting Pre-processing Technique for Images

Authors: Revoti Prasad Bora, Nikita Katyal, Saurabh Yadav

Abstract:

Pre-processing plays an important role in various image processing applications. Most of the time due to the similar nature of images, a particular pre-processing or a set of pre-processing steps are sufficient to produce the desired results. However, in the education domain, there is a wide variety of images in various aspects like images with line-based diagrams, chemical formulas, mathematical equations, etc. Hence a single pre-processing or a set of pre-processing steps may not yield good results. Therefore, a Deep Learning based approach for dynamically selecting a relevant pre-processing technique for each image is proposed. The proposed method works as a classifier to detect hidden patterns in the images and predicts the relevant pre-processing technique needed for the image. This approach experimented for an image similarity matching problem but it can be adapted to other use cases too. Experimental results showed significant improvement in average similarity ranking with the proposed method as opposed to static pre-processing techniques.

Keywords: deep-learning, classification, pre-processing, computer vision, image processing, educational data mining

Procedia PDF Downloads 156
24525 Power Asymmetry and Major Corporate Social Responsibility Projects in Mhondoro-Ngezi District, Zimbabwe

Authors: A. T. Muruviwa

Abstract:

Empirical studies of the current CSR agenda have been dominated by literature from the North at the expense of the nations from the South where most TNCs are located. Therefore, owing to the limitations of the current discourse that is dominated by Western ideas such as voluntarism, philanthropy, business case and economic gains, scholars have been calling for a new CSR agenda that is South-centred and addresses the needs of developing nations. The development theme has dominated in the recent literature as scholars concerned with the relationship between business and society have tried to understand its relationship with CSR. Despite a plethora of literature on the roles of corporations in local communities and the impact of CSR initiatives, there is lack of adequate empirical evidence to help us understand the nexus between CSR and development. For all the claims made about the positive and negative consequences of CSR, there is surprisingly little information about the outcomes it delivers. This study is a response to these claims made about the developmental aspect of CSR in developing countries. It offers some empirical bases for assessing the major CSR projects that have been fulfilled by a major mining company, Zimplats in Mhondoro-Ngezi Zimbabwe. The neo-liberal idea of capitalism and market dominations has empowered TNCs to stamp their authority in the developing countries. TNCs have made their mark in developing nations as they stamp their global private authority, rivalling or implicitly challenging the state in many functions. This dominance of corporate power raises great concerns over their tendencies of abuses in terms of environmental, social and human rights concerns as well as how to make them increasingly accountable. The hegemonic power of TNCs in the developing countries has had a tremendous impact on the overall CSR practices. While TNCs are key drivers of globalization they may be acting responsibly in their Global Northern home countries where there is a combination of legal mechanisms and the fear of civil society activism associated with corporate scandals. Using a triangulated approach in which both qualitative and quantitative methods were used the study found out that most CSR projects in Zimbabwe are dominated and directed by Zimplats because of the power it possesses. Most of the major CSR projects are beneficial to the mining company as they serve the business plans of the mining company. What was deduced from the study is that the infrastructural development initiatives by Zimplats confirm that CSR is a tool to advance business obligations. This shows that although proponents of CSR might claim that business has a mandate for social obligations to society, we need not to forget the dominant idea that the primary function of CSR is to enhance the firm’s profitability.

Keywords: hegemonic power, projects, reciprocity, stakeholders

Procedia PDF Downloads 249
24524 A Survey of Semantic Integration Approaches in Bioinformatics

Authors: Chaimaa Messaoudi, Rachida Fissoune, Hassan Badir

Abstract:

Technological advances of computer science and data analysis are helping to provide continuously huge volumes of biological data, which are available on the web. Such advances involve and require powerful techniques for data integration to extract pertinent knowledge and information for a specific question. Biomedical exploration of these big data often requires the use of complex queries across multiple autonomous, heterogeneous and distributed data sources. Semantic integration is an active area of research in several disciplines, such as databases, information-integration, and ontology. We provide a survey of some approaches and techniques for integrating biological data, we focus on those developed in the ontology community.

Keywords: biological ontology, linked data, semantic data integration, semantic web

Procedia PDF Downloads 446
24523 Classification of Generative Adversarial Network Generated Multivariate Time Series Data Featuring Transformer-Based Deep Learning Architecture

Authors: Thrivikraman Aswathi, S. Advaith

Abstract:

As there can be cases where the use of real data is somehow limited, such as when it is hard to get access to a large volume of real data, we need to go for synthetic data generation. This produces high-quality synthetic data while maintaining the statistical properties of a specific dataset. In the present work, a generative adversarial network (GAN) is trained to produce multivariate time series (MTS) data since the MTS is now being gathered more often in various real-world systems. Furthermore, the GAN-generated MTS data is fed into a transformer-based deep learning architecture that carries out the data categorization into predefined classes. Further, the model is evaluated across various distinct domains by generating corresponding MTS data.

Keywords: GAN, transformer, classification, multivariate time series

Procedia PDF Downloads 126
24522 Allele Mining for Rice Sheath Blight Resistance by Whole-Genome Association Mapping in a Tail-End Population

Authors: Naoki Yamamoto, Hidenobu Ozaki, Taiichiro Ookawa, Youming Liu, Kazunori Okada, Aiping Zheng

Abstract:

Rice sheath blight is one of the destructive fungal diseases in rice. We have thought that rice sheath blight resistance is a polygenic trait. Host-pathogen interactions and secondary metabolites such as lignin and phytoalexins are likely to be involved in defense against R. solani. However, to our knowledge, it is still unknown how sheath blight resistance can be enhanced in rice breeding. To seek for an alternative genetic factor that contribute to sheath blight resistance, we mined relevant allelic variations from rice core collections created in Japan. Based on disease lesion length on detached leaf sheath, we selected 30 varieties of the top tail-end and the bottom tail-end, respectively, from the core collections to perform genome-wide association mapping. Re-sequencing reads for these varieties were used for calling single nucleotide polymorphisms among the 60 varieties to create a SNP panel, which contained 1,137,131 homozygous variant sites after filitering. Association mapping highlighted a locus on the long arm of chromosome 11, which is co-localized with three sheath blight QTLs, qShB11-2-TX, qShB11, and qSBR-11-2. Based on the localization of the trait-associated alleles, we identified an ankyryn repeat-containing protein gene (ANK-M) as an uncharacterized candidate factor for rice sheath blight resistance. Allelic distributions for ANK-M in the whole rice population supported the reliability of trait-allele associations. Gene expression characteristics were checked to evaluiate the functionality of ANK-M. Since an ANK-M homolog (OsPIANK1) in rice seems a basal defense regulator against rice blast and bacterial leaf blight, ANK-M may also play a role in the rice immune system.

Keywords: allele mining, GWAS, QTL, rice sheath blight

Procedia PDF Downloads 73
24521 Generative AI: A Comparison of Conditional Tabular Generative Adversarial Networks and Conditional Tabular Generative Adversarial Networks with Gaussian Copula in Generating Synthetic Data with Synthetic Data Vault

Authors: Lakshmi Prayaga, Chandra Prayaga. Aaron Wade, Gopi Shankar Mallu, Harsha Satya Pola

Abstract:

Synthetic data generated by Generative Adversarial Networks and Autoencoders is becoming more common to combat the problem of insufficient data for research purposes. However, generating synthetic data is a tedious task requiring extensive mathematical and programming background. Open-source platforms such as the Synthetic Data Vault (SDV) and Mostly AI have offered a platform that is user-friendly and accessible to non-technical professionals to generate synthetic data to augment existing data for further analysis. The SDV also provides for additions to the generic GAN, such as the Gaussian copula. We present the results from two synthetic data sets (CTGAN data and CTGAN with Gaussian Copula) generated by the SDV and report the findings. The results indicate that the ROC and AUC curves for the data generated by adding the layer of Gaussian copula are much higher than the data generated by the CTGAN.

Keywords: synthetic data generation, generative adversarial networks, conditional tabular GAN, Gaussian copula

Procedia PDF Downloads 74
24520 Groundwater Treatment of Thailand's Mae Moh Lignite Mine

Authors: A. Laksanayothin, W. Ariyawong

Abstract:

Mae Moh Lignite Mine is the largest open-pit mine in Thailand. The mine serves coal to the power plant about 16 million tons per year. This amount of coal can produce electricity accounting for about 10% of Nation’s electric power generation. The mining area of Mae Moh Mine is about 28 km2. At present, the deepest area of the pit is about 280 m from ground level (+40 m. MSL) and in the future the depth of the pit can reach 520 m from ground level (-200 m.MSL). As the size of the pit is quite large, the stability of the pit is seriously important. Furthermore, the preliminary drilling and extended drilling in year 1989-1996 had found high pressure aquifer under the pit. As a result, the pressure of the underground water has to be released in order to control mine pit stability. The study by the consulting experts later found that 3-5 million m3 per year of the underground water is needed to be de-watered for the safety of mining. However, the quality of this discharged water should meet the standard. Therefore, the ground water treatment facility has been implemented, aiming to reduce the amount of naturally contaminated Arsenic (As) in discharged water lower than the standard limit of 10 ppb. The treatment system consists of coagulation and filtration process. The main components include rapid mixing tanks, slow mixing tanks, sedimentation tank, thickener tank and sludge drying bed. The treatment process uses 40% FeCl3 as a coagulant. The FeCl3 will adsorb with As(V), forming floc particles and separating from the water as precipitate. After that, the sludge is dried in the sand bed and then be disposed in the secured land fill. Since 2011, the treatment plant of 12,000 m3/day has been efficiently operated. The average removal efficiency of the process is about 95%.

Keywords: arsenic, coagulant, ferric chloride, groundwater, lignite, coal mine

Procedia PDF Downloads 308
24519 Integrative Omics-Portrayal Disentangles Molecular Heterogeneity and Progression Mechanisms of Cancer

Authors: Binder Hans

Abstract:

Cancer is no longer seen as solely a genetic disease where genetic defects such as mutations and copy number variations affect gene regulation and eventually lead to aberrant cell functioning which can be monitored by transcriptome analysis. It has become obvious that epigenetic alterations represent a further important layer of (de-)regulation of gene activity. For example, aberrant DNA methylation is a hallmark of many cancer types, and methylation patterns were successfully used to subtype cancer heterogeneity. Hence, unraveling the interplay between different omics levels such as genome, transcriptome and epigenome is inevitable for a mechanistic understanding of molecular deregulation causing complex diseases such as cancer. This objective requires powerful downstream integrative bioinformatics methods as an essential prerequisite to discover the whole genome mutational, transcriptome and epigenome landscapes of cancer specimen and to discover cancer genesis, progression and heterogeneity. Basic challenges and tasks arise ‘beyond sequencing’ because of the big size of the data, their complexity, the need to search for hidden structures in the data, for knowledge mining to discover biological function and also systems biology conceptual models to deduce developmental interrelations between different cancer states. These tasks are tightly related to cancer biology as an (epi-)genetic disease giving rise to aberrant genomic regulation under micro-environmental control and clonal evolution which leads to heterogeneous cellular states. Machine learning algorithms such as self organizing maps (SOM) represent one interesting option to tackle these bioinformatics tasks. The SOMmethod enables recognizing complex patterns in large-scale data generated by highthroughput omics technologies. It portrays molecular phenotypes by generating individualized, easy to interpret images of the data landscape in combination with comprehensive analysis options. Our image-based, reductionist machine learning methods provide one interesting perspective how to deal with massive data in the discovery of complex diseases, gliomas, melanomas and colon cancer on molecular level. As an important new challenge, we address the combined portrayal of different omics data such as genome-wide genomic, transcriptomic and methylomic ones. The integrative-omics portrayal approach is based on the joint training of the data and it provides separate personalized data portraits for each patient and data type which can be analyzed by visual inspection as one option. The new method enables an integrative genome-wide view on the omics data types and the underlying regulatory modes. It is applied to high and low-grade gliomas and to melanomas where it disentangles transversal and longitudinal molecular heterogeneity in terms of distinct molecular subtypes and progression paths with prognostic impact.

Keywords: integrative bioinformatics, machine learning, molecular mechanisms of cancer, gliomas and melanomas

Procedia PDF Downloads 147