Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 25101

Search results for: clustering on flowing data

24561 Opening up Government Datasets for Big Data Analysis to Support Policy Decisions

Abstract:

Policy makers are increasingly looking to make evidence-based decisions. Evidence-based decisions have historically used rigorous methodologies of empirical studies by research institutes, as well as less reliable immediate survey/polls often with limited sample sizes. As we move into the era of Big Data analytics, policy makers are looking to different methodologies to deliver reliable empirics in real-time. The question is not why did these people do this for the last 10 years, but why are these people doing this now, and if the this is undesirable, and how can we have an impact to promote change immediately. Big data analytics rely heavily on government data that has been released in to the public domain. The open data movement promises greater productivity and more efficient delivery of services; however, Australian government agencies remain reluctant to release their data to the general public. This paper considers the barriers to releasing government data as open data, and how these barriers might be overcome.

Keywords: big data, open data, productivity, data governance

Procedia PDF Downloads 362

24560 Analysis of the Impact of Suez Canal on the Robustness of Global Shipping Networks

Authors: Zimu Li, Zheng Wan

Abstract:

The Suez Canal plays an important role in global shipping networks and is one of the most frequently used waterways in the world. The 2021 canal obstruction by ship Ever Given in March 2021, however, completed blocked the Suez Canal for a week and caused significant disruption to world trade. Therefore, it is very important to quantitatively analyze the impact of the accident on the robustness of the global shipping network. However, the current research on maritime transportation networks is usually limited to local or small-scale networks in a certain region. Based on the complex network theory, this study establishes a global shipping complex network covering 2713 nodes and 137830 edges by using the real trajectory data of the global marine transport ship automatic identification system in 2018. At the same time, two attack modes, deliberate (Suez Canal Blocking) and random, are defined to calculate the changes in network node degree, eccentricity, clustering coefficient, network density, network isolated nodes, betweenness centrality, and closeness centrality under the two attack modes, and quantitatively analyze the actual impact of Suez Canal Blocking on the robustness of global shipping network. The results of the network robustness analysis show that Suez Canal blocking was more destructive to the shipping network than random attacks of the same scale. The network connectivity and accessibility decreased significantly, and the decline decreased with the distance between the port and the canal, showing the phenomenon of distance attenuation. This study further analyzes the impact of the blocking of the Suez Canal on Chinese ports and finds that the blocking of the Suez Canal significantly interferes withChina's shipping network and seriously affects China's normal trade activities. Finally, the impact of the global supply chain is analyzed, and it is found that blocking the canal will seriously damage the normal operation of the global supply chain.

Keywords: global shipping networks, ship AIS trajectory data, main channel, complex network, eigenvalue change

Procedia PDF Downloads 175

24559 Antibacterial Evaluation, in Silico ADME and QSAR Studies of Some Benzimidazole Derivatives

Authors: Strahinja Kovačević, Lidija Jevrić, Miloš Kuzmanović, Sanja Podunavac-Kuzmanović

Abstract:

In this paper, various derivatives of benzimidazole have been evaluated against Gram-negative bacteria Escherichia coli. For all investigated compounds the minimum inhibitory concentration (MIC) was determined. Quantitative structure-activity relationships (QSAR) attempts to find consistent relationships between the variations in the values of molecular properties and the biological activity for a series of compounds so that these rules can be used to evaluate new chemical entities. The correlation between MIC and some absorption, distribution, metabolism and excretion (ADME) parameters was investigated, and the mathematical models for predicting the antibacterial activity of this class of compounds were developed. The quality of the multiple linear regression (MLR) models was validated by the leave-one-out (LOO) technique, as well as by the calculation of the statistical parameters for the developed models and the results are discussed on the basis of the statistical data. The results of this study indicate that ADME parameters have a significant effect on the antibacterial activity of this class of compounds. Principal component analysis (PCA) and agglomerative hierarchical clustering algorithms (HCA) confirmed that the investigated molecules can be classified into groups on the basis of the ADME parameters: Madin-Darby Canine Kidney cell permeability (MDCK), Plasma protein binding (PPB%), human intestinal absorption (HIA%) and human colon carcinoma cell permeability (Caco-2).

Keywords: benzimidazoles, QSAR, ADME, in silico

Procedia PDF Downloads 369

24558 Effect of Fractional Flow Curves on the Heavy Oil and Light Oil Recoveries in Petroleum Reservoirs

Authors: Abdul Jamil Nazari, Shigeo Honma

Abstract:

This paper evaluates and compares the effect of fractional flow curves on the heavy oil and light oil recoveries in a petroleum reservoir. Fingering of flowing water is one of the serious problems of the oil displacement by water and another problem is the estimation of the amount of recover oil from a petroleum reservoir. To address these problems, the fractional flow of heavy oil and light oil are investigated. The fractional flow approach treats the multi-phases flow rate as a total mixed fluid and then describes the individual phases as fractional of the total flow. Laboratory experiments are implemented for two different types of oils, heavy oil, and light oil, to experimentally obtain relative permeability and fractional flow curves. Application of the light oil fractional curve, which exhibits a regular S-shape, to the water flooding method showed that a large amount of mobile oil in the reservoir is displaced by water injection. In contrast, the fractional flow curve of heavy oil does not display an S-shape because of its high viscosity. Although the advance of the injected waterfront is faster than in light oil reservoirs, a significant amount of mobile oil remains behind the waterfront.

Keywords: fractional flow, relative permeability, oil recovery, water fingering

Procedia PDF Downloads 298

24557 A Review on Existing Challenges of Data Mining and Future Research Perspectives

Authors: Hema Bhardwaj, D. Srinivasa Rao

Abstract:

Technology for analysing, processing, and extracting meaningful data from enormous and complicated datasets can be termed as "big data." The technique of big data mining and big data analysis is extremely helpful for business movements such as making decisions, building organisational plans, researching the market efficiently, improving sales, etc., because typical management tools cannot handle such complicated datasets. Special computational and statistical issues, such as measurement errors, noise accumulation, spurious correlation, and storage and scalability limitations, are brought on by big data. These unique problems call for new computational and statistical paradigms. This research paper offers an overview of the literature on big data mining, its process, along with problems and difficulties, with a focus on the unique characteristics of big data. Organizations have several difficulties when undertaking data mining, which has an impact on their decision-making. Every day, terabytes of data are produced, yet only around 1% of that data is really analyzed. The idea of the mining and analysis of data and knowledge discovery techniques that have recently been created with practical application systems is presented in this study. This article's conclusion also includes a list of issues and difficulties for further research in the area. The report discusses the management's main big data and data mining challenges.

Keywords: big data, data mining, data analysis, knowledge discovery techniques, data mining challenges

Procedia PDF Downloads 103

24556 The Impact of Urbanisation on Sediment Concentration of Ginzo River in Katsina City, Katsina State, Nigeria

Authors: Ahmed A. Lugard, Mohammed A. Aliyu

Abstract:

This paper studied the influence of urban development and its accompanied land surface transformation on sediment concentration of a natural flowing Ginzo river across the city of Katsina. An opposite twin river known as Tille river, which is less urbanized, was used to compare the result of the sediment concentration of the Ginzo River in order to ascertain the consequences of the urban area on impacting the sediment concentration. An instrument called USP 61 point integrating cable way sampler described by Gregory and walling (1973), was used to collect the suspended sediment samples in the wet season months of June, July, August and September. The result obtained in the study shows that only the sample collected at the peripheral site of the city, which is mostly farmland areas resembles the results in the four sites of Tille river, which is the reference stream in the study. It was found to be only + 10% different from one another, while at the other three sites of the Ginzo which are highly urbanized the disparity ranges from 35-45% less than what are obtained at the four sites of Tille River. In the generalized assessment, the t-distribution result applied to the two set of data shows that there is a significant difference between the sediment concentration of urbanized River Ginzo and that of less urbanized River Tille. The study further discovered that the less sediment concentration found in urbanized River Ginzo is attributed to concretization of surfaced, tarred roads, concretized channeling of segments of the river including the river bed and reserved open grassland areas, all within the catchments. The study therefore concludes that urbanization affect not only the hydrology of an urbanized river basin, but also the sediment concentration which is a significant aspect of its geomorphology. This world certainly affects the flood plain of the basin at a certain point which might be a suitable land for cultivation. It is recommended here that further studies on the impact of urbanization on River Basins should focus on all elements of geomorphology as it has been on hydrology. This would make the work rather complete as the two disciplines are inseparable from each other. The authorities concern should also trigger a more proper environmental and land use management policies to arrest the menace of land degradation and related episodic events.

Keywords: environment, infiltration, river, urbanization

Procedia PDF Downloads 310

24555 A Systematic Review on Challenges in Big Data Environment

Authors: Rimmy Yadav, Anmol Preet Kaur

Abstract:

Big Data has demonstrated the vast potential in streamlining, deciding, spotting business drifts in different fields, for example, producing, fund, Information Technology. This paper gives a multi-disciplinary diagram of the research issues in enormous information and its procedures, instruments, and system identified with the privacy, data storage management, network and energy utilization, adaptation to non-critical failure and information representations. Other than this, result difficulties and openings accessible in this Big Data platform have made.

Keywords: big data, privacy, data management, network and energy consumption

Procedia PDF Downloads 305

24554 Survey on Big Data Stream Classification by Decision Tree

Authors: Mansoureh Ghiasabadi Farahani, Samira Kalantary, Sara Taghi-Pour, Mahboubeh Shamsi

Abstract:

Nowadays, the development of computers technology and its recent applications provide access to new types of data, which have not been considered by the traditional data analysts. Two particularly interesting characteristics of such data sets include their huge size and streaming nature .Incremental learning techniques have been used extensively to address the data stream classification problem. This paper presents a concise survey on the obstacles and the requirements issues classifying data streams with using decision tree. The most important issue is to maintain a balance between accuracy and efficiency, the algorithm should provide good classification performance with a reasonable time response.

Keywords: big data, data streams, classification, decision tree

Procedia PDF Downloads 513

24553 Robust and Dedicated Hybrid Cloud Approach for Secure Authorized Deduplication

Authors: Aishwarya Shekhar, Himanshu Sharma

Abstract:

Data deduplication is one of important data compression techniques for eliminating duplicate copies of repeating data, and has been widely used in cloud storage to reduce the amount of storage space and save bandwidth. In this process, duplicate data is expunged, leaving only one copy means single instance of the data to be accumulated. Though, indexing of each and every data is still maintained. Data deduplication is an approach for minimizing the part of storage space an organization required to retain its data. In most of the company, the storage systems carry identical copies of numerous pieces of data. Deduplication terminates these additional copies by saving just one copy of the data and exchanging the other copies with pointers that assist back to the primary copy. To ignore this duplication of the data and to preserve the confidentiality in the cloud here we are applying the concept of hybrid nature of cloud. A hybrid cloud is a fusion of minimally one public and private cloud. As a proof of concept, we implement a java code which provides security as well as removes all types of duplicated data from the cloud.

Keywords: confidentiality, deduplication, data compression, hybridity of cloud

Procedia PDF Downloads 375

24552 A Review of Machine Learning for Big Data

Authors: Devatha Kalyan Kumar, Aravindraj D., Sadathulla A.

Abstract:

Big data are now rapidly expanding in all engineering and science and many other domains. The potential of large or massive data is undoubtedly significant, make sense to require new ways of thinking and learning techniques to address the various big data challenges. Machine learning is continuously unleashing its power in a wide range of applications. In this paper, the latest advances and advancements in the researches on machine learning for big data processing. First, the machine learning techniques methods in recent studies, such as deep learning, representation learning, transfer learning, active learning and distributed and parallel learning. Then focus on the challenges and possible solutions of machine learning for big data.

Keywords: active learning, big data, deep learning, machine learning

Procedia PDF Downloads 433

24551 Analyzing the Impact of Global Financial Crisis on Interconnectedness of Asian Stock Markets Using Network Science

Authors: Jitendra Aswani

Abstract:

In the first section of this study, impact of Global Financial Crisis (GFC) on the synchronization of fourteen Asian Stock Markets (ASM’s) of countries like Hong Kong, India, Thailand, Singapore, Taiwan, Pakistan, Bangladesh, South Korea, Malaysia, Indonesia, Japan, China, Philippines and Sri Lanka, has been analysed using the network science and its metrics like degree of node, clustering coefficient and network density. Then in the second section of this study by introducing the US stock market in existing network and developing a Minimum Spanning Tree (MST) spread of crisis from the US stock market to Asian Stock Markets (ASM) has been explained. Data used for this study is adjusted the closing price of these indices from 6th January, 2000 to 15th September, 2013 which further divided into three sub-periods: Pre, during and post-crisis. Using network analysis, it is found that Asian stock markets become more interdependent during the crisis than pre and post crisis, and also Hong Kong, India, South Korea and Japan are systemic important stock markets in the Asian region. Therefore, failure or shock to any of these systemic important stock markets can cause contagion to another stock market of this region. This study is useful for global investors’ in portfolio management especially during the crisis period and also for policy makers in formulating the financial regulation norms by knowing the connections between the stock markets and how the system of these stock markets changes in crisis period and after that.

Keywords: global financial crisis, Asian stock markets, network science, Kruskal algorithm

Procedia PDF Downloads 411

24550 Strengthening Legal Protection of Personal Data through Technical Protection Regulation in Line with Human Rights

Authors: Tomy Prihananto, Damar Apri Sudarmadi

Abstract:

Indonesia recognizes the right to privacy as a human right. Indonesia provides legal protection against data management activities because the protection of personal data is a part of human rights. This paper aims to describe the arrangement of data management and data management in Indonesia. This paper is a descriptive research with qualitative approach and collecting data from literature study. Results of this paper are comprehensive arrangement of data that have been set up as a technical requirement of data protection by encryption methods. Arrangements on encryption and protection of personal data are mutually reinforcing arrangements in the protection of personal data. Indonesia has two important and immediately enacted laws that provide protection for the privacy of information that is part of human rights.

Keywords: Indonesia, protection, personal data, privacy, human rights, encryption

Procedia PDF Downloads 179

24549 Parametric and Analysis Study of the Melting in Slabs Heated by a Laminar Heat Transfer Fluid in Downward and Upward Flows

Authors: Radouane Elbahjaoui, Hamid El Qarnia

Abstract:

The present work aims to investigate numerically the thermal and flow characteristics of a rectangular latent heat storage unit (LHSU) during the melting process of a phase change material (PCM). The LHSU consists of a number of vertical and identical plates of PCM separated by rectangular channels. The melting process is initiated when the LHSU is heated by a heat transfer fluid (HTF: water) flowing in channels in a downward or upward direction. The proposed study is motivated by the need to optimize the thermal performance of the LHSU by accelerating the charging process. A mathematical model is developed and a fixed-grid enthalpy formulation is adopted for modeling the melting process coupling with convection-conduction heat transfer. The finite volume method was used for discretization. The obtained numerical results are compared with experimental, analytical and numerical ones found in the literature and reasonable agreement is obtained. Thereafter, the numerical investigations were carried out to highlight the effects of the HTF flow direction and the aspect ratio of the PCM slabs on the heat transfer characteristics and thermal performance enhancement of the LHSU.

Keywords: PCM, TES, LHSU, melting

Procedia PDF Downloads 254

24548 Bi-Directional Impulse Turbine for Thermo-Acoustic Generator

Authors: A. I. Dovgjallo, A. B. Tsapkova, A. A. Shimanov

Abstract:

The paper is devoted to one of engine types with external heating – a thermoacoustic engine. In thermoacoustic engine heat energy is converted to an acoustic energy. Further, acoustic energy of oscillating gas flow must be converted to mechanical energy and this energy in turn must be converted to electric energy. The most widely used way of transforming acoustic energy to electric one is application of linear generator or usual generator with crank mechanism. In both cases, the piston is used. Main disadvantages of piston use are friction losses, lubrication problems and working fluid pollution which cause decrease of engine power and ecological efficiency. Using of a bidirectional impulse turbine as an energy converter is suggested. The distinctive feature of this kind of turbine is that the shock wave of oscillating gas flow passing through the turbine is reflected and passes through the turbine again in the opposite direction. The direction of turbine rotation does not change in the process. Different types of bidirectional impulse turbines for thermoacoustic engines are analyzed. The Wells turbine is the simplest and least efficient of them. A radial impulse turbine has more complicated design and is more efficient than the Wells turbine. The most appropriate type of impulse turbine was chosen. This type is an axial impulse turbine, which has a simpler design than that of a radial turbine and similar efficiency. The peculiarities of the method of an impulse turbine calculating are discussed. They include changes in gas pressure and velocity as functions of time during the generation of gas oscillating flow shock waves in a thermoacoustic system. In thermoacoustic system pressure constantly changes by a certain law due to acoustic waves generation. Peak values of pressure are amplitude which determines acoustic power. Gas, flowing in thermoacoustic system, periodically changes its direction and its mean velocity is equal to zero but its peak values can be used for bi-directional turbine rotation. In contrast with feed turbine, described turbine operates on un-steady oscillating flows with direction changes which significantly influence the algorithm of its calculation. Calculated power output is 150 W with frequency 12000 r/min and pressure amplitude 1,7 kPa. Then, 3-d modeling and numerical research of impulse turbine was carried out. As a result of numerical modeling, main parameters of the working fluid in turbine were received. On the base of theoretical and numerical data model of impulse turbine was made on 3D printer. Experimental unit was designed for numerical modeling results verification. Acoustic speaker was used as acoustic wave generator. Analysis if the acquired data shows that use of the bi-directional impulse turbine is advisable. By its characteristics as a converter, it is comparable with linear electric generators. But its lifetime cycle will be higher and engine itself will be smaller due to turbine rotation motion.

Keywords: acoustic power, bi-directional pulse turbine, linear alternator, thermoacoustic generator

Procedia PDF Downloads 376

24547 The Various Legal Dimensions of Genomic Data

Authors: Amy Gooden

Abstract:

When human genomic data is considered, this is often done through only one dimension of the law, or the interplay between the various dimensions is not considered, thus providing an incomplete picture of the legal framework. This research considers and analyzes the various dimensions in South African law applicable to genomic sequence data – including property rights, personality rights, and intellectual property rights. The effective use of personal genomic sequence data requires the acknowledgement and harmonization of the rights applicable to such data.

Keywords: artificial intelligence, data, law, genomics, rights

Procedia PDF Downloads 135

24546 Big Brain: A Single Database System for a Federated Data Warehouse Architecture

Authors: X. Gumara Rigol, I. Martínez de Apellaniz Anzuola, A. Garcia Serrano, A. Franzi Cros, O. Vidal Calbet, A. Al Maruf

Abstract:

Traditional federated architectures for data warehousing work well when corporations have existing regional data warehouses and there is a need to aggregate data at a global level. Schibsted Media Group has been maturing from a decentralised organisation into a more globalised one and needed to build both some of the regional data warehouses for some brands at the same time as the global one. In this paper, we present the architectural alternatives studied and why a custom federated approach was the notable recommendation to go further with the implementation. Although the data warehouses are logically federated, the implementation uses a single database system which presented many advantages like: cost reduction and improved data access to global users allowing consumers of the data to have a common data model for detailed analysis across different geographies and a flexible layer for local specific needs in the same place.

Keywords: data integration, data warehousing, federated architecture, Online Analytical Processing (OLAP)

Procedia PDF Downloads 231

24545 Flow Control Optimisation Using Vortex Generators in Turbine Blade

Authors: J. Karthik, G. Vinayagamurthy

Abstract:

Aerodynamic flow control is achieved by interaction of flowing medium with corresponding structure so that its natural flow state is disturbed to delay the transition point. This paper explains the aerodynamic effect and optimized design of Vortex Generators on the turbine blade to achieve maximum flow control. The airfoil is chosen from NREL [National Renewable Energy Laboratory] S-series airfoil as they are characterized with good lift characteristics and lower noise. Vortex generators typically chosen are Ogival, Rectangular, Triangular and Tapered Fin shapes attached near leading edge. Vortex generators are typically distributed from the primary to tip of the blade section. The design wind speed is taken as 6m/s and the computational analysis is executed. The blade surface is simulated using k- ɛ SST model and results are compared with X-FOIL results. The computational results are validated using Wind Tunnel Testing of the blade corresponding to the design speed. The effect of Vortex generators on the flow characteristics is studied from the results of analysis. By comparing the computational and test results of all shapes of Vortex generators; the optimized design is achieved for effective flow control corresponding to the blade.

Keywords: flow control, vortex generators, design optimisation, CFD

Procedia PDF Downloads 399

24544 Prediction of Cutting Tool Life in Drilling of Reinforced Aluminum Alloy Composite Using a Fuzzy Method

Authors: Mohammed T. Hayajneh

Abstract:

Machining of Metal Matrix Composites (MMCs) is very significant process and has been a main problem that draws many researchers to investigate the characteristics of MMCs during different machining process. The poor machining properties of hard particles reinforced MMCs make drilling process a rather interesting task. Unlike drilling of conventional materials, many problems can be seriously encountered during drilling of MMCs, such as tool wear and cutting forces. Cutting tool wear is a very significant concern in industries. Cutting tool wear not only influences the quality of the drilled hole, but also affects the cutting tool life. Prediction the cutting tool life during drilling is essential for optimizing the cutting conditions. However, the relationship between tool life and cutting conditions, tool geometrical factors and workpiece material properties has not yet been established by any machining theory. In this research work, fuzzy subtractive clustering system has been used to model the cutting tool life in drilling of Al₂O₃ particle reinforced aluminum alloy composite to investigate of the effect of cutting conditions on cutting tool life. This investigation can help in controlling and optimizing of cutting conditions when the process parameters are adjusted. The built model for prediction the tool life is identified by using drill diameter, cutting speed, and cutting feed rate as input data. The validity of the model was confirmed by the examinations under various cutting conditions. Experimental results have shown the efficiency of the model to predict cutting tool life.

Keywords: composite, fuzzy, tool life, wear

Procedia PDF Downloads 289

24543 A Review Paper on Data Mining and Genetic Algorithm

Authors: Sikander Singh Cheema, Jasmeen Kaur

Abstract:

In this paper, the concept of data mining is summarized and its one of the important process i.e KDD is summarized. The data mining based on Genetic Algorithm is researched in and ways to achieve the data mining Genetic Algorithm are surveyed. This paper also conducts a formal review on the area of data mining tasks and genetic algorithm in various fields.

Keywords: data mining, KDD, genetic algorithm, descriptive mining, predictive mining

Procedia PDF Downloads 584

24542 Data-Mining Approach to Analyzing Industrial Process Information for Real-Time Monitoring

Authors: Seung-Lock Seo

Abstract:

This work presents a data-mining empirical monitoring scheme for industrial processes with partially unbalanced data. Measurement data of good operations are relatively easy to gather, but in unusual special events or faults it is generally difficult to collect process information or almost impossible to analyze some noisy data of industrial processes. At this time some noise filtering techniques can be used to enhance process monitoring performance in a real-time basis. In addition, pre-processing of raw process data is helpful to eliminate unwanted variation of industrial process data. In this work, the performance of various monitoring schemes was tested and demonstrated for discrete batch process data. It showed that the monitoring performance was improved significantly in terms of monitoring success rate of given process faults.

Keywords: data mining, process data, monitoring, safety, industrial processes

Procedia PDF Downloads 391

24541 Identification of Paleogeomorphology at Kedulan Temple, Sleman, Yogyakarta

Authors: Virgina Claudia Latengke, Muhaammad Nur Arifin, Vanny Septia Sundari

Abstract:

Kedulan Temple is located in Dusun Kedulan, Sleman, Yogyakarta, Indonesia at coordinates S 07o 44’ 57’, E 110o 28’ 17’. Kedulan Temple is a trace of the relics of life in the 3 century AD. The Kedulan Temple including exhumed landforms, which the primordial landform is first surface topography, then buried under cover mass and exposed or re-inscribed. Recognized by the existence of ancient soil (paleosoil) and ancient objects. Seen from the type of soil that closes the temple, there are 13 layers of lava type, so it is estimated that the lava that buried the temple came from 13 times the eruption of Mount Merapi. The material that buries the base of this temple is the pyroclastic surge deposits in 3 layers, each of which is limited by a thin layer of paleosol, the sediments are 1445+/-50 yBP, 1175+/-50 yBP, and 1060+/-40 yBP. This temple is buried and dug again at 940+/-100 yBP. Furthermore, the temple affected by earthquake, so the floor and foundation becomes bumpy and most of the temple stone are thrown. The temple is left alone, until exposed to hot clouds at 1285 M (740+/-50yBP). Next, repeatedly buried lava in 4 periods, in 1587 M (360+/-50 yBP, 240+/-50 yBP, 200+/-50 yBP and unknown date). From studying this temple, can be known paleogeomorphology process that occurred in Yogyakarta, especially related to the volcanic activity of Mount Merapi. Until now, the water is still flowing around the temple so there is a fluvial process that began to take a role in the temple.

Keywords: Kedulan temple, paleogeomorphology, buried, mount Merapi, Yogyakarta

Procedia PDF Downloads 167

24540 A Survey of Semantic Integration Approaches in Bioinformatics

Authors: Chaimaa Messaoudi, Rachida Fissoune, Hassan Badir

Abstract:

Technological advances of computer science and data analysis are helping to provide continuously huge volumes of biological data, which are available on the web. Such advances involve and require powerful techniques for data integration to extract pertinent knowledge and information for a specific question. Biomedical exploration of these big data often requires the use of complex queries across multiple autonomous, heterogeneous and distributed data sources. Semantic integration is an active area of research in several disciplines, such as databases, information-integration, and ontology. We provide a survey of some approaches and techniques for integrating biological data, we focus on those developed in the ontology community.

Keywords: biological ontology, linked data, semantic data integration, semantic web

Procedia PDF Downloads 443

24539 Classification of Generative Adversarial Network Generated Multivariate Time Series Data Featuring Transformer-Based Deep Learning Architecture

Authors: Thrivikraman Aswathi, S. Advaith

Abstract:

As there can be cases where the use of real data is somehow limited, such as when it is hard to get access to a large volume of real data, we need to go for synthetic data generation. This produces high-quality synthetic data while maintaining the statistical properties of a specific dataset. In the present work, a generative adversarial network (GAN) is trained to produce multivariate time series (MTS) data since the MTS is now being gathered more often in various real-world systems. Furthermore, the GAN-generated MTS data is fed into a transformer-based deep learning architecture that carries out the data categorization into predefined classes. Further, the model is evaluated across various distinct domains by generating corresponding MTS data.

Keywords: GAN, transformer, classification, multivariate time series

Procedia PDF Downloads 123

24538 Generative AI: A Comparison of Conditional Tabular Generative Adversarial Networks and Conditional Tabular Generative Adversarial Networks with Gaussian Copula in Generating Synthetic Data with Synthetic Data Vault

Authors: Lakshmi Prayaga, Chandra Prayaga. Aaron Wade, Gopi Shankar Mallu, Harsha Satya Pola

Abstract:

Synthetic data generated by Generative Adversarial Networks and Autoencoders is becoming more common to combat the problem of insufficient data for research purposes. However, generating synthetic data is a tedious task requiring extensive mathematical and programming background. Open-source platforms such as the Synthetic Data Vault (SDV) and Mostly AI have offered a platform that is user-friendly and accessible to non-technical professionals to generate synthetic data to augment existing data for further analysis. The SDV also provides for additions to the generic GAN, such as the Gaussian copula. We present the results from two synthetic data sets (CTGAN data and CTGAN with Gaussian Copula) generated by the SDV and report the findings. The results indicate that the ROC and AUC curves for the data generated by adding the layer of Gaussian copula are much higher than the data generated by the CTGAN.

Keywords: synthetic data generation, generative adversarial networks, conditional tabular GAN, Gaussian copula

Procedia PDF Downloads 69

24537 Exploring the Unintended Consequences of Loyalty programs in the Gambling Sector

Authors: Violet Justine Mtonga, Cecilia Diaz

Abstract:

this paper explores the prevalence of loyalty programs in the UK gambling industry and their association with unintended consequences and harm amongst program members. The use of loyalty programs within the UK gambling industry has risen significantly with over 40 million cards in circulation. Some research suggests that as of 2013-2014, nearly 95% of UK consumers have at least one loyalty card with 78% being members of two or more programs, and the average household possesses ‘22 loyalty programs’, nearly half of which tend to be used actively. The core design of loyalty programs is to create a relational ‘win-win’ approach where value is jointly created between the parties involved through repetitive engagement. However, main concern about the diffusion of gambling organisations’ loyalty programs amongst consumers, might be the use by the organisations within the gambling industry to over influence customer engagement and potentially cause unintended harm. To help understand the complex phenomena of the diffusions and adaptation of the use of loyalty programs in the gambling industry, and the potential unintended outcomes, this study is theoretically underpinned by the social exchange theory of relationships entrenched in the processes of social exchanges of resources, rewards, and costs for long-term interactions and mutual benefits. Qualitative data were collected via in-depth interviews from 14 customers and 12 employees within the UK land-based gambling firms. Data were analysed using a combination of thematic and clustering analysis to help reveal and discover the emerging themes regarding the use of loyalty cards for gambling companies and exploration of subgroups within the sample. The study’s results indicate that there are different unintended consequences and harm of loyalty program engagement and usage such as maladaptive gambling behaviours, risk of compulsiveness, and loyalty programs promoting gambling from home. Furthermore, there is a strong indication of a rite of passage among loyalty program members. There is also strong evidence to support other unfavorable behaviors such as amplified gambling habits and risk-taking practices. Additionally, in pursuit of rewards, loyalty program incentives effectuate overconsumption and heighten expenditure. Overall, the primary findings of this study show that loyalty programs in the gambling industry should be designed with an ethical perspective and practice.

Keywords: gambling, loyalty programs, social exchange theory, unintended harm

Procedia PDF Downloads 83

24536 CFD Simulation and Investigation of Critical Two-Phase Flow Rate in Wellhead Choke

Authors: Alireza Rafie Boldaji, Ahmad Saboonchi

Abstract:

Chokes are commonly used in oil and gas production systems. A choke is a restriction basically designed to control flow rates of oil and gas wells, to prevent the downstream disturbances from propagating upstream (critical flow), and to protect the surface equipment facilities against slugging at high flowing pressures. There are different methods to calculate the multiphase flow rate, one of the multiphase flow measurement methods is the separation and measurement by on¬e-phaseFlow meter, another common method is the use of movable separator, their operations are very labor-intensive and costly. The current method used is based on the flow differential pressure on both sides of choke. Three groups of correlations describing two-phase flow through wellhead chokes were examined. The first group involved simple empirical equations similar to those of Gilbert, the second group comprised derived equations of two-phase flow incorporating PVT properties, and third group is computational method. In the article we calculate the flow of oil and gas through choke with simulation of this two phase flow bye computational fluid dynamic method, we use Ansys- fluent for this simulation and finally compared results of computational simulation whit empirical equations, the results show good agreement between experimental and numerical results.

Keywords: CFD, two-phase, choke, critical

Procedia PDF Downloads 271

24535 A Privacy Protection Scheme Supporting Fuzzy Search for NDN Routing Cache Data Name

Authors: Feng Tao, Ma Jing, Guo Xian, Wang Jing

Abstract:

Named Data Networking (NDN) replaces IP address of traditional network with data name, and adopts dynamic cache mechanism. In the existing mechanism, however, only one-to-one search can be achieved because every data has a unique name corresponding to it. There is a certain mapping relationship between data content and data name, so if the data name is intercepted by an adversary, the privacy of the data content and user’s interest can hardly be guaranteed. In order to solve this problem, this paper proposes a one-to-many fuzzy search scheme based on order-preserving encryption to reduce the query overhead by optimizing the caching strategy. In this scheme, we use hash value to ensure the user’s query safe from each node in the process of search, so does the privacy of the requiring data content.

Keywords: NDN, order-preserving encryption, fuzzy search, privacy

Procedia PDF Downloads 477

24534 Healthcare Big Data Analytics Using Hadoop

Authors: Chellammal Surianarayanan

Abstract:

Healthcare industry is generating large amounts of data driven by various needs such as record keeping, physician’s prescription, medical imaging, sensor data, Electronic Patient Record(EPR), laboratory, pharmacy, etc. Healthcare data is so big and complex that they cannot be managed by conventional hardware and software. The complexity of healthcare big data arises from large volume of data, the velocity with which the data is accumulated and different varieties such as structured, semi-structured and unstructured nature of data. Despite the complexity of big data, if the trends and patterns that exist within the big data are uncovered and analyzed, higher quality healthcare at lower cost can be provided. Hadoop is an open source software framework for distributed processing of large data sets across clusters of commodity hardware using a simple programming model. The core components of Hadoop include Hadoop Distributed File System which offers way to store large amount of data across multiple machines and MapReduce which offers way to process large data sets with a parallel, distributed algorithm on a cluster. Hadoop ecosystem also includes various other tools such as Hive (a SQL-like query language), Pig (a higher level query language for MapReduce), Hbase(a columnar data store), etc. In this paper an analysis has been done as how healthcare big data can be processed and analyzed using Hadoop ecosystem.

Keywords: big data analytics, Hadoop, healthcare data, towards quality healthcare

Procedia PDF Downloads 407

24533 Data Disorders in Healthcare Organizations: Symptoms, Diagnoses, and Treatments

Authors: Zakieh Piri, Shahla Damanabi, Peyman Rezaii Hachesoo

Abstract:

Introduction: Healthcare organizations like other organizations suffer from a number of disorders such as Business Sponsor Disorder, Business Acceptance Disorder, Cultural/Political Disorder, Data Disorder, etc. As quality in healthcare care mostly depends on the quality of data, we aimed to identify data disorders and its symptoms in two teaching hospitals. Methods: Using a self-constructed questionnaire, we asked 20 questions in related to quality and usability of patient data stored in patient records. Research population consisted of 150 managers, physicians, nurses, medical record staff who were working at the time of study. We also asked their views about the symptoms and treatments for any data disorders they mentioned in the questionnaire. Using qualitative methods we analyzed the answers. Results: After classifying the answers, we found six main data disorders: incomplete data, missed data, late data, blurred data, manipulated data, illegible data. The majority of participants believed in their important roles in treatment of data disorders while others believed in health system problems. Discussion: As clinicians have important roles in producing of data, they can easily identify symptoms and disorders of patient data. Health information managers can also play important roles in early detection of data disorders by proactively monitoring and periodic check-ups of data.

Keywords: data disorders, quality, healthcare, treatment

Procedia PDF Downloads 426

24532 Big Data and Analytics in Higher Education: An Assessment of Its Status, Relevance and Future in the Republic of the Philippines

Authors: Byron Joseph A. Hallar, Annjeannette Alain D. Galang, Maria Visitacion N. Gumabay

Abstract:

One of the unique challenges provided by the twenty-first century to Philippine higher education is the utilization of Big Data. The higher education system in the Philippines is generating burgeoning amounts of data that contains relevant data that can be used to generate the information and knowledge needed for accurate data-driven decision making. This study examines the status, relevance and future of Big Data and Analytics in Philippine higher education. The insights gained from the study may be relevant to other developing nations similarly situated as the Philippines.

Keywords: big data, data analytics, higher education, republic of the philippines, assessment

Procedia PDF Downloads 338