Search results for: data pipeline
25006 A Study on Big Data Analytics, Applications, and Challenges
Authors: Chhavi Rana
Abstract:
The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, healthcare, and business intelligence contain voluminous and incremental data which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organisation decision-making strategy can be enhanced by using big data analytics and applying different machine learning techniques and statistical tools to such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates various frameworks in the process of analysis using different machine learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.Keywords: big data, big data analytics, machine learning, review
Procedia PDF Downloads 9525005 Improved K-Means Clustering Algorithm Using RHadoop with Combiner
Authors: Ji Eun Shin, Dong Hoon Lim
Abstract:
Data clustering is a common technique used in data analysis and is used in many applications, such as artificial intelligence, pattern recognition, economics, ecology, psychiatry and marketing. K-means clustering is a well-known clustering algorithm aiming to cluster a set of data points to a predefined number of clusters. In this paper, we implement K-means algorithm based on MapReduce framework with RHadoop to make the clustering method applicable to large scale data. RHadoop is a collection of R packages that allow users to manage and analyze data with Hadoop. The main idea is to introduce a combiner as a function of our map output to decrease the amount of data needed to be processed by reducers. The experimental results demonstrated that K-means algorithm using RHadoop can scale well and efficiently process large data sets on commodity hardware. We also showed that our K-means algorithm using RHadoop with combiner was faster than regular algorithm without combiner as the size of data set increases.Keywords: big data, combiner, K-means clustering, RHadoop
Procedia PDF Downloads 43825004 Framework for Integrating Big Data and Thick Data: Understanding Customers Better
Authors: Nikita Valluri, Vatcharaporn Esichaikul
Abstract:
With the popularity of data-driven decision making on the rise, this study focuses on providing an alternative outlook towards the process of decision-making. Combining quantitative and qualitative methods rooted in the social sciences, an integrated framework is presented with a focus on delivering a much more robust and efficient approach towards the concept of data-driven decision-making with respect to not only Big data but also 'Thick data', a new form of qualitative data. In support of this, an example from the retail sector has been illustrated where the framework is put into action to yield insights and leverage business intelligence. An interpretive approach to analyze findings from both kinds of quantitative and qualitative data has been used to glean insights. Using traditional Point-of-sale data as well as an understanding of customer psychographics and preferences, techniques of data mining along with qualitative methods (such as grounded theory, ethnomethodology, etc.) are applied. This study’s final goal is to establish the framework as a basis for providing a holistic solution encompassing both the Big and Thick aspects of any business need. The proposed framework is a modified enhancement in lieu of traditional data-driven decision-making approach, which is mainly dependent on quantitative data for decision-making.Keywords: big data, customer behavior, customer experience, data mining, qualitative methods, quantitative methods, thick data
Procedia PDF Downloads 16225003 Incremental Learning of Independent Topic Analysis
Authors: Takahiro Nishigaki, Katsumi Nitta, Takashi Onoda
Abstract:
In this paper, we present a method of applying Independent Topic Analysis (ITA) to increasing the number of document data. The number of document data has been increasing since the spread of the Internet. ITA was presented as one method to analyze the document data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis (ICA). ICA is a technique in the signal processing; however, it is difficult to apply the ITA to increasing number of document data. Because ITA must use the all document data so temporal and spatial cost is very high. Therefore, we present Incremental ITA which extracts the independent topics from increasing number of document data. Incremental ITA is a method of updating the independent topics when the document data is added after extracted the independent topics from a just previous the data. In addition, Incremental ITA updates the independent topics when the document data is added. And we show the result applied Incremental ITA to benchmark datasets.Keywords: text mining, topic extraction, independent, incremental, independent component analysis
Procedia PDF Downloads 30925002 Open Data for e-Governance: Case Study of Bangladesh
Authors: Sami Kabir, Sadek Hossain Khoka
Abstract:
Open Government Data (OGD) refers to all data produced by government which are accessible in reusable way by common people with access to Internet and at free of cost. In line with “Digital Bangladesh” vision of Bangladesh government, the concept of open data has been gaining momentum in the country. Opening all government data in digital and customizable format from single platform can enhance e-governance which will make government more transparent to the people. This paper presents a well-in-progress case study on OGD portal by Bangladesh Government in order to link decentralized data. The initiative is intended to facilitate e-service towards citizens through this one-stop web portal. The paper further discusses ways of collecting data in digital format from relevant agencies with a view to making it publicly available through this single point of access. Further, possible layout of this web portal is presented.Keywords: e-governance, one-stop web portal, open government data, reusable data, web of data
Procedia PDF Downloads 35525001 Local People’s Livelihoods and Coping Strategies in the Wake of a Co-management System in the Campo Ma'an National Park, Cameroon
Authors: Nchanji Yvonne Kiki, Mala William Armand, Nchanji Eileen Bogweh, Ramcilovik-Suominen Sabaheta, Kotilainen Juha
Abstract:
The Campo Ma'an National Park was created as part of an environmental and biodiversity compensation for the Chad-Cameroon Oil Pipeline Project, which was meant to help alleviate poverty and boost the livelihood of rural communities around the area. This paper examines different strategies and coping mechanisms employed by the indigenous people and local communities to deal with the national and internationally driven conservation policies and initiatives in the case of the Campo Ma'an National Park. While most literature on park management/co-management/nature conservation has focused on the negative implications for local peoples’ livelihoods, fewer studies have investigated the strategies of local people to respond to these policies and renegotiate their position in a way that enables them to continue their traditional livelihoods using the existing local knowledge systems. This study contributes to the current literature by zooming into not only the impacts of nature conservation policies but also the local individual and collective strategies and responses to such policies and initiatives. We employ a qualitative research approach using ethnomethodology and a convivial lens to analyze data collected from October to November 2018. We find that conservation policies have worsened some existing livelihoods on the one hand and constrained livelihood improvement of indigenous people and local communities (IPLC) on the other hand. Nonetheless, the IPLC has devised individual and collective coping mechanisms to deal with these conservation interventions and the negative effects they have caused. Upon exploring these mechanisms and their effectiveness, this study proposes a management approach to conservation centered on both people and nature, based on indigenous and local people's knowledge and practices, promoting nature for and by humans and strengthening both livelihood and conservation. We take inspiration from the convivial conservation approach and thinking by Bucher and Fletcher.Keywords: conservation policies, national park management, indigenous and local people’s experiences, livelihoods, local knowledge, coping strategies, conviviality
Procedia PDF Downloads 18325000 Resource Framework Descriptors for Interestingness in Data
Authors: C. B. Abhilash, Kavi Mahesh
Abstract:
Human beings are the most advanced species on earth; it's all because of the ability to communicate and share information via human language. In today's world, a huge amount of data is available on the web in text format. This has also resulted in the generation of big data in structured and unstructured formats. In general, the data is in the textual form, which is highly unstructured. To get insights and actionable content from this data, we need to incorporate the concepts of text mining and natural language processing. In our study, we mainly focus on Interesting data through which interesting facts are generated for the knowledge base. The approach is to derive the analytics from the text via the application of natural language processing. Using semantic web Resource framework descriptors (RDF), we generate the triple from the given data and derive the interesting patterns. The methodology also illustrates data integration using the RDF for reliable, interesting patterns.Keywords: RDF, interestingness, knowledge base, semantic data
Procedia PDF Downloads 16224999 Data Mining Practices: Practical Studies on the Telecommunication Companies in Jordan
Authors: Dina Ahmad Alkhodary
Abstract:
This study aimed to investigate the practices of Data Mining on the telecommunication companies in Jordan, from the viewpoint of the respondents. In order to achieve the goal of the study, and test the validity of hypotheses, the researcher has designed a questionnaire to collect data from managers and staff members from main department in the researched companies. The results shows improvements stages of the telecommunications companies towered Data Mining.Keywords: data, mining, development, business
Procedia PDF Downloads 49824998 The Impact of System and Data Quality on Organizational Success in the Kingdom of Bahrain
Authors: Amal M. Alrayes
Abstract:
Data and system quality play a central role in organizational success, and the quality of any existing information system has a major influence on the effectiveness of overall system performance.Given the importance of system and data quality to an organization, it is relevant to highlight their importance on organizational performance in the Kingdom of Bahrain. This research aims to discover whether system quality and data quality are related, and to study the impact of system and data quality on organizational success. A theoretical model based on previous research is used to show the relationship between data and system quality, and organizational impact. We hypothesize, first, that system quality is positively associated with organizational impact, secondly that system quality is positively associated with data quality, and finally that data quality is positively associated with organizational impact. A questionnaire was conducted among public and private organizations in the Kingdom of Bahrain. The results show that there is a strong association between data and system quality, that affects organizational success.Keywords: data quality, performance, system quality, Kingdom of Bahrain
Procedia PDF Downloads 49324997 Cloud Computing in Data Mining: A Technical Survey
Authors: Ghaemi Reza, Abdollahi Hamid, Dashti Elham
Abstract:
Cloud computing poses a diversity of challenges in data mining operation arising out of the dynamic structure of data distribution as against the use of typical database scenarios in conventional architecture. Due to immense number of users seeking data on daily basis, there is a serious security concerns to cloud providers as well as data providers who put their data on the cloud computing environment. Big data analytics use compute intensive data mining algorithms (Hidden markov, MapReduce parallel programming, Mahot Project, Hadoop distributed file system, K-Means and KMediod, Apriori) that require efficient high performance processors to produce timely results. Data mining algorithms to solve or optimize the model parameters. The challenges that operation has to encounter is the successful transactions to be established with the existing virtual machine environment and the databases to be kept under the control. Several factors have led to the distributed data mining from normal or centralized mining. The approach is as a SaaS which uses multi-agent systems for implementing the different tasks of system. There are still some problems of data mining based on cloud computing, including design and selection of data mining algorithms.Keywords: cloud computing, data mining, computing models, cloud services
Procedia PDF Downloads 47924996 Cross-border Data Transfers to and from South Africa
Authors: Amy Gooden, Meshandren Naidoo
Abstract:
Genetic research and transfers of big data are not confined to a particular jurisdiction, but there is a lack of clarity regarding the legal requirements for importing and exporting such data. Using direct-to-consumer genetic testing (DTC-GT) as an example, this research assesses the status of data sharing into and out of South Africa (SA). While SA laws cover the sending of genetic data out of SA, prohibiting such transfer unless a legal ground exists, the position where genetic data comes into the country depends on the laws of the country from where it is sent – making the legal position less clear.Keywords: cross-border, data, genetic testing, law, regulation, research, sharing, South Africa
Procedia PDF Downloads 12524995 Aspects Regarding the Structural Behaviour of Autonomous Underwater Vehicle for Emergency Response
Authors: Lucian Stefanita Grigore, Damian Gorgoteanu, Cristian Molder, Amado Stefan, Daniel Constantin
Abstract:
The purpose of this article is to present an analytical-numerical study on the structural behavior of a sunken autonomous underwater vehicle (AUV) for emergency intervention. The need for such a study was generated by the key objective of the ERL-Emergency project. The project aims to develop a system of collaborative robots for emergency response. The system consists of two robots: unmanned ground vehicles (UGV) on tracks and the second is an AUV. The system of collaborative robots, AUV and UGV, will be used to perform missions of monitoring, intervention, and rescue. The main mission of the AUV is to dive into the maritime space of an industrial port to detect possible leaks in a pipeline transporting petroleum products. Another mission is to close and open the valves with which the pipes are provided. Finally, you will need to be able to lift a manikin to the surface, which you can take to land. Numerical analysis was performed by the finite element method (FEM). The conditions for immersing the AUV at 100 m depth were simulated, and the calculations for different fluid flow rates were repeated. From a structural point of view, the stiffening areas and the enclosures in which the command-and-control elements and the accumulators are located have been especially analyzed. The conclusion of this research is that the AUV meets very well the established requirements.Keywords: analytical-numerical, emergency, FEM, robotics, underwater
Procedia PDF Downloads 15024994 The Study of Security Techniques on Information System for Decision Making
Authors: Tejinder Singh
Abstract:
Information system is the flow of data from different levels to different directions for decision making and data operations in information system (IS). Data can be violated by different manner like manual or technical errors, data tampering or loss of integrity. Security system called firewall of IS is effected by such type of violations. The flow of data among various levels of Information System is done by networking system. The flow of data on network is in form of packets or frames. To protect these packets from unauthorized access, virus attacks, and to maintain the integrity level, network security is an important factor. To protect the data to get pirated, various security techniques are used. This paper represents the various security techniques and signifies different harmful attacks with the help of detailed data analysis. This paper will be beneficial for the organizations to make the system more secure, effective, and beneficial for future decisions making.Keywords: information systems, data integrity, TCP/IP network, vulnerability, decision, data
Procedia PDF Downloads 30724993 Data Integration with Geographic Information System Tools for Rural Environmental Monitoring
Authors: Tamas Jancso, Andrea Podor, Eva Nagyne Hajnal, Peter Udvardy, Gabor Nagy, Attila Varga, Meng Qingyan
Abstract:
The paper deals with the conditions and circumstances of integration of remotely sensed data for rural environmental monitoring purposes. The main task is to make decisions during the integration process when we have data sources with different resolution, location, spectral channels, and dimension. In order to have exact knowledge about the integration and data fusion possibilities, it is necessary to know the properties (metadata) that characterize the data. The paper explains the joining of these data sources using their attribute data through a sample project. The resulted product will be used for rural environmental analysis.Keywords: remote sensing, GIS, metadata, integration, environmental analysis
Procedia PDF Downloads 12024992 Analysis of Genomics Big Data in Cloud Computing Using Fuzzy Logic
Authors: Mohammad Vahed, Ana Sadeghitohidi, Majid Vahed, Hiroki Takahashi
Abstract:
In the genomics field, the huge amounts of data have produced by the next-generation sequencers (NGS). Data volumes are very rapidly growing, as it is postulated that more than one billion bases will be produced per year in 2020. The growth rate of produced data is much faster than Moore's law in computer technology. This makes it more difficult to deal with genomics data, such as storing data, searching information, and finding the hidden information. It is required to develop the analysis platform for genomics big data. Cloud computing newly developed enables us to deal with big data more efficiently. Hadoop is one of the frameworks distributed computing and relies upon the core of a Big Data as a Service (BDaaS). Although many services have adopted this technology, e.g. amazon, there are a few applications in the biology field. Here, we propose a new algorithm to more efficiently deal with the genomics big data, e.g. sequencing data. Our algorithm consists of two parts: First is that BDaaS is applied for handling the data more efficiently. Second is that the hybrid method of MapReduce and Fuzzy logic is applied for data processing. This step can be parallelized in implementation. Our algorithm has great potential in computational analysis of genomics big data, e.g. de novo genome assembly and sequence similarity search. We will discuss our algorithm and its feasibility.Keywords: big data, fuzzy logic, MapReduce, Hadoop, cloud computing
Procedia PDF Downloads 29924991 Forthcoming Big Data on Smart Buildings and Cities: An Experimental Study on Correlations among Urban Data
Authors: Yu-Mi Song, Sung-Ah Kim, Dongyoun Shin
Abstract:
Cities are complex systems of diverse and inter-tangled activities. These activities and their complex interrelationships create diverse urban phenomena. And such urban phenomena have considerable influences on the lives of citizens. This research aimed to develop a method to reveal the causes and effects among diverse urban elements in order to enable better understanding of urban activities and, therefrom, to make better urban planning strategies. Specifically, this study was conducted to solve a data-recommendation problem found on a Korean public data homepage. First, a correlation analysis was conducted to find the correlations among random urban data. Then, based on the results of that correlation analysis, the weighted data network of each urban data was provided to people. It is expected that the weights of urban data thereby obtained will provide us with insights into cities and show us how diverse urban activities influence each other and induce feedback.Keywords: big data, machine learning, ontology model, urban data model
Procedia PDF Downloads 41824990 Women In Orthopedic Surgery, A Scoping Review
Authors: Katherine van Kampen, Reva Qiu, Patricia Farrugia
Abstract:
Orthopedic surgery has fallen behind when it comes to gender diversity despite medical school classes reaching gender parity. Studies have shown that orthopedic surgery would require 117 years to reach gender parity with the trainee population, the longest time than any other specialty, including neurosurgery, urology, and otolaryngology. The barriers that face women in orthopedic surgery have been well researched, with contributing factors being on-going stereotypes of the field, lack of women mentors, and gender roles outside of the hospital. Furthermore, women in orthopedic surgery face barriers to achieve promotion, publications, and leadership roles leading to a “leaky pipeline,” resulting in less and less women in key academic roles in the field. It is a complex topic with barriers and challenges faced in medical school, residency, and throughout employment. Our scoping review seeks to understand these challenges across a temporal timeline and to further characterize such barriers and the driving factors behind them. To this date, authors did not find a scoping review that seeks to look broadly at factors impacting the decreased amount of women entering orthopedics and the factors that cause women to hit a “glass ceiling”, the idea that women will not achieve the same success as men despite the same qualifications, upon entering the field. This scoping review is the first of its kind to attempt to summarize the large body of research focusing on women in orthopedic surgery from the preconceptions in medical school impacting their desire to pursue orthopedics all the way to employment, including challenges to academic success and financial success. Literature databases will be searched with the following key terms: women, gender inequity, workforce, orthopedics, and citations will be hand searched and collected. Articles included will discuss gender inequality within orthopedics with non-english, patient related articles excluded. Full-text review will seek to characterize the specific barriers faced by women across medical school, residency, and employment. Themes that are expected to be highlighted are workforce data, women in orthopedic leadership, medical student perspectives on the specialty, and gender bias and discrimination in the field.Keywords: orthopedics, gender equity, workforce, women in surgery
Procedia PDF Downloads 9124989 Data-driven Decision-Making in Digital Entrepreneurship
Authors: Abeba Nigussie Turi, Xiangming Samuel Li
Abstract:
Data-driven business models are more typical for established businesses than early-stage startups that strive to penetrate a market. This paper provided an extensive discussion on the principles of data analytics for early-stage digital entrepreneurial businesses. Here, we developed data-driven decision-making (DDDM) framework that applies to startups prone to multifaceted barriers in the form of poor data access, technical and financial constraints, to state some. The startup DDDM framework proposed in this paper is novel in its form encompassing startup data analytics enablers and metrics aligning with startups' business models ranging from customer-centric product development to servitization which is the future of modern digital entrepreneurship.Keywords: startup data analytics, data-driven decision-making, data acquisition, data generation, digital entrepreneurship
Procedia PDF Downloads 32824988 Cryptographic Protocol for Secure Cloud Storage
Authors: Luvisa Kusuma, Panji Yudha Prakasa
Abstract:
Cloud storage, as a subservice of infrastructure as a service (IaaS) in Cloud Computing, is the model of nerworked storage where data can be stored in server. In this paper, we propose a secure cloud storage system consisting of two main components; client as a user who uses the cloud storage service and server who provides the cloud storage service. In this system, we propose the protocol schemes to guarantee against security attacks in the data transmission. The protocols are login protocol, upload data protocol, download protocol, and push data protocol, which implement hybrid cryptographic mechanism based on data encryption before it is sent to the cloud, so cloud storage provider does not know the user's data and cannot analysis user’s data, because there is no correspondence between data and user.Keywords: cloud storage, security, cryptographic protocol, artificial intelligence
Procedia PDF Downloads 35724987 Decentralized Data Marketplace Framework Using Blockchain-Based Smart Contract
Authors: Meshari Aljohani, Stephan Olariu, Ravi Mukkamala
Abstract:
Data is essential for enhancing the quality of life. Its value creates chances for users to profit from data sales and purchases. Users in data marketplaces, however, must share and trade data in a secure and trusted environment while maintaining their privacy. The first main contribution of this paper is to identify enabling technologies and challenges facing the development of decentralized data marketplaces. The second main contribution is to propose a decentralized data marketplace framework based on blockchain technology. The proposed framework enables sellers and buyers to transact with more confidence. Using a security deposit, the system implements a unique approach for enforcing honesty in data exchange among anonymous individuals. Before the transaction is considered complete, the system has a time frame. As a result, users can submit disputes to the arbitrators which will review them and respond with their decision. Use cases are presented to demonstrate how these technologies help data marketplaces handle issues and challenges.Keywords: blockchain, data, data marketplace, smart contract, reputation system
Procedia PDF Downloads 15824986 Control of Pipeline Gas Quality to Extend Gas Turbine Life
Authors: Peter J. H. Carnell, Panayiotis Theophanous
Abstract:
Natural gas due to its cleaner combustion characteristics is expected to be the most widely used fuel in the move towards less polluting and renewable energy sources. Thus, the developed world is supplied by a complex network of gas pipelines and natural gas is becoming a major source of fuel. Natural gas delivered directly from the well will differ in composition from gas derived from LNG or produced by anaerobic digestion processes. Each will also have specific contaminants and properties although gas from all sources is likely to enter the distribution system and be blended to provide the desired characteristics such as Higher Heating Value and Wobbe No. The absence of a standard gas composition poses problems when the gas is used as a chemical feedstock, in specialised furnaces or on gas turbines. The chemical industry has suffered in the past as a result of variable gas composition. Transition metal catalysts used in ammonia, methanol and hydrogen plants were easily poisoned by sulphur, chlorides and mercury reducing both activity and catalyst expected lives from years to months. These plants now concentrate on purification and conditioning of the natural gas feed using fixed bed technologies, allowing them to run for several years and having transformed their operations. Similar technologies can be applied to the power industry reducing maintenance requirements and extending the operating life of gas turbines.Keywords: gas composition, gas conditioning, gas turbines, power generation, purification
Procedia PDF Downloads 28624985 FLIME - Fast Low Light Image Enhancement for Real-Time Video
Authors: Vinay P., Srinivas K. S.
Abstract:
Low Light Image Enhancement is of utmost impor- tance in computer vision based tasks. Applications include vision systems for autonomous driving, night vision devices for defence systems, low light object detection tasks. Many of the existing deep learning methods are resource intensive during the inference step and take considerable time for processing. The algorithm should take considerably less than 41 milliseconds in order to process a real-time video feed with 24 frames per second and should be even less for a video with 30 or 60 frames per second. The paper presents a fast and efficient solution which has two main advantages, it has the potential to be used for a real-time video feed, and it can be used in low compute environments because of the lightweight nature. The proposed solution is a pipeline of three steps, the first one is the use of a simple function to map input RGB values to output RGB values, the second is to balance the colors and the final step is to adjust the contrast of the image. Hence a custom dataset is carefully prepared using images taken in low and bright lighting conditions. The preparation of the dataset, the proposed model, the processing time are discussed in detail and the quality of the enhanced images using different methods is shown.Keywords: low light image enhancement, real-time video, computer vision, machine learning
Procedia PDF Downloads 20524984 Production Sharing Contracts Transparency Simulation
Authors: Chariton Christou, David Cornwell
Abstract:
Production Sharing Contract (PSC) is the type of contract that is being used widely in our time. The financial crisis made the governments tightfisted and they do not have the resources to participate in a development of a field. Therefore, more and more countries introduce the PSC. The companies have the power and the money to develop the field with their own way. The main problem is the transparency of oil and gas companies especially in the PSC and how this can be achieved. Many discussions have been made especially in the U.K. What we are suggesting is a dynamic financial simulation with the help of a flow meter. The flow meter will count the production of each field every day (it will be installed in a pipeline). The production will be the basic input of the simulation. It will count the profit, the costs and more according to the information of the flow meter. In addition it will include the terms of the contract and the costs that have been paid. By all these parameters the simulation will be able to present in real time the information of a field (taxes, employees, R-factor). By this simulation the company will share some information with the government but not all of them. The government will know the taxes that should be paid and what is the sharing percentage of it. All of the other information could be confidential for the company. Furthermore, oil company could control the R-factor by changing the production each day to maximize its sharing percentages and as a result of this the profit. This idea aims to change the way that governments 'control' oil companies and bring a transparency evolution in the industry. With the help of a simulation every country could be next to the company and have a better collaboration.Keywords: production sharing contracts, transparency, simulation
Procedia PDF Downloads 37524983 Field Deployment of Corrosion Inhibitor Developed for Sour Oil and Gas Carbon Steel Pipelines
Authors: Jeremy Moloney
Abstract:
A major oil and gas operator in western Canada producing approximately 50,000 BOE per day of sour fluids was experiencing increased water production along with decreased oil production over several years. The higher water volumes being produced meant an increase in the operator’s incumbent corrosion inhibitor (CI) chemical requirements but with reduced oil production revenues. Thus, a cost-effective corrosion inhibitor solution was sought to deliver enhanced corrosion mitigation of the carbon steel pipeline infrastructure but at reduced chemical injection dose rates. This paper presents the laboratory work conducted on the development of a corrosion inhibitor under the operator’s simulated sour operating conditions and then subsequent field testing of the product. The new CI not only provided extremely good levels of general and localized corrosion inhibition and outperformed the incumbent CI under the laboratory test conditions but did so at vastly lower concentrations. In turn, the novel CI product facilitated field chemical injection rates to be optimized and reduced by 40% compared with the incumbent whilst maintaining superior corrosion protection resulting in significant cost savings and associated sustainability benefits for the operator.Keywords: carbon steel, sour gas, hydrogen sulphide, localized corrosion, pitting, corrosion inhibitor
Procedia PDF Downloads 8524982 Data Mining Approach for Commercial Data Classification and Migration in Hybrid Storage Systems
Authors: Mais Haj Qasem, Maen M. Al Assaf, Ali Rodan
Abstract:
Parallel hybrid storage systems consist of a hierarchy of different storage devices that vary in terms of data reading speed performance. As we ascend in the hierarchy, data reading speed becomes faster. Thus, migrating the application’ important data that will be accessed in the near future to the uppermost level will reduce the application I/O waiting time; hence, reducing its execution elapsed time. In this research, we implement trace-driven two-levels parallel hybrid storage system prototype that consists of HDDs and SSDs. The prototype uses data mining techniques to classify application’ data in order to determine its near future data accesses in parallel with the its on-demand request. The important data (i.e. the data that the application will access in the near future) are continuously migrated to the uppermost level of the hierarchy. Our simulation results show that our data migration approach integrated with data mining techniques reduces the application execution elapsed time when using variety of traces in at least to 22%.Keywords: hybrid storage system, data mining, recurrent neural network, support vector machine
Procedia PDF Downloads 30824981 Discussion on Big Data and One of Its Early Training Application
Authors: Fulya Gokalp Yavuz, Mark Daniel Ward
Abstract:
This study focuses on a contemporary and inevitable topic of Data Science and its exemplary application for early career building: Big Data and Leaving Learning Community (LLC). ‘Academia’ and ‘Industry’ have a common sense on the importance of Big Data. However, both of them are in a threat of missing the training on this interdisciplinary area. Some traditional teaching doctrines are far away being effective on Data Science. Practitioners needs some intuition and real-life examples how to apply new methods to data in size of terabytes. We simply explain the scope of Data Science training and exemplified its early stage application with LLC, which is a National Science Foundation (NSF) founded project under the supervision of Prof. Ward since 2014. Essentially, we aim to give some intuition for professors, researchers and practitioners to combine data science tools for comprehensive real-life examples with the guides of mentees’ feedback. As a result of discussing mentoring methods and computational challenges of Big Data, we intend to underline its potential with some more realization.Keywords: Big Data, computation, mentoring, training
Procedia PDF Downloads 36224980 Investigation of Optimal Parameter Settings in Super Duplex Stainless Steel Welding Welding
Authors: R. M. Chandima Ratnayake, Daniel Dyakov
Abstract:
Super steel materials play vital role in construction and fabrication of structural, piping and pipeline components. They enable to minimize the life cycle costs in assuring the integrity of onshore and offshore operating systems. In this context, Duplex stainless steel (DSS) material related welding on constructions and fabrications play a significant role in maintaining and assuring integrity at an optimal expenditure over the life cycle of production and process systems as well as associated structures. In DSS welding, the factors such as gap geometry, shielding gas supply rate, welding current, and type of the welding process play a vital role on the final joint performance. Hence, an experimental investigation has been performed using engineering robust design approach (ERDA) to investigate the optimal settings that generate optimal super DSS (i.e. UNS S32750) joint performance. This manuscript illustrates the mathematical approach and experimental design, optimal parameter settings and results of verification experiment.Keywords: duplex stainless steel welding, engineering robust design, mathematical framework, optimal parameter settings
Procedia PDF Downloads 41524979 Towards a Secure Storage in Cloud Computing
Authors: Mohamed Elkholy, Ahmed Elfatatry
Abstract:
Cloud computing has emerged as a flexible computing paradigm that reshaped the Information Technology map. However, cloud computing brought about a number of security challenges as a result of the physical distribution of computational resources and the limited control that users have over the physical storage. This situation raises many security challenges for data integrity and confidentiality as well as authentication and access control. This work proposes a security mechanism for data integrity that allows a data owner to be aware of any modification that takes place to his data. The data integrity mechanism is integrated with an extended Kerberos authentication that ensures authorized access control. The proposed mechanism protects data confidentiality even if data are stored on an untrusted storage. The proposed mechanism has been evaluated against different types of attacks and proved its efficiency to protect cloud data storage from different malicious attacks.Keywords: access control, data integrity, data confidentiality, Kerberos authentication, cloud security
Procedia PDF Downloads 33524978 Predicting Susceptibility to Coronary Artery Disease using Single Nucleotide Polymorphisms with a Large-Scale Data Extraction from PubMed and Validation in an Asian Population Subset
Authors: K. H. Reeta, Bhavana Prasher, Mitali Mukerji, Dhwani Dholakia, Sangeeta Khanna, Archana Vats, Shivam Pandey, Sandeep Seth, Subir Kumar Maulik
Abstract:
Introduction Research has demonstrated a connection between coronary artery disease (CAD) and genetics. We did a deep literature mining using both bioinformatics and manual efforts to identify the susceptible polymorphisms in coronary artery disease. Further, the study sought to validate these findings in an Asian population. Methodology In first phase, we used an automated pipeline which organizes and presents structured information on SNPs, Population and Diseases. The information was obtained by applying Natural Language Processing (NLP) techniques to approximately 28 million PubMed abstracts. To accomplish this, we utilized Python scripts to extract and curate disease-related data, filter out false positives, and categorize them into 24 hierarchical groups using named Entity Recognition (NER) algorithms. From the extensive research conducted, a total of 466 unique PubMed Identifiers (PMIDs) and 694 Single Nucleotide Polymorphisms (SNPs) related to coronary artery disease (CAD) were identified. To refine the selection process, a thorough manual examination of all the studies was carried out. Specifically, SNPs that demonstrated susceptibility to CAD and exhibited a positive Odds Ratio (OR) were selected, and a final pool of 324 SNPs was compiled. The next phase involved validating the identified SNPs in DNA samples of 96 CAD patients and 37 healthy controls from Indian population using Global Screening Array. ResultsThe results exhibited out of 324, only 108 SNPs were expressed, further 4 SNPs showed significant difference of minor allele frequency in cases and controls. These were rs187238 of IL-18 gene, rs731236 of VDR gene, rs11556218 of IL16 gene and rs5882 of CETP gene. Prior researches have reported association of these SNPs with various pathways like endothelial damage, susceptibility of vitamin D receptor (VDR) polymorphisms, and reduction of HDL-cholesterol levels, ultimately leading to the development of CAD. Among these, only rs731236 had been studied in Indian population and that too in diabetes and vitamin D deficiency. For the first time, these SNPs were reported to be associated with CAD in Indian population. Conclusion: This pool of 324 SNP s is a unique kind of resource that can help to uncover risk associations in CAD. Here, we validated in Indian population. Further, validation in different populations may offer valuable insights and contribute to the development of a screening tool and may help in enabling the implementation of primary prevention strategies targeted at the vulnerable population.Keywords: coronary artery disease, single nucleotide polymorphism, susceptible SNP, bioinformatics
Procedia PDF Downloads 7624977 Ontological Modeling Approach for Statistical Databases Publication in Linked Open Data
Authors: Bourama Mane, Ibrahima Fall, Mamadou Samba Camara, Alassane Bah
Abstract:
At the level of the National Statistical Institutes, there is a large volume of data which is generally in a format which conditions the method of publication of the information they contain. Each household or business data collection project includes a dissemination platform for its implementation. Thus, these dissemination methods previously used, do not promote rapid access to information and especially does not offer the option of being able to link data for in-depth processing. In this paper, we present an approach to modeling these data to publish them in a format intended for the Semantic Web. Our objective is to be able to publish all this data in a single platform and offer the option to link with other external data sources. An application of the approach will be made on data from major national surveys such as the one on employment, poverty, child labor and the general census of the population of Senegal.Keywords: Semantic Web, linked open data, database, statistic
Procedia PDF Downloads 175