Search results for: data streams
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24736

Search results for: data streams

24526 Estimation of Delay Due to Loading–Unloading of Passengers by Buses and Reduction of Number of Lanes at Selected Intersections in Dhaka City

Authors: Sumit Roy, A. Uddin

Abstract:

One of the significant reasons that increase the delay time in the intersections at heterogeneous traffic condition is a sudden reduction of the capacity of the roads. In this study, the delay for this sudden capacity reduction is estimated. Two intersections at Dhaka city were brought in to thestudy, i.e., Kakrail intersection, and SAARC Foara intersection. At Kakrail intersection, the sudden reduction of capacity in the roads is seen at three downstream legs of the intersection, which are because of slowing down or stopping of buses for loading and unloading of passengers. At SAARC Foara intersection, sudden reduction of capacity was seen at two downstream legs. At one leg, it was due to loading and unloading of buses, and at another leg, it was for both loading and unloading of buses and reduction of the number of lanes. With these considerations, the delay due to intentional stoppage or slowing down of buses and reduction of the number of lanes for these two intersections are estimated. Here the delay was calculated by two approaches. The first approach came from the concept of shock waves in traffic streams. Here the delay was calculated by determining the flow, density, and speed before and after the sudden capacity reduction. The second approach came from the deterministic analysis of queues. Here the delay is calculated by determining the volume, capacity and reduced capacity of the road. After determining the delay from these two approaches, the results were compared. For this study, the video of each of the two intersections was recorded for one hour at the evening peak. Necessary geometric data were also taken to determine speed, flow, and density, etc. parameters. The delay was calculated for one hour with one-hour data at both intersections. In case of Kakrail intersection, the per hour delay for Kakrail circle leg was 5.79, and 7.15 minutes, for Shantinagar cross intersection leg they were 13.02 and 15.65 minutes, and for Paltan T intersection leg, they were 3 and 1.3 minutes for 1st and 2nd approaches respectively. In the case of SAARC Foara intersection, the delay at Shahbag leg was only due to intentional stopping or slowing down of busses, which were 3.2 and 3 minutes respectively for both approaches. For the Karwan Bazar leg, the delays for buses by both approaches were 5 and 7.5 minutes respectively, and for reduction of the number of lanes, the delays for both approaches were 2 and 1.78 minutes respectively. Measuring the delay per hour for the Kakrail leg at Kakrail circle, it is seen that, with consideration of the first approach of delay estimation, the intentional stoppage and lowering of speed by buses contribute to 26.24% of total delay at Kakrail circle. If the loading and unloading of buses at intersection is made forbidden near intersection, and any other measures for loading and unloading of passengers are established far enough from the intersections, then the delay at intersections can be reduced at significant scale, and the performance of the intersections can be enhanced.

Keywords: delay, deterministic queue analysis, shock wave, passenger loading-unloading

Procedia PDF Downloads 165
24525 Opening up Government Datasets for Big Data Analysis to Support Policy Decisions

Authors: K. Hardy, A. Maurushat

Abstract:

Policy makers are increasingly looking to make evidence-based decisions. Evidence-based decisions have historically used rigorous methodologies of empirical studies by research institutes, as well as less reliable immediate survey/polls often with limited sample sizes. As we move into the era of Big Data analytics, policy makers are looking to different methodologies to deliver reliable empirics in real-time. The question is not why did these people do this for the last 10 years, but why are these people doing this now, and if the this is undesirable, and how can we have an impact to promote change immediately. Big data analytics rely heavily on government data that has been released in to the public domain. The open data movement promises greater productivity and more efficient delivery of services; however, Australian government agencies remain reluctant to release their data to the general public. This paper considers the barriers to releasing government data as open data, and how these barriers might be overcome.

Keywords: big data, open data, productivity, data governance

Procedia PDF Downloads 356
24524 A Review on Existing Challenges of Data Mining and Future Research Perspectives

Authors: Hema Bhardwaj, D. Srinivasa Rao

Abstract:

Technology for analysing, processing, and extracting meaningful data from enormous and complicated datasets can be termed as "big data." The technique of big data mining and big data analysis is extremely helpful for business movements such as making decisions, building organisational plans, researching the market efficiently, improving sales, etc., because typical management tools cannot handle such complicated datasets. Special computational and statistical issues, such as measurement errors, noise accumulation, spurious correlation, and storage and scalability limitations, are brought on by big data. These unique problems call for new computational and statistical paradigms. This research paper offers an overview of the literature on big data mining, its process, along with problems and difficulties, with a focus on the unique characteristics of big data. Organizations have several difficulties when undertaking data mining, which has an impact on their decision-making. Every day, terabytes of data are produced, yet only around 1% of that data is really analyzed. The idea of the mining and analysis of data and knowledge discovery techniques that have recently been created with practical application systems is presented in this study. This article's conclusion also includes a list of issues and difficulties for further research in the area. The report discusses the management's main big data and data mining challenges.

Keywords: big data, data mining, data analysis, knowledge discovery techniques, data mining challenges

Procedia PDF Downloads 95
24523 Channel Characteristics and Morphometry of a Part of Umtrew River, Meghalaya

Authors: Pratyashi Phukan, Ranjan Saikia

Abstract:

Morphometry incorporates quantitative study of the area ,altitude,volume, slope profiles of a land and drainage basin characteristics of the area concerned.Fluvial geomorphology includes the consideration of linear,areal and relief aspects of a fluvially originated drainage basin. The linear aspect deals with the hierarchical orders of streams, numbers, and lenghts of stream segments and various relationship among them.The areal aspect includes the analysis of basin perimeters,basin shape, basin area, and related morphometric laws. The relief aspect incorporates besides hypsometric, climographic and altimetric analysis,the study of absolute and relative reliefs, relief ratios, average slope, etc. In this paper we have analysed the relationship among stream velocity, channel shape,sediment load,channel width,channel depth, etc.

Keywords: morphometry, hydraulic geometry, Umtrew river, Meghalaya

Procedia PDF Downloads 443
24522 In the Study of Co₂ Capacity Performance of Different Frothing Agents through Process Simulation

Authors: Muhammad Idrees, Masroor Abro, Sikandar Almani

Abstract:

Presently, the increasing CO₂ concentration in the atmosphere has been taken as one of the major challenges faced by the modern world. The average CO₂ in the atmosphere reached the highest value of 414.72 ppm in 2021, as reported in a conference of the parties (COP26). This study focuses on (i) the comparative study of MEA, NaOH, Acetic acid, and Na₂CO₃ in terms of their CO₂ capture performance, (ii) the significance of adding various frothing agents achieving improved absorption capacity of Na₂CO₃ and (iii) the overall economic evaluation of process with the help of Aspen Plus. The results obtained suggest that the addition of frothing agents significantly increased the absorption rate of dilute sodium carbonate such that from 45% to 99.9%. The effect of temperature, pressure and flow rate of liquid and flue gas streams on CO₂ absorption capacity was also investigated. It was found that the absorption capacity of Na₂CO₃ decreased with increasing temperature of the liquid stream and decreasing flow rate of the liquid stream and pressure of the gas stream.

Keywords: CO₂, absorbents, frothing agents, process simulation

Procedia PDF Downloads 64
24521 A Systematic Review on Challenges in Big Data Environment

Authors: Rimmy Yadav, Anmol Preet Kaur

Abstract:

Big Data has demonstrated the vast potential in streamlining, deciding, spotting business drifts in different fields, for example, producing, fund, Information Technology. This paper gives a multi-disciplinary diagram of the research issues in enormous information and its procedures, instruments, and system identified with the privacy, data storage management, network and energy utilization, adaptation to non-critical failure and information representations. Other than this, result difficulties and openings accessible in this Big Data platform have made.

Keywords: big data, privacy, data management, network and energy consumption

Procedia PDF Downloads 291
24520 Diffusion of Social Innovation in Thai Community Enterprises

Authors: Thanisa Sirithaporn

Abstract:

The study aims to examine the diffusion of social innovation among Thai Community Enterprises in conjunction with a singular case study of a medium-sized corporation that has successfully transitioned from a charitable foundation to a sustainable, profitable entity creating value for both shareholders and the communities in which it operates. It seeks to bridge the gap between different streams of aligned research in the fields of diffusion, social innovation, and community enterprises into a more cohesive conceptual framework and thus to better understand the historical and current impediments that have resulted in so many enterprises failing to be sustainable. The methodology is mixed and dual phased. The initial quantitative phase uses a questionnaire as the main research instrument distributed among community enterprises throughout Thailand which will provide the themes for the qualitative phase through semi-structured interviews with key stakeholders at a commercial enterprise actively engaged in social innovation. The findings seek to present a more comprehensive conceptual framework and actionable guidelines to aid community enterprises to develop social innovation in a sustainable manner that creates value to its beneficiaries.

Keywords: diffusion, community enterprises, social innovation, Thailand

Procedia PDF Downloads 123
24519 Robust and Dedicated Hybrid Cloud Approach for Secure Authorized Deduplication

Authors: Aishwarya Shekhar, Himanshu Sharma

Abstract:

Data deduplication is one of important data compression techniques for eliminating duplicate copies of repeating data, and has been widely used in cloud storage to reduce the amount of storage space and save bandwidth. In this process, duplicate data is expunged, leaving only one copy means single instance of the data to be accumulated. Though, indexing of each and every data is still maintained. Data deduplication is an approach for minimizing the part of storage space an organization required to retain its data. In most of the company, the storage systems carry identical copies of numerous pieces of data. Deduplication terminates these additional copies by saving just one copy of the data and exchanging the other copies with pointers that assist back to the primary copy. To ignore this duplication of the data and to preserve the confidentiality in the cloud here we are applying the concept of hybrid nature of cloud. A hybrid cloud is a fusion of minimally one public and private cloud. As a proof of concept, we implement a java code which provides security as well as removes all types of duplicated data from the cloud.

Keywords: confidentiality, deduplication, data compression, hybridity of cloud

Procedia PDF Downloads 368
24518 A Review of Machine Learning for Big Data

Authors: Devatha Kalyan Kumar, Aravindraj D., Sadathulla A.

Abstract:

Big data are now rapidly expanding in all engineering and science and many other domains. The potential of large or massive data is undoubtedly significant, make sense to require new ways of thinking and learning techniques to address the various big data challenges. Machine learning is continuously unleashing its power in a wide range of applications. In this paper, the latest advances and advancements in the researches on machine learning for big data processing. First, the machine learning techniques methods in recent studies, such as deep learning, representation learning, transfer learning, active learning and distributed and parallel learning. Then focus on the challenges and possible solutions of machine learning for big data.

Keywords: active learning, big data, deep learning, machine learning

Procedia PDF Downloads 424
24517 Strengthening Legal Protection of Personal Data through Technical Protection Regulation in Line with Human Rights

Authors: Tomy Prihananto, Damar Apri Sudarmadi

Abstract:

Indonesia recognizes the right to privacy as a human right. Indonesia provides legal protection against data management activities because the protection of personal data is a part of human rights. This paper aims to describe the arrangement of data management and data management in Indonesia. This paper is a descriptive research with qualitative approach and collecting data from literature study. Results of this paper are comprehensive arrangement of data that have been set up as a technical requirement of data protection by encryption methods. Arrangements on encryption and protection of personal data are mutually reinforcing arrangements in the protection of personal data. Indonesia has two important and immediately enacted laws that provide protection for the privacy of information that is part of human rights.

Keywords: Indonesia, protection, personal data, privacy, human rights, encryption

Procedia PDF Downloads 166
24516 Geomorphologic Evolution of the Southern Habble-Rud River Basin, North of Iran

Authors: Maryam Jaberi, Siavosh Shayan, Mojtaba Yamani

Abstract:

Habble-Rud River basin (HR), up to 100 km length, one of the largest watersheds which drain into deserts to the north of Central Iran (Dasht-e Kavir). This stream is oblique with the NE-SW trending, flow in the southern range of central Alborz Mountains and the northern border of Central Iran. The end of the ~17 km suddenly change direction and with the southern trending to have a morphology which meanders passes through the Alborz Mountain ridge and flows into the Garmsar plain where it forms one of the largest alluvial fans in Iran, i.e. the vast Garmsar alluvial fan with an area of 476 km2. This study was carried out through morphometric analyses, longitudinal river profiles, and study of geomorpholic evidence such as fluvial terraces, gypsum-salt domes, seismic data, and satellite images. This study aimed to investigate the changes in the pattern of rivers in the southern part of the HR river basin. The southern part of HR river basin located at the southern foothills of the Central Alborz is characterized the thrust faults (Sorkheh-Kalut and Garmsar faults), folds,diapirs and arid climate. The activity of more than 10 salt domes that belong to the Oligocene-Miocene period has considerably influenced the pattern of streams in this region. Dissolution of these domes has not only reduced the quality of water and soil resources, but also has led to the formation of badlands and gullies.Our results indicated that the pattern of rivers in the southern part of HR river basin was influenced by discharge of the HR river in Quaternary, geological structure, subsidence of Central Iran and vertical uplift of Alborz mountain. These agents caused the formation meanders in the southern part of the HR River and evaluation of the seasonal rivers like Shoor-Darre and Garmabsar.

Keywords: geomorphologic evaluation, rivers pattern, Habble-Rud River basin, seasonal rivers

Procedia PDF Downloads 491
24515 The Various Legal Dimensions of Genomic Data

Authors: Amy Gooden

Abstract:

When human genomic data is considered, this is often done through only one dimension of the law, or the interplay between the various dimensions is not considered, thus providing an incomplete picture of the legal framework. This research considers and analyzes the various dimensions in South African law applicable to genomic sequence data – including property rights, personality rights, and intellectual property rights. The effective use of personal genomic sequence data requires the acknowledgement and harmonization of the rights applicable to such data.

Keywords: artificial intelligence, data, law, genomics, rights

Procedia PDF Downloads 130
24514 Big Brain: A Single Database System for a Federated Data Warehouse Architecture

Authors: X. Gumara Rigol, I. Martínez de Apellaniz Anzuola, A. Garcia Serrano, A. Franzi Cros, O. Vidal Calbet, A. Al Maruf

Abstract:

Traditional federated architectures for data warehousing work well when corporations have existing regional data warehouses and there is a need to aggregate data at a global level. Schibsted Media Group has been maturing from a decentralised organisation into a more globalised one and needed to build both some of the regional data warehouses for some brands at the same time as the global one. In this paper, we present the architectural alternatives studied and why a custom federated approach was the notable recommendation to go further with the implementation. Although the data warehouses are logically federated, the implementation uses a single database system which presented many advantages like: cost reduction and improved data access to global users allowing consumers of the data to have a common data model for detailed analysis across different geographies and a flexible layer for local specific needs in the same place.

Keywords: data integration, data warehousing, federated architecture, Online Analytical Processing (OLAP)

Procedia PDF Downloads 225
24513 Case Study on Innovative Aquatic-Based Bioeconomy for Chlorella sorokiniana

Authors: Iryna Atamaniuk, Hannah Boysen, Nils Wieczorek, Natalia Politaeva, Iuliia Bazarnova, Kerstin Kuchta

Abstract:

Over the last decade due to climate change and a strategy of natural resources preservation, the interest for the aquatic biomass has dramatically increased. Along with mitigation of the environmental pressure and connection of waste streams (including CO2 and heat emissions), microalgae bioeconomy can supply food, feed, as well as the pharmaceutical and power industry with number of value-added products. Furthermore, in comparison to conventional biomass, microalgae can be cultivated in wide range of conditions without compromising food and feed production, thus addressing issues associated with negative social and the environmental impacts. This paper presents the state-of-the art technology for microalgae bioeconomy from cultivation process to production of valuable components and by-streams. Microalgae Chlorella sorokiniana were cultivated in the pilot-scale innovation concept in Hamburg (Germany) using different systems such as race way pond (5000 L) and flat panel reactors (8 x 180 L). In order to achieve the optimum growth conditions along with suitable cellular composition for the further extraction of the value-added components, process parameters such as light intensity, temperature and pH are continuously being monitored. On the other hand, metabolic needs in nutrients were provided by addition of micro- and macro-nutrients into a medium to ensure autotrophic growth conditions of microalgae. The cultivation was further followed by downstream process and extraction of lipids, proteins and saccharides. Lipids extraction is conducted in repeated-batch semi-automatic mode using hot extraction method according to Randall. As solvents hexane and ethanol are used at different ratio of 9:1 and 1:9, respectively. Depending on cell disruption method along with solvents ratio, the total lipids content showed significant variations between 8.1% and 13.9 %. The highest percentage of extracted biomass was reached with a sample pretreated with microwave digestion using 90% of hexane and 10% of ethanol as solvents. Proteins content in microalgae was determined by two different methods, namely: Total Kejadahl Nitrogen (TKN), which further was converted to protein content, as well as Bradford method using Brilliant Blue G-250 dye. Obtained results, showed a good correlation between both methods with protein content being in the range of 39.8–47.1%. Characterization of neutral and acid saccharides from microalgae was conducted by phenol-sulfuric acid method at two wavelengths of 480 nm and 490 nm. The average concentration of neutral and acid saccharides under the optimal cultivation conditions was 19.5% and 26.1%, respectively. Subsequently, biomass residues are used as substrate for anaerobic digestion on the laboratory-scale. The methane concentration, which was measured on the daily bases, showed some variations for different samples after extraction steps but was in the range between 48% and 55%. CO2 which is formed during the fermentation process and after the combustion in the Combined Heat and Power unit can potentially be used within the cultivation process as a carbon source for the photoautotrophic synthesis of biomass.

Keywords: bioeconomy, lipids, microalgae, proteins, saccharides

Procedia PDF Downloads 232
24512 A Review Paper on Data Mining and Genetic Algorithm

Authors: Sikander Singh Cheema, Jasmeen Kaur

Abstract:

In this paper, the concept of data mining is summarized and its one of the important process i.e KDD is summarized. The data mining based on Genetic Algorithm is researched in and ways to achieve the data mining Genetic Algorithm are surveyed. This paper also conducts a formal review on the area of data mining tasks and genetic algorithm in various fields.

Keywords: data mining, KDD, genetic algorithm, descriptive mining, predictive mining

Procedia PDF Downloads 578
24511 Data-Mining Approach to Analyzing Industrial Process Information for Real-Time Monitoring

Authors: Seung-Lock Seo

Abstract:

This work presents a data-mining empirical monitoring scheme for industrial processes with partially unbalanced data. Measurement data of good operations are relatively easy to gather, but in unusual special events or faults it is generally difficult to collect process information or almost impossible to analyze some noisy data of industrial processes. At this time some noise filtering techniques can be used to enhance process monitoring performance in a real-time basis. In addition, pre-processing of raw process data is helpful to eliminate unwanted variation of industrial process data. In this work, the performance of various monitoring schemes was tested and demonstrated for discrete batch process data. It showed that the monitoring performance was improved significantly in terms of monitoring success rate of given process faults.

Keywords: data mining, process data, monitoring, safety, industrial processes

Procedia PDF Downloads 384
24510 Development of Integrated Solid Waste Management Plan for Industrial Estates of Pakistan

Authors: Mehak Masood

Abstract:

This paper aims to design an integrated solid waste management plan for industrial estates taking Sundar Industrial Estate as case model. The issue of solid waste management is on the rise in Pakistan especially in the industrial sector. In this regard, the concept of development and establishment of industrial estates is gaining popularity nowadays. Without proper solid waste management plan it is very difficult to manage day to day affairs of industrial estates. An industrial estate contains clusters of different types of industrial units. It is necessary to identify different types of solid waste streams from each industrial cluster within the estate. In this study, Sundar Industrial Estate was taken as a case model. Primary and secondary data collection, waste assessment, waste segregation and weighing and field surveys were essential elements of the study. Wastes from each industrial process were identified and quantified. Currently 130 industries are in production but after full colonization of industries this number would reach 385. Elaborated process flow diagrams were made to characterize the recyclable and non-recyclables waste. From the study it was calculated that about 12354.1 kg/captia/day of solid waste is being generated in Sundar Industrial Estate. After the full colonization of the industrial estate, the estimated quantity will be 4756328.5 kg/captia/day. Furthermore, solid waste generated from each industrial sector was estimated. Suggestions for collection and transportation are given. Environment friendly solid waste management practices are suggested. If an effective integrated waste management system is developed and implemented it will conserve resources, create jobs, reduce poverty, conserve natural resources, protect the environment, save collection, transportation and disposal costs and extend the life of disposal sites. A major outcome of this study is an integrated solid waste management plan for the Sundar Industrial Estate which requires immediate implementation.

Keywords: integrated solid waste management plan, industrial estates, Sundar Industrial Estate, Pakistan

Procedia PDF Downloads 476
24509 Soils Properties of Alfisols in the Nicoya Peninsula, Guanacaste, Costa Rica

Authors: Elena Listo, Miguel Marchamalo

Abstract:

This research studies the soil properties located in the watershed of Jabillo River in the Guanacaste province, Costa Rica. The soils are classified as Alfisols (T. Haplustalfs), in the flatter parts with grazing as Fluventic Haplustalfs or as a consequence of bad drainage as F. Epiaqualfs. The objective of this project is to define the status of the soil, to use remote sensing as a tool for analyzing the evolution of land use and determining the water balance of the watershed in order to improve the efficiency of the water collecting systems. Soil samples were analyzed from trial pits taken from secondary forests, degraded pastures, mature teak plantation, and regrowth -Tectona grandis L. F.- species developed favorably in the area. Furthermore, to complete the study, infiltration measurements were taken with an artificial rainfall simulator, as well as studies of soil compaction with a penetrometer, in points strategically selected from the different land uses. Regarding remote sensing, nearly 40 data samples were collected per plot of land. The source of radiation is reflected sunlight from the beam and the underside of leaves, bare soil, streams, roads and logs, and soil samples. Infiltration reached high levels. The majority of data came from the secondary forest and mature planting due to a high proportion of organic matter, relatively low bulk density, and high hydraulic conductivity. Teak regrowth had a low rate of infiltration because the studies made regarding the soil compaction showed a partial compaction over 50 cm. The secondary forest presented a compaction layer from 15 cm to 30 cm deep, and the degraded pasture, as a result of grazing, in the first 15 cm. In this area, the alfisols soils have high content of iron oxides, a fact that causes a higher reflectivity close to the infrared region of the electromagnetic spectrum (around 700mm), as a result of clay texture. Specifically in the teak plantation where the reflectivity reaches values of 90 %, this is due to the high content of clay in relation to others. In conclusion, the protective function of secondary forests is reaffirmed with regards to erosion and high rate of infiltration. In humid climates and permeable soils, the decrease of runoff is less, however, the percolation increases. The remote sensing indicates that being clay soils, they retain moisture in a better way and it means a low reflectivity despite being fine texture.

Keywords: alfisols, Costa Rica, infiltration, remote sensing

Procedia PDF Downloads 677
24508 Investigation of Biogas from Slaughterhouse and Dairy Farm Waste

Authors: Saadelnour Abdueljabbar Adam

Abstract:

Wastes from slaughterhouses in most towns in Sudan are often poorly managed and sometimes discharged into adjoining streams due to poor implementation of standards, thus causing environmental and public health hazards and also there is a large amount of manure from dairy farms. This paper presents a solution of organic waste from cow dairy farms and slaughterhouse. We present the findings of experimental investigation of biogas production using cow manure, blood and rumen content were mixed at three proportions :72.3%, 61%, 39% manure, 6%, 8.5%, 22% blood; and 21.7%, 30.5%, 39% rumen content in volume for bio-digester 1,2,3 respectively. This paper analyses the quantitative and qualitative composition of biogas: gas content, and the concentration of methane. The highest biogas output 0.116L/g dry matter from bio-digester1 together with a high-quality biogas of 85% methane Was from the mixture of cow manure with blood and rumen content were mixed at 72.3%manure, 6%blood and 21.7%rumen content which is useful for combustion and energy production. While bio-digester 2 and 3 gave 0.012L/g dry matter and 0.013L/g dry matter respectively with the weak concentration of methane (50%).

Keywords: anaerobic digestion, bio-digester, blood, cow manure, rumen content

Procedia PDF Downloads 553
24507 A Survey of Semantic Integration Approaches in Bioinformatics

Authors: Chaimaa Messaoudi, Rachida Fissoune, Hassan Badir

Abstract:

Technological advances of computer science and data analysis are helping to provide continuously huge volumes of biological data, which are available on the web. Such advances involve and require powerful techniques for data integration to extract pertinent knowledge and information for a specific question. Biomedical exploration of these big data often requires the use of complex queries across multiple autonomous, heterogeneous and distributed data sources. Semantic integration is an active area of research in several disciplines, such as databases, information-integration, and ontology. We provide a survey of some approaches and techniques for integrating biological data, we focus on those developed in the ontology community.

Keywords: biological ontology, linked data, semantic data integration, semantic web

Procedia PDF Downloads 434
24506 Classification of Generative Adversarial Network Generated Multivariate Time Series Data Featuring Transformer-Based Deep Learning Architecture

Authors: Thrivikraman Aswathi, S. Advaith

Abstract:

As there can be cases where the use of real data is somehow limited, such as when it is hard to get access to a large volume of real data, we need to go for synthetic data generation. This produces high-quality synthetic data while maintaining the statistical properties of a specific dataset. In the present work, a generative adversarial network (GAN) is trained to produce multivariate time series (MTS) data since the MTS is now being gathered more often in various real-world systems. Furthermore, the GAN-generated MTS data is fed into a transformer-based deep learning architecture that carries out the data categorization into predefined classes. Further, the model is evaluated across various distinct domains by generating corresponding MTS data.

Keywords: GAN, transformer, classification, multivariate time series

Procedia PDF Downloads 112
24505 Generative AI: A Comparison of Conditional Tabular Generative Adversarial Networks and Conditional Tabular Generative Adversarial Networks with Gaussian Copula in Generating Synthetic Data with Synthetic Data Vault

Authors: Lakshmi Prayaga, Chandra Prayaga. Aaron Wade, Gopi Shankar Mallu, Harsha Satya Pola

Abstract:

Synthetic data generated by Generative Adversarial Networks and Autoencoders is becoming more common to combat the problem of insufficient data for research purposes. However, generating synthetic data is a tedious task requiring extensive mathematical and programming background. Open-source platforms such as the Synthetic Data Vault (SDV) and Mostly AI have offered a platform that is user-friendly and accessible to non-technical professionals to generate synthetic data to augment existing data for further analysis. The SDV also provides for additions to the generic GAN, such as the Gaussian copula. We present the results from two synthetic data sets (CTGAN data and CTGAN with Gaussian Copula) generated by the SDV and report the findings. The results indicate that the ROC and AUC curves for the data generated by adding the layer of Gaussian copula are much higher than the data generated by the CTGAN.

Keywords: synthetic data generation, generative adversarial networks, conditional tabular GAN, Gaussian copula

Procedia PDF Downloads 56
24504 A Privacy Protection Scheme Supporting Fuzzy Search for NDN Routing Cache Data Name

Authors: Feng Tao, Ma Jing, Guo Xian, Wang Jing

Abstract:

Named Data Networking (NDN) replaces IP address of traditional network with data name, and adopts dynamic cache mechanism. In the existing mechanism, however, only one-to-one search can be achieved because every data has a unique name corresponding to it. There is a certain mapping relationship between data content and data name, so if the data name is intercepted by an adversary, the privacy of the data content and user’s interest can hardly be guaranteed. In order to solve this problem, this paper proposes a one-to-many fuzzy search scheme based on order-preserving encryption to reduce the query overhead by optimizing the caching strategy. In this scheme, we use hash value to ensure the user’s query safe from each node in the process of search, so does the privacy of the requiring data content.

Keywords: NDN, order-preserving encryption, fuzzy search, privacy

Procedia PDF Downloads 469
24503 Development of Electronic Waste Management Framework at College of Design Art, Design and Technology

Authors: Wafula Simon Peter, Kimuli Nabayego Ibtihal, Nabaggala Kimuli Nashua

Abstract:

The worldwide use of information and communications technology (ICT) equipment and other electronic equipment is growing and consequently, there is a growing amount of equipment that becomes waste after its time in use. This growth is expected to accelerate since equipment lifetime decreases with time and growing consumption. As a result, e-waste is one of the fastest-growing waste streams globally. The United Nations University (UNU) calculates in its second Global E-waste Monitor 44.7 million metric tonnes (Mt) of e-waste were generated globally in 2016. The study population was 80 respondents, from which a sample of 69 respondents was selected using simple and purposive sampling techniques. This research was carried out to investigate the problem of e-waste and come up with a framework to improve e-waste management. The objective of the study was to develop a framework for improving e-waste management at the College of Engineering, Design, Art and Technology (CEDAT). This was achieved by breaking it down into specific objectives, and these included the establishment of the policy and other Regulatory frameworks being used in e-waste management at CEDAT, the determination of the effectiveness of the e-waste management practices at CEDAT, the establishment of the critical challenges constraining e-waste management at the College, development of a framework for e-waste management. The study reviewed the e-waste regulatory framework used at the college and then collected data which was used to come up with a framework. The study also established that weak policy and regulatory framework, lack of proper infrastructure, improper disposal of e-waste and a general lack of awareness of the e-waste and the magnitude of the problem are the critical challenges of e-waste management. In conclusion, the policy and regulatory framework should be revised, localized and strengthened to contextually address the problem. Awareness campaigns, the development of proper infrastructure and extensive research to establish the volumes and magnitude of the problems will come in handy. The study recommends a framework for the improvement of e-waste.

Keywords: e-waste, treatment, disposal, computers, model, management policy and guidelines

Procedia PDF Downloads 64
24502 Healthcare Big Data Analytics Using Hadoop

Authors: Chellammal Surianarayanan

Abstract:

Healthcare industry is generating large amounts of data driven by various needs such as record keeping, physician’s prescription, medical imaging, sensor data, Electronic Patient Record(EPR), laboratory, pharmacy, etc. Healthcare data is so big and complex that they cannot be managed by conventional hardware and software. The complexity of healthcare big data arises from large volume of data, the velocity with which the data is accumulated and different varieties such as structured, semi-structured and unstructured nature of data. Despite the complexity of big data, if the trends and patterns that exist within the big data are uncovered and analyzed, higher quality healthcare at lower cost can be provided. Hadoop is an open source software framework for distributed processing of large data sets across clusters of commodity hardware using a simple programming model. The core components of Hadoop include Hadoop Distributed File System which offers way to store large amount of data across multiple machines and MapReduce which offers way to process large data sets with a parallel, distributed algorithm on a cluster. Hadoop ecosystem also includes various other tools such as Hive (a SQL-like query language), Pig (a higher level query language for MapReduce), Hbase(a columnar data store), etc. In this paper an analysis has been done as how healthcare big data can be processed and analyzed using Hadoop ecosystem.

Keywords: big data analytics, Hadoop, healthcare data, towards quality healthcare

Procedia PDF Downloads 393
24501 Data Disorders in Healthcare Organizations: Symptoms, Diagnoses, and Treatments

Authors: Zakieh Piri, Shahla Damanabi, Peyman Rezaii Hachesoo

Abstract:

Introduction: Healthcare organizations like other organizations suffer from a number of disorders such as Business Sponsor Disorder, Business Acceptance Disorder, Cultural/Political Disorder, Data Disorder, etc. As quality in healthcare care mostly depends on the quality of data, we aimed to identify data disorders and its symptoms in two teaching hospitals. Methods: Using a self-constructed questionnaire, we asked 20 questions in related to quality and usability of patient data stored in patient records. Research population consisted of 150 managers, physicians, nurses, medical record staff who were working at the time of study. We also asked their views about the symptoms and treatments for any data disorders they mentioned in the questionnaire. Using qualitative methods we analyzed the answers. Results: After classifying the answers, we found six main data disorders: incomplete data, missed data, late data, blurred data, manipulated data, illegible data. The majority of participants believed in their important roles in treatment of data disorders while others believed in health system problems. Discussion: As clinicians have important roles in producing of data, they can easily identify symptoms and disorders of patient data. Health information managers can also play important roles in early detection of data disorders by proactively monitoring and periodic check-ups of data.

Keywords: data disorders, quality, healthcare, treatment

Procedia PDF Downloads 419
24500 Big Data and Analytics in Higher Education: An Assessment of Its Status, Relevance and Future in the Republic of the Philippines

Authors: Byron Joseph A. Hallar, Annjeannette Alain D. Galang, Maria Visitacion N. Gumabay

Abstract:

One of the unique challenges provided by the twenty-first century to Philippine higher education is the utilization of Big Data. The higher education system in the Philippines is generating burgeoning amounts of data that contains relevant data that can be used to generate the information and knowledge needed for accurate data-driven decision making. This study examines the status, relevance and future of Big Data and Analytics in Philippine higher education. The insights gained from the study may be relevant to other developing nations similarly situated as the Philippines.

Keywords: big data, data analytics, higher education, republic of the philippines, assessment

Procedia PDF Downloads 326
24499 Removal of VOCs from Gas Streams with Double Perovskite-Type Catalyst

Authors: Kuan Lun Pan, Moo Been Chang

Abstract:

Volatile organic compounds (VOCs) are one of major air contaminants, and they can react with nitrogen oxides (NOx) in atmosphere to form ozone (O3) and peroxyacetyl nitrate (PAN) with solar irradiation, leading to environmental hazards. In addition, some VOCs are toxic at low concentration levels and cause adverse effects on human health. How to effectively reduce VOCs emission has become an important issue. Thermal catalysis is regarded as an effective way for VOCs removal because it provides oxidation route to successfully convert VOCs into carbon dioxide (CO2) and water (H2O(g)). Single perovskite-type catalysts are promising for VOC removal, and they are of good potential to replace noble metals due to good activity and high thermal stability. Single perovskites can be generally described as ABO3 or A2BO4, where A-site is often a rare earth element or an alkaline. Typically, the B-site is transition metal cation (Fe, Cu, Ni, Co, or Mn). Catalytic properties of perovskites mainly rely on nature, oxidation states and arrangement of B-site cation. Interestingly, single perovskites could be further synthesized to form double perovskite-type catalysts which can simply be represented by A2B’B”O6. Likewise, A-site stands for an alkaline metal or rare earth element, and the B′ and B′′ are transition metals. Double perovskites possess unique surface properties. In structure, three-dimensional of B-site with ordered arrangement of B’O6 and B”O6 is presented alternately, and they corner-share octahedral along three directions of the crystal lattice, while cations of A-site position between the void of octahedral. It has attracted considerable attention due to specific arrangement of alternating B-site structure. Therefore, double perovskites may have more variations than single perovskites, and this greater variation may promote catalytic performance. It is expected that activity of double perovskites is higher than that of single perovskites toward VOC removal. In this study, double perovskite-type catalyst (La2CoMnO6) is prepared and evaluated for VOC removal. Also, single perovskites including LaCoO3 and LaMnO3 are tested for the comparison purpose. Toluene (C7H8) is one of the important VOCs which are commonly applied in chemical processes. In addition to its wide application, C7H8 has high toxicity at a low concentration. Therefore, C7H8 is selected as the target compound in this study. Experimental results indicate that double perovskite (La2CoMnO6) has better activity if compared with single perovskites. Especially, C7H8 can be completely oxidized to CO2 at 300oC as La2CoMnO6 is applied. Characterization of catalysts indicates that double perovskite has unique surface properties and is of higher amounts of lattice oxygen, leading to higher activity. For durability test, La2CoMnO6 maintains high C7H8 removal efficiency of 100% at 300oC and 30,000 h-1, and it also shows good resistance to CO2 (5%) and H2O(g) (5%) of gas streams tested. For various VOCs including isopropyl alcohol (C3H8O), ethanal (C2H4O), and ethylene (C2H4) tested, as high as 100% efficiency could be achieved with double perovskite-type catalyst operated at 300℃, indicating that double perovskites are promising catalysts for VOCs removal, and possible mechanisms will be elucidated in this paper.

Keywords: volatile organic compounds, Toluene (C7H8), double perovskite-type catalyst, catalysis

Procedia PDF Downloads 151
24498 Assessment of Ground Water Potential Zone: A Case Study of Paramakudi Taluk, Ramanathapuram, Tamilnadu, India

Authors: Shri Devi

Abstract:

This paper was conducted to see the ground water potential zones in Paramakudi taluk, Ramanathapuram,Tamilnadu India with a total areal extent of 745 sq. km. The various thematic map have been prepared for the study such as soil, geology, geomorphology, drainage, land use of the particular study area using the Toposheet of 1: 50000. The digital elevation model (DEM) has been generated from contour interval of 10m and also the slope was prepared. The ground water potential zone of the region was obtained using the weighted overlay analysis for which all the thematic maps were overlayed in arc gis 10.2. For the particular output the ranking has been given for all the parameters of each thematic layer with different weightage such as 25% was given to soil, 25% to geomorphology and land use land cover also 25%, slope 15%, lineament with 5% and drainage streams with 5 percentage. Using these entire potential zone maps was prepared which was overlayed with the village map to check the region which has good, moderate and low groundwater potential zone.

Keywords: GIS, ground water, Paramakudi, weighted overlay analysis

Procedia PDF Downloads 324
24497 Data Management and Analytics for Intelligent Grid

Authors: G. Julius P. Roy, Prateek Saxena, Sanjeev Singh

Abstract:

Power distribution utilities two decades ago would collect data from its customers not later than a period of at least one month. The origin of SmartGrid and AMI has subsequently increased the sampling frequency leading to 1000 to 10000 fold increase in data quantity. This increase is notable and this steered to coin the tern Big Data in utilities. Power distribution industry is one of the largest to handle huge and complex data for keeping history and also to turn the data in to significance. Majority of the utilities around the globe are adopting SmartGrid technologies as a mass implementation and are primarily focusing on strategic interdependence and synergies of the big data coming from new information sources like AMI and intelligent SCADA, there is a rising need for new models of data management and resurrected focus on analytics to dissect data into descriptive, predictive and dictatorial subsets. The goal of this paper is to is to bring load disaggregation into smart energy toolkit for commercial usage.

Keywords: data management, analytics, energy data analytics, smart grid, smart utilities

Procedia PDF Downloads 765