Search results for: data scarcity
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24415

Search results for: data scarcity

24265 Protecting the Cloud Computing Data Through the Data Backups

Authors: Abdullah Alsaeed

Abstract:

Virtualized computing and cloud computing infrastructures are no longer fuzz or marketing term. They are a core reality in today’s corporate Information Technology (IT) organizations. Hence, developing an effective and efficient methodologies for data backup and data recovery is required more than any time. The purpose of data backup and recovery techniques are to assist the organizations to strategize the business continuity and disaster recovery approaches. In order to accomplish this strategic objective, a variety of mechanism were proposed in the recent years. This research paper will explore and examine the latest techniques and solutions to provide data backup and restoration for the cloud computing platforms.

Keywords: data backup, data recovery, cloud computing, business continuity, disaster recovery, cost-effective, data encryption.

Procedia PDF Downloads 55
24264 Missing Link Data Estimation with Recurrent Neural Network: An Application Using Speed Data of Daegu Metropolitan Area

Authors: JaeHwan Yang, Da-Woon Jeong, Seung-Young Kho, Dong-Kyu Kim

Abstract:

In terms of ITS, information on link characteristic is an essential factor for plan or operation. But in practical cases, not every link has installed sensors on it. The link that does not have data on it is called “Missing Link”. The purpose of this study is to impute data of these missing links. To get these data, this study applies the machine learning method. With the machine learning process, especially for the deep learning process, missing link data can be estimated from present link data. For deep learning process, this study uses “Recurrent Neural Network” to take time-series data of road. As input data, Dedicated Short-range Communications (DSRC) data of Dalgubul-daero of Daegu Metropolitan Area had been fed into the learning process. Neural Network structure has 17 links with present data as input, 2 hidden layers, for 1 missing link data. As a result, forecasted data of target link show about 94% of accuracy compared with actual data.

Keywords: data estimation, link data, machine learning, road network

Procedia PDF Downloads 484
24263 Customer Data Analysis Model Using Business Intelligence Tools in Telecommunication Companies

Authors: Monica Lia

Abstract:

This article presents a customer data analysis model using business intelligence tools for data modelling, transforming, data visualization and dynamic reports building. Economic organizational customer’s analysis is made based on the information from the transactional systems of the organization. The paper presents how to develop the data model starting for the data that companies have inside their own operational systems. The owned data can be transformed into useful information about customers using business intelligence tool. For a mature market, knowing the information inside the data and making forecast for strategic decision become more important. Business Intelligence tools are used in business organization as support for decision-making.

Keywords: customer analysis, business intelligence, data warehouse, data mining, decisions, self-service reports, interactive visual analysis, and dynamic dashboards, use cases diagram, process modelling, logical data model, data mart, ETL, star schema, OLAP, data universes

Procedia PDF Downloads 392
24262 Evaluation of Shale Gas Resource Potential of Cambay Basin, Gujarat, India

Authors: Vaishali Sharma, Anirbid Sircar

Abstract:

Energy is one of the most eminent and fundamental strategic commodity, scarcity of which may poses great impact on the functioning of the entire commodity. According to the present study, the estimated reserves of gas in India as on 31.03.2015 stood at 1427.15 BCM. It is expected that the gas demand is set to grow significantly at a CAGR of 7% from 226.7 MMSCMD in 2012-13 to 713.5 MMSCMD in 2009-30. To bridge the gap between the demand and supply of energy, the interest towards the exploration and exploitation of unconventional resources like – Shale gas, Coal bed methane, Gas hydrates, tight gas etc has immensed. Nowadays, Shale gas prospects are emerging rapidly as a promising energy source globally. The United States of America (USA) has 240 TCF of proved reserves of shale gas and presently contributed more than 17% of total gas production. As compared to USA, shale gas production in India is at nascent stage. A resource potential of around 2000 TCF is estimated and according to preliminary data analysis, basins like Gondwana, Cambay, Krishna – Godavari, Cauvery, Assam-Arakan, Rajasthan, Vindhyan, and Bengal are the most promising shale gas basins. In the present study, the careful evaluation of Cambay Shale (Indian Shale) properties like geological age, lithology, depth, organically rich thickness, TOC, thermal maturity, porosity, permeability, clay content, quartz content, Kerogen type, Hydrocarbon window etc. has been done. And then the detailed comparison of Indian shale with USA shale will be discussed. This study investigates qualitative and quantitative nature of potential shale basins which will be helpful from exploration and exploitation point of view.

Keywords: shale, shale gas, energy source, lithology

Procedia PDF Downloads 264
24261 Probabilistic Life Cycle Assessment of the Nano Membrane Toilet

Authors: A. Anastasopoulou, A. Kolios, T. Somorin, A. Sowale, Y. Jiang, B. Fidalgo, A. Parker, L. Williams, M. Collins, E. J. McAdam, S. Tyrrel

Abstract:

Developing countries are nowadays confronted with great challenges related to domestic sanitation services in view of the imminent water scarcity. Contemporary sanitation technologies established in these countries are likely to pose health risks unless waste management standards are followed properly. This paper provides a solution to sustainable sanitation with the development of an innovative toilet system, called Nano Membrane Toilet (NMT), which has been developed by Cranfield University and sponsored by the Bill & Melinda Gates Foundation. The particular technology converts human faeces into energy through gasification and provides treated wastewater from urine through membrane filtration. In order to evaluate the environmental profile of the NMT system, a deterministic life cycle assessment (LCA) has been conducted in SimaPro software employing the Ecoinvent v3.3 database. The particular study has determined the most contributory factors to the environmental footprint of the NMT system. However, as sensitivity analysis has identified certain critical operating parameters for the robustness of the LCA results, adopting a stochastic approach to the Life Cycle Inventory (LCI) will comprehensively capture the input data uncertainty and enhance the credibility of the LCA outcome. For that purpose, Monte Carlo simulations, in combination with an artificial neural network (ANN) model, have been conducted for the input parameters of raw material, produced electricity, NOX emissions, amount of ash and transportation of fertilizer. The given analysis has provided the distribution and the confidence intervals of the selected impact categories and, in turn, more credible conclusions are drawn on the respective LCIA (Life Cycle Impact Assessment) profile of NMT system. Last but not least, the specific study will also yield essential insights into the methodological framework that can be adopted in the environmental impact assessment of other complex engineering systems subject to a high level of input data uncertainty.

Keywords: sanitation systems, nano-membrane toilet, lca, stochastic uncertainty analysis, Monte Carlo simulations, artificial neural network

Procedia PDF Downloads 200
24260 Opening up Government Datasets for Big Data Analysis to Support Policy Decisions

Authors: K. Hardy, A. Maurushat

Abstract:

Policy makers are increasingly looking to make evidence-based decisions. Evidence-based decisions have historically used rigorous methodologies of empirical studies by research institutes, as well as less reliable immediate survey/polls often with limited sample sizes. As we move into the era of Big Data analytics, policy makers are looking to different methodologies to deliver reliable empirics in real-time. The question is not why did these people do this for the last 10 years, but why are these people doing this now, and if the this is undesirable, and how can we have an impact to promote change immediately. Big data analytics rely heavily on government data that has been released in to the public domain. The open data movement promises greater productivity and more efficient delivery of services; however, Australian government agencies remain reluctant to release their data to the general public. This paper considers the barriers to releasing government data as open data, and how these barriers might be overcome.

Keywords: big data, open data, productivity, data governance

Procedia PDF Downloads 341
24259 A Review on Existing Challenges of Data Mining and Future Research Perspectives

Authors: Hema Bhardwaj, D. Srinivasa Rao

Abstract:

Technology for analysing, processing, and extracting meaningful data from enormous and complicated datasets can be termed as "big data." The technique of big data mining and big data analysis is extremely helpful for business movements such as making decisions, building organisational plans, researching the market efficiently, improving sales, etc., because typical management tools cannot handle such complicated datasets. Special computational and statistical issues, such as measurement errors, noise accumulation, spurious correlation, and storage and scalability limitations, are brought on by big data. These unique problems call for new computational and statistical paradigms. This research paper offers an overview of the literature on big data mining, its process, along with problems and difficulties, with a focus on the unique characteristics of big data. Organizations have several difficulties when undertaking data mining, which has an impact on their decision-making. Every day, terabytes of data are produced, yet only around 1% of that data is really analyzed. The idea of the mining and analysis of data and knowledge discovery techniques that have recently been created with practical application systems is presented in this study. This article's conclusion also includes a list of issues and difficulties for further research in the area. The report discusses the management's main big data and data mining challenges.

Keywords: big data, data mining, data analysis, knowledge discovery techniques, data mining challenges

Procedia PDF Downloads 80
24258 Overcoming the Problems Affecting Drip Irrigation System through the Design of an Efficient Filtration and Flushing System

Authors: Stephen A. Akinlabi, Esther T. Akinlabi

Abstract:

The drip irrigation system is one of the important areas that affect the livelihood of farmers directly. The use of drip irrigation system has been the most efficient system compared to the other types of irrigations systems because the drip irrigation helps to save water and increase the productivity of crops. But like any other system, it can be considered inefficient when the filters and the emitters get clogged while in operation. The efficiency of the entire system is reduced when the emitters are clogged and blocked. This consequently impact and affect the farm operations which may result in scarcity of farm products and increase the demand. This design work focuses on how to overcome some of the challenges affecting drip irrigation system through the design of an efficient filtration and flushing system.

Keywords: drip irrigation system, filters, soil texture, mechanical engineering design, analysis

Procedia PDF Downloads 342
24257 A Systematic Review on Challenges in Big Data Environment

Authors: Rimmy Yadav, Anmol Preet Kaur

Abstract:

Big Data has demonstrated the vast potential in streamlining, deciding, spotting business drifts in different fields, for example, producing, fund, Information Technology. This paper gives a multi-disciplinary diagram of the research issues in enormous information and its procedures, instruments, and system identified with the privacy, data storage management, network and energy utilization, adaptation to non-critical failure and information representations. Other than this, result difficulties and openings accessible in this Big Data platform have made.

Keywords: big data, privacy, data management, network and energy consumption

Procedia PDF Downloads 277
24256 Survey on Big Data Stream Classification by Decision Tree

Authors: Mansoureh Ghiasabadi Farahani, Samira Kalantary, Sara Taghi-Pour, Mahboubeh Shamsi

Abstract:

Nowadays, the development of computers technology and its recent applications provide access to new types of data, which have not been considered by the traditional data analysts. Two particularly interesting characteristics of such data sets include their huge size and streaming nature .Incremental learning techniques have been used extensively to address the data stream classification problem. This paper presents a concise survey on the obstacles and the requirements issues classifying data streams with using decision tree. The most important issue is to maintain a balance between accuracy and efficiency, the algorithm should provide good classification performance with a reasonable time response.

Keywords: big data, data streams, classification, decision tree

Procedia PDF Downloads 486
24255 Robust and Dedicated Hybrid Cloud Approach for Secure Authorized Deduplication

Authors: Aishwarya Shekhar, Himanshu Sharma

Abstract:

Data deduplication is one of important data compression techniques for eliminating duplicate copies of repeating data, and has been widely used in cloud storage to reduce the amount of storage space and save bandwidth. In this process, duplicate data is expunged, leaving only one copy means single instance of the data to be accumulated. Though, indexing of each and every data is still maintained. Data deduplication is an approach for minimizing the part of storage space an organization required to retain its data. In most of the company, the storage systems carry identical copies of numerous pieces of data. Deduplication terminates these additional copies by saving just one copy of the data and exchanging the other copies with pointers that assist back to the primary copy. To ignore this duplication of the data and to preserve the confidentiality in the cloud here we are applying the concept of hybrid nature of cloud. A hybrid cloud is a fusion of minimally one public and private cloud. As a proof of concept, we implement a java code which provides security as well as removes all types of duplicated data from the cloud.

Keywords: confidentiality, deduplication, data compression, hybridity of cloud

Procedia PDF Downloads 355
24254 A Review of Machine Learning for Big Data

Authors: Devatha Kalyan Kumar, Aravindraj D., Sadathulla A.

Abstract:

Big data are now rapidly expanding in all engineering and science and many other domains. The potential of large or massive data is undoubtedly significant, make sense to require new ways of thinking and learning techniques to address the various big data challenges. Machine learning is continuously unleashing its power in a wide range of applications. In this paper, the latest advances and advancements in the researches on machine learning for big data processing. First, the machine learning techniques methods in recent studies, such as deep learning, representation learning, transfer learning, active learning and distributed and parallel learning. Then focus on the challenges and possible solutions of machine learning for big data.

Keywords: active learning, big data, deep learning, machine learning

Procedia PDF Downloads 406
24253 Strengthening Legal Protection of Personal Data through Technical Protection Regulation in Line with Human Rights

Authors: Tomy Prihananto, Damar Apri Sudarmadi

Abstract:

Indonesia recognizes the right to privacy as a human right. Indonesia provides legal protection against data management activities because the protection of personal data is a part of human rights. This paper aims to describe the arrangement of data management and data management in Indonesia. This paper is a descriptive research with qualitative approach and collecting data from literature study. Results of this paper are comprehensive arrangement of data that have been set up as a technical requirement of data protection by encryption methods. Arrangements on encryption and protection of personal data are mutually reinforcing arrangements in the protection of personal data. Indonesia has two important and immediately enacted laws that provide protection for the privacy of information that is part of human rights.

Keywords: Indonesia, protection, personal data, privacy, human rights, encryption

Procedia PDF Downloads 152
24252 Impact of Collieries on Groundwater in Damodar River Basin

Authors: Rajkumar Ghosh

Abstract:

The industrialization of coal mining and related activities has a significant impact on groundwater in the surrounding areas of the Damodar River. The Damodar River basin, located in eastern India, is known as the "Ruhr of India" due to its abundant coal reserves and extensive coal mining and industrial operations. One of the major consequences of collieries on groundwater is the contamination of water sources. Coal mining activities often involve the excavation and extraction of coal through underground or open-pit mining methods. These processes can release various pollutants and chemicals into the groundwater, including heavy metals, acid mine drainage, and other toxic substances. As a result, the quality of groundwater in the Damodar River region has deteriorated, making it unsuitable for drinking, irrigation, and other purposes. The high concentration of heavy metals, such as arsenic, lead, and mercury, in the groundwater has posed severe health risks to the local population. Prolonged exposure to contaminated water can lead to various health problems, including skin diseases, respiratory issues, and even long-term ailments like cancer. The contamination has also affected the aquatic ecosystem, harming fish populations and other organisms dependent on the river's water. Moreover, the excessive extraction of groundwater for industrial processes, including coal washing and cooling systems, has resulted in a decline in the water table and depletion of aquifers. This has led to water scarcity and reduced availability of water for agricultural activities, impacting the livelihoods of farmers in the region. Efforts have been made to mitigate these issues through the implementation of regulations and improved industrial practices. However, the historical legacy of coal industrialization continues to impact the groundwater in the Damodar River area. Remediation measures, such as the installation of water treatment plants and the promotion of sustainable mining practices, are essential to restore the quality of groundwater and ensure the well-being of the affected communities. In conclusion, the coal industrialization in the Damodar River surrounding has had a detrimental impact on groundwater. This research focuses on soil subsidence induced by the over-exploitation of ground water for dewatering open pit coal mines. Soil degradation happens in arid and semi-arid regions as a result of land subsidence in coal mining region, which reduces soil fertility. Depletion of aquifers, contamination, and water scarcity are some of the key challenges resulting from these activities. It is crucial to prioritize sustainable mining practices, environmental conservation, and the provision of clean drinking water to mitigate the long-lasting effects of collieries on the groundwater resources in the region.

Keywords: coal mining, groundwater, soil subsidence, water table, damodar river

Procedia PDF Downloads 50
24251 Evaluation of a Hybrid System for Renewable Energy in a Small Island in Greece

Authors: M. Bertsiou, E. Feloni, E. Baltas

Abstract:

The proper management of the water supply and electricity is the key issue, especially in small islands, where sustainability has been combined with the autonomy and covering of water needs and the fast development in potential sectors of economy. In this research work a hybrid system in Fournoi island (Icaria), a small island of Aegean, has been evaluated in order to produce hydropower and cover water demands, as it can provide solutions to acute problems, such as the water scarcity or the instability of local power grids. The meaning and the utility of hybrid system and the cooperation with a desalination plant has also been considered. This kind of project has not yet been widely applied, so the consideration will give us valuable information about the storage of water and the controlled distribution of the generated clean energy. This process leads to the conclusions about the functioning of the system and the profitability of this project, covering the demand for water and electricity.

Keywords: hybrid system, water, electricity, island

Procedia PDF Downloads 296
24250 The Various Legal Dimensions of Genomic Data

Authors: Amy Gooden

Abstract:

When human genomic data is considered, this is often done through only one dimension of the law, or the interplay between the various dimensions is not considered, thus providing an incomplete picture of the legal framework. This research considers and analyzes the various dimensions in South African law applicable to genomic sequence data – including property rights, personality rights, and intellectual property rights. The effective use of personal genomic sequence data requires the acknowledgement and harmonization of the rights applicable to such data.

Keywords: artificial intelligence, data, law, genomics, rights

Procedia PDF Downloads 113
24249 Big Brain: A Single Database System for a Federated Data Warehouse Architecture

Authors: X. Gumara Rigol, I. Martínez de Apellaniz Anzuola, A. Garcia Serrano, A. Franzi Cros, O. Vidal Calbet, A. Al Maruf

Abstract:

Traditional federated architectures for data warehousing work well when corporations have existing regional data warehouses and there is a need to aggregate data at a global level. Schibsted Media Group has been maturing from a decentralised organisation into a more globalised one and needed to build both some of the regional data warehouses for some brands at the same time as the global one. In this paper, we present the architectural alternatives studied and why a custom federated approach was the notable recommendation to go further with the implementation. Although the data warehouses are logically federated, the implementation uses a single database system which presented many advantages like: cost reduction and improved data access to global users allowing consumers of the data to have a common data model for detailed analysis across different geographies and a flexible layer for local specific needs in the same place.

Keywords: data integration, data warehousing, federated architecture, Online Analytical Processing (OLAP)

Procedia PDF Downloads 214
24248 Anti-lipidemic and Hematinic Potentials of Moringa Oleifera Leaves: A Clinical Trial on Type 2 Diabetic Subjects in a Rural Nigerian Community

Authors: Ifeoma C. Afiaenyi, Elizabeth K. Ngwu, Rufina N. B. Ayogu

Abstract:

Diabetes has crept into the rural areas of Nigeria, causing devastating effects on its sufferers; most of them could not afford diabetic medications. Moringa oleifera has been used extensively in animal models to demonstrate its antilipidaemic and haematinic qualities; however, there is a scarcity of data on the effect of graded levels of Moringa oleifera leaves on the lipid profile and hematological parameters in human diabetic subjects. The study determined the effect of Moringa oleifera leaves on the lipid profile and hematological parameters of type 2 diabetic subjects in Ukehe, a rural Nigerian community. Twenty-four adult male and female diabetic subjects were purposively selected for the study. These subjects were shared into four groups of six subjects each. The diets used in the study were isocaloric. A control group (diabetics, group 1) was fed diets without Moringa oleifera leaves. Experimental groups 2, 3 and 4 received 20g, 40g and 60g of Moringa oleifera leaves daily, respectively, in addition to the diets. The subjects' lipid profile and hematological parameters were measured prior to the feeding trial and at the end of the feeding trial. The feeding trial lasted for fourteen days. The data obtained were analyzed using the computer program Statistical Product for Service Solution (SPSS) for windows version 21. A Paired-samples t-test was used to compare the means of values collected before and after the feeding trial within the groups and significance was accepted at p < 0.05. There was a non-significant (p > 0.05) decrease in the mean total cholesterol of the subjects in groups 1, 2 and 3 after the feeding trial. There was a non-significant (p > 0.05) decrease in the mean triglyceride levels of the subjects in group 1 after the feeding trial. Groups 1 and 3 subjects had a non-significant (p > 0.05) decrease in their mean low-density lipoprotein (LDL) cholesterol after the feeding trial. Groups 1, 2 and 4 had a significant (p < 0.05) increase in their mean high-density lipoprotein (HDL) cholesterol after the feeding trial. A significant (p < 0.05) decrease in the mean hemoglobin level was observed only in group 4 subjects. Similarly, there was a significant (p < 0.05) decrease in the mean packed cell volume of group 4 subjects. It was only in group 4 that a significant (p < 0.05) decrease in the mean white blood cells of the subjects was also observed. The changes observed in the parameters assessed were not dose-dependent. Therefore, a similar study of longer duration and more samples is imperative to authenticate these results.

Keywords: anemia, diabetic subjects, lipid profile, moringa oleifera

Procedia PDF Downloads 158
24247 A Survey on Ambient Intelligence in Agricultural Technology

Authors: C. Angel, S. Asha

Abstract:

Despite the advances made in various new technologies, application of these technologies for agriculture still remains a formidable task, as it involves integration of diverse domains for monitoring the different process involved in agricultural management. Advances in ambient intelligence technology represents one of the most powerful technology for increasing the yield of agricultural crops and to mitigate the impact of water scarcity, climatic change and methods for managing pests, weeds, and diseases. This paper proposes a GPS-assisted, machine to machine solutions that combine information collected by multiple sensors for the automated management of paddy crops. To maintain the economic viability of paddy cultivation, the various techniques used in agriculture are discussed and a novel system which uses ambient intelligence technique is proposed in this paper. The ambient intelligence based agricultural system gives a great scope.

Keywords: ambient intelligence, agricultural technology, smart agriculture, precise farming

Procedia PDF Downloads 574
24246 A Review Paper on Data Mining and Genetic Algorithm

Authors: Sikander Singh Cheema, Jasmeen Kaur

Abstract:

In this paper, the concept of data mining is summarized and its one of the important process i.e KDD is summarized. The data mining based on Genetic Algorithm is researched in and ways to achieve the data mining Genetic Algorithm are surveyed. This paper also conducts a formal review on the area of data mining tasks and genetic algorithm in various fields.

Keywords: data mining, KDD, genetic algorithm, descriptive mining, predictive mining

Procedia PDF Downloads 565
24245 Data-Mining Approach to Analyzing Industrial Process Information for Real-Time Monitoring

Authors: Seung-Lock Seo

Abstract:

This work presents a data-mining empirical monitoring scheme for industrial processes with partially unbalanced data. Measurement data of good operations are relatively easy to gather, but in unusual special events or faults it is generally difficult to collect process information or almost impossible to analyze some noisy data of industrial processes. At this time some noise filtering techniques can be used to enhance process monitoring performance in a real-time basis. In addition, pre-processing of raw process data is helpful to eliminate unwanted variation of industrial process data. In this work, the performance of various monitoring schemes was tested and demonstrated for discrete batch process data. It showed that the monitoring performance was improved significantly in terms of monitoring success rate of given process faults.

Keywords: data mining, process data, monitoring, safety, industrial processes

Procedia PDF Downloads 368
24244 A Survey of Semantic Integration Approaches in Bioinformatics

Authors: Chaimaa Messaoudi, Rachida Fissoune, Hassan Badir

Abstract:

Technological advances of computer science and data analysis are helping to provide continuously huge volumes of biological data, which are available on the web. Such advances involve and require powerful techniques for data integration to extract pertinent knowledge and information for a specific question. Biomedical exploration of these big data often requires the use of complex queries across multiple autonomous, heterogeneous and distributed data sources. Semantic integration is an active area of research in several disciplines, such as databases, information-integration, and ontology. We provide a survey of some approaches and techniques for integrating biological data, we focus on those developed in the ontology community.

Keywords: biological ontology, linked data, semantic data integration, semantic web

Procedia PDF Downloads 415
24243 Utilization of “Adlai” (Coix lacryma-jobi L) Flour as Wheat Flour Extender in Selected Baked Products in the Philippines

Authors: Rolando B. Llave Jr.

Abstract:

In many countries, wheat flour is used an essential component in production/preparation of bread and other baked products considered to have a significant role in man’s diet. Partial replacement of wheat flour with other flours (composite flour) in preparation of the said products is seen as a solution to the scarcity of wheat flour (in non-wheat producing countries), and improved nourishment. In composite flour, other flours may come from cereals, legumes, root crops, and those that are rich in starch. Many countries utilize whatever is locally available. “Adlai” or Job’s tears is a tall cereal plant that belongs to the same family of grass as wheat, rice, and corn. In some countries, it is used as an ingredient in producing many dishes and alcoholic and non-alcoholic beverages. As part of the Food Staple Self-Sufficiency Program (FSSP) of the Department of Agriculture (DA) in the Philippines, “adlai” is being promoted as alternative food source for the Filipinos. In this study, the grits coming from the seeds of “adlai” were turned into flour. The resulting flour was then used as partial replacement for wheat flour in selected baked products namely “pan de sal” (salt bread), cupcakes and cookies. The supplementation of “adlai” flour ranged 20%-45% with 20%-35% for “pan de sal”; 30%-45% for cupcakes; and 25% - 40% for cookies. The study was composed of four (4) phases. Phase I was product formulation studies. Phase II included the acceptability test/sensory evaluation of the baked products where the statistical analysis of the data gathered followed. Phase III was the computation of the theoretical protein content of the most acceptable “pan de sal”, cupcake and cookie, and lastly, in Phase IV, cost benefit was analyzed, specifically in terms of the direct material cost.

Keywords: “adlai”, composite flour, supplementation, sensory evaluation

Procedia PDF Downloads 817
24242 Classification of Generative Adversarial Network Generated Multivariate Time Series Data Featuring Transformer-Based Deep Learning Architecture

Authors: Thrivikraman Aswathi, S. Advaith

Abstract:

As there can be cases where the use of real data is somehow limited, such as when it is hard to get access to a large volume of real data, we need to go for synthetic data generation. This produces high-quality synthetic data while maintaining the statistical properties of a specific dataset. In the present work, a generative adversarial network (GAN) is trained to produce multivariate time series (MTS) data since the MTS is now being gathered more often in various real-world systems. Furthermore, the GAN-generated MTS data is fed into a transformer-based deep learning architecture that carries out the data categorization into predefined classes. Further, the model is evaluated across various distinct domains by generating corresponding MTS data.

Keywords: GAN, transformer, classification, multivariate time series

Procedia PDF Downloads 90
24241 Generative AI: A Comparison of Conditional Tabular Generative Adversarial Networks and Conditional Tabular Generative Adversarial Networks with Gaussian Copula in Generating Synthetic Data with Synthetic Data Vault

Authors: Lakshmi Prayaga, Chandra Prayaga. Aaron Wade, Gopi Shankar Mallu, Harsha Satya Pola

Abstract:

Synthetic data generated by Generative Adversarial Networks and Autoencoders is becoming more common to combat the problem of insufficient data for research purposes. However, generating synthetic data is a tedious task requiring extensive mathematical and programming background. Open-source platforms such as the Synthetic Data Vault (SDV) and Mostly AI have offered a platform that is user-friendly and accessible to non-technical professionals to generate synthetic data to augment existing data for further analysis. The SDV also provides for additions to the generic GAN, such as the Gaussian copula. We present the results from two synthetic data sets (CTGAN data and CTGAN with Gaussian Copula) generated by the SDV and report the findings. The results indicate that the ROC and AUC curves for the data generated by adding the layer of Gaussian copula are much higher than the data generated by the CTGAN.

Keywords: synthetic data generation, generative adversarial networks, conditional tabular GAN, Gaussian copula

Procedia PDF Downloads 32
24240 Unequal Traveling: How School District System and School District Housing Characteristics Shape the Duration of Families Commuting

Authors: Geyang Xia

Abstract:

In many countries, governments have responded to the growing demand for educational resources through school district systems, and there is substantial evidence that school district systems have been effective in promoting inter-district and inter-school equity in educational resources. However, the scarcity of quality educational resources has brought about varying levels of education among different school districts, making it a common choice for many parents to buy a house in the school district where a quality school is located, and they are even willing to bear huge commuting costs for this purpose. Moreover, this is evidenced by the fact that parents of families in school districts with quality education resources have longer average commute lengths and longer average commute distances than parents in average school districts. This "unequal traveling" under the influence of the school district system is more common in school districts at the primary level of education. This further reinforces the differential hierarchy of educational resources and raises issues of inequitable educational public services, education-led residential segregation, and gentrification of school district housing. Against this background, this paper takes Nanjing, a famous educational city in China, as a case study and selects the school districts where the top 10 public elementary schools are located. The study first identifies the spatio-temporal behavioral trajectory dataset of these high-quality school district households by using spatial vector data, decrypted cell phone signaling data, and census data. Then, by constructing a "house-school-work (HSW)" commuting pattern of the population in the school district where the high-quality educational resources are located, and based on the classification of the HSW commuting pattern of the population, school districts with long employment hours were identified. Ultimately, the mechanisms and patterns inherent in this unequal commuting are analyzed in terms of six aspects, including the centrality of school district location, functional diversity, and accessibility. The results reveal that the "unequal commuting" of Nanjing's high-quality school districts under the influence of the school district system occurs mainly in the peripheral areas of the city, and the schools matched with these high-quality school districts are mostly branches of prestigious schools in the built-up areas of the city's core. At the same time, the centrality of school district location and the diversity of functions are the most important influencing factors of unequal commuting in high-quality school districts. Based on the research results, this paper proposes strategies to optimize the spatial layout of high-quality educational resources and corresponding transportation policy measures.

Keywords: school-district system, high quality school district, commuting pattern, unequal traveling

Procedia PDF Downloads 64
24239 A Privacy Protection Scheme Supporting Fuzzy Search for NDN Routing Cache Data Name

Authors: Feng Tao, Ma Jing, Guo Xian, Wang Jing

Abstract:

Named Data Networking (NDN) replaces IP address of traditional network with data name, and adopts dynamic cache mechanism. In the existing mechanism, however, only one-to-one search can be achieved because every data has a unique name corresponding to it. There is a certain mapping relationship between data content and data name, so if the data name is intercepted by an adversary, the privacy of the data content and user’s interest can hardly be guaranteed. In order to solve this problem, this paper proposes a one-to-many fuzzy search scheme based on order-preserving encryption to reduce the query overhead by optimizing the caching strategy. In this scheme, we use hash value to ensure the user’s query safe from each node in the process of search, so does the privacy of the requiring data content.

Keywords: NDN, order-preserving encryption, fuzzy search, privacy

Procedia PDF Downloads 448
24238 Hydro-Climatological, Geological, Hydrogeological and Geochemical Study of the Coastal Aquifer System of Chiba Watershed (Cape Bon Peninsula)

Authors: Khawla Askri, Mohamed Haythem Msaddek, AbdelAziz Sebei

Abstract:

Climate change combined with the increase in anthropogenic activities will affect coastal groundwater systems around the world and, more particularly, the Cap Bon region in the North East of Tunisia. This study aims to study the impact of climate change and human stress on the salinization and quantification of groundwater in the Wadi Chiba watershed. In this regard, a hydro-climatological study and a hydrogeological study were carried out based on the characterization of the aquifer system of the eastern coast at the level of the watershed of Wadi Chiba in order to seek to identify, first of all, the degradation of the state of the aquifer on the quantitative level by the study of the piezometric and its evolution over time. Secondly, we sought to identify the degradation of the state of the aquifer qualitatively by using the geochemical method, in particular the major elements, to assess the mineralization of the aquifer water and understand its hydrogeochemical functioning. The study of the Na + / Cl- and Ca2 + / Mg2 + chemical relationships confirmed the presence of a marine intrusion downstream of the Wadi Chiba watershed northeast of Cap-Bon accompanied by a piezometric depression. For this purpose, we proceeded to: 1) Mapping of both piezometric data and salinity. 2) The interpretation of the mapping results. 3)Identification of the origin of the localized deterioration in the quality of the aquifer water. Finally, the analysis of the results showed that the scarcity of water is already forcing human actions in the Chiba watershed due to the irrigation of agricultural lands and the overexploitation of the water table in the study area.

Keywords: climate change, human activities, water table, Wadi Chiba watershed, piezometric depression, marine intrusion

Procedia PDF Downloads 53
24237 Healthcare Big Data Analytics Using Hadoop

Authors: Chellammal Surianarayanan

Abstract:

Healthcare industry is generating large amounts of data driven by various needs such as record keeping, physician’s prescription, medical imaging, sensor data, Electronic Patient Record(EPR), laboratory, pharmacy, etc. Healthcare data is so big and complex that they cannot be managed by conventional hardware and software. The complexity of healthcare big data arises from large volume of data, the velocity with which the data is accumulated and different varieties such as structured, semi-structured and unstructured nature of data. Despite the complexity of big data, if the trends and patterns that exist within the big data are uncovered and analyzed, higher quality healthcare at lower cost can be provided. Hadoop is an open source software framework for distributed processing of large data sets across clusters of commodity hardware using a simple programming model. The core components of Hadoop include Hadoop Distributed File System which offers way to store large amount of data across multiple machines and MapReduce which offers way to process large data sets with a parallel, distributed algorithm on a cluster. Hadoop ecosystem also includes various other tools such as Hive (a SQL-like query language), Pig (a higher level query language for MapReduce), Hbase(a columnar data store), etc. In this paper an analysis has been done as how healthcare big data can be processed and analyzed using Hadoop ecosystem.

Keywords: big data analytics, Hadoop, healthcare data, towards quality healthcare

Procedia PDF Downloads 379
24236 Data Disorders in Healthcare Organizations: Symptoms, Diagnoses, and Treatments

Authors: Zakieh Piri, Shahla Damanabi, Peyman Rezaii Hachesoo

Abstract:

Introduction: Healthcare organizations like other organizations suffer from a number of disorders such as Business Sponsor Disorder, Business Acceptance Disorder, Cultural/Political Disorder, Data Disorder, etc. As quality in healthcare care mostly depends on the quality of data, we aimed to identify data disorders and its symptoms in two teaching hospitals. Methods: Using a self-constructed questionnaire, we asked 20 questions in related to quality and usability of patient data stored in patient records. Research population consisted of 150 managers, physicians, nurses, medical record staff who were working at the time of study. We also asked their views about the symptoms and treatments for any data disorders they mentioned in the questionnaire. Using qualitative methods we analyzed the answers. Results: After classifying the answers, we found six main data disorders: incomplete data, missed data, late data, blurred data, manipulated data, illegible data. The majority of participants believed in their important roles in treatment of data disorders while others believed in health system problems. Discussion: As clinicians have important roles in producing of data, they can easily identify symptoms and disorders of patient data. Health information managers can also play important roles in early detection of data disorders by proactively monitoring and periodic check-ups of data.

Keywords: data disorders, quality, healthcare, treatment

Procedia PDF Downloads 406