Search results for: accurate data

25892 Variability of Hydrological Modeling of the Blue Nile

Authors: Abeer Samy, Oliver C. Saavedra Valeriano, Abdelazim Negm

Abstract:

The Blue Nile Basin is the most important tributary of the Nile River. Egypt and Sudan are almost dependent on water originated from the Blue Nile. This multi-dependency creates conflicts among the three countries Egypt, Sudan, and Ethiopia making the management of these conflicts as an international issue. Good assessment of the water resources of the Blue Nile is an important to help in managing such conflicts. Hydrological models are good tool for such assessment. This paper presents a critical review of the nature and variability of the climate and hydrology of the Blue Nile Basin as a first step of using hydrological modeling to assess the water resources of the Blue Nile. Many several attempts are done to develop basin-scale hydrological modeling on the Blue Nile. Lumped and semi distributed models used averages of meteorological inputs and watershed characteristics in hydrological simulation, to analyze runoff for flood control and water resource management. Distributed models include the temporal and spatial variability of catchment conditions and meteorological inputs to allow better representation of the hydrological process. The main challenge of all used models was to assess the water resources of the basin is the shortage of the data needed for models calibration and validation. It is recommended to use distributed model for their higher accuracy to cope with the great variability and complexity of the Blue Nile basin and to collect sufficient data to have more sophisticated and accurate hydrological modeling.

Keywords: Blue Nile Basin, climate change, hydrological modeling, watershed

Procedia PDF Downloads 364

25891 Missing Link Data Estimation with Recurrent Neural Network: An Application Using Speed Data of Daegu Metropolitan Area

Authors: JaeHwan Yang, Da-Woon Jeong, Seung-Young Kho, Dong-Kyu Kim

Abstract:

In terms of ITS, information on link characteristic is an essential factor for plan or operation. But in practical cases, not every link has installed sensors on it. The link that does not have data on it is called “Missing Link”. The purpose of this study is to impute data of these missing links. To get these data, this study applies the machine learning method. With the machine learning process, especially for the deep learning process, missing link data can be estimated from present link data. For deep learning process, this study uses “Recurrent Neural Network” to take time-series data of road. As input data, Dedicated Short-range Communications (DSRC) data of Dalgubul-daero of Daegu Metropolitan Area had been fed into the learning process. Neural Network structure has 17 links with present data as input, 2 hidden layers, for 1 missing link data. As a result, forecasted data of target link show about 94% of accuracy compared with actual data.

Keywords: data estimation, link data, machine learning, road network

Procedia PDF Downloads 508

25890 Customer Data Analysis Model Using Business Intelligence Tools in Telecommunication Companies

Authors: Monica Lia

Abstract:

This article presents a customer data analysis model using business intelligence tools for data modelling, transforming, data visualization and dynamic reports building. Economic organizational customer’s analysis is made based on the information from the transactional systems of the organization. The paper presents how to develop the data model starting for the data that companies have inside their own operational systems. The owned data can be transformed into useful information about customers using business intelligence tool. For a mature market, knowing the information inside the data and making forecast for strategic decision become more important. Business Intelligence tools are used in business organization as support for decision-making.

Keywords: customer analysis, business intelligence, data warehouse, data mining, decisions, self-service reports, interactive visual analysis, and dynamic dashboards, use cases diagram, process modelling, logical data model, data mart, ETL, star schema, OLAP, data universes

Procedia PDF Downloads 428

25889 Predicting Costs in Construction Projects with Machine Learning: A Detailed Study Based on Activity-Level Data

Authors: Soheila Sadeghi

Abstract:

Construction projects are complex and often subject to significant cost overruns due to the multifaceted nature of the activities involved. Accurate cost estimation is crucial for effective budget planning and resource allocation. Traditional methods for predicting overruns often rely on expert judgment or analysis of historical data, which can be time-consuming, subjective, and may fail to consider important factors. However, with the increasing availability of data from construction projects, machine learning techniques can be leveraged to improve the accuracy of overrun predictions. This study applied machine learning algorithms to enhance the prediction of cost overruns in a case study of a construction project. The methodology involved the development and evaluation of two machine learning models: Random Forest and Neural Networks. Random Forest can handle high-dimensional data, capture complex relationships, and provide feature importance estimates. Neural Networks, particularly Deep Neural Networks (DNNs), are capable of automatically learning and modeling complex, non-linear relationships between input features and the target variable. These models can adapt to new data, reduce human bias, and uncover hidden patterns in the dataset. The findings of this study demonstrate that both Random Forest and Neural Networks can significantly improve the accuracy of cost overrun predictions compared to traditional methods. The Random Forest model also identified key cost drivers and risk factors, such as changes in the scope of work and delays in material delivery, which can inform better project risk management. However, the study acknowledges several limitations. First, the findings are based on a single construction project, which may limit the generalizability of the results to other projects or contexts. Second, the dataset, although comprehensive, may not capture all relevant factors influencing cost overruns, such as external economic conditions or political factors. Third, the study focuses primarily on cost overruns, while schedule overruns are not explicitly addressed. Future research should explore the application of machine learning techniques to a broader range of projects, incorporate additional data sources, and investigate the prediction of both cost and schedule overruns simultaneously.

Keywords: cost prediction, machine learning, project management, random forest, neural networks

Procedia PDF Downloads 51

25888 A Machine Learning Approach for Efficient Resource Management in Construction Projects

Authors: Soheila Sadeghi

Abstract:

Construction projects are complex and often subject to significant cost overruns due to the multifaceted nature of the activities involved. Accurate cost estimation is crucial for effective budget planning and resource allocation. Traditional methods for predicting overruns often rely on expert judgment or analysis of historical data, which can be time-consuming, subjective, and may fail to consider important factors. However, with the increasing availability of data from construction projects, machine learning techniques can be leveraged to improve the accuracy of overrun predictions. This study applied machine learning algorithms to enhance the prediction of cost overruns in a case study of a construction project. The methodology involved the development and evaluation of two machine learning models: Random Forest and Neural Networks. Random Forest can handle high-dimensional data, capture complex relationships, and provide feature importance estimates. Neural Networks, particularly Deep Neural Networks (DNNs), are capable of automatically learning and modeling complex, non-linear relationships between input features and the target variable. These models can adapt to new data, reduce human bias, and uncover hidden patterns in the dataset. The findings of this study demonstrate that both Random Forest and Neural Networks can significantly improve the accuracy of cost overrun predictions compared to traditional methods. The Random Forest model also identified key cost drivers and risk factors, such as changes in the scope of work and delays in material delivery, which can inform better project risk management. However, the study acknowledges several limitations. First, the findings are based on a single construction project, which may limit the generalizability of the results to other projects or contexts. Second, the dataset, although comprehensive, may not capture all relevant factors influencing cost overruns, such as external economic conditions or political factors. Third, the study focuses primarily on cost overruns, while schedule overruns are not explicitly addressed. Future research should explore the application of machine learning techniques to a broader range of projects, incorporate additional data sources, and investigate the prediction of both cost and schedule overruns simultaneously.

Keywords: resource allocation, machine learning, optimization, data-driven decision-making, project management

Procedia PDF Downloads 36

25887 Life Stage Customer Segmentation by Fine-Tuning Large Language Models

Authors: Nikita Katyal, Shaurya Uppal

Abstract:

This paper tackles the significant challenge of accurately classifying customers within a retailer’s customer base. Accurate classification is essential for developing targeted marketing strategies that effectively engage this important demographic. To address this issue, we propose a method that utilizes Large Language Models (LLMs). By employing LLMs, we analyze the metadata associated with product purchases derived from historical data to identify key product categories that act as distinguishing factors. These categories, such as baby food, eldercare products, or family-sized packages, offer valuable insights into the likely household composition of customers, including families with babies, families with kids/teenagers, families with pets, households caring for elders, or mixed households. We segment high-confidence customers into distinct categories by integrating historical purchase behavior with LLM-powered product classification. This paper asserts that life stage segmentation can significantly enhance e-commerce businesses’ ability to target the appropriate customers with tailored products and campaigns, thereby augmenting sales and improving customer retention. Additionally, the paper details the data sources, model architecture, and evaluation metrics employed for the segmentation task.

Keywords: LLMs, segmentation, product tags, fine-tuning, target segments, marketing communication

Procedia PDF Downloads 21

25886 An Overview of the Wind and Wave Climate in the Romanian Nearshore

Authors: Liliana Rusu

Abstract:

The goal of the proposed work is to provide a more comprehensive picture of the wind and wave climate in the Romanian nearshore, using the results provided by numerical models. The Romanian coastal environment is located in the western side of the Black Sea, the more energetic part of the sea, an area with heavy maritime traffic and various offshore operations. Information about the wind and wave climate in the Romanian waters is mainly based on observations at Gloria drilling platform (70 km from the coast). As regards the waves, the measurements of the wave characteristics are not so accurate due to the method used, being also available for a limited period. For this reason, the wave simulations that cover large temporal and spatial scales represent an option to describe better the wave climate. To assess the wind climate in the target area spanning 1992–2016, data provided by the NCEP-CFSR (U.S. National Centers for Environmental Prediction - Climate Forecast System Reanalysis) and consisting in wind fields at 10m above the sea level are used. The high spatial and temporal resolution of the wind fields is good enough to represent the wind variability over the area. For the same 25-year period, as considered for the wind climate, this study characterizes the wave climate from a wave hindcast data set that uses NCEP-CFSR winds as input for a model system SWAN (Simulating WAves Nearshore) based. The wave simulation results with a two-level modelling scale have been validated against both in situ measurements and remotely sensed data. The second level of the system, with a higher resolution in the geographical space (0.02°×0.02°), is focused on the Romanian coastal environment. The main wave parameters simulated at this level are used to analyse the wave climate. The spatial distributions of the wind speed, wind direction and the mean significant wave height have been computed as the average of the total data. As resulted from the amount of data, the target area presents a generally moderate wave climate that is affected by the storm events developed in the Black Sea basin. Both wind and wave climate presents high seasonal variability. All the results are computed as maps that help to find the more dangerous areas. A local analysis has been also employed in some key locations corresponding to highly sensitive areas, as for example the main Romanian harbors.

Keywords: numerical simulations, Romanian nearshore, waves, wind

Procedia PDF Downloads 343

25885 Physicochemical Characterization of Coastal Aerosols over the Mediterranean Comparison with Weather Research and Forecasting-Chem Simulations

Authors: Stephane Laussac, Jacques Piazzola, Gilles Tedeschi

Abstract:

Estimation of the impact of atmospheric aerosols on the climate evolution is an important scientific challenge. One of a major source of particles is constituted by the oceans through the generation of sea-spray aerosols. In coastal areas, marine aerosols can affect air quality through their ability to interact chemically and physically with other aerosol species and gases. The integration of accurate sea-spray emission terms in modeling studies is then required. However, it was found that sea-spray concentrations are not represented with the necessary accuracy in some situations, more particularly at short fetch. In this study, the WRF-Chem model was implemented on a North-Western Mediterranean coastal region. WRF-Chem is the Weather Research and Forecasting (WRF) model online-coupled with chemistry for investigation of regional-scale air quality which simulates the emission, transport, mixing, and chemical transformation of trace gases and aerosols simultaneously with the meteorology. One of the objectives was to test the ability of the WRF-Chem model to represent the fine details of the coastal geography to provide accurate predictions of sea spray evolution for different fetches and the anthropogenic aerosols. To assess the performance of the model, a comparison between the model predictions using a local emission inventory and the physicochemical analysis of aerosol concentrations measured for different wind direction on the island of Porquerolles located 10 km south of the French Riviera is proposed.

Keywords: sea-spray aerosols, coastal areas, sea-spray concentrations, short fetch, WRF-Chem model

Procedia PDF Downloads 194

25884 Modelling Biological Treatment of Dye Wastewater in SBR Systems Inoculated with Bacteria by Artificial Neural Network

Authors: Yasaman Sanayei, Alireza Bahiraie

Abstract:

This paper presents a systematic methodology based on the application of artificial neural networks for sequencing batch reactor (SBR). The SBR is a fill-and-draw biological wastewater technology, which is specially suited for nutrient removal. Employing reactive dye by Sphingomonas paucimobilis bacteria at sequence batch reactor is a novel approach of dye removal. The influent COD, MLVSS, and reaction time were selected as the process inputs and the effluent COD and BOD as the process outputs. The best possible result for the discrete pole parameter was a= 0.44. In orderto adjust the parameters of ANN, the Levenberg-Marquardt (LM) algorithm was employed. The results predicted by the model were compared to the experimental data and showed a high correlation with R2> 0.99 and a low mean absolute error (MAE). The results from this study reveal that the developed model is accurate and efficacious in predicting COD and BOD parameters of the dye-containing wastewater treated by SBR. The proposed modeling approach can be applied to other industrial wastewater treatment systems to predict effluent characteristics. Note that SBR are normally operated with constant predefined duration of the stages, thus, resulting in low efficient operation. Data obtained from the on-line electronic sensors installed in the SBR and from the control quality laboratory analysis have been used to develop the optimal architecture of two different ANN. The results have shown that the developed models can be used as efficient and cost-effective predictive tools for the system analysed.

Keywords: artificial neural network, COD removal, SBR, Sphingomonas paucimobilis

Procedia PDF Downloads 412

25883 Orbit Determination Modeling with Graphical Demonstration

Authors: Assem M. F. Sallam, Ah. El-S. Makled

Abstract:

In this paper, there is an implementation, verification, and graphical demonstration of a software application, which can be used swiftly over different preliminary orbit determination methods. A passive orbit determination method is used in this study to determine the location of a satellite or a flying body. It is named a passive orbit determination because it depends on observation without the use of any aids (radio and laser) installed on satellite. In order to understand how these methods work and how their output is accurate when compared with available verification data, the built models help in knowing the different inputs used with each method. Output from the different orbit determination methods (Gibbs, Lambert, and Gauss) will be compared with each other and verified by the data obtained from Satellite Tool Kit (STK) application. A modified model including all of the orbit determination methods using the same input will be introduced to investigate different models output (orbital parameters) for the same input (azimuth, elevation, and time). Simulation software is implemented using MATLAB. A Graphical User Interface (GUI) application named OrDet is produced using the GUI of MATLAB. It includes all the available used inputs and it outputs the current Classical Orbital Elements (COE) of satellite under observation. Produced COE are then used to propagate for a complete revolution and plotted on a 3-D view. Modified model which uses an adapter to allow same input parameters, passes these parameters to the preliminary orbit determination methods under study. Result from all orbit determination methods yield exactly the same COE output, which shows the equality of concept in determination of satellite’s location, but with different numerical methods.

Keywords: orbit determination, STK, Matlab-GUI, satellite tracking

Procedia PDF Downloads 276

25882 Neural Networks Models for Measuring Hotel Users Satisfaction

Authors: Asma Ameur, Dhafer Malouche

Abstract:

Nowadays, user comments on the Internet have an important impact on hotel bookings. This confirms that the e-reputation issue can influence the likelihood of customer loyalty to a hotel. In this way, e-reputation has become a real differentiator between hotels. For this reason, we have a unique opportunity in the opinion mining field to analyze the comments. In fact, this field provides the possibility of extracting information related to the polarity of user reviews. This sentimental study (Opinion Mining) represents a new line of research for analyzing the unstructured textual data. Knowing the score of e-reputation helps the hotelier to better manage his marketing strategy. The score we then obtain is translated into the image of hotels to differentiate between them. Therefore, this present research highlights the importance of hotel satisfaction ‘scoring. To calculate the satisfaction score, the sentimental analysis can be manipulated by several techniques of machine learning. In fact, this study treats the extracted textual data by using the Artificial Neural Networks Approach (ANNs). In this context, we adopt the aforementioned technique to extract information from the comments available in the ‘Trip Advisor’ website. This actual paper details the description and the modeling of the ANNs approach for the scoring of online hotel reviews. In summary, the validation of this used method provides a significant model for hotel sentiment analysis. So, it provides the possibility to determine precisely the polarity of the hotel users reviews. The empirical results show that the ANNs are an accurate approach for sentiment analysis. The obtained results show also that this proposed approach serves to the dimensionality reduction for textual data’ clustering. Thus, this study provides researchers with a useful exploration of this technique. Finally, we outline guidelines for future research in the hotel e-reputation field as comparing the ANNs with other technique.

Keywords: clustering, consumer behavior, data mining, e-reputation, machine learning, neural network, online hotel ‘reviews, opinion mining, scoring

Procedia PDF Downloads 136

25881 Opening up Government Datasets for Big Data Analysis to Support Policy Decisions

Authors: K. Hardy, A. Maurushat

Abstract:

Policy makers are increasingly looking to make evidence-based decisions. Evidence-based decisions have historically used rigorous methodologies of empirical studies by research institutes, as well as less reliable immediate survey/polls often with limited sample sizes. As we move into the era of Big Data analytics, policy makers are looking to different methodologies to deliver reliable empirics in real-time. The question is not why did these people do this for the last 10 years, but why are these people doing this now, and if the this is undesirable, and how can we have an impact to promote change immediately. Big data analytics rely heavily on government data that has been released in to the public domain. The open data movement promises greater productivity and more efficient delivery of services; however, Australian government agencies remain reluctant to release their data to the general public. This paper considers the barriers to releasing government data as open data, and how these barriers might be overcome.

Keywords: big data, open data, productivity, data governance

Procedia PDF Downloads 370

25880 Case Study of Sexual Violence Victim Assessment in Semarang Regency

Authors: Sujana T, Kurniasari MD, Ayakeding AM

Abstract:

Background: Sexual violence is one of the violence with high incidence in Indonesia. Purpose: This research aims to describe the implementation of sexual violence victim assessment in Semarang Regency. Method: This research is a qualitative research with embeded single case study design. Data is analized with two units of analysis. The first unit of analysis is victim’s examiner with minimum one year of work experience. Semi-structured interview method is used to obtain the data. The second unit of analysis is document related. The data is taken by observing the pathway and description of every document and how it supported each implementation of assessment. Results: This study is resulted with three themes, which are: The first theme is assessments of sexual violence in Semarang regency has been standardized. The laws of the Republic of Indonesia have regulated the handling of victims of sexual violence in outline. Victims of sexual violence can be dealt with by the police, the Integrated Service Center for Women and Children Empowerment and the Regional General Hospital. Each examination site has different operational procedures standards for dealing with victims of sexual violence. Cooperation with family and witnesses is also required in the review process to obtain accurate results and evidence; The second idea that resulted from this study is there are inhibits factors in the assessments process. Victims sometimes feel embarrassed and reluctant to recount the chronological events during reporting. The examining officer should be able to approach and build a trust to convince the victim to be able to cooperate. The third theme is there are other things to consider in the process of assessing victims of sexual violence. Ensuring implementation in accordance with applicable operational procedures standards, providing exclusive examination rooms, counseling and safeguarding the privacy of victims are important to be considered in the assessment.

Keywords: assessment, case study, Semarang regency, sexual violence

Procedia PDF Downloads 138

25879 A Review on Existing Challenges of Data Mining and Future Research Perspectives

Authors: Hema Bhardwaj, D. Srinivasa Rao

Abstract:

Technology for analysing, processing, and extracting meaningful data from enormous and complicated datasets can be termed as "big data." The technique of big data mining and big data analysis is extremely helpful for business movements such as making decisions, building organisational plans, researching the market efficiently, improving sales, etc., because typical management tools cannot handle such complicated datasets. Special computational and statistical issues, such as measurement errors, noise accumulation, spurious correlation, and storage and scalability limitations, are brought on by big data. These unique problems call for new computational and statistical paradigms. This research paper offers an overview of the literature on big data mining, its process, along with problems and difficulties, with a focus on the unique characteristics of big data. Organizations have several difficulties when undertaking data mining, which has an impact on their decision-making. Every day, terabytes of data are produced, yet only around 1% of that data is really analyzed. The idea of the mining and analysis of data and knowledge discovery techniques that have recently been created with practical application systems is presented in this study. This article's conclusion also includes a list of issues and difficulties for further research in the area. The report discusses the management's main big data and data mining challenges.

Keywords: big data, data mining, data analysis, knowledge discovery techniques, data mining challenges

Procedia PDF Downloads 108

25878 Artificial Intelligence Methods in Estimating the Minimum Miscibility Pressure Required for Gas Flooding

Authors: Emad A. Mohammed

Abstract:

Utilizing the capabilities of Data Mining and Artificial Intelligence in the prediction of the minimum miscibility pressure (MMP) required for multi-contact miscible (MCM) displacement of reservoir petroleum by hydrocarbon gas flooding using Fuzzy Logic models and Artificial Neural Network models will help a lot in giving accurate results. The factors affecting the (MMP) as it is proved from the literature and from the dataset are as follows: XC2-6: Intermediate composition in the oil-containing C2-6, CO2 and H2S, in mole %, XC1: Amount of methane in the oil (%),T: Temperature (°C), MwC7+: Molecular weight of C7+ (g/mol), YC2+: Mole percent of C2+ composition in injected gas (%), MwC2+: Molecular weight of C2+ in injected gas. Fuzzy Logic and Neural Networks have been used widely in prediction and classification, with relatively high accuracy, in different fields of study. It is well known that the Fuzzy Inference system can handle uncertainty within the inputs such as in our case. The results of this work showed that our proposed models perform better with higher performance indices than other emprical correlations.

Keywords: MMP, gas flooding, artificial intelligence, correlation

Procedia PDF Downloads 143

25877 Sea-Spray Calculations Using the MESO-NH Model

Authors: Alix Limoges, William Bruch, Christophe Yohia, Jacques Piazzola

Abstract:

A number of questions arise concerning the long-term impact of the contribution of marine aerosol fluxes generated at the air-sea interface on the occurrence of intense events (storms, floods, etc.) in the coastal environment. To this end, knowledge is needed on sea-spray emission rates and the atmospheric dynamics of the corresponding particles. Our aim is to implement the mesoscale model MESO-NH on the study area using an accurate sea-spray source function to estimate heat fluxes and impact on the precipitations. Based on an original and complete sea-spray source function, which covers a large size spectrum since taking into consideration the sea-spray produced by both bubble bursting and surface tearing process, we propose a comparison between model simulations and experimental data obtained during an oceanic scientific cruise on board the navy ship Atalante. The results show the relevance of the sea-spray flux calculations as well as their impact on the heat fluxes and AOD.

Keywords: atmospheric models, sea-spray source, sea-spray dynamics, aerosols

Procedia PDF Downloads 147

25876 A Systematic Review on Challenges in Big Data Environment

Authors: Rimmy Yadav, Anmol Preet Kaur

Abstract:

Big Data has demonstrated the vast potential in streamlining, deciding, spotting business drifts in different fields, for example, producing, fund, Information Technology. This paper gives a multi-disciplinary diagram of the research issues in enormous information and its procedures, instruments, and system identified with the privacy, data storage management, network and energy utilization, adaptation to non-critical failure and information representations. Other than this, result difficulties and openings accessible in this Big Data platform have made.

Keywords: big data, privacy, data management, network and energy consumption

Procedia PDF Downloads 311

25875 A Hybrid Recommendation System Based on Association Rules

Authors: Ahmed Mohammed Alsalama

Abstract:

Recommendation systems are widely used in e-commerce applications. The engine of a current recommendation system recommends items to a particular user based on user preferences and previous high ratings. Various recommendation schemes such as collaborative filtering and content-based approaches are used to build a recommendation system. Most of the current recommendation systems were developed to fit a certain domain such as books, articles, and movies. We propose a hybrid framework recommendation system to be applied on two-dimensional spaces (User x Item) with a large number of Users and a small number of Items. Moreover, our proposed framework makes use of both favorite and non-favorite items of a particular user. The proposed framework is built upon the integration of association rules mining and the content-based approach. The results of experiments show that our proposed framework can provide accurate recommendations to users.

Keywords: data mining, association rules, recommendation systems, hybrid systems

Procedia PDF Downloads 451

25874 Survey on Big Data Stream Classification by Decision Tree

Authors: Mansoureh Ghiasabadi Farahani, Samira Kalantary, Sara Taghi-Pour, Mahboubeh Shamsi

Abstract:

Nowadays, the development of computers technology and its recent applications provide access to new types of data, which have not been considered by the traditional data analysts. Two particularly interesting characteristics of such data sets include their huge size and streaming nature .Incremental learning techniques have been used extensively to address the data stream classification problem. This paper presents a concise survey on the obstacles and the requirements issues classifying data streams with using decision tree. The most important issue is to maintain a balance between accuracy and efficiency, the algorithm should provide good classification performance with a reasonable time response.

Keywords: big data, data streams, classification, decision tree

Procedia PDF Downloads 520

25873 Robust and Dedicated Hybrid Cloud Approach for Secure Authorized Deduplication

Authors: Aishwarya Shekhar, Himanshu Sharma

Abstract:

Data deduplication is one of important data compression techniques for eliminating duplicate copies of repeating data, and has been widely used in cloud storage to reduce the amount of storage space and save bandwidth. In this process, duplicate data is expunged, leaving only one copy means single instance of the data to be accumulated. Though, indexing of each and every data is still maintained. Data deduplication is an approach for minimizing the part of storage space an organization required to retain its data. In most of the company, the storage systems carry identical copies of numerous pieces of data. Deduplication terminates these additional copies by saving just one copy of the data and exchanging the other copies with pointers that assist back to the primary copy. To ignore this duplication of the data and to preserve the confidentiality in the cloud here we are applying the concept of hybrid nature of cloud. A hybrid cloud is a fusion of minimally one public and private cloud. As a proof of concept, we implement a java code which provides security as well as removes all types of duplicated data from the cloud.

Keywords: confidentiality, deduplication, data compression, hybridity of cloud

Procedia PDF Downloads 381

25872 Reinforced Concrete Bridge Deck Condition Assessment Methods Using Ground Penetrating Radar and Infrared Thermography

Authors: Nicole M. Martino

Abstract:

Reinforced concrete bridge deck condition assessments primarily use visual inspection methods, where an inspector looks for and records locations of cracks, potholes, efflorescence and other signs of probable deterioration. Sounding is another technique used to diagnose the condition of a bridge deck, however this method listens for damage within the subsurface as the surface is struck with a hammer or chain. Even though extensive procedures are in place for using these inspection techniques, neither one provides the inspector with a comprehensive understanding of the internal condition of a bridge deck – the location where damage originates from. In order to make accurate estimates of repair locations and quantities, in addition to allocating the necessary funding, a total understanding of the deck’s deteriorated state is key. The research presented in this paper collected infrared thermography and ground penetrating radar data from reinforced concrete bridge decks without an asphalt overlay. These decks were of various ages and their condition varied from brand new, to in need of replacement. The goals of this work were to first verify that these nondestructive evaluation methods could identify similar areas of healthy and damaged concrete, and then to see if combining the results of both methods would provide a higher confidence than if the condition assessment was completed using only one method. The results from each method were presented as plan view color contour plots. The results from one of the decks assessed as a part of this research, including these plan view plots, are presented in this paper. Furthermore, in order to answer the interest of transportation agencies throughout the United States, this research developed a step-by-step guide which demonstrates how to collect and assess a bridge deck using these nondestructive evaluation methods. This guide addresses setup procedures on the deck during the day of data collection, system setups and settings for different bridge decks, data post-processing for each method, and data visualization and quantification.

Keywords: bridge deck deterioration, ground penetrating radar, infrared thermography, NDT of bridge decks

Procedia PDF Downloads 153

25871 A Review of Machine Learning for Big Data

Authors: Devatha Kalyan Kumar, Aravindraj D., Sadathulla A.

Abstract:

Big data are now rapidly expanding in all engineering and science and many other domains. The potential of large or massive data is undoubtedly significant, make sense to require new ways of thinking and learning techniques to address the various big data challenges. Machine learning is continuously unleashing its power in a wide range of applications. In this paper, the latest advances and advancements in the researches on machine learning for big data processing. First, the machine learning techniques methods in recent studies, such as deep learning, representation learning, transfer learning, active learning and distributed and parallel learning. Then focus on the challenges and possible solutions of machine learning for big data.

Keywords: active learning, big data, deep learning, machine learning

Procedia PDF Downloads 443

25870 Strengthening Legal Protection of Personal Data through Technical Protection Regulation in Line with Human Rights

Authors: Tomy Prihananto, Damar Apri Sudarmadi

Abstract:

Indonesia recognizes the right to privacy as a human right. Indonesia provides legal protection against data management activities because the protection of personal data is a part of human rights. This paper aims to describe the arrangement of data management and data management in Indonesia. This paper is a descriptive research with qualitative approach and collecting data from literature study. Results of this paper are comprehensive arrangement of data that have been set up as a technical requirement of data protection by encryption methods. Arrangements on encryption and protection of personal data are mutually reinforcing arrangements in the protection of personal data. Indonesia has two important and immediately enacted laws that provide protection for the privacy of information that is part of human rights.

Keywords: Indonesia, protection, personal data, privacy, human rights, encryption

Procedia PDF Downloads 181

25869 Heterogeneity of Soil Moisture and Its Impacts on the Mountainous Watershed Hydrology in Northwest China

Authors: Chansheng He, Zhongfu Wang, Xiao Bai, Jie Tian, Xin Jin

Abstract:

Heterogeneity of soil hydraulic properties directly affects hydrological processes at different scales. Understanding heterogeneity of soil hydraulic properties such as soil moisture is therefore essential for modeling watershed ecohydrological processes, particularly in hard to access, topographically complex mountainous watersheds. This study maps spatial variations of soil moisture by in situ observation network that consists of sampling points, zones, and tributaries, and monitors corresponding hydrological variables of air and soil temperatures, evapotranspiration, infiltration, and runoff in the Upper Reach of the Heihe River Watershed, a second largest inland river (terminal lake) with a drainage area of over 128,000 km² in Northwest China. Subsequently, the study uses a hydrological model, SWAT (Soil and Water Assessment Tool) to simulate the effects of heterogeneity of soil moisture on watershed hydrological processes. The spatial clustering method, Full-Order-CLK was employed to derive five soil heterogeneous zones (Configuration 97, 80, 65, 40, and 20) for soil input to SWAT. Results show the simulations by the SWAT model with the spatially clustered soil hydraulic information from the field sampling data had much better representation of the soil heterogeneity and more accurate performance than the model using the average soil property values for each soil type derived from the coarse soil datasets. Thus, incorporating detailed field sampling soil heterogeneity data greatly improves performance in hydrologic modeling.

Keywords: heterogeneity, soil moisture, SWAT, up-scaling

Procedia PDF Downloads 345

25868 Development of Stability Indicating Method and Characterization of Degradation Impurity of Nirmaltrelvir in Its Self-Emulsifying Drug Delivery System

Authors: Ravi Patel, Ravisinh Solanki, Dignesh Khunt

Abstract:

A stability-indicating reverse phase high performance liquid chromatography (RP-HPLC) method was developed and validated for estimating Nirmatrelvir in its self-emulsifying drug delivery system (SEDDS). The separation of Nirmatrelvir and its degradation products was accomplished by employing an Agilent Zorbax Eclipse plus C18 (250 mm x 4.6 mm, 5 µm) column, through which the mobile phase 5 mM phosphate buffer (pH 4.0) as mobile phase A and Acetonitrile as mobile phase B in a ratio of (40:60 % v/v) was pumped at a flow rate of 1.0 mL/min, through the HPLC system. Chromatographic separation and elution were monitored by a photo-diode array detector at 210 nm. Stress studies have been employed to evaluate this method's ability to indicate stability. Nirmatrelvir was exposed to several stress conditions, such as acid, alkali, oxidative, photolytic, and thermal degradations. Significant degradation was observed during acid and alkali hydrolysis, and the resulting degradation product was successfully separated from the Nirmatrelvir peak, preventing any interference. Furthermore, the primary degradant produced under alkali degradation conditions was identified using UPLC-ESI-TQ-MS/MS. The method was validated in accordance with the International Council on Harmonization (ICH) and found to be selective, precise, accurate, linear, and robust. The apparent permeability of Nirmatrelvir SEDDS was 4.20 ± 0.21×10-6 cm/sec, and the average proportion of free drug recovered was 0.5%. The method developed in this study was feasible and accurate for routine quality control evaluation of Nirmatrelvir SEDDS.

Keywords: Nirmatrelvir, SEDDS, degradation study, HPLC, LC-MS/MS

Procedia PDF Downloads 15

25867 The Various Legal Dimensions of Genomic Data

Authors: Amy Gooden

Abstract:

When human genomic data is considered, this is often done through only one dimension of the law, or the interplay between the various dimensions is not considered, thus providing an incomplete picture of the legal framework. This research considers and analyzes the various dimensions in South African law applicable to genomic sequence data – including property rights, personality rights, and intellectual property rights. The effective use of personal genomic sequence data requires the acknowledgement and harmonization of the rights applicable to such data.

Keywords: artificial intelligence, data, law, genomics, rights

Procedia PDF Downloads 136

25866 Accuracy of Small Field of View CBCT in Determining Endodontic Working Length

Authors: N. L. S. Ahmad, Y. L. Thong, P. Nambiar

Abstract:

An in vitro study was carried out to evaluate the feasibility of small field of view (FOV) cone beam computed tomography (CBCT) in determining endodontic working length. The objectives were to determine the accuracy of CBCT in measuring the estimated preoperative working lengths (EPWL), endodontic working lengths (EWL) and file lengths. Access cavities were prepared in 27 molars. For each root canal, the baseline electronic working length was determined using an EAL (Raypex 5). The teeth were then divided into overextended, non-modified and underextended groups and the lengths were adjusted accordingly. Imaging and measurements were made using the respective software of the RVG (Kodak RVG 6100) and CBCT units (Kodak 9000 3D). Root apices were then shaved and the apical constrictions viewed under magnification to measure the control working lengths. The paired t-test showed a statistically significant difference between CBCT EPWL and control length but the difference was too small to be clinically significant. From the Bland Altman analysis, the CBCT method had the widest range of 95% limits of agreement, reflecting its greater potential of error. In measuring file lengths, RVG had a bigger window of 95% limits of agreement compared to CBCT. Conclusions: (1) The clinically insignificant underestimation of the preoperative working length using small FOV CBCT showed that it is acceptable for use in the estimation of preoperative working length. (2) Small FOV CBCT may be used in working length determination but it is not as accurate as the currently practiced method of using the EAL. (3) It is also more accurate than RVG in measuring file lengths.

Keywords: accuracy, CBCT, endodontics, measurement

Procedia PDF Downloads 307

25865 Analysis on Prediction Models of TBM Performance and Selection of Optimal Input Parameters

Authors: Hang Lo Lee, Ki Il Song, Hee Hwan Ryu

Abstract:

An accurate prediction of TBM(Tunnel Boring Machine) performance is very difficult for reliable estimation of the construction period and cost in preconstruction stage. For this purpose, the aim of this study is to analyze the evaluation process of various prediction models published since 2000 for TBM performance, and to select the optimal input parameters for the prediction model. A classification system of TBM performance prediction model and applied methodology are proposed in this research. Input and output parameters applied for prediction models are also represented. Based on these results, a statistical analysis is performed using the collected data from shield TBM tunnel in South Korea. By performing a simple regression and residual analysis utilizinFg statistical program, R, the optimal input parameters are selected. These results are expected to be used for development of prediction model of TBM performance.

Keywords: TBM performance prediction model, classification system, simple regression analysis, residual analysis, optimal input parameters

Procedia PDF Downloads 307

25864 Big Brain: A Single Database System for a Federated Data Warehouse Architecture

Authors: X. Gumara Rigol, I. Martínez de Apellaniz Anzuola, A. Garcia Serrano, A. Franzi Cros, O. Vidal Calbet, A. Al Maruf

Abstract:

Traditional federated architectures for data warehousing work well when corporations have existing regional data warehouses and there is a need to aggregate data at a global level. Schibsted Media Group has been maturing from a decentralised organisation into a more globalised one and needed to build both some of the regional data warehouses for some brands at the same time as the global one. In this paper, we present the architectural alternatives studied and why a custom federated approach was the notable recommendation to go further with the implementation. Although the data warehouses are logically federated, the implementation uses a single database system which presented many advantages like: cost reduction and improved data access to global users allowing consumers of the data to have a common data model for detailed analysis across different geographies and a flexible layer for local specific needs in the same place.

Keywords: data integration, data warehousing, federated architecture, Online Analytical Processing (OLAP)

Procedia PDF Downloads 235

25863 Discriminating Between Energy Drinks and Sports Drinks Based on Their Chemical Properties Using Chemometric Methods

Authors: Robert Cazar, Nathaly Maza

Abstract:

Energy drinks and sports drinks are quite popular among young adults and teenagers worldwide. Some concerns regarding their health effects – particularly those of the energy drinks - have been raised based on scientific findings. Differentiating between these two types of drinks by means of their chemical properties seems to be an instructive task. Chemometrics provides the most appropriate strategy to do so. In this study, a discrimination analysis of the energy and sports drinks has been carried out applying chemometric methods. A set of eleven samples of available commercial brands of drinks – seven energy drinks and four sports drinks – were collected. Each sample was characterized by eight chemical variables (carbohydrates, energy, sugar, sodium, pH, degrees Brix, density, and citric acid). The data set was standardized and examined by exploratory chemometric techniques such as clustering and principal component analysis. As a preliminary step, a variable selection was carried out by inspecting the variable correlation matrix. It was detected that some variables are redundant, so they can be safely removed, leaving only five variables that are sufficient for this analysis. They are sugar, sodium, pH, density, and citric acid. Then, a hierarchical clustering `employing the average – linkage criterion and using the Euclidian distance metrics was performed. It perfectly separates the two types of drinks since the resultant dendogram, cut at the 25% similarity level, assorts the samples in two well defined groups, one of them containing the energy drinks and the other one the sports drinks. Further assurance of the complete discrimination is provided by the principal component analysis. The projection of the data set on the first two principal components – which retain the 71% of the data information – permits to visualize the distribution of the samples in the two groups identified in the clustering stage. Since the first principal component is the discriminating one, the inspection of its loadings consents to characterize such groups. The energy drinks group possesses medium to high values of density, citric acid, and sugar. The sports drinks group, on the other hand, exhibits low values of those variables. In conclusion, the application of chemometric methods on a data set that features some chemical properties of a number of energy and sports drinks provides an accurate, dependable way to discriminate between these two types of beverages.

Keywords: chemometrics, clustering, energy drinks, principal component analysis, sports drinks

Procedia PDF Downloads 105