Search results for: large amounts of data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 29706

Search results for: large amounts of data

29436 Predictive Analysis for Big Data: Extension of Classification and Regression Trees Algorithm

Authors: Ameur Abdelkader, Abed Bouarfa Hafida

Abstract:

Since its inception, predictive analysis has revolutionized the IT industry through its robustness and decision-making facilities. It involves the application of a set of data processing techniques and algorithms in order to create predictive models. Its principle is based on finding relationships between explanatory variables and the predicted variables. Past occurrences are exploited to predict and to derive the unknown outcome. With the advent of big data, many studies have suggested the use of predictive analytics in order to process and analyze big data. Nevertheless, they have been curbed by the limits of classical methods of predictive analysis in case of a large amount of data. In fact, because of their volumes, their nature (semi or unstructured) and their variety, it is impossible to analyze efficiently big data via classical methods of predictive analysis. The authors attribute this weakness to the fact that predictive analysis algorithms do not allow the parallelization and distribution of calculation. In this paper, we propose to extend the predictive analysis algorithm, Classification And Regression Trees (CART), in order to adapt it for big data analysis. The major changes of this algorithm are presented and then a version of the extended algorithm is defined in order to make it applicable for a huge quantity of data.

Keywords: predictive analysis, big data, predictive analysis algorithms, CART algorithm

Procedia PDF Downloads 131
29435 Non-Linear Regression Modeling for Composite Distributions

Authors: Mostafa Aminzadeh, Min Deng

Abstract:

Modeling loss data is an important part of actuarial science. Actuaries use models to predict future losses and manage financial risk, which can be beneficial for marketing purposes. In the insurance industry, small claims happen frequently while large claims are rare. Traditional distributions such as Normal, Exponential, and inverse-Gaussian are not suitable for describing insurance data, which often show skewness and fat tails. Several authors have studied classical and Bayesian inference for parameters of composite distributions, such as Exponential-Pareto, Weibull-Pareto, and Inverse Gamma-Pareto. These models separate small to moderate losses from large losses using a threshold parameter. This research introduces a computational approach using a nonlinear regression model for loss data that relies on multiple predictors. Simulation studies were conducted to assess the accuracy of the proposed estimation method. The simulations confirmed that the proposed method provides precise estimates for regression parameters. It's important to note that this approach can be applied to datasets if goodness-of-fit tests confirm that the composite distribution under study fits the data well. To demonstrate the computations, a real data set from the insurance industry is analyzed. A Mathematica code uses the Fisher information algorithm as an iteration method to obtain the maximum likelihood estimation (MLE) of regression parameters.

Keywords: maximum likelihood estimation, fisher scoring method, non-linear regression models, composite distributions

Procedia PDF Downloads 4
29434 Multidimensional Item Response Theory Models for Practical Application in Large Tests Designed to Measure Multiple Constructs

Authors: Maria Fernanda Ordoñez Martinez, Alvaro Mauricio Montenegro

Abstract:

This work presents a statistical methodology for measuring and founding constructs in Latent Semantic Analysis. This approach uses the qualities of Factor Analysis in binary data with interpretations present on Item Response Theory. More precisely, we propose initially reducing dimensionality with specific use of Principal Component Analysis for the linguistic data and then, producing axes of groups made from a clustering analysis of the semantic data. This approach allows the user to give meaning to previous clusters and found the real latent structure presented by data. The methodology is applied in a set of real semantic data presenting impressive results for the coherence, speed and precision.

Keywords: semantic analysis, factorial analysis, dimension reduction, penalized logistic regression

Procedia PDF Downloads 426
29433 Keynote Talk: The Role of Internet of Things in the Smart Cities Power System

Authors: Abdul-Rahman Al-Ali

Abstract:

As the number of mobile devices is growing exponentially, it is estimated to connect about 50 million devices to the Internet by the year 2020. At the end of this decade, it is expected that an average of eight connected devices per person worldwide. The 50 billion devices are not mobile phones and data browsing gadgets only, but machine-to-machine and man-to-machine devices. With such growing numbers of devices the Internet of Things (I.o.T) concept is one of the emerging technologies as of recently. Within the smart grid technologies, smart home appliances, Intelligent Electronic Devices (IED) and Distributed Energy Resources (DER) are major I.o.T objects that can be addressable using the IPV6. These objects are called the smart grid internet of things (SG-I.o.T). The SG-I.o.T generates big data that requires high-speed computing infrastructure, widespread computer networks, big data storage, software, and platforms services. A company’s utility control and data centers cannot handle such a large number of devices, high-speed processing, and massive data storage. Building large data center’s infrastructure takes a long time, it also requires widespread communication networks and huge capital investment. To maintain and upgrade control and data centers’ infrastructure and communication networks as well as updating and renewing software licenses which collectively, requires additional cost. This can be overcome by utilizing the emerging computing paradigms such as cloud computing. This can be used as a smart grid enabler to replace the legacy of utilities data centers. The talk will highlight the role of I.o.T, cloud computing services and their development models within the smart grid technologies.

Keywords: intelligent electronic devices (IED), distributed energy resources (DER), internet, smart home appliances

Procedia PDF Downloads 309
29432 Degradation of EE2 by Different Consortium of Enriched Nitrifying Activated Sludge

Authors: Pantip Kayee

Abstract:

17α-ethinylestradiol (EE2) is a recalcitrant micropollutant which is found in small amounts in municipal wastewater. But these small amounts still adversely affect for the reproductive function of aquatic organisms. Evidence in the past suggested that full-scale WWTPs equipped with nitrification process enhanced the removal of EE2 in the municipal wastewater. EE2 has been proven to be able to be transformed by ammonia oxidizing bacteria (AOB) via co-metabolism. This research aims to clarify the EE2 degradation pattern by different consortium of ammonia oxidizing microorganism (AOM) including AOA (ammonia oxidizing archaea) and investigate contribution between the existing ammonia monooxygenase (AMO) and new synthesized AOM. The result showed that AOA or AOB of N. oligotropha cluster in enriched nitrifying activated sludge (NAS) from 2mM and 5mM, commonly found in municipal WWTPs, could degrade EE2 in wastewater via co-metabolism. Moreover, the investigation of the contribution between the existing ammonia monooxygenase (AMO) and new synthesized AOM demonstrated that the new synthesized AMO enzyme may perform ammonia oxidation rather than the existing AMO enzyme or the existing AMO enzyme may has a small amount to oxidize ammonia.

Keywords: 17α-ethinylestradiol, nitrification, ammonia oxidizing bacteria, ammonia oxidizing archaea

Procedia PDF Downloads 276
29431 Determination of the Bank's Customer Risk Profile: Data Mining Applications

Authors: Taner Ersoz, Filiz Ersoz, Seyma Ozbilge

Abstract:

In this study, the clients who applied to a bank branch for loan were analyzed through data mining. The study was composed of the information such as amounts of loans received by personal and SME clients working with the bank branch, installment numbers, number of delays in loan installments, payments available in other banks and number of banks to which they are in debt between 2010 and 2013. The client risk profile was examined through Classification and Regression Tree (CART) analysis, one of the decision tree classification methods. At the end of the study, 5 different types of customers have been determined on the decision tree. The classification of these types of customers has been created with the rating of those posing a risk for the bank branch and the customers have been classified according to the risk ratings.

Keywords: client classification, loan suitability, risk rating, CART analysis

Procedia PDF Downloads 326
29430 Conductivity-Depth Inversion of Large Loop Transient Electromagnetic Sounding Data over Layered Earth Models

Authors: Ravi Ande, Mousumi Hazari

Abstract:

One of the common geophysical techniques for mapping subsurface geo-electrical structures, extensive hydro-geological research, and engineering and environmental geophysics applications is the use of time domain electromagnetic (TDEM)/transient electromagnetic (TEM) soundings. A large transmitter loop for energising the ground and a small receiver loop or magnetometer for recording the transient voltage or magnetic field in the air or on the surface of the earth, with the receiver at the center of the loop or at any random point inside or outside the source loop, make up a large loop TEM system. In general, one can acquire data using one of the configurations with a large loop source, namely, with the receiver at the center point of the loop (central loop method), at an arbitrary in-loop point (in-loop method), coincident with the transmitter loop (coincidence-loop method), and at an arbitrary offset loop point (offset-loop method), respectively. Because of the mathematical simplicity associated with the expressions of EM fields, as compared to the in-loop and offset-loop systems, the central loop system (for ground surveys) and coincident loop system (for ground as well as airborne surveys) have been developed and used extensively for the exploration of mineral and geothermal resources, for mapping contaminated groundwater caused by hazardous waste and thickness of permafrost layer. Because a proper analytical expression for the TEM response over the layered earth model for the large loop TEM system does not exist, the forward problem used in this inversion scheme is first formulated in the frequency domain and then it is transformed in the time domain using Fourier cosine or sine transforms. Using the EMLCLLER algorithm, the forward computation is initially carried out in the frequency domain. As a result, the EMLCLLER modified the forward calculation scheme in NLSTCI to compute frequency domain answers before converting them to the time domain using Fourier Cosine and/or Sine transforms.

Keywords: time domain electromagnetic (TDEM), TEM system, geoelectrical sounding structure, Fourier cosine

Procedia PDF Downloads 82
29429 Social Data Aggregator and Locator of Knowledge (STALK)

Authors: Rashmi Raghunandan, Sanjana Shankar, Rakshitha K. Bhat

Abstract:

Social media contributes a vast amount of data and information about individuals to the internet. This project will greatly reduce the need for unnecessary manual analysis of large and diverse social media profiles by filtering out and combining the useful information from various social media profiles, eliminating irrelevant data. It differs from the existing social media aggregators in that it does not provide a consolidated view of various profiles. Instead, it provides consolidated INFORMATION derived from the subject’s posts and other activities. It also allows analysis over multiple profiles and analytics based on several profiles. We strive to provide a query system to provide a natural language answer to questions when a user does not wish to go through the entire profile. The information provided can be filtered according to the different use cases it is used for.

Keywords: social network, analysis, Facebook, Linkedin, git, big data

Procedia PDF Downloads 428
29428 Experimental Evaluation of Succinct Ternary Tree

Authors: Dmitriy Kuptsov

Abstract:

Tree data structures, such as binary or in general k-ary trees, are essential in computer science. The applications of these data structures can range from data search and retrieval to sorting and ranking algorithms. Naive implementations of these data structures can consume prohibitively large volumes of random access memory limiting their applicability in certain solutions. Thus, in these cases, more advanced representation of these data structures is essential. In this paper we present the design of the compact version of ternary tree data structure and demonstrate the results for the experimental evaluation using static dictionary problem. We compare these results with the results for binary and regular ternary trees. The conducted evaluation study shows that our design, in the best case, consumes up to 12 times less memory (for the dictionary used in our experimental evaluation) than a regular ternary tree and in certain configuration shows performance comparable to regular ternary trees. We have evaluated the performance of the algorithms using both 32 and 64 bit operating systems.

Keywords: algorithms, data structures, succinct ternary tree, per- formance evaluation

Procedia PDF Downloads 152
29427 Study of Management of Waste Construction Materials in Civil Engineering Projects

Authors: Jalindar R. Patil, Harish P. Gayakwad

Abstract:

The increased economic growth across the globe as well as urbanization in developing countries have led into extensive construction activities that generate large amounts of wastes. Material wastage in construction projects resulted into huge financial setbacks to builders and contractors. In addition to this, it may also cause significant effects over aesthetics, health, and the general environment. However in many cities across the globe where construction wastes material management is still a problem. In this paper, the discussion is all about the method for the management of waste construction materials. The objectives of this seminar are to identify the significant source of construction waste globally, to improve the performance of by extracting the major barriers construction waste management and to determine the cost impact on the construction project. These wastes needs to be managed as well as their impacts needs to be ascertained to pave way for their proper management. The seminar includes the details of construction waste management with the reference to construction project. The application of construction waste management in the civil engineering projects is to describe the reduction in the construction wastes.

Keywords: civil engineering, construction materials, waste management, construction activities

Procedia PDF Downloads 508
29426 Business-Intelligence Mining of Large Decentralized Multimedia Datasets with a Distributed Multi-Agent System

Authors: Karima Qayumi, Alex Norta

Abstract:

The rapid generation of high volume and a broad variety of data from the application of new technologies pose challenges for the generation of business-intelligence. Most organizations and business owners need to extract data from multiple sources and apply analytical methods for the purposes of developing their business. Therefore, the recently decentralized data management environment is relying on a distributed computing paradigm. While data are stored in highly distributed systems, the implementation of distributed data-mining techniques is a challenge. The aim of this technique is to gather knowledge from every domain and all the datasets stemming from distributed resources. As agent technologies offer significant contributions for managing the complexity of distributed systems, we consider this for next-generation data-mining processes. To demonstrate agent-based business intelligence operations, we use agent-oriented modeling techniques to develop a new artifact for mining massive datasets.

Keywords: agent-oriented modeling (AOM), business intelligence model (BIM), distributed data mining (DDM), multi-agent system (MAS)

Procedia PDF Downloads 418
29425 User Intention Generation with Large Language Models Using Chain-of-Thought Prompting Title

Authors: Gangmin Li, Fan Yang

Abstract:

Personalized recommendation is crucial for any recommendation system. One of the techniques for personalized recommendation is to identify the intention. Traditional user intention identification uses the user’s selection when facing multiple items. This modeling relies primarily on historical behaviour data resulting in challenges such as the cold start, unintended choice, and failure to capture intention when items are new. Motivated by recent advancements in Large Language Models (LLMs) like ChatGPT, we present an approach for user intention identification by embracing LLMs with Chain-of-Thought (CoT) prompting. We use the initial user profile as input to LLMs and design a collection of prompts to align the LLM's response through various recommendation tasks encompassing rating prediction, search and browse history, user clarification, etc. Our tests on real-world datasets demonstrate the improvements in recommendation by explicit user intention identification and, with that intention, merged into a user model.

Keywords: personalized recommendation, generative user modelling, user intention identification, large language models, chain-of-thought prompting

Procedia PDF Downloads 34
29424 Landslide Susceptibility Analysis in the St. Lawrence Lowlands Using High Resolution Data and Failure Plane Analysis

Authors: Kevin Potoczny, Katsuichiro Goda

Abstract:

The St. Lawrence lowlands extend from Ottawa to Quebec City and are known for large deposits of sensitive Leda clay. Leda clay deposits are responsible for many large landslides, such as the 1993 Lemieux and 2010 St. Jude (4 fatalities) landslides. Due to the large extent and sensitivity of Leda clay, regional hazard analysis for landslides is an important tool in risk management. A 2018 regional study by Farzam et al. on the susceptibility of Leda clay slopes to landslide hazard uses 1 arc second topographical data. A qualitative method known as Hazus is used to estimate susceptibility by checking for various criteria in a location and determine a susceptibility rating on a scale of 0 (no susceptibility) to 10 (very high susceptibility). These criteria are slope angle, geological group, soil wetness, and distance from waterbodies. Given the flat nature of St. Lawrence lowlands, the current assessment fails to capture local slopes, such as the St. Jude site. Additionally, the data did not allow one to analyze failure planes accurately. This study majorly improves the analysis performed by Farzam et al. in two aspects. First, regional assessment with high resolution data allows for identification of local locations that may have been previously identified as low susceptibility. This then provides the opportunity to conduct a more refined analysis on the failure plane of the slope. Slopes derived from 1 arc second data are relatively gentle (0-10 degrees) across the region; however, the 1- and 2-meter resolution 2022 HRDEM provided by NRCAN shows that short, steep slopes are present. At a regional level, 1 arc second data can underestimate the susceptibility of short, steep slopes, which can be dangerous as Leda clay landslides behave retrogressively and travel upwards into flatter terrain. At the location of the St. Jude landslide, slope differences are significant. 1 arc second data shows a maximum slope of 12.80 degrees and a mean slope of 4.72 degrees, while the HRDEM data shows a maximum slope of 56.67 degrees and a mean slope of 10.72 degrees. This equates to a difference of three susceptibility levels when the soil is dry and one susceptibility level when wet. The use of GIS software is used to create a regional susceptibility map across the St. Lawrence lowlands at 1- and 2-meter resolutions. Failure planes are necessary to differentiate between small and large landslides, which have so far been ignored in regional analysis. Leda clay failures can only retrogress as far as their failure planes, so the regional analysis must be able to transition smoothly into a more robust local analysis. It is expected that slopes within the region, once previously assessed at low susceptibility scores, contain local areas of high susceptibility. The goal is to create opportunities for local failure plane analysis to be undertaken, which has not been possible before. Due to the low resolution of previous regional analyses, any slope near a waterbody could be considered hazardous. However, high-resolution regional analysis would allow for more precise determination of hazard sites.

Keywords: hazus, high-resolution DEM, leda clay, regional analysis, susceptibility

Procedia PDF Downloads 58
29423 Adaption Model for Building Agile Pronunciation Dictionaries Using Phonemic Distance Measurements

Authors: Akella Amarendra Babu, Rama Devi Yellasiri, Natukula Sainath

Abstract:

Where human beings can easily learn and adopt pronunciation variations, machines need training before put into use. Also humans keep minimum vocabulary and their pronunciation variations are stored in front-end of their memory for ready reference, while machines keep the entire pronunciation dictionary for ready reference. Supervised methods are used for preparation of pronunciation dictionaries which take large amounts of manual effort, cost, time and are not suitable for real time use. This paper presents an unsupervised adaptation model for building agile and dynamic pronunciation dictionaries online. These methods mimic human approach in learning the new pronunciations in real time. A new algorithm for measuring sound distances called Dynamic Phone Warping is presented and tested. Performance of the system is measured using an adaptation model and the precision metrics is found to be better than 86 percent.

Keywords: pronunciation variations, dynamic programming, machine learning, natural language processing

Procedia PDF Downloads 160
29422 Developing an Intervention Program to Promote Healthy Eating in a Catering System Based on Qualitative Research Results

Authors: O. Katz-Shufan, T. Simon-Tuval, L. Sabag, L. Granek, D. R. Shahar

Abstract:

Meals provided at catering systems are a common source of workers' nutrition and were found as contributing high amounts calories and fat. Thus, eating daily catering food can lead to overweight and chronic diseases. On the other hand, the institutional dining room may be an ideal environment for implementation of intervention programs that promote healthy eating. This may improve diners' lifestyle and reduce their prevalence of overweight, obesity and chronic diseases. The significance of this study is in developing an intervention program based on the diners’ dietary habits, preferences and their attitudes towards various intervention programs. In addition, a successful catering-based intervention program may have a significant effect simultaneously on a large group of diners, leading to improved nutrition, healthier lifestyle, and disease-prevention on a large scale. In order to develop the intervention program, we conducted a qualitative study. We interviewed 13 diners who eat regularly at catering systems, using a semi-structured interview. The interviews were recorded, transcribed and then analyzed by the thematic method, which identifies, analyzes and reports themes within the data. The interviews revealed several major themes, including expectation of diners to be provided with healthy food choices; their request for nutrition-expert involvement in planning the meals; the diners' feel that there is a conflict between sensory attractiveness of the food and its' nutritional quality. In the context of the catering-based intervention programs, the diners prefer scientific and clear messages focusing on labeling healthy dishes only, as opposed to the labeling of unhealthy dishes; they were interested in a nutritional education program to accompany the intervention program. Based on these findings, we have developed an intervention program that includes: changes in food served such as replacing several menu items and nutritional improvement of some of the recipes; as well as, environmental changes such as changing the location of some food items presented on the buffet, placing positive nutritional labels on healthy dishes and an ongoing healthy nutrition campaign, all accompanied by a nutrition education program. The intervention program is currently being tested for its impact on health outcomes and its cost-effectiveness.

Keywords: catering system, food services, intervention, nutrition policy, public health, qualitative research

Procedia PDF Downloads 182
29421 Fast Bayesian Inference of Multivariate Block-Nearest Neighbor Gaussian Process (NNGP) Models for Large Data

Authors: Carlos Gonzales, Zaida Quiroz, Marcos Prates

Abstract:

Several spatial variables collected at the same location that share a common spatial distribution can be modeled simultaneously through a multivariate geostatistical model that takes into account the correlation between these variables and the spatial autocorrelation. The main goal of this model is to perform spatial prediction of these variables in the region of study. Here we focus on a geostatistical multivariate formulation that relies on sharing common spatial random effect terms. In particular, the first response variable can be modeled by a mean that incorporates a shared random spatial effect, while the other response variables depend on this shared spatial term, in addition to specific random spatial effects. Each spatial random effect is defined through a Gaussian process with a valid covariance function, but in order to improve the computational efficiency when the data are large, each Gaussian process is approximated to a Gaussian random Markov field (GRMF), specifically to the block nearest neighbor Gaussian process (Block-NNGP). This approach involves dividing the spatial domain into several dependent blocks under certain constraints, where the cross blocks allow capturing the spatial dependence on a large scale, while each individual block captures the spatial dependence on a smaller scale. The multivariate geostatistical model belongs to the class of Latent Gaussian Models; thus, to achieve fast Bayesian inference, it is used the integrated nested Laplace approximation (INLA) method. The good performance of the proposed model is shown through simulations and applications for massive data.

Keywords: Block-NNGP, geostatistics, gaussian process, GRMF, INLA, multivariate models.

Procedia PDF Downloads 83
29420 Efficient Principal Components Estimation of Large Factor Models

Authors: Rachida Ouysse

Abstract:

This paper proposes a constrained principal components (CnPC) estimator for efficient estimation of large-dimensional factor models when errors are cross sectionally correlated and the number of cross-sections (N) may be larger than the number of observations (T). Although principal components (PC) method is consistent for any path of the panel dimensions, it is inefficient as the errors are treated to be homoskedastic and uncorrelated. The new CnPC exploits the assumption of bounded cross-sectional dependence, which defines Chamberlain and Rothschild’s (1983) approximate factor structure, as an explicit constraint and solves a constrained PC problem. The CnPC method is computationally equivalent to the PC method applied to a regularized form of the data covariance matrix. Unlike maximum likelihood type methods, the CnPC method does not require inverting a large covariance matrix and thus is valid for panels with N ≥ T. The paper derives a convergence rate and an asymptotic normality result for the CnPC estimators of the common factors. We provide feasible estimators and show in a simulation study that they are more accurate than the PC estimator, especially for panels with N larger than T, and the generalized PC type estimators, especially for panels with N almost as large as T.

Keywords: high dimensionality, unknown factors, principal components, cross-sectional correlation, shrinkage regression, regularization, pseudo-out-of-sample forecasting

Procedia PDF Downloads 139
29419 Genodata: The Human Genome Variation Using BigData

Authors: Surabhi Maiti, Prajakta Tamhankar, Prachi Uttam Mehta

Abstract:

Since the accomplishment of the Human Genome Project, there has been an unparalled escalation in the sequencing of genomic data. This project has been the first major vault in the field of medical research, especially in genomics. This project won accolades by using a concept called Bigdata which was earlier, extensively used to gain value for business. Bigdata makes use of data sets which are generally in the form of files of size terabytes, petabytes, or exabytes and these data sets were traditionally used and managed using excel sheets and RDBMS. The voluminous data made the process tedious and time consuming and hence a stronger framework called Hadoop was introduced in the field of genetic sciences to make data processing faster and efficient. This paper focuses on using SPARK which is gaining momentum with the advancement of BigData technologies. Cloud Storage is an effective medium for storage of large data sets which is generated from the genetic research and the resultant sets produced from SPARK analysis.

Keywords: human genome project, Bigdata, genomic data, SPARK, cloud storage, Hadoop

Procedia PDF Downloads 243
29418 Quantitative Analysis of Nutrient Inflow from River and Groundwater to Imazu Bay in Fukuoka, Japan

Authors: Keisuke Konishi, Yoshinari Hiroshiro, Kento Terashima, Atsushi Tsutsumi

Abstract:

Imazu Bay plays an important role for endangered species such as horseshoe crabs and black-faced spoonbills that stay in the bay for spawning or the passing of winter. However, this bay is semi-enclosed with slow water exchange, which could lead to eutrophication under the condition of excess nutrient inflow to the bay. Therefore, quantification of nutrient inflow is of great importance. Generally, analysis of nutrient inflow to the bays takes into consideration nutrient inflow from only the river, but that from groundwater should not be ignored for more accurate results. The main objective of this study is to estimate the amounts of nutrient inflow from river and groundwater to Imazu Bay by analyzing water budget in Zuibaiji River Basin and loads of T-N, T-P, NO3-N and NH4-N. The water budget computation in the basin is performed using groundwater recharge model and quasi three-dimensional two-phase groundwater flow model, and the multiplication of the measured amount of nutrient inflow with the computed discharge gives the total amount of nutrient inflow to the bay. In addition, in order to evaluate nutrient inflow to the bay, the result is compared with nutrient inflow from geologically similar river basins. The result shows that the discharge is 3.50×107 m3/year from the river and 1.04×107 m3/year from groundwater. The submarine groundwater discharge accounts for approximately 23 % of the total discharge, which is large compared to the other river basins. It is also revealed that the total nutrient inflow is not particularly large. The sum of NO3-N and NH4-N loadings from groundwater is less than 10 % of that from the river because of denitrification in groundwater. The Shin Seibu Sewage Treatment Plant located below the observation points discharges treated water of 15,400 m3/day and plans to increase it. However, the loads of T-N and T-P from the treatment plant are 3.9 mg/L and 0.19 mg/L, so that it does not contribute a lot to eutrophication.

Keywords: Eutrophication, groundwater recharge model, nutrient inflow, quasi three-dimensional two-phase groundwater flow model, submarine groundwater discharge

Procedia PDF Downloads 444
29417 Large-Scale Electroencephalogram Biometrics through Contrastive Learning

Authors: Mostafa ‘Neo’ Mohsenvand, Mohammad Rasool Izadi, Pattie Maes

Abstract:

EEG-based biometrics (user identification) has been explored on small datasets of no more than 157 subjects. Here we show that the accuracy of modern supervised methods falls rapidly as the number of users increases to a few thousand. Moreover, supervised methods require a large amount of labeled data for training which limits their applications in real-world scenarios where acquiring data for training should not take more than a few minutes. We show that using contrastive learning for pre-training, it is possible to maintain high accuracy on a dataset of 2130 subjects while only using a fraction of labels. We compare 5 different self-supervised tasks for pre-training of the encoder where our proposed method achieves the accuracy of 96.4%, improving the baseline supervised models by 22.75% and the competing self-supervised model by 3.93%. We also study the effects of the length of the signal and the number of channels on the accuracy of the user-identification models. Our results reveal that signals from temporal and frontal channels contain more identifying features compared to other channels.

Keywords: brainprint, contrastive learning, electroencephalo-gram, self-supervised learning, user identification

Procedia PDF Downloads 146
29416 Automated Testing to Detect Instance Data Loss in Android Applications

Authors: Anusha Konduru, Zhiyong Shan, Preethi Santhanam, Vinod Namboodiri, Rajiv Bagai

Abstract:

Mobile applications are increasing in a significant amount, each to address the requirements of many users. However, the quick developments and enhancements are resulting in many underlying defects. Android apps create and handle a large variety of 'instance' data that has to persist across runs, such as the current navigation route, workout results, antivirus settings, or game state. Due to the nature of Android, an app can be paused, sent into the background, or killed at any time. If the instance data is not saved and restored between runs, in addition to data loss, partially-saved or corrupted data can crash the app upon resume or restart. However, it is difficult for the programmer to manually test this issue for all the activities. This results in the issue of data loss that the data entered by the user are not saved when there is any interruption. This issue can degrade user experience because the user needs to reenter the information each time there is an interruption. Automated testing to detect such data loss is important to improve the user experience. This research proposes a tool, DroidDL, a data loss detector for Android, which detects the instance data loss from a given android application. We have tested 395 applications and found 12 applications with the issue of data loss. This approach is proved highly accurate and reliable to find the apps with this defect, which can be used by android developers to avoid such errors.

Keywords: Android, automated testing, activity, data loss

Procedia PDF Downloads 224
29415 Managing Construction and Demolition Wastes - A Case Study of Multi Triagem, Lda

Authors: Cláudia Moço, Maria Santos, Carlos Arsénio, Débora Mendes, Miguel Oliveira. José Paulo Da Silva

Abstract:

Construction industry generates large amounts of waste all over the world. About 450 million tons of construction and demolition wastes (C&DW) are produced annually in the European Union. C&DW are highly heterogeneous materials in size and composition, which imposes strong difficulties on their management. Directive n.º 2008/98/CE, of the European Parliament and of the Council of 6 November establishes that 70 % of the C&DW have to be recycled by 2020. To evaluate possible applications of these materials, a detailed physical, chemical and environmental characterization is necessary. Multi Triagem, Lda. is a company located in Algarve (Portugal) and was supported by the European Regional Development Fund (grant QREN 30307 Multivalor) to quantify and characterize the received C&DW, in order to evaluate their possible applications. This evaluation, performed in collaboration with the University of Algarve, involves a physical, chemical and environmental detailed characterization of the received C&DW. In this work we report on the amounts, trial procedures and properties of the C&DW received over a period of fifteen month. In this period the company received C&DW coming from 393 different origins. The total amount was 32.458 tons, mostly mixtures containing concrete, masonry/mortar and soil/rock. Most of C&DW came from demodulation constructions and diggings. The organic/inert component, namely metal, glass, wood and plastics, were screened first and account for about 3 % of the received materials. The remaining materials were screened and grouped according to their origin and contents, the latter evaluated by visual inspection. Twenty five samples were prepared and submitted to a detailed physical, chemical and environmental analysis. The C&DW aggregates show lower quality properties than natural aggregates for concrete preparation and unbound layers of road pavements. However, chemical analyzes indicated that most samples are environmentally safe. A continuous monitoring of the presence of heavy metals and organic compounds is needed in order to perform a proper screening of the C&DW. C&DW aggregates provide a good alternative to natural aggregates.

Keywords: construction and demolition wastes, waste classification, waste composition, waste screening

Procedia PDF Downloads 338
29414 Design and Development of a Platform for Analyzing Spatio-Temporal Data from Wireless Sensor Networks

Authors: Walid Fantazi

Abstract:

The development of sensor technology (such as microelectromechanical systems (MEMS), wireless communications, embedded systems, distributed processing and wireless sensor applications) has contributed to a broad range of WSN applications which are capable of collecting a large amount of spatiotemporal data in real time. These systems require real-time data processing to manage storage in real time and query the data they process. In order to cover these needs, we propose in this paper a Snapshot spatiotemporal data model based on object-oriented concepts. This model allows saving storing and reducing data redundancy which makes it easier to execute spatiotemporal queries and save analyzes time. Further, to ensure the robustness of the system as well as the elimination of congestion from the main access memory we propose a spatiotemporal indexing technique in RAM called Captree *. As a result, we offer an RIA (Rich Internet Application) -based SOA application architecture which allows the remote monitoring and control.

Keywords: WSN, indexing data, SOA, RIA, geographic information system

Procedia PDF Downloads 241
29413 Managing HR Knowledge in a Large Privately Owned Enterprise: An Empirical Case Analysis

Authors: Cindy Wang-Cowham, Judy Ningyu Tang

Abstract:

The paper contributes towards the development of scarce literature on HR knowledge management. Drawing literature from knowledge management, the authors define the meaning of HR knowledge and propose that there are social mechanisms in organizations that facilitate the management and sharing of HR knowledge. Instead of investigating the subject in large multinational corporations, the present paper examines it in a large Chinese privately owned enterprise, which has an international standing. The main finding of the case analysis is that communication and feedback plays a pivotal role when managing HR knowledge. Social mechanisms can stimulate the communication and feedback between employees, thus facilitate knowledge exchange.

Keywords: HR knowledge, knowledge management, large privately owned enterprises, China

Procedia PDF Downloads 516
29412 Large Neural Networks Learning From Scratch With Very Few Data and Without Explicit Regularization

Authors: Christoph Linse, Thomas Martinetz

Abstract:

Recent findings have shown that Neural Networks generalize also in over-parametrized regimes with zero training error. This is surprising, since it is completely against traditional machine learning wisdom. In our empirical study we fortify these findings in the domain of fine-grained image classification. We show that very large Convolutional Neural Networks with millions of weights do learn with only a handful of training samples and without image augmentation, explicit regularization or pretraining. We train the architectures ResNet018, ResNet101 and VGG19 on subsets of the difficult benchmark datasets Caltech101, CUB_200_2011, FGVCAircraft, Flowers102 and StanfordCars with 100 classes and more, perform a comprehensive comparative study and draw implications for the practical application of CNNs. Finally, we show that VGG19 with 140 million weights learns to distinguish airplanes and motorbikes with up to 95% accuracy using only 20 training samples per class.

Keywords: convolutional neural networks, fine-grained image classification, generalization, image recognition, over-parameterized, small data sets

Procedia PDF Downloads 72
29411 Identification of Potential Large Scale Floating Solar Sites in Peninsular Malaysia

Authors: Nur Iffika Ruslan, Ahmad Rosly Abbas, Munirah Stapah@Salleh, Nurfaziera Rahim

Abstract:

Increased concerns and awareness of environmental hazards by fossil fuels burning for energy have become the major factor driving the transition toward green energy. It is expected that an additional of 2,000 MW of renewable energy is to be recorded from the renewable sources by 2025 following the implementation of Large Scale Solar projects in Peninsular Malaysia, including Large Scale Floating Solar projects. Floating Solar has better advantages over its landed counterparts such as the requirement for land acquisition is relatively insignificant. As part of the site selection process established by TNB Research Sdn. Bhd., a set of mandatory and rejection criteria has been developed in order to identify only sites that are feasible for the future development of Large Scale Floating Solar power plant. There are a total of 85 lakes and reservoirs identified within Peninsular Malaysia. Only lakes and reservoirs with a minimum surface area of 120 acres will be considered as potential sites for the development of Large Scale Floating Solar power plant. The result indicates a total of 10 potential Large Scale Floating Solar sites identified which are located in Selangor, Johor, Perak, Pulau Pinang, Perlis and Pahang. This paper will elaborate on the various mandatory and rejection criteria, as well as on the various site selection process required to identify potential (suitable) Large Scale Floating Solar sites in Peninsular Malaysia.

Keywords: Large Scale Floating Solar, Peninsular Malaysia, Potential Sites, Renewable Energy

Procedia PDF Downloads 168
29410 Cascaded Transcritical/Supercritical CO2 Cycles and Organic Rankine Cycles to Recover Low-Temperature Waste Heat and LNG Cold Energy Simultaneously

Authors: Haoshui Yu, Donghoi Kim, Truls Gundersen

Abstract:

Low-temperature waste heat is abundant in the process industries, and large amounts of Liquefied Natural Gas (LNG) cold energy are discarded without being recovered properly in LNG terminals. Power generation is an effective way to utilize low-temperature waste heat and LNG cold energy simultaneously. Organic Rankine Cycles (ORCs) and CO2 power cycles are promising technologies to convert low-temperature waste heat and LNG cold energy into electricity. If waste heat and LNG cold energy are utilized simultaneously in one system, the performance may outperform separate systems utilizing low-temperature waste heat and LNG cold energy, respectively. Low-temperature waste heat acts as the heat source and LNG regasification acts as the heat sink in the combined system. Due to the large temperature difference between the heat source and the heat sink, cascaded power cycle configurations are proposed in this paper. Cascaded power cycles can improve the energy efficiency of the system considerably. The cycle operating at a higher temperature to recover waste heat is called top cycle and the cycle operating at a lower temperature to utilize LNG cold energy is called bottom cycle in this study. The top cycle condensation heat is used as the heat source in the bottom cycle. The top cycle can be an ORC, transcritical CO2 (tCO2) cycle or supercritical CO2 (sCO2) cycle, while the bottom cycle only can be an ORC due to the low-temperature range of the bottom cycle. However, the thermodynamic path of the tCO2 cycle and sCO2 cycle are different from that of an ORC. The tCO2 cycle and the sCO2 cycle perform better than an ORC for sensible waste heat recovery due to a better temperature match with the waste heat source. Different combinations of the tCO2 cycle, sCO2 cycle and ORC are compared to screen the best configurations of the cascaded power cycles. The influence of the working fluid and the operating conditions are also investigated in this study. Each configuration is modeled and optimized in Aspen HYSYS. The results show that cascaded tCO2/ORC performs better compared with cascaded ORC/ORC and cascaded sCO2/ORC for the case study.

Keywords: LNG cold energy, low-temperature waste heat, organic Rankine cycle, supercritical CO₂ cycle, transcritical CO₂ cycle

Procedia PDF Downloads 242
29409 Moral Obligation as a Governor to Skeptical Theism's Relativism

Authors: Peter J. Morgan

Abstract:

In response to evidential arguments from evil, Stephen Wykstra presents CORNEA (Condition of Reasonable Epistemic Access) as a foundational principle for Skeptical Theism which urges one to think in terms of what can be expected in a given situation. The use of CORNEA results in skepticism regarding the ability of human ken to know divine levels of knowledge in instances of intense evil. However, William Rowe presents a critique of Skeptical Theism that questions its ability to argue successfully for theism. Rowe contends that siding with Skeptical Theism is akin to boarding a trolley car that does not stop. Contra Wykstra, Rowe observes that, for all that can be known, there could be greater amounts of evils than goods, and the goods that are seen may not be the best possible goods. This amounts to a mortally challenging critique of Skeptical Theism. However, there is a brake on Rowe’s Trolley. This paper makes the argument that the ubiquitous presence of Moral Obligation (MO) serves as a braking system for Rowe’s Trolley. When the rider begins to feel lost in an epistemic stalemate of good and evil it is MO that turns the tide: MO serves as evidence towards the good on a basic human level, and it is a reminder that God’s character will result in actions towards the good.

Keywords: CORNEA, moral obligation, problem of evil, skeptical theism

Procedia PDF Downloads 197
29408 Degradation Model for UK Railway Drainage System

Authors: Yiqi Wu, Simon Tait, Andrew Nichols

Abstract:

Management of UK railway drainage assets is challenging due to the large amounts of historical assets with long asset life cycles. A major concern for asset managers is to maintain the required performance economically and efficiently while complying with the relevant regulation and legislation. As the majority of the drainage assets are buried underground and are often difficult or costly to examine, it is important for asset managers to understand and model the degradation process in order to foresee the upcoming reduction in asset performance and conduct proactive maintenance accordingly. In this research, a Markov chain approach is used to model the deterioration process of rail drainage assets. The study is based on historical condition scores and characteristics of drainage assets across the whole railway network in England, Scotland, and Wales. The model is used to examine the effect of various characteristics on the probabilities of degradation, for example, the regional difference in probabilities of degradation, and how material and shape can influence the deterioration process for chambers, channels, and pipes.

Keywords: deterioration, degradation, markov models, probability, railway drainage

Procedia PDF Downloads 206
29407 Nazca: A Context-Based Matching Method for Searching Heterogeneous Structures

Authors: Karine B. de Oliveira, Carina F. Dorneles

Abstract:

The structure level matching is the problem of combining elements of a structure, which can be represented as entities, classes, XML elements, web forms, and so on. This is a challenge due to large number of distinct representations of semantically similar structures. This paper describes a structure-based matching method applied to search for different representations in data sources, considering the similarity between elements of two structures and the data source context. Using real data sources, we have conducted an experimental study comparing our approach with our baseline implementation and with another important schema matching approach. We demonstrate that our proposal reaches higher precision than the baseline.

Keywords: context, data source, index, matching, search, similarity, structure

Procedia PDF Downloads 349