Search results for: inverse models of data envelopment analysis
44368 Observed Changes in Constructed Precipitation at High Resolution in Southern Vietnam
Authors: Nguyen Tien Thanh, Günter Meon
Abstract:
Precipitation plays a key role in water cycle, defining the local climatic conditions and in ecosystem. It is also an important input parameter for water resources management and hydrologic models. With spatial continuous data, a certainty of discharge predictions or other environmental factors is unquestionably better than without. This is, however, not always willingly available to acquire for a small basin, especially for coastal region in Vietnam due to a low network of meteorological stations (30 stations) on long coast of 3260 km2. Furthermore, available gridded precipitation datasets are not fine enough when applying to hydrologic models. Under conditions of global warming, an application of spatial interpolation methods is a crucial for the climate change impact studies to obtain the spatial continuous data. In recent research projects, although some methods can perform better than others do, no methods draw the best results for all cases. The objective of this paper therefore, is to investigate different spatial interpolation methods for daily precipitation over a small basin (approximately 400 km2) located in coastal region, Southern Vietnam and find out the most efficient interpolation method on this catchment. The five different interpolation methods consisting of cressman, ordinary kriging, regression kriging, dual kriging and inverse distance weighting have been applied to identify the best method for the area of study on the spatio-temporal scale (daily, 10 km x 10 km). A 30-year precipitation database was created and merged into available gridded datasets. Finally, observed changes in constructed precipitation were performed. The results demonstrate that the method of ordinary kriging interpolation is an effective approach to analyze the daily precipitation. The mixed trends of increasing and decreasing monthly, seasonal and annual precipitation have documented at significant levels.Keywords: interpolation, precipitation, trend, vietnam
Procedia PDF Downloads 27544367 Prediction of Mechanical Strength of Multiscale Hybrid Reinforced Cementitious Composite
Authors: Salam Alrekabi, A. B. Cundy, Mohammed Haloob Al-Majidi
Abstract:
Novel multiscale hybrid reinforced cementitious composites based on carbon nanotubes (MHRCC-CNT), and carbon nanofibers (MHRCC-CNF) are new types of cement-based material fabricated with micro steel fibers and nanofilaments, featuring superior strain hardening, ductility, and energy absorption. This study focused on established models to predict the compressive strength, and direct and splitting tensile strengths of the produced cementitious composites. The analysis was carried out based on the experimental data presented by the previous author’s study, regression analysis, and the established models that available in the literature. The obtained models showed small differences in the predictions and target values with experimental verification indicated that the estimation of the mechanical properties could be achieved with good accuracy.Keywords: multiscale hybrid reinforced cementitious composites, carbon nanotubes, carbon nanofibers, mechanical strength prediction
Procedia PDF Downloads 16144366 Fuzzy-Machine Learning Models for the Prediction of Fire Outbreak: A Comparative Analysis
Authors: Uduak Umoh, Imo Eyoh, Emmauel Nyoho
Abstract:
This paper compares fuzzy-machine learning algorithms such as Support Vector Machine (SVM), and K-Nearest Neighbor (KNN) for the predicting cases of fire outbreak. The paper uses the fire outbreak dataset with three features (Temperature, Smoke, and Flame). The data is pre-processed using Interval Type-2 Fuzzy Logic (IT2FL) algorithm. Min-Max Normalization and Principal Component Analysis (PCA) are used to predict feature labels in the dataset, normalize the dataset, and select relevant features respectively. The output of the pre-processing is a dataset with two principal components (PC1 and PC2). The pre-processed dataset is then used in the training of the aforementioned machine learning models. K-fold (with K=10) cross-validation method is used to evaluate the performance of the models using the matrices – ROC (Receiver Operating Curve), Specificity, and Sensitivity. The model is also tested with 20% of the dataset. The validation result shows KNN is the better model for fire outbreak detection with an ROC value of 0.99878, followed by SVM with an ROC value of 0.99753.Keywords: Machine Learning Algorithms , Interval Type-2 Fuzzy Logic, Fire Outbreak, Support Vector Machine, K-Nearest Neighbour, Principal Component Analysis
Procedia PDF Downloads 18144365 Extreme Temperature Forecast in Mbonge, Cameroon Through Return Level Analysis of the Generalized Extreme Value (GEV) Distribution
Authors: Nkongho Ayuketang Arreyndip, Ebobenow Joseph
Abstract:
In this paper, temperature extremes are forecast by employing the block maxima method of the generalized extreme value (GEV) distribution to analyse temperature data from the Cameroon Development Corporation (CDC). By considering two sets of data (raw data and simulated data) and two (stationary and non-stationary) models of the GEV distribution, return levels analysis is carried out and it was found that in the stationary model, the return values are constant over time with the raw data, while in the simulated data the return values show an increasing trend with an upper bound. In the non-stationary model, the return levels of both the raw data and simulated data show an increasing trend with an upper bound. This clearly shows that although temperatures in the tropics show a sign of increase in the future, there is a maximum temperature at which there is no exceedance. The results of this paper are very vital in agricultural and environmental research.Keywords: forecasting, generalized extreme value (GEV), meteorology, return level
Procedia PDF Downloads 47744364 A Structuring and Classification Method for Assigning Application Areas to Suitable Digital Factory Models
Authors: R. Hellmuth
Abstract:
The method of factory planning has changed a lot, especially when it is about planning the factory building itself. Factory planning has the task of designing products, plants, processes, organization, areas, and the building of a factory. Regular restructuring is becoming more important in order to maintain the competitiveness of a factory. Restrictions in new areas, shorter life cycles of product and production technology as well as a VUCA world (Volatility, Uncertainty, Complexity and Ambiguity) lead to more frequent restructuring measures within a factory. A digital factory model is the planning basis for rebuilding measures and becomes an indispensable tool. Furthermore, digital building models are increasingly being used in factories to support facility management and manufacturing processes. The main research question of this paper is, therefore: What kind of digital factory model is suitable for the different areas of application during the operation of a factory? First, different types of digital factory models are investigated, and their properties and usabilities for use cases are analysed. Within the scope of investigation are point cloud models, building information models, photogrammetry models, and these enriched with sensor data are examined. It is investigated which digital models allow a simple integration of sensor data and where the differences are. Subsequently, possible application areas of digital factory models are determined by means of a survey and the respective digital factory models are assigned to the application areas. Finally, an application case from maintenance is selected and implemented with the help of the appropriate digital factory model. It is shown how a completely digitalized maintenance process can be supported by a digital factory model by providing information. Among other purposes, the digital factory model is used for indoor navigation, information provision, and display of sensor data. In summary, the paper shows a structuring of digital factory models that concentrates on the geometric representation of a factory building and its technical facilities. A practical application case is shown and implemented. Thus, the systematic selection of digital factory models with the corresponding application cases is evaluated.Keywords: building information modeling, digital factory model, factory planning, maintenance
Procedia PDF Downloads 11044363 Study on Flexible Diaphragm In-Plane Model of Irregular Multi-Storey Industrial Plant
Authors: Cheng-Hao Jiang, Mu-Xuan Tao
Abstract:
The rigid diaphragm model may cause errors in the calculation of internal forces due to neglecting the in-plane deformation of the diaphragm. This paper thus studies the effects of different diaphragm in-plane models (including in-plane rigid model and in-plane flexible model) on the seismic performance of structures. Taking an actual industrial plant as an example, the seismic performance of the structure is predicted using different floor diaphragm models, and the analysis errors caused by different diaphragm in-plane models including deformation error and internal force error are calculated. Furthermore, the influence of the aspect ratio on the analysis errors is investigated. Finally, the code rationality is evaluated by assessing the analysis errors of the structure models whose floors were determined as rigid according to the code’s criterion. It is found that different floor models may cause great differences in the distribution of structural internal forces, and the current code may underestimate the influence of the floor in-plane effect.Keywords: industrial plant, diaphragm, calculating error, code rationality
Procedia PDF Downloads 14044362 Modeling the Demand for the Healthcare Services Using Data Analysis Techniques
Authors: Elizaveta S. Prokofyeva, Svetlana V. Maltseva, Roman D. Zaitsev
Abstract:
Rapidly evolving modern data analysis technologies in healthcare play a large role in understanding the operation of the system and its characteristics. Nowadays, one of the key tasks in urban healthcare is to optimize the resource allocation. Thus, the application of data analysis in medical institutions to solve optimization problems determines the significance of this study. The purpose of this research was to establish the dependence between the indicators of the effectiveness of the medical institution and its resources. Hospital discharges by diagnosis; hospital days of in-patients and in-patient average length of stay were selected as the performance indicators and the demand of the medical facility. The hospital beds by type of care, medical technology (magnetic resonance tomography, gamma cameras, angiographic complexes and lithotripters) and physicians characterized the resource provision of medical institutions for the developed models. The data source for the research was an open database of the statistical service Eurostat. The choice of the source is due to the fact that the databases contain complete and open information necessary for research tasks in the field of public health. In addition, the statistical database has a user-friendly interface that allows you to quickly build analytical reports. The study provides information on 28 European for the period from 2007 to 2016. For all countries included in the study, with the most accurate and complete data for the period under review, predictive models were developed based on historical panel data. An attempt to improve the quality and the interpretation of the models was made by cluster analysis of the investigated set of countries. The main idea was to assess the similarity of the joint behavior of the variables throughout the time period under consideration to identify groups of similar countries and to construct the separate regression models for them. Therefore, the original time series were used as the objects of clustering. The hierarchical agglomerate algorithm k-medoids was used. The sampled objects were used as the centers of the clusters obtained, since determining the centroid when working with time series involves additional difficulties. The number of clusters used the silhouette coefficient. After the cluster analysis it was possible to significantly improve the predictive power of the models: for example, in the one of the clusters, MAPE error was only 0,82%, which makes it possible to conclude that this forecast is highly reliable in the short term. The obtained predicted values of the developed models have a relatively low level of error and can be used to make decisions on the resource provision of the hospital by medical personnel. The research displays the strong dependencies between the demand for the medical services and the modern medical equipment variable, which highlights the importance of the technological component for the successful development of the medical facility. Currently, data analysis has a huge potential, which allows to significantly improving health services. Medical institutions that are the first to introduce these technologies will certainly have a competitive advantage.Keywords: data analysis, demand modeling, healthcare, medical facilities
Procedia PDF Downloads 14444361 Strategy Management of Soybean (Glycine max L.) for Dealing with Extreme Climate through the Use of Cropsyst Model
Authors: Aminah Muchdar, Nuraeni, Eddy
Abstract:
The aims of the research are: (1) to verify the cropsyst plant model of experimental data in the field of soybean plants and (2) to predict planting time and potential yield soybean plant with the use of cropsyst model. This research is divided into several stages: (1) first calibration stage which conducted in the field from June until September 2015.(2) application models stage, where the data obtained from calibration in the field will be included in cropsyst models. The required data models are climate data, ground data/soil data,also crop genetic data. The relationship between the obtained result in field with simulation cropsyst model indicated by Efficiency Index (EF) which the value is 0,939.That is showing that cropsyst model is well used. From the calculation result RRMSE which the value is 1,922%.That is showing that comparative fault prediction results from simulation with result obtained in the field is 1,92%. The conclusion has obtained that the prediction of soybean planting time cropsyst based models that have been made valid for use. and the appropriate planting time for planting soybeans mainly on rain-fed land is at the end of the rainy season, in which the above study first planting time (June 2, 2015) which gives the highest production, because at that time there was still some rain. Tanggamus varieties more resistant to slow planting time cause the percentage decrease in the yield of each decade is lower than the average of all varieties.Keywords: soybean, Cropsyst, calibration, efficiency Index, RRMSE
Procedia PDF Downloads 17944360 Simulation of Optimal Runoff Hydrograph Using Ensemble of Radar Rainfall and Blending of Runoffs Model
Authors: Myungjin Lee, Daegun Han, Jongsung Kim, Soojun Kim, Hung Soo Kim
Abstract:
Recently, the localized heavy rainfall and typhoons are frequently occurred due to the climate change and the damage is becoming bigger. Therefore, we may need a more accurate prediction of the rainfall and runoff. However, the gauge rainfall has the limited accuracy in space. Radar rainfall is better than gauge rainfall for the explanation of the spatial variability of rainfall but it is mostly underestimated with the uncertainty involved. Therefore, the ensemble of radar rainfall was simulated using error structure to overcome the uncertainty and gauge rainfall. The simulated ensemble was used as the input data of the rainfall-runoff models for obtaining the ensemble of runoff hydrographs. The previous studies discussed about the accuracy of the rainfall-runoff model. Even if the same input data such as rainfall is used for the runoff analysis using the models in the same basin, the models can have different results because of the uncertainty involved in the models. Therefore, we used two models of the SSARR model which is the lumped model, and the Vflo model which is a distributed model and tried to simulate the optimum runoff considering the uncertainty of each rainfall-runoff model. The study basin is located in Han river basin and we obtained one integrated runoff hydrograph which is an optimum runoff hydrograph using the blending methods such as Multi-Model Super Ensemble (MMSE), Simple Model Average (SMA), Mean Square Error (MSE). From this study, we could confirm the accuracy of rainfall and rainfall-runoff model using ensemble scenario and various rainfall-runoff model and we can use this result to study flood control measure due to climate change. Acknowledgements: This work is supported by the Korea Agency for Infrastructure Technology Advancement(KAIA) grant funded by the Ministry of Land, Infrastructure and Transport (Grant 18AWMP-B083066-05).Keywords: radar rainfall ensemble, rainfall-runoff models, blending method, optimum runoff hydrograph
Procedia PDF Downloads 28044359 A Demonstration of How to Employ and Interpret Binary IRT Models Using the New IRT Procedure in SAS 9.4
Authors: Ryan A. Black, Stacey A. McCaffrey
Abstract:
Over the past few decades, great strides have been made towards improving the science in the measurement of psychological constructs. Item Response Theory (IRT) has been the foundation upon which statistical models have been derived to increase both precision and accuracy in psychological measurement. These models are now being used widely to develop and refine tests intended to measure an individual's level of academic achievement, aptitude, and intelligence. Recently, the field of clinical psychology has adopted IRT models to measure psychopathological phenomena such as depression, anxiety, and addiction. Because advances in IRT measurement models are being made so rapidly across various fields, it has become quite challenging for psychologists and other behavioral scientists to keep abreast of the most recent developments, much less learn how to employ and decide which models are the most appropriate to use in their line of work. In the same vein, IRT measurement models vary greatly in complexity in several interrelated ways including but not limited to the number of item-specific parameters estimated in a given model, the function which links the expected response and the predictor, response option formats, as well as dimensionality. As a result, inferior methods (a.k.a. Classical Test Theory methods) continue to be employed in efforts to measure psychological constructs, despite evidence showing that IRT methods yield more precise and accurate measurement. To increase the use of IRT methods, this study endeavors to provide a comprehensive overview of binary IRT models; that is, measurement models employed on test data consisting of binary response options (e.g., correct/incorrect, true/false, agree/disagree). Specifically, this study will cover the most basic binary IRT model, known as the 1-parameter logistic (1-PL) model dating back to over 50 years ago, up until the most recent complex, 4-parameter logistic (4-PL) model. Binary IRT models will be defined mathematically and the interpretation of each parameter will be provided. Next, all four binary IRT models will be employed on two sets of data: 1. Simulated data of N=500,000 subjects who responded to four dichotomous items and 2. A pilot analysis of real-world data collected from a sample of approximately 770 subjects who responded to four self-report dichotomous items pertaining to emotional consequences to alcohol use. Real-world data were based on responses collected on items administered to subjects as part of a scale-development study (NIDA Grant No. R44 DA023322). IRT analyses conducted on both the simulated data and analyses of real-world pilot will provide a clear demonstration of how to construct, evaluate, and compare binary IRT measurement models. All analyses will be performed using the new IRT procedure in SAS 9.4. SAS code to generate simulated data and analyses will be available upon request to allow for replication of results.Keywords: instrument development, item response theory, latent trait theory, psychometrics
Procedia PDF Downloads 35644358 On-Line Data-Driven Multivariate Statistical Prediction Approach to Production Monitoring
Authors: Hyun-Woo Cho
Abstract:
Detection of incipient abnormal events in production processes is important to improve safety and reliability of manufacturing operations and reduce losses caused by failures. The construction of calibration models for predicting faulty conditions is quite essential in making decisions on when to perform preventive maintenance. This paper presents a multivariate calibration monitoring approach based on the statistical analysis of process measurement data. The calibration model is used to predict faulty conditions from historical reference data. This approach utilizes variable selection techniques, and the predictive performance of several prediction methods are evaluated using real data. The results shows that the calibration model based on supervised probabilistic model yielded best performance in this work. By adopting a proper variable selection scheme in calibration models, the prediction performance can be improved by excluding non-informative variables from their model building steps.Keywords: calibration model, monitoring, quality improvement, feature selection
Procedia PDF Downloads 35544357 A Comparative Study of Regional Climate Models and Global Coupled Models over Uttarakhand
Authors: Sudip Kumar Kundu, Charu Singh
Abstract:
As a great physiographic divide, the Himalayas affecting a large system of water and air circulation which helps to determine the climatic condition in the Indian subcontinent to the south and mid-Asian highlands to the north. It creates obstacles by defending chill continental air from north side into India in winter and also defends rain-bearing southwesterly monsoon to give up maximum precipitation in that area in monsoon season. Nowadays extreme weather conditions such as heavy precipitation, cloudburst, flash flood, landslide and extreme avalanches are the regular happening incidents in the region of North Western Himalayan (NWH). The present study has been planned to investigate the suitable model(s) to find out the rainfall pattern over that region. For this investigation, selected models from Coordinated Regional Climate Downscaling Experiment (CORDEX) and Coupled Model Intercomparison Project Phase 5 (CMIP5) has been utilized in a consistent framework for the period of 1976 to 2000 (historical). The ability of these driving models from CORDEX domain and CMIP5 has been examined according to their capability of the spatial distribution as well as time series plot of rainfall over NWH in the rainy season and compared with the ground-based Indian Meteorological Department (IMD) gridded rainfall data set. It is noted from the analysis that the models like MIROC5 and MPI-ESM-LR from the both CORDEX and CMIP5 provide the best spatial distribution of rainfall over NWH region. But the driving models from CORDEX underestimates the daily rainfall amount as compared to CMIP5 driving models as it is unable to capture daily rainfall data properly when it has been plotted for time series (TS) individually for the state of Uttarakhand (UK) and Himachal Pradesh (HP). So finally it can be said that the driving models from CMIP5 are better than CORDEX domain models to investigate the rainfall pattern over NWH region.Keywords: global warming, rainfall, CMIP5, CORDEX, NWH
Procedia PDF Downloads 16944356 Random Matrix Theory Analysis of Cross-Correlation in the Nigerian Stock Exchange
Authors: Chimezie P. Nnanwa, Thomas C. Urama, Patrick O. Ezepue
Abstract:
In this paper we use Random Matrix Theory to analyze the eigen-structure of the empirical correlations of 82 stocks which are consistently traded in the Nigerian Stock Exchange (NSE) over a 4-year study period 3 August 2009 to 26 August 2013. We apply the Marchenko-Pastur distribution of eigenvalues of a purely random matrix to investigate the presence of investment-pertinent information contained in the empirical correlation matrix of the selected stocks. We use hypothesised standard normal distribution of eigenvector components from RMT to assess deviations of the empirical eigenvectors to this distribution for different eigenvalues. We also use the Inverse Participation Ratio to measure the deviation of eigenvectors of the empirical correlation matrix from RMT results. These preliminary results on the dynamics of asset price correlations in the NSE are important for improving risk-return trade-offs associated with Markowitz’s portfolio optimization in the stock exchange, which is pursued in future work.Keywords: correlation matrix, eigenvalue and eigenvector, inverse participation ratio, portfolio optimization, random matrix theory
Procedia PDF Downloads 34444355 Engagement Analysis Using DAiSEE Dataset
Authors: Naman Solanki, Souraj Mondal
Abstract:
With the world moving towards online communication, the video datastore has exploded in the past few years. Consequently, it has become crucial to analyse participant’s engagement levels in online communication videos. Engagement prediction of people in videos can be useful in many domains, like education, client meetings, dating, etc. Video-level or frame-level prediction of engagement for a user involves the development of robust models that can capture facial micro-emotions efficiently. For the development of an engagement prediction model, it is necessary to have a widely-accepted standard dataset for engagement analysis. DAiSEE is one of the datasets which consist of in-the-wild data and has a gold standard annotation for engagement prediction. Earlier research done using the DAiSEE dataset involved training and testing standard models like CNN-based models, but the results were not satisfactory according to industry standards. In this paper, a multi-level classification approach has been introduced to create a more robust model for engagement analysis using the DAiSEE dataset. This approach has recorded testing accuracies of 0.638, 0.7728, 0.8195, and 0.866 for predicting boredom level, engagement level, confusion level, and frustration level, respectively.Keywords: computer vision, engagement prediction, deep learning, multi-level classification
Procedia PDF Downloads 11444354 Benchmarking Machine Learning Approaches for Forecasting Hotel Revenue
Authors: Rachel Y. Zhang, Christopher K. Anderson
Abstract:
A critical aspect of revenue management is a firm’s ability to predict demand as a function of price. Historically hotels have used simple time series models (regression and/or pick-up based models) owing to the complexities of trying to build casual models of demands. Machine learning approaches are slowly attracting attention owing to their flexibility in modeling relationships. This study provides an overview of approaches to forecasting hospitality demand – focusing on the opportunities created by machine learning approaches, including K-Nearest-Neighbors, Support vector machine, Regression Tree, and Artificial Neural Network algorithms. The out-of-sample performances of above approaches to forecasting hotel demand are illustrated by using a proprietary sample of the market level (24 properties) transactional data for Las Vegas NV. Causal predictive models can be built and evaluated owing to the availability of market level (versus firm level) data. This research also compares and contrast model accuracy of firm-level models (i.e. predictive models for hotel A only using hotel A’s data) to models using market level data (prices, review scores, location, chain scale, etc… for all hotels within the market). The prospected models will be valuable for hotel revenue prediction given the basic characters of a hotel property or can be applied in performance evaluation for an existed hotel. The findings will unveil the features that play key roles in a hotel’s revenue performance, which would have considerable potential usefulness in both revenue prediction and evaluation.Keywords: hotel revenue, k-nearest-neighbors, machine learning, neural network, prediction model, regression tree, support vector machine
Procedia PDF Downloads 13244353 Evaluation of QSRR Models by Sum of Ranking Differences Approach: A Case Study of Prediction of Chromatographic Behavior of Pesticides
Authors: Lidija R. Jevrić, Sanja O. Podunavac-Kuzmanović, Strahinja Z. Kovačević
Abstract:
The present study deals with the selection of the most suitable quantitative structure-retention relationship (QSRR) models which should be used in prediction of the retention behavior of basic, neutral, acidic and phenolic pesticides which belong to different classes: fungicides, herbicides, metabolites, insecticides and plant growth regulators. Sum of ranking differences (SRD) approach can give a different point of view on selection of the most consistent QSRR model. SRD approach can be applied not only for ranking of the QSRR models, but also for detection of similarity or dissimilarity among them. Applying the SRD analysis, the most similar models can be found easily. In this study, selection of the best model was carried out on the basis of the reference ranking (“golden standard”) which was defined as the row average values of logarithm of retention time (logtr) defined by high performance liquid chromatography (HPLC). Also, SRD analysis based on experimental logtr values as reference ranking revealed similar grouping of the established QSRR models already obtained by hierarchical cluster analysis (HCA).Keywords: chemometrics, chromatography, pesticides, sum of ranking differences
Procedia PDF Downloads 37544352 Web Data Scraping Technology Using Term Frequency Inverse Document Frequency to Enhance the Big Data Quality on Sentiment Analysis
Authors: Sangita Pokhrel, Nalinda Somasiri, Rebecca Jeyavadhanam, Swathi Ganesan
Abstract:
Tourism is a booming industry with huge future potential for global wealth and employment. There are countless data generated over social media sites every day, creating numerous opportunities to bring more insights to decision-makers. The integration of Big Data Technology into the tourism industry will allow companies to conclude where their customers have been and what they like. This information can then be used by businesses, such as those in charge of managing visitor centers or hotels, etc., and the tourist can get a clear idea of places before visiting. The technical perspective of natural language is processed by analysing the sentiment features of online reviews from tourists, and we then supply an enhanced long short-term memory (LSTM) framework for sentiment feature extraction of travel reviews. We have constructed a web review database using a crawler and web scraping technique for experimental validation to evaluate the effectiveness of our methodology. The text form of sentences was first classified through Vader and Roberta model to get the polarity of the reviews. In this paper, we have conducted study methods for feature extraction, such as Count Vectorization and TFIDF Vectorization, and implemented Convolutional Neural Network (CNN) classifier algorithm for the sentiment analysis to decide the tourist’s attitude towards the destinations is positive, negative, or simply neutral based on the review text that they posted online. The results demonstrated that from the CNN algorithm, after pre-processing and cleaning the dataset, we received an accuracy of 96.12% for the positive and negative sentiment analysis.Keywords: counter vectorization, convolutional neural network, crawler, data technology, long short-term memory, web scraping, sentiment analysis
Procedia PDF Downloads 8844351 Analyzing the Relationship between the Spatial Characteristics of Cultural Structure, Activities, and the Tourism Demand
Authors: Deniz Karagöz
Abstract:
This study is attempt to comprehend the relationship between the spatial characteristics of cultural structure, activities and the tourism demand in Turkey. The analysis divided into four parts. The first part consisted of a cultural structure and cultural activity (CSCA) index provided by principal component analysis. The analysis determined four distinct dimensions, namely, cultural activity/structure, accessing culture, consumption, and cultural management. The exploratory spatial data analysis employed to determine the spatial models of cultural structure and cultural activities in 81 provinces in Turkey. Global Moran I indices is used to ascertain the cultural activities and the structural clusters. Finally, the relationship between the cultural activities/cultural structure and tourism demand was analyzed. The raw/original data of the study official databases. The data on the cultural structure and activities gathered from the Turkish Statistical Institute and the data related to the tourism demand was provided by the Republic of Turkey Ministry of Culture and Tourism.Keywords: cultural activities, cultural structure, spatial characteristics, tourism demand, Turkey
Procedia PDF Downloads 56044350 Stability Analysis of Modelling the Effect of Vaccination and Novel Quarantine-Adjusted Incidence on the Spread of Newcastle Disease
Authors: Nurudeen O. Lasisi, Sirajo Abdulrahman, Abdulkareem A. Ibrahim
Abstract:
Newcastle disease is an infection of domestic poultry and other bird species with the virulent Newcastle disease virus (NDV). In this paper, we study the dynamics of the modeling of the Newcastle disease virus (NDV) using a novel quarantine-adjusted incidence. The comparison of Vaccination, linear incident rate and novel quarantine-adjusted incident rate in the models are discussed. The dynamics of the models yield disease-free and endemic equilibrium states.The effective reproduction numbers of the models are computed in order to measure the relative impact of an individual bird or combined intervention for effective disease control. We showed the local and global stability of endemic equilibrium states of the models and we found that the stability of endemic equilibrium states of models are globally asymptotically stable if the effective reproduction numbers of the models equations are greater than a unit.Keywords: effective reproduction number, Endemic state, Mathematical model, Newcastle disease virus, novel quarantine-adjusted incidence, stability analysis
Procedia PDF Downloads 12144349 Discrimination in Insurance Pricing: A Textual-Analysis Perspective
Authors: Ruijuan Bi
Abstract:
Discrimination in insurance pricing is a topic of increasing concern, particularly in the context of the rapid development of big data and artificial intelligence. There is a need to explore the various forms of discrimination, such as direct and indirect discrimination, proxy discrimination, algorithmic discrimination, and unfair discrimination, and understand their implications in insurance pricing models. This paper aims to analyze and interpret the definitions of discrimination in insurance pricing and explore measures to reduce discrimination. It utilizes a textual analysis methodology, which involves gathering qualitative data from relevant literature on definitions of discrimination. The research methodology focuses on exploring the various forms of discrimination and their implications in insurance pricing models. Through textual analysis, this paper identifies the specific characteristics and implications of each form of discrimination in the general insurance industry. This research contributes to the theoretical understanding of discrimination in insurance pricing. By analyzing and interpreting relevant literature, this paper provides insights into the definitions of discrimination and the laws and regulations surrounding it. This theoretical foundation can inform future empirical research on discrimination in insurance pricing using relevant theories of probability theory.Keywords: algorithmic discrimination, direct and indirect discrimination, proxy discrimination, unfair discrimination, insurance pricing
Procedia PDF Downloads 7344348 An As-Is Analysis and Approach for Updating Building Information Models and Laser Scans
Authors: Rene Hellmuth
Abstract:
Factory planning has the task of designing products, plants, processes, organization, areas, and the construction of a factory. The requirements for factory planning and the building of a factory have changed in recent years. Regular restructuring of the factory building is becoming more important in order to maintain the competitiveness of a factory. Restrictions in new areas, shorter life cycles of product and production technology as well as a VUCA world (Volatility, Uncertainty, Complexity & Ambiguity) lead to more frequent restructuring measures within a factory. A building information model (BIM) is the planning basis for rebuilding measures and becomes an indispensable data repository to be able to react quickly to changes. Use as a planning basis for restructuring measures in factories only succeeds if the BIM model has adequate data quality. Under this aspect and the industrial requirement, three data quality factors are particularly important for this paper regarding the BIM model: up-to-dateness, completeness, and correctness. The research question is: how can a BIM model be kept up to date with required data quality and which visualization techniques can be applied in a short period of time on the construction site during conversion measures? An as-is analysis is made of how BIM models and digital factory models (including laser scans) are currently being kept up to date. Industrial companies are interviewed, and expert interviews are conducted. Subsequently, the results are evaluated, and a procedure conceived how cost-effective and timesaving updating processes can be carried out. The availability of low-cost hardware and the simplicity of the process are of importance to enable service personnel from facility mnagement to keep digital factory models (BIM models and laser scans) up to date. The approach includes the detection of changes to the building, the recording of the changing area, and the insertion into the overall digital twin. Finally, an overview of the possibilities for visualizations suitable for construction sites is compiled. An augmented reality application is created based on an updated BIM model of a factory and installed on a tablet. Conversion scenarios with costs and time expenditure are displayed. A user interface is designed in such a way that all relevant conversion information is available at a glance for the respective conversion scenario. A total of three essential research results are achieved: As-is analysis of current update processes for BIM models and laser scans, development of a time-saving and cost-effective update process and the conception and implementation of an augmented reality solution for BIM models suitable for construction sites.Keywords: building information modeling, digital factory model, factory planning, restructuring
Procedia PDF Downloads 11444347 Multimedia Data Fusion for Event Detection in Twitter by Using Dempster-Shafer Evidence Theory
Authors: Samar M. Alqhtani, Suhuai Luo, Brian Regan
Abstract:
Data fusion technology can be the best way to extract useful information from multiple sources of data. It has been widely applied in various applications. This paper presents a data fusion approach in multimedia data for event detection in twitter by using Dempster-Shafer evidence theory. The methodology applies a mining algorithm to detect the event. There are two types of data in the fusion. The first is features extracted from text by using the bag-ofwords method which is calculated using the term frequency-inverse document frequency (TF-IDF). The second is the visual features extracted by applying scale-invariant feature transform (SIFT). The Dempster - Shafer theory of evidence is applied in order to fuse the information from these two sources. Our experiments have indicated that comparing to the approaches using individual data source, the proposed data fusion approach can increase the prediction accuracy for event detection. The experimental result showed that the proposed method achieved a high accuracy of 0.97, comparing with 0.93 with texts only, and 0.86 with images only.Keywords: data fusion, Dempster-Shafer theory, data mining, event detection
Procedia PDF Downloads 41044346 A Comparative Analysis of E-Government Quality Models
Authors: Abdoullah Fath-Allah, Laila Cheikhi, Rafa E. Al-Qutaish, Ali Idri
Abstract:
Many quality models have been used to measure e-government portals quality. However, the absence of an international consensus for e-government portals quality models results in many differences in terms of quality attributes and measures. The aim of this paper is to compare and analyze the existing e-government quality models proposed in literature (those that are based on ISO standards and those that are not) in order to propose guidelines to build a good and useful e-government portals quality model. Our findings show that, there is no e-government portal quality model based on the new international standard ISO 25010. Besides that, the quality models are not based on a best practice model to allow agencies to both; measure e-government portals quality and identify missing best practices for those portals.Keywords: e-government, portal, best practices, quality model, ISO, standard, ISO 25010, ISO 9126
Procedia PDF Downloads 56044345 Frailty Models for Modeling Heterogeneity: Simulation Study and Application to Quebec Pension Plan
Authors: Souad Romdhane, Lotfi Belkacem
Abstract:
When referring to actuarial analysis of lifetime, only models accounting for observable risk factors have been developed. Within this context, Cox proportional hazards model (CPH model) is commonly used to assess the effects of observable covariates as gender, age, smoking habits, on the hazard rates. These covariates may fail to fully account for the true lifetime interval. This may be due to the existence of another random variable (frailty) that is still being ignored. The aim of this paper is to examine the shared frailty issue in the Cox proportional hazard model by including two different parametric forms of frailty into the hazard function. Four estimated methods are used to fit them. The performance of the parameter estimates is assessed and compared between the classical Cox model and these frailty models through a real-life data set from the Quebec Pension Plan and then using a more general simulation study. This performance is investigated in terms of the bias of point estimates and their empirical standard errors in both fixed and random effect parts. Both the simulation and the real dataset studies showed differences between classical Cox model and shared frailty model.Keywords: life insurance-pension plan, survival analysis, risk factors, cox proportional hazards model, multivariate failure-time data, shared frailty, simulations study
Procedia PDF Downloads 35944344 Parameters Identification of Granular Soils around PMT Test by Inverse Analysis
Authors: Younes Abed
Abstract:
The successful application of in-situ testing of soils heavily depends on development of interpretation methods of tests. The pressuremeter test simulates the expansion of a cylindrical cavity and because it has well defined boundary conditions, it is more unable to rigorous theoretical analysis (i. e. cavity expansion theory) then most other in-situ tests. In this article, and in order to make the identification process more convenient, we propose a relatively simple procedure which involves the numerical identification of some mechanical parameters of a granular soil, especially, the elastic modulus and the friction angle from a pressuremeter curve. The procedure, applied here to identify the parameters of generalised prager model associated to the Drucker & Prager criterion from a pressuremeter curve, is based on an inverse analysis approach, which consists of minimizing the function representing the difference between the experimental curve and the curve obtained by integrating the model along the loading path in in-situ testing. The numerical process implemented here is based on the established finite element program. We present a validation of the proposed approach by a database of tests on expansion of cylindrical cavity. This database consists of four types of tests; thick cylinder tests carried out on the Hostun RF sand, pressuremeter tests carried out on the Hostun sand, in-situ pressuremeter tests carried out at the site of Fos with marine self-boring pressuremeter and in-situ pressuremeter tests realized on the site of Labenne with Menard pressuremeter.Keywords: granular soils, cavity expansion, pressuremeter test, finite element method, identification procedure
Procedia PDF Downloads 29244343 Cirrhosis Mortality Prediction as Classification using Frequent Subgraph Mining
Authors: Abdolghani Ebrahimi, Diego Klabjan, Chenxi Ge, Daniela Ladner, Parker Stride
Abstract:
In this work, we use machine learning and novel data analysis techniques to predict the one-year mortality of cirrhotic patients. Data from 2,322 patients with liver cirrhosis are collected at a single medical center. Different machine learning models are applied to predict one-year mortality. A comprehensive feature space including demographic information, comorbidity, clinical procedure and laboratory tests is being analyzed. A temporal pattern mining technic called Frequent Subgraph Mining (FSM) is being used. Model for End-stage liver disease (MELD) prediction of mortality is used as a comparator. All of our models statistically significantly outperform the MELD-score model and show an average 10% improvement of the area under the curve (AUC). The FSM technic itself does not improve the model significantly, but FSM, together with a machine learning technique called an ensemble, further improves the model performance. With the abundance of data available in healthcare through electronic health records (EHR), existing predictive models can be refined to identify and treat patients at risk for higher mortality. However, due to the sparsity of the temporal information needed by FSM, the FSM model does not yield significant improvements. To the best of our knowledge, this is the first work to apply modern machine learning algorithms and data analysis methods on predicting one-year mortality of cirrhotic patients and builds a model that predicts one-year mortality significantly more accurate than the MELD score. We have also tested the potential of FSM and provided a new perspective of the importance of clinical features.Keywords: machine learning, liver cirrhosis, subgraph mining, supervised learning
Procedia PDF Downloads 13444342 A Study of Non Linear Partial Differential Equation with Random Initial Condition
Authors: Ayaz Ahmad
Abstract:
In this work, we present the effect of noise on the solution of a partial differential equation (PDE) in three different setting. We shall first consider random initial condition for two nonlinear dispersive PDE the non linear Schrodinger equation and the Kortteweg –de vries equation and analyse their effect on some special solution , the soliton solutions.The second case considered a linear partial differential equation , the wave equation with random initial conditions allow to substantially decrease the computational and data storage costs of an algorithm to solve the inverse problem based on the boundary measurements of the solution of this equation. Finally, the third example considered is that of the linear transport equation with a singular drift term, when we shall show that the addition of a multiplicative noise term forbids the blow up of solutions under a very weak hypothesis for which we have finite time blow up of a solution in the deterministic case. Here we consider the problem of wave propagation, which is modelled by a nonlinear dispersive equation with noisy initial condition .As observed noise can also be introduced directly in the equations.Keywords: drift term, finite time blow up, inverse problem, soliton solution
Procedia PDF Downloads 21544341 In-Context Meta Learning for Automatic Designing Pretext Tasks for Self-Supervised Image Analysis
Authors: Toktam Khatibi
Abstract:
Self-supervised learning (SSL) includes machine learning models that are trained on one aspect and/or one part of the input to learn other aspects and/or part of it. SSL models are divided into two different categories, including pre-text task-based models and contrastive learning ones. Pre-text tasks are some auxiliary tasks learning pseudo-labels, and the trained models are further fine-tuned for downstream tasks. However, one important disadvantage of SSL using pre-text task solving is defining an appropriate pre-text task for each image dataset with a variety of image modalities. Therefore, it is required to design an appropriate pretext task automatically for each dataset and each downstream task. To the best of our knowledge, the automatic designing of pretext tasks for image analysis has not been considered yet. In this paper, we present a framework based on In-context learning that describes each task based on its input and output data using a pre-trained image transformer. Our proposed method combines the input image and its learned description for optimizing the pre-text task design and its hyper-parameters using Meta-learning models. The representations learned from the pre-text tasks are fine-tuned for solving the downstream tasks. We demonstrate that our proposed framework outperforms the compared ones on unseen tasks and image modalities in addition to its superior performance for previously known tasks and datasets.Keywords: in-context learning (ICL), meta learning, self-supervised learning (SSL), vision-language domain, transformers
Procedia PDF Downloads 8044340 The Application of Data Mining Technology in Building Energy Consumption Data Analysis
Authors: Liang Zhao, Jili Zhang, Chongquan Zhong
Abstract:
Energy consumption data, in particular those involving public buildings, are impacted by many factors: the building structure, climate/environmental parameters, construction, system operating condition, and user behavior patterns. Traditional methods for data analysis are insufficient. This paper delves into the data mining technology to determine its application in the analysis of building energy consumption data including energy consumption prediction, fault diagnosis, and optimal operation. Recent literature are reviewed and summarized, the problems faced by data mining technology in the area of energy consumption data analysis are enumerated, and research points for future studies are given.Keywords: data mining, data analysis, prediction, optimization, building operational performance
Procedia PDF Downloads 85244339 Off-Grid Sparse Inverse Synthetic Aperture Imaging by Basis Shift Algorithm
Authors: Mengjun Yang, Zhulin Zong, Jie Gao
Abstract:
In this paper, a new and robust algorithm is proposed to achieve high resolution for inverse synthetic aperture radar (ISAR) imaging in the compressive sensing (CS) framework. Traditional CS based methods have to assume that unknown scatters exactly lie on the pre-divided grids; otherwise, their reconstruction performance dropped significantly. In this processing algorithm, several basis shifts are utilized to achieve the same effect as grid refinement does. The detailed implementation of the basis shift algorithm is presented in this paper. From the simulation we can see that using the basis shift algorithm, imaging precision can be improved. The effectiveness and feasibility of the proposed method are investigated by the simulation results.Keywords: ISAR imaging, sparse reconstruction, off-grid, basis shift
Procedia PDF Downloads 265