Search results for: ERA-5 analysis data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 41038

Search results for: ERA-5 analysis data

40408 Partial Least Square Regression for High-Dimentional and High-Correlated Data

Authors: Mohammed Abdullah Alshahrani

Abstract:

The research focuses on investigating the use of partial least squares (PLS) methodology for addressing challenges associated with high-dimensional correlated data. Recent technological advancements have led to experiments producing data characterized by a large number of variables compared to observations, with substantial inter-variable correlations. Such data patterns are common in chemometrics, where near-infrared (NIR) spectrometer calibrations record chemical absorbance levels across hundreds of wavelengths, and in genomics, where thousands of genomic regions' copy number alterations (CNA) are recorded from cancer patients. PLS serves as a widely used method for analyzing high-dimensional data, functioning as a regression tool in chemometrics and a classification method in genomics. It handles data complexity by creating latent variables (components) from original variables. However, applying PLS can present challenges. The study investigates key areas to address these challenges, including unifying interpretations across three main PLS algorithms and exploring unusual negative shrinkage factors encountered during model fitting. The research presents an alternative approach to addressing the interpretation challenge of predictor weights associated with PLS. Sparse estimation of predictor weights is employed using a penalty function combining a lasso penalty for sparsity and a Cauchy distribution-based penalty to account for variable dependencies. The results demonstrate sparse and grouped weight estimates, aiding interpretation and prediction tasks in genomic data analysis. High-dimensional data scenarios, where predictors outnumber observations, are common in regression analysis applications. Ordinary least squares regression (OLS), the standard method, performs inadequately with high-dimensional and highly correlated data. Copy number alterations (CNA) in key genes have been linked to disease phenotypes, highlighting the importance of accurate classification of gene expression data in bioinformatics and biology using regularized methods like PLS for regression and classification.

Keywords: partial least square regression, genetics data, negative filter factors, high dimensional data, high correlated data

Procedia PDF Downloads 34
40407 The Use of Geographically Weighted Regression for Deforestation Analysis: Case Study in Brazilian Cerrado

Authors: Ana Paula Camelo, Keila Sanches

Abstract:

The Geographically Weighted Regression (GWR) was proposed in geography literature to allow relationship in a regression model to vary over space. In Brazil, the agricultural exploitation of the Cerrado Biome is the main cause of deforestation. In this study, we propose a methodology using geostatistical methods to characterize the spatial dependence of deforestation in the Cerrado based on agricultural production indicators. Therefore, it was used the set of exploratory spatial data analysis tools (ESDA) and confirmatory analysis using GWR. It was made the calibration a non-spatial model, evaluation the nature of the regression curve, election of the variables by stepwise process and multicollinearity analysis. After the evaluation of the non-spatial model was processed the spatial-regression model, statistic evaluation of the intercept and verification of its effect on calibration. In an analysis of Spearman’s correlation the results between deforestation and livestock was +0.783 and with soybeans +0.405. The model presented R²=0.936 and showed a strong spatial dependence of agricultural activity of soybeans associated to maize and cotton crops. The GWR is a very effective tool presenting results closer to the reality of deforestation in the Cerrado when compared with other analysis.

Keywords: deforestation, geographically weighted regression, land use, spatial analysis

Procedia PDF Downloads 344
40406 Wave Velocity-Rock Property Relationships in Shallow Marine Libyan Carbonate Reservoir

Authors: Tarek S. Duzan, Abdulaziz F. Ettir

Abstract:

Wave velocities, Core and Log petrophysical data were collected from recently drilled four new wells scattered through-out the Dahra/Jofra (PL-5) Reservoir. The collected data were analyzed for the relationships of Wave Velocities with rock property such as Porosity, permeability and Bulk Density. Lots of Literature review reveals a number of differing results and conclusions regarding wave velocities (Compressional Waves (Vp) and Shear Waves (Vs)) versus rock petrophysical property relationships, especially in carbonate reservoirs. In this paper, we focused on the relationships between wave velocities (Vp , Vs) and the ratio Vp/Vs with rock properties for shallow marine libyan carbonate reservoir (Real Case). Upon data analysis, a relationship between petrophysical properties and wave velocities (Vp, Vs) and the ratio Vp/Vs has been found. Porosity and bulk density properties have shown exponential relationship with wave velocities, while permeability has shown a power relationship in the interested zone. It is also clear that wave velocities (Vp , Vs) seems to be a good indicator for the lithology change with true vertical depth. Therefore, it is highly recommended to use the output relationships to predict porosity, bulk density and permeability of the similar reservoir type utilizing the most recent seismic data.

Keywords: conventional core analysis (porosity, permeability bulk density) data, VS wave and P-wave velocities, shallow carbonate reservoir in D/J field

Procedia PDF Downloads 312
40405 A Crowdsourced Homeless Data Collection System and Its Econometric Analysis

Authors: Praniil Nagaraj

Abstract:

This paper proposes a method to collect homeless data using crowdsourcing and presents an approach to analyze the data, demonstrating its potential to strengthen existing and future policies aimed at promoting socio-economic equilibrium. The 2022 Annual Homeless Assessment Report (AHAR) to Congress highlighted alarming statistics, emphasizing the need for effective decision-making and budget allocation within local planning bodies known as Continuums of Care (CoC). This paper's contributions can be categorized into three main areas. Firstly, a unique method for collecting homeless data is introduced, utilizing a user-friendly smartphone app (currently available for Android). The app enables the general public to quickly record information about homeless individuals, including the number of people and details about their living conditions. The collected data, including date, time, and location, is anonymized and securely transmitted to the cloud. It is anticipated that an increasing number of users motivated to contribute to society will adopt the app, thus expanding the data collection efforts. Duplicate data is addressed through simple classification methods, and historical data is utilized to fill in missing information. The second contribution of this paper is the description of data analysis techniques applied to the collected data. By combining this new data with existing information, statistical regression analysis is employed to gain insights into various aspects, such as distinguishing between unsheltered and sheltered homeless populations, as well as examining their correlation with factors like unemployment rates, housing affordability, and labor demand. Initial data is collected in San Francisco, while pre-existing information is drawn from three cities: San Francisco, New York City, and Washington D.C., facilitating the conduction of simulations. The third contribution focuses on demonstrating the practical implications of the data processing results. The challenges faced by key stakeholders, including charitable organizations and local city governments, are taken into consideration. Two case studies are presented as examples. The first case study explores improving the efficiency of food and necessities distribution, as well as medical assistance, driven by charitable organizations. The second case study examines the correlation between micro-geographic budget expenditure by local city governments and homeless information to justify budget allocation and expenditures. The ultimate objective of this endeavor is to enable the continuous enhancement of the quality of life for the underprivileged. It is hoped that through increased crowdsourcing of data from the public, the Generosity Curve and the Need Curve will intersect, leading to a better world for all.

Keywords: crowdsourcing, homelessness, socio-economic policies, statistical analysis

Procedia PDF Downloads 43
40404 Validity and Reliability of Competency Assessment Implementation (CAI) Instrument Using Rasch Model

Authors: Nurfirdawati Muhamad Hanafi, Azmanirah Ab Rahman, Marina Ibrahim Mukhtar, Jamil Ahmad, Sarebah Warman

Abstract:

This study was conducted to generate empirical evidence on validity and reliability of the item of Competency Assessment Implementation (CAI) Instrument using Rasch Model for polythomous data aided by Winstep software version 3.68. The construct validity was examined by analyzing the point-measure correlation index (PTMEA), in fit and outfit MNSQ values; meanwhile the reliability was examined by analyzing item reliability index. A survey technique was used as the major method with the CAI instrument on 156 teachers from vocational schools. The results have shown that the reliability of CAI Instrument items were between 0.80 and 0.98. PTMEA Correlation is in positive values, in which the item is able to distinguish between the ability of the respondent. Statistical data obtained shows that out of 154 items, 12 items from the instrument suggested to be omitted. This study is hoped could bring a new direction to the process of data analysis in educational research.

Keywords: competency assessment, reliability, validity, item analysis

Procedia PDF Downloads 425
40403 A Dynamic Spatial Panel Data Analysis on Renter-Occupied Multifamily Housing DC

Authors: Jose Funes, Jeff Sauer, Laixiang Sun

Abstract:

This research examines determinants of multifamily housing development and spillovers in the District of Columbia. A range of socioeconomic factors related to income distribution, productivity, and land use policies are thought to influence the development in contemporary U.S. multifamily housing markets. The analysis leverages data from the American Community Survey to construct panel datasets spanning from 2010 to 2019. Using spatial regression, we identify several socioeconomic measures and land use policies both positively and negatively associated with new housing supply. We contextualize housing estimates related to race in relation to uneven development in the contemporary D.C. housing supply.

Keywords: neighborhood effect, sorting, spatial spillovers, multifamily housing

Procedia PDF Downloads 81
40402 Changes in the Subjective Interpretation of Poverty Due to COVID-19: The Case of a Peripheral County of Hungary

Authors: Eszter Siposne Nandori

Abstract:

The paper describes how the subjective interpretation of poverty changed during the COVID-19 pandemic. The results of data collection at the end of 2020 are compared to the results of a similar survey from 2019. The methods of systematic data collection are used to collect data about the beliefs of the population about poverty. The analysis is carried out in Borsod-Abaúj-Zemplén County, one of the most backward areas in Hungary. The paper concludes that poverty is mainly linked to material values, and it did not change from 2019 to 2020. Some slight changes, however, highlight the effect of the pandemic: poverty is increasingly seen as a generational problem in 2020, and another important change is that isolation became more closely related to poverty.

Keywords: Hungary, interpretation of poverty, pandemic, systematic data collection, subjective poverty

Procedia PDF Downloads 106
40401 Estimating the Ladder Angle and the Camera Position From a 2D Photograph Based on Applications of Projective Geometry and Matrix Analysis

Authors: Inigo Beckett

Abstract:

In forensic investigations, it is often the case that the most potentially useful recorded evidence derives from coincidental imagery, recorded immediately before or during an incident, and that during the incident (e.g. a ‘failure’ or fire event), the evidence is changed or destroyed. To an image analysis expert involved in photogrammetric analysis for Civil or Criminal Proceedings, traditional computer vision methods involving calibrated cameras is often not appropriate because image metadata cannot be relied upon. This paper presents an approach for resolving this problem, considering in particular and by way of a case study, the angle of a simple ladder shown in a photograph. The UK Health and Safety Executive (HSE) guidance document published in 2014 (INDG455) advises that a leaning ladder should be erected at 75 degrees to the horizontal axis. Personal injury cases can arise in the construction industry because a ladder is too steep or too shallow. Ad-hoc photographs of such ladders in their incident position provide a basis for analysis of their angle. This paper presents a direct approach for ascertaining the position of the camera and the angle of the ladder simultaneously from the photograph(s) by way of a workflow that encompasses a novel application of projective geometry and matrix analysis. Mathematical analysis shows that for a given pixel ratio of directly measured collinear points (i.e. features that lie on the same line segment) from the 2D digital photograph with respect to a given viewing point, we can constrain the 3D camera position to a surface of a sphere in the scene. Depending on what we know about the ladder, we can enforce another independent constraint on the possible camera positions which enables us to constrain the possible positions even further. Experiments were conducted using synthetic and real-world data. The synthetic data modeled a vertical plane with a ladder on a horizontally flat plane resting against a vertical wall. The real-world data was captured using an Apple iPhone 13 Pro and 3D laser scan survey data whereby a ladder was placed in a known location and angle to the vertical axis. For each case, we calculated camera positions and the ladder angles using this method and cross-compared them against their respective ‘true’ values.

Keywords: image analysis, projective geometry, homography, photogrammetry, ladders, Forensics, Mathematical modeling, planar geometry, matrix analysis, collinear, cameras, photographs

Procedia PDF Downloads 31
40400 Explanatory Variables for Crash Injury Risk Analysis

Authors: Guilhermina Torrao

Abstract:

An extensive number of studies have been conducted to determine the factors which influence crash injury risk (CIR); however, uncertainties inherent to selected variables have been neglected. A review of existing literature is required to not only obtain an overview of the variables and measures but also ascertain the implications when comparing studies without a systematic view of variable taxonomy. Therefore, the aim of this literature review is to examine and report on peer-reviewed studies in the field of crash analysis and to understand the implications of broad variations in variable selection in CIR analysis. The objective of this study is to demonstrate the variance in variable selection and classification when modeling injury risk involving occupants of light vehicles by presenting an analytical review of the literature. Based on data collected from 64 journal publications reported over the past 21 years, the analytical review discusses the variables selected by each study across an organized list of predictors for CIR analysis and provides a better understanding of the contribution of accident and vehicle factors to injuries acquired by occupants of light vehicles. A cross-comparison analysis demonstrates that almost half the studies (48%) did not consider vehicle design specifications (e.g., vehicle weight), whereas, for those that did, the vehicle age/model year was the most selected explanatory variable used by 41% of the literature studies. For those studies that included speed risk factor in their analyses, the majority (64%) used the legal speed limit data as a ‘proxy’ of vehicle speed at the moment of a crash, imposing limitations for CIR analysis and modeling. Despite the proven efficiency of airbags in minimizing injury impact following a crash, only 22% of studies included airbag deployment data. A major contribution of this study is to highlight the uncertainty linked to explanatory variable selection and identify opportunities for improvements when performing future studies in the field of road injuries.

Keywords: crash, exploratory, injury, risk, variables, vehicle

Procedia PDF Downloads 115
40399 Contextual Sentiment Analysis with Untrained Annotators

Authors: Lucas A. Silva, Carla R. Aguiar

Abstract:

This work presents a proposal to perform contextual sentiment analysis using a supervised learning algorithm and disregarding the extensive training of annotators. To achieve this goal, a web platform was developed to perform the entire procedure outlined in this paper. The main contribution of the pipeline described in this article is to simplify and automate the annotation process through a system of analysis of congruence between the notes. This ensured satisfactory results even without using specialized annotators in the context of the research, avoiding the generation of biased training data for the classifiers. For this, a case study was conducted in a blog of entrepreneurship. The experimental results were consistent with the literature related annotation using formalized process with experts.

Keywords: sentiment analysis, untrained annotators, naive bayes, entrepreneurship, contextualized classifier

Procedia PDF Downloads 377
40398 Spatial Analysis of the Impact of City Developments Degradation of Green Space in Urban Fringe Eastern City of Yogyakarta Year 2005-2010

Authors: Pebri Nurhayati, Rozanah Ahlam Fadiyah

Abstract:

In the development of the city often use rural areas that can not be separated from the change in land use that lead to the degradation of urban green space in the city fringe. In the long run, the degradation of green open space this can impact on the decline of ecological, psychological and public health. Therefore, this research aims to (1) determine the relationship between the parameters of the degradation rate of urban development with green space, (2) develop a spatial model of the impact of urban development on the degradation of green open space with remote sensing techniques and Geographical Information Systems in an integrated manner. This research is a descriptive research with data collection techniques of observation and secondary data . In the data analysis, to interpret the direction of urban development and degradation of green open space is required in 2005-2010 ASTER image with NDVI. Of interpretation will generate two maps, namely maps and map development built land degradation green open space. Secondary data related to the rate of population growth, the level of accessibility, and the main activities of each city map is processed into a population growth rate, the level of accessibility maps, and map the main activities of the town. Each map is used as a parameter to map the degradation of green space and analyzed by non-parametric statistical analysis using Crosstab thus obtained value of C (coefficient contingency). C values were then compared with the Cmaximum to determine the relationship. From this research will be obtained in the form of modeling spatial map of the City Development Impact Degradation Green Space in Urban Fringe eastern city of Yogyakarta 2005-2010. In addition, this research also generate statistical analysis of the test results of each parameter to the degradation of green open space in the Urban Fringe eastern city of Yogyakarta 2005-2010.

Keywords: spatial analysis, urban development, degradation of green space, urban fringe

Procedia PDF Downloads 295
40397 Artificial Intelligence Approach to Water Treatment Processes: Case Study of Daspoort Treatment Plant, South Africa

Authors: Olumuyiwa Ojo, Masengo Ilunga

Abstract:

Artificial neural network (ANN) has broken the bounds of the convention programming, which is actually a function of garbage in garbage out by its ability to mimic the human brain. Its ability to adopt, adapt, adjust, evaluate, learn and recognize the relationship, behavior, and pattern of a series of data set administered to it, is tailored after the human reasoning and learning mechanism. Thus, the study aimed at modeling wastewater treatment process in order to accurately diagnose water control problems for effective treatment. For this study, a stage ANN model development and evaluation methodology were employed. The source data analysis stage involved a statistical analysis of the data used in modeling in the model development stage, candidate ANN architecture development and then evaluated using a historical data set. The model was developed using historical data obtained from Daspoort Wastewater Treatment plant South Africa. The resultant designed dimensions and model for wastewater treatment plant provided good results. Parameters considered were temperature, pH value, colour, turbidity, amount of solids and acidity. Others are total hardness, Ca hardness, Mg hardness, and chloride. This enables the ANN to handle and represent more complex problems that conventional programming is incapable of performing.

Keywords: ANN, artificial neural network, wastewater treatment, model, development

Procedia PDF Downloads 134
40396 Algorithms used in Spatial Data Mining GIS

Authors: Vahid Bairami Rad

Abstract:

Extracting knowledge from spatial data like GIS data is important to reduce the data and extract information. Therefore, the development of new techniques and tools that support the human in transforming data into useful knowledge has been the focus of the relatively new and interdisciplinary research area ‘knowledge discovery in databases’. Thus, we introduce a set of database primitives or basic operations for spatial data mining which are sufficient to express most of the spatial data mining algorithms from the literature. This approach has several advantages. Similar to the relational standard language SQL, the use of standard primitives will speed-up the development of new data mining algorithms and will also make them more portable. We introduced a database-oriented framework for spatial data mining which is based on the concepts of neighborhood graphs and paths. A small set of basic operations on these graphs and paths were defined as database primitives for spatial data mining. Furthermore, techniques to efficiently support the database primitives by a commercial DBMS were presented.

Keywords: spatial data base, knowledge discovery database, data mining, spatial relationship, predictive data mining

Procedia PDF Downloads 440
40395 Unlocking the Puzzle of Borrowing Adult Data for Designing Hybrid Pediatric Clinical Trials

Authors: Rajesh Kumar G

Abstract:

A challenging aspect of any clinical trial is to carefully plan the study design to meet the study objective in optimum way and to validate the assumptions made during protocol designing. And when it is a pediatric study, there is the added challenge of stringent guidelines and difficulty in recruiting the necessary subjects. Unlike adult trials, there is not much historical data available for pediatrics, which is required to validate assumptions for planning pediatric trials. Typically, pediatric studies are initiated as soon as approval is obtained for a drug to be marketed for adults, so with the adult study historical information and with the available pediatric pilot study data or simulated pediatric data, the pediatric study can be well planned. Generalizing the historical adult study for new pediatric study is a tedious task; however, it is possible by integrating various statistical techniques and utilizing the advantage of hybrid study design, which will help to achieve the study objective in a smoother way even with the presence of many constraints. This research paper will explain how well the hybrid study design can be planned along with integrated technique (SEV) to plan the pediatric study; In brief the SEV technique (Simulation, Estimation (using borrowed adult data and applying Bayesian methods)) incorporates the use of simulating the planned study data and getting the desired estimates to Validate the assumptions.This method of validation can be used to improve the accuracy of data analysis, ensuring that results are as valid and reliable as possible, which allow us to make informed decisions well ahead of study initiation. With professional precision, this technique based on the collected data allows to gain insight into best practices when using data from historical study and simulated data alike.

Keywords: adaptive design, simulation, borrowing data, bayesian model

Procedia PDF Downloads 57
40394 Analysis of Brownfield Soil Contamination Using Local Government Planning Data

Authors: Emma E. Hellawell, Susan J. Hughes

Abstract:

BBrownfield sites are currently being redeveloped for residential use. Information on soil contamination on these former industrial sites is collected as part of the planning process by the local government. This research project analyses this untapped resource of environmental data, using site investigation data submitted to a local Borough Council, in Surrey, UK. Over 150 site investigation reports were collected and interrogated to extract relevant information. This study involved three phases. Phase 1 was the development of a database for soil contamination information from local government reports. This database contained information on the source, history, and quality of the data together with the chemical information on the soil that was sampled. Phase 2 involved obtaining site investigation reports for development within the study area and extracting the required information for the database. Phase 3 was the data analysis and interpretation of key contaminants to evaluate typical levels of contaminants, their distribution within the study area, and relating these results to current guideline levels of risk for future site users. Preliminary results for a pilot study using a sample of the dataset have been obtained. This pilot study showed there is some inconsistency in the quality of the reports and measured data, and careful interpretation of the data is required. Analysis of the information has found high levels of lead in shallow soil samples, with mean and median levels exceeding the current guidance for residential use. The data also showed elevated (but below guidance) levels of potentially carcinogenic polyaromatic hydrocarbons. Of particular concern from the data was the high detection rate for asbestos fibers. These were found at low concentrations in 25% of the soil samples tested (however, the sample set was small). Contamination levels of the remaining chemicals tested were all below the guidance level for residential site use. These preliminary pilot study results will be expanded, and results for the whole local government area will be presented at the conference. The pilot study has demonstrated the potential for this extensive dataset to provide greater information on local contamination levels. This can help inform regulators and developers and lead to more targeted site investigations, improving risk assessments, and brownfield development.

Keywords: Brownfield development, contaminated land, local government planning data, site investigation

Procedia PDF Downloads 126
40393 Leveraging Power BI for Advanced Geotechnical Data Analysis and Visualization in Mining Projects

Authors: Elaheh Talebi, Fariba Yavari, Lucy Philip, Lesley Town

Abstract:

The mining industry generates vast amounts of data, necessitating robust data management systems and advanced analytics tools to achieve better decision-making processes in the development of mining production and maintaining safety. This paper highlights the advantages of Power BI, a powerful intelligence tool, over traditional Excel-based approaches for effectively managing and harnessing mining data. Power BI enables professionals to connect and integrate multiple data sources, ensuring real-time access to up-to-date information. Its interactive visualizations and dashboards offer an intuitive interface for exploring and analyzing geotechnical data. Advanced analytics is a collection of data analysis techniques to improve decision-making. Leveraging some of the most complex techniques in data science, advanced analytics is used to do everything from detecting data errors and ensuring data accuracy to directing the development of future project phases. However, while Power BI is a robust tool, specific visualizations required by geotechnical engineers may have limitations. This paper studies the capability to use Python or R programming within the Power BI dashboard to enable advanced analytics, additional functionalities, and customized visualizations. This dashboard provides comprehensive tools for analyzing and visualizing key geotechnical data metrics, including spatial representation on maps, field and lab test results, and subsurface rock and soil characteristics. Advanced visualizations like borehole logs and Stereonet were implemented using Python programming within the Power BI dashboard, enhancing the understanding and communication of geotechnical information. Moreover, the dashboard's flexibility allows for the incorporation of additional data and visualizations based on the project scope and available data, such as pit design, rock fall analyses, rock mass characterization, and drone data. This further enhances the dashboard's usefulness in future projects, including operation, development, closure, and rehabilitation phases. Additionally, this helps in minimizing the necessity of utilizing multiple software programs in projects. This geotechnical dashboard in Power BI serves as a user-friendly solution for analyzing, visualizing, and communicating both new and historical geotechnical data, aiding in informed decision-making and efficient project management throughout various project stages. Its ability to generate dynamic reports and share them with clients in a collaborative manner further enhances decision-making processes and facilitates effective communication within geotechnical projects in the mining industry.

Keywords: geotechnical data analysis, power BI, visualization, decision-making, mining industry

Procedia PDF Downloads 73
40392 A Method for Reduction of Association Rules in Data Mining

Authors: Diego De Castro Rodrigues, Marcelo Lisboa Rocha, Daniela M. De Q. Trevisan, Marcos Dias Da Conceicao, Gabriel Rosa, Rommel M. Barbosa

Abstract:

The use of association rules algorithms within data mining is recognized as being of great value in the knowledge discovery in databases. Very often, the number of rules generated is high, sometimes even in databases with small volume, so the success in the analysis of results can be hampered by this quantity. The purpose of this research is to present a method for reducing the quantity of rules generated with association algorithms. Therefore, a computational algorithm was developed with the use of a Weka Application Programming Interface, which allows the execution of the method on different types of databases. After the development, tests were carried out on three types of databases: synthetic, model, and real. Efficient results were obtained in reducing the number of rules, where the worst case presented a gain of more than 50%, considering the concepts of support, confidence, and lift as measures. This study concluded that the proposed model is feasible and quite interesting, contributing to the analysis of the results of association rules generated from the use of algorithms.

Keywords: data mining, association rules, rules reduction, artificial intelligence

Procedia PDF Downloads 146
40391 Analysis of Noodle Production Process at Yan Hu Food Manufacturing: Basis for Production Improvement

Authors: Rhadinia Tayag-Relanes, Felina C. Young

Abstract:

This study was conducted to analyze the noodle production process at Yan Hu Food Manufacturing for the basis of production improvement. The study utilized the PDCA approach and record review in the gathering of data for the calendar year 2019 from August to October data of the noodle products miki, canton, and misua. Causal-comparative research was used in this study; it attempts to establish cause-effect relationships among the variables such as descriptive statistics and correlation, both were used to compute the data gathered. The study found that miki, canton, and misua production has different cycle time sets for each production and has different production outputs in every set of its production process and a different number of wastages. The company has not yet established its allowable rejection rate/ wastage; instead, this paper used a 1% wastage limit. The researcher recommended the following: machines used for each process of the noodle product must be consistently maintained and monitored; an assessment of all the production operators by checking their performance statistically based on the output and the machine performance; a root cause analysis for finding the solution must be conducted; and an improvement on the recording system of the input and output of the production process of noodle product should be established to eliminate the poor recording of data.

Keywords: production, continuous improvement, process, operations, PDCA

Procedia PDF Downloads 44
40390 Innovation in Traditional Game: A Case Study of Trainee Teachers' Learning Experiences

Authors: Malathi Balakrishnan, Cheng Lee Ooi, Chander Vengadasalam

Abstract:

The purpose of this study is to explore a case study of trainee teachers’ learning experience on innovating traditional games during the traditional game carnival. It explores issues arising from multiple case studies of trainee teachers learning experiences in innovating traditional games. A qualitative methodology was adopted through observations, semi-structured interviews and reflective journals’ content analysis of trainee teachers’ learning experiences creating and implementing innovative traditional games. Twelve groups of 36 trainee teachers who registered for Sports and Physical Education Management Course were the participants for this research during the traditional game carnival. Semi structured interviews were administrated after the trainee teachers learning experiences in creating innovative traditional games. Reflective journals were collected after carnival day and the content analyzed. Inductive data analysis was used to evaluate various data sources. All the collected data were then evaluated through the Nvivo data analysis process. Inductive reasoning was interpreted based on the Self Determination Theory (SDT). The findings showed that the trainee teachers had positive game participation experiences, game knowledge about traditional games and positive motivation to innovate the game. The data also revealed the influence of themes like cultural significance and creativity. It can be concluded from the findings that the organized game carnival, as a requirement of course work by the Institute of Teacher Training Malaysia, was able to enhance teacher trainers’ innovative thinking skills. The SDT, as a multidimensional approach to motivation, was utilized. Therefore, teacher trainers may have more learning experiences using the SDT.

Keywords: learning experiences, innovation, traditional games, trainee teachers

Procedia PDF Downloads 313
40389 Towards an Enhanced Compartmental Model for Profiling Malware Dynamics

Authors: Jessemyn Modiini, Timothy Lynar, Elena Sitnikova

Abstract:

We present a novel enhanced compartmental model for malware spread analysis in cyber security. This paper applies cyber security data features to epidemiological compartmental models to model the infectious potential of malware. Compartmental models are most efficient for calculating the infectious potential of a disease. In this paper, we discuss and profile epidemiologically relevant data features from a Domain Name System (DNS) dataset. We then apply these features to epidemiological compartmental models to network traffic features. This paper demonstrates how epidemiological principles can be applied to the novel analysis of key cybersecurity behaviours and trends and provides insight into threat modelling above that of kill-chain analysis. In applying deterministic compartmental models to a cyber security use case, the authors analyse the deficiencies and provide an enhanced stochastic model for cyber epidemiology. This enhanced compartmental model (SUEICRN model) is contrasted with the traditional SEIR model to demonstrate its efficacy.

Keywords: cybersecurity, epidemiology, cyber epidemiology, malware

Procedia PDF Downloads 91
40388 Financial Reports and Common Ownership: An Analysis of the Mechanisms Common Owners Use to Induce Anti-Competitive Behavior

Authors: Kevin Smith

Abstract:

Publicly traded company in the US are legally obligated to host earnings calls that discuss their most recent financial reports. During these calls, investors are able to ask these companies questions about these financial reports and on the future direction of the company. This paper examines whether common institutional owners use these calls as a way to indirectly signal to companies in their portfolio to not take actions that could hurt the common owner's interests. This paper uses transcripts taken from the earnings calls of the six largest health insurance companies in the US from 2014 to 2019. This data is analyzed using text analysis and sentiment analysis to look for patterns in the statements made by common owners. The analysis found that common owners where more likely to recommend against direct price competition and instead redirect the insurance companies towards more passive actions, like investing in new technologies. This result indicates a mechanism that common owners use to reduce competition in the health insurance market.

Keywords: common ownership, text analysis, sentiment analysis, machine learning

Procedia PDF Downloads 58
40387 Detection of Abnormal Process Behavior in Copper Solvent Extraction by Principal Component Analysis

Authors: Kirill Filianin, Satu-Pia Reinikainen, Tuomo Sainio

Abstract:

Frequent measurements of product steam quality create a data overload that becomes more and more difficult to handle. In the current study, plant history data with multiple variables was successfully treated by principal component analysis to detect abnormal process behavior, particularly, in copper solvent extraction. The multivariate model is based on the concentration levels of main process metals recorded by the industrial on-stream x-ray fluorescence analyzer. After mean-centering and normalization of concentration data set, two-dimensional multivariate model under principal component analysis algorithm was constructed. Normal operating conditions were defined through control limits that were assigned to squared score values on x-axis and to residual values on y-axis. 80 percent of the data set were taken as the training set and the multivariate model was tested with the remaining 20 percent of data. Model testing showed successful application of control limits to detect abnormal behavior of copper solvent extraction process as early warnings. Compared to the conventional techniques of analyzing one variable at a time, the proposed model allows to detect on-line a process failure using information from all process variables simultaneously. Complex industrial equipment combined with advanced mathematical tools may be used for on-line monitoring both of process streams’ composition and final product quality. Defining normal operating conditions of the process supports reliable decision making in a process control room. Thus, industrial x-ray fluorescence analyzers equipped with integrated data processing toolbox allows more flexibility in copper plant operation. The additional multivariate process control and monitoring procedures are recommended to apply separately for the major components and for the impurities. Principal component analysis may be utilized not only in control of major elements’ content in process streams, but also for continuous monitoring of plant feed. The proposed approach has a potential in on-line instrumentation providing fast, robust and cheap application with automation abilities.

Keywords: abnormal process behavior, failure detection, principal component analysis, solvent extraction

Procedia PDF Downloads 293
40386 Wavelet Based Advanced Encryption Standard Algorithm for Image Encryption

Authors: Ajish Sreedharan

Abstract:

With the fast evolution of digital data exchange, security information becomes much important in data storage and transmission. Due to the increasing use of images in industrial process, it is essential to protect the confidential image data from unauthorized access. As encryption process is applied to the whole image in AES ,it is difficult to improve the efficiency. In this paper, wavelet decomposition is used to concentrate the main information of image to the low frequency part. Then, AES encryption is applied to the low frequency part. The high frequency parts are XORed with the encrypted low frequency part and a wavelet reconstruction is applied. Theoretical analysis and experimental results show that the proposed algorithm has high efficiency, and satisfied security suits for image data transmission.

Keywords: discrete wavelet transforms, AES, dynamic SBox

Procedia PDF Downloads 419
40385 Using Data from Foursquare Web Service to Represent the Commercial Activity of a City

Authors: Taras Agryzkov, Almudena Nolasco-Cirugeda, Jose L. Oliver, Leticia Serrano-Estrada, Leandro Tortosa, Jose F. Vicent

Abstract:

This paper aims to represent the commercial activity of a city taking as source data the social network Foursquare. The city of Murcia is selected as case study, and the location-based social network Foursquare is the main source of information. After carrying out a reorganisation of the user-generated data extracted from Foursquare, it is possible to graphically display on a map the various city spaces and venues –especially those related to commercial, food and entertainment sector businesses. The obtained visualisation provides information about activity patterns in the city of Murcia according to the people`s interests and preferences and, moreover, interesting facts about certain characteristics of the town itself.

Keywords: social networks, spatial analysis, data visualization, geocomputation, Foursquare

Procedia PDF Downloads 403
40384 Analysis of Sediment Distribution around Karang Sela Coral Reef Using Multibeam Backscatter

Authors: Razak Zakariya, Fazliana Mustajap, Lenny Sharinee Sakai

Abstract:

A sediment map is quite important in the marine environment. The sediment itself contains thousands of information that can be used for other research. This study was conducted by using a multibeam echo sounder Reson T20 on 15 August 2020 at the Karang Sela (coral reef area) at Pulau Bidong. The study aims to identify the sediment type around the coral reef by using bathymetry and backscatter data. The sediment in the study area was collected as ground truthing data to verify the classification of the seabed. A dry sieving method was used to analyze the sediment sample by using a sieve shaker. PDS 2000 software was used for data acquisition, and Qimera QPS version 2.4.5 was used for processing the bathymetry data. Meanwhile, FMGT QPS version 7.10 processes the backscatter data. Then, backscatter data were analyzed by using the maximum likelihood classification tool in ArcGIS version 10.8 software. The result identified three types of sediments around the coral which were very coarse sand, coarse sand, and medium sand.

Keywords: sediment type, MBES echo sounder, backscatter, ArcGIS

Procedia PDF Downloads 65
40383 Biomechanical Performance of the Synovial Capsule of the Glenohumeral Joint with a BANKART Lesion through Finite Element Analysis

Authors: Duvert A. Puentes T., Javier A. Maldonado E., Ivan Quintero., Diego F. Villegas

Abstract:

Mechanical Computation is a great tool to study the performance of complex models. An example of it is the study of the human body structure. This paper took advantage of different types of software to make a 3D model of the glenohumeral joint and apply a finite element analysis. The main objective was to study the change in the biomechanical properties of the joint when it presents an injury. Specifically, a BANKART lesion, which consists in the detachment of the anteroinferior labrum from the glenoid. Stress and strain distribution of the soft tissues were the focus of this study. First, a 3D model was made of a joint without any pathology, as a control sample, using segmentation software for the bones with the support of medical imagery and a cadaveric model to represent the soft tissue. The joint was built to simulate a compression and external rotation test using CAD to prepare the model in the adequate position. When the healthy model was finished, it was submitted to a finite element analysis and the results were validated with experimental model data. With the validated model, it was sensitized to obtain the best mesh measurement. Finally, the geometry of the 3D model was changed to imitate a BANKART lesion. Then, the contact zone of the glenoid with the labrum was slightly separated simulating a tissue detachment. With this new geometry, the finite element analysis was applied again, and the results were compared with the control sample created initially. With the data gathered, this study can be used to improve understanding of the labrum tears. Nevertheless, it is important to remember that the computational analysis are approximations and the initial data was taken from an in vitro assay.

Keywords: biomechanics, computational model, finite elements, glenohumeral joint, bankart lesion, labrum

Procedia PDF Downloads 144
40382 Enhancing Large Language Models' Data Analysis Capability with Planning-and-Execution and Code Generation Agents: A Use Case for Southeast Asia Real Estate Market Analytics

Authors: Kien Vu, Jien Min Soh, Mohamed Jahangir Abubacker, Piyawut Pattamanon, Soojin Lee, Suvro Banerjee

Abstract:

Recent advances in Generative Artificial Intelligence (GenAI), in particular Large Language Models (LLMs) have shown promise to disrupt multiple industries at scale. However, LLMs also present unique challenges, notably, these so-called "hallucination" which is the generation of outputs that are not grounded in the input data that hinders its adoption into production. Common practice to mitigate hallucination problem is utilizing Retrieval Agmented Generation (RAG) system to ground LLMs'response to ground truth. RAG converts the grounding documents into embeddings, retrieve the relevant parts with vector similarity between user's query and documents, then generates a response that is not only based on its pre-trained knowledge but also on the specific information from the retrieved documents. However, the RAG system is not suitable for tabular data and subsequent data analysis tasks due to multiple reasons such as information loss, data format, and retrieval mechanism. In this study, we have explored a novel methodology that combines planning-and-execution and code generation agents to enhance LLMs' data analysis capabilities. The approach enables LLMs to autonomously dissect a complex analytical task into simpler sub-tasks and requirements, then convert them into executable segments of code. In the final step, it generates the complete response from output of the executed code. When deployed beta version on DataSense, the property insight tool of PropertyGuru, the approach yielded promising results, as it was able to provide market insights and data visualization needs with high accuracy and extensive coverage by abstracting the complexities for real-estate agents and developers from non-programming background. In essence, the methodology not only refines the analytical process but also serves as a strategic tool for real estate professionals, aiding in market understanding and enhancement without the need for programming skills. The implication extends beyond immediate analytics, paving the way for a new era in the real estate industry characterized by efficiency and advanced data utilization.

Keywords: large language model, reasoning, planning and execution, code generation, natural language processing, prompt engineering, data analysis, real estate, data sense, PropertyGuru

Procedia PDF Downloads 58
40381 A Comprehensive Survey and Improvement to Existing Privacy Preserving Data Mining Techniques

Authors: Tosin Ige

Abstract:

Ethics must be a condition of the world, like logic. (Ludwig Wittgenstein, 1889-1951). As important as data mining is, it possess a significant threat to ethics, privacy, and legality, since data mining makes it difficult for an individual or consumer (in the case of a company) to control the accessibility and usage of his data. This research focuses on Current issues and the latest research and development on Privacy preserving data mining methods as at year 2022. It also discusses some advances in those techniques while at the same time highlighting and providing a new technique as a solution to an existing technique of privacy preserving data mining methods. This paper also bridges the wide gap between Data mining and the Web Application Programing Interface (web API), where research is urgently needed for an added layer of security in data mining while at the same time introducing a seamless and more efficient way of data mining.

Keywords: data, privacy, data mining, association rule, privacy preserving, mining technique

Procedia PDF Downloads 146
40380 Concept for Knowledge out of Sri Lankan Non-State Sector: Performances of Higher Educational Institutes and Successes of Its Sector

Authors: S. Jeyarajan

Abstract:

Concept of knowledge is discovered from conducted study for successive Competition in Sri Lankan Non-State Higher Educational Institutes. The Concept discovered out of collected Knowledge Management Practices from Emerald inside likewise reputed literatures and of Non-State Higher Educational sector. A test is conducted to reveal existences and its reason behind of these collected practices in Sri Lankan Non-State Higher Education Institutes. Further, unavailability of such study and uncertain on number of participants for data collection in the Sri Lankan context contributed selection of research method as qualitative method, which used attributes of Delphi Method to manage those likewise uncertainty. Data are collected under Dramaturgical Method, which contributes efficient usage of the Delphi method. Grounded theory is selected as data analysis techniques, which is conducted in intermixed discourse to manage different perspectives of data that are collected systematically through perspective and modified snowball sampling techniques. Data are then analysed using Grounded Theory Development Techniques in Intermix discourses to manage differences in Data. Consequently, Agreement in the results of Grounded theories and of finding in the Foreign Study is discovered in the analysis whereas present study conducted as Qualitative Research and The Foreign Study conducted as Quantitative Research. As such, the Present study widens the discovery in the Foreign Study. Further, having discovered reason behind of the existences, the Present result shows Concept for Knowledge from Sri Lankan Non-State sector to manage higher educational Institutes in successful manner.

Keywords: adherence of snowball sampling into perspective sampling, Delphi method in qualitative method, grounded theory development in intermix discourses of analysis, knowledge management for success of higher educational institutes

Procedia PDF Downloads 156
40379 Automatic Lead Qualification with Opinion Mining in Customer Relationship Management Projects

Authors: Victor Radich, Tania Basso, Regina Moraes

Abstract:

Lead qualification is one of the main procedures in Customer Relationship Management (CRM) projects. Its main goal is to identify potential consumers who have the ideal characteristics to establish a profitable and long-term relationship with a certain organization. Social networks can be an important source of data for identifying and qualifying leads since interest in specific products or services can be identified from the users’ expressed feelings of (dis)satisfaction. In this context, this work proposes the use of machine learning techniques and sentiment analysis as an extra step in the lead qualification process in order to improve it. In addition to machine learning models, sentiment analysis or opinion mining can be used to understand the evaluation that the user makes of a particular service, product, or brand. The results obtained so far have shown that it is possible to extract data from social networks and combine the techniques for a more complete classification.

Keywords: lead qualification, sentiment analysis, opinion mining, machine learning, CRM, lead scoring

Procedia PDF Downloads 63