Search results for: custom dataset
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1312

Search results for: custom dataset

142 Handling, Exporting and Archiving Automated Mineralogy Data Using TESCAN TIMA

Authors: Marek Dosbaba

Abstract:

Within the mining sector, SEM-based Automated Mineralogy (AM) has been the standard application for quickly and efficiently handling mineral processing tasks. Over the last decade, the trend has been to analyze larger numbers of samples, often with a higher level of detail. This has necessitated a shift from interactive sample analysis performed by an operator using a SEM, to an increased reliance on offline processing to analyze and report the data. In response to this trend, TESCAN TIMA Mineral Analyzer is designed to quickly create a virtual copy of the studied samples, thereby preserving all the necessary information. Depending on the selected data acquisition mode, TESCAN TIMA can perform hyperspectral mapping and save an X-ray spectrum for each pixel or segment, respectively. This approach allows the user to browse through elemental distribution maps of all elements detectable by means of energy dispersive spectroscopy. Re-evaluation of the existing data for the presence of previously unconsidered elements is possible without the need to repeat the analysis. Additional tiers of data such as a secondary electron or cathodoluminescence images can also be recorded. To take full advantage of these information-rich datasets, TIMA utilizes a new archiving tool introduced by TESCAN. The dataset size can be reduced for long-term storage and all information can be recovered on-demand in case of renewed interest. TESCAN TIMA is optimized for network storage of its datasets because of the larger data storage capacity of servers compared to local drives, which also allows multiple users to access the data remotely. This goes hand in hand with the support of remote control for the entire data acquisition process. TESCAN also brings a newly extended open-source data format that allows other applications to extract, process and report AM data. This offers the ability to link TIMA data to large databases feeding plant performance dashboards or geometallurgical models. The traditional tabular particle-by-particle or grain-by-grain export process is preserved and can be customized with scripts to include user-defined particle/grain properties.

Keywords: Tescan, electron microscopy, mineralogy, SEM, automated mineralogy, database, TESCAN TIMA, open format, archiving, big data

Procedia PDF Downloads 93
141 Hybrid Knowledge and Data-Driven Neural Networks for Diffuse Optical Tomography Reconstruction in Medical Imaging

Authors: Paola Causin, Andrea Aspri, Alessandro Benfenati

Abstract:

Diffuse Optical Tomography (DOT) is an emergent medical imaging technique which employs NIR light to estimate the spatial distribution of optical coefficients in biological tissues for diagnostic purposes, in a noninvasive and non-ionizing manner. DOT reconstruction is a severely ill-conditioned problem due to prevalent scattering of light in the tissue. In this contribution, we present our research in adopting hybrid knowledgedriven/data-driven approaches which exploit the existence of well assessed physical models and build upon them neural networks integrating the availability of data. Namely, since in this context regularization procedures are mandatory to obtain a reasonable reconstruction [1], we explore the use of neural networks as tools to include prior information on the solution. 2. Materials and Methods The idea underlying our approach is to leverage neural networks to solve PDE-constrained inverse problems of the form š’’ āˆ— = š’‚š’“š’ˆ š’Žš’Šš’š’’ šƒ(š’š, š’šĢƒ), (1) where D is a loss function which typically contains a discrepancy measure (or data fidelity) term plus other possible ad-hoc designed terms enforcing specific constraints. In the context of inverse problems like (1), one seeks the optimal set of physical parameters q, given the set of observations y. Moreover, š‘¦Ģƒ is the computable approximation of y, which may be as well obtained from a neural network but also in a classic way via the resolution of a PDE with given input coefficients (forward problem, Fig.1 box ļ‚‚). Due to the severe ill conditioning of the reconstruction problem, we adopt a two-fold approach: i) we restrict the solutions (optical coefficients) to lie in a lower-dimensional subspace generated by auto-decoder type networks. This procedure forms priors of the solution (Fig.1 box ļ‚); ii) we use regularization procedures of type š’’Ģ‚ āˆ— = š’‚š’“š’ˆš’Žš’Šš’š’’ šƒ(š’š, š’šĢƒ)+ š‘¹(š’’), where š‘¹(š’’) is a regularization functional depending on regularization parameters which can be fixed a-priori or learned via a neural network in a data-driven modality. To further improve the generalizability of the proposed framework, we also infuse physics knowledge via soft penalty constraints (Fig.1 box ļ‚ƒ) in the overall optimization procedure (Fig.1 box ļ‚„). 3. Discussion and Conclusion DOT reconstruction is severely hindered by ill-conditioning. The combined use of data-driven and knowledgedriven elements is beneficial and allows to obtain improved results, especially with a restricted dataset and in presence of variable sources of noise.

Keywords: inverse problem in tomography, deep learning, diffuse optical tomography, regularization

Procedia PDF Downloads 58
140 Supplier Carbon Footprint Methodology Development for Automotive Original Equipment Manufacturers

Authors: Nur A. Ɩzdemir, Sude Erkin, Hatice K. GĆ¼ney, Cemre S. Atılgan, Enes Huylu, HĆ¼seyin Y. Altıntaş, Aysemin Top, Ɩzak Durmuş

Abstract:

Carbon emissions produced during a productā€™s life cycle, from extraction of raw materials up to waste disposal and market consumption activities are the major contributors to global warming. In the light of the science-based targets (SBT) leading the way to a zero-carbon economy for sustainable growth of the companies, carbon footprint reporting of the purchased goods has become critical for identifying hotspots and best practices for emission reduction opportunities. In line with Ford Otosan's corporate sustainability strategy, research was conducted to evaluate the carbon footprint of purchased products in accordance with Scope 3 of the Greenhouse Gas Protocol (GHG). The purpose of this paper is to develop a systematic and transparent methodology to calculate carbon footprint of the products produced by automotive OEMs (Original Equipment Manufacturers) within the context of automobile supply chain management. To begin with, primary material data were collected through IMDS (International Material Database System) corresponds to companyā€™s three distinct types of vehicles including Light Commercial Vehicle (Courier), Medium Commercial Vehicle (Transit and Transit Custom), Heavy Commercial Vehicle (F-MAX). Obtained material data was classified as metals, plastics, liquids, electronics, and others to get insights about the overall material distribution of produced vehicles and matched to the SimaPro Ecoinvent 3 database which is one of the most extent versions for modelling material data related to the product life cycle. Product life cycle analysis was calculated within the framework of ISO 14040 ā€“ 14044 standards by addressing the requirements and procedures. A comprehensive literature review and cooperation with suppliers were undertaken to identify the production methods of parts used in vehicles and to find out the amount of scrap generated during part production. Cumulative weight and material information with related production process belonging the components were listed by multiplying with current sales figures. The results of the study show a key modelling on carbon footprint of products and processes based on a scientific approach to drive sustainable growth by setting straightforward, science-based emission reduction targets. Hence, this study targets to identify the hotspots and correspondingly provide broad ideas about our understanding of how to integrate carbon footprint estimates into our company's supply chain management by defining convenient actions in line with climate science. According to emission values arising from the production phase including raw material extraction and material processing for Ford OTOSAN vehicles subjected in this study, GHG emissions from the production of metals used for HCV, MCV and LCV account for more than half of the carbon footprint of the vehicle's production. Correspondingly, aluminum and steel have the largest share among all material types and achieving carbon neutrality in the steel and aluminum industry is of great significance to the world, which will also present an immense impact on the automobile industry. Strategic product sustainability plan which includes the use of secondary materials, conversion to green energy and low-energy process design is required to reduce emissions of steel, aluminum, and plastics due to the projected increase in total volume by 2030.

Keywords: automotive, carbon footprint, IMDS, scope 3, SimaPro, sustainability

Procedia PDF Downloads 89
139 Measuring the Embodied Energy of Construction Materials and Their Associated Cost Through Building Information Modelling

Authors: Ahmad Odeh, Ahmad Jrade

Abstract:

Energy assessment is an evidently significant factor when evaluating the sustainability of structures especially at the early design stage. Today design practices revolve around the selection of material that reduces the operational energy and yet meets their displinary need. Operational energy represents a substantial part of the building lifecycle energy usage but the fact remains that embodied energy is an important aspect unaccounted for in the carbon footprint. At the moment, little or no consideration is given to embodied energy mainly due to the complexity of calculation and the various factors involved. The equipment used, the fuel needed, and electricity required for each material vary with location and thus the embodied energy will differ for each project. Moreover, the method and the technique used in manufacturing, transporting and putting in place will have a significant influence on the materialsā€™ embodied energy. This anomaly has made it difficult to calculate or even bench mark the usage of such energies. This paper presents a model aimed at helping designers select the construction materials based on their embodied energy. Moreover, this paper presents a systematic approach that uses an efficient method of calculation and ultimately provides new insight into construction material selection. The model is developed in a BIM environment targeting the quantification of embodied energy for construction materials through the three main stages of their life: manufacturing, transportation and placement. The model contains three major databases each of which contains a set of the most commonly used construction materials. The first dataset holds information about the energy required to manufacture any type of materials, the second includes information about the energy required for transporting the materials while the third stores information about the energy required by tools and cranes needed to place an item in its intended location. The model provides designers with sets of all available construction materials and their associated embodied energies to use for the selection during the design process. Through geospatial data and dimensional material analysis, the model will also be able to automatically calculate the distance between the factories and the construction site. To remain within the sustainability criteria set by LEED, a final database is created and used to calculate the overall construction cost based on R.M.S. means cost data and then automatically recalculate the costs for any modifications. Design criteria including both operational and embodied energies will cause designers to revaluate the current material selection for cost, energy, and most importantly sustainability.

Keywords: building information modelling, energy, life cycle analysis, sustainablity

Procedia PDF Downloads 255
138 On the Semantics and Pragmatics of 'Be Able To': Modality and Actualisation

Authors: BenoƮt Leclercq, Ilse Depraetere

Abstract:

The goal of this presentation is to shed new light on the semantics and pragmatics of be able to. It presents the results of a corpus analysis based on data from the BNC (British National Corpus), and discusses these results in light of a specific stance on the semantics-pragmatics interface taking into account recent developments. Be able to is often discussed in relation to can and could, all of which can be used to express ability. Such an onomasiological approach often results in the identification of usage constraints for each expression. In the case of be able to, it is the formal properties of the modal expression (unlike can and could, be able to has non-finite forms) that are in the foreground, and the modal expression is described as the verb that conveys future ability. Be able to is also argued to expressed actualised ability in the past (I was able/could to open the door). This presentation aims to provide a more accurate pragmatic-semantic profile of be able to, based on extensive data analysis and one that is embedded in a very explicit view on the semantics-pragmatics interface. A random sample of 3000 examples (1000 for each modal verb) extracted from the BNC was analysed to account for the following issues. First, the challenge is to identify the exact semantic range of be able to. The results show that, contrary to general assumption, be able to does not only express ability but it shares most of the root meanings usually associated with the possibility modals can and could. The data reveal that what is called opportunity is, in fact, the most frequent meaning of be able to. Second, attention will be given to the notion of actualisation. It is commonly argued that be able to is the preferred form when the residue actualises: (1) The only reason he was able to do that was because of the restriction (BNC, spoken) (2) It is only through my imaginative shuffling of the aces that we are able to stay ahead of the pack. (BNC, written) Although this notion has been studied in detail within formal semantic approaches, empirical data is crucially lacking and it is unclear whether actualisation constitutes a conventional (and distinguishing) property of be able to. The empirical analysis provides solid evidence that actualisation is indeed a conventional feature of the modal. Furthermore, the dataset reveals that be able to expresses actualised 'opportunities' and not actualised 'abilities'. In the final part of this paper, attention will be given to the theoretical implications of the empirical findings, and in particular to the following paradox: how can the same expression encode both modal meaning (non-factual) and actualisation (factual)? It will be argued that this largely depends on one's conception of the semantics-pragmatics interface, and that this need not be an issue when actualisation (unlike modality) is analysed as a generalised conversational implicature and thus is considered part of the conventional pragmatic layer of be able to.

Keywords: Actualisation, Modality, Pragmatics, Semantics

Procedia PDF Downloads 109
137 Using Mathematical Models to Predict the Academic Performance of Students from Initial Courses in Engineering School

Authors: MartĆ­n Pratto Burgos

Abstract:

The Engineering School of the University of the Republic in Uruguay offers an Introductory Mathematical Course from the second semester of 2019. This course has been designed to assist students in preparing themselves for math courses that are essential for Engineering Degrees, namely Math1, Math2, and Math3 in this research. The research proposes to build a model that can accurately predict the student's activity and academic progress based on their performance in the three essential Mathematical courses. Additionally, there is a need for a model that can forecast the incidence of the Introductory Mathematical Course in the three essential courses approval during the first academic year. The techniques used are Principal Component Analysis and predictive modelling using the Generalised Linear Model. The dataset includes information from 5135 engineering students and 12 different characteristics based on activity and course performance. Two models are created for a type of data that follows a binomial distribution using the R programming language. Model 1 is based on a variable's p-value being less than 0.05, and Model 2 uses the stepAIC function to remove variables and get the lowest AIC score. After using Principal Component Analysis, the main components represented in the y-axis are the approval of the Introductory Mathematical Course, and the x-axis is the approval of Math1 and Math2 courses as well as student activity three years after taking the Introductory Mathematical Course. Model 2, which considered studentā€™s activity, performed the best with an AUC of 0.81 and an accuracy of 84%. According to Model 2, the student's engagement in school activities will continue for three years after the approval of the Introductory Mathematical Course. This is because they have successfully completed the Math1 and Math2 courses. Passing the Math3 course does not have any effect on the studentā€™s activity. Concerning academic progress, the best fit is Model 1. It has an AUC of 0.56 and an accuracy rate of 91%. The model says that if the student passes the three first-year courses, they will progress according to the timeline set by the curriculum. Both models show that the Introductory Mathematical Course does not directly affect the studentā€™s activity and academic progress. The best model to explain the impact of the Introductory Mathematical Course on the three first-year courses was Model 1. It has an AUC of 0.76 and 98% accuracy. The model shows that if students pass the Introductory Mathematical Course, it will help them to pass Math1 and Math2 courses without affecting their performance on the Math3 course. Matching the three predictive models, if students pass Math1 and Math2 courses, they will stay active for three years after taking the Introductory Mathematical Course, and also, they will continue following the recommended engineering curriculum. Additionally, the Introductory Mathematical Course helps students to pass Math1 and Math2 when they start Engineering School. Models obtained in the research don't consider the time students took to pass the three Math courses, but they can successfully assess courses in the university curriculum.

Keywords: machine-learning, engineering, university, education, computational models

Procedia PDF Downloads 64
136 Ethnic Xenophobia as Symbolic Politics: An Explanation of Anti-Migrant Activity from Brussels to Beirut

Authors: Annamarie Rannou, Horace Bartilow

Abstract:

Global concerns about xenophobic activity are on the rise across developed and developing countries. And yet, social science scholarship has almost exclusively examined xenophobia as a prejudice of advanced western nations. This research argues that the fields of study related to xenophobia must be re-conceptualized within a framework of ethnicity in order to level the playing field for cross-regional inquiry. This study develops a new concept of ethnic xenophobia and integrates existing explanations of anti-migrant expression into theories of ethnic threat. We argue specifically that political elites convert economic, political, and social threats at the national level into ethnic xenophobic activity in order to gain or maintain political advantage among their native selectorate. We expand on Stuart Kaufmanā€™s theory of symbolic politics to underscore the methods of mobilization used against migrants and the power of elite discourse in moments of national crises. An original dataset is used to examine over 35,000 cases of ethnic xenophobic activity targeting refugees. Wordscores software is used to develop a unique measure of anti-migrant elite rhetoric which captures the symbolic discourse of elites in their mobilization of ethnic xenophobic activism. We use a Structural Equation Model (SEM) to test the causal pathways of the theory across seventy-two developed and developing countries from 1990 to 2016. A framework of Most Different Systems Design (MDSD) is also applied to two pairs of developed-developing country cases, including Kenya and the Netherlands and Lebanon and the United States. This study sheds tremendous light on an underrepresented area of comparative research in migration studies. It shows that the causal elements of anti-migrant activity are far more similar than existing research suggests which has major implications for policy makers, practitioners, and academics in fields of migration protection and advocacy. It speaks directly to the mobilization of myths surrounding refugees, in particular, and the nationalization of narratives of migration that may be neutralized by the development of deeper associational relationships between natives and migrants.

Keywords: refugees, ethnicity, symbolic politics, elites, migration, comparative politics

Procedia PDF Downloads 130
135 Construction of Graph Signal Modulations via Graph Fourier Transform and Its Applications

Authors: Xianwei Zheng, Yuan Yan Tang

Abstract:

Classical window Fourier transform has been widely used in signal processing, image processing, machine learning and pattern recognition. The related Gabor transform is powerful enough to capture the texture information of any given dataset. Recently, in the emerging field of graph signal processing, researchers devoting themselves to develop a graph signal processing theory to handle the so-called graph signals. Among the new developing theory, windowed graph Fourier transform has been constructed to establish a time-frequency analysis framework of graph signals. The windowed graph Fourier transform is defined by using the translation and modulation operators of graph signals, following the similar calculations in classical windowed Fourier transform. Specifically, the translation and modulation operators of graph signals are defined by using the Laplacian eigenvectors as follows. For a given graph signal, its translation is defined by a similar manner as its definition in classical signal processing. Specifically, the translation operator can be defined by using the Fourier atoms; the graph signal translation is defined similarly by using the Laplacian eigenvectors. The modulation of the graph can also be established by using the Laplacian eigenvectors. The windowed graph Fourier transform based on these two operators has been applied to obtain time-frequency representations of graph signals. Fundamentally, the modulation operator is defined similarly to the classical modulation by multiplying a graph signal with the entries in each Fourier atom. However, a single Laplacian eigenvector entry cannot play a similar role as the Fourier atom. This definition ignored the relationship between the translation and modulation operators. In this paper, a new definition of the modulation operator is proposed and thus another time-frequency framework for graph signal is constructed. Specifically, the relationship between the translation and modulation operations can be established by the Fourier transform. Specifically, for any signal, the Fourier transform of its translation is the modulation of its Fourier transform. Thus, the modulation of any signal can be defined as the inverse Fourier transform of the translation of its Fourier transform. Therefore, similarly, the graph modulation of any graph signal can be defined as the inverse graph Fourier transform of the translation of its graph Fourier. The novel definition of the graph modulation operator established a relationship of the translation and modulation operations. The new modulation operation and the original translation operation are applied to construct a new framework of graph signal time-frequency analysis. Furthermore, a windowed graph Fourier frame theory is developed. Necessary and sufficient conditions for constructing windowed graph Fourier frames, tight frames and dual frames are presented in this paper. The novel graph signal time-frequency analysis framework is applied to signals defined on well-known graphs, e.g. Minnesota road graph and random graphs. Experimental results show that the novel framework captures new features of graph signals.

Keywords: graph signals, windowed graph Fourier transform, windowed graph Fourier frames, vertex frequency analysis

Procedia PDF Downloads 322
134 Assessing the Influence of Station Density on Geostatistical Prediction of Groundwater Levels in a Semi-arid Watershed of Karnataka

Authors: Sakshi Dhumale, Madhushree C., Amba Shetty

Abstract:

The effect of station density on the geostatistical prediction of groundwater levels is of critical importance to ensure accurate and reliable predictions. Monitoring station density directly impacts the accuracy and reliability of geostatistical predictions by influencing the model's ability to capture localized variations and small-scale features in groundwater levels. This is particularly crucial in regions with complex hydrogeological conditions and significant spatial heterogeneity. Insufficient station density can result in larger prediction uncertainties, as the model may struggle to adequately represent the spatial variability and correlation patterns of the data. On the other hand, an optimal distribution of monitoring stations enables effective coverage of the study area and captures the spatial variability of groundwater levels more comprehensively. In this study, we investigate the effect of station density on the predictive performance of groundwater levels using the geostatistical technique of Ordinary Kriging. The research utilizes groundwater level data collected from 121 observation wells within the semi-arid Berambadi watershed, gathered over a six-year period (2010-2015) from the Indian Institute of Science (IISc), Bengaluru. The dataset is partitioned into seven subsets representing varying sampling densities, ranging from 15% (12 wells) to 100% (121 wells) of the total well network. The results obtained from different monitoring networks are compared against the existing groundwater monitoring network established by the Central Ground Water Board (CGWB). The findings of this study demonstrate that higher station densities significantly enhance the accuracy of geostatistical predictions for groundwater levels. The increased number of monitoring stations enables improved interpolation accuracy and captures finer-scale variations in groundwater levels. These results shed light on the relationship between station density and the geostatistical prediction of groundwater levels, emphasizing the importance of appropriate station densities to ensure accurate and reliable predictions. The insights gained from this study have practical implications for designing and optimizing monitoring networks, facilitating effective groundwater level assessments, and enabling sustainable management of groundwater resources.

Keywords: station density, geostatistical prediction, groundwater levels, monitoring networks, interpolation accuracy, spatial variability

Procedia PDF Downloads 34
133 Optimization for Autonomous Robotic Construction by Visual Guidance through Machine Learning

Authors: Yangzhi Li

Abstract:

Network transfer of information and performance customization is now a viable method of digital industrial production in the era of Industry 4.0. Robot platforms and network platforms have grown more important in digital design and construction. The pressing need for novel building techniques is driven by the growing labor scarcity problem and increased awareness of construction safety. Robotic approaches in construction research are regarded as an extension of operational and production tools. Several technological theories related to robot autonomous recognition, which include high-performance computing, physical system modeling, extensive sensor coordination, and dataset deep learning, have not been explored using intelligent construction. Relevant transdisciplinary theory and practice research still has specific gaps. Optimizing high-performance computing and autonomous recognition visual guidance technologies improves the robot's grasp of the scene and capacity for autonomous operation. Intelligent vision guidance technology for industrial robots has a serious issue with camera calibration, and the use of intelligent visual guiding and identification technologies for industrial robots in industrial production has strict accuracy requirements. It can be considered that visual recognition systems have challenges with precision issues. In such a situation, it will directly impact the effectiveness and standard of industrial production, necessitating a strengthening of the visual guiding study on positioning precision in recognition technology. To best facilitate the handling of complicated components, an approach for the visual recognition of parts utilizing machine learning algorithms is proposed. This study will identify the position of target components by detecting the information at the boundary and corner of a dense point cloud and determining the aspect ratio in accordance with the guidelines for the modularization of building components. To collect and use components, operational processing systems assign them to the same coordinate system based on their locations and postures. The RGB image's inclination detection and the depth image's verification will be used to determine the component's present posture. Finally, a virtual environment model for the robot's obstacle-avoidance route will be constructed using the point cloud information.

Keywords: robotic construction, robotic assembly, visual guidance, machine learning

Procedia PDF Downloads 67
132 Design and Implementation of Generative Models for Odor Classification Using Electronic Nose

Authors: Kumar Shashvat, Amol P. Bhondekar

Abstract:

In the midst of the five senses, odor is the most reminiscent and least understood. Odor testing has been mysterious and odor data fabled to most practitioners. The delinquent of recognition and classification of odor is important to achieve. The facility to smell and predict whether the artifact is of further use or it has become undesirable for consumption; the imitation of this problem hooked on a model is of consideration. The general industrial standard for this classification is color based anyhow; odor can be improved classifier than color based classification and if incorporated in machine will be awfully constructive. For cataloging of odor for peas, trees and cashews various discriminative approaches have been used Discriminative approaches offer good prognostic performance and have been widely used in many applications but are incapable to make effectual use of the unlabeled information. In such scenarios, generative approaches have better applicability, as they are able to knob glitches, such as in set-ups where variability in the series of possible input vectors is enormous. Generative models are integrated in machine learning for either modeling data directly or as a transitional step to form an indeterminate probability density function. The algorithms or models Linear Discriminant Analysis and Naive Bayes Classifier have been used for classification of the odor of cashews. Linear Discriminant Analysis is a method used in data classification, pattern recognition, and machine learning to discover a linear combination of features that typifies or divides two or more classes of objects or procedures. The Naive Bayes algorithm is a classification approach base on Bayes rule and a set of qualified independence theory. Naive Bayes classifiers are highly scalable, requiring a number of restraints linear in the number of variables (features/predictors) in a learning predicament. The main recompenses of using the generative models are generally a Generative Models make stronger assumptions about the data, specifically, about the distribution of predictors given the response variables. The Electronic instrument which is used for artificial odor sensing and classification is an electronic nose. This device is designed to imitate the anthropological sense of odor by providing an analysis of individual chemicals or chemical mixtures. The experimental results have been evaluated in the form of the performance measures i.e. are accuracy, precision and recall. The investigational results have proven that the overall performance of the Linear Discriminant Analysis was better in assessment to the Naive Bayes Classifier on cashew dataset.

Keywords: odor classification, generative models, naive bayes, linear discriminant analysis

Procedia PDF Downloads 363
131 Using Arellano-Bover/Blundell-Bond Estimator in Dynamic Panel Data Analysis ā€“ Case of Finnish Housing Price Dynamics

Authors: Janne Engblom, Elias Oikarinen

Abstract:

A panel dataset is one that follows a given sample of individuals over time, and thus provides multiple observations on each individual in the sample. Panel data models include a variety of fixed and random effects models which form a wide range of linear models. A special case of panel data models are dynamic in nature. A complication regarding a dynamic panel data model that includes the lagged dependent variable is endogeneity bias of estimates. Several approaches have been developed to account for this problem. In this paper, the panel models were estimated using the Arellano-Bover/Blundell-Bond Generalized method of moments (GMM) estimator which is an extension of the Arellano-Bond model where past values and different transformations of past values of the potentially problematic independent variable are used as instruments together with other instrumental variables. The Arellanoā€“Bover/Blundellā€“Bond estimator augments Arellanoā€“Bond by making an additional assumption that first differences of instrument variables are uncorrelated with the fixed effects. This allows the introduction of more instruments and can dramatically improve efficiency. It builds a system of two equationsā€”the original equation and the transformed oneā€”and is also known as system GMM. In this study, Finnish housing price dynamics were examined empirically by using the Arellanoā€“Bover/Blundellā€“Bond estimation technique together with ordinary OLS. The aim of the analysis was to provide a comparison between conventional fixed-effects panel data models and dynamic panel data models. The Arellanoā€“Bover/Blundellā€“Bond estimator is suitable for this analysis for a number of reasons: It is a general estimator designed for situations with 1) a linear functional relationship; 2) one left-hand-side variable that is dynamic, depending on its own past realizations; 3) independent variables that are not strictly exogenous, meaning they are correlated with past and possibly current realizations of the error; 4) fixed individual effects; and 5) heteroskedasticity and autocorrelation within individuals but not across them. Based on data of 14 Finnish cities over 1988-2012 differences of short-run housing price dynamics estimates were considerable when different models and instrumenting were used. Especially, the use of different instrumental variables caused variation of model estimates together with their statistical significance. This was particularly clear when comparing estimates of OLS with different dynamic panel data models. Estimates provided by dynamic panel data models were more in line with theory of housing price dynamics.

Keywords: dynamic model, fixed effects, panel data, price dynamics

Procedia PDF Downloads 1456
130 Assessing Online Learning Paths in an Learning Management Systems Using a Data Mining and Machine Learning Approach

Authors: Alvaro Figueira, Bruno Cabral

Abstract:

Nowadays, students are used to be assessed through an online platform. Educators have stepped up from a period in which they endured the transition from paper to digital. The use of a diversified set of question types that range from quizzes to open questions is currently common in most university courses. In many courses, today, the evaluation methodology also fosters the studentsā€™ online participation in forums, the download, and upload of modified files, or even the participation in group activities. At the same time, new pedagogy theories that promote the active participation of students in the learning process, and the systematic use of problem-based learning, are being adopted using an eLearning system for that purpose. However, although there can be a lot of feedback from these activities to studentā€™s, usually it is restricted to the assessments of online well-defined tasks. In this article, we propose an automatic system that informs students of abnormal deviations of a 'correct' learning path in the course. Our approach is based on the fact that by obtaining this information earlier in the semester, may provide students and educators an opportunity to resolve an eventual problem regarding the studentā€™s current online actions towards the course. Our goal is to prevent situations that have a significant probability to lead to a poor grade and, eventually, to failing. In the major learning management systems (LMS) currently available, the interaction between the students and the system itself is registered in log files in the form of registers that mark beginning of actions performed by the user. Our proposed system uses that logged information to derive new one: the time each student spends on each activity, the time and order of the resources used by the student and, finally, the online resource usage pattern. Then, using the grades assigned to the students in previous years, we built a learning dataset that is used to feed a machine learning meta classifier. The produced classification model is then used to predict the grades a learning path is heading to, in the current year. Not only this approach serves the teacher, but also the student to receive automatic feedback on her current situation, having past years as a perspective. Our system can be applied to online courses that integrate the use of an online platform that stores user actions in a log file, and that has access to other studentā€™s evaluations. The system is based on a data mining process on the log files and on a self-feedback machine learning algorithm that works paired with the Moodle LMS.

Keywords: data mining, e-learning, grade prediction, machine learning, student learning path

Procedia PDF Downloads 107
129 Integrating Data Mining with Case-Based Reasoning for Diagnosing Sorghum Anthracnose

Authors: Mariamawit T. Belete

Abstract:

Cereal production and marketing are the means of livelihood for millions of households in Ethiopia. However, cereal production is constrained by technical and socio-economic factors. Among the technical factors, cereal crop diseases are the major contributing factors to the low yield. The aim of this research is to develop an integration of data mining and knowledge based system for sorghum anthracnose disease diagnosis that assists agriculture experts and development agents to make timely decisions. Anthracnose diagnosing systems gather information from Melkassa agricultural research center and attempt to score anthracnose severity scale. Empirical research is designed for data exploration, modeling, and confirmatory procedures for testing hypothesis and prediction to draw a sound conclusion. WEKA (Waikato Environment for Knowledge Analysis) was employed for the modeling. Knowledge based system has come across a variety of approaches based on the knowledge representation method; case-based reasoning (CBR) is one of the popular approaches used in knowledge-based system. CBR is a problem solving strategy that uses previous cases to solve new problems. The system utilizes hidden knowledge extracted by employing clustering algorithms, specifically K-means clustering from sampled anthracnose dataset. Clustered cases with centroid value are mapped to jCOLIBRI, and then the integrator application is created using NetBeans with JDK 8.0.2. The important part of a case based reasoning model includes case retrieval; the similarity measuring stage, reuse; which allows domain expert to transfer retrieval case solution to suit for the current case, revise; to test the solution, and retain to store the confirmed solution to the case base for future use. Evaluation of the system was done for both system performance and user acceptance. For testing the prototype, seven test cases were used. Experimental result shows that the system achieves an average precision and recall values of 70% and 83%, respectively. User acceptance testing also performed by involving five domain experts, and an average of 83% acceptance is achieved. Although the result of this study is promising, however, further study should be done an investigation on hybrid approach such as rule based reasoning, and pictorial retrieval process are recommended.

Keywords: sorghum anthracnose, data mining, case based reasoning, integration

Procedia PDF Downloads 67
128 Dividend Policy in Family Controlling Firms from a Governance Perspective: Empirical Evidence in Thailand

Authors: Tanapond S.

Abstract:

Typically, most of the controlling firms are relate to family firms which are widespread and important for economic growth particularly in Asian Pacific region. The unique characteristics of the controlling families tend to play an important role in determining the corporate policies such as dividend policy. Given the complexity of the family business phenomenon, the empirical evidence has been unclear on how the families behind business groups influence dividend policy in Asian markets with the prevalent existence of cross-shareholdings and pyramidal structure. Dividend policy as one of an important determinant of firm value could also be implemented in order to examine the effect of the controlling families behind business groups on strategic decisions-making in terms of a governance perspective and agency problems. The purpose of this paper is to investigate the impact of ownership structure and concentration which are influential internal corporate governance mechanisms in family firms on dividend decision-making. Using panel data and constructing a unique dataset of family ownership and control through hand-collecting information from the nonfinancial companies listed in Stock Exchange of Thailand (SET) between 2000 and 2015, the study finds that family firms with large stakes distribute higher dividends than family firms with small stakes. Family ownership can mitigate the agency problems and the expropriation of minority investors in family firms. To provide insight into the distinguish between ownership rights and control rights, this study examines specific firm characteristics including the degrees of concentration of controlling shareholders by classifying family ownership in different categories. The results show that controlling families with large deviation between voting rights and cash flow rights have more power and affect lower dividend payment. These situations become worse when second blockholders are families. To the best knowledge of the researcher, this study is the first to examine the association between family firmsā€™ characteristics and dividend policy from the corporate governance perspectives in Thailand with weak investor protection environment and high ownership concentration. This research also underscores the importance of family control especially in a context in which family business groups and pyramidal structure are prevalent. As a result, academics and policy makers can develop markets and corporate policies to eliminate agency problem.

Keywords: agency theory, dividend policy, family control, Thailand

Procedia PDF Downloads 266
127 Profiling Risky Code Using Machine Learning

Authors: Zunaira Zaman, David Bohannon

Abstract:

This study explores the application of machine learning (ML) for detecting security vulnerabilities in source code. The research aims to assist organizations with large application portfolios and limited security testing capabilities in prioritizing security activities. ML-based approaches offer benefits such as increased confidence scores, false positives and negatives tuning, and automated feedback. The initial approach using natural language processing techniques to extract features achieved 86% accuracy during the training phase but suffered from overfitting and performed poorly on unseen datasets during testing. To address these issues, the study proposes using the abstract syntax tree (AST) for Java and C++ codebases to capture code semantics and structure and generate path-context representations for each function. The Code2Vec model architecture is used to learn distributed representations of source code snippets for training a machine-learning classifier for vulnerability prediction. The study evaluates the performance of the proposed methodology using two datasets and compares the results with existing approaches. The Devign dataset yielded 60% accuracy in predicting vulnerable code snippets and helped resist overfitting, while the Juliet Test Suite predicted specific vulnerabilities such as OS-Command Injection, Cryptographic, and Cross-Site Scripting vulnerabilities. The Code2Vec model achieved 75% accuracy and a 98% recall rate in predicting OS-Command Injection vulnerabilities. The study concludes that even partial AST representations of source code can be useful for vulnerability prediction. The approach has the potential for automated intelligent analysis of source code, including vulnerability prediction on unseen source code. State-of-the-art models using natural language processing techniques and CNN models with ensemble modelling techniques did not generalize well on unseen data and faced overfitting issues. However, predicting vulnerabilities in source code using machine learning poses challenges such as high dimensionality and complexity of source code, imbalanced datasets, and identifying specific types of vulnerabilities. Future work will address these challenges and expand the scope of the research.

Keywords: code embeddings, neural networks, natural language processing, OS command injection, software security, code properties

Procedia PDF Downloads 90
126 Surface Elevation Dynamics Assessment Using Digital Elevation Models, Light Detection and Ranging, GPS and Geospatial Information Science Analysis: Ecosystem Modelling Approach

Authors: Ali K. M. Al-Nasrawi, Uday A. Al-Hamdany, Sarah M. Hamylton, Brian G. Jones, Yasir M. Alyazichi

Abstract:

Surface elevation dynamics have always responded to disturbance regimes. Creating Digital Elevation Models (DEMs) to detect surface dynamics has led to the development of several methods, devices and data clouds. DEMs can provide accurate and quick results with cost efficiency, in comparison to the inherited geomatics survey techniques. Nowadays, remote sensing datasets have become a primary source to create DEMs, including LiDAR point clouds with GIS analytic tools. However, these data need to be tested for error detection and correction. This paper evaluates various DEMs from different data sources over time for Apple Orchard Island, a coastal site in southeastern Australia, in order to detect surface dynamics. Subsequently, 30 chosen locations were examined in the field to test the error of the DEMs surface detection using high resolution global positioning systems (GPSs). Results show significant surface elevation changes on Apple Orchard Island. Accretion occurred on most of the island while surface elevation loss due to erosion is limited to the northern and southern parts. Concurrently, the projected differential correction and validation method aimed to identify errors in the dataset. The resultant DEMs demonstrated a small error ratio (≤ 3%) from the gathered datasets when compared with the fieldwork survey using RTK-GPS. As modern modelling approaches need to become more effective and accurate, applying several tools to create different DEMs on a multi-temporal scale would allow easy predictions in time-cost-frames with more comprehensive coverage and greater accuracy. With a DEM technique for the eco-geomorphic context, such insights about the ecosystem dynamic detection, at such a coastal intertidal system, would be valuable to assess the accuracy of the predicted eco-geomorphic risk for the conservation management sustainability. Demonstrating this framework to evaluate the historical and current anthropogenic and environmental stressors on coastal surface elevation dynamism could be profitably applied worldwide.

Keywords: DEMs, eco-geomorphic-dynamic processes, geospatial Information Science, remote sensing, surface elevation changes,

Procedia PDF Downloads 256
125 In silico Designing of Imidazo [4,5-b] Pyridine as a Probable Lead for Potent Decaprenyl Phosphoryl-Ī²-D-Ribose 2ā€²-Epimerase (DprE1) Inhibitors as Antitubercular Agents

Authors: Jineetkumar Gawad, Chandrakant Bonde

Abstract:

Tuberculosis (TB) is a major worldwide concern whose control has been exacerbated by HIV, the rise of multidrug-resistance (MDR-TB) and extensively drug resistance (XDR-TB) strains of Mycobacterium tuberculosis. The interest for newer and faster acting antitubercular drugs are more remarkable than any time. To search potent compounds is need and challenge for researchers. Here, we tried to design lead for inhibition of Decaprenyl phosphoryl-Ī²-D-ribose 2ā€²-epimerase (DprE1) enzyme. Arabinose is an essential constituent of mycobacterial cell wall. DprE1 is a flavoenzyme that converts decaprenylphosphoryl-D-ribose into decaprenylphosphoryl-2-keto-ribose, which is intermediate in biosynthetic pathway of arabinose. Latter, DprE2 converts keto-ribose into decaprenylphosphoryl-D-arabinose. We had a selection of 23 compounds from azaindole series for computational study, and they were drawn using marvisketch. Ligands were prepared using Maestro molecular modeling interface, Schrodinger, v10.5. Common pharmacophore hypotheses were developed by applying dataset thresholds to yield active and inactive set of compounds. There were 326 hypotheses were developed. On the basis of survival score, ADRRR (Survival Score: 5.453) was selected. Selected pharmacophore hypotheses were subjected to virtual screening results into 1000 hits. Hits were prepared and docked with protein 4KW5 (oxydoreductase inhibitor) was downloaded in .pdb format from RCSB Protein Data Bank. Protein was prepared using protein preparation wizard. Protein was preprocessed, the workspace was analyzed using force field OPLS 2005. Glide grid was generated by picking single atom in molecule. Prepared ligands were docked with prepared protein 4KW5 using Glide docking. After docking, on the basis of glide score top-five compounds were selected, (5223, 5812, 0661, 0662, and 2945) and the glide docking score (-8.928, -8.534, -8.412, -8.411, -8.351) respectively. There were interactions of ligand and protein, specifically HIS 132, LYS 418, TRY 230, ASN 385. Pi-pi stacking was observed in few compounds with basic Imidazo [4,5-b] pyridine ring. We had basic azaindole ring in parent compounds, but after glide docking, we received compounds with Imidazo [4,5-b] pyridine as a basic ring. That might be the new lead in the process of drug discovery.

Keywords: DprE1 inhibitors, in silico drug designing, imidazo [4, 5-b] pyridine, lead, tuberculosis

Procedia PDF Downloads 132
124 Tumor Size and Lymph Node Metastasis Detection in Colon Cancer Patients Using MR Images

Authors: Mohammadreza Hedyehzadeh, Mahdi Yousefi

Abstract:

Colon cancer is one of the most common cancer, which predicted to increase its prevalence due to the bad eating habits of peoples. Nowadays, due to the busyness of people, the use of fast foods is increasing, and therefore, diagnosis of this disease and its treatment are of particular importance. To determine the best treatment approach for each specific colon cancer patients, the oncologist should be known the stage of the tumor. The most common method to determine the tumor stage is TNM staging system. In this system, M indicates the presence of metastasis, N indicates the extent of spread to the lymph nodes, and T indicates the size of the tumor. It is clear that in order to determine all three of these parameters, an imaging method must be used, and the gold standard imaging protocols for this purpose are CT and PET/CT. In CT imaging, due to the use of X-rays, the risk of cancer and the absorbed dose of the patient is high, while in the PET/CT method, there is a lack of access to the device due to its high cost. Therefore, in this study, we aimed to estimate the tumor size and the extent of its spread to the lymph nodes using MR images. More than 1300 MR images collected from the TCIA portal, and in the first step (pre-processing), histogram equalization to improve image qualities and resizing to get the same image size was done. Two expert radiologists, which work more than 21 years on colon cancer cases, segmented the images and extracted the tumor region from the images. The next step is feature extraction from segmented images and then classify the data into three classes: T0N0ŲŒ T3N1 Łˆ T3N2. In this article, the VGG-16 convolutional neural network has been used to perform both of the above-mentioned tasks, i.e., feature extraction and classification. This network has 13 convolution layers for feature extraction and three fully connected layers with the softmax activation function for classification. In order to validate the proposed method, the 10-fold cross validation method used in such a way that the data was randomly divided into three parts: training (70% of data), validation (10% of data) and the rest for testing. It is repeated 10 times, each time, the accuracy, sensitivity and specificity of the model are calculated and the average of ten repetitions is reported as the result. The accuracy, specificity and sensitivity of the proposed method for testing dataset was 89/09%, 95/8% and 96/4%. Compared to previous studies, using a safe imaging technique (MRI) and non-use of predefined hand-crafted imaging features to determine the stage of colon cancer patients are some of the study advantages.

Keywords: colon cancer, VGG-16, magnetic resonance imaging, tumor size, lymph node metastasis

Procedia PDF Downloads 42
123 Comparison of Parametric and Bayesian Survival Regression Models in Simulated and HIV Patient Antiretroviral Therapy Data: Case Study of Alamata Hospital, North Ethiopia

Authors: Zeytu G. Asfaw, Serkalem K. Abrha, Demisew G. Degefu

Abstract:

Background: HIV/AIDS remains a major public health problem in Ethiopia and heavily affecting people of productive and reproductive age. We aimed to compare the performance of Parametric Survival Analysis and Bayesian Survival Analysis using simulations and in a real dataset application focused on determining predictors of HIV patient survival. Methods: A Parametric Survival Models - Exponential, Weibull, Log-normal, Log-logistic, Gompertz and Generalized gamma distributions were considered. Simulation study was carried out with two different algorithms that were informative and noninformative priors. A retrospective cohort study was implemented for HIV infected patients under Highly Active Antiretroviral Therapy in Alamata General Hospital, North Ethiopia. Results: A total of 320 HIV patients were included in the study where 52.19% females and 47.81% males. According to Kaplan-Meier survival estimates for the two sex groups, females has shown better survival time in comparison with their male counterparts. The median survival time of HIV patients was 79 months. During the follow-up period 89 (27.81%) deaths and 231 (72.19%) censored individuals registered. The average baseline cluster of differentiation 4 (CD4) cells count for HIV/AIDS patients were 126.01 but after a three-year antiretroviral therapy follow-up the average cluster of differentiation 4 (CD4) cells counts were 305.74, which was quite encouraging. Age, functional status, tuberculosis screen, past opportunistic infection, baseline cluster of differentiation 4 (CD4) cells, World Health Organization clinical stage, sex, marital status, employment status, occupation type, baseline weight were found statistically significant factors for longer survival of HIV patients. The standard error of all covariate in Bayesian log-normal survival model is less than the classical one. Hence, Bayesian survival analysis showed better performance than classical parametric survival analysis, when subjective data analysis was performed by considering expert opinions and historical knowledge about the parameters. Conclusions: Thus, HIV/AIDS patient mortality rate could be reduced through timely antiretroviral therapy with special care on the potential factors. Moreover, Bayesian log-normal survival model was preferable than the classical log-normal survival model for determining predictors of HIV patients survival.

Keywords: antiretroviral therapy (ART), Bayesian analysis, HIV, log-normal, parametric survival models

Procedia PDF Downloads 168
122 Computational Characterization of Electronic Charge Transfer in Interfacial Phospholipid-Water Layers

Authors: Samira Baghbanbari, A. B. P. Lever, Payam S. Shabestari, Donald Weaver

Abstract:

Existing signal transmission models, although undoubtedly useful, have proven insufficient to explain the full complexity of information transfer within the central nervous system. The development of transformative models will necessitate a more comprehensive understanding of neuronal lipid membrane electrophysiology. Pursuant to this goal, the role of highly organized interfacial phospholipid-water layers emerges as a promising case study. A series of phospholipids in neural-glial gap junction interfaces as well as cholesterol molecules have been computationally modelled using high-performance density functional theory (DFT) calculations. Subsequent 'charge decomposition analysis' calculations have revealed a net transfer of charge from phospholipid orbitals through the organized interfacial water layer before ultimately finding its way to cholesterol acceptor molecules. The specific pathway of charge transfer from phospholipid via water layers towards cholesterol has been mapped in detail. Cholesterol is an essential membrane component that is overrepresented in neuronal membranes as compared to other mammalian cells; given this relative abundance, its apparent role as an electronic acceptor may prove to be a relevant factor in further signal transmission studies of the central nervous system. The timescales over which this electronic charge transfer occurs have also been evaluated by utilizing a system design that systematically increases the number of water molecules separating lipids and cholesterol. Memory loss through hydrogen-bonded networks in water can occur at femtosecond timescales, whereas existing action potential-based models are limited to micro or nanosecond scales. As such, the development of future models that attempt to explain faster timescale signal transmission in the central nervous system may benefit from our work, which provides additional information regarding fast timescale energy transfer mechanisms occurring through interfacial water. The study possesses a dataset that includes six distinct phospholipids and a collection of cholesterol. Ten optimized geometric characteristics (features) were employed to conduct binary classification through an artificial neural network (ANN), differentiating cholesterol from the various phospholipids. This stems from our understanding that all lipids within the first group function as electronic charge donors, while cholesterol serves as an electronic charge acceptor.

Keywords: charge transfer, signal transmission, phospholipids, water layers, ANN

Procedia PDF Downloads 47
121 Influence of Atmospheric Circulation Patterns on Dust Pollution Transport during the Harmattan Period over West Africa

Authors: Ayodeji Oluleye

Abstract:

This study used Total Ozone Mapping Spectrometer (TOMS) Aerosol Index (AI) and reanalysis dataset of thirty years (1983-2012) to investigate the influence of the atmospheric circulation on dust transport during the Harmattan period over WestAfrica using TOMS data. The Harmattan dust mobilization and atmospheric circulation pattern were evaluated using a kernel density estimate which shows the areas where most points are concentrated between the variables. The evolution of the Inter-Tropical Discontinuity (ITD), Sea surface Temperature (SST) over the Gulf of Guinea, and the North Atlantic Oscillation (NAO) index during the Harmattan period (November-March) was also analyzed and graphs of the average ITD positions, SST and the NAO were observed on daily basis. The Pearson moment correlation analysis was also employed to assess the effect of atmospheric circulation on Harmattan dust transport. The results show that the departure (increased) of TOMS AI values from the long-term mean (1.64) occurred from around 21st of December, which signifies the rich dust days during winter period. Strong TOMS AI signal were observed from January to March with the maximum occurring in the latter months (February and March). The inter-annual variability of TOMSAI revealed that the rich dust years were found between 1984-1985, 1987-1988, 1997-1998, 1999-2000, and 2002-2004. Significantly, poor dust year was found between 2005 and 2006 in all the periods. The study has found strong north-easterly (NE) trade winds were over most of the Sahelianregion of West Africa during the winter months with the maximum wind speed reaching 8.61m/s inJanuary.The strength of NE winds determines the extent of dust transport to the coast of Gulf of Guinea during winter. This study has confirmed that the presence of the Harmattan is strongly dependent on theSST over Atlantic Ocean and ITD position. The locus of the average SST and ITD positions over West Africa could be described by polynomial functions. The study concludes that the evolution of near surface wind field at 925 hpa, and the variations of SST and ITD positions are the major large scale atmospheric circulation systems driving the emission, distribution, and transport of Harmattan dust aerosols over West Africa. However, the influence of NAO was shown to have fewer significance effects on the Harmattan dust transport over the region.

Keywords: atmospheric circulation, dust aerosols, Harmattan, West Africa

Procedia PDF Downloads 294
120 Different Data-Driven Bivariate Statistical Approaches to Landslide Susceptibility Mapping (Uzundere, Erzurum, Turkey)

Authors: Azimollah Aleshzadeh, Enver Vural Yavuz

Abstract:

The main goal of this study is to produce landslide susceptibility maps using different data-driven bivariate statistical approaches; namely, entropy weight method (EWM), evidence belief function (EBF), and information content model (ICM), at Uzundere county, Erzurum province, in the north-eastern part of Turkey. Past landslide occurrences were identified and mapped from an interpretation of high-resolution satellite images, and earlier reports as well as by carrying out field surveys. In total, 42 landslide incidence polygons were mapped using ArcGIS 10.4.1 software and randomly split into a construction dataset 70 % (30 landslide incidences) for building the EWM, EBF, and ICM models and the remaining 30 % (12 landslides incidences) were used for verification purposes. Twelve layers of landslide-predisposing parameters were prepared, including total surface radiation, maximum relief, soil groups, standard curvature, distance to stream/river sites, distance to the road network, surface roughness, land use pattern, engineering geological rock group, topographical elevation, the orientation of slope, and terrain slope gradient. The relationships between the landslide-predisposing parameters and the landslide inventory map were determined using different statistical models (EWM, EBF, and ICM). The model results were validated with landslide incidences, which were not used during the model construction. In addition, receiver operating characteristic curves were applied, and the area under the curve (AUC) was determined for the different susceptibility maps using the success (construction data) and prediction (verification data) rate curves. The results revealed that the AUC for success rates are 0.7055, 0.7221, and 0.7368, while the prediction rates are 0.6811, 0.6997, and 0.7105 for EWM, EBF, and ICM models, respectively. Consequently, landslide susceptibility maps were classified into five susceptibility classes, including very low, low, moderate, high, and very high. Additionally, the portion of construction and verification landslides incidences in high and very high landslide susceptibility classes in each map was determined. The results showed that the EWM, EBF, and ICM models produced satisfactory accuracy. The obtained landslide susceptibility maps may be useful for future natural hazard mitigation studies and planning purposes for environmental protection.

Keywords: entropy weight method, evidence belief function, information content model, landslide susceptibility mapping

Procedia PDF Downloads 121
119 Spectrogram Pre-Processing to Improve Isotopic Identification to Discriminate Gamma and Neutrons Sources

Authors: Mustafa Alhamdi

Abstract:

Industrial application to classify gamma rays and neutron events is investigated in this study using deep machine learning. The identification using a convolutional neural network and recursive neural network showed a significant improvement in predication accuracy in a variety of applications. The ability to identify the isotope type and activity from spectral information depends on feature extraction methods, followed by classification. The features extracted from the spectrum profiles try to find patterns and relationships to present the actual spectrum energy in low dimensional space. Increasing the level of separation between classes in feature space improves the possibility to enhance classification accuracy. The nonlinear nature to extract features by neural network contains a variety of transformation and mathematical optimization, while principal component analysis depends on linear transformations to extract features and subsequently improve the classification accuracy. In this paper, the isotope spectrum information has been preprocessed by finding the frequencies components relative to time and using them as a training dataset. Fourier transform implementation to extract frequencies component has been optimized by a suitable windowing function. Training and validation samples of different isotope profiles interacted with CdTe crystal have been simulated using Geant4. The readout electronic noise has been simulated by optimizing the mean and variance of normal distribution. Ensemble learning by combing voting of many models managed to improve the classification accuracy of neural networks. The ability to discriminate gamma and neutron events in a single predication approach using deep machine learning has shown high accuracy using deep learning. The paper findings show the ability to improve the classification accuracy by applying the spectrogram preprocessing stage to the gamma and neutron spectrums of different isotopes. Tuning deep machine learning models by hyperparameter optimization of neural network models enhanced the separation in the latent space and provided the ability to extend the number of detected isotopes in the training database. Ensemble learning contributed significantly to improve the final prediction.

Keywords: machine learning, nuclear physics, Monte Carlo simulation, noise estimation, feature extraction, classification

Procedia PDF Downloads 134
118 Systematic Evaluation of Convolutional Neural Network on Land Cover Classification from Remotely Sensed Images

Authors: Eiman Kattan, Hong Wei

Abstract:

In using Convolutional Neural Network (CNN) for classification, there is a set of hyperparameters available for the configuration purpose. This study aims to evaluate the impact of a range of parameters in CNN architecture i.e. AlexNet on land cover classification based on four remotely sensed datasets. The evaluation tests the influence of a set of hyperparameters on the classification performance. The parameters concerned are epoch values, batch size, and convolutional filter size against input image size. Thus, a set of experiments were conducted to specify the effectiveness of the selected parameters using two implementing approaches, named pertained and fine-tuned. We first explore the number of epochs under several selected batch size values (32, 64, 128 and 200). The impact of kernel size of convolutional filters (1, 3, 5, 7, 10, 15, 20, 25 and 30) was evaluated against the image size under testing (64, 96, 128, 180 and 224), which gave us insight of the relationship between the size of convolutional filters and image size. To generalise the validation, four remote sensing datasets, AID, RSD, UCMerced and RSCCN, which have different land covers and are publicly available, were used in the experiments. These datasets have a wide diversity of input data, such as number of classes, amount of labelled data, and texture patterns. A specifically designed interactive deep learning GPU training platform for image classification (Nvidia Digit) was employed in the experiments. It has shown efficiency in both training and testing. The results have shown that increasing the number of epochs leads to a higher accuracy rate, as expected. However, the convergence state is highly related to datasets. For the batch size evaluation, it has shown that a larger batch size slightly decreases the classification accuracy compared to a small batch size. For example, selecting the value 32 as the batch size on the RSCCN dataset achieves the accuracy rate of 90.34 % at the 11th epoch while decreasing the epoch value to one makes the accuracy rate drop to 74%. On the other extreme, setting an increased value of batch size to 200 decreases the accuracy rate at the 11th epoch is 86.5%, and 63% when using one epoch only. On the other hand, selecting the kernel size is loosely related to data set. From a practical point of view, the filter size 20 produces 70.4286%. The last performed image size experiment shows a dependency in the accuracy improvement. However, an expensive performance gain had been noticed. The represented conclusion opens the opportunities toward a better classification performance in various applications such as planetary remote sensing.

Keywords: CNNs, hyperparamters, remote sensing, land cover, land use

Procedia PDF Downloads 152
117 Cartographic Depiction and Visualization of Wetlands Changes in the North-Western States of India

Authors: Bansal Ashwani

Abstract:

Cartographic depiction and visualization of wetland changes is an important tool to map spatial-temporal information about the wetland dynamics effectively and to comprehend the response of these water bodies in maintaining the groundwater and surrounding ecosystem. This is true for the states of North Western India, i.e., J&K, Himachal, Punjab, and Haryana that are bestowed upon with several natural wetlands in the flood plains or on the courses of its rivers. Thus, the present study documents, analyses and reconstructs the lost wetlands, which existed in the flood plains of the major river basins of these states, i.e., Chenab, Jhelum, Satluj, Beas, Ravi, and Ghagar, in the beginning of the 20th century. To achieve the objective, the study has used multi-temporal datasets since the 1960s using high to medium resolution satellite datasets, e.g., Corona (1960s/70s), Landsat (1990s-2017) and Sentinel (2017). The Sentinel (2017) satellite image has been used for making the wetland inventory owing to its comparatively higher spatial resolution with multi-spectral bands. In addition, historical records, repeated photographs, historical maps, field observations including geomorphological evidence were also used. The water index techniques, i.e., band rationing, normalized difference water index (NDWI), modified NDWI (MNDWI) have been compared and used to map the wetlands. The wetland types found in the north-western states have been categorized under 19 classes suggested by Space Application Centre, India. These enable the researcher to provide with the wetlands inventory and a series of cartographic representation that includes overlaying multiple temporal wetlands extent vectors. A preliminary result shows the general state of wetland shrinkage since the 1960s with varying area shrinkage rate from one wetland to another. In addition, it is observed that majority of wetlands have not been documented so far and even do not have names. Moreover, the purpose is to emphasize their elimination in addition to establishing a baseline dataset that can be a tool for wetland planning and management. Finally, the applicability of cartographic depiction and visualization, historical map sources, repeated photographs and remote sensing data for reconstruction of long term wetlands fluctuations, especially in the northern part of India, will be addressed.

Keywords: cartographic depiction and visualization, wetland changes, NDWI/MDWI, geomorphological evidence and remote sensing

Procedia PDF Downloads 245
116 Escalation of Commitment and Turnover in Top Management Teams

Authors: Dmitriy V. Chulkov

Abstract:

Escalation of commitment is defined as continuation of a project after receiving negative information about it. While literature in management and psychology identified various factors contributing to escalation behavior, this phenomenon has received little analysis in economics, potentially due to the apparent irrationality of escalation. In this study, we present an economic model of escalation with asymmetric information in a principal-agent setup where the agents are responsible for a project selection decision and discover the outcome of the project before the principal. Our theoretical model complements the existing literature on several accounts. First, we link the incentive to escalate commitment to a project with the turnover decision by the manager. When a manager learns the outcome of the project and stops it that reveals that a mistake was made. There is an incentive to continue failing projects and avoid admitting the mistake. This incentive is enhanced when the agent may voluntarily resign from the firm before the outcome of the failing project is revealed, and thus not bear the full extent of reputation damage due to project failure. As long as some successful managers leave the firm for extraneous reasons, outside firms find it difficult to link failing projects with certainty to managers that left a firm. Second, we demonstrate that non-CEO managers have reputation concerns separate from those of the CEO, and thus may escalate commitment to projects they oversee, when such escalation can attenuate damage to reputation from impending project failure. Such incentive for escalation will be present for non-CEO managers if the CEO delegates responsibility for a project to a non-CEO executive. If reputation matters for promotion to the CEO, the incentive for a rising executive to escalate in order to protect reputation is distinct from that of a CEO. Third, our theoretical model is supported by empirical analysis of changes in the firmā€™s operations measured by the presence of discontinued operations at the time of turnover among the top four members of the top management team. Discontinued operations are indicative of termination of failing projects at a firm. The empirical results demonstrate that in a large dataset of over three thousand publicly traded U.S. firms for a period from 1993 to 2014 turnover by top executives significantly increases the likelihood that the firm discontinues operations. Furthermore, the type of turnover matters as this effect is strongest when at least one non-CEO member of the top management team leaves the firm and when the CEO departure is due to a voluntary resignation and not to a retirement or illness. Empirical results are consistent with the predictions of the theoretical model and suggest that escalation of commitment is primarily observed in decisions by non-CEO members of the top management team.

Keywords: discontinued operations, escalation of commitment, executive turnover, top management teams

Procedia PDF Downloads 351
115 Application of Improved Semantic Communication Technology in Remote Sensing Data Transmission

Authors: Tingwei Shu, Dong Zhou, Chengjun Guo

Abstract:

Semantic communication is an emerging form of communication that realize intelligent communication by extracting semantic information of data at the source and transmitting it, and recovering the data at the receiving end. It can effectively solve the problem of data transmission under the situation of large data volume, low SNR and restricted bandwidth. With the development of Deep Learning, semantic communication further matures and is gradually applied in the fields of the Internet of Things, Uumanned Air Vehicle cluster communication, remote sensing scenarios, etc. We propose an improved semantic communication system for the situation where the data volume is huge and the spectrum resources are limited during the transmission of remote sensing images. At the transmitting, we need to extract the semantic information of remote sensing images, but there are some problems. The traditional semantic communication system based on Convolutional Neural Network cannot take into account the global semantic information and local semantic information of the image, which results in less-than-ideal image recovery at the receiving end. Therefore, we adopt the improved vision-Transformer-based structure as the semantic encoder instead of the mainstream one using CNN to extract the image semantic features. In this paper, we first perform pre-processing operations on remote sensing images to improve the resolution of the images in order to obtain images with more semantic information. We use wavelet transform to decompose the image into high-frequency and low-frequency components, perform bilinear interpolation on the high-frequency components and bicubic interpolation on the low-frequency components, and finally perform wavelet inverse transform to obtain the preprocessed image. We adopt the improved Vision-Transformer structure as the semantic coder to extract and transmit the semantic information of remote sensing images. The Vision-Transformer structure can better train the huge data volume and extract better image semantic features, and adopt the multi-layer self-attention mechanism to better capture the correlation between semantic features and reduce redundant features. Secondly, to improve the coding efficiency, we reduce the quadratic complexity of the self-attentive mechanism itself to linear so as to improve the image data processing speed of the model. We conducted experimental simulations on the RSOD dataset and compared the designed system with a semantic communication system based on CNN and image coding methods such as BGP and JPEG to verify that the method can effectively alleviate the problem of excessive data volume and improve the performance of image data communication.

Keywords: semantic communication, transformer, wavelet transform, data processing

Procedia PDF Downloads 63
114 Applying GIS Geographic Weighted Regression Analysis to Assess Local Factors Impeding Smallholder Farmers from Participating in Agribusiness Markets: A Case Study of Vihiga County, Western Kenya

Authors: Mwehe Mathenge, Ben G. J. S. Sonneveld, Jacqueline E. W. Broerse

Abstract:

Smallholder farmers are important drivers of agriculture productivity, food security, and poverty reduction in Sub-Saharan Africa. However, they are faced with myriad challenges in their efforts at participating in agribusiness markets. How the geographic explicit factors existing at the local level interact to impede smallholder farmers' decision to participates (or not) in agribusiness markets is not well understood. Deconstructing the spatial complexity of the local environment could provide a deeper insight into how geographically explicit determinants promote or impede resource-poor smallholder farmers from participating in agribusiness. This paperā€™s objective was to identify, map, and analyze local spatial autocorrelation in factors that impede poor smallholders from participating in agribusiness markets. Data were collected using geocoded researcher-administered survey questionnaires from 392 households in Western Kenya. Three spatial statistics methods in geographic information system (GIS) were used to analyze data -Global Moranā€™s I, Cluster and Outliers Analysis (Anselin Local Moranā€™s I), and geographically weighted regression. The results of Global Moranā€™s I reveal the presence of spatial patterns in the dataset that was not caused by spatial randomness of data. Subsequently, Anselin Local Moranā€™s I result identified spatially and statistically significant local spatial clustering (hot spots and cold spots) in factors hindering smallholder participation. Finally, the geographically weighted regression results unearthed those specific geographic explicit factors impeding market participation in the study area. The results confirm that geographically explicit factors are indispensable in influencing the smallholder farming decisions, and policymakers should take cognizance of them. Additionally, this research demonstrated how geospatial explicit analysis conducted at the local level, using geographically disaggregated data, could help in identifying households and localities where the most impoverished and resource-poor smallholder households reside. In designing spatially targeted interventions, policymakers could benefit from geospatial analysis methods in understanding complex geographic factors and processes that interact to influence smallholder farmers' decision-making processes and choices.

Keywords: agribusiness markets, GIS, smallholder farmers, spatial statistics, disaggregated spatial data

Procedia PDF Downloads 127
113 Real Estate Trend Prediction with Artificial Intelligence Techniques

Authors: Sophia Liang Zhou

Abstract:

For investors, businesses, consumers, and governments, an accurate assessment of future housing prices is crucial to critical decisions in resource allocation, policy formation, and investment strategies. Previous studies are contradictory about macroeconomic determinants of housing price and largely focused on one or two areas using point prediction. This study aims to develop data-driven models to accurately predict future housing market trends in different markets. This work studied five different metropolitan areas representing different market trends and compared three-time lagging situations: no lag, 6-month lag, and 12-month lag. Linear regression (LR), random forest (RF), and artificial neural network (ANN) were employed to model the real estate price using datasets with S&P/Case-Shiller home price index and 12 demographic and macroeconomic features, such as gross domestic product (GDP), resident population, personal income, etc. in five metropolitan areas: Boston, Dallas, New York, Chicago, and San Francisco. The data from March 2005 to December 2018 were collected from the Federal Reserve Bank, FBI, and Freddie Mac. In the original data, some factors are monthly, some quarterly, and some yearly. Thus, two methods to compensate missing values, backfill or interpolation, were compared. The models were evaluated by accuracy, mean absolute error, and root mean square error. The LR and ANN models outperformed the RF model due to RFā€™s inherent limitations. Both ANN and LR methods generated predictive models with high accuracy ( > 95%). It was found that personal income, GDP, population, and measures of debt consistently appeared as the most important factors. It also showed that technique to compensate missing values in the dataset and implementation of time lag can have a significant influence on the model performance and require further investigation. The best performing models varied for each area, but the backfilled 12-month lag LR models and the interpolated no lag ANN models showed the best stable performance overall, with accuracies > 95% for each city. This study reveals the influence of input variables in different markets. It also provides evidence to support future studies to identify the optimal time lag and data imputing methods for establishing accurate predictive models.

Keywords: linear regression, random forest, artificial neural network, real estate price prediction

Procedia PDF Downloads 86