Search results for: recursive%20%20least%20square%20algorithm
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 88

Search results for: recursive%20%20least%20square%20algorithm

28 Accelerating Quantum Chemistry Calculations: Machine Learning for Efficient Evaluation of Electron-Repulsion Integrals

Authors: Nishant Rodrigues, Nicole Spanedda, Chilukuri K. Mohan, Arindam Chakraborty

Abstract:

A crucial objective in quantum chemistry is the computation of the energy levels of chemical systems. This task requires electron-repulsion integrals as inputs, and the steep computational cost of evaluating these integrals poses a major numerical challenge in efficient implementation of quantum chemical software. This work presents a moment-based machine-learning approach for the efficient evaluation of electron-repulsion integrals. These integrals were approximated using linear combinations of a small number of moments. Machine learning algorithms were applied to estimate the coefficients in the linear combination. A random forest approach was used to identify promising features using a recursive feature elimination approach, which performed best for learning the sign of each coefficient but not the magnitude. A neural network with two hidden layers were then used to learn the coefficient magnitudes along with an iterative feature masking approach to perform input vector compression, identifying a small subset of orbitals whose coefficients are sufficient for the quantum state energy computation. Finally, a small ensemble of neural networks (with a median rule for decision fusion) was shown to improve results when compared to a single network.

Keywords: quantum energy calculations, atomic orbitals, electron-repulsion integrals, ensemble machine learning, random forests, neural networks, feature extraction

Procedia PDF Downloads 68
27 Eye Tracking Syntax in Language Education

Authors: Marcus Maia

Abstract:

The present study reports and discusses the use of eye tracking qualitative data in reading workshops in Brazilian middle and high schools and in Generative Syntax and Sentence Processing courses at the undergraduate and graduate levels at the Federal University of Rio de Janeiro, respectively. Both endeavors take the sentential level as the proper object to be metacognitively explored in language education (cf. Chomsky, Gallego & Ott, 2019) to develop innate science forming capacity and knowledge of language. In both projects, non-discrepant qualitative eye tracking data collected and quantitatively analyzed in experimental syntax and psycholinguistic studies carried out in Lapex (Experimental Psycholinguistics Laboratory of the Federal University of Rio de Janeiro) were displayed to students as a point of departure, triggering discussions. Classes would generally start with the display of videos showing eye tracking data, such as gaze plots and heatmaps from several studies in Psycholinguistics and Experimental Syntax that we had already developed in our laboratory. The videos usually triggered discussions with students about linguistic and psycholinguistic issues, such as the reading of sentences for gist, garden-path sentences, syntactic and semantic anomalies, the filled-gap effect, island effects, direct and indirect cause, and recursive constructions, among other topics. Active, problem-solving based methodologies were employed with the objective of stimulating student participation. The communication also discusses the importance of developing full literacy, epistemic vigilance and intellectual self-defense in an infodemic world in the lines of Maia (2022).

Keywords: reading, educational psycholinguistics, eye-tracking, active methodology

Procedia PDF Downloads 29
26 Very Large Scale Integration Architecture of Finite Impulse Response Filter Implementation Using Retiming Technique

Authors: S. Jalaja, A. M. Vijaya Prakash

Abstract:

Recursive combination of an algorithm based on Karatsuba multiplication is exploited to design a generalized transpose and parallel Finite Impulse Response (FIR) Filter. Mid-range Karatsuba multiplication and Carry Save adder based on Karatsuba multiplication reduce time complexity for higher order multiplication implemented up to n-bit. As a result, we design modified N-tap Transpose and Parallel Symmetric FIR Filter Structure using Karatsuba algorithm. The mathematical formulation of the FFA Filter is derived. The proposed architecture involves significantly less area delay product (APD) then the existing block implementation. By adopting retiming technique, hardware cost is reduced further. The filter architecture is designed by using 90 nm technology library and is implemented by using cadence EDA Tool. The synthesized result shows better performance for different word length and block size. The design achieves switching activity reduction and low power consumption by applying with and without retiming for different combination of the circuit. The proposed structure achieves more than a half of the power reduction by adopting with and without retiming techniques compared to the earlier design structure. As a proof of the concept for block size 16 and filter length 64 for CKA method, it achieves a 51% as well as 70% less power by applying retiming technique, and for CSA method it achieves a 57% as well as 77% less power by applying retiming technique compared to the previously proposed design.

Keywords: carry save adder Karatsuba multiplication, mid range Karatsuba multiplication, modified FFA and transposed filter, retiming

Procedia PDF Downloads 204
25 Design of Two-Channel Quadrature Mirror Filter Banks Using a Transformation Approach

Authors: Ju-Hong Lee, Yi-Lin Shieh

Abstract:

Two-dimensional (2-D) quadrature mirror filter (QMF) banks have been widely considered for high-quality coding of image and video data at low bit rates. Without implementing subband coding, a 2-D QMF bank is required to have an exactly linear-phase response without magnitude distortion, i.e., the perfect reconstruction (PR) characteristics. The design problem of 2-D QMF banks with the PR characteristics has been considered in the literature for many years. This paper presents a transformation approach for designing 2-D two-channel QMF banks. Under a suitable one-dimensional (1-D) to two-dimensional (2-D) transformation with a specified decimation/interpolation matrix, the analysis and synthesis filters of the QMF bank are composed of 1-D causal and stable digital allpass filters (DAFs) and possess the 2-D doubly complementary half-band (DC-HB) property. This facilitates the design problem of the two-channel QMF banks by finding the real coefficients of the 1-D recursive DAFs. The design problem is formulated based on the minimax phase approximation for the 1-D DAFs. A novel objective function is then derived to obtain an optimization for 1-D minimax phase approximation. As a result, the problem of minimizing the objective function can be simply solved by using the well-known weighted least-squares (WLS) algorithm in the minimax (L∞) optimal sense. The novelty of the proposed design method is that the design procedure is very simple and the designed 2-D QMF bank achieves perfect magnitude response and possesses satisfactory phase response. Simulation results show that the proposed design method provides much better design performance and much less design complexity as compared with the existing techniques.

Keywords: Quincunx QMF bank, doubly complementary filter, digital allpass filter, WLS algorithm

Procedia PDF Downloads 206
24 Analysis of Real Time Seismic Signal Dataset Using Machine Learning

Authors: Sujata Kulkarni, Udhav Bhosle, Vijaykumar T.

Abstract:

Due to the closeness between seismic signals and non-seismic signals, it is vital to detect earthquakes using conventional methods. In order to distinguish between seismic events and non-seismic events depending on their amplitude, our study processes the data that come from seismic sensors. The authors suggest a robust noise suppression technique that makes use of a bandpass filter, an IIR Wiener filter, recursive short-term average/long-term average (STA/LTA), and Carl short-term average (STA)/long-term average for event identification (LTA). The trigger ratio used in the proposed study to differentiate between seismic and non-seismic activity is determined. The proposed work focuses on significant feature extraction for machine learning-based seismic event detection. This serves as motivation for compiling a dataset of all features for the identification and forecasting of seismic signals. We place a focus on feature vector dimension reduction techniques due to the temporal complexity. The proposed notable features were experimentally tested using a machine learning model, and the results on unseen data are optimal. Finally, a presentation using a hybrid dataset (captured by different sensors) demonstrates how this model may also be employed in a real-time setting while lowering false alarm rates. The planned study is based on the examination of seismic signals obtained from both individual sensors and sensor networks (SN). A wideband seismic signal from BSVK and CUKG station sensors, respectively located near Basavakalyan, Karnataka, and the Central University of Karnataka, makes up the experimental dataset.

Keywords: Carl STA/LTA, features extraction, real time, dataset, machine learning, seismic detection

Procedia PDF Downloads 65
23 Understanding and Explaining Urban Resilience and Vulnerability: A Framework for Analyzing the Complex Adaptive Nature of Cities

Authors: Richard Wolfel, Amy Richmond

Abstract:

Urban resilience and vulnerability are critical concepts in the modern city due to the increased sociocultural, political, economic, demographic, and environmental stressors that influence current urban dynamics. Urban scholars need help explaining urban resilience and vulnerability. First, cities are dominated by people, which is challenging to model, both from an explanatory and a predictive perspective. Second, urban regions are highly recursive in nature, meaning they not only influence human action, but the structures of cities are constantly changing due to human actions. As a result, explanatory frameworks must continuously evolve as humans influence and are influenced by the urban environment in which they operate. Finally, modern cities have populations, sociocultural characteristics, economic flows, and environmental impacts on order of magnitude well beyond the cities of the past. As a result, the frameworks that seek to explain the various functions of a city that influence urban resilience and vulnerability must address the complex adaptive nature of cities and the interaction of many distinct factors that influence resilience and vulnerability in the city. This project develops a taxonomy and framework for organizing and explaining urban vulnerability. The framework is built on a well-established political development model that includes six critical classes of urban dynamics: political presence, political legitimacy, political participation, identity, production, and allocation. In addition, the framework explores how environmental security and technology influence and are influenced by the six elements of political development. The framework aims to identify key tipping points in society that act as influential agents of urban vulnerability in a region. This will help analysts and scholars predict and explain the influence of both physical and human geographical stressors in a dense urban area.

Keywords: urban resilience, vulnerability, sociocultural stressors, political stressors

Procedia PDF Downloads 85
22 Uplift Segmentation Approach for Targeting Customers in a Churn Prediction Model

Authors: Shivahari Revathi Venkateswaran

Abstract:

Segmenting customers plays a significant role in churn prediction. It helps the marketing team with proactive and reactive customer retention. For the reactive retention, the retention team reaches out to customers who already showed intent to disconnect by giving some special offers. When coming to proactive retention, the marketing team uses churn prediction model, which ranks each customer from rank 1 to 100, where 1 being more risk to churn/disconnect (high ranks have high propensity to churn). The churn prediction model is built by using XGBoost model. However, with the churn rank, the marketing team can only reach out to the customers based on their individual ranks. To profile different groups of customers and to frame different marketing strategies for targeted groups of customers are not possible with the churn ranks. For this, the customers must be grouped in different segments based on their profiles, like demographics and other non-controllable attributes. This helps the marketing team to frame different offer groups for the targeted audience and prevent them from disconnecting (proactive retention). For segmentation, machine learning approaches like k-mean clustering will not form unique customer segments that have customers with same attributes. This paper finds an alternate approach to find all the combination of unique segments that can be formed from the user attributes and then finds the segments who have uplift (churn rate higher than the baseline churn rate). For this, search algorithms like fast search and recursive search are used. Further, for each segment, all customers can be targeted using individual churn ranks from the churn prediction model. Finally, a UI (User Interface) is developed for the marketing team to interactively search for the meaningful segments that are formed and target the right set of audience for future marketing campaigns and prevent them from disconnecting.

Keywords: churn prediction modeling, XGBoost model, uplift segments, proactive marketing, search algorithms, retention, k-mean clustering

Procedia PDF Downloads 42
21 Frequency Selective Filters for Estimating the Equivalent Circuit Parameters of Li-Ion Battery

Authors: Arpita Mondal, Aurobinda Routray, Sreeraj Puravankara, Rajashree Biswas

Abstract:

The most difficult part of designing a battery management system (BMS) is battery modeling. A good battery model can capture the dynamics which helps in energy management, by accurate model-based state estimation algorithms. So far the most suitable and fruitful model is the equivalent circuit model (ECM). However, in real-time applications, the model parameters are time-varying, changes with current, temperature, state of charge (SOC), and aging of the battery and this make a great impact on the performance of the model. Therefore, to increase the equivalent circuit model performance, the parameter estimation has been carried out in the frequency domain. The battery is a very complex system, which is associated with various chemical reactions and heat generation. Therefore, it’s very difficult to select the optimal model structure. As we know, if the model order is increased, the model accuracy will be improved automatically. However, the higher order model will face the tendency of over-parameterization and unfavorable prediction capability, while the model complexity will increase enormously. In the time domain, it becomes difficult to solve higher order differential equations as the model order increases. This problem can be resolved by frequency domain analysis, where the overall computational problems due to ill-conditioning reduce. In the frequency domain, several dominating frequencies can be found in the input as well as output data. The selective frequency domain estimation has been carried out, first by estimating the frequencies of the input and output by subspace decomposition, then by choosing the specific bands from the most dominating to the least, while carrying out the least-square, recursive least square and Kalman Filter based parameter estimation. In this paper, a second order battery model consisting of three resistors, two capacitors, and one SOC controlled voltage source has been chosen. For model identification and validation hybrid pulse power characterization (HPPC) tests have been carried out on a 2.6 Ah LiFePO₄ battery.

Keywords: equivalent circuit model, frequency estimation, parameter estimation, subspace decomposition

Procedia PDF Downloads 108
20 Advancements in Predicting Diabetes Biomarkers: A Machine Learning Epigenetic Approach

Authors: James Ladzekpo

Abstract:

Background: The urgent need to identify new pharmacological targets for diabetes treatment and prevention has been amplified by the disease's extensive impact on individuals and healthcare systems. A deeper insight into the biological underpinnings of diabetes is crucial for the creation of therapeutic strategies aimed at these biological processes. Current predictive models based on genetic variations fall short of accurately forecasting diabetes. Objectives: Our study aims to pinpoint key epigenetic factors that predispose individuals to diabetes. These factors will inform the development of an advanced predictive model that estimates diabetes risk from genetic profiles, utilizing state-of-the-art statistical and data mining methods. Methodology: We have implemented a recursive feature elimination with cross-validation using the support vector machine (SVM) approach for refined feature selection. Building on this, we developed six machine learning models, including logistic regression, k-Nearest Neighbors (k-NN), Naive Bayes, Random Forest, Gradient Boosting, and Multilayer Perceptron Neural Network, to evaluate their performance. Findings: The Gradient Boosting Classifier excelled, achieving a median recall of 92.17% and outstanding metrics such as area under the receiver operating characteristics curve (AUC) with a median of 68%, alongside median accuracy and precision scores of 76%. Through our machine learning analysis, we identified 31 genes significantly associated with diabetes traits, highlighting their potential as biomarkers and targets for diabetes management strategies. Conclusion: Particularly noteworthy were the Gradient Boosting Classifier and Multilayer Perceptron Neural Network, which demonstrated potential in diabetes outcome prediction. We recommend future investigations to incorporate larger cohorts and a wider array of predictive variables to enhance the models' predictive capabilities.

Keywords: diabetes, machine learning, prediction, biomarkers

Procedia PDF Downloads 16
19 Collaborative Approaches in Achieving Sustainable Private-Public Transportation Services in Inner-City Areas: A Case of Durban Minibus Taxis

Authors: Lonna Mabandla, Godfrey Musvoto

Abstract:

Transportation is a catalytic feature in cities. Transport and land use activity are interdependent and have a feedback loop between how land is developed and how transportation systems are designed and used. This recursive relationship between land use and transportation is reflected in how public transportation routes internal to the inner-city enhance accessibility, therefore creating spaces that are conducive to business activity, while the business activity also informs public transportation routes. It is for this reason that the focus of this research is on public transportation within inner-city areas where the dynamic is evident. Durban is the chosen case study where the dominating form of public transportation within the central business district (CBD) is minibus taxis. The paradox here is that minibus taxis still form part of the informal economy even though they are the leading form of public transportation in South Africa. There have been many attempts to formalise this industry to follow more regulatory practices, but minibus taxis are privately owned, therefore complicating any proposed intervention. The argument of this study is that the application of collaborative planning through a sustainable partnership between the public and private sectors will improve the social and environmental sustainability of public transportation. One of the major challenges that exist within such collaborative endeavors is power dynamics. As a result, a key focus of the study is on power relations. Practically, power relations should be observed over an extended period, specifically when the different stakeholders engage with each other, to reflect valid data. However, a lengthy data collection process was not possible to observe during the data collection phase of this research. Instead, interviews were conducted focusing on existing procedural planning practices between the inner-city minibus taxi association (South and North Beach Taxi Association), the eThekwini Transport Authority (ETA), and the eThekwini Town Planning Department. Conclusions and recommendations were then generated based on these data.

Keywords: collaborative planning, sustainability, public transport, minibus taxis

Procedia PDF Downloads 30
18 Lineup Optimization Model of Basketball Players Based on the Prediction of Recursive Neural Networks

Authors: Wang Yichen, Haruka Yamashita

Abstract:

In recent years, in the field of sports, decision making such as member in the game and strategy of the game based on then analysis of the accumulated sports data are widely attempted. In fact, in the NBA basketball league where the world's highest level players gather, to win the games, teams analyze the data using various statistical techniques. However, it is difficult to analyze the game data for each play such as the ball tracking or motion of the players in the game, because the situation of the game changes rapidly, and the structure of the data should be complicated. Therefore, it is considered that the analysis method for real time game play data is proposed. In this research, we propose an analytical model for "determining the optimal lineup composition" using the real time play data, which is considered to be difficult for all coaches. In this study, because replacing the entire lineup is too complicated, and the actual question for the replacement of players is "whether or not the lineup should be changed", and “whether or not Small Ball lineup is adopted”. Therefore, we propose an analytical model for the optimal player selection problem based on Small Ball lineups. In basketball, we can accumulate scoring data for each play, which indicates a player's contribution to the game, and the scoring data can be considered as a time series data. In order to compare the importance of players in different situations and lineups, we combine RNN (Recurrent Neural Network) model, which can analyze time series data, and NN (Neural Network) model, which can analyze the situation on the field, to build the prediction model of score. This model is capable to identify the current optimal lineup for different situations. In this research, we collected all the data of accumulated data of NBA from 2019-2020. Then we apply the method to the actual basketball play data to verify the reliability of the proposed model.

Keywords: recurrent neural network, players lineup, basketball data, decision making model

Procedia PDF Downloads 100
17 Discerning Divergent Nodes in Social Networks

Authors: Mehran Asadi, Afrand Agah

Abstract:

In data mining, partitioning is used as a fundamental tool for classification. With the help of partitioning, we study the structure of data, which allows us to envision decision rules, which can be applied to classification trees. In this research, we used online social network dataset and all of its attributes (e.g., Node features, labels, etc.) to determine what constitutes an above average chance of being a divergent node. We used the R statistical computing language to conduct the analyses in this report. The data were found on the UC Irvine Machine Learning Repository. This research introduces the basic concepts of classification in online social networks. In this work, we utilize overfitting and describe different approaches for evaluation and performance comparison of different classification methods. In classification, the main objective is to categorize different items and assign them into different groups based on their properties and similarities. In data mining, recursive partitioning is being utilized to probe the structure of a data set, which allow us to envision decision rules and apply them to classify data into several groups. Estimating densities is hard, especially in high dimensions, with limited data. Of course, we do not know the densities, but we could estimate them using classical techniques. First, we calculated the correlation matrix of the dataset to see if any predictors are highly correlated with one another. By calculating the correlation coefficients for the predictor variables, we see that density is strongly correlated with transitivity. We initialized a data frame to easily compare the quality of the result classification methods and utilized decision trees (with k-fold cross validation to prune the tree). The method performed on this dataset is decision trees. Decision tree is a non-parametric classification method, which uses a set of rules to predict that each observation belongs to the most commonly occurring class label of the training data. Our method aggregates many decision trees to create an optimized model that is not susceptible to overfitting. When using a decision tree, however, it is important to use cross-validation to prune the tree in order to narrow it down to the most important variables.

Keywords: online social networks, data mining, social cloud computing, interaction and collaboration

Procedia PDF Downloads 116
16 Climate Change and Migration in the Semi-arid Tropic and Eastern Regions of India: Exploring Alternative Adaptation Strategies

Authors: Gauri Sreekumar, Sabuj Kumar Mandal

Abstract:

Contributing about 18% to India’s Gross Domestic Product, the agricultural sector plays a significant role in the Indian rural economy. Despite being the primary source of livelihood for more than half of India’s population, most of them are marginal and small farmers facing several challenges due to agro-climatic shocks. Climate change is expected to increase the risk in the regions that are highly agriculture dependent. With systematic and scientific evidence of changes in rainfall, temperature and other extreme climate events, migration started to emerge as a survival strategy for the farm households. In this backdrop, our present study aims to combine the two strands of literature and attempts to explore whether migration is the only adaptation strategy for the farmers once they experience crop failures due adverse climatic condition. Combining the temperature and rainfall information from the weather data provided by the Indian Meteorological Department with the household level panel data on Indian states belonging to the Eastern and Semi-Arid Tropics regions from the Village Dynamics in South Asia (VDSA) collected by the International Crop Research Institute for the Semi-arid Tropics, we form a rich panel data for the years 2010-2014. A Recursive Econometric Model is used to establish the three-way nexus between climate change-yield-migration while addressing the role of irrigation and local non-farm income diversification. Using Three Stage Least Squares Estimation method, we find that climate change induced yield loss is a major driver of farmers’ migration. However, irrigation and local level non-farm income diversification are found to mitigate the adverse impact of climate change on migration. Based on our empirical results, we suggest for enhancing irrigation facilities and making local non-farm income diversification opportunities available to increase farm productivity and thereby reduce farmers’ migration.

Keywords: climate change, migration, adaptation, mitigation

Procedia PDF Downloads 37
15 A Heteroskedasticity Robust Test for Contemporaneous Correlation in Dynamic Panel Data Models

Authors: Andreea Halunga, Chris D. Orme, Takashi Yamagata

Abstract:

This paper proposes a heteroskedasticity-robust Breusch-Pagan test of the null hypothesis of zero cross-section (or contemporaneous) correlation in linear panel-data models, without necessarily assuming independence of the cross-sections. The procedure allows for either fixed, strictly exogenous and/or lagged dependent regressor variables, as well as quite general forms of both non-normality and heteroskedasticity in the error distribution. The asymptotic validity of the test procedure is predicated on the number of time series observations, T, being large relative to the number of cross-section units, N, in that: (i) either N is fixed as T→∞; or, (ii) N²/T→0, as both T and N diverge, jointly, to infinity. Given this, it is not expected that asymptotic theory would provide an adequate guide to finite sample performance when T/N is "small". Because of this, we also propose and establish asymptotic validity of, a number of wild bootstrap schemes designed to provide improved inference when T/N is small. Across a variety of experimental designs, a Monte Carlo study suggests that the predictions from asymptotic theory do, in fact, provide a good guide to the finite sample behaviour of the test when T is large relative to N. However, when T and N are of similar orders of magnitude, discrepancies between the nominal and empirical significance levels occur as predicted by the first-order asymptotic analysis. On the other hand, for all the experimental designs, the proposed wild bootstrap approximations do improve agreement between nominal and empirical significance levels, when T/N is small, with a recursive-design wild bootstrap scheme performing best, in general, and providing quite close agreement between the nominal and empirical significance levels of the test even when T and N are of similar size. Moreover, in comparison with the wild bootstrap "version" of the original Breusch-Pagan test our experiments indicate that the corresponding version of the heteroskedasticity-robust Breusch-Pagan test appears reliable. As an illustration, the proposed tests are applied to a dynamic growth model for a panel of 20 OECD countries.

Keywords: cross-section correlation, time-series heteroskedasticity, dynamic panel data, heteroskedasticity robust Breusch-Pagan test

Procedia PDF Downloads 405
14 Quantifying Automation in the Architectural Design Process via a Framework Based on Task Breakdown Systems and Recursive Analysis: An Exploratory Study

Authors: D. M. Samartsev, A. G. Copping

Abstract:

As with all industries, architects are using increasing amounts of automation within practice, with approaches such as generative design and use of AI becoming more commonplace. However, the discourse on the rate at which the architectural design process is being automated is often personal and lacking in objective figures and measurements. This results in confusion between people and barriers to effective discourse on the subject, in turn limiting the ability of architects, policy makers, and members of the public in making informed decisions in the area of design automation. This paper proposes the use of a framework to quantify the progress of automation within the design process. The use of a reductionist analysis of the design process allows it to be quantified in a manner that enables direct comparison across different times, as well as locations and projects. The methodology is informed by the design of this framework – taking on the aspects of a systematic review but compressed in time to allow for an initial set of data to verify the validity of the framework. The use of such a framework of quantification enables various practical uses such as predicting the future of the architectural industry with regards to which tasks will be automated, as well as making more informed decisions on the subject of automation on multiple levels ranging from individual decisions to policy making from governing bodies such as the RIBA. This is achieved by analyzing the design process as a generic task that needs to be performed, then using principles of work breakdown systems to split the task of designing an entire building into smaller tasks, which can then be recursively split further as required. Each task is then assigned a series of milestones that allow for the objective analysis of its automation progress. By combining these two approaches it is possible to create a data structure that describes how much various parts of the architectural design process are automated. The data gathered in the paper serves the dual purposes of providing the framework with validation, as well as giving insights into the current situation of automation within the architectural design process. The framework can be interrogated in many ways and preliminary analysis shows that almost 40% of the architectural design process has been automated in some practical fashion at the time of writing, with the rate at which progress is made slowly increasing over the years, with the majority of tasks in the design process reaching a new milestone in automation in less than 6 years. Additionally, a further 15% of the design process is currently being automated in some way, with various products in development but not yet released to the industry. Lastly, various limitations of the framework are examined in this paper as well as further areas of study.

Keywords: analysis, architecture, automation, design process, technology

Procedia PDF Downloads 75
13 Spectrogram Pre-Processing to Improve Isotopic Identification to Discriminate Gamma and Neutrons Sources

Authors: Mustafa Alhamdi

Abstract:

Industrial application to classify gamma rays and neutron events is investigated in this study using deep machine learning. The identification using a convolutional neural network and recursive neural network showed a significant improvement in predication accuracy in a variety of applications. The ability to identify the isotope type and activity from spectral information depends on feature extraction methods, followed by classification. The features extracted from the spectrum profiles try to find patterns and relationships to present the actual spectrum energy in low dimensional space. Increasing the level of separation between classes in feature space improves the possibility to enhance classification accuracy. The nonlinear nature to extract features by neural network contains a variety of transformation and mathematical optimization, while principal component analysis depends on linear transformations to extract features and subsequently improve the classification accuracy. In this paper, the isotope spectrum information has been preprocessed by finding the frequencies components relative to time and using them as a training dataset. Fourier transform implementation to extract frequencies component has been optimized by a suitable windowing function. Training and validation samples of different isotope profiles interacted with CdTe crystal have been simulated using Geant4. The readout electronic noise has been simulated by optimizing the mean and variance of normal distribution. Ensemble learning by combing voting of many models managed to improve the classification accuracy of neural networks. The ability to discriminate gamma and neutron events in a single predication approach using deep machine learning has shown high accuracy using deep learning. The paper findings show the ability to improve the classification accuracy by applying the spectrogram preprocessing stage to the gamma and neutron spectrums of different isotopes. Tuning deep machine learning models by hyperparameter optimization of neural network models enhanced the separation in the latent space and provided the ability to extend the number of detected isotopes in the training database. Ensemble learning contributed significantly to improve the final prediction.

Keywords: machine learning, nuclear physics, Monte Carlo simulation, noise estimation, feature extraction, classification

Procedia PDF Downloads 121
12 Machine Learning in Patent Law: How Genetic Breeding Algorithms Challenge Modern Patent Law Regimes

Authors: Stefan Papastefanou

Abstract:

Artificial intelligence (AI) is an interdisciplinary field of computer science with the aim of creating intelligent machine behavior. Early approaches to AI have been configured to operate in very constrained environments where the behavior of the AI system was previously determined by formal rules. Knowledge was presented as a set of rules that allowed the AI system to determine the results for specific problems; as a structure of if-else rules that could be traversed to find a solution to a particular problem or question. However, such rule-based systems typically have not been able to generalize beyond the knowledge provided. All over the world and especially in IT-heavy industries such as the United States, the European Union, Singapore, and China, machine learning has developed to be an immense asset, and its applications are becoming more and more significant. It has to be examined how such products of machine learning models can and should be protected by IP law and for the purpose of this paper patent law specifically, since it is the IP law regime closest to technical inventions and computing methods in technical applications. Genetic breeding models are currently less popular than recursive neural network method and deep learning, but this approach can be more easily described by referring to the evolution of natural organisms, and with increasing computational power; the genetic breeding method as a subset of the evolutionary algorithms models is expected to be regaining popularity. The research method focuses on patentability (according to the world’s most significant patent law regimes such as China, Singapore, the European Union, and the United States) of AI inventions and machine learning. Questions of the technical nature of the problem to be solved, the inventive step as such, and the question of the state of the art and the associated obviousness of the solution arise in the current patenting processes. Most importantly, and the key focus of this paper is the problem of patenting inventions that themselves are developed through machine learning. The inventor of a patent application must be a natural person or a group of persons according to the current legal situation in most patent law regimes. In order to be considered an 'inventor', a person must actually have developed part of the inventive concept. The mere application of machine learning or an AI algorithm to a particular problem should not be construed as the algorithm that contributes to a part of the inventive concept. However, when machine learning or the AI algorithm has contributed to a part of the inventive concept, there is currently a lack of clarity regarding the ownership of artificially created inventions. Since not only all European patent law regimes but also the Chinese and Singaporean patent law approaches include identical terms, this paper ultimately offers a comparative analysis of the most relevant patent law regimes.

Keywords: algorithms, inventor, genetic breeding models, machine learning, patentability

Procedia PDF Downloads 89
11 Predicting Football Player Performance: Integrating Data Visualization and Machine Learning

Authors: Saahith M. S., Sivakami R.

Abstract:

In the realm of football analytics, particularly focusing on predicting football player performance, the ability to forecast player success accurately is of paramount importance for teams, managers, and fans. This study introduces an elaborate examination of predicting football player performance through the integration of data visualization methods and machine learning algorithms. The research entails the compilation of an extensive dataset comprising player attributes, conducting data preprocessing, feature selection, model selection, and model training to construct predictive models. The analysis within this study will involve delving into feature significance using methodologies like Select Best and Recursive Feature Elimination (RFE) to pinpoint pertinent attributes for predicting player performance. Various machine learning algorithms, including Random Forest, Decision Tree, Linear Regression, Support Vector Regression (SVR), and Artificial Neural Networks (ANN), will be explored to develop predictive models. The evaluation of each model's performance utilizing metrics such as Mean Squared Error (MSE) and R-squared will be executed to gauge their efficacy in predicting player performance. Furthermore, this investigation will encompass a top player analysis to recognize the top-performing players based on the anticipated overall performance scores. Nationality analysis will entail scrutinizing the player distribution based on nationality and investigating potential correlations between nationality and player performance. Positional analysis will concentrate on examining the player distribution across various positions and assessing the average performance of players in each position. Age analysis will evaluate the influence of age on player performance and identify any discernible trends or patterns associated with player age groups. The primary objective is to predict a football player's overall performance accurately based on their individual attributes, leveraging data-driven insights to enrich the comprehension of player success on the field. By amalgamating data visualization and machine learning methodologies, the aim is to furnish valuable tools for teams, managers, and fans to effectively analyze and forecast player performance. This research contributes to the progression of sports analytics by showcasing the potential of machine learning in predicting football player performance and offering actionable insights for diverse stakeholders in the football industry.

Keywords: football analytics, player performance prediction, data visualization, machine learning algorithms, random forest, decision tree, linear regression, support vector regression, artificial neural networks, model evaluation, top player analysis, nationality analysis, positional analysis

Procedia PDF Downloads 10
10 Accounting for Downtime Effects in Resilience-Based Highway Network Restoration Scheduling

Authors: Zhenyu Zhang, Hsi-Hsien Wei

Abstract:

Highway networks play a vital role in post-disaster recovery for disaster-damaged areas. Damaged bridges in such networks can disrupt the recovery activities by impeding the transportation of people, cargo, and reconstruction resources. Therefore, rapid restoration of damaged bridges is of paramount importance to long-term disaster recovery. In the post-disaster recovery phase, the key to restoration scheduling for a highway network is prioritization of bridge-repair tasks. Resilience is widely used as a measure of the ability to recover with which a network can return to its pre-disaster level of functionality. In practice, highways will be temporarily blocked during the downtime of bridge restoration, leading to the decrease of highway-network functionality. The failure to take downtime effects into account can lead to overestimation of network resilience. Additionally, post-disaster recovery of highway networks is generally divided into emergency bridge repair (EBR) in the response phase and long-term bridge repair (LBR) in the recovery phase, and both of EBR and LBR are different in terms of restoration objectives, restoration duration, budget, etc. Distinguish these two phases are important to precisely quantify highway network resilience and generate suitable restoration schedules for highway networks in the recovery phase. To address the above issues, this study proposes a novel resilience quantification method for the optimization of long-term bridge repair schedules (LBRS) taking into account the impact of EBR activities and restoration downtime on a highway network’s functionality. A time-dependent integer program with recursive functions is formulated for optimally scheduling LBR activities. Moreover, since uncertainty always exists in the LBRS problem, this paper extends the optimization model from the deterministic case to the stochastic case. A hybrid genetic algorithm that integrates a heuristic approach into a traditional genetic algorithm to accelerate the evolution process is developed. The proposed methods are tested using data from the 2008 Wenchuan earthquake, based on a regional highway network in Sichuan, China, consisting of 168 highway bridges on 36 highways connecting 25 cities/towns. The results show that, in this case, neglecting the bridge restoration downtime can lead to approximately 15% overestimation of highway network resilience. Moreover, accounting for the impact of EBR on network functionality can help to generate a more specific and reasonable LBRS. The theoretical and practical values are as follows. First, the proposed network recovery curve contributes to comprehensive quantification of highway network resilience by accounting for the impact of both restoration downtime and EBR activities on the recovery curves. Moreover, this study can improve the highway network resilience from the organizational dimension by providing bridge managers with optimal LBR strategies.

Keywords: disaster management, highway network, long-term bridge repair schedule, resilience, restoration downtime

Procedia PDF Downloads 121
9 Recognition of Spelling Problems during the Text in Progress: A Case Study on the Comments Made by Portuguese Students Newly Literate

Authors: E. Calil, L. A. Pereira

Abstract:

The acquisition of orthography is a complex process, involving both lexical and grammatical questions. This learning occurs simultaneously with the domain of multiple textual aspects (e.g.: graphs, punctuation, etc.). However, most of the research on orthographic acquisition focus on this acquisition from an autonomous point of view, separated from the process of textual production. This means that their object of analysis is the production of words selected by the researcher or the requested sentences in an experimental and controlled setting. In addition, the analysis of the Spelling Problems (SP) are identified by the researcher on the sheet of paper. Considering the perspective of Textual Genetics, from an enunciative approach, this study will discuss the SPs recognized by dyads of newly literate students, while they are writing a text collaboratively. Six proposals of textual production were registered, requested by a 2nd year teacher of a Portuguese Primary School between January and March 2015. In our case study we discuss the SPs recognized by the dyad B and L (7 years old). We adopted as a methodological tool the Ramos System audiovisual record. This system allows real-time capture of the text in process and of the face-to-face dialogue between both students and their teacher, and also captures the body movements and facial expressions of the participants during textual production proposals in the classroom. In these ecological conditions of multimodal registration of collaborative writing, we could identify the emergence of SP in two dimensions: i. In the product (finished text): SP identification without recursive graphic marks (without erasures) and the identification of SPs with erasures, indicating the recognition of SP by the student; ii. In the process (text in progress): identification of comments made by students about recognized SPs. Given this, we’ve analyzed the comments on identified SPs during the text in progress. These comments characterize a type of reformulation referred to as Commented Oral Erasure (COE). The COE has two enunciative forms: Simple Comment (SC) such as ' 'X' is written with 'Y' '; or Unfolded Comment (UC), such as ' 'X' is written with 'Y' because...'. The spelling COE may also occur before or during the SP (Early Spelling Recognition - ESR) or after the SP has been entered (Later Spelling Recognition - LSR). There were 631 words entered in the 6 stories written by the B-L dyad, 145 of them containing some type of SP. During the text in progress, the students recognized orally 174 SP, 46 of which were identified in advance (ESRs) and 128 were identified later (LSPs). If we consider that the 88 erasure SPs in the product indicate some form of SP recognition, we can observe that there were twice as many SPs recognized orally. The ESR was characterized by SC when students asked their colleague or teacher how to spell a given word. The LSR presented predominantly UC, verbalizing meta-orthographic arguments, mostly made by L. These results indicate that writing in dyad is an important didactic strategy for the promotion of metalinguistic reflection, favoring the learning of spelling.

Keywords: collaborative writing, erasure, learning, metalinguistic awareness, spelling, text production

Procedia PDF Downloads 139
8 Seismic Perimeter Surveillance System (Virtual Fence) for Threat Detection and Characterization Using Multiple ML Based Trained Models in Weighted Ensemble Voting

Authors: Vivek Mahadev, Manoj Kumar, Neelu Mathur, Brahm Dutt Pandey

Abstract:

Perimeter guarding and protection of critical installations require prompt intrusion detection and assessment to take effective countermeasures. Currently, visual and electronic surveillance are the primary methods used for perimeter guarding. These methods can be costly and complicated, requiring careful planning according to the location and terrain. Moreover, these methods often struggle to detect stealthy and camouflaged insurgents. The object of the present work is to devise a surveillance technique using seismic sensors that overcomes the limitations of existing systems. The aim is to improve intrusion detection, assessment, and characterization by utilizing seismic sensors. Most of the similar systems have only two types of intrusion detection capability viz., human or vehicle. In our work we could even categorize further to identify types of intrusion activity such as walking, running, group walking, fence jumping, tunnel digging and vehicular movements. A virtual fence of 60 meters at GCNEP, Bahadurgarh, Haryana, India, was created by installing four underground geophones at a distance of 15 meters each. The signals received from these geophones are then processed to find unique seismic signatures called features. Various feature optimization and selection methodologies, such as LightGBM, Boruta, Random Forest, Logistics, Recursive Feature Elimination, Chi-2 and Pearson Ratio were used to identify the best features for training the machine learning models. The trained models were developed using algorithms such as supervised support vector machine (SVM) classifier, kNN, Decision Tree, Logistic Regression, Naïve Bayes, and Artificial Neural Networks. These models were then used to predict the category of events, employing weighted ensemble voting to analyze and combine their results. The models were trained with 1940 training events and results were evaluated with 831 test events. It was observed that using the weighted ensemble voting increased the efficiency of predictions. In this study we successfully developed and deployed the virtual fence using geophones. Since these sensors are passive, do not radiate any energy and are installed underground, it is impossible for intruders to locate and nullify them. Their flexibility, quick and easy installation, low costs, hidden deployment and unattended surveillance make such systems especially suitable for critical installations and remote facilities with difficult terrain. This work demonstrates the potential of utilizing seismic sensors for creating better perimeter guarding and protection systems using multiple machine learning models in weighted ensemble voting. In this study the virtual fence achieved an intruder detection efficiency of over 97%.

Keywords: geophone, seismic perimeter surveillance, machine learning, weighted ensemble method

Procedia PDF Downloads 38
7 The Bidirectional Effect between Parental Burnout and the Child’s Internalized and/or Externalized Behaviors

Authors: Aline Woine, Moïra Mikolajczak, Virginie Dardier, Isabelle Roskam

Abstract:

Background information: Becoming a parent is said to be the happiest event one can ever experience in one’s life. This popular (and almost absolute) truth–which no reasonable and decent human being would ever dare question on pain of being singled out as a bad parent–contrasts with the nuances that reality offers. Indeed, while many parents do thrive in their parenting role, some others falter and become progressively overwhelmed by their parenting role, ineluctably caught in a spiral of exhaustion. Parental burnout (henceforth PB) sets in when parental demands (stressors) exceed parental resources. While it is now generally acknowledged that PB affects the parent’s behavior in terms of neglect and violence toward their offspring, little is known about the impact that the syndrome might have on the children’s internalized (anxious and depressive symptoms, somatic complaints, etc.) and/or externalized (irritability, violence, aggressiveness, conduct disorder, oppositional disorder, etc.) behaviors. Furthermore, at the time of writing, to our best knowledge, no research has yet tested the reverse effect, namely, that of the child's internalized and/or externalized behaviors on the onset and/or maintenance of parental burnout symptoms. Goals and hypotheses: The present pioneering research proposes to fill an important gap in the existing literature related to PB by investigating the bidirectional effect between PB and the child’s internalized and/or externalized behaviors. Relying on a cross-lagged longitudinal study with three waves of data collection (4 months apart), our study tests a transactional model with bidirectional and recursive relations between observed variables and at the three waves, as well as autoregressive paths and cross-sectional correlations. Methods: As we write this, wave-two data are being collected via Qualtrics, and we expect a final sample of about 600 participants composed of French-speaking (snowball sample) and English-speaking (Prolific sample) parents. Structural equation modeling is employed using Stata version 17. In order to retain as much statistical power as possible, we use all available data and therefore apply the maximum likelihood with a missing value (mlmv) as the method of estimation to compute the parameter estimates. To limit (in so far is possible) the shared method variance bias in the evaluation of the child’s behavior, the study relies on a multi-informant evaluation approach. Expected results: We expect our three-wave longitudinal study to show that PB symptoms (measured at T1) raise the occurrence/intensity of the child’s externalized and/or internalized behaviors (measured at T2 and T3). We further expect the child’s occurrence/intensity of externalized and/or internalized behaviors (measured at T1) to augment the risk for PB (measured at T2 and T3). Conclusion: Should our hypotheses be confirmed, our results will make an important contribution to the understanding of both PB and children’s behavioral issues, thereby opening interesting theoretical and clinical avenues.

Keywords: exhaustion, structural equation modeling, cross-lagged longitudinal study, violence and neglect, child-parent relationship

Procedia PDF Downloads 43
6 Electromagnetic Simulation Based on Drift and Diffusion Currents for Real-Time Systems

Authors: Alexander Norbach

Abstract:

The script in this paper describes the use of advanced simulation environment using electronic systems (Microcontroller, Operational Amplifiers, and FPGA). The simulation may be used for all dynamic systems with the diffusion and the ionisation behaviour also. By additionally required observer structure, the system works with parallel real-time simulation based on diffusion model and the state-space representation for other dynamics. The proposed deposited model may be used for electrodynamic effects, including ionising effects and eddy current distribution also. With the script and proposed method, it is possible to calculate the spatial distribution of the electromagnetic fields in real-time. For further purpose, the spatial temperature distribution may be used also. With upon system, the uncertainties, unknown initial states and disturbances may be determined. This provides the estimation of the more precise system states for the required system, and additionally, the estimation of the ionising disturbances that occur due to radiation effects. The results have shown that a system can be also developed and adopted specifically for space systems with the real-time calculation of the radiation effects only. Electronic systems can take damage caused by impacts with charged particle flux in space or radiation environment. In order to be able to react to these processes, it must be calculated within a shorter time that ionising radiation and dose is present. All available sensors shall be used to observe the spatial distributions. By measured value of size and known location of the sensors, the entire distribution can be calculated retroactively or more accurately. With the formation, the type of ionisation and the direct effect to the systems and thus possible prevent processes can be activated up to the shutdown. The results show possibilities to perform more qualitative and faster simulations independent of kind of systems space-systems and radiation environment also. The paper gives additionally an overview of the diffusion effects and their mechanisms. For the modelling and derivation of equations, the extended current equation is used. The size K represents the proposed charge density drifting vector. The extended diffusion equation was derived and shows the quantising character and has similar law like the Klein-Gordon equation. These kinds of PDE's (Partial Differential Equations) are analytically solvable by giving initial distribution conditions (Cauchy problem) and boundary conditions (Dirichlet boundary condition). For a simpler structure, a transfer function for B- and E- fields was analytically calculated. With known discretised responses g₁(k·Ts) and g₂(k·Ts), the electric current or voltage may be calculated using a convolution; g₁ is the direct function and g₂ is a recursive function. The analytical results are good enough for calculation of fields with diffusion effects. Within the scope of this work, a proposed model of the consideration of the electromagnetic diffusion effects of arbitrary current 'waveforms' has been developed. The advantage of the proposed calculation of diffusion is the real-time capability, which is not really possible with the FEM programs available today. It makes sense in the further course of research to use these methods and to investigate them thoroughly.

Keywords: advanced observer, electrodynamics, systems, diffusion, partial differential equations, solver

Procedia PDF Downloads 102
5 The Biosphere as a Supercomputer Directing and Controlling Evolutionary Processes

Authors: Igor A. Krichtafovitch

Abstract:

The evolutionary processes are not linear. Long periods of quiet and slow development turn to rather rapid emergences of new species and even phyla. During Cambrian explosion, 22 new phyla were added to the previously existed 3 phyla. Contrary to the common credence the natural selection or a survival of the fittest cannot be accounted for the dominant evolution vector which is steady and accelerated advent of more complex and more intelligent living organisms. Neither Darwinism nor alternative concepts including panspermia and intelligent design propose a satisfactory solution for these phenomena. The proposed hypothesis offers a logical and plausible explanation of the evolutionary processes in general. It is based on two postulates: a) the Biosphere is a single living organism, all parts of which are interconnected, and b) the Biosphere acts as a giant biological supercomputer, storing and processing the information in digital and analog forms. Such supercomputer surpasses all human-made computers by many orders of magnitude. Living organisms are the product of intelligent creative action of the biosphere supercomputer. The biological evolution is driven by growing amount of information stored in the living organisms and increasing complexity of the biosphere as a single organism. Main evolutionary vector is not a survival of the fittest but an accelerated growth of the computational complexity of the living organisms. The following postulates may summarize the proposed hypothesis: biological evolution as a natural life origin and development is a reality. Evolution is a coordinated and controlled process. One of evolution’s main development vectors is a growing computational complexity of the living organisms and the biosphere’s intelligence. The intelligent matter which conducts and controls global evolution is a gigantic bio-computer combining all living organisms on Earth. The information is acting like a software stored in and controlled by the biosphere. Random mutations trigger this software, as is stipulated by Darwinian Evolution Theories, and it is further stimulated by the growing demand for the Biosphere’s global memory storage and computational complexity. Greater memory volume requires a greater number and more intellectually advanced organisms for storing and handling it. More intricate organisms require the greater computational complexity of biosphere in order to keep control over the living world. This is an endless recursive endeavor with accelerated evolutionary dynamic. New species emerge when two conditions are met: a) crucial environmental changes occur and/or global memory storage volume comes to its limit and b) biosphere computational complexity reaches critical mass capable of producing more advanced creatures. The hypothesis presented here is a naturalistic concept of life creation and evolution. The hypothesis logically resolves many puzzling problems with the current state evolution theory such as speciation, as a result of GM purposeful design, evolution development vector, as a need for growing global intelligence, punctuated equilibrium, happening when two above conditions a) and b) are met, the Cambrian explosion, mass extinctions, happening when more intelligent species should replace outdated creatures.

Keywords: supercomputer, biological evolution, Darwinism, speciation

Procedia PDF Downloads 127
4 Video Club as a Pedagogical Tool to Shift Teachers’ Image of the Child

Authors: Allison Tucker, Carolyn Clarke, Erin Keith

Abstract:

Introduction: In education, the determination to uncover privileged practices requires critical reflection to be placed at the center of both pre-service and in-service teacher education. Confronting deficit thinking about children’s abilities and shifting to holding an image of the child as capable and competent is necessary for teachers to engage in responsive pedagogy that meets children where they are in their learning and builds on strengths. This paper explores the ways in which early elementary teachers' perceptions of the assets of children might shift through the pedagogical use of video clubs. Video club is a pedagogical practice whereby teachers record and view short videos with the intended purpose of deepening their practices. The use of video club as a learning tool has been an extensively documented practice. In this study, a video club is used to watch short recordings of playing children to identify the assets of their students. Methodology: The study on which this paper is based asks the question: What are the ways in which teachers’ image of the child and teaching practices evolve through the use of video club focused on the strengths of children demonstrated during play? Using critical reflection, it aims to identify and describe participants’ experiences of examining their personally held image of the child through the pedagogical tool video club, and how that image influences their practices, specifically in implementing play pedagogy. Teachers enrolled in a graduate-level play pedagogy course record and watch videos of their own students as a means to notice and reflect on the learning that happens during play. Using a co-constructed viewing protocol, teachers identify student strengths and consider their pedagogical responses. Video club provides a framework for teachers to critically reflect in action, return to the video to rewatch the children or themselves and discuss their noticings with colleagues. Critical reflection occurs when there is focused attention on identifying the ways in which actions perpetuate or challenge issues of inherent power in education. When the image of the child held by the teacher is from a deficit position and is influenced by hegemonic dimensions of practice, critical reflection is essential in naming and addressing power imbalances, biases, and practices that are harmful to children and become barriers to their thriving. The data is comprised of teacher reflections, analyzed using phenomenology. Phenomenology seeks to understand and appreciate how individuals make sense of their experiences. Teacher reflections are individually read, and researchers determine pools of meaning. Categories are identified by each researcher, after which commonalities are named through a recursive process of returning to the data until no more themes emerge or saturation is reached. Findings: The final analysis and interpretation of the data are forthcoming. However, emergent analysis of the data collected using teacher reflections reveals the ways in which the use of video club grew teachers’ awareness of their image of the child. It shows video club as a promising pedagogical tool when used with in-service teachers to prompt opportunities for play and to challenge deficit thinking about children and their abilities to thrive in learning.

Keywords: asset-based teaching, critical reflection, image of the child, video club

Procedia PDF Downloads 73
3 On the Influence of Sleep Habits for Predicting Preterm Births: A Machine Learning Approach

Authors: C. Fernandez-Plaza, I. Abad, E. Diaz, I. Diaz

Abstract:

Births occurring before the 37th week of gestation are considered preterm births. A threat of preterm is defined as the beginning of regular uterine contractions, dilation and cervical effacement between 23 and 36 gestation weeks. To author's best knowledge, the factors that determine the beginning of the birth are not completely defined yet. In particular, the incidence of sleep habits on preterm births is weekly studied. The aim of this study is to develop a model to predict the factors affecting premature delivery on pregnancy, based on the above potential risk factors, including those derived from sleep habits and light exposure at night (introduced as 12 variables obtained by a telephone survey using two questionnaires previously used by other authors). Thus, three groups of variables were included in the study (maternal, fetal and sleep habits). The study was approved by Research Ethics Committee of the Principado of Asturias (Spain). An observational, retrospective and descriptive study was performed with 481 births between January 1, 2015 and May 10, 2016 in the University Central Hospital of Asturias (Spain). A statistical analysis using SPSS was carried out to compare qualitative and quantitative variables between preterm and term delivery. Chi-square test qualitative variable and t-test for quantitative variables were applied. Statistically significant differences (p < 0.05) between preterm vs. term births were found for primiparity, multi-parity, kind of conception, place of residence or premature rupture of membranes and interruption during nights. In addition to the statistical analysis, machine learning methods to look for a prediction model were tested. In particular, tree based models were applied as the trade-off between performance and interpretability is especially suitable for this study. C5.0, recursive partitioning, random forest and tree bag models were analysed using caret R-package. Cross validation with 10-folds and parameter tuning to optimize the methods were applied. In addition, different noise reduction methods were applied to the initial data using NoiseFiltersR package. The best performance was obtained by C5.0 method with Accuracy 0.91, Sensitivity 0.93, Specificity 0.89 and Precision 0.91. Some well known preterm birth factors were identified: Cervix Dilation, maternal BMI, Premature rupture of membranes or nuchal translucency analysis in the first trimester. The model also identifies other new factors related to sleep habits such as light through window, bedtime on working days, usage of electronic devices before sleeping from Mondays to Fridays or change of sleeping habits reflected in the number of hours, in the depth of sleep or in the lighting of the room. IF dilation < = 2.95 AND usage of electronic devices before sleeping from Mondays to Friday = YES and change of sleeping habits = YES, then preterm is one of the predicting rules obtained by C5.0. In this work a model for predicting preterm births is developed. It is based on machine learning together with noise reduction techniques. The method maximizing the performance is the one selected. This model shows the influence of variables related to sleep habits in preterm prediction.

Keywords: machine learning, noise reduction, preterm birth, sleep habit

Procedia PDF Downloads 111
2 Production Factor Coefficients Transition through the Lens of State Space Model

Authors: Kanokwan Chancharoenchai

Abstract:

Economic growth can be considered as an important element of countries’ development process. For developing countries, like Thailand, to ensure the continuous growth of the economy, the Thai government usually implements various policies to stimulate economic growth. They may take the form of fiscal, monetary, trade, and other policies. Because of these different aspects, understanding factors relating to economic growth could allow the government to introduce the proper plan for the future economic stimulating scheme. Consequently, this issue has caught interest of not only policymakers but also academics. This study, therefore, investigates explanatory variables for economic growth in Thailand from 2005 to 2017 with a total of 52 quarters. The findings would contribute to the field of economic growth and become helpful information to policymakers. The investigation is estimated throughout the production function with non-linear Cobb-Douglas equation. The rate of growth is indicated by the change of GDP in the natural logarithmic form. The relevant factors included in the estimation cover three traditional means of production and implicit effects, such as human capital, international activity and technological transfer from developed countries. Besides, this investigation takes the internal and external instabilities into account as proxied by the unobserved inflation estimation and the real effective exchange rate (REER) of the Thai baht, respectively. The unobserved inflation series are obtained from the AR(1)-ARCH(1) model, while the unobserved REER of Thai baht is gathered from naive OLS-GARCH(1,1) model. According to empirical results, the AR(|2|) equation which includes seven significant variables, namely capital stock, labor, the imports of capital goods, trade openness, the REER of Thai baht uncertainty, one previous GDP, and the world financial crisis in 2009 dummy, presents the most suitable model. The autoregressive model is assumed constant estimator that would somehow cause the unbias. However, this is not the case of the recursive coefficient model from the state space model that allows the transition of coefficients. With the powerful state space model, it provides the productivity or effect of each significant factor more in detail. The state coefficients are estimated based on the AR(|2|) with the exception of the one previous GDP and the 2009 world financial crisis dummy. The findings shed the light that those factors seem to be stable through time since the occurrence of the world financial crisis together with the political situation in Thailand. These two events could lower the confidence in the Thai economy. Moreover, state coefficients highlight the sluggish rate of machinery replacement and quite low technology of capital goods imported from abroad. The Thai government should apply proactive policies via taxation and specific credit policy to improve technological advancement, for instance. Another interesting evidence is the issue of trade openness which shows the negative transition effect along the sample period. This could be explained by the loss of price competitiveness to imported goods, especially under the widespread implementation of free trade agreement. The Thai government should carefully handle with regulations and the investment incentive policy by focusing on strengthening small and medium enterprises.

Keywords: autoregressive model, economic growth, state space model, Thailand

Procedia PDF Downloads 122
1 Evaluation of Random Forest and Support Vector Machine Classification Performance for the Prediction of Early Multiple Sclerosis from Resting State FMRI Connectivity Data

Authors: V. Saccà, A. Sarica, F. Novellino, S. Barone, T. Tallarico, E. Filippelli, A. Granata, P. Valentino, A. Quattrone

Abstract:

The work aim was to evaluate how well Random Forest (RF) and Support Vector Machine (SVM) algorithms could support the early diagnosis of Multiple Sclerosis (MS) from resting-state functional connectivity data. In particular, we wanted to explore the ability in distinguishing between controls and patients of mean signals extracted from ICA components corresponding to 15 well-known networks. Eighteen patients with early-MS (mean-age 37.42±8.11, 9 females) were recruited according to McDonald and Polman, and matched for demographic variables with 19 healthy controls (mean-age 37.55±14.76, 10 females). MRI was acquired by a 3T scanner with 8-channel head coil: (a)whole-brain T1-weighted; (b)conventional T2-weighted; (c)resting-state functional MRI (rsFMRI), 200 volumes. Estimated total lesion load (ml) and number of lesions were calculated using LST-toolbox from the corrected T1 and FLAIR. All rsFMRIs were pre-processed using tools from the FMRIB's Software Library as follows: (1) discarding of the first 5 volumes to remove T1 equilibrium effects, (2) skull-stripping of images, (3) motion and slice-time correction, (4) denoising with high-pass temporal filter (128s), (5) spatial smoothing with a Gaussian kernel of FWHM 8mm. No statistical significant differences (t-test, p < 0.05) were found between the two groups in the mean Euclidian distance and the mean Euler angle. WM and CSF signal together with 6 motion parameters were regressed out from the time series. We applied an independent component analysis (ICA) with the GIFT-toolbox using the Infomax approach with number of components=21. Fifteen mean components were visually identified by two experts. The resulting z-score maps were thresholded and binarized to extract the mean signal of the 15 networks for each subject. Statistical and machine learning analysis were then conducted on this dataset composed of 37 rows (subjects) and 15 features (mean signal in the network) with R language. The dataset was randomly splitted into training (75%) and test sets and two different classifiers were trained: RF and RBF-SVM. We used the intrinsic feature selection of RF, based on the Gini index, and recursive feature elimination (rfe) for the SVM, to obtain a rank of the most predictive variables. Thus, we built two new classifiers only on the most important features and we evaluated the accuracies (with and without feature selection) on test-set. The classifiers, trained on all the features, showed very poor accuracies on training (RF:58.62%, SVM:65.52%) and test sets (RF:62.5%, SVM:50%). Interestingly, when feature selection by RF and rfe-SVM were performed, the most important variable was the sensori-motor network I in both cases. Indeed, with only this network, RF and SVM classifiers reached an accuracy of 87.5% on test-set. More interestingly, the only misclassified patient resulted to have the lowest value of lesion volume. We showed that, with two different classification algorithms and feature selection approaches, the best discriminant network between controls and early MS, was the sensori-motor I. Similar importance values were obtained for the sensori-motor II, cerebellum and working memory networks. These findings, in according to the early manifestation of motor/sensorial deficits in MS, could represent an encouraging step toward the translation to the clinical diagnosis and prognosis.

Keywords: feature selection, machine learning, multiple sclerosis, random forest, support vector machine

Procedia PDF Downloads 216