Search results for: random forest algorithm
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 6083

Search results for: random forest algorithm

4433 Next Generation Radiation Risk Assessment and Prediction Tools Generation Applying AI-Machine (Deep) Learning Algorithms

Authors: Selim M. Khan

Abstract:

Indoor air quality is strongly influenced by the presence of radioactive radon (222Rn) gas. Indeed, exposure to high 222Rn concentrations is unequivocally linked to DNA damage and lung cancer and is a worsening issue in North American and European built environments, having increased over time within newer housing stocks as a function of as yet unclear variables. Indoor air radon concentration can be influenced by a wide range of environmental, structural, and behavioral factors. As some of these factors are quantitative while others are qualitative, no single statistical model can determine indoor radon level precisely while simultaneously considering all these variables across a complex and highly diverse dataset. The ability of AI- machine (deep) learning to simultaneously analyze multiple quantitative and qualitative features makes it suitable to predict radon with a high degree of precision. Using Canadian and Swedish long-term indoor air radon exposure data, we are using artificial deep neural network models with random weights and polynomial statistical models in MATLAB to assess and predict radon health risk to human as a function of geospatial, human behavioral, and built environmental metrics. Our initial artificial neural network with random weights model run by sigmoid activation tested different combinations of variables and showed the highest prediction accuracy (>96%) within the reasonable iterations. Here, we present details of these emerging methods and discuss strengths and weaknesses compared to the traditional artificial neural network and statistical methods commonly used to predict indoor air quality in different countries. We propose an artificial deep neural network with random weights as a highly effective method for assessing and predicting indoor radon.

Keywords: radon, radiation protection, lung cancer, aI-machine deep learnng, risk assessment, risk prediction, Europe, North America

Procedia PDF Downloads 86
4432 Design and Development of an Algorithm to Predict Fluctuations of Currency Rates

Authors: Nuwan Kuruwitaarachchi, M. K. M. Peiris, C. N. Madawala, K. M. A. R. Perera, V. U. N Perera

Abstract:

Dealing with businesses with the foreign market always took a special place in a country’s economy. Political and social factors came into play making currency rate changes fluctuate rapidly. Currency rate prediction has become an important factor for larger international businesses since large amounts of money exchanged between countries. This research focuses on comparing the accuracy of mainly three models; Autoregressive Integrated Moving Average (ARIMA), Artificial Neural Networks(ANN) and Support Vector Machines(SVM). series of data import, export, USD currency exchange rate respect to LKR has been selected for training using above mentioned algorithms. After training the data set and comparing each algorithm, it was able to see that prediction in SVM performed better than other models. It was improved more by combining SVM and SVR models together.

Keywords: ARIMA, ANN, FFNN, RMSE, SVM, SVR

Procedia PDF Downloads 193
4431 Using the Smith-Waterman Algorithm to Extract Features in the Classification of Obesity Status

Authors: Rosa Figueroa, Christopher Flores

Abstract:

Text categorization is the problem of assigning a new document to a set of predetermined categories, on the basis of a training set of free-text data that contains documents whose category membership is known. To train a classification model, it is necessary to extract characteristics in the form of tokens that facilitate the learning and classification process. In text categorization, the feature extraction process involves the use of word sequences also known as N-grams. In general, it is expected that documents belonging to the same category share similar features. The Smith-Waterman (SW) algorithm is a dynamic programming algorithm that performs a local sequence alignment in order to determine similar regions between two strings or protein sequences. This work explores the use of SW algorithm as an alternative to feature extraction in text categorization. The dataset used for this purpose, contains 2,610 annotated documents with the classes Obese/Non-Obese. This dataset was represented in a matrix form using the Bag of Word approach. The score selected to represent the occurrence of the tokens in each document was the term frequency-inverse document frequency (TF-IDF). In order to extract features for classification, four experiments were conducted: the first experiment used SW to extract features, the second one used unigrams (single word), the third one used bigrams (two word sequence) and the last experiment used a combination of unigrams and bigrams to extract features for classification. To test the effectiveness of the extracted feature set for the four experiments, a Support Vector Machine (SVM) classifier was tuned using 20% of the dataset. The remaining 80% of the dataset together with 5-Fold Cross Validation were used to evaluate and compare the performance of the four experiments of feature extraction. Results from the tuning process suggest that SW performs better than the N-gram based feature extraction. These results were confirmed by using the remaining 80% of the dataset, where SW performed the best (accuracy = 97.10%, weighted average F-measure = 97.07%). The second best was obtained by the combination of unigrams-bigrams (accuracy = 96.04, weighted average F-measure = 95.97) closely followed by the bigrams (accuracy = 94.56%, weighted average F-measure = 94.46%) and finally unigrams (accuracy = 92.96%, weighted average F-measure = 92.90%).

Keywords: comorbidities, machine learning, obesity, Smith-Waterman algorithm

Procedia PDF Downloads 286
4430 Using ε Value in Describe Regular Languages by Using Finite Automata, Operation on Languages and the Changing Algorithm Implementation

Authors: Abdulmajid Mukhtar Afat

Abstract:

This paper aims at introducing nondeterministic finite automata with ε value which is used to perform some operations on languages. a program is created to implement the algorithm that converts nondeterministic finite automata with ε value (ε-NFA) to deterministic finite automata (DFA).The program is written in c++ programming language. The program inputs are FA 5-tuples from text file and then classifies it into either DFA/NFA or ε -NFA. For DFA, the program will get the string w and decide whether it is accepted or rejected. The tracking path for an accepted string is saved by the program. In case of NFA or ε-NFA automation, the program changes the automation to DFA to enable tracking and to decide if the string w exists in the regular language or not.

Keywords: DFA, NFA, ε-NFA, eclose, finite automata, operations on languages

Procedia PDF Downloads 477
4429 Probability-Based Damage Detection of Structures Using Kriging Surrogates and Enhanced Ideal Gas Molecular Movement Algorithm

Authors: M. R. Ghasemi, R. Ghiasi, H. Varaee

Abstract:

Surrogate model has received increasing attention for use in detecting damage of structures based on vibration modal parameters. However, uncertainties existing in the measured vibration data may lead to false or unreliable output result from such model. In this study, an efficient approach based on Monte Carlo simulation is proposed to take into account the effect of uncertainties in developing a surrogate model. The probability of damage existence (PDE) is calculated based on the probability density function of the existence of undamaged and damaged states. The kriging technique allows one to genuinely quantify the surrogate error, therefore it is chosen as metamodeling technique. Enhanced version of ideal gas molecular movement (EIGMM) algorithm is used as main algorithm for model updating. The developed approach is applied to detect simulated damage in numerical models of 72-bar space truss and 120-bar dome truss. The simulation results show the proposed method can perform well in probability-based damage detection of structures with less computational effort compared to direct finite element model.

Keywords: probability-based damage detection (PBDD), Kriging, surrogate modeling, uncertainty quantification, artificial intelligence, enhanced ideal gas molecular movement (EIGMM)

Procedia PDF Downloads 228
4428 Frailty Models for Modeling Heterogeneity: Simulation Study and Application to Quebec Pension Plan

Authors: Souad Romdhane, Lotfi Belkacem

Abstract:

When referring to actuarial analysis of lifetime, only models accounting for observable risk factors have been developed. Within this context, Cox proportional hazards model (CPH model) is commonly used to assess the effects of observable covariates as gender, age, smoking habits, on the hazard rates. These covariates may fail to fully account for the true lifetime interval. This may be due to the existence of another random variable (frailty) that is still being ignored. The aim of this paper is to examine the shared frailty issue in the Cox proportional hazard model by including two different parametric forms of frailty into the hazard function. Four estimated methods are used to fit them. The performance of the parameter estimates is assessed and compared between the classical Cox model and these frailty models through a real-life data set from the Quebec Pension Plan and then using a more general simulation study. This performance is investigated in terms of the bias of point estimates and their empirical standard errors in both fixed and random effect parts. Both the simulation and the real dataset studies showed differences between classical Cox model and shared frailty model.

Keywords: life insurance-pension plan, survival analysis, risk factors, cox proportional hazards model, multivariate failure-time data, shared frailty, simulations study

Procedia PDF Downloads 350
4427 Optimization of Water Pipeline Routes Using a GIS-Based Multi-Criteria Decision Analysis and a Geometric Search Algorithm

Authors: Leon Mortari

Abstract:

The Metropolitan East region of Rio de Janeiro state, Brazil, faces a historic water scarcity. Among the alternatives studied to solve this situation, the possibility of adduction of the available water in the reservoir Lagoa de Juturnaíba to supply the region's municipalities stands out. The allocation of a linear engineering project must occur through an evaluation of different aspects, such as altitude, slope, proximity to roads, distance from watercourses, land use and occupation, and physical and chemical features of the soil. This work aims to apply a multi-criteria model that combines geoprocessing techniques, decision-making, and geometric search algorithm to optimize a hypothetical adductor system in the scenario of expanding the water supply system that serves this region, known as Imunana-Laranjal, using the Lagoa de Juturnaíba as the source. It is proposed in this study, the construction of a spatial database related to the presented evaluation criteria, treatment and rasterization of these data, and standardization and reclassification of this information in a Geographic Information System (GIS) platform. The methodology involves the integrated analysis of these criteria, using their relative importance defined by weighting them based on expert consultations and the Analytic Hierarchy Process (AHP) method. Three approaches are defined for weighting the criteria by AHP: the first treats all criteria as equally important, the second considers weighting based on a pairwise comparison matrix, and the third establishes a hierarchy based on the priority of the criteria. For each approach, a distinct group of weightings is defined. In the next step, map algebra tools are used to overlay the layers and generate cost surfaces, that indicates the resistance to the passage of the adductor route, using the three groups of weightings. The Dijkstra algorithm, a geometric search algorithm, is then applied to these cost surfaces to find an optimized path within the geographical space, aiming to minimize resources, time, investment, maintenance, and environmental and social impacts.

Keywords: geometric search algorithm, GIS, pipeline, route optimization, spatial multi-criteria analysis model

Procedia PDF Downloads 11
4426 Parametric Analysis and Optimal Design of Functionally Graded Plates Using Particle Swarm Optimization Algorithm and a Hybrid Meshless Method

Authors: Foad Nazari, Seyed Mahmood Hosseini, Mohammad Hossein Abolbashari, Mohammad Hassan Abolbashari

Abstract:

The present study is concerned with the optimal design of functionally graded plates using particle swarm optimization (PSO) algorithm. In this study, meshless local Petrov-Galerkin (MLPG) method is employed to obtain the functionally graded (FG) plate’s natural frequencies. Effects of two parameters including thickness to height ratio and volume fraction index on the natural frequencies and total mass of plate are studied by using the MLPG results. Then the first natural frequency of the plate, for different conditions where MLPG data are not available, is predicted by an artificial neural network (ANN) approach which is trained by back-error propagation (BEP) technique. The ANN results show that the predicted data are in good agreement with the actual one. To maximize the first natural frequency and minimize the mass of FG plate simultaneously, the weighted sum optimization approach and PSO algorithm are used. However, the proposed optimization process of this study can provide the designers of FG plates with useful data.

Keywords: optimal design, natural frequency, FG plate, hybrid meshless method, MLPG method, ANN approach, particle swarm optimization

Procedia PDF Downloads 357
4425 Prediction of Bariatric Surgery Publications by Using Different Machine Learning Algorithms

Authors: Senol Dogan, Gunay Karli

Abstract:

Identification of relevant publications based on a Medline query is time-consuming and error-prone. An all based process has the potential to solve this problem without any manual work. To the best of our knowledge, our study is the first to investigate the ability of machine learning to identify relevant articles accurately. 5 different machine learning algorithms were tested using 23 predictors based on several metadata fields attached to publications. We find that the Boosted model is the best-performing algorithm and its overall accuracy is 96%. In addition, specificity and sensitivity of the algorithm is 97 and 93%, respectively. As a result of the work, we understood that we can apply the same procedure to understand cancer gene expression big data.

Keywords: prediction of publications, machine learning, algorithms, bariatric surgery, comparison of algorithms, boosted, tree, logistic regression, ANN model

Procedia PDF Downloads 198
4424 Implications of Optimisation Algorithm on the Forecast Performance of Artificial Neural Network for Streamflow Modelling

Authors: Martins Y. Otache, John J. Musa, Abayomi I. Kuti, Mustapha Mohammed

Abstract:

The performance of an artificial neural network (ANN) is contingent on a host of factors, for instance, the network optimisation scheme. In view of this, the study examined the general implications of the ANN training optimisation algorithm on its forecast performance. To this end, the Bayesian regularisation (Br), Levenberg-Marquardt (LM), and the adaptive learning gradient descent: GDM (with momentum) algorithms were employed under different ANN structural configurations: (1) single-hidden layer, and (2) double-hidden layer feedforward back propagation network. Results obtained revealed generally that the gradient descent with momentum (GDM) optimisation algorithm, with its adaptive learning capability, used a relatively shorter time in both training and validation phases as compared to the Levenberg- Marquardt (LM) and Bayesian Regularisation (Br) algorithms though learning may not be consummated; i.e., in all instances considering also the prediction of extreme flow conditions for 1-day and 5-day ahead, respectively especially using the ANN model. In specific statistical terms on the average, model performance efficiency using the coefficient of efficiency (CE) statistic were Br: 98%, 94%; LM: 98 %, 95 %, and GDM: 96 %, 96% respectively for training and validation phases. However, on the basis of relative error distribution statistics (MAE, MAPE, and MSRE), GDM performed better than the others overall. Based on the findings, it is imperative to state that the adoption of ANN for real-time forecasting should employ training algorithms that do not have computational overhead like the case of LM that requires the computation of the Hessian matrix, protracted time, and sensitivity to initial conditions; to this end, Br and other forms of the gradient descent with momentum should be adopted considering overall time expenditure and quality of the forecast as well as mitigation of network overfitting. On the whole, it is recommended that evaluation should consider implications of (i) data quality and quantity and (ii) transfer functions on the overall network forecast performance.

Keywords: streamflow, neural network, optimisation, algorithm

Procedia PDF Downloads 144
4423 Visualization Tool for EEG Signal Segmentation

Authors: Sweeti, Anoop Kant Godiyal, Neha Singh, Sneh Anand, B. K. Panigrahi, Jayasree Santhosh

Abstract:

This work is about developing a tool for visualization and segmentation of Electroencephalograph (EEG) signals based on frequency domain features. Change in the frequency domain characteristics are correlated with change in mental state of the subject under study. Proposed algorithm provides a way to represent the change in the mental states using the different frequency band powers in form of segmented EEG signal. Many segmentation algorithms have been suggested in literature having application in brain computer interface, epilepsy and cognition studies that have been used for data classification. But the proposed method focusses mainly on the better presentation of signal and that’s why it could be a good utilization tool for clinician. Algorithm performs the basic filtering using band pass and notch filters in the range of 0.1-45 Hz. Advanced filtering is then performed by principal component analysis and wavelet transform based de-noising method. Frequency domain features are used for segmentation; considering the fact that the spectrum power of different frequency bands describes the mental state of the subject. Two sliding windows are further used for segmentation; one provides the time scale and other assigns the segmentation rule. The segmented data is displayed second by second successively with different color codes. Segment’s length can be selected as per need of the objective. Proposed algorithm has been tested on the EEG data set obtained from University of California in San Diego’s online data repository. Proposed tool gives a better visualization of the signal in form of segmented epochs of desired length representing the power spectrum variation in data. The algorithm is designed in such a way that it takes the data points with respect to the sampling frequency for each time frame and so it can be improved to use in real time visualization with desired epoch length.

Keywords: de-noising, multi-channel data, PCA, power spectra, segmentation

Procedia PDF Downloads 386
4422 Equalization Algorithm for the Optical OFDM System Based on the Fractional Fourier Transform

Authors: A. Cherifi, B. Bouazza, A. O. Dahmane, B. Yagoubi

Abstract:

Transmission over Optical channels will introduce inter-symbol interference (ISI) as well as inter-channel (or inter-carrier) interference (ICI). To decrease the effects of ICI, this paper proposes equalizer for the Optical OFDM system based on the fractional Fourier transform (FrFFT). In this FrFT-OFDM system, traditional Fourier transform is replaced by fractional Fourier transform to modulate and demodulate the data symbols. The equalizer proposed consists of sampling the received signal in the different time per time symbol. Theoretical analysis and numerical simulation are discussed.

Keywords: OFDM, (FrFT) fractional fourier transform, optical OFDM, equalization algorithm

Procedia PDF Downloads 420
4421 Genetic Algorithms for Parameter Identification of DC Motor ARMAX Model and Optimal Control

Authors: A. Mansouri, F. Krim

Abstract:

This paper presents two techniques for DC motor parameters identification. We propose a numerical method using the adaptive extensive recursive least squares (AERLS) algorithm for real time parameters estimation. This algorithm, based on minimization of quadratic criterion, is realized in simulation for parameters identification of DC motor autoregressive moving average with extra inputs (ARMAX). As advanced technique, we use genetic algorithms (GA) identification with biased estimation for high dynamic performance speed regulation. DC motors are extensively used in variable speed drives, for robot and solar panel trajectory control. GA effectiveness is derived through comparison of the two approaches.

Keywords: ARMAX model, DC motor, AERLS, GA, optimization, parameter identification, PID speed regulation

Procedia PDF Downloads 364
4420 Predicting Wealth Status of Households Using Ensemble Machine Learning Algorithms

Authors: Habtamu Ayenew Asegie

Abstract:

Wealth, as opposed to income or consumption, implies a more stable and permanent status. Due to natural and human-made difficulties, households' economies will be diminished, and their well-being will fall into trouble. Hence, governments and humanitarian agencies offer considerable resources for poverty and malnutrition reduction efforts. One key factor in the effectiveness of such efforts is the accuracy with which low-income or poor populations can be identified. As a result, this study aims to predict a household’s wealth status using ensemble Machine learning (ML) algorithms. In this study, design science research methodology (DSRM) is employed, and four ML algorithms, Random Forest (RF), Adaptive Boosting (AdaBoost), Light Gradient Boosted Machine (LightGBM), and Extreme Gradient Boosting (XGBoost), have been used to train models. The Ethiopian Demographic and Health Survey (EDHS) dataset is accessed for this purpose from the Central Statistical Agency (CSA)'s database. Various data pre-processing techniques were employed, and the model training has been conducted using the scikit learn Python library functions. Model evaluation is executed using various metrics like Accuracy, Precision, Recall, F1-score, area under curve-the receiver operating characteristics (AUC-ROC), and subjective evaluations of domain experts. An optimal subset of hyper-parameters for the algorithms was selected through the grid search function for the best prediction. The RF model has performed better than the rest of the algorithms by achieving an accuracy of 96.06% and is better suited as a solution model for our purpose. Following RF, LightGBM, XGBoost, and AdaBoost algorithms have an accuracy of 91.53%, 88.44%, and 58.55%, respectively. The findings suggest that some of the features like ‘Age of household head’, ‘Total children ever born’ in a family, ‘Main roof material’ of their house, ‘Region’ they lived in, whether a household uses ‘Electricity’ or not, and ‘Type of toilet facility’ of a household are determinant factors to be a focal point for economic policymakers. The determinant risk factors, extracted rules, and designed artifact achieved 82.28% of the domain expert’s evaluation. Overall, the study shows ML techniques are effective in predicting the wealth status of households.

Keywords: ensemble machine learning, households wealth status, predictive model, wealth status prediction

Procedia PDF Downloads 29
4419 Radiomics: Approach to Enable Early Diagnosis of Non-Specific Breast Nodules in Contrast-Enhanced Magnetic Resonance Imaging

Authors: N. D'Amico, E. Grossi, B. Colombo, F. Rigiroli, M. Buscema, D. Fazzini, G. Cornalba, S. Papa

Abstract:

Purpose: To characterize, through a radiomic approach, the nature of nodules considered non-specific by expert radiologists, recognized in magnetic resonance mammography (MRm) with T1-weighted (T1w) sequences with paramagnetic contrast. Material and Methods: 47 cases out of 1200 undergoing MRm, in which the MRm assessment gave uncertain classification (non-specific nodules), were admitted to the study. The clinical outcome of the non-specific nodules was later found through follow-up or further exams (biopsy), finding 35 benign and 12 malignant. All MR Images were acquired at 1.5T, a first basal T1w sequence and then four T1w acquisitions after the paramagnetic contrast injection. After a manual segmentation of the lesions, done by a radiologist, and the extraction of 150 radiomic features (30 features per 5 subsequent times) a machine learning (ML) approach was used. An evolutionary algorithm (TWIST system based on KNN algorithm) was used to subdivide the dataset into training and validation test and to select features yielding the maximal amount of information. After this pre-processing, different machine learning systems were applied to develop a predictive model based on a training-testing crossover procedure. 10 cases with a benign nodule (follow-up older than 5 years) and 18 with an evident malignant tumor (clear malignant histological exam) were added to the dataset in order to allow the ML system to better learn from data. Results: NaiveBayes algorithm working on 79 features selected by a TWIST system, resulted to be the best performing ML system with a sensitivity of 96% and a specificity of 78% and a global accuracy of 87% (average values of two training-testing procedures ab-ba). The results showed that in the subset of 47 non-specific nodules, the algorithm predicted the outcome of 45 nodules which an expert radiologist could not identify. Conclusion: In this pilot study we identified a radiomic approach allowing ML systems to perform well in the diagnosis of a non-specific nodule at MR mammography. This algorithm could be a great support for the early diagnosis of malignant breast tumor, in the event the radiologist is not able to identify the kind of lesion and reduces the necessity for long follow-up. Clinical Relevance: This machine learning algorithm could be essential to support the radiologist in early diagnosis of non-specific nodules, in order to avoid strenuous follow-up and painful biopsy for the patient.

Keywords: breast, machine learning, MRI, radiomics

Procedia PDF Downloads 258
4418 An Observation Approach of Reading Order for Single Column and Two Column Layout Template

Authors: In-Tsang Lin, Chiching Wei

Abstract:

Reading order is an important task in many digitization scenarios involving the preservation of the logical structure of a document. From the paper survey, it finds that the state-of-the-art algorithm could not fulfill to get the accurate reading order in the portable document format (PDF) files with rich formats, diverse layout arrangement. In recent years, most of the studies on the analysis of reading order have targeted the specific problem of associating layout components with logical labels, while less attention has been paid to the problem of extracting relationships the problem of detecting the reading order relationship between logical components, such as cross-references. Over 3 years of development, the company Foxit has demonstrated the layout recognition (LR) engine in revision 20601 to eager for the accuracy of the reading order. The bounding box of each paragraph can be obtained correctly by the Foxit LR engine, but the result of reading-order is not always correct for single-column, and two-column layout format due to the table issue, formula issue, and multiple mini separated bounding box and footer issue. Thus, the algorithm is developed to improve the accuracy of the reading order based on the Foxit LR structure. In this paper, a creative observation method (Here called the MESH method) is provided here to open a new chance in the research of the reading-order field. Here two important parameters are introduced, one parameter is the number of the bounding box on the right side of the present bounding box (NRight), and another parameter is the number of the bounding box under the present bounding box (Nunder). And the normalized x-value (x/the whole width), the normalized y-value (y/the whole height) of each bounding box, the x-, and y- position of each bounding box were also put into consideration. Initial experimental results of single column layout format demonstrate a 19.33% absolute improvement in accuracy of the reading-order over 7 PDF files (total 150 pages) using our proposed method based on the LR structure over the baseline method using the LR structure in 20601 revision, which its accuracy of the reading-order is 72%. And for two-column layout format, the preliminary results demonstrate a 44.44% absolute improvement in accuracy of the reading-order over 2 PDF files (total 18 pages) using our proposed method based on the LR structure over the baseline method using the LR structure in 20601 revision, which its accuracy of the reading-order is 0%. Until now, the footer issue and a part of multiple mini separated bounding box issue can be solved by using the MESH method. However, there are still three issues that cannot be solved, such as the table issue, formula issue, and the random multiple mini separated bounding boxes. But the detection of the table position and the recognition of the table structure are out of the scope in this paper, and there is needed another research. In the future, the tasks are chosen- how to detect the table position in the page and to extract the content of the table.

Keywords: document processing, reading order, observation method, layout recognition

Procedia PDF Downloads 169
4417 Scientific Linux Cluster for BIG-DATA Analysis (SLBD): A Case of Fayoum University

Authors: Hassan S. Hussein, Rania A. Abul Seoud, Amr M. Refaat

Abstract:

Scientific researchers face in the analysis of very large data sets that is increasing noticeable rate in today’s and tomorrow’s technologies. Hadoop and Spark are types of software that developed frameworks. Hadoop framework is suitable for many Different hardware platforms. In this research, a scientific Linux cluster for Big Data analysis (SLBD) is presented. SLBD runs open source software with large computational capacity and high performance cluster infrastructure. SLBD composed of one cluster contains identical, commodity-grade computers interconnected via a small LAN. SLBD consists of a fast switch and Gigabit-Ethernet card which connect four (nodes). Cloudera Manager is used to configure and manage an Apache Hadoop stack. Hadoop is a framework allows storing and processing big data across the cluster by using MapReduce algorithm. MapReduce algorithm divides the task into smaller tasks which to be assigned to the network nodes. Algorithm then collects the results and form the final result dataset. SLBD clustering system allows fast and efficient processing of large amount of data resulting from different applications. SLBD also provides high performance, high throughput, high availability, expandability and cluster scalability.

Keywords: big data platforms, cloudera manager, Hadoop, MapReduce

Procedia PDF Downloads 352
4416 A Method for Solving a Bi-Objective Transportation Problem under Fuzzy Environment

Authors: Sukhveer Singh, Sandeep Singh

Abstract:

A bi-objective fuzzy transportation problem with the objectives to minimize the total fuzzy cost and fuzzy time of transportation without according priorities to them is considered. To the best of our knowledge, there is no method in the literature to find efficient solutions of the bi-objective transportation problem under uncertainty. In this paper, a bi-objective transportation problem in an uncertain environment has been formulated. An algorithm has been proposed to find efficient solutions of the bi-objective transportation problem under uncertainty. The proposed algorithm avoids the degeneracy and gives the optimal solution faster than other existing algorithms for the given uncertain transportation problem.

Keywords: uncertain transportation problem, efficient solution, ranking function, fuzzy transportation problem

Procedia PDF Downloads 511
4415 Alternating Expectation-Maximization Algorithm for a Bilinear Model in Isoform Quantification from RNA-Seq Data

Authors: Wenjiang Deng, Tian Mou, Yudi Pawitan, Trung Nghia Vu

Abstract:

Estimation of isoform-level gene expression from RNA-seq data depends on simplifying assumptions, such as uniform reads distribution, that are easily violated in real data. Such violations typically lead to biased estimates. Most existing methods provide a bias correction step(s), which is based on biological considerations, such as GC content–and applied in single samples separately. The main problem is that not all biases are known. For example, new technologies such as single-cell RNA-seq (scRNA-seq) may introduce new sources of bias not seen in bulk-cell data. This study introduces a method called XAEM based on a more flexible and robust statistical model. Existing methods are essentially based on a linear model Xβ, where the design matrix X is known and derived based on the simplifying assumptions. In contrast, XAEM considers Xβ as a bilinear model with both X and β unknown. Joint estimation of X and β is made possible by simultaneous analysis of multi-sample RNA-seq data. Compared to existing methods, XAEM automatically performs empirical correction of potentially unknown biases. XAEM implements an alternating expectation-maximization (AEM) algorithm, alternating between estimation of X and β. For speed XAEM utilizes quasi-mapping for read alignment, thus leading to a fast algorithm. Overall XAEM performs favorably compared to other recent advanced methods. For simulated datasets, XAEM obtains higher accuracy for multiple-isoform genes, particularly for paralogs. In a differential-expression analysis of a real scRNA-seq dataset, XAEM achieves substantially greater rediscovery rates in an independent validation set.

Keywords: alternating EM algorithm, bias correction, bilinear model, gene expression, RNA-seq

Procedia PDF Downloads 134
4414 MapReduce Algorithm for Geometric and Topological Information Extraction from 3D CAD Models

Authors: Ahmed Fradi

Abstract:

In a digital world in perpetual evolution and acceleration, data more and more voluminous, rich and varied, the new software solutions emerged with the Big Data phenomenon offer new opportunities to the company enabling it not only to optimize its business and to evolve its production model, but also to reorganize itself to increase competitiveness and to identify new strategic axes. Design and manufacturing industrial companies, like the others, face these challenges, data represent a major asset, provided that they know how to capture, refine, combine and analyze them. The objective of our paper is to propose a solution allowing geometric and topological information extraction from 3D CAD model (precisely STEP files) databases, with specific algorithm based on the programming paradigm MapReduce. Our proposal is the first step of our future approach to 3D CAD object retrieval.

Keywords: Big Data, MapReduce, 3D object retrieval, CAD, STEP format

Procedia PDF Downloads 535
4413 Development of Evolutionary Algorithm by Combining Optimization and Imitation Approach for Machine Learning in Gaming

Authors: Rohit Mittal, Bright Keswani, Amit Mithal

Abstract:

This paper provides a sense about the application of computational intelligence techniques used to develop computer games, especially car racing. For the deep sense and knowledge of artificial intelligence, this paper is divided into various sections that is optimization, imitation, innovation and combining approach of optimization and imitation. This paper is mainly concerned with combining approach which tells different aspects of using fitness measures and supervised learning techniques used to imitate aspects of behavior. The main achievement of this paper is based on modelling player behaviour and evolving new game content such as racing tracks as single car racing on single track.

Keywords: evolution algorithm, genetic, optimization, imitation, racing, innovation, gaming

Procedia PDF Downloads 632
4412 Nelder-Mead Parametric Optimization of Elastic Metamaterials with Artificial Neural Network Surrogate Model

Authors: Jiaqi Dong, Qing-Hua Qin, Yi Xiao

Abstract:

Some of the most fundamental challenges of elastic metamaterials (EMMs) optimization can be attributed to the high consumption of computational power resulted from finite element analysis (FEA) simulations that render the optimization process inefficient. Furthermore, due to the inherent mesh dependence of FEA, minuscule geometry features, which often emerge during the later stages of optimization, induce very fine elements, resulting in enormously high time consumption, particularly when repetitive solutions are needed for computing the objective function. In this study, a surrogate modelling algorithm is developed to reduce computational time in structural optimization of EMMs. The surrogate model is constructed based on a multilayer feedforward artificial neural network (ANN) architecture, trained with prepopulated eigenfrequency data prepopulated from FEA simulation and optimized through regime selection with genetic algorithm (GA) to improve its accuracy in predicting the location and width of the primary elastic band gap. With the optimized ANN surrogate at the core, a Nelder-Mead (NM) algorithm is established and its performance inspected in comparison to the FEA solution. The ANNNM model shows remarkable accuracy in predicting the band gap width and a reduction of time consumption by 47%.

Keywords: artificial neural network, machine learning, mechanical metamaterials, Nelder-Mead optimization

Procedia PDF Downloads 120
4411 A Laundry Algorithm for Colored Textiles

Authors: H. E. Budak, B. Arslan-Ilkiz, N. Cakmakci, I. Gocek, U. K. Sahin, H. Acikgoz-Tufan, M. H. Arslan

Abstract:

The aim of this study is to design a novel laundry algorithm for colored textiles which have significant decoloring problem. During the experimental work, bleached knitted single jersey fabric made of 100% cotton and dyed with reactive dyestuff was utilized, since according to a conducted survey textiles made of cotton are the most demanded textile products in the textile market by the textile consumers and for coloration of textiles reactive dyestuffs are the ones that are the most commonly used in the textile industry for dyeing cotton-made products. Therefore, the fabric used in this study was selected and purchased in accordance with the survey results. The fabric samples cut out of this fabric were dyed with different dyeing parameters by using Remazol Brilliant Red 3BS dyestuff in Gyrowash machine at laboratory conditions. From the alternative reactive-dyed cotton fabric samples, the ones that have high tendency to color loss were determined and examined. Accordingly, the parameters of the dyeing process used for these fabric samples were evaluated and the dyeing process which was chosen to be used for causing high tendency to color loss for the cotton fabrics was determined in order to reveal the level of improvement in color loss during this study clearly. Afterwards, all of the untreated fabric samples cut out of the fabric purchased were dyed with the dyeing process selected. When dyeing process was completed, an experimental design was created for the laundering process by using Minitab® program considering temperature, time and mechanical action as parameters. All of the washing experiments were performed in domestic washing machine. 16 washing experiments were performed with 8 different experimental conditions and 2 repeats for each condition. After each of the washing experiments, water samples of the main wash of the laundering process were measured with UV spectrophotometer. The values obtained were compared with the calibration curve of the materials used for the dyeing process. The results of the washing experiments were statistically analyzed with Minitab® program. According to the results, the most suitable washing algorithm to be used in terms of the parameters temperature, time and mechanical action for domestic washing machines for minimizing fabric color loss was chosen. The laundry algorithm proposed in this study have the ability of minimalizing the problem of color loss of colored textiles in washing machines by eliminating the negative effects of the parameters of laundering process on color of textiles without compromising the fundamental effects of basic cleaning action being performed properly. Therefore, since fabric color loss is minimized with this washing algorithm, dyestuff residuals will definitely be lower in the grey water released from the laundering process. In addition to this, with this laundry algorithm it is possible to wash and clean other types of textile products with proper cleaning effect and minimized color loss.

Keywords: color loss, laundry algorithm, textiles, domestic washing process

Procedia PDF Downloads 337
4410 Monocular Visual Odometry for Three Different View Angles by Intel Realsense T265 with the Measurement of Remote

Authors: Heru Syah Putra, Aji Tri Pamungkas Nurcahyo, Chuang-Jan Chang

Abstract:

MOIL-SDK method refers to the spatial angle that forms a view with a different perspective from the Fisheye image. Visual Odometry forms a trusted application for extending projects by tracking using image sequences. A real-time, precise, and persistent approach that is able to contribute to the work when taking datasets and generate ground truth as a reference for the estimates of each image using the FAST Algorithm method in finding Keypoints that are evaluated during the tracking process with the 5-point Algorithm with RANSAC, as well as produce accurate estimates the camera trajectory for each rotational, translational movement on the X, Y, and Z axes.

Keywords: MOIL-SDK, intel realsense T265, Fisheye image, monocular visual odometry

Procedia PDF Downloads 123
4409 A Multifactorial Algorithm to Automate Screening of Drug-Induced Liver Injury Cases in Clinical and Post-Marketing Settings

Authors: Osman Turkoglu, Alvin Estilo, Ritu Gupta, Liliam Pineda-Salgado, Rajesh Pandey

Abstract:

Background: Hepatotoxicity can be linked to a variety of clinical symptoms and histopathological signs, posing a great challenge in the surveillance of suspected drug-induced liver injury (DILI) cases in the safety database. Additionally, the majority of such cases are rare, idiosyncratic, highly unpredictable, and tend to demonstrate unique individual susceptibility; these qualities, in turn, lend to a pharmacovigilance monitoring process that is often tedious and time-consuming. Objective: Develop a multifactorial algorithm to assist pharmacovigilance physicians in identifying high-risk hepatotoxicity cases associated with DILI from the sponsor’s safety database (Argus). Methods: Multifactorial selection criteria were established using Structured Query Language (SQL) and the TIBCO Spotfire® visualization tool, via a combination of word fragments, wildcard strings, and mathematical constructs, based on Hy’s law criteria and pattern of injury (R-value). These criteria excluded non-eligible cases from monthly line listings mined from the Argus safety database. The capabilities and limitations of these criteria were verified by comparing a manual review of all monthly cases with system-generated monthly listings over six months. Results: On an average, over a period of six months, the algorithm accurately identified 92% of DILI cases meeting established criteria. The automated process easily compared liver enzyme elevations with baseline values, reducing the screening time to under 15 minutes as opposed to multiple hours exhausted using a cognitively laborious, manual process. Limitations of the algorithm include its inability to identify cases associated with non-standard laboratory tests, naming conventions, and/or incomplete/incorrectly entered laboratory values. Conclusions: The newly developed multifactorial algorithm proved to be extremely useful in detecting potential DILI cases, while heightening the vigilance of the drug safety department. Additionally, the application of this algorithm may be useful in identifying a potential signal for DILI in drugs not yet known to cause liver injury (e.g., drugs in the initial phases of development). This algorithm also carries the potential for universal application, due to its product-agnostic data and keyword mining features. Plans for the tool include improving it into a fully automated application, thereby completely eliminating a manual screening process.

Keywords: automation, drug-induced liver injury, pharmacovigilance, post-marketing

Procedia PDF Downloads 142
4408 Compressed Sensing of Fetal Electrocardiogram Signals Based on Joint Block Multi-Orthogonal Least Squares Algorithm

Authors: Xiang Jianhong, Wang Cong, Wang Linyu

Abstract:

With the rise of medical IoT technologies, Wireless body area networks (WBANs) can collect fetal electrocardiogram (FECG) signals to support telemedicine analysis. The compressed sensing (CS)-based WBANs system can avoid the sampling of a large amount of redundant information and reduce the complexity and computing time of data processing, but the existing algorithms have poor signal compression and reconstruction performance. In this paper, a Joint block multi-orthogonal least squares (JBMOLS) algorithm is proposed. We apply the FECG signal to the Joint block sparse model (JBSM), and a comparative study of sparse transformation and measurement matrices is carried out. A FECG signal compression transmission mode based on Rbio5.5 wavelet, Bernoulli measurement matrix, and JBMOLS algorithm is proposed to improve the compression and reconstruction performance of FECG signal by CS-based WBANs. Experimental results show that the compression ratio (CR) required for accurate reconstruction of this transmission mode is increased by nearly 10%, and the runtime is saved by about 30%.

Keywords: telemedicine, fetal ECG, compressed sensing, joint sparse reconstruction, block sparse signal

Procedia PDF Downloads 115
4407 Design and Implementation of DC-DC Converter with Inc-Cond Algorithm

Authors: Mustafa Engin Başoğlu, Bekir Çakır

Abstract:

The most important component affecting the efficiency of photovoltaic power systems are solar panels. Efficiency of these systems are significantly affected because of being low efficiency of solar panel. Therefore, solar panels should be operated under maximum power point conditions through a power converter. In this study, design boost converter with maximum power point tracking (MPPT) operation has been designed and performed with Incremental Conductance (Inc-Cond) algorithm by using direct duty control. Furthermore, it is shown that performance of boost converter with MPPT operation fails under low load resistance connection.

Keywords: boost converter, incremental conductance (Inc-Cond), MPPT, solar panel

Procedia PDF Downloads 1035
4406 Thermodynamics of Random Copolymers in Solution

Authors: Maria Bercea, Bernhard A. Wolf

Abstract:

The thermodynamic behavior for solutions of poly (methyl methacrylate-ran-t-butyl methacrylate) of variable composition as compared with the corresponding homopolymers was investigated by light scattering measurements carried out for dilute solutions and vapor pressure measurements of concentrated solutions. The complex dependencies of the Flory Huggins interaction parameter on concentration and copolymer composition in solvents of different polarity (toluene and chloroform) can be understood by taking into account the ability of the polymers to rearrange in a response to changes in their molecular surrounding. A recent unified thermodynamic approach was used for modeling the experimental data, being able to describe the behavior of the different solutions by means of two adjustable parameters, one representing the effective number of solvent segments and another one accounting for the interactions between the components. Thus, it was investigated how the solvent quality changes with the composition of the copolymers through the Gibbs energy of mixing as a function of polymer concentration. The largest reduction of the Gibbs energy at a given composition of the system was observed for the best solvent. The present investigation proves that the new unified thermodynamic approach is a general concept applicable to homo- and copolymers, independent of the chain conformation or shape, molecular and chemical architecture of the components and of other dissimilarities, such as electrical charges.

Keywords: random copolymers, Flory Huggins interaction parameter, Gibbs energy of mixing, chemical architecture

Procedia PDF Downloads 273
4405 Strategies for Conserving Ecosystem Functions of the Aravalli Range to Combat Land Degradation: Case of Kishangarh and Tijara Tehsil in Rajasthan, India

Authors: Saloni Khandelwal

Abstract:

The Aravalli hills are one of the oldest and most distinctive mountain chains of peninsular India spanning in around 692 Km. More than 60% of it falls in the state of Rajasthan and influences ecological equilibrium in about 30% of the state. Because of natural and human-induced activities, physical gaps in the Aravallis are increasing, new gaps are coming up, and its physical structure is changing. There are no strict regulations to protect and monitor the Aravallis and no comprehensive research and study has been done for the enhancement of ecosystem functions of these ranges. Through this study, various factors leading to Aravalli’s degradation are identified and its impacts on selected areas are analyzed. A literature study is done to identify factors responsible for the degradation. To understand the severity of the problem at the lowest level, two tehsils from different districts in Rajasthan, which are the most affected due to illegal mining and increasing physical gaps are selected for the study. Case-1 of three-gram panchayats in Kishangarh Tehsil of Ajmer district focuses on the expanding physical gaps in the Aravalli range, and case-2 of three-gram panchayats in Tijara Tehsil of Alwar district focuses on increasing illegal mining in the Aravalli range. For measuring the degradation, physical, biological and social indicators are identified through literature review and for both the cases analysis is done on the basis of these indicators. Primary survey and focus group discussions are done with villagers, mining owners, illegal miners, and various government officials to understand dependency of people on the Aravalli and its importance to them along with the impact of degradation on their livelihood and environment. From the analysis, it has been found that green cover is continuously decreasing in both cases, dense forest areas do not exist now, the groundwater table is depleting at a very fast rate, soil is losing its moisture resulting in low yield and shift in agriculture. Wild animals which were easily seen earlier are now extinct. Cattles of villagers are dependent on the forest area in the Aravalli range for food, but with a decrease in fodder, their cattle numbers are decreasing. There is a decrease in agricultural land and an increase in scrub and salt-affected land. Analysis of various national and state programmes, acts which were passed to conserve biodiversity has been done showing that none of them is helping much to protect the Aravalli. For conserving the Aravalli and its forest areas, regional level and local level initiatives are required and are proposed in this study. This study is an attempt to formulate conservation and management strategies for the Aravalli range. These strategies will help in improving biodiversity which can lead to the revival of its ecosystem functions. It will also help in curbing the pollution at the regional and local level. All this will lead to the sustainable development of the region.

Keywords: Aravalli, ecosystem, LULC, Rajasthan

Procedia PDF Downloads 126
4404 Hardware Implementation on Field Programmable Gate Array of Two-Stage Algorithm for Rough Set Reduct Generation

Authors: Tomasz Grzes, Maciej Kopczynski, Jaroslaw Stepaniuk

Abstract:

The rough sets theory developed by Prof. Z. Pawlak is one of the tools that can be used in the intelligent systems for data analysis and processing. Banking, medicine, image recognition and security are among the possible fields of utilization. In all these fields, the amount of the collected data is increasing quickly, but with the increase of the data, the computation speed becomes the critical factor. Data reduction is one of the solutions to this problem. Removing the redundancy in the rough sets can be achieved with the reduct. A lot of algorithms of generating the reduct were developed, but most of them are only software implementations, therefore have many limitations. Microprocessor uses the fixed word length, consumes a lot of time for either fetching as well as processing of the instruction and data; consequently, the software based implementations are relatively slow. Hardware systems don’t have these limitations and can process the data faster than a software. Reduct is the subset of the decision attributes that provides the discernibility of the objects. For the given decision table there can be more than one reduct. Core is the set of all indispensable condition attributes. None of its elements can be removed without affecting the classification power of all condition attributes. Moreover, every reduct consists of all the attributes from the core. In this paper, the hardware implementation of the two-stage greedy algorithm to find the one reduct is presented. The decision table is used as an input. Output of the algorithm is the superreduct which is the reduct with some additional removable attributes. First stage of the algorithm is calculating the core using the discernibility matrix. Second stage is generating the superreduct by enriching the core with the most common attributes, i.e., attributes that are more frequent in the decision table. Described above algorithm has two disadvantages: i) generating the superreduct instead of reduct, ii) additional first stage may be unnecessary if the core is empty. But for the systems focused on the fast computation of the reduct the first disadvantage is not the key problem. The core calculation can be achieved with a combinational logic block, and thus add respectively little time to the whole process. Algorithm presented in this paper was implemented in Field Programmable Gate Array (FPGA) as a digital device consisting of blocks that process the data in a single step. Calculating the core is done by the comparators connected to the block called 'singleton detector', which detects if the input word contains only single 'one'. Calculating the number of occurrences of the attribute is performed in the combinational block made up of the cascade of the adders. The superreduct generation process is iterative and thus needs the sequential circuit for controlling the calculations. For the research purpose, the algorithm was also implemented in C language and run on a PC. The times of execution of the reduct calculation in a hardware and software were considered. Results show increase in the speed of data processing.

Keywords: data reduction, digital systems design, field programmable gate array (FPGA), reduct, rough set

Procedia PDF Downloads 209