Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 3999

Search results for: forecasting accuracy

3699 The Impact of the Cross Race Effect on Eyewitness Identification

Abstract:

Eyewitness identification is arguably one of the most utilized practices within our legal system; however, exoneration cases indicate that this practice may lead to accuracy and conviction errors. The purpose of this study was to examine the effects of the cross-race effect, the phenomena in which people are able to more easily and accurately identify faces from within their racial category, on the accuracy of eyewitness identification. Participants watched three separate videos of a perpetrator trying to steal a bicycle. In each video, the perpetrator was of a different race and gender. Participants watched a video where the perpetrator was a Black male, a White male, and a White female. Following the completion of watching each video, participants were asked to recall everything they could about the perpetrator they witnessed. The initial results of the study did not find the expected cross-race effect impacted the eyewitness identification accuracy. These surprising results are discussed in terms of cross-race bias and recognition theory as well as applied implications.

Keywords: cross race effect, eyewitness identification, own-race bias, racial profiling

Procedia PDF Downloads 148

3698 DenseNet and Autoencoder Architecture for COVID-19 Chest X-Ray Image Classification and Improved U-Net Lung X-Ray Segmentation

Authors: Jonathan Gong

Abstract:

Purpose AI-driven solutions are at the forefront of many pathology and medical imaging methods. Using algorithms designed to better the experience of medical professionals within their respective fields, the efficiency and accuracy of diagnosis can improve. In particular, X-rays are a fast and relatively inexpensive test that can diagnose diseases. In recent years, X-rays have not been widely used to detect and diagnose COVID-19. The under use of Xrays is mainly due to the low diagnostic accuracy and confounding with pneumonia, another respiratory disease. However, research in this field has expressed a possibility that artificial neural networks can successfully diagnose COVID-19 with high accuracy. Models and Data The dataset used is the COVID-19 Radiography Database. This dataset includes images and masks of chest X-rays under the labels of COVID-19, normal, and pneumonia. The classification model developed uses an autoencoder and a pre-trained convolutional neural network (DenseNet201) to provide transfer learning to the model. The model then uses a deep neural network to finalize the feature extraction and predict the diagnosis for the input image. This model was trained on 4035 images and validated on 807 separate images from the ones used for training. The images used to train the classification model include an important feature: the pictures are cropped beforehand to eliminate distractions when training the model. The image segmentation model uses an improved U-Net architecture. This model is used to extract the lung mask from the chest X-ray image. The model is trained on 8577 images and validated on a validation split of 20%. These models are calculated using the external dataset for validation. The models’ accuracy, precision, recall, f1-score, IOU, and loss are calculated. Results The classification model achieved an accuracy of 97.65% and a loss of 0.1234 when differentiating COVID19-infected, pneumonia-infected, and normal lung X-rays. The segmentation model achieved an accuracy of 97.31% and an IOU of 0.928. Conclusion The models proposed can detect COVID-19, pneumonia, and normal lungs with high accuracy and derive the lung mask from a chest X-ray with similarly high accuracy. The hope is for these models to elevate the experience of medical professionals and provide insight into the future of the methods used.

Keywords: artificial intelligence, convolutional neural networks, deep learning, image processing, machine learning

Procedia PDF Downloads 114

3697 Electricity Load Modeling: An Application to Italian Market

Authors: Giovanni Masala, Stefania Marica

Abstract:

Forecasting electricity load plays a crucial role regards decision making and planning for economical purposes. Besides, in the light of the recent privatization and deregulation of the power industry, the forecasting of future electricity load turned out to be a very challenging problem. Empirical data about electricity load highlights a clear seasonal behavior (higher load during the winter season), which is partly due to climatic effects. We also emphasize the presence of load periodicity at a weekly basis (electricity load is usually lower on weekends or holidays) and at daily basis (electricity load is clearly influenced by the hour). Finally, a long-term trend may depend on the general economic situation (for example, industrial production affects electricity load). All these features must be captured by the model. The purpose of this paper is then to build an hourly electricity load model. The deterministic component of the model requires non-linear regression and Fourier series while we will investigate the stochastic component through econometrical tools. The calibration of the parameters’ model will be performed by using data coming from the Italian market in a 6 year period (2007- 2012). Then, we will perform a Monte Carlo simulation in order to compare the simulated data respect to the real data (both in-sample and out-of-sample inspection). The reliability of the model will be deduced thanks to standard tests which highlight a good fitting of the simulated values.

Keywords: ARMA-GARCH process, electricity load, fitting tests, Fourier series, Monte Carlo simulation, non-linear regression

Procedia PDF Downloads 386

3696 A Study of the Performance Parameter for Recommendation Algorithm Evaluation

Authors: C. Rana, S. K. Jain

Abstract:

The enormous amount of Web data has challenged its usage in efficient manner in the past few years. As such, a range of techniques are applied to tackle this problem; prominent among them is personalization and recommender system. In fact, these are the tools that assist user in finding relevant information of web. Most of the e-commerce websites are applying such tools in one way or the other. In the past decade, a large number of recommendation algorithms have been proposed to tackle such problems. However, there have not been much research in the evaluation criteria for these algorithms. As such, the traditional accuracy and classification metrics are still used for the evaluation purpose that provides a static view. This paper studies how the evolution of user preference over a period of time can be mapped in a recommender system using a new evaluation methodology that explicitly using time dimension. We have also presented different types of experimental set up that are generally used for recommender system evaluation. Furthermore, an overview of major accuracy metrics and metrics that go beyond the scope of accuracy as researched in the past few years is also discussed in detail.

Keywords: collaborative filtering, data mining, evolutionary, clustering, algorithm, recommender systems

Procedia PDF Downloads 399

3695 Revealing of the Wave-Like Process in Kinetics of the Structural Steel Radiation Degradation

Authors: E. A. Krasikov

Abstract:

Dependence of the materials properties on neutron irradiation intensity (flux) is a key problem while usage data of the accelerated materials irradiation in test reactors for forecasting of their capacity for work in realistic (practical) circumstances of operation. Investigations of the reactor pressure vessel steel radiation degradation dependence on fast neutron fluence (embrittlement kinetics) at low flux reveal the instability in the form of the scatter of the experimental data and wave-like sections of embrittlement kinetics appearance. Disclosure of the steel degradation oscillating is a sign of the steel structure cyclic self-recovery transformation as it take place in self-organization processes. This assumption has received support through the discovery of the similar ‘anomalous’ data in scientific publications and by means of own additional experiments. Data obtained stimulate looking-for ways to management of the structural steel radiation stability (for example, by means of nano - structure modification for radiation defects annihilation intensification) for creation of the intelligent self-recovering material. Expected results: - radiation degradation theory and mechanisms development, - more adequate models of the radiation embrittlement elaboration, - surveillance specimen programs improvement, - methods and facility development for usage data of the accelerated materials irradiation for forecasting of their capacity for work in realistic (practical) circumstances of operation, - search of the ways for creating of the radiation stable self-recovery intelligent materials.

Keywords: degradation, radiation, steel, wave-like kinetics

Procedia PDF Downloads 292

3694 Robust Image Registration Based on an Adaptive Normalized Mutual Information Metric

Authors: Huda Algharib, Amal Algharib, Hanan Algharib, Ali Mohammad Alqudah

Abstract:

Image registration is an important topic for many imaging systems and computer vision applications. The standard image registration techniques such as Mutual information/ Normalized mutual information -based methods have a limited performance because they do not consider the spatial information or the relationships between the neighbouring pixels or voxels. In addition, the amount of image noise may significantly affect the registration accuracy. Therefore, this paper proposes an efficient method that explicitly considers the relationships between the adjacent pixels, where the gradient information of the reference and scene images is extracted first, and then the cosine similarity of the extracted gradient information is computed and used to improve the accuracy of the standard normalized mutual information measure. Our experimental results on different data types (i.e. CT, MRI and thermal images) show that the proposed method outperforms a number of image registration techniques in terms of the accuracy.

Keywords: image registration, mutual information, image gradients, image transformations

Procedia PDF Downloads 236

3693 Automated Driving Deep Neural Networks Model Accuracy and Performance Assessment in a Simulated Environment

Authors: David Tena-Gago, Jose M. Alcaraz Calero, Qi Wang

Abstract:

The evolution and integration of automated vehicles have become more and more tangible in recent years. State-of-the-art technological advances in the field of camera-based Artificial Intelligence (AI) and computer vision greatly favor the performance and reliability of the Advanced Driver Assistance System (ADAS), leading to a greater knowledge of vehicular operation and resembling human behavior. However, the exclusive use of this technology still seems insufficient to control vehicular operation at 100%. To reveal the degree of accuracy of the current camera-based automated driving AI modules, this paper studies the structure and behavior of one of the main solutions in a controlled testing environment. The results obtained clearly outline the lack of reliability when using exclusively the AI model in the perception stage, thereby entailing using additional complementary sensors to improve its safety and performance.

Keywords: accuracy assessment, AI-driven mobility, artificial intelligence, automated vehicles

Procedia PDF Downloads 96

3692 Performance and Emission Prediction in a Biodiesel Engine Fuelled with Honge Methyl Ester Using RBF Neural Networks

Authors: Shiva Kumar, G. S. Vijay, Srinivas Pai P., Shrinivasa Rao B. R.

Abstract:

In the present study RBF neural networks were used for predicting the performance and emission parameters of a biodiesel engine. Engine experiments were carried out in a 4 stroke diesel engine using blends of diesel and Honge methyl ester as the fuel. Performance parameters like BTE, BSEC, Tech and emissions from the engine were measured. These experimental results were used for ANN modeling. RBF center initialization was done by random selection and by using Clustered techniques. Network was trained by using fixed and varying widths for the RBF units. It was observed that RBF results were having a good agreement with the experimental results. Networks trained by using clustering technique gave better results than using random selection of centers in terms of reduced MRE and increased prediction accuracy. The average MRE for the performance parameters was 3.25% with the prediction accuracy of 98% and for emissions it was 10.4% with a prediction accuracy of 80%.

Keywords: radial basis function networks, emissions, performance parameters, fuzzy c means

Procedia PDF Downloads 543

3691 Downscaling Seasonal Sea Surface Temperature Forecasts over the Mediterranean Sea Using Deep Learning

Authors: Redouane Larbi Boufeniza, Jing-Jia Luo

Abstract:

This study assesses the suitability of deep learning (DL) for downscaling sea surface temperature (SST) over the Mediterranean Sea in the context of seasonal forecasting. We design a set of experiments that compare different DL configurations and deploy the best-performing architecture to downscale one-month lead forecasts of June–September (JJAS) SST from the Nanjing University of Information Science and Technology Climate Forecast System version 1.0 (NUIST-CFS1.0) for the period of 1982–2020. We have also introduced predictors over a larger area to include information about the main large-scale circulations that drive SST over the Mediterranean Sea region, which improves the downscaling results. Finally, we validate the raw model and downscaled forecasts in terms of both deterministic and probabilistic verification metrics, as well as their ability to reproduce the observed precipitation extreme and spell indicator indices. The results showed that the convolutional neural network (CNN)-based downscaling consistently improves the raw model forecasts, with lower bias and more accurate representations of the observed mean and extreme SST spatial patterns. Besides, the CNN-based downscaling yields a much more accurate forecast of extreme SST and spell indicators and reduces the significant relevant biases exhibited by the raw model predictions. Moreover, our results show that the CNN-based downscaling yields better skill scores than the raw model forecasts over most portions of the Mediterranean Sea. The results demonstrate the potential usefulness of CNN in downscaling seasonal SST predictions over the Mediterranean Sea, particularly in providing improved forecast products.

Keywords: Mediterranean Sea, sea surface temperature, seasonal forecasting, downscaling, deep learning

Procedia PDF Downloads 63

3690 Predicting Recessions with Bivariate Dynamic Probit Model: The Czech and German Case

Authors: Lukas Reznak, Maria Reznakova

Abstract:

Recession of an economy has a profound negative effect on all involved stakeholders. It follows that timely prediction of recessions has been of utmost interest both in the theoretical research and in practical macroeconomic modelling. Current mainstream of recession prediction is based on standard OLS models of continuous GDP using macroeconomic data. This approach is not suitable for two reasons: the standard continuous models are proving to be obsolete and the macroeconomic data are unreliable, often revised many years retroactively. The aim of the paper is to explore a different branch of recession forecasting research theory and verify the findings on real data of the Czech Republic and Germany. In the paper, the authors present a family of discrete choice probit models with parameters estimated by the method of maximum likelihood. In the basic form, the probits model a univariate series of recessions and expansions in the economic cycle for a given country. The majority of the paper deals with more complex model structures, namely dynamic and bivariate extensions. The dynamic structure models the autoregressive nature of recessions, taking into consideration previous economic activity to predict the development in subsequent periods. Bivariate extensions utilize information from a foreign economy by incorporating correlation of error terms and thus modelling the dependencies of the two countries. Bivariate models predict a bivariate time series of economic states in both economies and thus enhance the predictive performance. A vital enabler of timely and successful recession forecasting are reliable and readily available data. Leading indicators, namely the yield curve and the stock market indices, represent an ideal data base, as the pieces of information is available in advance and do not undergo any retroactive revisions. As importantly, the combination of yield curve and stock market indices reflect a range of macroeconomic and financial market investors’ trends which influence the economic cycle. These theoretical approaches are applied on real data of Czech Republic and Germany. Two models for each country were identified – each for in-sample and out-of-sample predictive purposes. All four followed a bivariate structure, while three contained a dynamic component.

Keywords: bivariate probit, leading indicators, recession forecasting, Czech Republic, Germany

Procedia PDF Downloads 236

3689 Modelling Flood Events in Botswana (Palapye) for Protecting Roads Structure against Floods

Authors: Thabo M. Bafitlhile, Adewole Oladele

Abstract:

Botswana has been affected by floods since long ago and is still experiencing this tragic event. Flooding occurs mostly in the North-West, North-East, and parts of Central district due to heavy rainfalls experienced in these areas. The torrential rains destroyed homes, roads, flooded dams, fields and destroyed livestock and livelihoods. Palapye is one area in the central district that has been experiencing floods ever since 1995 when its greatest flood on record occurred. Heavy storms result in floods and inundation; this has been exacerbated by poor and absence of drainage structures. Since floods are a part of nature, they have existed and will to continue to exist, hence more destruction. Furthermore floods and highway plays major role in erosion and destruction of roads structures. Already today, many culverts, trenches, and other drainage facilities lack the capacity to deal with current frequency for extreme flows. Future changes in the pattern of hydro climatic events will have implications for the design and maintenance costs of roads. Increase in rainfall and severe weather events can affect the demand for emergent responses. Therefore flood forecasting and warning is a prerequisite for successful mitigation of flood damage. In flood prone areas like Palapye, preventive measures should be taken to reduce possible adverse effects of floods on the environment including road structures. Therefore this paper attempts to estimate return periods associated with huge storms of different magnitude from recorded historical rainfall depth using statistical method. The method of annual maxima was used to select data sets for the rainfall analysis. In the statistical method, the Type 1 extreme value (Gumbel), Log Normal, Log Pearson 3 distributions were all applied to the annual maximum series for Palapye area to produce IDF curves. The Kolmogorov-Smirnov test and Chi Squared were used to confirm the appropriateness of fitted distributions for the location and the data do fit the distributions used to predict expected frequencies. This will be a beneficial tool for urgent flood forecasting and water resource administration as proper drainage design will be design based on the estimated flood events and will help to reclaim and protect the road structures from adverse impacts of flood.

Keywords: drainage, estimate, evaluation, floods, flood forecasting

Procedia PDF Downloads 355

3688 In-door Localization Algorithm and Appropriate Implementation Using Wireless Sensor Networks

Authors: Adeniran K. Ademuwagun, Alastair Allen

Abstract:

The relationship dependence between RSS and distance in an enclosed environment is an important consideration because it is a factor that can influence the reliability of any localization algorithm founded on RSS. Several algorithms effectively reduce the variance of RSS to improve localization or accuracy performance. Our proposed algorithm essentially avoids this pitfall and consequently, its high adaptability in the face of erratic radio signal. Using 3 anchors in close proximity of each other, we are able to establish that RSS can be used as reliable indicator for localization with an acceptable degree of accuracy. Inherent in this concept, is the ability for each prospective anchor to validate (guarantee) the position or the proximity of the other 2 anchors involved in the localization and vice versa. This procedure ensures that the uncertainties of radio signals due to multipath effects in enclosed environments are minimized. A major driver of this idea is the implicit topological relationship among sensors due to raw radio signal strength. The algorithm is an area based algorithm; however, it does not trade accuracy for precision (i.e the size of the returned area).

Keywords: anchor nodes, centroid algorithm, communication graph, radio signal strength

Procedia PDF Downloads 492

3687 An Accurate Computer-Aided Diagnosis: CAD System for Diagnosis of Aortic Enlargement by Using Convolutional Neural Networks

Authors: Mahdi Bazarganigilani

Abstract:

Aortic enlargement, also known as an aortic aneurysm, can occur when the walls of the aorta become weak. This disease can become deadly if overlooked and undiagnosed. In this paper, a computer-aided diagnosis (CAD) system was introduced to accurately diagnose aortic enlargement from chest x-ray images. An enhanced convolutional neural network (CNN) was employed and then trained by transfer learning by using three different main areas from the original images. The areas included the left lung, heart, and right lung. The accuracy of the system was then evaluated on 1001 samples by using 4-fold cross-validation. A promising accuracy of 90% was achieved in terms of the F-measure indicator. The results showed using different areas from the original image in the training phase of CNN could increase the accuracy of predictions. This encouraged the author to evaluate this method on a larger dataset and even on different CAD systems for further enhancement of this methodology.

Keywords: computer-aided diagnosis systems, aortic enlargement, chest X-ray, image processing, convolutional neural networks

Procedia PDF Downloads 141

3686 The Effect of Explicit Focus on Form on Second Language Learning Writing Performance

Authors: Keivan Seyyedi, Leila Esmaeilpour, Seyed Jamal Sadeghi

Abstract:

Investigating the effectiveness of explicit focus on form on the written performance of the EFL learners was the aim of this study. To provide empirical support for this study, sixty male English learners were selected and randomly assigned into two groups of explicit focus on form and meaning focused. Narrative writing was employed for data collection. To measure writing performance, participants were required to narrate a story. They were given 20 minutes to finish the task and were asked to write at least 150 words. The participants’ output was coded then analyzed utilizing Independent t-test for grammatical accuracy and fluency of learners’ performance. Results indicated that learners in explicit focus on form group appear to benefit from error correction and rule explanation as two pedagogical techniques of explicit focus on form with respect to accuracy, but regarding fluency they did not yield any significant differences compared to the participants of meaning-focused group.

Keywords: explicit focus on form, rule explanation, accuracy, fluency

Procedia PDF Downloads 494

3685 Effects of Topic Familiarity on Linguistic Aspects in EFL Learners’ Writing Performance

Authors: Jeong-Won Lee, Kyeong-Ok Yoon

Abstract:

The current study aimed to investigate the effects of topic familiarity and language proficiency on linguistic aspects (lexical complexity, syntactic complexity, accuracy, and fluency) in EFL learners’ argumentative essays. For the study 64 college students were asked to write an argumentative essay for the two different topics (Driving and Smoking) chosen by the consideration of topic familiarity. The students were divided into two language proficiency groups (high-level and intermediate) according to their English writing proficiency. The findings of the study are as follows: 1) the participants of this study exhibited lower levels of lexical and syntactic complexity as well as accuracy when performing writing tasks with unfamiliar topics; and 2) they demonstrated the use of a wider range of vocabulary, and longer and more complex structures, and produced accurate and lengthier texts compared to their intermediate peers. Discussion and pedagogical implications for instruction of writing classes in EFL contexts were addressed.

Keywords: topic familiarity, complexity, accuracy, fluency

Procedia PDF Downloads 38

3684 A Developmental Survey of Local Stereo Matching Algorithms

Authors: André Smith, Amr Abdel-Dayem

Abstract:

This paper presents an overview of the history and development of stereo matching algorithms. Details from its inception, up to relatively recent techniques are described, noting challenges that have been surmounted across these past decades. Different components of these are explored, though focus is directed towards the local matching techniques. While global approaches have existed for some time, and demonstrated greater accuracy than their counterparts, they are generally quite slow. Many strides have been made more recently, allowing local methods to catch up in terms of accuracy, without sacrificing the overall performance.

Keywords: developmental survey, local stereo matching, rectification, stereo correspondence

Procedia PDF Downloads 276

3683 The Combination Of Aortic Dissection Detection Risk Score (ADD-RS) With D-dimer As A Diagnostic Tool To Exclude The Diagnosis Of Acute Aortic Syndrome (AAS)

Authors: Mohamed Hamada Abdelkader Fayed

Abstract:

Background: To evaluate the diagnostic accuracy of (ADD-RS) with D-dimer as a screening test to exclude AAS. Methods: We conducted research for the studies examining the diagnostic accuracy of (ADD- RS)+ D-dimer to exclude the diagnosis of AAS, We searched MEDLINE, Embase, and Cochrane of Trials up to 31 December 2020. Results: We identified 3 studies using (ADD-RS) with D-dimer as a diagnostic tool for AAS, involving 3261 patients were AAS was diagnosed in 559(17.14%) patients. Overall results showed that the pooled sensitivities were 97.6 (95% CI 0.95.6, 99.6) at (ADD-RS)≤1(low risk group) with D-dimer and 97.4(95% CI 0.95.4,, 99.4) at (ADD-RS)>1(High risk group) with D-dimer., the failure rate was 0.48% at low risk group and 4.3% at high risk group respectively. Conclusions: (ADD-RS) with D-dimer was a useful screening test with high sensitivity to exclude Acute Aortic Syndrome.

Keywords: aortic dissection detection risk score, D-dimer, acute aortic syndrome, diagnostic accuracy

Procedia PDF Downloads 204

3682 Evaluation of Spatial Distribution Prediction for Site-Scale Soil Contaminants Based on Partition Interpolation

Authors: Pengwei Qiao, Sucai Yang, Wenxia Wei

Abstract:

Soil pollution has become an important issue in China. Accurate spatial distribution prediction of pollutants with interpolation methods is the basis for soil remediation in the site. However, a relatively strong variability of pollutants would decrease the prediction accuracy. Theoretically, partition interpolation can result in accurate prediction results. In order to verify the applicability of partition interpolation for a site, benzo (b) fluoranthene (BbF) in four soil layers was adopted as the research object in this paper. IDW (inverse distance weighting)-, RBF (radial basis function)-and OK (ordinary kriging)-based partition interpolation accuracies were evaluated, and their influential factors were analyzed; then, the uncertainty and applicability of partition interpolation were determined. Three conclusions were drawn. (1) The prediction error of partitioned interpolation decreased by 70% compared to unpartitioned interpolation. (2) Partition interpolation reduced the impact of high CV (coefficient of variation) and high concentration value on the prediction accuracy. (3) The prediction accuracy of IDW-based partition interpolation was higher than that of RBF- and OK-based partition interpolation, and it was suitable for the identification of highly polluted areas at a contaminated site. These results provide a useful method to obtain relatively accurate spatial distribution information of pollutants and to identify highly polluted areas, which is important for soil pollution remediation in the site.

Keywords: accuracy, applicability, partition interpolation, site, soil pollution, uncertainty

Procedia PDF Downloads 132

3681 Oil Producing Wells Using a Technique of Gas Lift on Prosper Software

Authors: Nikhil Yadav, Shubham Verma

Abstract:

Gas lift is a common technique used to optimize oil production in wells. Prosper software is a powerful tool for modeling and optimizing gas lift systems in oil wells. This review paper examines the effectiveness of Prosper software in optimizing gas lift systems in oil-producing wells. The literature review identified several studies that demonstrated the use of Prosper software to adjust injection rate, depth, and valve characteristics to optimize gas lift system performance. The results showed that Prosper software can significantly improve production rates and reduce operating costs in oil-producing wells. However, the accuracy of the model depends on the accuracy of the input data, and the cost of Prosper software can be high. Therefore, further research is needed to improve the accuracy of the model and evaluate the cost-effectiveness of using Prosper software in gas lift system optimization

Keywords: gas lift, prosper software, injection rate, operating costs, oil-producing wells

Procedia PDF Downloads 64

3680 Rain Gauges Network Optimization in Southern Peninsular Malaysia

Authors: Mohd Khairul Bazli Mohd Aziz, Fadhilah Yusof, Zulkifli Yusop, Zalina Mohd Daud, Mohammad Afif Kasno

Abstract:

Recent developed rainfall network design techniques have been discussed and compared by many researchers worldwide due to the demand of acquiring higher levels of accuracy from collected data. In many studies, rain-gauge networks are designed to provide good estimation for areal rainfall and for flood modelling and prediction. In a certain study, even using lumped models for flood forecasting, a proper gauge network can significantly improve the results. Therefore existing rainfall network in Johor must be optimized and redesigned in order to meet the required level of accuracy preset by rainfall data users. The well-known geostatistics method (variance-reduction method) that is combined with simulated annealing was used as an algorithm of optimization in this study to obtain the optimal number and locations of the rain gauges. Rain gauge network structure is not only dependent on the station density; station location also plays an important role in determining whether information is acquired accurately. The existing network of 84 rain gauges in Johor is optimized and redesigned by using rainfall, humidity, solar radiation, temperature and wind speed data during monsoon season (November – February) for the period of 1975 – 2008. Three different semivariogram models which are Spherical, Gaussian and Exponential were used and their performances were also compared in this study. Cross validation technique was applied to compute the errors and the result showed that exponential model is the best semivariogram. It was found that the proposed method was satisfied by a network of 64 rain gauges with the minimum estimated variance and 20 of the existing ones were removed and relocated. An existing network may consist of redundant stations that may make little or no contribution to the network performance for providing quality data. Therefore, two different cases were considered in this study. The first case considered the removed stations that were optimally relocated into new locations to investigate their influence in the calculated estimated variance and the second case explored the possibility to relocate all 84 existing stations into new locations to determine the optimal position. The relocations of the stations in both cases have shown that the new optimal locations have managed to reduce the estimated variance and it has proven that locations played an important role in determining the optimal network.

Keywords: geostatistics, simulated annealing, semivariogram, optimization

Procedia PDF Downloads 288

3679 A Study of Permission-Based Malware Detection Using Machine Learning

Authors: Ratun Rahman, Rafid Islam, Akin Ahmed, Kamrul Hasan, Hasan Mahmud

Abstract:

Malware is becoming more prevalent, and several threat categories have risen dramatically in recent years. This paper provides a bird's-eye view of the world of malware analysis. The efficiency of five different machine learning methods (Naive Bayes, K-Nearest Neighbor, Decision Tree, Random Forest, and TensorFlow Decision Forest) combined with features picked from the retrieval of Android permissions to categorize applications as harmful or benign is investigated in this study. The test set consists of 1,168 samples (among these android applications, 602 are malware and 566 are benign applications), each consisting of 948 features (permissions). Using the permission-based dataset, the machine learning algorithms then produce accuracy rates above 80%, except the Naive Bayes Algorithm with 65% accuracy. Of the considered algorithms TensorFlow Decision Forest performed the best with an accuracy of 90%.

Keywords: android malware detection, machine learning, malware, malware analysis

Procedia PDF Downloads 143

3678 Shark Detection and Classification with Deep Learning

Authors: Jeremy Jenrette, Z. Y. C. Liu, Pranav Chimote, Edward Fox, Trevor Hastie, Francesco Ferretti

Abstract:

Suitable shark conservation depends on well-informed population assessments. Direct methods such as scientific surveys and fisheries monitoring are adequate for defining population statuses, but species-specific indices of abundance and distribution coming from these sources are rare for most shark species. We can rapidly fill these information gaps by boosting media-based remote monitoring efforts with machine learning and automation. We created a database of shark images by sourcing 24,546 images covering 219 species of sharks from the web application spark pulse and the social network Instagram. We used object detection to extract shark features and inflate this database to 53,345 images. We packaged object-detection and image classification models into a Shark Detector bundle. We developed the Shark Detector to recognize and classify sharks from videos and images using transfer learning and convolutional neural networks (CNNs). We applied these models to common data-generation approaches of sharks: boosting training datasets, processing baited remote camera footage and online videos, and data-mining Instagram. We examined the accuracy of each model and tested genus and species prediction correctness as a result of training data quantity. The Shark Detector located sharks in baited remote footage and YouTube videos with an average accuracy of 89\%, and classified located subjects to the species level with 69\% accuracy (n =\ eight species). The Shark Detector sorted heterogeneous datasets of images sourced from Instagram with 91\% accuracy and classified species with 70\% accuracy (n =\ 17 species). Data-mining Instagram can inflate training datasets and increase the Shark Detector’s accuracy as well as facilitate archiving of historical and novel shark observations. Base accuracy of genus prediction was 68\% across 25 genera. The average base accuracy of species prediction within each genus class was 85\%. The Shark Detector can classify 45 species. All data-generation methods were processed without manual interaction. As media-based remote monitoring strives to dominate methods for observing sharks in nature, we developed an open-source Shark Detector to facilitate common identification applications. Prediction accuracy of the software pipeline increases as more images are added to the training dataset. We provide public access to the software on our GitHub page.

Keywords: classification, data mining, Instagram, remote monitoring, sharks

Procedia PDF Downloads 100

3677 Random Forest Classification for Population Segmentation

Authors: Regina Chua

Abstract:

To reduce the costs of re-fielding a large survey, a Random Forest classifier was applied to measure the accuracy of classifying individuals into their assigned segments with the fewest possible questions. Given a long survey, one needed to determine the most predictive ten or fewer questions that would accurately assign new individuals to custom segments. Furthermore, the solution needed to be quick in its classification and usable in non-Python environments. In this paper, a supervised Random Forest classifier was modeled on a dataset with 7,000 individuals, 60 questions, and 254 features. The Random Forest consisted of an iterative collection of individual decision trees that result in a predicted segment with robust precision and recall scores compared to a single tree. A random 70-30 stratified sampling for training the algorithm was used, and accuracy trade-offs at different depths for each segment were identified. Ultimately, the Random Forest classifier performed at 87% accuracy at a depth of 10 with 20 instead of 254 features and 10 instead of 60 questions. With an acceptable accuracy in prioritizing feature selection, new tools were developed for non-Python environments: a worksheet with a formulaic version of the algorithm and an embedded function to predict the segment of an individual in real-time. Random Forest was determined to be an optimal classification model by its feature selection, performance, processing speed, and flexible application in other environments.

Keywords: machine learning, supervised learning, data science, random forest, classification, prediction, predictive modeling

Procedia PDF Downloads 82

3676 Experiments on Weakly-Supervised Learning on Imperfect Data

Authors: Yan Cheng, Yijun Shao, James Rudolph, Charlene R. Weir, Beth Sahlmann, Qing Zeng-Treitler

Abstract:

Supervised predictive models require labeled data for training purposes. Complete and accurate labeled data, i.e., a ‘gold standard’, is not always available, and imperfectly labeled data may need to serve as an alternative. An important question is if the accuracy of the labeled data creates a performance ceiling for the trained model. In this study, we trained several models to recognize the presence of delirium in clinical documents using data with annotations that are not completely accurate (i.e., weakly-supervised learning). In the external evaluation, the support vector machine model with a linear kernel performed best, achieving an area under the curve of 89.3% and accuracy of 88%, surpassing the 80% accuracy of the training sample. We then generated a set of simulated data and carried out a series of experiments which demonstrated that models trained on imperfect data can (but do not always) outperform the accuracy of the training data, e.g., the area under the curve for some models is higher than 80% when trained on the data with an error rate of 40%. Our experiments also showed that the error resistance of linear modeling is associated with larger sample size, error type, and linearity of the data (all p-values < 0.001). In conclusion, this study sheds light on the usefulness of imperfect data in clinical research via weakly-supervised learning.

Keywords: weakly-supervised learning, support vector machine, prediction, delirium, simulation

Procedia PDF Downloads 179

3675 Enhancing Temporal Extrapolation of Wind Speed Using a Hybrid Technique: A Case Study in West Coast of Denmark

Authors: B. Elshafei, X. Mao

Abstract:

The demand for renewable energy is significantly increasing, major investments are being supplied to the wind power generation industry as a leading source of clean energy. The wind energy sector is entirely dependable and driven by the prediction of wind speed, which by the nature of wind is very stochastic and widely random. This s0tudy employs deep multi-fidelity Gaussian process regression, used to predict wind speeds for medium term time horizons. Data of the RUNE experiment in the west coast of Denmark were provided by the Technical University of Denmark, which represent the wind speed across the study area from the period between December 2015 and March 2016. The study aims to investigate the effect of pre-processing the data by denoising the signal using empirical wavelet transform (EWT) and engaging the vector components of wind speed to increase the number of input data layers for data fusion using deep multi-fidelity Gaussian process regression (GPR). The outcomes were compared using root mean square error (RMSE) and the results demonstrated a significant increase in the accuracy of predictions which demonstrated that using vector components of the wind speed as additional predictors exhibits more accurate predictions than strategies that ignore them, reflecting the importance of the inclusion of all sub data and pre-processing signals for wind speed forecasting models.

Keywords: data fusion, Gaussian process regression, signal denoise, temporal extrapolation

Procedia PDF Downloads 126

3674 The Role of Information Technology in Supply Chain Management

Authors: V. Jagadeesh, K. Venkata Subbaiah, P. Govinda Rao

Abstract:

This paper explaining about the significance of information technology tools and software packages in supply chain management (SCM) in order to manage the entire supply chain. Managing materials flow and financial flow and information flow effectively and efficiently with the aid of information technology tools and packages in order to deliver right quantity with right quality of goods at right time by using right methods and technology. Information technology plays a vital role in streamlining the sales forecasting and demand planning and Inventory control and transportation in supply networks and finally deals with production planning and scheduling. It achieves the objectives by streamlining the business process and integrates within the enterprise and its extended enterprise. SCM starts with customer and it involves sequence of activities from customer, retailer, distributor, manufacturer and supplier within the supply chain framework. It is the process of integrating demand planning and supply network planning and production planning and control. Forecasting indicates the direction for planning raw materials in order to meet the production planning requirements. Inventory control and transportation planning allocate the optimal or economic order quantity by utilizing shortest possible routes to deliver the goods to the customer. Production planning and control utilize the optimal resources mix in order to meet the capacity requirement planning. The above operations can be achieved by using appropriate information technology tools and software packages for the supply chain management.

Keywords: supply chain management, information technology, business process, extended enterprise

Procedia PDF Downloads 366

3673 Preparation of Wireless Networks and Security; Challenges in Efficient Accession of Encrypted Data in Healthcare

Authors: M. Zayoud, S. Oueida, S. Ionescu, P. AbiChar

Abstract:

Background: Wireless sensor network is encompassed of diversified tools of information technology, which is widely applied in a range of domains, including military surveillance, weather forecasting, and earthquake forecasting. Strengthened grounds are always developed for wireless sensor networks, which usually emerges security issues during professional application. Thus, essential technological tools are necessary to be assessed for secure aggregation of data. Moreover, such practices have to be incorporated in the healthcare practices that shall be serving in the best of the mutual interest Objective: Aggregation of encrypted data has been assessed through homomorphic stream cipher to assure its effectiveness along with providing the optimum solutions to the field of healthcare. Methods: An experimental design has been incorporated, which utilized newly developed cipher along with CPU-constrained devices. Modular additions have also been employed to evaluate the nature of aggregated data. The processes of homomorphic stream cipher have been highlighted through different sensors and modular additions. Results: Homomorphic stream cipher has been recognized as simple and secure process, which has allowed efficient aggregation of encrypted data. In addition, the application has led its way to the improvisation of the healthcare practices. Statistical values can be easily computed through the aggregation on the basis of selected cipher. Sensed data in accordance with variance, mean, and standard deviation has also been computed through the selected tool. Conclusion: It can be concluded that homomorphic stream cipher can be an ideal tool for appropriate aggregation of data. Alongside, it shall also provide the best solutions to the healthcare sector.

Keywords: aggregation, cipher, homomorphic stream, encryption

Procedia PDF Downloads 244

3672 The Effect of Information vs. Reasoning Gap Tasks on the Frequency of Conversational Strategies and Accuracy in Speaking among Iranian Intermediate EFL Learners

Authors: Hooriya Sadr Dadras, Shiva Seyed Erfani

Abstract:

Speaking skills merit meticulous attention both on the side of the learners and the teachers. In particular, accuracy is a critical component to guarantee the messages to be conveyed through conversation because a wrongful change may adversely alter the content and purpose of the talk. Different types of tasks have served teachers to meet numerous educational objectives. Besides, negotiation of meaning and the use of different strategies have been areas of concern in socio-cultural theories of SLA. Negotiation of meaning is among the conversational processes which have a crucial role in facilitating the understanding and expression of meaning in a given second language. Conversational strategies are used during interaction when there is a breakdown in communication that leads to the interlocutor attempting to remedy the gap through talk. Therefore, this study was an attempt to investigate if there was any significant difference between the effect of reasoning gap tasks and information gap tasks on the frequency of conversational strategies used in negotiation of meaning in classrooms on one hand, and on the accuracy in speaking of Iranian intermediate EFL learners on the other. After a pilot study to check the practicality of the treatments, at the outset of the main study, the Preliminary English Test was administered to ensure the homogeneity of 87 out of 107 participants who attended the intact classes of a 15 session term in one control and two experimental groups. Also, speaking sections of PET were used as pretest and posttest to examine their speaking accuracy. The tests were recorded and transcribed to estimate the percentage of the number of the clauses with no grammatical errors in the total produced clauses to measure the speaking accuracy. In all groups, the grammatical points of accuracy were instructed and the use of conversational strategies was practiced. Then, different kinds of reasoning gap tasks (matchmaking, deciding on the course of action, and working out a time table) and information gap tasks (restoring an incomplete chart, spot the differences, arranging sentences into stories, and guessing game) were manipulated in experimental groups during treatment sessions, and the students were required to practice conversational strategies when doing speaking tasks. The conversations throughout the terms were recorded and transcribed to count the frequency of the conversational strategies used in all groups. The results of statistical analysis demonstrated that applying both the reasoning gap tasks and information gap tasks significantly affected the frequency of conversational strategies through negotiation. In the face of the improvements, the reasoning gap tasks had a more significant impact on encouraging the negotiation of meaning and increasing the number of conversational frequencies every session. The findings also indicated both task types could help learners significantly improve their speaking accuracy. Here, applying the reasoning gap tasks was more effective than the information gap tasks in improving the level of learners’ speaking accuracy.

Keywords: accuracy in speaking, conversational strategies, information gap tasks, reasoning gap tasks

Procedia PDF Downloads 296

3671 SNR Classification Using Multiple CNNs

Authors: Thinh Ngo, Paul Rad, Brian Kelley

Abstract:

Noise estimation is essential in today wireless systems for power control, adaptive modulation, interference suppression and quality of service. Deep learning (DL) has already been applied in the physical layer for modulation and signal classifications. Unacceptably low accuracy of less than 50% is found to undermine traditional application of DL classification for SNR prediction. In this paper, we use divide-and-conquer algorithm and classifier fusion method to simplify SNR classification and therefore enhances DL learning and prediction. Specifically, multiple CNNs are used for classification rather than a single CNN. Each CNN performs a binary classification of a single SNR with two labels: less than, greater than or equal. Together, multiple CNNs are combined to effectively classify over a range of SNR values from −20 ≤ SNR ≤ 32 dB.We use pre-trained CNNs to predict SNR over a wide range of joint channel parameters including multiple Doppler shifts (0, 60, 120 Hz), power-delay profiles, and signal-modulation types (QPSK,16QAM,64-QAM). The approach achieves individual SNR prediction accuracy of 92%, composite accuracy of 70% and prediction convergence one order of magnitude faster than that of traditional estimation.

Keywords: classification, CNN, deep learning, prediction, SNR

Procedia PDF Downloads 123

3670 Machine Learning for Disease Prediction Using Symptoms and X-Ray Images

Authors: Ravija Gunawardana, Banuka Athuraliya

Abstract:

Machine learning has emerged as a powerful tool for disease diagnosis and prediction. The use of machine learning algorithms has the potential to improve the accuracy of disease prediction, thereby enabling medical professionals to provide more effective and personalized treatments. This study focuses on developing a machine-learning model for disease prediction using symptoms and X-ray images. The importance of this study lies in its potential to assist medical professionals in accurately diagnosing diseases, thereby improving patient outcomes. Respiratory diseases are a significant cause of morbidity and mortality worldwide, and chest X-rays are commonly used in the diagnosis of these diseases. However, accurately interpreting X-ray images requires significant expertise and can be time-consuming, making it difficult to diagnose respiratory diseases in a timely manner. By incorporating machine learning algorithms, we can significantly enhance disease prediction accuracy, ultimately leading to better patient care. The study utilized the Mask R-CNN algorithm, which is a state-of-the-art method for object detection and segmentation in images, to process chest X-ray images. The model was trained and tested on a large dataset of patient information, which included both symptom data and X-ray images. The performance of the model was evaluated using a range of metrics, including accuracy, precision, recall, and F1-score. The results showed that the model achieved an accuracy rate of over 90%, indicating that it was able to accurately detect and segment regions of interest in the X-ray images. In addition to X-ray images, the study also incorporated symptoms as input data for disease prediction. The study used three different classifiers, namely Random Forest, K-Nearest Neighbor and Support Vector Machine, to predict diseases based on symptoms. These classifiers were trained and tested using the same dataset of patient information as the X-ray model. The results showed promising accuracy rates for predicting diseases using symptoms, with the ensemble learning techniques significantly improving the accuracy of disease prediction. The study's findings indicate that the use of machine learning algorithms can significantly enhance disease prediction accuracy, ultimately leading to better patient care. The model developed in this study has the potential to assist medical professionals in diagnosing respiratory diseases more accurately and efficiently. However, it is important to note that the accuracy of the model can be affected by several factors, including the quality of the X-ray images, the size of the dataset used for training, and the complexity of the disease being diagnosed. In conclusion, the study demonstrated the potential of machine learning algorithms for disease prediction using symptoms and X-ray images. The use of these algorithms can improve the accuracy of disease diagnosis, ultimately leading to better patient care. Further research is needed to validate the model's accuracy and effectiveness in a clinical setting and to expand its application to other diseases.

Keywords: K-nearest neighbor, mask R-CNN, random forest, support vector machine

Procedia PDF Downloads 125