Search results for: statistical classifiers
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4063

Search results for: statistical classifiers

3613 Probing Language Models for Multiple Linguistic Information

Authors: Bowen Ding, Yihao Kuang

Abstract:

In recent years, large-scale pre-trained language models have achieved state-of-the-art performance on a variety of natural language processing tasks. The word vectors produced by these language models can be viewed as dense encoded presentations of natural language that in text form. However, it is unknown how much linguistic information is encoded and how. In this paper, we construct several corresponding probing tasks for multiple linguistic information to clarify the encoding capabilities of different language models and performed a visual display. We firstly obtain word presentations in vector form from different language models, including BERT, ELMo, RoBERTa and GPT. Classifiers with a small scale of parameters and unsupervised tasks are then applied on these word vectors to discriminate their capability to encode corresponding linguistic information. The constructed probe tasks contain both semantic and syntactic aspects. The semantic aspect includes the ability of the model to understand semantic entities such as numbers, time, and characters, and the grammatical aspect includes the ability of the language model to understand grammatical structures such as dependency relationships and reference relationships. We also compare encoding capabilities of different layers in the same language model to infer how linguistic information is encoded in the model.

Keywords: language models, probing task, text presentation, linguistic information

Procedia PDF Downloads 83
3612 Comprehensive Review of Adversarial Machine Learning in PDF Malware

Authors: Preston Nabors, Nasseh Tabrizi

Abstract:

Portable Document Format (PDF) files have gained significant popularity for sharing and distributing documents due to their universal compatibility. However, the widespread use of PDF files has made them attractive targets for cybercriminals, who exploit vulnerabilities to deliver malware and compromise the security of end-user systems. This paper reviews notable contributions in PDF malware detection, including static, dynamic, signature-based, and hybrid analysis. It presents a comprehensive examination of PDF malware detection techniques, focusing on the emerging threat of adversarial sampling and the need for robust defense mechanisms. The paper highlights the vulnerability of machine learning classifiers to evasion attacks. It explores adversarial sampling techniques in PDF malware detection to produce mimicry and reverse mimicry evasion attacks, which aim to bypass detection systems. Improvements for future research are identified, including accessible methods, applying adversarial sampling techniques to malicious payloads, evaluating other models, evaluating the importance of features to malware, implementing adversarial defense techniques, and conducting comprehensive examination across various scenarios. By addressing these opportunities, researchers can enhance PDF malware detection and develop more resilient defense mechanisms against adversarial attacks.

Keywords: adversarial attacks, adversarial defense, adversarial machine learning, intrusion detection, PDF malware, malware detection, malware detection evasion

Procedia PDF Downloads 21
3611 Electroencephalography Correlates of Memorability While Viewing Advertising Content

Authors: Victor N. Anisimov, Igor E. Serov, Ksenia M. Kolkova, Natalia V. Galkina

Abstract:

The problem of memorability of the advertising content is closely connected with the key issues of neuromarketing. The memorability of the advertising content contributes to the marketing effectiveness of the promoted product. Significant directions of studying the phenomenon of memorability are the memorability of the brand (detected through the memorability of the logo) and the memorability of the product offer (detected through the memorization of dynamic audiovisual advertising content - commercial). The aim of this work is to reveal the predictors of memorization of static and dynamic audiovisual stimuli (logos and commercials). An important direction of the research was revealing differences in psychophysiological correlates of memorability between static and dynamic audiovisual stimuli. We assumed that static and dynamic images are perceived in different ways and may have a difference in the memorization process. Objective methods of recording psychophysiological parameters while watching static and dynamic audiovisual materials are well suited to achieve the aim. The electroencephalography (EEG) method was performed with the aim of identifying correlates of the memorability of various stimuli in the electrical activity of the cerebral cortex. All stimuli (in the groups of statics and dynamics separately) were divided into 2 groups – remembered and not remembered based on the results of the questioning method. The questionnaires were filled out by survey participants after viewing the stimuli not immediately, but after a time interval (for detecting stimuli recorded through long-term memorization). Using statistical method, we developed the classifier (statistical model) that predicts which group (remembered or not remembered) stimuli gets, based on psychophysiological perception. The result of the statistical model was compared with the results of the questionnaire. Conclusions: Predictors of the memorability of static and dynamic stimuli have been identified, which allows prediction of which stimuli will have a higher probability of remembering. Further developments of this study will be the creation of stimulus memory model with the possibility of recognizing the stimulus as previously seen or new. Thus, in the process of remembering the stimulus, it is planned to take into account the stimulus recognition factor, which is one of the most important tasks for neuromarketing.

Keywords: memory, commercials, neuromarketing, EEG, branding

Procedia PDF Downloads 231
3610 Cat Stool as an Additive Aggregate to Garden Bricks

Authors: Mary Joy B. Amoguis, Alonah Jane D. Labtic, Hyna Wary Namoca, Aira Jane V. Original

Abstract:

Animal waste has been rapidly increasing due to the growing animal population and the lack of innovative waste management practices. In a country like the Philippines, animal waste is rampant. This study aims to minimize animal waste by producing garden bricks using cat stool as an additive. The research study analyzes different levels of concentration to determine the most efficient combination in terms of compressive strength and durability of cat stool as an additive to garden bricks. The researcher's first collects the cat stool and incinerates the different concentrations. The first concentration is 25% cat stool and 75% cement mixture. The second concentration is 50% cat stool and 50% cement mixture. And the third concentration is 75% cat stool and 25% cement mixture. The researchers analyze the statistical data using one-way ANOVA, and the statistical analysis revealed a significant difference compared to the controlled variable. The research findings show an inversely proportional relationship: the higher the concentration of cat stool additive, the lower the compressive strength of the bricks, and the lower the concentration of cat stool additive, the higher the compressive strength of the bricks.

Keywords: cat stool, garden bricks, cement, concentrations, animal wastes, compressive strength, durability, one-way ANOVA, additive, incineration, aggregates, stray cats

Procedia PDF Downloads 40
3609 Systematic Identification of Noncoding Cancer Driver Somatic Mutations

Authors: Zohar Manber, Ran Elkon

Abstract:

Accumulation of somatic mutations (SMs) in the genome is a major driving force of cancer development. Most SMs in the tumor's genome are functionally neutral; however, some cause damage to critical processes and provide the tumor with a selective growth advantage (termed cancer driver mutations). Current research on functional significance of SMs is mainly focused on finding alterations in protein coding sequences. However, the exome comprises only 3% of the human genome, and thus, SMs in the noncoding genome significantly outnumber those that map to protein-coding regions. Although our understanding of noncoding driver SMs is very rudimentary, it is likely that disruption of regulatory elements in the genome is an important, yet largely underexplored mechanism by which somatic mutations contribute to cancer development. The expression of most human genes is controlled by multiple enhancers, and therefore, it is conceivable that regulatory SMs are distributed across different enhancers of the same target gene. Yet, to date, most statistical searches for regulatory SMs have considered each regulatory element individually, which may reduce statistical power. The first challenge in considering the cumulative activity of all the enhancers of a gene as a single unit is to map enhancers to their target promoters. Such mapping defines for each gene its set of regulating enhancers (termed "set of regulatory elements" (SRE)). Considering multiple enhancers of each gene as one unit holds great promise for enhancing the identification of driver regulatory SMs. However, the success of this approach is greatly dependent on the availability of comprehensive and accurate enhancer-promoter (E-P) maps. To date, the discovery of driver regulatory SMs has been hindered by insufficient sample sizes and statistical analyses that often considered each regulatory element separately. In this study, we analyzed more than 2,500 whole-genome sequence (WGS) samples provided by The Cancer Genome Atlas (TCGA) and The International Cancer Genome Consortium (ICGC) in order to identify such driver regulatory SMs. Our analyses took into account the combinatorial aspect of gene regulation by considering all the enhancers that control the same target gene as one unit, based on E-P maps from three genomics resources. The identification of candidate driver noncoding SMs is based on their recurrence. We searched for SREs of genes that are "hotspots" for SMs (that is, they accumulate SMs at a significantly elevated rate). To test the statistical significance of recurrence of SMs within a gene's SRE, we used both global and local background mutation rates. Using this approach, we detected - in seven different cancer types - numerous "hotspots" for SMs. To support the functional significance of these recurrent noncoding SMs, we further examined their association with the expression level of their target gene (using gene expression data provided by the ICGC and TCGA for samples that were also analyzed by WGS).

Keywords: cancer genomics, enhancers, noncoding genome, regulatory elements

Procedia PDF Downloads 90
3608 Characteristics of Cumulative Distribution Function of Grown Crack Size at Specified Fatigue Crack Propagation Life under Different Maximum Fatigue Loads in AZ31

Authors: Seon Soon Choi

Abstract:

Magnesium alloy has been widely used in structure such as an automobile. It is necessary to consider probabilistic characteristics of a structural material because a fatigue behavior of a structure has a randomness and uncertainty. The purpose of this study is to find the characteristics of the cumulative distribution function (CDF) of the grown crack size at a specified fatigue crack propagation life and to investigate a statistical crack propagation in magnesium alloys. The statistical fatigue data of the grown crack size are obtained through the fatigue crack propagation (FCP) tests under different maximum fatigue load conditions conducted on the replicated specimens of magnesium alloys. The 3-parameter Weibull distribution is used to find the CDF of grown crack size. The CDF of grown crack size in case of larger maximum fatigue load has longer tail in below 10 percent and above 90 percent. The fatigue failure occurs easily as the tail of CDF of grown crack size becomes long. The fatigue behavior under the larger maximum fatigue load condition shows more rapid propagation and failure mode.

Keywords: cumulative distribution function, fatigue crack propagation, grown crack size, magnesium alloys, maximum fatigue load

Procedia PDF Downloads 269
3607 Examines the Proportionality between the Needs of Industry and Technical and Vocational Training of Male and Female Vocational Schools

Authors: Khalil Aryanfar, Pariya Gholipor, Elmira Hafez

Abstract:

This study examines the proportionality between the needs of industry and technical and vocational training of male and female vocational schools. The research method was descriptive that was conducted in two parts: documentary analysis and needs assessment and Delphi method was used in the need assessment. The statistical population of the study included 312 individuals from the industry sector employers and 52 of them were selected through stratified random sampling. Methods of data collection in this study, upstream documents include: document of the development of technical and vocational training, Statistical Yearbook 1393 in Tehran, the available documents in Isfahan Planning Department, the findings indicate that there is an almost proportionality between the needs of industry and Vocational training of male and female vocational schools in fields of welding, industrial electronics, electro technique, industrial drawing, auto mechanics, design, packaging, machine tool, metalworking, construction, accounting, computer graphics and the Administrative Affairs. The findings indicate that there is no proportionality between the needs of industry and Vocational training of male and female vocational schools in fields of Thermal - cooling systems, building electricity, building drawing, interior architecture, car electricity and motor repair.

Keywords: needs assessment, technical and vocational training, industry

Procedia PDF Downloads 434
3606 Effectiveness of Homoeopathic Medicine Conium Maculatum 200 C for Management of Pyuria

Authors: Amir Ashraf

Abstract:

Homoeopathy is an alternative system of medicine discovered by German physician Samuel Hahnemann in 1796. It has been used by several people for various health conditions globally for more than last 200 years. In India, homoeopathy is considered as a major system of alternative medicine. Homoeopathy is found effective in various medical conditions including Pyuria. Pyuria is the condition in which pus cells are found in urine. Homoeopathy is very useful for reducing pus cells, and homeopathically potentized Conium Mac (Hemlock) is an important remedy commonly used for reducing pyuria. Aim: To reduce the amount pus cells found in urine using Conium Mac 200C. Methods: Design. Small N Design. Samples: Purposive Sampling with 5 cases diagnosed as pyuria. Tools: Personal Data Schedule and ICD-10 Criteria for Pyuria. Techniques: Potentized homoeopathic medicine, Conium Mac 200th potency is used. Statistical Analysis: The statistical analyses were done using non-parametric tests. Results: There is significant pre/post difference has been identified. Conclusion: Homoeopathic potency, Conium Mac 200 C is effective in reducing the increased level of pus cells found in urine samples.

Keywords: homoeopathy, alternative medicine, Pyuria, Conim Mac, small N design, non-parametric tests, homeopathic physician, Ashirvad Hospital, Kannur

Procedia PDF Downloads 314
3605 Statistical Model to Examine the Impact of the Inflation Rate and Real Interest Rate on the Bahrain Economy

Authors: Ghada Abo-Zaid

Abstract:

Introduction: Oil is one of the most income source in Bahrain. Low oil price influence on the economy growth and the investment rate in Bahrain. For example, the economic growth was 3.7% in 2012, and it reduced to 2.9% in 2015. Investment rate was 9.8% in 2012, and it is reduced to be 5.9% and -12.1% in 2014 and 2015, respectively. The inflation rate is increased to the peak point in 2013 with 3.3 %. Objectives: The objectives here are to build statistical models to examine the effect of the interest rate inflation rate on the growth economy in Bahrain from 2000 to 2018. Methods: This study based on 18 years, and the multiple regression model is used for the analysis. All of the missing data are omitted from the analysis. Results: Regression model is used to examine the association between the Growth national product (GNP), the inflation rate, and real interest rate. We found that (i) Increase the real interest rate decrease the GNP. (ii) Increase the inflation rate does not effect on the growth economy in Bahrain since the average of the inflation rate was almost 2%, and this is considered as a low percentage. Conclusion: There is a positive impact of the real interest rate on the GNP in Bahrain. While the inflation rate does not show any negative influence on the GNP as the inflation rate was not large enough to effect negatively on the economy growth rate in Bahrain.

Keywords: growth national product, egypt, regression model, interest rate

Procedia PDF Downloads 137
3604 Predicting Stack Overflow Accepted Answers Using Features and Models with Varying Degrees of Complexity

Authors: Osayande Pascal Omondiagbe, Sherlock a Licorish

Abstract:

Stack Overflow is a popular community question and answer portal which is used by practitioners to solve technology-related challenges during software development. Previous studies have shown that this forum is becoming a substitute for official software programming languages documentation. While tools have looked to aid developers by presenting interfaces to explore Stack Overflow, developers often face challenges searching through many possible answers to their questions, and this extends the development time. To this end, researchers have provided ways of predicting acceptable Stack Overflow answers by using various modeling techniques. However, less interest is dedicated to examining the performance and quality of typically used modeling methods, and especially in relation to models’ and features’ complexity. Such insights could be of practical significance to the many practitioners that use Stack Overflow. This study examines the performance and quality of various modeling methods that are used for predicting acceptable answers on Stack Overflow, drawn from 2014, 2015 and 2016. Our findings reveal significant differences in models’ performance and quality given the type of features and complexity of models used. Researchers examining classifiers’ performance and quality and features’ complexity may leverage these findings in selecting suitable techniques when developing prediction models.

Keywords: feature selection, modeling and prediction, neural network, random forest, stack overflow

Procedia PDF Downloads 115
3603 Identifying and Quantifying Factors Affecting Traffic Crash Severity under Heterogeneous Traffic Flow

Authors: Praveen Vayalamkuzhi, Veeraragavan Amirthalingam

Abstract:

Studies on safety on highways are becoming the need of the hour as over 400 lives are lost every day in India due to road crashes. In order to evaluate the factors that lead to different levels of crash severity, it is necessary to investigate the level of safety of highways and their relation to crashes. In the present study, an attempt is made to identify the factors that contribute to road crashes and to quantify their effect on the severity of road crashes. The study was carried out on a four-lane divided rural highway in India. The variables considered in the analysis includes components of horizontal alignment of highway, viz., straight or curve section; time of day, driveway density, presence of median; median opening; gradient; operating speed; and annual average daily traffic. These variables were considered after a preliminary analysis. The major complexities in the study are the heterogeneous traffic and the speed variation between different classes of vehicles along the highway. To quantify the impact of each of these factors, statistical analyses were carried out using Logit model and also negative binomial regression. The output from the statistical models proved that the variables viz., horizontal components of the highway alignment; driveway density; time of day; operating speed as well as annual average daily traffic show significant relation with the severity of crashes viz., fatal as well as injury crashes. Further, the annual average daily traffic has significant effect on the severity compared to other variables. The contribution of highway horizontal components on crash severity is also significant. Logit models can predict crashes better than the negative binomial regression models. The results of the study will help the transport planners to look into these aspects at the planning stage itself in the case of highways operated under heterogeneous traffic flow condition.

Keywords: geometric design, heterogeneous traffic, road crash, statistical analysis, level of safety

Procedia PDF Downloads 271
3602 Regional Flood-Duration-Frequency Models for Norway

Authors: Danielle M. Barna, Kolbjørn Engeland, Thordis Thorarinsdottir, Chong-Yu Xu

Abstract:

Design flood values give estimates of flood magnitude within a given return period and are essential to making adaptive decisions around land use planning, infrastructure design, and disaster mitigation. Often design flood values are needed at locations with insufficient data. Additionally, in hydrologic applications where flood retention is important (e.g., floodplain management and reservoir design), design flood values are required at different flood durations. A statistical approach to this problem is a development of a regression model for extremes where some of the parameters are dependent on flood duration in addition to being covariate-dependent. In hydrology, this is called a regional flood-duration-frequency (regional-QDF) model. Typically, the underlying statistical distribution is chosen to be the Generalized Extreme Value (GEV) distribution. However, as the support of the GEV distribution depends on both its parameters and the range of the data, special care must be taken with the development of the regional model. In particular, we find that the GEV is problematic when developing a GAMLSS-type analysis due to the difficulty of proposing a link function that is independent of the unknown parameters and the observed data. We discuss these challenges in the context of developing a regional QDF model for Norway.

Keywords: design flood values, bayesian statistics, regression modeling of extremes, extreme value analysis, GEV

Procedia PDF Downloads 57
3601 Examination of the Main Behavioral Patterns of Male and Female Students in Islamic Azad University

Authors: Sobhan Sobhani

Abstract:

This study examined the behavioral patterns of student and their determinants according to the "symbolic interaction" sociological perspective in the form of 7 hypotheses. Behavioral patterns of students were classified in 8 categories: religious, scientific, political, artistic, sporting, national, parents and teachers. They were evaluated by student opinions by a five-point Likert rating scale. The statistical population included all male and female students of Islamic Azad University, Behabahan branch, among which 600 patients (268 females and 332 males) were selected randomly. The following statistical methods were used: frequency and percentage, mean, t-test, Pearson correlation coefficient and multi-way analysis of variance. The results obtained from statistical analysis showed that: 1-There is a significant difference between male and female students in terms of disposition to religious figures, artists, teachers and parents. 2-There is a significant difference between students of urban and rural areas in terms of assuming behavioral patterns of religious, political, scientific, artistic, national figures and teachers. 3-The most important criterion for selecting behavioral patterns of students is intellectual understanding with the pattern. 4-The most important factor influencing the behavioral patterns of male and female students is parents followed by friends. 5-Boys are affected by teachers, the Internet and satellite programs more than girls. Girls assume behavioral patterns from books more than boys. 6-There is a significant difference between students in human sciences, technical, medical and engineering disciplines in terms of selecting religious and political figures as behavioral patterns. 7-There is a significant difference between students belonging to different subcultures in terms of assuming behavioral patterns of religious, scientific and cultural figures. 8-Between the first and fourth year students in terms of selecting behavioral patterns, there is a significant difference only in selecting religious figures. 9-There is a significant negative correlation between the education level of parents and the selection of religious and political figures and teachers. 10-There is a significant negative correlation between family income and the selection of political and religious figures.

Keywords: behavioral patterns, behavioral patterns, male and female students, Islamic Azad University

Procedia PDF Downloads 344
3600 Simulation Based Analysis of Gear Dynamic Behavior in Presence of Multiple Cracks

Authors: Ahmed Saeed, Sadok Sassi, Mohammad Roshun

Abstract:

Gears are important components with a vital role in many rotating machines. One of the common gear failure causes is tooth fatigue crack; however, its early detection is still a challenging task. The objective of this study is to develop a numerical model that simulates the effect of teeth cracks on the resulting gears vibrations and permits consequently to perform an early fault detection. In contrast to other published papers, this work incorporates the possibility of multiple simultaneous cracks with different depths. As cracks alter significantly the stiffness of the tooth, finite element software is used to determine the stiffness variation with respect to the angular position, for different combinations of crack orientation and depth. A simplified six degrees of freedom nonlinear lumped parameter model of a one-stage spur gear system is proposed to study the vibration with and without cracks. The model developed for calculating the stiffness with the crack permitted to update the physical parameters of the second-degree-of-freedom equations of motions describing the vibration of the gearbox. The vibration simulation results of the gearbox were by obtained using Simulink/Matlab. The effect of one crack with different levels was studied thoroughly. The change in the mesh stiffness and the vibration response were found to be consistent with previously published works. In addition, various statistical time domain parameters were considered. They showed different degrees of sensitivity toward the crack depth. Multiple cracks were also introduced at different locations and the vibration response along with the statistical parameters were obtained again for a general case of degradation (increase in crack depth, crack number and crack locations). It was found that although some parameters increase in value as the deterioration level increases, they show almost no change or even decrease when the number of cracks increases. Therefore, the use of any statistical parameters could be misleading if not considered in an appropriate way.

Keywords: Spur gear, cracked tooth, numerical simulation, time-domain parameters

Procedia PDF Downloads 249
3599 Impact of Yogic Exercise on Cardiovascular Function on Selected College Students of High Altitude

Authors: Benu Gupta

Abstract:

The purpose of the study was to assess the impact of yogic exercise on cardiovascular exercises on selected college students of high altitude. The research was conducted on college students of high altitude in Shimla for their cardiovascular function [Blood Pressure (BP), VO2 Max (TLC) and Pulse Rate (PR)] in respect to yogic exercise. Total 139 students were randomly selected from Himachal University colleges in Shimla. The study was conducted in three phases. The subjects were identified in the first phase of research program then further in next phase they were physiologically tested, and yogic exercise battery was operated in different time frame. The entire subjects were treated with three months yogic exercise. The entire lot of students were again evaluated physiologically [(Cardiovascular measurement: Blood Pressure (BP), VO2 Max (TLC) and Pulse Rate (PR)] with standard equipments. The statistical analyses of the variance (PR, BP (SBP & DBP) and TLC) were done. The result reveals that there was a significant difference in TLC; whereas there was no significant difference in PR. For BP statistical analysis suggests no significant difference were formed. Result showed that the BP of the participants were more inclined towards normal standard BP i.e. 120-80 mmHg.

Keywords: cardiovascular function, college students, high altitude, yogic exercise

Procedia PDF Downloads 209
3598 Remote Assessment and Change Detection of GreenLAI of Cotton Crop Using Different Vegetation Indices

Authors: Ganesh B. Shinde, Vijaya B. Musande

Abstract:

Cotton crop identification based on the timely information has significant advantage to the different implications of food, economic and environment. Due to the significant advantages, the accurate detection of cotton crop regions using supervised learning procedure is challenging problem in remote sensing. Here, classifiers on the direct image are played a major role but the results are not much satisfactorily. In order to further improve the effectiveness, variety of vegetation indices are proposed in the literature. But, recently, the major challenge is to find the better vegetation indices for the cotton crop identification through the proposed methodology. Accordingly, fuzzy c-means clustering is combined with neural network algorithm, trained by Levenberg-Marquardt for cotton crop classification. To experiment the proposed method, five LISS-III satellite images was taken and the experimentation was done with six vegetation indices such as Simple Ratio, Normalized Difference Vegetation Index, Enhanced Vegetation Index, Green Atmospherically Resistant Vegetation Index, Wide-Dynamic Range Vegetation Index, Green Chlorophyll Index. Along with these indices, Green Leaf Area Index is also considered for investigation. From the research outcome, Green Atmospherically Resistant Vegetation Index outperformed with all other indices by reaching the average accuracy value of 95.21%.

Keywords: Fuzzy C-Means clustering (FCM), neural network, Levenberg-Marquardt (LM) algorithm, vegetation indices

Procedia PDF Downloads 294
3597 Statistical Tools for SFRA Diagnosis in Power Transformers

Authors: Rahul Srivastava, Priti Pundir, Y. R. Sood, Rajnish Shrivastava

Abstract:

For the interpretation of the signatures of sweep frequency response analysis(SFRA) of transformer different types of statistical techniques serves as an effective tool for doing either phase to phase comparison or sister unit comparison. In this paper with the discussion on SFRA several statistics techniques like cross correlation coefficient (CCF), root square error (RSQ), comparative standard deviation (CSD), Absolute difference, mean square error(MSE),Min-Max ratio(MM) are presented through several case studies. These methods require sample data size and spot frequencies of SFRA signatures that are being compared. The techniques used are based on power signal processing tools that can simplify result and limits can be created for the severity of the fault occurring in the transformer due to several short circuit forces or due to ageing. The advantages of using statistics techniques for analyzing of SFRA result are being indicated through several case studies and hence the results are obtained which determines the state of the transformer.

Keywords: absolute difference (DABS), cross correlation coefficient (CCF), mean square error (MSE), min-max ratio (MM-ratio), root square error (RSQ), standard deviation (CSD), sweep frequency response analysis (SFRA)

Procedia PDF Downloads 675
3596 Computer-Aided Classification of Liver Lesions Using Contrasting Features Difference

Authors: Hussein Alahmer, Amr Ahmed

Abstract:

Liver cancer is one of the common diseases that cause the death. Early detection is important to diagnose and reduce the incidence of death. Improvements in medical imaging and image processing techniques have significantly enhanced interpretation of medical images. Computer-Aided Diagnosis (CAD) systems based on these techniques play a vital role in the early detection of liver disease and hence reduce liver cancer death rate.  This paper presents an automated CAD system consists of three stages; firstly, automatic liver segmentation and lesion’s detection. Secondly, extracting features. Finally, classifying liver lesions into benign and malignant by using the novel contrasting feature-difference approach. Several types of intensity, texture features are extracted from both; the lesion area and its surrounding normal liver tissue. The difference between the features of both areas is then used as the new lesion descriptors. Machine learning classifiers are then trained on the new descriptors to automatically classify liver lesions into benign or malignant. The experimental results show promising improvements. Moreover, the proposed approach can overcome the problems of varying ranges of intensity and textures between patients, demographics, and imaging devices and settings.

Keywords: CAD system, difference of feature, fuzzy c means, lesion detection, liver segmentation

Procedia PDF Downloads 297
3595 A Review on Existing Challenges of Data Mining and Future Research Perspectives

Authors: Hema Bhardwaj, D. Srinivasa Rao

Abstract:

Technology for analysing, processing, and extracting meaningful data from enormous and complicated datasets can be termed as "big data." The technique of big data mining and big data analysis is extremely helpful for business movements such as making decisions, building organisational plans, researching the market efficiently, improving sales, etc., because typical management tools cannot handle such complicated datasets. Special computational and statistical issues, such as measurement errors, noise accumulation, spurious correlation, and storage and scalability limitations, are brought on by big data. These unique problems call for new computational and statistical paradigms. This research paper offers an overview of the literature on big data mining, its process, along with problems and difficulties, with a focus on the unique characteristics of big data. Organizations have several difficulties when undertaking data mining, which has an impact on their decision-making. Every day, terabytes of data are produced, yet only around 1% of that data is really analyzed. The idea of the mining and analysis of data and knowledge discovery techniques that have recently been created with practical application systems is presented in this study. This article's conclusion also includes a list of issues and difficulties for further research in the area. The report discusses the management's main big data and data mining challenges.

Keywords: big data, data mining, data analysis, knowledge discovery techniques, data mining challenges

Procedia PDF Downloads 89
3594 Socio-Economic and Psychological Factors of Moscow Population Deviant Behavior: Sociological and Statistical Research

Authors: V. Bezverbny

Abstract:

The actuality of the project deals with stable growing of deviant behavior’ statistics among Moscow citizens. During the recent years the socioeconomic health, wealth and life expectation of Moscow residents is regularly growing up, but the limits of crime and drug addiction have grown up seriously. Another serious Moscow problem has been economical stratification of population. The cost of identical residential areas differs at 2.5 times. The project is aimed at complex research and the development of methodology for main factors and reasons evaluation of deviant behavior growing in Moscow. The main project objective is finding out the links between the urban environment quality and dynamics of citizens’ deviant behavior in regional and municipal aspect using the statistical research methods and GIS modeling. The conducted research allowed: 1) to evaluate the dynamics of deviant behavior in Moscow different administrative districts; 2) to describe the reasons of crime increasing, drugs addiction, alcoholism, suicides tendencies among the city population; 3) to develop the city districts classification based on the level of the crime rate; 4) to create the statistical database containing the main indicators of Moscow population deviant behavior in 2010-2015 including information regarding crime level, alcoholism, drug addiction, suicides; 5) to present statistical indicators that characterize the dynamics of Moscow population deviant behavior in condition of expanding the city territory; 6) to analyze the main sociological theories and factors of deviant behavior for concretization the deviation types; 7) to consider the main theoretical statements of the city sociology devoted to the reasons for deviant behavior in megalopolis conditions. To explore the level of deviant behavior’ factors differentiation, the questionnaire was worked out, and sociological survey involved more than 1000 people from different districts of the city was conducted. Sociological survey allowed to study the socio-economical and psychological factors of deviant behavior. It also included the Moscow residents’ open-ended answers regarding the most actual problems in their districts and reasons of wish to leave their place. The results of sociological survey lead to the conclusion that the main factors of deviant behavior in Moscow are high level of social inequality, large number of illegal migrants and bums, nearness of large transport hubs and stations on the territory, ineffective work of police, alcohol availability and drug accessibility, low level of psychological comfort for Moscow citizens, large number of building projects.

Keywords: deviant behavior, megapolis, Moscow, urban environment, social stratification

Procedia PDF Downloads 172
3593 Hand Gesture Interpretation Using Sensing Glove Integrated with Machine Learning Algorithms

Authors: Aqsa Ali, Aleem Mushtaq, Attaullah Memon, Monna

Abstract:

In this paper, we present a low cost design for a smart glove that can perform sign language recognition to assist the speech impaired people. Specifically, we have designed and developed an Assistive Hand Gesture Interpreter that recognizes hand movements relevant to the American Sign Language (ASL) and translates them into text for display on a Thin-Film-Transistor Liquid Crystal Display (TFT LCD) screen as well as synthetic speech. Linear Bayes Classifiers and Multilayer Neural Networks have been used to classify 11 feature vectors obtained from the sensors on the glove into one of the 27 ASL alphabets and a predefined gesture for space. Three types of features are used; bending using six bend sensors, orientation in three dimensions using accelerometers and contacts at vital points using contact sensors. To gauge the performance of the presented design, the training database was prepared using five volunteers. The accuracy of the current version on the prepared dataset was found to be up to 99.3% for target user. The solution combines electronics, e-textile technology, sensor technology, embedded system and machine learning techniques to build a low cost wearable glove that is scrupulous, elegant and portable.

Keywords: American sign language, assistive hand gesture interpreter, human-machine interface, machine learning, sensing glove

Procedia PDF Downloads 272
3592 Multi-Elemental Analysis Using Inductively Coupled Plasma Mass Spectrometry for the Geographical Origin Discrimination of Greek Giant Beans “Gigantes Elefantes”

Authors: Eleni C. Mazarakioti, Anastasios Zotos, Anna-Akrivi Thomatou, Efthimios Kokkotos, Achilleas Kontogeorgos, Athanasios Ladavos, Angelos Patakas

Abstract:

“Gigantes Elefantes” is a particularly dynamic crop of giant beans cultivated in western Macedonia (Greece). This variety of large beans growing in this area and specifically in the regions of Prespes and Kastoria is a protected designation of origin (PDO) species with high nutritional quality. Mislabeling of geographical origin and blending with unidentified samples are common fraudulent practices in Greek food market with financial and possible health consequences. In the last decades, multi-elemental composition analysis has been used in identifying the geographical origin of foods and agricultural products. In an attempt to discriminate the authenticity of Greek beans, multi-elemental analysis (Ag, Al, As, B, Ba, Be, Ca, Cd, Co, Cr, Cs, Cu, Fe, Ga, Ge, K, Li, Mg, Mn, Mo, Na, Nb, Ni, P, Pb, Rb, Re, Se, Sr, Ta, Ti, Tl, U, V, W, Zn, Zr) was performed by inductively coupled plasma mass spectrometry (ICP-MS) on 320 samples of beans, originated from Greece (Prespes and Kastoria), China and Poland. All samples were collected during the autumn of 2021. The obtained data were analysed by principal component analysis (PCA), an unsupervised statistical method, which allows for to reduce of the dimensionality of the enormous datasets. Statistical analysis revealed a clear separation of beans that had been cultivated in Greece compared with those from China and Poland. An adequate discrimination of geographical origin between bean samples originating from the two Greece regions, Prespes and Kastoria, was also evident. Our results suggest that multi-elemental analysis combined with the appropriate multivariate statistical method could be a useful tool for bean’s geographical authentication. Acknowledgment: This research has been financed by the Public Investment Programme/General Secretariat for Research and Innovation, under the call “YPOERGO 3, code 2018SE01300000: project title: ‘Elaboration and implementation of methodology for authenticity and geographical origin assessment of agricultural products.

Keywords: geographical origin, authenticity, multi-elemental analysis, beans, ICP-MS, PCA

Procedia PDF Downloads 52
3591 Application of Stochastic Models to Annual Extreme Streamflow Data

Authors: Karim Hamidi Machekposhti, Hossein Sedghi

Abstract:

This study was designed to find the best stochastic model (using of time series analysis) for annual extreme streamflow (peak and maximum streamflow) of Karkheh River at Iran. The Auto-regressive Integrated Moving Average (ARIMA) model used to simulate these series and forecast those in future. For the analysis, annual extreme streamflow data of Jelogir Majin station (above of Karkheh dam reservoir) for the years 1958–2005 were used. A visual inspection of the time plot gives a little increasing trend; therefore, series is not stationary. The stationarity observed in Auto-Correlation Function (ACF) and Partial Auto-Correlation Function (PACF) plots of annual extreme streamflow was removed using first order differencing (d=1) in order to the development of the ARIMA model. Interestingly, the ARIMA(4,1,1) model developed was found to be most suitable for simulating annual extreme streamflow for Karkheh River. The model was found to be appropriate to forecast ten years of annual extreme streamflow and assist decision makers to establish priorities for water demand. The Statistical Analysis System (SAS) and Statistical Package for the Social Sciences (SPSS) codes were used to determinate of the best model for this series.

Keywords: stochastic models, ARIMA, extreme streamflow, Karkheh river

Procedia PDF Downloads 129
3590 Development of Trigger Tool to Identify Adverse Drug Events From Warfarin Administered to Patient Admitted in Medical Wards of Chumphae Hospital

Authors: Puntarikorn Rungrattanakasin

Abstract:

Objectives: To develop the trigger tool to warn about the risk of bleeding as an adverse event from warfarin drug usage during admission in Medical Wards of Chumphae Hospital. Methods: A retrospective study was performed by reviewing the medical records for the patients admitted between June 1st,2020- May 31st, 2021. ADEs were evaluated by Naranjo’s algorithm. The international normalized ratio (INR) and events of bleeding during admissions were collected. Statistical analyses, including Chi-square test and Reciever Operating Characteristic (ROC) curve for optimal INR threshold, were used for the study. Results: Among the 139 admissions, the INR range was found to vary between 0.86-14.91, there was a total of 15 bleeding events, out of which 9 were mild, and 6 were severe. The occurrence of bleeding started whenever the INR was greater than 2.5 and reached the statistical significance (p <0.05), which was in concordance with the ROC curve and yielded 100 % sensitivity and 60% specificity in the detection of a bleeding event. In this regard, the INR greater than 2.5 was considered to be an optimal threshold to alert promptly for bleeding tendency. Conclusions: The INR value of greater than 2.5 (>2.5) would be an appropriate trigger tool to warn of the risk of bleeding for patients taking warfarin in Chumphae Hospital.

Keywords: trigger tool, warfarin, risk of bleeding, medical wards

Procedia PDF Downloads 128
3589 Dynamical Relation of Poisson Spike Trains in Hodkin-Huxley Neural Ion Current Model and Formation of Non-Canonical Bases, Islands, and Analog Bases in DNA, mRNA, and RNA at or near the Transcription

Authors: Michael Fundator

Abstract:

Groundbreaking application of biomathematical and biochemical research in neural networks processes to formation of non-canonical bases, islands, and analog bases in DNA and mRNA at or near the transcription that contradicts the long anticipated statistical assumptions for the distribution of bases and analog bases compounds is implemented through statistical and stochastic methods apparatus with addition of quantum principles, where the usual transience of Poisson spike train becomes very instrumental tool for finding even almost periodical type of solutions to Fokker-Plank stochastic differential equation. Present article develops new multidimensional methods of finding solutions to stochastic differential equations based on more rigorous approach to mathematical apparatus through Kolmogorov-Chentsov continuity theorem that allows the stochastic processes with jumps under certain conditions to have γ-Holder continuous modification that is used as basis for finding analogous parallels in dynamics of neutral networks and formation of analog bases and transcription in DNA.

Keywords: Fokker-Plank stochastic differential equation, Kolmogorov-Chentsov continuity theorem, neural networks, translation and transcription

Procedia PDF Downloads 385
3588 Design of an Ensemble Learning Behavior Anomaly Detection Framework

Authors: Abdoulaye Diop, Nahid Emad, Thierry Winter, Mohamed Hilia

Abstract:

Data assets protection is a crucial issue in the cybersecurity field. Companies use logical access control tools to vault their information assets and protect them against external threats, but they lack solutions to counter insider threats. Nowadays, insider threats are the most significant concern of security analysts. They are mainly individuals with legitimate access to companies information systems, which use their rights with malicious intents. In several fields, behavior anomaly detection is the method used by cyber specialists to counter the threats of user malicious activities effectively. In this paper, we present the step toward the construction of a user and entity behavior analysis framework by proposing a behavior anomaly detection model. This model combines machine learning classification techniques and graph-based methods, relying on linear algebra and parallel computing techniques. We show the utility of an ensemble learning approach in this context. We present some detection methods tests results on an representative access control dataset. The use of some explored classifiers gives results up to 99% of accuracy.

Keywords: cybersecurity, data protection, access control, insider threat, user behavior analysis, ensemble learning, high performance computing

Procedia PDF Downloads 104
3587 Fault Detection and Isolation in Sensors and Actuators of Wind Turbines

Authors: Shahrokh Barati, Reza Ramezani

Abstract:

Due to the countries growing attention to the renewable energy producing, the demand for energy from renewable energy has gone up among the renewable energy sources; wind energy is the fastest growth in recent years. In this regard, in order to increase the availability of wind turbines, using of Fault Detection and Isolation (FDI) system is necessary. Wind turbines include of various faults such as sensors fault, actuator faults, network connection fault, mechanical faults and faults in the generator subsystem. Although, sensors and actuators have a large number of faults in wind turbine but have discussed fewer in the literature. Therefore, in this work, we focus our attention to design a sensor and actuator fault detection and isolation algorithm and Fault-tolerant control systems (FTCS) for Wind Turbine. The aim of this research is to propose a comprehensive fault detection and isolation system for sensors and actuators of wind turbine based on data-driven approaches. To achieve this goal, the features of measurable signals in real wind turbine extract in any condition. The next step is the feature selection among the extract in any condition. The next step is the feature selection among the extracted features. Features are selected that led to maximum separation networks that implemented in parallel and results of classifiers fused together. In order to maximize the reliability of decision on fault, the property of fault repeatability is used.

Keywords: FDI, wind turbines, sensors and actuators faults, renewable energy

Procedia PDF Downloads 380
3586 Assessing the Self-Directed Learning Skills of the Undergraduate Nursing Students in a Medical University in Bahrain: A Quantitative Study

Authors: Catherine Mary Abou-Zaid

Abstract:

This quantitative study discusses the concerns with the self-directed learning (SDL) skills of the undergraduate nursing students in a medical university in Bahrain. The nursing undergraduate student SDL study was conducted taking all 4 years and compiling data collected from the students themselves by survey questionnaire. The aim of the study is to understand and change the attitudes of self-directed learning among the undergraduate students. The SDL of the undergraduate student nurses has been noticed to be lacking and motivation to actually perform without supervision while out-with classrooms are very low. Their use of the resources available on the virtual learning environment and also within the university is not as good as it should be for a university student at this level. They do not use them to their own advantage. They are not prepared for the transition from high school to an academic environment such as a university or college. For some students it is the first time in their academic lives that they have faced sharing a classroom with the opposite sex. For some this is a major issue and we as academics need to be aware of all issues that they come to higher education with. Design Methodology: The design methodology that was chosen was a quantitative design using convenience sampling of the students who would be asked to complete survey questionnaire. This sampling method was chosen because of the time constraint. This was completed by the undergraduate students themselves while in class. The questionnaire was analyzed by the statistical package for social sciences (SPSS), the results interpreted by the researcher and the findings published in the paper. The analyzed data will also be reported on and from this information we as educators will be able to see the student’s weaknesses regarding self-directed learning. The aims and objectives of the research will be used as recommendations for the improvement of resources for the students to improve their SDL skills. Conclusion: The results will be able to give the educators an insight to how we can change the self-directed learning techniques of the students and enable them to embrace the skills and to focus more on being self-directed in their studies rather than having to be put on to a SDL pathway from the educators themselves. This evidence will come from the analysis of the statistical data. It may even change the way in which the students are selected for the nursing programme. These recommendations will be reported to the head of school and also to the nursing faculty.

Keywords: self-directed learning, undergraduate students, transition, statistical package for social sciences (SPSS), higher education

Procedia PDF Downloads 295
3585 Coastal Environment: Statistical Analysis and Geomorphic Impact on Urban Tourism in Lagos, Portugal

Authors: Magdalena Kuleta

Abstract:

Ponta de Piedade (37º05 ' N, 08º40 ' W) is an area located in the southern part of the Lagos municipality, which include an abrasive and accumulative type of coastline. It is the one of the main touristic destinations of the city. The dynamic development of the attractiveness of the coast, is related with the expansion of the new tourism infrastructure and urban tourism products. These products are: transportation, sightseeing and entertainment in the form of the boat trips. Each type of excursion refers to the different product. This progress brings also many risks associated primarily with landslides cliffs. Natural conditions affecting the coast, create a huge impact on the evolution of urban tourism management. Based on observation, statistical analysis and survey method, author compare the period of six years from 2012 to 2016 in terms of the number of tourists, number and diversity of attractions, most frequently dialled products and infrastructure changes in the city. Carried methodology is based on data belonging to Turismo Portugal and the tourist company Days of Adventure. Main result, is to indicate the essence of the income from coastal tourism into the city development and how does it influence on the marketing and promoting of urban tourism in Lagos.

Keywords: geomorphology of the coast in Lagos, market and promotion, quality of tourism service, urban tourism products

Procedia PDF Downloads 294
3584 Effect of Electronic Banking on the Performance of Deposit Money Banks in Nigeria: Using ATM and Mobile Phone as a Case Study

Authors: Charity Ifunanya Osakwe, Victoria Ogochuchukwu Obi-Nwosu, Chima Kenneth Anachedo

Abstract:

The study investigates how automated teller machines (ATM) and mobile banking affect deposit money banks in the Nigerian economy. The study made use of time series data which were obtained from the Central Bank of Nigeria Statistical Bulletin from 2009 to 2021. The Central Bank of Nigeria (CBN) data on automated teller machine and mobile phones were used to proxy electronic banking while total deposit in banks proxied the performance of deposit money banks. The analysis for the study was done using ordinary least square econometric technique with the aid of economic view statistical package. The results show that the automated teller machine has a positive and significant effect on the total deposits of deposit money banks in Nigeria and that making use of deposits of deposit money banks in Nigeria. It was concluded in the study that e-banking has equally increased banking access to customers and also created room for banks to expand their operations to more customers. The study recommends that banks in Nigeria should prioritize the expansion and maintenance of ATM networks as well as continue to invest in and develop more mobile banking services.

Keywords: electronic, banking, automated teller machines, mobile, deposit

Procedia PDF Downloads 33