Search results for: statistical classifiers
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4120

Search results for: statistical classifiers

3700 Stock Market Prediction Using Convolutional Neural Network That Learns from a Graph

Authors: Mo-Se Lee, Cheol-Hwi Ahn, Kee-Young Kwahk, Hyunchul Ahn

Abstract:

Over the past decade, deep learning has been in spotlight among various machine learning algorithms. In particular, CNN (Convolutional Neural Network), which is known as effective solution for recognizing and classifying images, has been popularly applied to classification and prediction problems in various fields. In this study, we try to apply CNN to stock market prediction, one of the most challenging tasks in the machine learning research. In specific, we propose to apply CNN as the binary classifier that predicts stock market direction (up or down) by using a graph as its input. That is, our proposal is to build a machine learning algorithm that mimics a person who looks at the graph and predicts whether the trend will go up or down. Our proposed model consists of four steps. In the first step, it divides the dataset into 5 days, 10 days, 15 days, and 20 days. And then, it creates graphs for each interval in step 2. In the next step, CNN classifiers are trained using the graphs generated in the previous step. In step 4, it optimizes the hyper parameters of the trained model by using the validation dataset. To validate our model, we will apply it to the prediction of KOSPI200 for 1,986 days in eight years (from 2009 to 2016). The experimental dataset will include 14 technical indicators such as CCI, Momentum, ROC and daily closing price of KOSPI200 of Korean stock market.

Keywords: convolutional neural network, deep learning, Korean stock market, stock market prediction

Procedia PDF Downloads 423
3699 Effects of a Student-Centered Approach to Assessment on Students' Attitudes towards 'Applied Statistics' Course

Authors: Anduela Lile

Abstract:

The purpose of this cross sectional study was to investigate the effectiveness of teaching and learning Statistics from a student centered perspective in higher education institutions. Statistics education has emphasized the application of tangible and interesting examples in order to motivate students learning about statistical concepts. Participants in this study were 112 bachelor students enrolled in the ‘Applied Statistics’ course in Sports University of Tirana. Experimental group students received a student-centered teaching approach; Control group students received an instructor-centered teaching approach. This study found student-centered approach student group had statistically significantly higher assessments scores (52.1 ± 18.9) at the end of the evaluation compared to instructor-centered approach student group (61.8 ± 16.4), (t (108) = 2.848, p = 0.005). Results concluded that student-centered perspective can improve student positive attitude to statistical methods and to motivate project work. Therefore, findings of this study may be very useful to the higher education institutions to establish their learning strategies especially for courses related to Statistics.

Keywords: student-centered, instructor-centered, course assessment, learning outcomes, applied statistics

Procedia PDF Downloads 277
3698 Surface Quality Improvement of Abrasive Waterjet Cutting for Spacecraft Structure

Authors: Tarek M. Ahmed, Ahmed S. El Mesalamy, Amro M. Youssef, Tawfik T. El Midany

Abstract:

Abrasive waterjet (AWJ) machining is considered as one of the most powerful cutting processes. It can be used for cutting heat sensitive, hard and reflective materials. Aluminum 2024 is a high-strength alloy which is widely used in aerospace and aviation industries. This paper aims to improve aluminum alloy and to investigate the effect of AWJ control parameters on surface geometry quality. Design of experiments (DoE) is used for establishing an experimental matrix. Statistical modeling is used to present a relation between the cutting parameters (pressure, speed, and distance between the nozzle and cut surface) and responses (taper angle and surface roughness). The results revealed a tangible improvement in productivity by using AWJ processing. The taper kerf angle can be improved by decreasing standoff distance and speed and increasing water pressure. While decreasing (cutting speed, pressure and distance between the nozzle and cut surface) improve the surface roughness in the operating window of cutting parameters.

Keywords: abrasive waterjet machining, machining of aluminum alloy, non-traditional cutting, statistical modeling

Procedia PDF Downloads 246
3697 Prevalence of Breast Cancer Molecular Subtypes at a Tertiary Cancer Institute

Authors: Nahush Modak, Meena Pangarkar, Anand Pathak, Ankita Tamhane

Abstract:

Background: Breast cancer is the prominent cause of cancer and mortality among women. This study was done to show the statistical analysis of a cohort of over 250 patients detected with breast cancer diagnosed by oncologists using Immunohistochemistry (IHC). IHC was performed by using ER; PR; HER2; Ki-67 antibodies. Materials and methods: Formalin fixed Paraffin embedded tissue samples were obtained by surgical manner and standard protocol was followed for fixation, grossing, tissue processing, embedding, cutting and IHC. The Ventana Benchmark XT machine was used for automated IHC of the samples. Antibodies used were supplied by F. Hoffmann-La Roche Ltd. Statistical analysis was performed by using SPSS for windows. Statistical tests performed were chi-squared test and Correlation tests with p<.01. The raw data was collected and provided by National Cancer Insitute, Jamtha, India. Result: Luminal B was the most prevailing molecular subtype of Breast cancer at our institute. Chi squared test of homogeneity was performed to find equality in distribution and Luminal B was the most prevalent molecular subtype. The worse prognostic indicator for breast cancer depends upon expression of Ki-67 and her2 protein in cancerous cells. Our study was done at p <.01 and significant dependence was observed. There exists no dependence of age on molecular subtype of breast cancer. Similarly, age is an independent variable while considering Ki-67 expression. Chi square test performed on Human epidermal growth factor receptor 2 (HER2) statuses of patients and strong dependence was observed in percentage of Ki-67 expression and Her2 (+/-) character which shows that, value of Ki depends upon Her2 expression in cancerous cells (p<.01). Surprisingly, dependence was observed in case of Ki-67 and Pr, at p <.01. This shows that Progesterone receptor proteins (PR) are over-expressed when there is an elevation in expression of Ki-67 protein. Conclusion: We conclude from that Luminal B is the most prevalent molecular subtype at National Cancer Institute, Jamtha, India. There was found no significant correlation between age and Ki-67 expression in any molecular subtype. And no dependence or correlation exists between patients’ age and molecular subtype. We also found that, when the diagnosis is Luminal A, out of the cohort of 257 patients, no patient shows >14% Ki-67 value. Statistically, extremely significant values were observed for dependence of PR+Her2- and PR-Her2+ scores on Ki-67 expression. (p<.01). Her2 is an important prognostic factor in breast cancer. Chi squared test for Her2 and Ki-67 shows that the expression of Ki depends upon Her2 statuses. Moreover, Ki-67 cannot be used as a standalone prognostic factor for determining breast cancer.

Keywords: breast cancer molecular subtypes , correlation, immunohistochemistry, Ki-67 and HR, statistical analysis

Procedia PDF Downloads 118
3696 Characterization of Climatic Drought in the Saiss Plateau (Morocco) Using Statistical Indices

Authors: Abdeghani Qadem

Abstract:

Climate change is now an undeniable reality with increasing impacts on water systems worldwide, especially leading to severe drought episodes. The Southern Mediterranean region is particularly affected by this drought, which can have devastating consequences on water resources. Morocco, due to its geographical location in North Africa and the Southern Mediterranean, is especially vulnerable to these effects of climate change, particularly drought. In this context, this article focuses on the study of climate variability and drought characteristics in the Saiss Plateau region and its adjacent areas with the Middle Atlas, using specific statistical indices. The study begins by analyzing the annual precipitation variation, with a particular emphasis on data homogenization and gap filling using a regional vector. Then, the analysis delves into drought episodes in the region, using the Standardized Precipitation Index (SPI) over a 12-month period. The central objective is to accurately assess significant drought changes between 1980 and 2015, based on data collected from nine meteorological stations located in the study area.

Keywords: climate variability, regional vector, drought, standardized precipitation index, Saiss Plateau, middle atlas

Procedia PDF Downloads 62
3695 Impact of Instagram Food Bloggers on Consumer (Generation Z) Decision Making Process in Islamabad. Pakistan

Authors: Tabinda Sadiq, Tehmina Ashfaq Qazi, Hoor Shumail

Abstract:

Recently, the advent of emerging technology has created an emerging generation of restaurant marketing. It explores the aspects that influence customers’ decision-making process in selecting a restaurant after reading food bloggers' reviews online. The motivation behind this research is to investigate the correlation between the credibility of the source and their attitude toward restaurant visits. The researcher collected the data by distributing a survey questionnaire through google forms by employing the Source credibility theory. Non- probability purposive sampling technique was used to collect data. The questionnaire used a predeveloped and validated scale by Ohanian to measure the relationship. Also, the researcher collected data from 250 respondents in order to investigate the influence of food bloggers on Gen Z's decision-making process. SPSS statistical version 26 was used for statistical testing and analyzing the data. The findings of the survey revealed that there is a moderate positive correlation between the variables. So, it can be analyzed that food bloggers do have an impact on Generation Z's decision making process.

Keywords: credibility, decision making, food bloggers, generation z, e-wom

Procedia PDF Downloads 68
3694 The Use of Boosted Multivariate Trees in Medical Decision-Making for Repeated Measurements

Authors: Ebru Turgal, Beyza Doganay Erdogan

Abstract:

Machine learning aims to model the relationship between the response and features. Medical decision-making researchers would like to make decisions about patients’ course and treatment, by examining the repeated measurements over time. Boosting approach is now being used in machine learning area for these aims as an influential tool. The aim of this study is to show the usage of multivariate tree boosting in this field. The main reason for utilizing this approach in the field of decision-making is the ease solutions of complex relationships. To show how multivariate tree boosting method can be used to identify important features and feature-time interaction, we used the data, which was collected retrospectively from Ankara University Chest Diseases Department records. Dataset includes repeated PF ratio measurements. The follow-up time is planned for 120 hours. A set of different models is tested. In conclusion, main idea of classification with weighed combination of classifiers is a reliable method which was shown with simulations several times. Furthermore, time varying variables will be taken into consideration within this concept and it could be possible to make accurate decisions about regression and survival problems.

Keywords: boosted multivariate trees, longitudinal data, multivariate regression tree, panel data

Procedia PDF Downloads 201
3693 An Investigation of Surface Water Quality in an Industrial Area Using Integrated Approaches

Authors: Priti Saha, Biswajit Paul

Abstract:

Rapid urbanization and industrialization has increased the pollution load in surface water bodies. However, these water bodies are major source of water for drinking, irrigation, industrial activities and fishery. Therefore, water quality assessment is paramount importance to evaluate its suitability for all these purposes. This study focus to evaluate the surface water quality of an industrial city in eastern India through integrating interdisciplinary techniques. The multi-purpose Water Quality Index (WQI) assess the suitability for drinking, irrigation as well as fishery of forty-eight sampling locations, where 8.33% have excellent water quality (WQI:0-25) for fishery and 10.42%, 20.83% and 45.83% have good quality (WQI:25-50), which represents its suitability for drinking irrigation and fishery respectively. However, the industrial water quality was assessed through Ryznar Stability Index (LSI), which affirmed that only 6.25% of sampling locations have neither corrosive nor scale forming properties (RSI: 6.2-6.8). Integration of these statistical analysis with geographical information system (GIS) helps in spatial assessment. It identifies of the regions where the water quality is suitable for its use in drinking, irrigation, fishery as well as industrial activities. This research demonstrates the effectiveness of statistical and GIS techniques for water quality assessment.

Keywords: surface water, water quality assessment, water quality index, spatial assessment

Procedia PDF Downloads 174
3692 An Appraisal of Maintenance Management Practices in Federal University Dutse and Jigawa State Polytechnic Dutse, Nigeria

Authors: Aminu Mubarak Sadis

Abstract:

This study appraised the maintenance management practice in Federal University Dutse and Jigawa State Polytechnic Dutse, in Nigeria. The Physical Planning, Works and Maintenance Departments of the two Higher Institutions (Federal University Dutse and Jigawa State Polytechnic) are responsible for production and maintenance management of their physical assets. Over–enrollment problem has been a common feature in the higher institutions in Nigeria, Data were collected by the administered questionnaires and subsequent oral interview to authenticate the completed questionnaires. Random sampling techniques was used in selecting 150 respondents across the various institutions (Federal University Dutse and Jigawa State Polytechnic Dutse). Data collected was analyzed using Statistical Package for Social Science (SPSS) and t-test statistical techniques The conclusion was that maintenance management activities are yet to be given their appropriate attention on functions of the university and polytechnic which are crucial to improving teaching, learning and research. The unit responsible for maintenance and managing facilities should focus on their stated functions and effect changes were possible.

Keywords: appraisal, maintenance management, university, Polytechnic, practices

Procedia PDF Downloads 240
3691 The Concept of Neurostatistics as a Neuroscience

Authors: Igwenagu Chinelo Mercy

Abstract:

This study is on the concept of Neurostatistics in relation to neuroscience. Neuroscience also known as neurobiology is the scientific study of the nervous system. In the study of neuroscience, it has been noted that brain function and its relations to the process of acquiring knowledge and behaviour can be better explained by the use of various interrelated methods. The scope of neuroscience has broadened over time to include different approaches used to study the nervous system at different scales. On the other hand, Neurostatistics based on this study is viewed as a statistical concept that uses similar techniques of neuron mechanisms to solve some problems especially in the field of life science. This study is imperative in this era of Artificial intelligence/Machine leaning in the sense that clear understanding of the technique and its proper application could assist in solving some medical disorder that are mainly associated with the nervous system. This will also help in layman’s understanding of the technique of the nervous system in order to overcome some of the health challenges associated with it. For this concept to be well understood, an illustrative example using a brain associated disorder was used for demonstration. Structural equation modelling was adopted in the analysis. The results clearly show the link between the techniques of statistical model and nervous system. Hence, based on this study, the appropriateness of Neurostatistics application in relation to neuroscience could be based on the understanding of the behavioural pattern of both concepts.

Keywords: brain, neurons, neuroscience, neurostatistics, structural equation modeling

Procedia PDF Downloads 69
3690 The Trend of Injuries in Building Fire in Tehran from 2002 to 2012

Authors: Mohammadreza Ashouri, Majid Bayatian

Abstract:

Analysis of fire data is a way for the implementation of any plan to improve the level of safety in cities. Such an analysis is able to reveal signs of changes in a given period and can be used as a measure of safety. The information of about 66,341 fires (from 2002 to 2012) released by Tehran Safety Services and Fire-Fighting Organization and data on the population and the number of households provided by Tehran Municipality and the Statistical Yearbook of Iran were extracted. Using the data, the fire changes, the rate of injuries, and mortality rate were determined and analyzed. The rate of injuries and mortality rate of fires per one million population of Tehran were 59.58% and 86.12%, respectively. During the study period, the number of fires and fire stations increased by 104.38% and 102.63%, respectively. Most fires (9.21%) happened in the 4th District of Tehran. The results showed that the recorded fire data have not been systematically planned for fire prevention since one of the ways to reduce injuries caused by fires is to develop a systematic plan for necessary actions in emergency situations. To determine a reliable source for fire prevention, the stages, definitions of working processes and the cause and effect chains should be considered. Therefore, a comprehensive statistical system should be developed for reported and recorded fire data.

Keywords: fire statistics, fire analysis, accident prevention, Tehran

Procedia PDF Downloads 177
3689 Non-Targeted Adversarial Image Classification Attack-Region Modification Methods

Authors: Bandar Alahmadi, Lethia Jackson

Abstract:

Machine Learning model is used today in many real-life applications. The safety and security of such model is important, so the results of the model are as accurate as possible. One challenge of machine learning model security is the adversarial examples attack. Adversarial examples are designed by the attacker to cause the machine learning model to misclassify the input. We propose a method to generate adversarial examples to attack image classifiers. We are modifying the successfully classified images, so a classifier misclassifies them after the modification. In our method, we do not update the whole image, but instead we detect the important region, modify it, place it back to the original image, and then run it through a classifier. The algorithm modifies the detected region using two methods. First, it will add abstract image matrix on back of the detected image matrix. Then, it will perform a rotation attack to rotate the detected region around its axes, and embed the trace of image in image background. Finally, the attacked region is placed in its original position, from where it was removed, and a smoothing filter is applied to smooth the background with foreground. We test our method in cascade classifier, and the algorithm is efficient, the classifier confident has dropped to almost zero. We also try it in CNN (Convolutional neural network) with higher setting and the algorithm was successfully worked.

Keywords: adversarial examples, attack, computer vision, image processing

Procedia PDF Downloads 333
3688 Reduction of Defects Using Seven Quality Control Tools for Productivity Improvement at Automobile Company

Authors: Abdul Sattar Jamali, Imdad Ali Memon, Maqsood Ahmed Memon

Abstract:

Quality of production near to zero defects is an objective of every manufacturing and service organization. In order to maintain and improve the quality by reduction in defects, Statistical tools are being used by any organizations. There are many statistical tools are available to assess the quality. Keeping in view the importance of many statistical tools, traditional 7QC tools has been used in any manufacturing and automobile Industry. Therefore, the 7QC tools have been successfully applied at one of the Automobile Company Pakistan. Preliminary survey has been done for the implementation of 7QC tool in the assembly line of Automobile Industry. During preliminary survey two inspection points were decided to collect the data, which are Chassis line and trim line. The data for defects at Chassis line and trim line were collected for reduction in defects which ultimately improve productivity. Every 7QC tools has its benefits observed from the results. The flow charts developed for better understanding about inspection point for data collection. The check sheets developed for helps for defects data collection. Histogram represents the severity level of defects. Pareto charts show the cumulative effect of defects. The Cause and Effect diagrams developed for finding the root causes of each defects. Scatter diagram developed the relation of defects increasing or decreasing. The P-Control charts developed for showing out of control points beyond the limits for corrective actions. The successful implementation of 7QC tools at the inspection points at Automobile Industry concluded that the considerable amount of reduction on defects level, as in Chassis line from 132 defects to 13 defects. The total 90% defects were reduced in Chassis Line. In Trim line defects were reduced from 157 defects to 28 defects. The total 82% defects were reduced in Trim Line. As the Automobile Company exercised only few of the 7 QC tools, not fully getting the fruits by the application of 7 QC tools. Therefore, it is suggested the company may need to manage a mechanism for the application of 7 QC tools at every section.

Keywords: check sheet, cause and effect diagram, control chart, histogram

Procedia PDF Downloads 320
3687 Statistical Analysis of the Impact of Maritime Transport Gross Domestic Product (GDP) on Nigeria’s Economy

Authors: Kehinde Peter Oyeduntan, Kayode Oshinubi

Abstract:

Nigeria is referred as the ‘Giant of Africa’ due to high population, land mass and large economy. However, it still trails far behind many smaller economies in the continent in terms of maritime operations. As we have seen that the maritime industry is the spark plug for national growth, because it houses the most crucial infrastructure that generates wealth for a nation, it is worrisome that a nation with six seaports lag in maritime activities. In this research, we have studied how the Gross Domestic Product (GDP) of the maritime transport influences the Nigerian economy. To do this, we applied Simple Linear Regression (SLR), Support Vector Machine (SVM), Polynomial Regression Model (PRM), Generalized Additive Model (GAM) and Generalized Linear Mixed Model (GLMM) to model the relationship between the nation’s Total GDP (TGDP) and the Maritime Transport GDP (MGDP) using a time series data of 20 years. The result showed that the MGDP is statistically significant to the Nigerian economy. Amongst the statistical tool applied, the PRM of order 4 describes the relationship better when compared to other methods. The recommendations presented in this study will guide policy makers and help improve the economy of Nigeria in terms of its GDP.

Keywords: maritime transport, economy, GDP, regression, port

Procedia PDF Downloads 148
3686 A Statistical Approach to Air Pollution in Mexico City and It's Impacts on Well-Being

Authors: Ana B. Carrera-Aguilar , Rodrigo T. Sepulveda-Hirose, Diego A. Bernal-Gurrusquieta, Francisco A. Ramirez Casas

Abstract:

In recent years, Mexico City has presented high levels of atmospheric pollution; the city is also an example of inequality and poverty that impact metropolitan areas around the world. This combination of social and economic exclusion, coupled with high levels of pollution evidence the loss of well-being among the population. The effect of air pollution on quality of life is an area of study that has been overlooked. The purpose of this study is to find relations between air quality and quality of life in Mexico City through statistical analysis of a regression model and principal component analysis of several atmospheric contaminants (CO, NO₂, ozone, particulate matter, SO₂) and well-being indexes (HDI, poverty, inequality, life expectancy and health care index). The data correspond to official information (INEGI, SEDEMA, and CEPAL) for 2000-2018. Preliminary results show that the Human Development Index (HDI) is affected by the impacts of pollution, and its indicators are reduced in the presence of contaminants. It is necessary to promote a strong interest in this issue in Mexico City. Otherwise, the problem will not only remain but will worsen affecting those who have less and the population well-being in a generalized way.

Keywords: air quality, Mexico City, quality of life, statistics

Procedia PDF Downloads 136
3685 Isolation and Classification of Red Blood Cells in Anemic Microscopic Images

Authors: Jameela Ali Alkrimi, Abdul Rahim Ahmad, Azizah Suliman, Loay E. George

Abstract:

Red blood cells (RBCs) are among the most commonly and intensively studied type of blood cells in cell biology. The lack of RBCs is a condition characterized by lower than normal hemoglobin level; this condition is referred to as 'anemia'. In this study, a software was developed to isolate RBCs by using a machine learning approach to classify anemic RBCs in microscopic images. Several features of RBCs were extracted using image processing algorithms, including principal component analysis (PCA). With the proposed method, RBCs were isolated in 34 second from an image containing 18 to 27 cells. We also proposed that PCA could be performed to increase the speed and efficiency of classification. Our classifier algorithm yielded accuracy rates of 100%, 99.99%, and 96.50% for K-nearest neighbor (K-NN) algorithm, support vector machine (SVM), and neural network ANN, respectively. Classification was evaluated in highly sensitivity, specificity, and kappa statistical parameters. In conclusion, the classification results were obtained for a short time period with more efficient when PCA was used.

Keywords: red blood cells, pre-processing image algorithms, classification algorithms, principal component analysis PCA, confusion matrix, kappa statistical parameters, ROC

Procedia PDF Downloads 400
3684 Local Spectrum Feature Extraction for Face Recognition

Authors: Muhammad Imran Ahmad, Ruzelita Ngadiran, Mohd Nazrin Md Isa, Nor Ashidi Mat Isa, Mohd ZaizuIlyas, Raja Abdullah Raja Ahmad, Said Amirul Anwar Ab Hamid, Muzammil Jusoh

Abstract:

This paper presents two technique, local feature extraction using image spectrum and low frequency spectrum modelling using GMM to capture the underlying statistical information to improve the performance of face recognition system. Local spectrum features are extracted using overlap sub block window that are mapping on the face image. For each of this block, spatial domain is transformed to frequency domain using DFT. A low frequency coefficient is preserved by discarding high frequency coefficients by applying rectangular mask on the spectrum of the facial image. Low frequency information is non Gaussian in the feature space and by using combination of several Gaussian function that has different statistical properties, the best feature representation can be model using probability density function. The recognition process is performed using maximum likelihood value computed using pre-calculate GMM components. The method is tested using FERET data sets and is able to achieved 92% recognition rates.

Keywords: local features modelling, face recognition system, Gaussian mixture models, Feret

Procedia PDF Downloads 657
3683 Analysis of Organizational Factors Effect on Performing Electronic Commerce Strategy: A Case Study of the Namakin Food Industry

Authors: Seyed Hamidreza Hejazi Dehghani, Neda Khounsari

Abstract:

Quick growth of electronic commerce in developed countries means that developing nations must change in their commerce strategies fundamentally. Most organizations are aware of the impact of the Internet and e-Commerce on the future of their firm, and thus, they have to focus on organizational factors that have an effect on the deployment of an e-Commerce strategy. In this situation, it is essential to identify organizational factors such as the organizational culture, human resources, size, structure and product/service that impact an e-commerce strategy. Accordingly, this research specifies the effects of organizational factors on applying an e-commerce strategy in the Namakin food industry. The statistical population of this research is 95 managers and employees. Cochran's formula is used for determination of the sample size that is 77 of the statistical population. Also, SPSS and Smart PLS software were utilized for analyzing the collected data. The results of hypothesis testing show that organizational factors have positive and significant effects of applying an e-Commerce strategy. On the other hand, sub-hypothesizes show that effectiveness of the organizational culture and size criteria were rejected and other sub-hypothesis were accepted.

Keywords: electronic commerce, organizational factors, attitude of managers, organizational readiness

Procedia PDF Downloads 276
3682 Data and Spatial Analysis for Economy and Education of 28 E.U. Member-States for 2014

Authors: Alexiou Dimitra, Fragkaki Maria

Abstract:

The objective of the paper is the study of geographic, economic and educational variables and their contribution to determine the position of each member-state among the EU-28 countries based on the values of seven variables as given by Eurostat. The Data Analysis methods of Multiple Factorial Correspondence Analysis (MFCA) Principal Component Analysis and Factor Analysis have been used. The cross tabulation tables of data consist of the values of seven variables for the 28 countries for 2014. The data are manipulated using the CHIC Analysis V 1.1 software package. The results of this program using MFCA and Ascending Hierarchical Classification are given in arithmetic and graphical form. For comparison reasons with the same data the Factor procedure of Statistical package IBM SPSS 20 has been used. The numerical and graphical results presented with tables and graphs, demonstrate the agreement between the two methods. The most important result is the study of the relation between the 28 countries and the position of each country in groups or clouds, which are formed according to the values of the corresponding variables.

Keywords: Multiple Factorial Correspondence Analysis, Principal Component Analysis, Factor Analysis, E.U.-28 countries, Statistical package IBM SPSS 20, CHIC Analysis V 1.1 Software, Eurostat.eu Statistics

Procedia PDF Downloads 506
3681 Exploring Data Leakage in EEG Based Brain-Computer Interfaces: Overfitting Challenges

Authors: Khalida Douibi, Rodrigo Balp, Solène Le Bars

Abstract:

In the medical field, applications related to human experiments are frequently linked to reduced samples size, which makes the training of machine learning models quite sensitive and therefore not very robust nor generalizable. This is notably the case in Brain-Computer Interface (BCI) studies, where the sample size rarely exceeds 20 subjects or a few number of trials. To address this problem, several resampling approaches are often used during the data preparation phase, which is an overly critical step in a data science analysis process. One of the naive approaches that is usually applied by data scientists consists in the transformation of the entire database before the resampling phase. However, this can cause model’ s performance to be incorrectly estimated when making predictions on unseen data. In this paper, we explored the effect of data leakage observed during our BCI experiments for device control through the real-time classification of SSVEPs (Steady State Visually Evoked Potentials). We also studied potential ways to ensure optimal validation of the classifiers during the calibration phase to avoid overfitting. The results show that the scaling step is crucial for some algorithms, and it should be applied after the resampling phase to avoid data leackage and improve results.

Keywords: data leackage, data science, machine learning, SSVEP, BCI, overfitting

Procedia PDF Downloads 149
3680 The Factors Affecting the Operations of the Industrial Enterprises of Cassava in the Northeast of Thailand

Authors: Thanasuwit Thabhiranrak

Abstract:

This research aims to study factors that affected the operations of the cassava industrial enterprises in northeast of Thailand. Hypothesis was tested by regress analysis and also the analysis in order to determine the relationship between variables with Pearson correlation and show a class action in cassava process including the owner of business executives and supervisors. The research samples were 400 people in northeast region of Thailand. The research results revealed that success of entrepreneurs related to transformation leadership and knowledge management in a positive way at statistical significance level of 0.01 and respondents also emphasized on the importance of transformational leadership factors. The individual and the use of intelligence affect the success of entrepreneurs in cassava industry at statistical significance level of 0.05. The qualitative data were also collected by interviewing with operational level staff, supervisors, executives, and enterprise owners in the northeast of Thailand. The result was found that knowledge management was important in their business operations. Personnel in the organizations should learn from working experience, develop their skills, and increase knowledge from education.

Keywords: transformational leadership, knowledge management (KM), cassava, northeast of Thailand, industrial

Procedia PDF Downloads 299
3679 Performance Analysis of Traffic Classification with Machine Learning

Authors: Htay Htay Yi, Zin May Aye

Abstract:

Network security is role of the ICT environment because malicious users are continually growing that realm of education, business, and then related with ICT. The network security contravention is typically described and examined centrally based on a security event management system. The firewalls, Intrusion Detection System (IDS), and Intrusion Prevention System are becoming essential to monitor or prevent of potential violations, incidents attack, and imminent threats. In this system, the firewall rules are set only for where the system policies are needed. Dataset deployed in this system are derived from the testbed environment. The traffic as in DoS and PortScan traffics are applied in the testbed with firewall and IDS implementation. The network traffics are classified as normal or attacks in the existing testbed environment based on six machine learning classification methods applied in the system. It is required to be tested to get datasets and applied for DoS and PortScan. The dataset is based on CICIDS2017 and some features have been added. This system tested 26 features from the applied dataset. The system is to reduce false positive rates and to improve accuracy in the implemented testbed design. The system also proves good performance by selecting important features and comparing existing a dataset by machine learning classifiers.

Keywords: false negative rate, intrusion detection system, machine learning methods, performance

Procedia PDF Downloads 115
3678 Performance of the Cmip5 Models in Simulation of the Present and Future Precipitation over the Lake Victoria Basin

Authors: M. A. Wanzala, L. A. Ogallo, F. J. Opijah, J. N. Mutemi

Abstract:

The usefulness and limitations in climate information are due to uncertainty inherent in the climate system. For any given region to have sustainable development it is important to apply climate information into its socio-economic strategic plans. The overall objective of the study was to assess the performance of the Coupled Model Inter-comparison Project (CMIP5) over the Lake Victoria Basin. The datasets used included the observed point station data, gridded rainfall data from Climate Research Unit (CRU) and hindcast data from eight CMIP5. The methodology included trend analysis, spatial analysis, correlation analysis, Principal Component Analysis (PCA) regression analysis, and categorical statistical skill score. Analysis of the trends in the observed rainfall records indicated an increase in rainfall variability both in space and time for all the seasons. The spatial patterns of the individual models output from the models of MPI, MIROC, EC-EARTH and CNRM were closest to the observed rainfall patterns.

Keywords: categorical statistics, coupled model inter-comparison project, principal component analysis, statistical downscaling

Procedia PDF Downloads 364
3677 Effects of Knowledge on Fruit Diets by Integrating Posters and Actual-Sized Fruit Models in Health Education for Elderly Patients with Type 2 Diabetes Mellitus

Authors: Suchada Wongsawat

Abstract:

The objectives of this quasi-experiment were: 1) to compare pretest and posttest scores of the experimental group who were given health education on the “Fruit Diets for Elderly Patients with Type 2 Diabetes Mellitus”; and 2) to compare the posttest scores between experimental group and controlled group. The samples of this study were elderly patients with type 2 Diabetes Mellitus at Tambon Kanai Health Promoting Hospital, Thailand. The samples were randomly assigned to experimental and controlled groups, with 30 patients in each group. Statistics used in the data analysis included frequency, percentage, average, standard deviation, paired t-test and independent t-test. The study revealed that the patients in the experimental group had significantly higher posttest scores than the pretest scores in the health education at the .05 statistical level. The posttest scores of the experimental group in the health education were significantly higher than the controlled group at the .05 statistical level.

Keywords: fruit, health education, elderly, diabetes

Procedia PDF Downloads 276
3676 Online Handwritten Character Recognition for South Indian Scripts Using Support Vector Machines

Authors: Steffy Maria Joseph, Abdu Rahiman V, Abdul Hameed K. M.

Abstract:

Online handwritten character recognition is a challenging field in Artificial Intelligence. The classification success rate of current techniques decreases when the dataset involves similarity and complexity in stroke styles, number of strokes and stroke characteristics variations. Malayalam is a complex south indian language spoken by about 35 million people especially in Kerala and Lakshadweep islands. In this paper, we consider the significant feature extraction for the similar stroke styles of Malayalam. This extracted feature set are suitable for the recognition of other handwritten south indian languages like Tamil, Telugu and Kannada. A classification scheme based on support vector machines (SVM) is proposed to improve the accuracy in classification and recognition of online malayalam handwritten characters. SVM Classifiers are the best for real world applications. The contribution of various features towards the accuracy in recognition is analysed. Performance for different kernels of SVM are also studied. A graphical user interface has developed for reading and displaying the character. Different writing styles are taken for each of the 44 alphabets. Various features are extracted and used for classification after the preprocessing of input data samples. Highest recognition accuracy of 97% is obtained experimentally at the best feature combination with polynomial kernel in SVM.

Keywords: SVM, matlab, malayalam, South Indian scripts, onlinehandwritten character recognition

Procedia PDF Downloads 568
3675 Probing Language Models for Multiple Linguistic Information

Authors: Bowen Ding, Yihao Kuang

Abstract:

In recent years, large-scale pre-trained language models have achieved state-of-the-art performance on a variety of natural language processing tasks. The word vectors produced by these language models can be viewed as dense encoded presentations of natural language that in text form. However, it is unknown how much linguistic information is encoded and how. In this paper, we construct several corresponding probing tasks for multiple linguistic information to clarify the encoding capabilities of different language models and performed a visual display. We firstly obtain word presentations in vector form from different language models, including BERT, ELMo, RoBERTa and GPT. Classifiers with a small scale of parameters and unsupervised tasks are then applied on these word vectors to discriminate their capability to encode corresponding linguistic information. The constructed probe tasks contain both semantic and syntactic aspects. The semantic aspect includes the ability of the model to understand semantic entities such as numbers, time, and characters, and the grammatical aspect includes the ability of the language model to understand grammatical structures such as dependency relationships and reference relationships. We also compare encoding capabilities of different layers in the same language model to infer how linguistic information is encoded in the model.

Keywords: language models, probing task, text presentation, linguistic information

Procedia PDF Downloads 104
3674 Comprehensive Review of Adversarial Machine Learning in PDF Malware

Authors: Preston Nabors, Nasseh Tabrizi

Abstract:

Portable Document Format (PDF) files have gained significant popularity for sharing and distributing documents due to their universal compatibility. However, the widespread use of PDF files has made them attractive targets for cybercriminals, who exploit vulnerabilities to deliver malware and compromise the security of end-user systems. This paper reviews notable contributions in PDF malware detection, including static, dynamic, signature-based, and hybrid analysis. It presents a comprehensive examination of PDF malware detection techniques, focusing on the emerging threat of adversarial sampling and the need for robust defense mechanisms. The paper highlights the vulnerability of machine learning classifiers to evasion attacks. It explores adversarial sampling techniques in PDF malware detection to produce mimicry and reverse mimicry evasion attacks, which aim to bypass detection systems. Improvements for future research are identified, including accessible methods, applying adversarial sampling techniques to malicious payloads, evaluating other models, evaluating the importance of features to malware, implementing adversarial defense techniques, and conducting comprehensive examination across various scenarios. By addressing these opportunities, researchers can enhance PDF malware detection and develop more resilient defense mechanisms against adversarial attacks.

Keywords: adversarial attacks, adversarial defense, adversarial machine learning, intrusion detection, PDF malware, malware detection, malware detection evasion

Procedia PDF Downloads 35
3673 The Relationship Between Inspirational Leadership Style and Perceived Social Capital by Mediation of the Development of Organizational Knowledge Resources

Authors: Farhad Shafiepour Motlagh, Narges Salehi

Abstract:

The aim of the present study was to investigate the relationship between inspirational leadership style and perceived social capital through the mediation of organizational knowledge resource development. The research method was descriptive-correlational. The statistical population consisted of all 3537 secondary school teachers in Isfahan. Sample selection was based on Cochran's formula volume formula for 338 people and multi-stage random sampling. The research instruments included a researcher-made inspirational leadership style questionnaire, a perceived social capital questionnaire (Putnam, 1999), and a researcher-made questionnaire of perceived organizational knowledge resources. Kolmogorov statistical tests, Pearson correlation, stepwise multiple regression, and structural equation modeling were used to analyze the data. In general, the results showed that there is a significant relationship between inspirational leadership style and the use of perceived social capital at the level of P <0.05. Also, the development of organizational knowledge resources mediates the relationship between inspirational leadership style and the use of perceived social capital at the level of P <0.05.

Keywords: inspirational leadership style, perceived social capital, perceived organizational knowledge

Procedia PDF Downloads 200
3672 Biosorption of Heavy Metals by Low Cost Adsorbents

Authors: Azam Tabatabaee, Fereshteh Dastgoshadeh, Akram Tabatabaee

Abstract:

This paper describes the use of by-products as adsorbents for removing heavy metals from aqueous effluent solutions. Products of almond skin, walnut shell, saw dust, rice bran and egg shell were evaluated as metal ion adsorbents in aqueous solutions. A comparative study was done with commercial adsorbents like ion exchange resins and activated carbon too. Batch experiments were investigated to determine the affinity of all of biomasses for, Cd(ΙΙ), Cr(ΙΙΙ), Ni(ΙΙ), and Pb(ΙΙ) metal ions at pH 5. The rate of metal ion removal in the synthetic wastewater by the biomass was evaluated by measuring final concentration of synthetic wastewater. At a concentration of metal ion (50 mg/L), egg shell adsorbed high levels (98.6 – 99.7%) of Pb(ΙΙ) and Cr(ΙΙΙ) and walnut shell adsorbed high levels (35.3 – 65.4%) of Ni(ΙΙ) and Cd(ΙΙ). In this study, it has been shown that by-products were excellent adsorbents for removal of toxic ions from wastewater with efficiency comparable to commercially available adsorbents, but at a reduced cost. Also statistical studies using Independent Sample t Test and ANOVA Oneway for statistical comparison between various elements adsorption showed that there isn’t a significant difference in some elements adsorption percentage by by-products and commercial adsorbents.

Keywords: adsorbents, heavy metals, commercial adsorbents, wastewater, by-products

Procedia PDF Downloads 405
3671 Exploring Syntactic and Semantic Features for Text-Based Authorship Attribution

Authors: Haiyan Wu, Ying Liu, Shaoyun Shi

Abstract:

Authorship attribution is to extract features to identify authors of anonymous documents. Many previous works on authorship attribution focus on statistical style features (e.g., sentence/word length), content features (e.g., frequent words, n-grams). Modeling these features by regression or some transparent machine learning methods gives a portrait of the authors' writing style. But these methods do not capture the syntactic (e.g., dependency relationship) or semantic (e.g., topics) information. In recent years, some researchers model syntactic trees or latent semantic information by neural networks. However, few works take them together. Besides, predictions by neural networks are difficult to explain, which is vital in authorship attribution tasks. In this paper, we not only utilize the statistical style and content features but also take advantage of both syntactic and semantic features. Different from an end-to-end neural model, feature selection and prediction are two steps in our method. An attentive n-gram network is utilized to select useful features, and logistic regression is applied to give prediction and understandable representation of writing style. Experiments show that our extracted features can improve the state-of-the-art methods on three benchmark datasets.

Keywords: authorship attribution, attention mechanism, syntactic feature, feature extraction

Procedia PDF Downloads 132