Search results for: dataset quality
10659 Improved Classification Procedure for Imbalanced and Overlapped Situations
Authors: Hankyu Lee, Seoung Bum Kim
Abstract:
The issue with imbalance and overlapping in the class distribution becomes important in various applications of data mining. The imbalanced dataset is a special case in classification problems in which the number of observations of one class (i.e., major class) heavily exceeds the number of observations of the other class (i.e., minor class). Overlapped dataset is the case where many observations are shared together between the two classes. Imbalanced and overlapped data can be frequently found in many real examples including fraud and abuse patients in healthcare, quality prediction in manufacturing, text classification, oil spill detection, remote sensing, and so on. The class imbalance and overlap problem is the challenging issue because this situation degrades the performance of most of the standard classification algorithms. In this study, we propose a classification procedure that can effectively handle imbalanced and overlapped datasets by splitting data space into three parts: nonoverlapping, light overlapping, and severe overlapping and applying the classification algorithm in each part. These three parts were determined based on the Hausdorff distance and the margin of the modified support vector machine. An experiments study was conducted to examine the properties of the proposed method and compared it with other classification algorithms. The results showed that the proposed method outperformed the competitors under various imbalanced and overlapped situations. Moreover, the applicability of the proposed method was demonstrated through the experiment with real data.Keywords: classification, imbalanced data with class overlap, split data space, support vector machine
Procedia PDF Downloads 30810658 The Quality Health Services and Patient Satisfaction in Hospital
Authors: Nadia Fatima Zahra Malki
Abstract:
Quality is one of the most important modern management patterns that organizations seek to achieve in all areas and sectors in order to meet the needs and desires of customers and to remain and continuity, as they constitute a competitive advantage for the organization. and among the most prominent organizations that must be available on the quality factor are health organizations as they relate to the most valuable component of production. It is a person, and his health, and any error in it threatens his life and may lead to death, so she must provide health services of high quality to achieve the highest degree of satisfaction for the patient. This research aims to study the quality of health services and the extent of their impact on patient satisfaction, and this is through an applied study that relied on measuring the level of quality of health services in the university hospital center of Algeria and the extent of their impact on patient satisfaction according to the dimensions of the quality of health services, and we reached a conclusion that the determinants of the quality of health services It affects patient satisfaction, which necessitates developing health services according to patients' requirements and improving their quality to obtain patient satisfaction.Keywords: health service, health quality, quality determinants, patient satisfaction
Procedia PDF Downloads 6110657 Comparison of Different Machine Learning Algorithms for Solubility Prediction
Authors: Muhammet Baldan, Emel Timuçin
Abstract:
Molecular solubility prediction plays a crucial role in various fields, such as drug discovery, environmental science, and material science. In this study, we compare the performance of five machine learning algorithms—linear regression, support vector machines (SVM), random forests, gradient boosting machines (GBM), and neural networks—for predicting molecular solubility using the AqSolDB dataset. The dataset consists of 9981 data points with their corresponding solubility values. MACCS keys (166 bits), RDKit properties (20 properties), and structural properties(3) features are extracted for every smile representation in the dataset. A total of 189 features were used for training and testing for every molecule. Each algorithm is trained on a subset of the dataset and evaluated using metrics accuracy scores. Additionally, computational time for training and testing is recorded to assess the efficiency of each algorithm. Our results demonstrate that random forest model outperformed other algorithms in terms of predictive accuracy, achieving an 0.93 accuracy score. Gradient boosting machines and neural networks also exhibit strong performance, closely followed by support vector machines. Linear regression, while simpler in nature, demonstrates competitive performance but with slightly higher errors compared to ensemble methods. Overall, this study provides valuable insights into the performance of machine learning algorithms for molecular solubility prediction, highlighting the importance of algorithm selection in achieving accurate and efficient predictions in practical applications.Keywords: random forest, machine learning, comparison, feature extraction
Procedia PDF Downloads 4010656 Implementation of Total Quality Management in Public Sector: Case of Tunisia
Authors: Rafla Hchaichi
Abstract:
The public administration is currently experiencing in the field of quality unprecedented effervescence. However, in a globalized world more and more competitive, public services are confronted with the need to improve their performances which push public companies to implement quality approaches. Quality approaches have taken diverse forms such as service commitment, labels, certifications and the Common Assessment Framework. This paper provides an overview on the strategy for administrative development in Tunisia since the Carthaginian civilization until today. It outlines the evolution of quality management in the Tunisian public context while focusing on the National Referential of Quality of Administrative Services.Keywords: quality approach, the common assessment framework, service commitment, label, certification, quality of public service, performance of public service, Tunisian Public Service
Procedia PDF Downloads 55410655 Comparing Two Unmanned Aerial Systems in Determining Elevation at the Field Scale
Authors: Brock Buckingham, Zhe Lin, Wenxuan Guo
Abstract:
Accurate elevation data is critical in deriving topographic attributes for the precision management of crop inputs, especially water and nutrients. Traditional ground-based elevation data acquisition is time consuming, labor intensive, and often inconvenient at the field scale. Various unmanned aerial systems (UAS) provide the capability of generating digital elevation data from high-resolution images. The objective of this study was to compare the performance of two UAS with different global positioning system (GPS) receivers in determining elevation at the field scale. A DJI Phantom 4 Pro and a DJI Phantom 4 RTK(real-time kinematic) were applied to acquire images at three heights, including 40m, 80m, and 120m above ground. Forty ground control panels were placed in the field, and their geographic coordinates were determined using an RTK GPS survey unit. For each image acquisition using a UAS at a particular height, two elevation datasets were generated using the Pix4D stitching software: a calibrated dataset using the surveyed coordinates of the ground control panels and an uncalibrated dataset without using the surveyed coordinates of the ground control panels. Elevation values for each panel derived from the elevation model of each dataset were compared to the corresponding coordinates of the ground control panels. The coefficient of the determination (R²) and the root mean squared error (RMSE) were used as evaluation metrics to assess the performance of each image acquisition scenario. RMSE values for the uncalibrated elevation dataset were 26.613 m, 31.141 m, and 25.135 m for images acquired at 120 m, 80 m, and 40 m, respectively, using the Phantom 4 Pro UAS. With calibration for the same UAS, the accuracies were significantly improved with RMSE values of 0.161 m, 0.165, and 0.030 m, respectively. The best results showed an RMSE of 0.032 m and an R² of 0.998 for calibrated dataset generated using the Phantom 4 RTK UAS at 40m height. The accuracy of elevation determination decreased as the flight height increased for both UAS, with RMSE values greater than 0.160 m for the datasets acquired at 80 m and 160 m. The results of this study show that calibration with ground control panels improves the accuracy of elevation determination, especially for the UAS with a regular GPS receiver. The Phantom 4 Pro provides accurate elevation data with substantial surveyed ground control panels for the 40 m dataset. The Phantom 4 Pro RTK UAS provides accurate elevation at 40 m without calibration for practical precision agriculture applications. This study provides valuable information on selecting appropriate UAS and flight heights in determining elevation for precision agriculture applications.Keywords: unmanned aerial system, elevation, precision agriculture, real-time kinematic (RTK)
Procedia PDF Downloads 16410654 Camera Model Identification for Mi Pad 4, Oppo A37f, Samsung M20, and Oppo f9
Authors: Ulrich Wake, Eniman Syamsuddin
Abstract:
The model for camera model identificaiton is trained using pretrained model ResNet43 and ResNet50. The dataset consists of 500 photos of each phone. Dataset is divided into 1280 photos for training, 320 photos for validation and 400 photos for testing. The model is trained using One Cycle Policy Method and tested using Test-Time Augmentation. Furthermore, the model is trained for 50 epoch using regularization such as drop out and early stopping. The result is 90% accuracy for validation set and above 85% for Test-Time Augmentation using ResNet50. Every model is also trained by slightly updating the pretrained model’s weightsKeywords: One Cycle Policy, ResNet34, ResNet50, Test-Time Agumentation
Procedia PDF Downloads 20810653 Evalutaion of the Surface Water Quality Using the Water Quality Index and Discriminant Analysis Method
Authors: Lazhar Belkhiri, Ammar Tiri, Lotfi Mouni
Abstract:
Water resources present to the public order of the world a very important problem for the protection and management of water quality given the complexity of water quality data sets. In this study, the water quality index (WQI) and irrigation water quality index (IWQI) were calculated in order to evaluate the surface water quality for drinking and irrigation purposes based on nine hydrochemical parameters. In order to separate the variables that are the most responsible for the spatial differentiation, the discriminant analysis (DA) was applied. The results show that the surface water quality for drinking is poor quality and very poor quality based on WQI values, however, the values of IWQI reflect that this water is acceptable for irrigation with a restriction for sensitive plants. Consequently, the discriminant analysis DA method has shown that the following parameters pH, potassium, chloride, sulfate, and bicarbonate are significant discrimination between the different stations with the spatial variation of the surface water quality, therefore, the results obtained in this study provide very useful information to decision-makersKeywords: surface water quality, drinking and irrigation purposes, water quality index, discriminant analysis
Procedia PDF Downloads 8610652 An Efficient Motion Recognition System Based on LMA Technique and a Discrete Hidden Markov Model
Authors: Insaf Ajili, Malik Mallem, Jean-Yves Didier
Abstract:
Human motion recognition has been extensively increased in recent years due to its importance in a wide range of applications, such as human-computer interaction, intelligent surveillance, augmented reality, content-based video compression and retrieval, etc. However, it is still regarded as a challenging task especially in realistic scenarios. It can be seen as a general machine learning problem which requires an effective human motion representation and an efficient learning method. In this work, we introduce a descriptor based on Laban Movement Analysis technique, a formal and universal language for human movement, to capture both quantitative and qualitative aspects of movement. We use Discrete Hidden Markov Model (DHMM) for training and classification motions. We improve the classification algorithm by proposing two DHMMs for each motion class to process the motion sequence in two different directions, forward and backward. Such modification allows avoiding the misclassification that can happen when recognizing similar motions. Two experiments are conducted. In the first one, we evaluate our method on a public dataset, the Microsoft Research Cambridge-12 Kinect gesture data set (MSRC-12) which is a widely used dataset for evaluating action/gesture recognition methods. In the second experiment, we build a dataset composed of 10 gestures(Introduce yourself, waving, Dance, move, turn left, turn right, stop, sit down, increase velocity, decrease velocity) performed by 20 persons. The evaluation of the system includes testing the efficiency of our descriptor vector based on LMA with basic DHMM method and comparing the recognition results of the modified DHMM with the original one. Experiment results demonstrate that our method outperforms most of existing methods that used the MSRC-12 dataset, and a near perfect classification rate in our dataset.Keywords: human motion recognition, motion representation, Laban Movement Analysis, Discrete Hidden Markov Model
Procedia PDF Downloads 20710651 Evaluating 8D Reports Using Text-Mining
Authors: Benjamin Kuester, Bjoern Eilert, Malte Stonis, Ludger Overmeyer
Abstract:
Increasing quality requirements make reliable and effective quality management indispensable. This includes the complaint handling in which the 8D method is widely used. The 8D report as a written documentation of the 8D method is one of the key quality documents as it internally secures the quality standards and acts as a communication medium to the customer. In practice, however, the 8D report is mostly faulty and of poor quality. There is no quality control of 8D reports today. This paper describes the use of natural language processing for the automated evaluation of 8D reports. Based on semantic analysis and text-mining algorithms the presented system is able to uncover content and formal quality deficiencies and thus increases the quality of the complaint processing in the long term.Keywords: 8D report, complaint management, evaluation system, text-mining
Procedia PDF Downloads 31510650 Evaluating the Quality of Private University Websites in Malaysia
Authors: Rubijesmin Abdul Latif
Abstract:
This paper focuses on evaluating what are quality components of university websites in Malaysia especially the private universities. It is believed that with websites that prioritize quality, the websites will serve its intended users satisfactory. From the compiled analysis of other studies, quality components were identified and tested among 30 randomly selected respondents. Four Malaysia private university websites were compared and the highlights were better understanding of what users want for a quality university website.Keywords: website evaluation, criteria, quality, usability, user experience, university website
Procedia PDF Downloads 37110649 The Advancements of Transformer Models in Part-of-Speech Tagging System for Low-Resource Tigrinya Language
Authors: Shamm Kidane, Ibrahim Abdella, Fitsum Gaim, Simon Mulugeta, Sirak Asmerom, Natnael Ambasager, Yoel Ghebrihiwot
Abstract:
The call for natural language processing (NLP) systems for low-resource languages has become more apparent than ever in the past few years, with the arduous challenges still present in preparing such systems. This paper presents an improved dataset version of the Nagaoka Tigrinya Corpus for Parts-of-Speech (POS) classification system in the Tigrinya language. The size of the initial Nagaoka dataset was incremented, totaling the new tagged corpus to 118K tokens, which comprised the 12 basic POS annotations used previously. The additional content was also annotated manually in a stringent manner, followed similar rules to the former dataset and was formatted in CONLL format. The system made use of the novel approach in NLP tasks and use of the monolingually pre-trained TiELECTRA, TiBERT and TiRoBERTa transformer models. The highest achieved score is an impressive weighted F1-score of 94.2%, which surpassed the previous systems by a significant measure. The system will prove useful in the progress of NLP-related tasks for Tigrinya and similarly related low-resource languages with room for cross-referencing higher-resource languages.Keywords: Tigrinya POS corpus, TiBERT, TiRoBERTa, conditional random fields
Procedia PDF Downloads 10310648 Application of Machine Learning Techniques in Forest Cover-Type Prediction
Authors: Saba Ebrahimi, Hedieh Ashrafi
Abstract:
Predicting the cover type of forests is a challenge for natural resource managers. In this project, we aim to perform a comprehensive comparative study of two well-known classification methods, support vector machine (SVM) and decision tree (DT). The comparison is first performed among different types of each classifier, and then the best of each classifier will be compared by considering different evaluation metrics. The effect of boosting and bagging for decision trees is also explored. Furthermore, the effect of principal component analysis (PCA) and feature selection is also investigated. During the project, the forest cover-type dataset from the remote sensing and GIS program is used in all computations.Keywords: classification methods, support vector machine, decision tree, forest cover-type dataset
Procedia PDF Downloads 21710647 A Study of Agile Based Approaches to Improve Software Quality
Authors: Gurmeet Kaur
Abstract:
Agile software development methods are being recognized as popular, and efficient approach to the development of software system that has a short delivery period with high quality also that meets customer requirements with zero defect. In agile software development, quality means quality of code where in the quality is maintained through the use of methods or approaches like refactoring, test driven development, behavior driven development, acceptance test driven development, and demand driven development. Software quality is measured in term of metrics such as the number of defects during development of software. Usage of above mentioned methods or approaches, reduces the possibilities of defects in developed software, and hence improve quality. This paper focuses on study of agile based quality methods or approaches for software development that ensures improved quality of software as well as reduced cost, and customer satisfaction. Procedia PDF Downloads 17210646 Using Geospatial Analysis to Reconstruct the Thunderstorm Climatology for the Washington DC Metropolitan Region
Authors: Mace Bentley, Zhuojun Duan, Tobias Gerken, Dudley Bonsal, Henry Way, Endre Szakal, Mia Pham, Hunter Donaldson, Chelsea Lang, Hayden Abbott, Leah Wilcynzski
Abstract:
Air pollution has the potential to modify the lifespan and intensity of thunderstorms and the properties of lightning. Using data mining and geovisualization, we investigate how background climate and weather conditions shape variability in urban air pollution and how this, in turn, shapes thunderstorms as measured by the intensity, distribution, and frequency of cloud-to-ground lightning. A spatiotemporal analysis was conducted in order to identify thunderstorms using high-resolution lightning detection network data. Over seven million lightning flashes were used to identify more than 196,000 thunderstorms that occurred between 2006 - 2020 in the Washington, DC Metropolitan Region. Each lightning flash in the dataset was grouped into thunderstorm events by means of a temporal and spatial clustering algorithm. Once the thunderstorm event database was constructed, hourly wind direction, wind speed, and atmospheric thermodynamic data were added to the initiation and dissipation times and locations for the 196,000 identified thunderstorms. Hourly aerosol and air quality data for the thunderstorm initiation times and locations were also incorporated into the dataset. Developing thunderstorm climatologies using a lightning tracking algorithm and lightning detection network data was found to be useful for visualizing the spatial and temporal distribution of urban augmented thunderstorms in the region.Keywords: lightning, urbanization, thunderstorms, climatology
Procedia PDF Downloads 7510645 Deep Supervision Based-Unet to Detect Buildings Changes from VHR Aerial Imagery
Authors: Shimaa Holail, Tamer Saleh, Xiongwu Xiao
Abstract:
Building change detection (BCD) from satellite imagery is an essential topic in urbanization monitoring, agricultural land management, and updating geospatial databases. Recently, methods for detecting changes based on deep learning have made significant progress and impressive results. However, it has the problem of being insensitive to changes in buildings with complex spectral differences, and the features being extracted are not discriminatory enough, resulting in incomplete buildings and irregular boundaries. To overcome these problems, we propose a dual Siamese network based on the Unet model with the addition of a deep supervision strategy (DS) in this paper. This network consists of a backbone (encoder) based on ImageNet pre-training, a fusion block, and feature pyramid networks (FPN) to enhance the step-by-step information of the changing regions and obtain a more accurate BCD map. To train the proposed method, we created a new dataset (EGY-BCD) of high-resolution and multi-temporal aerial images captured over New Cairo in Egypt to detect building changes for this purpose. The experimental results showed that the proposed method is effective and performs well with the EGY-BCD dataset regarding the overall accuracy, F1-score, and mIoU, which were 91.6 %, 80.1 %, and 73.5 %, respectively.Keywords: building change detection, deep supervision, semantic segmentation, EGY-BCD dataset
Procedia PDF Downloads 12010644 Relationship between Quality Education and Organizational Culture at College Level in Punjab
Authors: Anam Noshaba, Mahr Muhammad Saeed Akhtar
Abstract:
The aim of this study was to find out the relationship between quality education and organizational culture. The population of this study was all the teachers of Public Degree Colleges located in Punjab. A sample of 400 teachers was selected by using a simple random sampling technique. Quality Education Assessment Questionnaire (QEAQ) and Organizational Culture Assessment Instrument (OCAI) were used for data collection. Out of all, 90% of teachers responded. Findings showed that quality education and organizational culture are positively correlated. Results indicated that there is no difference in quality education and organizational culture by demographic variables of teachers. Future research is needed to study the viewpoint of other stakeholders of education regarding quality education and organizational culture.Keywords: quality education, minimum quality standards, organizational culture, college level
Procedia PDF Downloads 13910643 Software Quality Assurance in Component Based Software Development – a Survey Analysis
Authors: Abeer Toheed Quadri, Maria Abubakar, Mehreen Sirshar
Abstract:
Component Based Software Development (CBSD) is a new trend in software development. Selection of quality components is not enough to ensure software quality in Component Based Software System (CBSS). A software product is considered to be a quality product if it satisfies its customer’s needs and has minimum defects. Authors’ survey different research papers and analyzes various techniques which ensure software quality in component based software development. This paper includes an investigation about how to improve the quality of a component based software system without effecting quality attributes. The reported information is identified from literature survey. The developments of component based systems are rising as they reduce the development time, effort and cost by means of reuse. After analysis, it has been explored that in order to achieve the quality in a CBSS we need to have the components that are certified through software measure because the predictability of software quality attributes of system depend on the quality attributes of the constituent components, integration process and the framework used.Keywords: CBSD (component based software development), CBSS (component based software system), quality components, SQA (software quality assurance)
Procedia PDF Downloads 41310642 Content-Aware Image Augmentation for Medical Imaging Applications
Authors: Filip Rusak, Yulia Arzhaeva, Dadong Wang
Abstract:
Machine learning based Computer-Aided Diagnosis (CAD) is gaining much popularity in medical imaging and diagnostic radiology. However, it requires a large amount of high quality and labeled training image datasets. The training images may come from different sources and be acquired from different radiography machines produced by different manufacturers, digital or digitized copies of film radiographs, with various sizes as well as different pixel intensity distributions. In this paper, a content-aware image augmentation method is presented to deal with these variations. The results of the proposed method have been validated graphically by plotting the removed and added seams of pixels on original images. Two different chest X-ray (CXR) datasets are used in the experiments. The CXRs in the datasets defer in size, some are digital CXRs while the others are digitized from analog CXR films. With the proposed content-aware augmentation method, the Seam Carving algorithm is employed to resize CXRs and the corresponding labels in the form of image masks, followed by histogram matching used to normalize the pixel intensities of digital radiography, based on the pixel intensity values of digitized radiographs. We implemented the algorithms, resized the well-known Montgomery dataset, to the size of the most frequently used Japanese Society of Radiological Technology (JSRT) dataset and normalized our digital CXRs for testing. This work resulted in the unified off-the-shelf CXR dataset composed of radiographs included in both, Montgomery and JSRT datasets. The experimental results show that even though the amount of augmentation is large, our algorithm can preserve the important information in lung fields, local structures, and global visual effect adequately. The proposed method can be used to augment training and testing image data sets so that the trained machine learning model can be used to process CXRs from various sources, and it can be potentially used broadly in any medical imaging applications.Keywords: computer-aided diagnosis, image augmentation, lung segmentation, medical imaging, seam carving
Procedia PDF Downloads 22210641 Nighttime Dehaze - Enhancement
Authors: Harshan Baskar, Anirudh S. Chakravarthy, Prateek Garg, Divyam Goel, Abhijith S. Raj, Kshitij Kumar, Lakshya, Ravichandra Parvatham, V. Sushant, Bijay Kumar Rout
Abstract:
In this paper, we introduce a new computer vision task called nighttime dehaze-enhancement. This task aims to jointly perform dehazing and lightness enhancement. Our task fundamentally differs from nighttime dehazing – our goal is to jointly dehaze and enhance scenes, while nighttime dehazing aims to dehaze scenes under a nighttime setting. In order to facilitate further research on this task, we release a new benchmark dataset called Reside-β Night dataset, consisting of 4122 nighttime hazed images from 2061 scenes and 2061 ground truth images. Moreover, we also propose a new network called NDENet (Nighttime Dehaze-Enhancement Network), which jointly performs dehazing and low-light enhancement in an end-to-end manner. We evaluate our method on the proposed benchmark and achieve SSIM of 0.8962 and PSNR of 26.25. We also compare our network with other baseline networks on our benchmark to demonstrate the effectiveness of our approach. We believe that nighttime dehaze-enhancement is an essential task, particularly for autonomous navigation applications, and we hope that our work will open up new frontiers in research. Our dataset and code will be made publicly available upon acceptance of our paper.Keywords: dehazing, image enhancement, nighttime, computer vision
Procedia PDF Downloads 15710640 Automated Process Quality Monitoring and Diagnostics for Large-Scale Measurement Data
Authors: Hyun-Woo Cho
Abstract:
Continuous monitoring of industrial plants is one of necessary tasks when it comes to ensuring high-quality final products. In terms of monitoring and diagnosis, it is quite critical and important to detect some incipient abnormal events of manufacturing processes in order to improve safety and reliability of operations involved and to reduce related losses. In this work a new multivariate statistical online diagnostic method is presented using a case study. For building some reference models an empirical discriminant model is constructed based on various past operation runs. When a fault is detected on-line, an on-line diagnostic module is initiated. Finally, the status of the current operating conditions is compared with the reference model to make a diagnostic decision. The performance of the presented framework is evaluated using a dataset from complex industrial processes. It has been shown that the proposed diagnostic method outperforms other techniques especially in terms of incipient detection of any faults occurred.Keywords: data mining, empirical model, on-line diagnostics, process fault, process monitoring
Procedia PDF Downloads 40110639 The Quality of Health Services and Patient Satisfaction in Hospital
Authors: Malki Nadia Fatima Zahra, Kellal Chaimaa, Brahimi Houria
Abstract:
Quality is one of the most important modern management patterns that organizations seek to achieve in all areas and sectors in order to meet the needs and desires of customers and to remain continuity, as they constitute a competitive advantage for the organization, and among the most prominent organizations that must be available on the quality factor are health organizations as they relate to the most valuable component of production It is a person and his health, and that any error in it threatens his life and may lead to death, so she must provide health services of high quality to achieve the highest degree of satisfaction for the patient. This research aims to study the quality of health services and the extent of their impact on patient satisfaction, and this is through an applied study that relied on measuring the level of quality of health services in the university hospital center of Algeria and the extent of their impact on patient satisfaction according to the dimensions of the quality of health services, and we reached a conclusion that the determinants of the quality of health services. It affects patient satisfaction, which necessitates developing health services according to patients' requirements and improving their quality to obtain patient satisfaction.Keywords: health service, health quality, quality determinants, patient satisfaction
Procedia PDF Downloads 6610638 Stock Market Prediction Using Convolutional Neural Network That Learns from a Graph
Authors: Mo-Se Lee, Cheol-Hwi Ahn, Kee-Young Kwahk, Hyunchul Ahn
Abstract:
Over the past decade, deep learning has been in spotlight among various machine learning algorithms. In particular, CNN (Convolutional Neural Network), which is known as effective solution for recognizing and classifying images, has been popularly applied to classification and prediction problems in various fields. In this study, we try to apply CNN to stock market prediction, one of the most challenging tasks in the machine learning research. In specific, we propose to apply CNN as the binary classifier that predicts stock market direction (up or down) by using a graph as its input. That is, our proposal is to build a machine learning algorithm that mimics a person who looks at the graph and predicts whether the trend will go up or down. Our proposed model consists of four steps. In the first step, it divides the dataset into 5 days, 10 days, 15 days, and 20 days. And then, it creates graphs for each interval in step 2. In the next step, CNN classifiers are trained using the graphs generated in the previous step. In step 4, it optimizes the hyper parameters of the trained model by using the validation dataset. To validate our model, we will apply it to the prediction of KOSPI200 for 1,986 days in eight years (from 2009 to 2016). The experimental dataset will include 14 technical indicators such as CCI, Momentum, ROC and daily closing price of KOSPI200 of Korean stock market.Keywords: convolutional neural network, deep learning, Korean stock market, stock market prediction
Procedia PDF Downloads 42510637 Measuring Housing Quality Using Geographic Information System (GIS)
Authors: Silvija ŠIljeg, Ante ŠIljeg, Ivan Marić
Abstract:
Measuring housing quality is being done on objective and subjective level using different indicators. During the research 5 urban and housing indicators formed according to 58 variables from different housing, domains were used. The aims of the research were to measure housing quality based on GIS approach and to detect critical points of housing in the example of Croatian coastal Town Zadar. The purposes of GIS in the research are to generate models of housing quality indexes by standardisation and aggregation of variables and to examine accuracy model of housing quality index. Analysis of accuracy has been done on the example of variable referring to educational objects availability. By defining weighted coefficients and using different GIS methods high, middle and low housing quality zones were determined. Obtained results can be of use to town planners, spatial planners and town authorities in the process of generating decisions, guidelines, and spatial interventions.Keywords: housing quality, GIS, housing quality index, indicators, models of housing quality
Procedia PDF Downloads 29810636 Classification of Red, Green and Blue Values from Face Images Using k-NN Classifier to Predict the Skin or Non-Skin
Authors: Kemal Polat
Abstract:
In this study, it has been estimated whether there is skin by using RBG values obtained from the camera and k-nearest neighbor (k-NN) classifier. The dataset used in this study has an unbalanced distribution and a linearly non-separable structure. This problem can also be called a big data problem. The Skin dataset was taken from UCI machine learning repository. As the classifier, we have used the k-NN method to handle this big data problem. For k value of k-NN classifier, we have used as 1. To train and test the k-NN classifier, 50-50% training-testing partition has been used. As the performance metrics, TP rate, FP Rate, Precision, recall, f-measure and AUC values have been used to evaluate the performance of k-NN classifier. These obtained results are as follows: 0.999, 0.001, 0.999, 0.999, 0.999, and 1,00. As can be seen from the obtained results, this proposed method could be used to predict whether the image is skin or not.Keywords: k-NN classifier, skin or non-skin classification, RGB values, classification
Procedia PDF Downloads 24810635 Quality Management in Construction Project
Authors: Harsh Panchal, Saurabh Amrutkar
Abstract:
Quality management is an essential part of any project that has directly related to the performance of a project. Quality management is depended on multiple factors at different stages in a project, right from time management to construction logistics. A project is a mixture of various components that include iternary management, health and safety, crew productivity, and many more. From the survey conducted, we came to the conclusion that advancement in technology and indigenous approach to any project will result in maximum quality standards and better project performance. In this paper, we discuss various components of the factors above that lead to compromise the quality of a project and how it can be controlled in order to achieve maximum quality assurance using quality planning and total quality management. The paper also focuses on limitations and problems faced in each factor responsible for quality management and to tackle them using techniques and processes based on activities and identifying the sequence, approaching critical path, and duration. The project management concept that deals with the sequence of scope cost time give us an overview regarding the ongoing quality management, in a nutshell, giving us hints to regulate the current procedure for maximum achievable quality. It also deals with the problems faced by engineers that make the mundane work process slow, reducing the quality outcome drastically.Keywords: management, performance, project, quality
Procedia PDF Downloads 16510634 Similar Script Character Recognition on Kannada and Telugu
Authors: Gurukiran Veerapur, Nytik Birudavolu, Seetharam U. N., Chandravva Hebbi, R. Praneeth Reddy
Abstract:
This work presents a robust approach for the recognition of characters in Telugu and Kannada, two South Indian scripts with structural similarities in characters. To recognize the characters exhaustive datasets are required, but there are only a few publicly available datasets. As a result, we decided to create a dataset for one language (source language),train the model with it, and then test it with the target language.Telugu is the target language in this work, whereas Kannada is the source language. The suggested method makes use of Canny edge features to increase character identification accuracy on pictures with noise and different lighting. A dataset of 45,150 images containing printed Kannada characters was created. The Nudi software was used to automatically generate printed Kannada characters with different writing styles and variations. Manual labelling was employed to ensure the accuracy of the character labels. The deep learning models like CNN (Convolutional Neural Network) and Visual Attention neural network (VAN) are used to experiment with the dataset. A Visual Attention neural network (VAN) architecture was adopted, incorporating additional channels for Canny edge features as the results obtained were good with this approach. The model's accuracy on the combined Telugu and Kannada test dataset was an outstanding 97.3%. Performance was better with Canny edge characteristics applied than with a model that solely used the original grayscale images. The accuracy of the model was found to be 80.11% for Telugu characters and 98.01% for Kannada words when it was tested with these languages. This model, which makes use of cutting-edge machine learning techniques, shows excellent accuracy when identifying and categorizing characters from these scripts.Keywords: base characters, modifiers, guninthalu, aksharas, vattakshara, VAN
Procedia PDF Downloads 5310633 Feature Location Restoration for Under-Sampled Photoplethysmogram Using Spline Interpolation
Authors: Hangsik Shin
Abstract:
The purpose of this research is to restore the feature location of under-sampled photoplethysmogram using spline interpolation and to investigate feasibility for feature shape restoration. We obtained 10 kHz-sampled photoplethysmogram and decimated it to generate under-sampled dataset. Decimated dataset has 5 kHz, 2.5 k Hz, 1 kHz, 500 Hz, 250 Hz, 25 Hz and 10 Hz sampling frequency. To investigate the restoration performance, we interpolated under-sampled signals with 10 kHz, then compared feature locations with feature locations of 10 kHz sampled photoplethysmogram. Features were upper and lower peak of photplethysmography waveform. Result showed that time differences were dramatically decreased by interpolation. Location error was lesser than 1 ms in both feature types. In 10 Hz sampled cases, location error was also deceased a lot, however, they were still over 10 ms.Keywords: peak detection, photoplethysmography, sampling, signal reconstruction
Procedia PDF Downloads 36610632 Data Presentation of Lane-Changing Events Trajectories Using HighD Dataset
Authors: Basma Khelfa, Antoine Tordeux, Ibrahima Ba
Abstract:
We present a descriptive analysis data of lane-changing events in multi-lane roads. The data are provided from The Highway Drone Dataset (HighD), which are microscopic trajectories in highway. This paper describes and analyses the role of the different parameters and their significance. Thanks to HighD data, we aim to find the most frequent reasons that motivate drivers to change lanes. We used the programming language R for the processing of these data. We analyze the involvement and relationship of different variables of each parameter of the ego vehicle and the four vehicles surrounding it, i.e., distance, speed difference, time gap, and acceleration. This was studied according to the class of the vehicle (car or truck), and according to the maneuver it undertook (overtaking or falling back).Keywords: autonomous driving, physical traffic model, prediction model, statistical learning process
Procedia PDF Downloads 26110631 Frequent Item Set Mining for Big Data Using MapReduce Framework
Authors: Tamanna Jethava, Rahul Joshi
Abstract:
Frequent Item sets play an essential role in many data Mining tasks that try to find interesting patterns from the database. Typically it refers to a set of items that frequently appear together in transaction dataset. There are several mining algorithm being used for frequent item set mining, yet most do not scale to the type of data we presented with today, so called “BIG DATA”. Big Data is a collection of large data sets. Our approach is to work on the frequent item set mining over the large dataset with scalable and speedy way. Big Data basically works with Map Reduce along with HDFS is used to find out frequent item sets from Big Data on large cluster. This paper focuses on using pre-processing & mining algorithm as hybrid approach for big data over Hadoop platform.Keywords: frequent item set mining, big data, Hadoop, MapReduce
Procedia PDF Downloads 43410630 Developing a Grading System for Restaurants
Authors: Joseph Roberson, Carina Kleynhans, Willie Coetzee
Abstract:
The low entry barriers of the restaurant industry lead to an extremely competitive business environment. In this volatile business sector it is of the utmost importance to implement a strategy of quality differentiation. Vital aspects of a quality differentiation strategy are total quality management, benchmarking and service quality management. Ultimately, restaurant success depends on the continuous support of customers. Customers select restaurants based on their expectations of quality. If the customers' expectations are met, they perceive quality service and will re-patronize the restaurant. The restaurateur can manage perceptions of quality by influencing expectations while ensuring that those expectations are not inflated. The management of expectations can be done by communicating service quality to customers. The aim of this research paper is to describe the development of a grading process for restaurants. An assessment of the extensive body of literature on grading was conducted through content analysis. A standardized method for developing a grading system would assist in successful grading systems that could inform both customers and restaurateurs of restaurant quality.Keywords: benchmarking, restaurants, grading, service quality, total quality management
Procedia PDF Downloads 333