Search results for: Data Reduction
7371 Spatial Variability of Brahmaputra River Flow Characteristics
Authors: Hemant Kumar
Abstract:
Brahmaputra River is known according to the Hindu mythology the son of the Lord Brahma. According to this name, the river Brahmaputra creates mass destruction during the monsoon season in Assam, India. It is a state situated in North-East part of India. This is one of the essential states out of the seven countries of eastern India, where almost all entire Brahmaputra flow carried out. The other states carry their tributaries. In the present case study, the spatial analysis performed in this specific case the number of MODIS data are acquired. In the method of detecting the change, the spray content was found during heavy rainfall and in the flooded monsoon season. By this method, particularly the analysis over the Brahmaputra outflow determines the flooded season. The charged particle-associated in aerosol content genuinely verifies the heavy water content below the ground surface, which is validated by trend analysis through rainfall spectrum data. This is confirmed by in-situ sampled view data from a different position of Brahmaputra River. Further, a Hyperion Hyperspectral 30 m resolution data were used to scan the sediment deposits, which is also confirmed by in-situ sampled view data from a different position.
Keywords: Spatial analysis, change detection, aerosol, trend analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5407370 Discovering Complex Regularities by Adaptive Self Organizing Classification
Authors: A. Faro, D. Giordano, F. Maiorana
Abstract:
Data mining uses a variety of techniques each of which is useful for some particular task. It is important to have a deep understanding of each technique and be able to perform sophisticated analysis. In this article we describe a tool built to simulate a variation of the Kohonen network to perform unsupervised clustering and support the entire data mining process up to results visualization. A graphical representation helps the user to find out a strategy to optmize classification by adding, moving or delete a neuron in order to change the number of classes. The tool is also able to automatically suggest a strategy for number of classes optimization.The tool is used to classify macroeconomic data that report the most developed countries? import and export. It is possible to classify the countries based on their economic behaviour and use an ad hoc tool to characterize the commercial behaviour of a country in a selected class from the analysis of positive and negative features that contribute to classes formation.
Keywords: Unsupervised classification, Kohonen networks, macroeconomics, Visual data mining, cluster interpretation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15627369 A New Evolutionary Algorithm for Cluster Analysis
Authors: B.Bahmani Firouzi, T. Niknam, M. Nayeripour
Abstract:
Clustering is a very well known technique in data mining. One of the most widely used clustering techniques is the kmeans algorithm. Solutions obtained from this technique depend on the initialization of cluster centers and the final solution converges to local minima. In order to overcome K-means algorithm shortcomings, this paper proposes a hybrid evolutionary algorithm based on the combination of PSO, SA and K-means algorithms, called PSO-SA-K, which can find better cluster partition. The performance is evaluated through several benchmark data sets. The simulation results show that the proposed algorithm outperforms previous approaches, such as PSO, SA and K-means for partitional clustering problem.
Keywords: Data clustering, Hybrid evolutionary optimization algorithm, K-means algorithm, Simulated Annealing (SA), Particle Swarm Optimization (PSO).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22767368 Research and Application of Consultative Committee for Space Data Systems Wireless Communications Standards for Spacecraft
Authors: Cuitao Zhang, Xiongwen He
Abstract:
According to the new requirements of the future spacecraft, such as networking, modularization and non-cable, this paper studies the CCSDS wireless communications standards, and focuses on the low data-rate wireless communications for spacecraft monitoring and control. The application fields and advantages of wireless communications are analyzed. Wireless communications technology has significant advantages in reducing the weight of the spacecraft, saving time in spacecraft integration, etc. Based on this technology, a scheme for spacecraft data system is put forward. The corresponding block diagram and key wireless interface design of the spacecraft data system are given. The design proposal of the wireless node and information flow of the spacecraft are also analyzed. The results show that the wireless communications scheme is reasonable and feasible. The wireless communications technology can meet the future spacecraft demands in networking, modularization and non-cable.
Keywords: CCSDS standards, information flow, non-cable, spacecraft, wireless communications.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9397367 Workplace Monitoring During Interventional Cardiology Procedures
Authors: N. Todorovic, I. Bikit, J. Nikolov, S. Forkapic, D. Mrdja, S. Todorovic
Abstract:
Interventional cardiologists are at greater risk from radiation exposure as a result of the procedures they undertake than most other medical specialists. A study was performed to evaluate operator dose during interventional cardiology procedures and to establish methods of operator dose reduction with a radiation protective device. Different procedure technique and use of protective tools can explain big difference in the annual equivalent dose received by the professionals. Strategies to prevent and monitor radiation exposure, advanced protective shielding and effective radiation monitoring methods should be applied.Keywords: absorbed dose rate measurements, annualequivalent dose, protective device.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15347366 The Use of Artificial Neural Network in Option Pricing: The Case of S and P 100 Index Options
Authors: Zeynep İltüzer Samur, Gül Tekin Temur
Abstract:
Due to the increasing and varying risks that economic units face with, derivative instruments gain substantial importance, and trading volumes of derivatives have reached very significant level. Parallel with these high trading volumes, researchers have developed many different models. Some are parametric, some are nonparametric. In this study, the aim is to analyse the success of artificial neural network in pricing of options with S&P 100 index options data. Generally, the previous studies cover the data of European type call options. This study includes not only European call option but also American call and put options and European put options. Three data sets are used to perform three different ANN models. One only includes data that are directly observed from the economic environment, i.e. strike price, spot price, interest rate, maturity, type of the contract. The others include an extra input that is not an observable data but a parameter, i.e. volatility. With these detail data, the performance of ANN in put/call dimension, American/European dimension, moneyness dimension is analyzed and whether the contribution of the volatility in neural network analysis make improvement in prediction performance or not is examined. The most striking results revealed by the study is that ANN shows better performance when pricing call options compared to put options; and the use of volatility parameter as an input does not improve the performance.
Keywords: Option Pricing, Neural Network, S&P 100 Index, American/European options
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30837365 Harmonic Reduction In Three-Phase Parallel Connected Inverter
Authors: M.A.A. Younis, N. A. Rahim, S. Mekhilef
Abstract:
This paper presents the design and analysis of a parallel connected inverter configuration of. The configuration consists of parallel connected three-phase dc/ac inverter. Series resistors added to the inverter output to maintain same current in each inverter of the two parallel inverters, and to reduce the circulating current in the parallel inverters to the minimum. High frequency third harmonic injection PWM (THIPWM) employed to reduce the total harmonic distortion and to make maximum use of the voltage source. DSP was used to generate the THIPWM and the control algorithm for the converter. Selected experimental results have been shown to validate the proposed system.Keywords: Three-phase inverter, Third harmonic injection PWM, inverters parallel connection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 37747364 Humans as Enrichment: Human-Animal Interactions and the Perceived Benefit to the Cheetah (Acinonyx jubatus), Human and Zoological Establishment
Authors: S. J. Higgs, E. Van Eck, K. Heynis, S. H. Broadberry
Abstract:
Engagement with non-human animals is a rapidly-growing field of study within the animal science and social science sectors, with human-interactions occurring in many forms; interactions, encounters and animal-assisted therapy. To our knowledge, there has been a wide array of research published on domestic and livestock human-animal interactions, however, there appear to be fewer publications relating to zoo animals and the effect these interactions have on the animal, human and establishment. The aim of this study was to identify if there were any perceivable benefits from the human-animal interaction for the cheetah, the human and the establishment. Behaviour data were collected before, during and after the interaction on the behaviour of the cheetah and the human participants to highlight any trends with nine interactions conducted. All 35 participants were asked to fill in a questionnaire prior to the interaction and immediately after to ascertain if their perceptions changed following an interaction with the cheetah. An online questionnaire was also distributed for three months to gain an understanding of the perceptions of human-animal interactions from members of the public, gaining 229 responses. Both questionnaires contained qualitative and quantitative questions to allow for specific definitive answers to be analysed, but also expansion on the participants perceived perception of human-animal interactions. In conclusion, it was found that participants’ perceptions of human-animal interactions saw a positive change, with 64% of participants altering their opinion and viewing the interaction as beneficial for the cheetah (reduction in stress assumed behaviours) following participation in a 15-minute interaction. However, it was noted that many participants felt the interaction lacked educational values and therefore this is an area in which zoological establishments can work to further improve upon. The results highlighted many positive benefits for the human, animal and establishment, however, the study does indicate further areas for research in order to promote positive perceptions of human-animal interactions and to further increase the welfare of the animal during these interactions, with recommendations to create and regulate legislation.
Keywords: Acinonyx jubatus, encounters, human-animal interactions, perceptions, zoological establishments.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16307363 Totally Integrated Smart Energy System through Data Acquisition via Remote Location
Authors: Muhammad Tahir Qadri, M. Irfan Anis, M. Nawaz Irshad Khan
Abstract:
This paper discusses the approach of real-time controlling of the energy management system using the data acquisition tool of LabVIEW. The main idea of this inspiration was to interface the Station (PC) with the system and publish the data on internet using LabVIEW. In this venture, controlling and switching of 3 phase AC loads are effectively and efficiently done. The phases are also sensed through devices. In case of any failure the attached generator starts functioning automatically. The computer sends command to the system and system respond to the request. The modern feature is to access and control the system world-wide using world wide web (internet). This controlling can be done at any time from anywhere to effectively use the energy especially in developing countries where energy management is a big problem. In this system totally integrated devices are used to operate via remote location.Keywords: VI-server, Remote Access, Telemetry, Data Acquisition, web server.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18777362 A Low-Voltage Current-Mode Wheatstone Bridge using CMOS Transistors
Authors: Ebrahim Farshidi
Abstract:
This paper presents a new circuit arrangement for a current-mode Wheatstone bridge that is suitable for low-voltage integrated circuits implementation. Compared to the other proposed circuits, this circuit features severe reduction of the elements number, low supply voltage (1V) and low power consumption (<350uW). In addition, the circuit has favorable nonlinearity error (<0.35%), operate with multiple sensors and works by single supply voltage. The circuit employs MOSFET transistors, so it can be used for standard CMOS fabrication. Simulation results by HSPICE show high performance of the circuit and confirm the validity of the proposed design technique.Keywords: Wheatstone bridge, current-mode, low-voltage, MOS.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30217361 Preservation of Molecular Ozone in a Clathrate Hydrate : Three-Phase (Gas + Liquid + Hydrate) Equilibrium Measurements for O3 + O2 + CO2 + H2O Systems
Authors: Kazutoshi Shishido, Sanehiro Muromachi, Ryo Ohmura
Abstract:
This paper reports the three-phase (gas + liquid + hydrate) equilibrium pressure versus temperature data for a (O3 + O2 + CO2 + H2O) system for developing the hydrate-based technology to preserve ozone, a chemically unstable substance, for various industrial, medical and consumer uses. These data cover the temperature range from 272 K to 277 K, corresponding to pressures from 1.6 MPa to 3.1 MPa, for each of the three different (O3 + O2)-to-CO2 or O2-to-CO2 molar ratios in the gas phase, which are approximately 4 : 6, 5 : 5, respectively. The mole fraction of ozone in the gas phase was ~0.03 , which are the densest ozone fraction to artificially form O3 containing hydrate ever reported in the literature. Based on these data, the formation of hydrate containing high-concentration ozone, as high as 1 mass %, will be expected.Keywords: Clathrate hydrate, Ozone, Molecule storage, Sterilization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15067360 A Fuzzy TOPSIS Based Model for Safety Risk Assessment of Operational Flight Data
Authors: N. Borjalilu, P. Rabiei, A. Enjoo
Abstract:
Flight Data Monitoring (FDM) program assists an operator in aviation industries to identify, quantify, assess and address operational safety risks, in order to improve safety of flight operations. FDM is a powerful tool for an aircraft operator integrated into the operator’s Safety Management System (SMS), allowing to detect, confirm, and assess safety issues and to check the effectiveness of corrective actions, associated with human errors. This article proposes a model for safety risk assessment level of flight data in a different aspect of event focus based on fuzzy set values. It permits to evaluate the operational safety level from the point of view of flight activities. The main advantages of this method are proposed qualitative safety analysis of flight data. This research applies the opinions of the aviation experts through a number of questionnaires Related to flight data in four categories of occurrence that can take place during an accident or an incident such as: Runway Excursions (RE), Controlled Flight Into Terrain (CFIT), Mid-Air Collision (MAC), Loss of Control in Flight (LOC-I). By weighting each one (by F-TOPSIS) and applying it to the number of risks of the event, the safety risk of each related events can be obtained.Keywords: F-TOPSIS, fuzzy set, FDM, flight safety.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8867359 Time Series Forecasting Using a Hybrid RBF Neural Network and AR Model Based On Binomial Smoothing
Authors: Fengxia Zheng, Shouming Zhong
Abstract:
ANNARIMA that combines both autoregressive integrated moving average (ARIMA) model and artificial neural network (ANN) model is a valuable tool for modeling and forecasting nonlinear time series, yet the over-fitting problem is more likely to occur in neural network models. This paper provides a hybrid methodology that combines both radial basis function (RBF) neural network and auto regression (AR) model based on binomial smoothing (BS) technique which is efficient in data processing, which is called BSRBFAR. This method is examined by using the data of Canadian Lynx data. Empirical results indicate that the over-fitting problem can be eased using RBF neural network based on binomial smoothing which is called BS-RBF, and the hybrid model–BS-RBFAR can be an effective way to improve forecasting accuracy achieved by BSRBF used separately.
Keywords: Binomial smoothing (BS), hybrid, Canadian Lynx data, forecasting accuracy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 36857358 Unsupervised Texture Classification and Segmentation
Authors: V.P.Subramanyam Rallabandi, S.K.Sett
Abstract:
An unsupervised classification algorithm is derived by modeling observed data as a mixture of several mutually exclusive classes that are each described by linear combinations of independent non-Gaussian densities. The algorithm estimates the data density in each class by using parametric nonlinear functions that fit to the non-Gaussian structure of the data. This improves classification accuracy compared with standard Gaussian mixture models. When applied to textures, the algorithm can learn basis functions for images that capture the statistically significant structure intrinsic in the images. We apply this technique to the problem of unsupervised texture classification and segmentation.Keywords: Gaussian Mixture Model, Independent Component Analysis, Segmentation, Unsupervised Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15907357 Vision-Based Daily Routine Recognition for Healthcare with Transfer Learning
Authors: Bruce X. B. Yu, Yan Liu, Keith C. C. Chan
Abstract:
We propose to record Activities of Daily Living (ADLs) of elderly people using a vision-based system so as to provide better assistive and personalization technologies. Current ADL-related research is based on data collected with help from non-elderly subjects in laboratory environments and the activities performed are predetermined for the sole purpose of data collection. To obtain more realistic datasets for the application, we recorded ADLs for the elderly with data collected from real-world environment involving real elderly subjects. Motivated by the need to collect data for more effective research related to elderly care, we chose to collect data in the room of an elderly person. Specifically, we installed Kinect, a vision-based sensor on the ceiling, to capture the activities that the elderly subject performs in the morning every day. Based on the data, we identified 12 morning activities that the elderly person performs daily. To recognize these activities, we created a HARELCARE framework to investigate into the effectiveness of existing Human Activity Recognition (HAR) algorithms and propose the use of a transfer learning algorithm for HAR. We compared the performance, in terms of accuracy, and training progress. Although the collected dataset is relatively small, the proposed algorithm has a good potential to be applied to all daily routine activities for healthcare purposes such as evidence-based diagnosis and treatment.Keywords: Daily activity recognition, healthcare, IoT sensors, transfer learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8917356 Visualization of Sediment Thickness Variation for Sea Bed Logging using Spline Interpolation
Authors: Hanita Daud, Noorhana Yahya, Vijanth Sagayan, Muizuddin Talib
Abstract:
This paper discusses on the use of Spline Interpolation and Mean Square Error (MSE) as tools to process data acquired from the developed simulator that shall replicate sea bed logging environment. Sea bed logging (SBL) is a new technique that uses marine controlled source electromagnetic (CSEM) sounding technique and is proven to be very successful in detecting and characterizing hydrocarbon reservoirs in deep water area by using resistivity contrasts. It uses very low frequency of 0.1Hz to 10 Hz to obtain greater wavelength. In this work the in house built simulator was used and was provided with predefined parameters and the transmitted frequency was varied for sediment thickness of 1000m to 4000m for environment with and without hydrocarbon. From series of simulations, synthetics data were generated. These data were interpolated using Spline interpolation technique (degree of three) and mean square error (MSE) were calculated between original data and interpolated data. Comparisons were made by studying the trends and relationship between frequency and sediment thickness based on the MSE calculated. It was found that the MSE was on increasing trends in the set up that has the presence of hydrocarbon in the setting than the one without. The MSE was also on decreasing trends as sediment thickness was increased and with higher transmitted frequency.Keywords: Spline Interpolation, Mean Square Error, Sea Bed Logging, Controlled Source Electromagnetic
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16557355 The Consumer Private Space: What is and How it can be Approached without Affecting the Consumer's Privacy
Authors: Calin Veghes
Abstract:
The concept of privacy, seen in connection to the consumer's private space and personalization, has recently gained a higher importance as a consequence of the increasing marketing efforts of the organizations based on the capturing, processing and usage of consumer-s personal data.Paper intends to provide a definition of the consumer-s private space based on the types of personal data the consumer is willing to disclose, to assess the attitude toward personalization and to identify the means preferred by consumers to control their personal data and defend their private space. Several implications generated through the definition of the consumer-s private space are identified and weighted from both the consumers- and organizations- perspectives.
Keywords: Consumer private space, personalization, privacy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15657354 Changes in Fine PM Pollution Levels with Tightening of Regulations on Vehicle Emissions
Authors: Akihiro Iijima, Kimiyo Kumagai
Abstract:
A long-term campaign for monitoring the concentration of atmospheric Particulate Matter (PM) was conducted at multiple sites located in the center and suburbs of the Tokyo Metropolitan Area in Japan. The concentration of fine PM has shown a declining trend over the last two decades. A positive matrix factorization model elucidated that the contribution of combustion sources was drastically reduced. In Japan, the regulations on vehicle exhaust emissions were phased in and gradually tightened over the last two decades, which has triggered a notable reduction in PM emissions from automobiles and has contributed to the mitigation of the problem of fine PM pollution.Keywords: Air pollution, Diesel-powered vehicle, Positive matrix factorization, Receptor modeling.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17637353 Quantifying the Methods of Monitoring Timers in Electric Water Heater for Grid Balancing on Demand Side Management: A Systematic Mapping Review
Authors: Yamamah Abdulrazaq, Lahieb A. Abrahim, Samuel E. Davies, Iain Shewring
Abstract:
Electric water heater (EWH) is a powerful appliance that uses electricity in residential, commercial, and industrial settings, and the ability to control them properly will result in cost savings and the prevention of blackouts on the national grid. This article discusses the usage of timers in EWH control strategies for demand-side management (DSM). To the authors' knowledge, there is no systematic mapping review focusing on the utilization of EWH control strategies in DSM has yet been conducted. Consequently, the purpose of this research is to identify and examine main papers exploring EWH procedures in DSM by quantifying and categorizing information with regard to publication year and source, kind of methods, and source of data for monitoring control techniques. In order to answer the research questions, a total of 31 publications published between 1999 and 2023 were selected depending on specific inclusion and exclusion criteria. The data indicate that direct load control (DLC) has been somewhat more prevalent than indirect load control (ILC). Additionally, the mix method is much lower than the other techniques, and the proportion of real-time data (RTD) to non-real-time data (NRTD) is about equal.
Keywords: Demand side management, direct load control, electric water heater, indirect load control, non-real-time data, real time data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1117352 Mechanical and Microstructural Properties of Rotary-Swaged Wire of Commercial-Purity Titanium
Authors: Michal Duchek, Jan Palán, Tomas Kubina
Abstract:
Bars made of titanium grade 2 and grade 4 were subjected to rotary forging with up to 2.2 true strain reduction in the cross-section from 10 to 3.81 mm. During progressive deformation, grain refinement in the transverse direction took place. In the longitudinal direction, ultrafine microstructure has not developed. It has been demonstrated that titanium grade 2 strengthens more than grade 4. The ultimate tensile strength increased from 650 MPa to 1040 MPa in titanium grade 4. Hardness profiles on the cross section in both materials show an increase in the centre of the wire.
Keywords: Commercial-purity titanium, wire, rotary swaging, tensile test, hardness, modulus of elasticity, microstructure.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7407351 VaR Forecasting in Times of Increased Volatility
Authors: Ivo Jánský, Milan Rippel
Abstract:
The paper evaluates several hundred one-day-ahead VaR forecasting models in the time period between the years 2004 and 2009 on data from six world stock indices - DJI, GSPC, IXIC, FTSE, GDAXI and N225. The models model mean using the ARMA processes with up to two lags and variance with one of GARCH, EGARCH or TARCH processes with up to two lags. The models are estimated on the data from the in-sample period and their forecasting accuracy is evaluated on the out-of-sample data, which are more volatile. The main aim of the paper is to test whether a model estimated on data with lower volatility can be used in periods with higher volatility. The evaluation is based on the conditional coverage test and is performed on each stock index separately. The primary result of the paper is that the volatility is best modelled using a GARCH process and that an ARMA process pattern cannot be found in analyzed time series.Keywords: VaR, risk analysis, conditional volatility, garch, egarch, tarch, moving average process, autoregressive process
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14277350 A Weighted Sum Technique for the Joint Optimization of Performance and Power Consumption in Data Centers
Authors: Samee Ullah Khan, C.Ardil
Abstract:
With data centers, end-users can realize the pervasiveness of services that will be one day the cornerstone of our lives. However, data centers are often classified as computing systems that consume the most amounts of power. To circumvent such a problem, we propose a self-adaptive weighted sum methodology that jointly optimizes the performance and power consumption of any given data center. Compared to traditional methodologies for multi-objective optimization problems, the proposed self-adaptive weighted sum technique does not rely on a systematical change of weights during the optimization procedure. The proposed technique is compared with the greedy and LR heuristics for large-scale problems, and the optimal solution for small-scale problems implemented in LINDO. the experimental results revealed that the proposed selfadaptive weighted sum technique outperforms both of the heuristics and projects a competitive performance compared to the optimal solution.Keywords: Meta-heuristics, distributed systems, adaptive methods, resource allocation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18347349 Image-Based (RBG) Technique for Estimating Phosphorus Levels of Crops
Authors: M. M. Ali, Ahmed Al-Ani, Derek Eamus, Daniel K. Y. Tan
Abstract:
In this glasshouse study, we developed a new imagebased non-destructive technique for detecting leaf P status of different crops such as cotton, tomato and lettuce. The plants were grown on a nutrient solution containing different P concentrations, e.g. 0%, 50% and 100% of recommended P concentration (P0 = no P, L; P1 = 2.5 mL 10 L-1 of P and P2 = 5 mL 10 L-1 of P). After 7 weeks of treatment, the plants were harvested and data on leaf P contents were collected using the standard destructive laboratory method and at the same time leaf images were collected by a handheld crop image sensor. We calculated leaf area, leaf perimeter and RGB (red, green and blue) values of these images. These data were further used in linear discriminant analysis (LDA) to estimate leaf P contents, which successfully classified these plants on the basis of leaf P contents. The data indicated that P deficiency in crop plants can be predicted using leaf image and morphological data. Our proposed nondestructive imaging method is precise in estimating P requirements of different crop species.Keywords: Image-based techniques, leaf area, leaf P contents, linear discriminant analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16487348 An Improved K-Means Algorithm for Gene Expression Data Clustering
Authors: Billel Kenidra, Mohamed Benmohammed
Abstract:
Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.
Keywords: Microarray data mining, biological pattern recognition, partitional clustering, k-means algorithm, centroid initialization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12837347 The Classification Performance in Parametric and Nonparametric Discriminant Analysis for a Class- Unbalanced Data of Diabetes Risk Groups
Authors: Lily Ingsrisawang, Tasanee Nacharoen
Abstract:
The problems arising from unbalanced data sets generally appear in real world applications. Due to unequal class distribution, many researchers have found that the performance of existing classifiers tends to be biased towards the majority class. The k-nearest neighbors’ nonparametric discriminant analysis is a method that was proposed for classifying unbalanced classes with good performance. In this study, the methods of discriminant analysis are of interest in investigating misclassification error rates for classimbalanced data of three diabetes risk groups. The purpose of this study was to compare the classification performance between parametric discriminant analysis and nonparametric discriminant analysis in a three-class classification of class-imbalanced data of diabetes risk groups. Data from a project maintaining healthy conditions for 599 employees of a government hospital in Bangkok were obtained for the classification problem. The employees were divided into three diabetes risk groups: non-risk (90%), risk (5%), and diabetic (5%). The original data including the variables of diabetes risk group, age, gender, blood glucose, and BMI were analyzed and bootstrapped for 50 and 100 samples, 599 observations per sample, for additional estimation of the misclassification error rate. Each data set was explored for the departure of multivariate normality and the equality of covariance matrices of the three risk groups. Both the original data and the bootstrap samples showed nonnormality and unequal covariance matrices. The parametric linear discriminant function, quadratic discriminant function, and the nonparametric k-nearest neighbors’ discriminant function were performed over 50 and 100 bootstrap samples and applied to the original data. Searching the optimal classification rule, the choices of prior probabilities were set up for both equal proportions (0.33: 0.33: 0.33) and unequal proportions of (0.90:0.05:0.05), (0.80: 0.10: 0.10) and (0.70, 0.15, 0.15). The results from 50 and 100 bootstrap samples indicated that the k-nearest neighbors approach when k=3 or k=4 and the defined prior probabilities of non-risk: risk: diabetic as 0.90: 0.05:0.05 or 0.80:0.10:0.10 gave the smallest error rate of misclassification. The k-nearest neighbors approach would be suggested for classifying a three-class-imbalanced data of diabetes risk groups.Keywords: Bootstrap, diabetes risk groups, error rate, k-nearest neighbors.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20077346 Localizing and Experiencing Electronic Questionnaires in an Educational Web Site
Authors: Theodore H. Kaskalis
Abstract:
One of the main research methods in humanistic studies is the collection and process of data through questionnaires. This paper reports our experiences of localizing and adapting the phpESP package of electronic surveys, which led to a friendly on-line questionnaire environment offered through our department web site. After presenting the characteristics of this environment, we identify the expected benefits and present a questionnaire carried out through both the traditional and electronic way. We present the respondents' feedback and then we report the researchers' opinions.Finally, we propose ideas we intend to implement in order to further assist and enhance the research based on this web accessed,electronic questionnaire environment.
Keywords: Electronic questionnaires, Computer assisted webinterviewing, Survey data collection, Survey data visualization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12857345 Design and Implementation of Security Middleware for Data Warehouse Signature Framework
Authors: Mayada AlMeghari
Abstract:
Recently, grid middlewares have provided large integrated use of network resources as the shared data and the CPU to become a virtual supercomputer. In this work, we present the design and implementation of the middleware for Data Warehouse Signature (DWS) Framework. The aim of using the middleware in the proposed DWS framework is to achieve the high performance by the parallel computing. This middleware is developed on Alchemi.Net framework to increase the security among the network nodes through the authentication and group-key distribution model. This model achieves the key security and prevents any intermediate attacks in the middleware. This paper presents the flow process structures of the middleware design. In addition, the paper ensures the implementation of security for DWS middleware enhancement with the authentication and group-key distribution model. Finally, from the analysis of other middleware approaches, the developed middleware of DWS framework is the optimal solution of a complete covering of security issues.
Keywords: Middleware, parallel computing, data warehouse, security, group-key, high performance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3367344 Improving Fake News Detection Using K-means and Support Vector Machine Approaches
Authors: Kasra Majbouri Yazdi, Adel Majbouri Yazdi, Saeid Khodayi, Jingyu Hou, Wanlei Zhou, Saeed Saedy
Abstract:
Fake news and false information are big challenges of all types of media, especially social media. There is a lot of false information, fake likes, views and duplicated accounts as big social networks such as Facebook and Twitter admitted. Most information appearing on social media is doubtful and in some cases misleading. They need to be detected as soon as possible to avoid a negative impact on society. The dimensions of the fake news datasets are growing rapidly, so to obtain a better result of detecting false information with less computation time and complexity, the dimensions need to be reduced. One of the best techniques of reducing data size is using feature selection method. The aim of this technique is to choose a feature subset from the original set to improve the classification performance. In this paper, a feature selection method is proposed with the integration of K-means clustering and Support Vector Machine (SVM) approaches which work in four steps. First, the similarities between all features are calculated. Then, features are divided into several clusters. Next, the final feature set is selected from all clusters, and finally, fake news is classified based on the final feature subset using the SVM method. The proposed method was evaluated by comparing its performance with other state-of-the-art methods on several specific benchmark datasets and the outcome showed a better classification of false information for our work. The detection performance was improved in two aspects. On the one hand, the detection runtime process decreased, and on the other hand, the classification accuracy increased because of the elimination of redundant features and the reduction of datasets dimensions.
Keywords: Fake news detection, feature selection, support vector machine, K-means clustering, machine learning, social media.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 45237343 Re-Optimization MVPP Using Common Subexpression for Materialized View Selection
Authors: Boontita Suchyukorn, Raweewan Auepanwiriyakul
Abstract:
A Data Warehouses is a repository of information integrated from source data. Information stored in data warehouse is the form of materialized in order to provide the better performance for answering the queries. Deciding which appropriated views to be materialized is one of important problem. In order to achieve this requirement, the constructing search space close to optimal is a necessary task. It will provide effective result for selecting view to be materialized. In this paper we have proposed an approach to reoptimize Multiple View Processing Plan (MVPP) by using global common subexpressions. The merged queries which have query processing cost not close to optimal would be rewritten. The experiment shows that our approach can help to improve the total query processing cost of MVPP and sum of query processing cost and materialized view maintenance cost is reduced as well after views are selected to be materialized.
Keywords: Data Warehouse, materialized views, query rewriting, common subexpressions.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16777342 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency
Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami
Abstract:
Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.
Keywords: Clustering, k-means, categorical datasets, pattern recognition, unsupervised learning, knowledge discovery.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3544