Search results for: Data Reduction
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8698

Search results for: Data Reduction

8038 Representing Data without Lost Compression Properties in Time Series: A Review

Authors: Nabilah Filzah Mohd Radzuan, Zalinda Othman, Azuraliza Abu Bakar, Abdul Razak Hamdan

Abstract:

Uncertain data is believed to be an important issue in building up a prediction model. The main objective in the time series uncertainty analysis is to formulate uncertain data in order to gain knowledge and fit low dimensional model prior to a prediction task. This paper discusses the performance of a number of techniques in dealing with uncertain data specifically those which solve uncertain data condition by minimizing the loss of compression properties.

Keywords: Compression properties, uncertainty, uncertain time series, mining technique, weather prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1620
8037 Are XBRL-based Financial Reports Better than Non-XBRL Reports? A Quality Assessment

Authors: Zhenkun Wang, Simon S. Gao

Abstract:

Using a scoring system, this paper provides a comparative assessment of the quality of data between XBRL formatted financial reports and non-XBRL financial reports. It shows a major improvement in the quality of data of XBRL formatted financial reports. Although XBRL formatted financial reports do not show much advantage in the quality at the beginning, XBRL financial reports lately display a large improvement in the quality of data in almost all aspects. With the improved XBRL web data managing, presentation and analysis applications, XBRL formatted financial reports have a much better accessibility, are more accurate and better in timeliness.

Keywords: Data Quality; Financial Report; Information; XBRL

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2569
8036 Modeling of Random Variable with Digital Probability Hyper Digraph: Data-Oriented Approach

Authors: A. Habibizad Navin, M. Naghian Fesharaki, M. Mirnia, M. Kargar

Abstract:

In this paper we introduce Digital Probability Hyper Digraph for modeling random variable as the hierarchical data-oriented model.

Keywords: Data-Oriented Models, Data Structure, DigitalProbability Hyper Digraph, Random Variable, Statistic andProbability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1274
8035 Higher Plants Ability to Assimilate Explosives

Authors: G. Khatisashvili, M. Gordeziani, G. Adamia, E. Kvesitadze, T. Sadunishvili, G. Kvesitadze

Abstract:

The ability of agricultural and decorative plants to absorb and detoxify TNT and RDX has been studied. All tested 8 plants, grown hydroponically, were able to absorb these explosives from water solutions: Alfalfa > Soybean > Chickpea> Chikling vetch >Ryegrass > Mung bean> China bean > Maize. Differently from TNT, RDX did not exhibit negative influence on seed germination and plant growth. Moreover, some plants, exposed to RDX containing solution were increased in their biomass by 20%. Study of the fate of absorbed [1-14ðí]-TNT revealed the label distribution in low and high-molecular mass compounds, both in roots and above ground parts of plants, prevailing in the later. Content of 14ðí in lowmolecular compounds in plant roots are much higher than in above ground parts. On the contrary, high-molecular compounds are more intensively labeled in aboveground parts of soybean. Most part (up to 70%) of metabolites of TNT, formed either by enzymatic reduction or oxidation, is found in high molecular insoluble conjugates. Activation of enzymes, responsible for reduction, oxidation and conjugation of TNT, such as nitroreductase, peroxidase, phenoloxidase and glutathione S-transferase has been demonstrated. Among these enzymes, only nitroreductase was shown to be induced in alfalfa, exposed to RDX. The increase in malate dehydrogenase activities in plants, exposed to both explosives, indicates intensification of Tricarboxylic Acid Cycle, that generates reduced equivalents of NAD(P)H, necessary for functioning of the nitroreductase. The hypothetic scheme of TNT metabolism in plants is proposed.

Keywords: Higher plants, TNT, RDX, transformation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1712
8034 Wireless Transmission of Big Data Using Novel Secure Algorithm

Authors: K. Thiagarajan, K. Saranya, A. Veeraiah, B. Sudha

Abstract:

This paper presents a novel algorithm for secure, reliable and flexible transmission of big data in two hop wireless networks using cooperative jamming scheme. Two hop wireless networks consist of source, relay and destination nodes. Big data has to transmit from source to relay and from relay to destination by deploying security in physical layer. Cooperative jamming scheme determines transmission of big data in more secure manner by protecting it from eavesdroppers and malicious nodes of unknown location. The novel algorithm that ensures secure and energy balance transmission of big data, includes selection of data transmitting region, segmenting the selected region, determining probability ratio for each node (capture node, non-capture and eavesdropper node) in every segment, evaluating the probability using binary based evaluation. If it is secure transmission resume with the two- hop transmission of big data, otherwise prevent the attackers by cooperative jamming scheme and transmit the data in two-hop transmission.

Keywords: Big data, cooperative jamming, energy balance, physical layer, two-hop transmission, wireless security.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2181
8033 Hexavalent Chromium Removal from Aqueous Solutions by Adsorption onto Synthetic Nano Size ZeroValent Iron (nZVI)

Authors: A.R. Rahmani, M.T. Samadi, R. Noroozi

Abstract:

The present work was conducted for the synthesis of nano size zerovalent iron (nZVI) and hexavalent chromium (Cr(VI)) removal as a highly toxic pollutant by using this nanoparticles. Batch experiments were performed to investigate the effects of Cr(VI), nZVI concentration, pH of solution and contact time variation on the removal efficiency of Cr(VI). nZVI was synthesized by reduction of ferric chloride using sodium borohydrid. SEM and XRD examinations applied for determination of particle size and characterization of produced nanoparticles. The results showed that the removal efficiency decreased with Cr(VI) concentration and pH of solution and increased with adsorbent dosage and contact time. The Langmuir and Freundlich isotherm models were used for the adsorption equilibrium data and the Langmuir isotherm model was well fitted. Nanoparticle ZVI presented an outstanding ability to remove Cr(VI) due to high surface area, low particle size and high inherent activity.

Keywords: Adsorption, aqueous solution, Chromium, nZVI, removal.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2566
8032 Dust Storm Prediction Using ANNs Technique (A Case Study: Zabol City)

Authors: Jamalizadeh, M.R., Moghaddamnia, A., Piri, J., Arbabi, V., Homayounifar, M., Shahryari, A.

Abstract:

Dust storms are one of the most costly and destructive events in many desert regions. They can cause massive damages both in natural environments and human lives. This paper is aimed at presenting a preliminary study on dust storms, as a major natural hazard in arid and semi-arid regions. As a case study, dust storm events occurred in Zabol city located in Sistan Region of Iran was analyzed to diagnose and predict dust storms. The identification and prediction of dust storm events could have significant impacts on damages reduction. Present models for this purpose are complicated and not appropriate for many areas with poor-data environments. The present study explores Gamma test for identifying inputs of ANNs model, for dust storm prediction. Results indicate that more attempts must be carried out concerning dust storms identification and segregate between various dust storm types.

Keywords: Dust Storm, Gamma Test, Prediction, ANNs, Zabol.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2153
8031 An Experimental Study on the Effect of Operating Parameters during the Micro-Electro-Discharge Machining of Ni Based Alloy

Authors: Asma Perveen, M. P. Jahan

Abstract:

Ni alloys have managed to cover wide range of applications such as automotive industries, oil gas industries, and aerospace industries. However, these alloys impose challenges while using conventional machining technologies. On the other hand, Micro-Electro-Discharge machining (micro-EDM) is a non-conventional machining method that uses controlled sparks energy to remove material irrespective of the materials hardness. There has been always a huge interest from the industries for developing optimum methodology and parameters in order to enhance the productivity of micro-EDM in terms of reducing machining time and tool wear for different alloys. Therefore, the aims of this study are to investigate the effects of the micro-EDM process parameters, in order to find their optimal values. The input process parameters include voltage, capacitance, and electrode rotational speed, whereas the output parameters considered are machining time, entrance diameter of hole, overcut, tool wear, and crater size. The surface morphology and element characterization are also investigated with the use of SEM and EDX analysis. The experimental result indicates the reduction of machining time with the increment of discharge energy. Discharge energy also contributes to the enlargement of entrance diameter as well as overcut. In addition, tool wears show reduction with the increase of discharge energy. Moreover, crater size is found to be increased in size along with the increment of discharge energy.

Keywords: Micro EDM, Ni alloy, discharge energy, micro-holes.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1336
8030 Study of Efficiency and Capability LZW++ Technique in Data Compression

Authors: Yusof. Mohd Kamir, Mat Deris. Mohd Sufian, Abidin. Ahmad Faisal Amri

Abstract:

The purpose of this paper is to show efficiency and capability LZWµ in data compression. The LZWµ technique is enhancement from existing LZW technique. The modification the existing LZW is needed to produce LZWµ technique. LZW read one by one character at one time. Differ with LZWµ technique, where the LZWµ read three characters at one time. This paper focuses on data compression and tested efficiency and capability LZWµ by different data format such as doc type, pdf type and text type. Several experiments have been done by different types of data format. The results shows LZWµ technique is better compared to existing LZW technique in term of file size.

Keywords: Data Compression, Huffman Encoding, LZW, LZWµ, RLL, Size.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2090
8029 Flow Visualization and Characterization of an Artery Model with Stenosis

Authors: Anis S. Shuib, Peter R. Hoskins, William J. Easson

Abstract:

Cardiovascular diseases, principally atherosclerosis, are responsible for 30% of world deaths. Atherosclerosis is due to the formation of plaque. The fatty plaque may be at risk of rupture, leading typically to stroke and heart attack. The plaque is usually associated with a high degree of lumen reduction, called a stenosis.It is increasingly recognized that the initiation and progression of disease and the occurrence of clinical events is a complex interplay between the local biomechanical environment and the local vascular biology. The aim of this study is to investigate the flow behavior through a stenosed artery. A physical experiment was performed using an artery model and blood analogue fluid. An axisymmetric model constructed consists of contraction and expansion region that follow a mathematical form of cosine function. A 30% diameter reduction was used in this study. The flow field was measured using particle image velocimetry (PIV). Spherical particles with 20μm diameter were seeded in a water-glycerol-NaCl mixture. Steady flow Reynolds numbers are 250. The area of interest is the region after the stenosis where the flow separation occurs. The velocity field was measured and the velocity gradient was investigated. There was high particle concentration in the recirculation zone. High velocity gradient formed immediately after the stenosis throat created a lift force that enhanced particle migration to the flow separation area.

Keywords: Stenosis artery, Biofluid mechanics, PIV

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2004
8028 Impact of Stack Caches: Locality Awareness and Cost Effectiveness

Authors: Abdulrahman K. Alshegaifi, Chun-Hsi Huang

Abstract:

Treating data based on its location in memory has received much attention in recent years due to its different properties, which offer important aspects for cache utilization. Stack data and non-stack data may interfere with each other’s locality in the data cache. One of the important aspects of stack data is that it has high spatial and temporal locality. In this work, we simulate non-unified cache design that split data cache into stack and non-stack caches in order to maintain stack data and non-stack data separate in different caches. We observe that the overall hit rate of non-unified cache design is sensitive to the size of non-stack cache. Then, we investigate the appropriate size and associativity for stack cache to achieve high hit ratio especially when over 99% of accesses are directed to stack cache. The result shows that on average more than 99% of stack cache accuracy is achieved by using 2KB of capacity and 1-way associativity. Further, we analyze the improvement in hit rate when adding small, fixed, size of stack cache at level1 to unified cache architecture. The result shows that the overall hit rate of unified cache design with adding 1KB of stack cache is improved by approximately, on average, 3.9% for Rijndael benchmark. The stack cache is simulated by using SimpleScalar toolset.

Keywords: Hit rate, Locality of program, Stack cache, and Stack data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1509
8027 Genetic Polymorphism of the Acute Lymphoblastic Leukaemia and Hyperhomocysteinemia its Relation with the for a Group of Children in the East of Algeria

Authors: Yahia Massinissa, Kalla A, Yahia M, Benbia S

Abstract:

A lot of recent research have spoken on the relation between the increase of the homocysteinemia and some kinds of cancer . For that, our study was based on the research of a possible relation between the increase of the concentration of this amino-acid in the plasma and the appearance of the disease of the Acute Lymphoblastic Leukaemia in a part of Algerian children with Berber origin in the East of Algeria . The study has done on 47 ill persons with an average age of (09±06 ) years , with whom the disease has diagnosed by blood and marrow examination in the hospital of blood diseases in the CHU of Batna, and on 194 healthy witnesses of the same age. The two groups were benefited by a dosage of the concentration of the homocysteine vitamin B9 ,vitamin B12 , and also of the study of special polymorphisms of indispensable enzymes in the metabolism of this acid , and that by the use of the method ( Light cycler ) Real time PCR , on the following enzymes : MS ( C2756G ), MSR ( A66G ) ,MTHFR1 ( C677T ) and MTHFR2 (A1298C). The obtained results have revealed that the rate of the homozygote muted genotype is the less frequent in the two groups , and that exist at list one genotype of each enzyme in the ill group and in which the percentage exceed with remarkable way the same genotype in the healthy group and we notice specially the muted genotype GG of -the methionine synthetase-and the form TT of the enzyme – methyline tetra hydrofolate reductase – We notice the existence of considerable number of genotypes in the ill group lied with characteristic increase of this Amino-acid ,and that for the reduction of the biologic activity of these enzymes which become inefficient in the transfer of the homocysteine into the methionine and cause the diminution of the biologic activity of these enzymes and with consequence the reduction of the percentage of methylic radicals in the DNA of studied genes and that lead to the increase of the activity and the capacity of transcription , and it-s so probably that this last one is one of the factors of this disease especially if we know that the specific check-up of vitamins is normal and similar in the two groups , which ovoid the hypothesis of the reduction of vitamins . We notice also that the heterozygote genotype is the less in the sick category except the MTHFR2. Wild genotype is more frequent in the witness group except MSR. Even these results are partials; they open a new way in the genetic diagnosis of this malicious disease which allow a precocious diagnosis and the use of an effective and appropriated treatment in the same time.

Keywords: Genetic polymorphism, Acute Lymphoblastic Leukaemia, Biomarkers, Metabolism of homocystein

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2262
8026 Cross Project Software Fault Prediction at Design Phase

Authors: Pradeep Singh, Shrish Verma

Abstract:

Software fault prediction models are created by using the source code, processed metrics from the same or previous version of code and related fault data. Some company do not store and keep track of all artifacts which are required for software fault prediction. To construct fault prediction model for such company, the training data from the other projects can be one potential solution. Earlier we predicted the fault the less cost it requires to correct. The training data consists of metrics data and related fault data at function/module level. This paper investigates fault predictions at early stage using the cross-project data focusing on the design metrics. In this study, empirical analysis is carried out to validate design metrics for cross project fault prediction. The machine learning techniques used for evaluation is Naïve Bayes. The design phase metrics of other projects can be used as initial guideline for the projects where no previous fault data is available. We analyze seven datasets from NASA Metrics Data Program which offer design as well as code metrics. Overall, the results of cross project is comparable to the within company data learning.

Keywords: Software Metrics, Fault prediction, Cross project, Within project.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2547
8025 A Novel VLSI Architecture for Image Compression Model Using Low power Discrete Cosine Transform

Authors: Vijaya Prakash.A.M, K.S.Gurumurthy

Abstract:

In Image processing the Image compression can improve the performance of the digital systems by reducing the cost and time in image storage and transmission without significant reduction of the Image quality. This paper describes hardware architecture of low complexity Discrete Cosine Transform (DCT) architecture for image compression[6]. In this DCT architecture, common computations are identified and shared to remove redundant computations in DCT matrix operation. Vector processing is a method used for implementation of DCT. This reduction in computational complexity of 2D DCT reduces power consumption. The 2D DCT is performed on 8x8 matrix using two 1-Dimensional Discrete cosine transform blocks and a transposition memory [7]. Inverse discrete cosine transform (IDCT) is performed to obtain the image matrix and reconstruct the original image. The proposed image compression algorithm is comprehended using MATLAB code. The VLSI design of the architecture is implemented Using Verilog HDL. The proposed hardware architecture for image compression employing DCT was synthesized using RTL complier and it was mapped using 180nm standard cells. . The Simulation is done using Modelsim. The simulation results from MATLAB and Verilog HDL are compared. Detailed analysis for power and area was done using RTL compiler from CADENCE. Power consumption of DCT core is reduced to 1.027mW with minimum area[1].

Keywords: Discrete Cosine Transform (DCT), Inverse DiscreteCosine Transform (IDCT), Joint Photographic Expert Group (JPEG), Low Power Design, Very Large Scale Integration (VLSI) .

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3141
8024 Extreme Temperature Forecast in Mbonge, Cameroon through Return Level Analysis of the Generalized Extreme Value (GEV) Distribution

Authors: Nkongho Ayuketang Arreyndip, Ebobenow Joseph

Abstract:

In this paper, temperature extremes are forecast by employing the block maxima method of the Generalized extreme value(GEV) distribution to analyse temperature data from the Cameroon Development Corporation (C.D.C). By considering two sets of data (Raw data and simulated data) and two (stationary and non-stationary) models of the GEV distribution, return levels analysis is carried out and it was found that in the stationary model, the return values are constant over time with the raw data while in the simulated data, the return values show an increasing trend but with an upper bound. In the non-stationary model, the return levels of both the raw data and simulated data show an increasing trend but with an upper bound. This clearly shows that temperatures in the tropics even-though show a sign of increasing in the future, there is a maximum temperature at which there is no exceedence. The results of this paper are very vital in Agricultural and Environmental research.

Keywords: Return level, Generalized extreme value (GEV), Meteorology, Forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2110
8023 Influence of Axial Magnetic Field on the Electrical Breakdown and Secondary Electron Emission in Plane-Parallel Plasma Discharge

Authors: Sabah I. Wais, Raghad Y. Mohammed, Sedki O. Yousif

Abstract:

The influence of axial magnetic field (B=0.48 T) on the variation of ionization efficiency coefficient h and secondary electron emission coefficient g with respect to reduced electric field E/P is studied at a new range of plane-parallel electrode spacing (0< d< 20 cm) and different nitrogen working pressure between 0.5-20 Pa. The axial magnetic field is produced from an inductive copper coil of radius 5.6 cm. The experimental data of breakdown voltage is adopted to estimate the mean Paschen curves at different working features. The secondary electron emission coefficient is calculated from the mean Paschen curve and used to determine the minimum breakdown voltage. A reduction of discharge voltage of about 25% is investigated by the applied of axial magnetic field. At high interelectrode spacing, the effect of axial magnetic field becomes more significant for the obtained values of h but it was less for the values of g.

Keywords: Paschen curve, Townsend coefficient, Secondaryelectron emission, Magnetic field, Minimum breakdown voltage.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2614
8022 Clarification of Synthetic Juice through Spiral Wound Ultrafiltration Module at Turbulent Flow Region and Cleaning Study

Authors: Vijay Singh, Chandan Das

Abstract:

Synthetic juice clarification was done through spiral wound ultrafiltration (UF) membrane module. Synthetic juice was clarified at two different operating conditions, such as, with and without permeates recycle at turbulent flow regime. The performance of spiral wound ultrafiltration membrane was analyzed during clarification of synthetic juice. Synthetic juice was the mixture of deionized water, sucrose and pectin molecule. The operating conditions are: feed flowrate of 10 lpm, pressure drop of 413.7 kPa and Reynolds no of 5000. Permeate sample was analyzed in terms of volume reduction factor (VRF), viscosity (Pa.s), ⁰Brix, TDS (mg/l), electrical conductivity (μS) and turbidity (NTU). It was observe that the permeate flux declined with operating time for both conditions of with and without permeate recycle due to increase of concentration polarization and increase of gel layer on membrane surface. For without permeate recycle, the membrane fouling rate was faster compared to with permeate recycle. For without permeate recycle, the VRF rose up to 5 and for with recycle permeate the VRF is 1.9. The VRF is higher due to adsorption of solute (pectin) molecule on membrane surface and resulting permeateflux declined with VRF. With permeate recycle, quality was within acceptable limit. Fouled membrane was cleaned by applying different processes (e.g., deionized water, SDS and EDTA solution). Membrane cleaning was analyzed in terms of permeability recovery.

Keywords: Synthetic juice, Spiral wound, ultrafiltration, Reynolds No, Volume reduction factor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1858
8021 Effects of Entomopathogenic Nematodes on Suppressing Hairy Rose Beetle, Tropinota squalida Scop. (Coleoptera: Scarabaeidae) Population in Cauliflower Field in Egypt

Authors: A. S. Abdel-Razek, M. M. M. Abd-Elgawad

Abstract:

The potential of entomopathogenic nematodes in suppressing T. squalida population on cauliflower from transplanting to harvest was evaluated. Significant reductions in plant infestation percentage and population density (/m2) were recorded throughout the plantation seasons, 2011 and 2012 before and after spraying the plants. The percent reduction in numbers/m2 was the highest in March for the treatments with Heterorhabditis indica Behera and Heterorhabditis bacteriophora Giza during the plantation season 2011, while at the plantation season 2012, the reduction in population density was the highest in January for Heterorhabditis Indica Behera and in February for H . bacteriophora Giza treatments. In a comparison test with conventional insecticides Hostathion and Lannate, there were no significant differences in control measures resulting from treatments with H. indica Behera, H. bacteriophora Giza and Lannate. At the plantation season is 2012. Also, the treatments reduced the economic threshold of T. squalida on cauliflower in this experiment as compared with before and after spraying with both the two entomopathogenic nematodes at both seasons 2011 and 2012. This means an increase in the marketability of heads harvested as a consequence of monthly treatments. 

Keywords: Cruciferous plants, chemical insecticides, microbial control, Scarabiead beetles, seasonal monitoring.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1863
8020 Mining Multicity Urban Data for Sustainable Population Relocation

Authors: Xu Du, Aparna S. Varde

Abstract:

In this research, we propose to conduct diagnostic and predictive analysis about the key factors and consequences of urban population relocation. To achieve this goal, urban simulation models extract the urban development trends as land use change patterns from a variety of data sources. The results are treated as part of urban big data with other information such as population change and economic conditions. Multiple data mining methods are deployed on this data to analyze nonlinear relationships between parameters. The result determines the driving force of population relocation with respect to urban sprawl and urban sustainability and their related parameters. This work sets the stage for developing a comprehensive urban simulation model for catering to specific questions by targeted users. It contributes towards achieving sustainability as a whole.

Keywords: Data Mining, Environmental Modeling, Sustainability, Urban Planning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1784
8019 An Ant-based Clustering System for Knowledge Discovery in DNA Chip Analysis Data

Authors: Minsoo Lee, Yun-mi Kim, Yearn Jeong Kim, Yoon-kyung Lee, Hyejung Yoon

Abstract:

Biological data has several characteristics that strongly differentiate it from typical business data. It is much more complex, usually large in size, and continuously changes. Until recently business data has been the main target for discovering trends, patterns or future expectations. However, with the recent rise in biotechnology, the powerful technology that was used for analyzing business data is now being applied to biological data. With the advanced technology at hand, the main trend in biological research is rapidly changing from structural DNA analysis to understanding cellular functions of the DNA sequences. DNA chips are now being used to perform experiments and DNA analysis processes are being used by researchers. Clustering is one of the important processes used for grouping together similar entities. There are many clustering algorithms such as hierarchical clustering, self-organizing maps, K-means clustering and so on. In this paper, we propose a clustering algorithm that imitates the ecosystem taking into account the features of biological data. We implemented the system using an Ant-Colony clustering algorithm. The system decides the number of clusters automatically. The system processes the input biological data, runs the Ant-Colony algorithm, draws the Topic Map, assigns clusters to the genes and displays the output. We tested the algorithm with a test data of 100 to1000 genes and 24 samples and show promising results for applying this algorithm to clustering DNA chip data.

Keywords: Ant colony system, biological data, clustering, DNA chip.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1975
8018 The Resource Description Framework (RDF) as a Modern Structure for Medical Data

Authors: Gabriela Lindemann, Danilo Schmidt, Thomas Schrader, Dietmar Keune

Abstract:

The amount and heterogeneity of data in biomedical research, notably in interdisciplinary fields, requires new methods for the collection, presentation and analysis of information. Important data from laboratory experiments as well as patient trials are available but come out of distributed resources. The Charité - University Hospital Berlin has established together with the German Research Foundation (DFG) a new information service centre for kidney diseases and transplantation (Open European Nephrology Science Centre - OpEN.SC). Beside a collaborative aspect to create new research groups every single partner or institution of this science information centre making his own data available is allowed to search the whole data pool of the various involved centres. A core task is the implementation of a non-restricting open data structure for the various different data sources. We decided to use a modern RDF model and in a first phase transformed original data coming from the web-based Electronic Patient Record database TBase©.

Keywords: Medical databases, Resource Description Framework (RDF), metadata repository.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2033
8017 XML Data Management in Compressed Relational Database

Authors: Hongzhi Wang, Jianzhong Li, Hong Gao

Abstract:

XML is an important standard of data exchange and representation. As a mature database system, using relational database to support XML data may bring some advantages. But storing XML in relational database has obvious redundancy that wastes disk space, bandwidth and disk I/O when querying XML data. For the efficiency of storage and query XML, it is necessary to use compressed XML data in relational database. In this paper, a compressed relational database technology supporting XML data is presented. Original relational storage structure is adaptive to XPath query process. The compression method keeps this feature. Besides traditional relational database techniques, additional query process technologies on compressed relations and for special structure for XML are presented. In this paper, technologies for XQuery process in compressed relational database are presented..

Keywords: XML, compression, query processing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1806
8016 A System for Analyzing and Eliciting Public Grievances Using Cache Enabled Big Data

Authors: P. Kaladevi, N. Giridharan

Abstract:

The system for analyzing and eliciting public grievances serves its main purpose to receive and process all sorts of complaints from the public and respond to users. Due to the more number of complaint data becomes big data which is difficult to store and process. The proposed system uses HDFS to store the big data and uses MapReduce to process the big data. The concept of cache was applied in the system to provide immediate response and timely action using big data analytics. Cache enabled big data increases the response time of the system. The unstructured data provided by the users are efficiently handled through map reduce algorithm. The processing of complaints takes place in the order of the hierarchy of the authority. The drawbacks of the traditional database system used in the existing system are set forth by our system by using Cache enabled Hadoop Distributed File System. MapReduce framework codes have the possible to leak the sensitive data through computation process. We propose a system that add noise to the output of the reduce phase to avoid signaling the presence of sensitive data. If the complaints are not processed in the ample time, then automatically it is forwarded to the higher authority. Hence it ensures assurance in processing. A copy of the filed complaint is sent as a digitally signed PDF document to the user mail id which serves as a proof. The system report serves to be an essential data while making important decisions based on legislation.

Keywords: Big Data, Hadoop, HDFS, Caching, MapReduce, web personalization, e-governance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1592
8015 Mining Image Features in an Automatic Two-Dimensional Shape Recognition System

Authors: R. A. Salam, M.A. Rodrigues

Abstract:

The number of features required to represent an image can be very huge. Using all available features to recognize objects can suffer from curse dimensionality. Feature selection and extraction is the pre-processing step of image mining. Main issues in analyzing images is the effective identification of features and another one is extracting them. The mining problem that has been focused is the grouping of features for different shapes. Experiments have been conducted by using shape outline as the features. Shape outline readings are put through normalization and dimensionality reduction process using an eigenvector based method to produce a new set of readings. After this pre-processing step data will be grouped through their shapes. Through statistical analysis, these readings together with peak measures a robust classification and recognition process is achieved. Tests showed that the suggested methods are able to automatically recognize objects through their shapes. Finally, experiments also demonstrate the system invariance to rotation, translation, scale, reflection and to a small degree of distortion.

Keywords: Image mining, feature selection, shape recognition, peak measures.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1460
8014 Flow Regime Characterization in a Diseased Artery Model

Authors: Anis S. Shuib, Peter R. Hoskins, William J. Easson

Abstract:

Cardiovascular disease mostly in the form of atherosclerosis is responsible for 30% of all world deaths amounting to 17 million people per year. Atherosclerosis is due to the formation of plaque. The fatty plaque may be at risk of rupture, leading typically to stroke and heart attack. The plaque is usually associated with a high degree of lumen reduction, called a stenosis. The initiation and progression of the disease is strongly linked to the hemodynamic environment near the vessel wall. The aim of this study is to validate the flow of blood mimic through an arterial stenosis model with computational fluid dynamics (CFD) package. In experiment, an axisymmetric model constructed consists of contraction and expansion region that follow a mathematical form of cosine function. A 30% diameter reduction was used in this study. Particle image velocimetry (PIV) was used to characterize the flow. The fluid consists of rigid spherical particles suspended in waterglycerol- NaCl mixture. The particles with 20 μm diameter were selected to follow the flow of fluid. The flow at Re=155, 270 and 390 were investigated. The experimental result is compared with FLUENT simulated flow that account for viscous laminar flow model. The results suggest that laminar flow model was sufficient to predict flow velocity at the inlet but the velocity at stenosis throat at Re =390 was overestimated. Hence, a transition to turbulent regime might have been developed at throat region as the flow rate increases.

Keywords: Atherosclerosis, Particle-laden flow, Particle imagevelocimetry, Stenosis artery

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1725
8013 Improved K-Modes for Categorical Clustering Using Weighted Dissimilarity Measure

Authors: S.Aranganayagi, K.Thangavel

Abstract:

K-Modes is an extension of K-Means clustering algorithm, developed to cluster the categorical data, where the mean is replaced by the mode. The similarity measure proposed by Huang is the simple matching or mismatching measure. Weight of attribute values contribute much in clustering; thus in this paper we propose a new weighted dissimilarity measure for K-Modes, based on the ratio of frequency of attribute values in the cluster and in the data set. The new weighted measure is experimented with the data sets obtained from the UCI data repository. The results are compared with K-Modes and K-representative, which show that the new measure generates clusters with high purity.

Keywords: Clustering, categorical data, K-Modes, weighted dissimilarity measure

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3692
8012 Different Formula of Mixed Bacteria as a Bio-Treatment for Sewage Wastewater

Authors: E. Marei, A. Hammad, S. Ismail, A. El-Gindy

Abstract:

This study aims to investigate the ability of different formula of mixed bacteria as a biological treatments of wastewater after primary treatment as a bio-treatment and bio-removal and bio-adsorbent of different heavy metals in natural circumstances. The wastewater was collected from Sarpium forest site-Ismailia Governorate, Egypt. These treatments were mixture of free cells and mixture of immobilized cells of different bacteria. These different formulas of mixed bacteria were prepared under Lab. condition. The obtained data indicated that, as a result of wastewater bio-treatment, the removal rate was found to be 76.92 and 76.70% for biological oxygen demand, 79.78 and 71.07% for chemical oxygen demand, 32.45 and 36.84 % for ammonia nitrogen as well as 91.67 and 50.0% for phosphate after 24 and 28 hrs with mixed free cells and mixed immobilized cells, respectively. Moreover, the bio-removals of different heavy metals were found to reach 90.0 and 50. 0% for Cu ion, 98.0 and 98.5% for Fe ion, 97.0 and 99.3% for Mn ion, 90.0 and 90.0% Pb, 80.0% and 75.0% for Zn ion after 24 and 28 hrs with mixed free cells and mixed immobilized cells, respectively. The results indicated that 13.86 and 17.43% of removal efficiency and reduction of total dissolved solids were achieved after 24 and 28 hrs with mixed free cells and mixed immobilized cells, respectively.

Keywords: Biological desalination, bio-sorption heavy metals, free cell bacteria, immobilized bacteria, wastewater bio-treatment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 864
8011 Mobile Phone as a Tool for Data Collection in Field Research

Authors: Sandro Mourão, Karla Okada

Abstract:

The necessity of accurate and timely field data is shared among organizations engaged in fundamentally different activities, public services or commercial operations. Basically, there are three major components in the process of the qualitative research: data collection, interpretation and organization of data, and analytic process. Representative technological advancements in terms of innovation have been made in mobile devices (mobile phone, PDA-s, tablets, laptops, etc). Resources that can be potentially applied on the data collection activity for field researches in order to improve this process. This paper presents and discuss the main features of a mobile phone based solution for field data collection, composed of basically three modules: a survey editor, a server web application and a client mobile application. The data gathering process begins with the survey creation module, which enables the production of tailored questionnaires. The field workforce receives the questionnaire(s) on their mobile phones to collect the interviews responses and sending them back to a server for immediate analysis.

Keywords: Data Gathering, Field Research, Mobile Phone, Survey.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2059
8010 On Pooling Different Levels of Data in Estimating Parameters of Continuous Meta-Analysis

Authors: N. R. N. Idris, S. Baharom

Abstract:

A meta-analysis may be performed using aggregate data (AD) or an individual patient data (IPD). In practice, studies may be available at both IPD and AD level. In this situation, both the IPD and AD should be utilised in order to maximize the available information. Statistical advantages of combining the studies from different level have not been fully explored. This study aims to quantify the statistical benefits of including available IPD when conducting a conventional summary-level meta-analysis. Simulated meta-analysis were used to assess the influence of the levels of data on overall meta-analysis estimates based on IPD-only, AD-only and the combination of IPD and AD (mixed data, MD), under different study scenario. The percentage relative bias (PRB), root mean-square-error (RMSE) and coverage probability were used to assess the efficiency of the overall estimates. The results demonstrate that available IPD should always be included in a conventional meta-analysis using summary level data as they would significantly increased the accuracy of the estimates.On the other hand, if more than 80% of the available data are at IPD level, including the AD does not provide significant differences in terms of accuracy of the estimates. Additionally, combining the IPD and AD has moderating effects on the biasness of the estimates of the treatment effects as the IPD tends to overestimate the treatment effects, while the AD has the tendency to produce underestimated effect estimates. These results may provide some guide in deciding if significant benefit is gained by pooling the two levels of data when conducting meta-analysis.

Keywords: Aggregate data, combined-level data, Individual patient data, meta analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1741
8009 Non-Revenue Water Management in Palestine

Authors: Samah Jawad Jabari

Abstract:

Water is the most important and valuable resource not only for human life but also for all living things on the planet. The water supply utilities should fulfill the water requirement quantitatively and qualitatively. Drinking water systems are exposed to both natural (hurricanes and flood) and manmade hazards (risks) that are common in Palestine. Non-Revenue Water (NRW) is a manmade risk which remains a major concern in Palestine, as the NRW levels are estimated to be at a high level. In this research, Hebron city water distribution network was taken as a case study to estimate and audit the NRW levels. The research also investigated the state of the existing water distribution system in the study area by investigating the water losses and obtained more information on NRW prevention and management practices. Data and information have been collected from the Palestinian Water Authority (PWA) and Hebron Municipality (HM) archive. In addition to that, a questionnaire has been designed and administered by the researcher in order to collect the necessary data for water auditing. The questionnaire also assessed the views of stakeholder in PWA and HM (staff) on the current status of the NRW in the Hebron water distribution system. The important result obtained by this research shows that NRW in Hebron city was high and in excess of 30%. The main factors that contribute to NRW were the inaccuracies in billing volumes, unauthorized consumption, and the method of estimating consumptions through faulty meters. Policy for NRW reduction is available in Palestine; however, it is clear that the number of qualified staff available to carry out the activities related to leak detection is low, and that there is a lack of appropriate technologies to reduce water losses and undertake sufficient system maintenance, which needs to be improved to enhance the performance of the network and decrease the level of NRW losses.

Keywords: Non-revenue water, water auditing, leak detection, water meters.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1107