Search results for: Data Reduction
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8691

Search results for: Data Reduction

7881 Ontology for a Voice Transcription of OpenStreetMap Data: The Case of Space Apprehension by Visually Impaired Persons

Authors: Said Boularouk, Didier Josselin, Eitan Altman

Abstract:

In this paper, we present a vocal ontology of OpenStreetMap data for the apprehension of space by visually impaired people. Indeed, the platform based on produsage gives a freedom to data producers to choose the descriptors of geocoded locations. Unfortunately, this freedom, called also folksonomy leads to complicate subsequent searches of data. We try to solve this issue in a simple but usable method to extract data from OSM databases in order to send them to visually impaired people using Text To Speech technology. We focus on how to help people suffering from visual disability to plan their itinerary, to comprehend a map by querying computer and getting information about surrounding environment in a mono-modal human-computer dialogue.

Keywords: Ontology, OpenStreetMap, visually impaired people, TTS, taxonomy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 887
7880 A Data Mining Model for Detecting Financial and Operational Risk Indicators of SMEs

Authors: Ali Serhan Koyuncugil, Nermin Ozgulbas

Abstract:

In this paper, a data mining model to SMEs for detecting financial and operational risk indicators by data mining is presenting. The identification of the risk factors by clarifying the relationship between the variables defines the discovery of knowledge from the financial and operational variables. Automatic and estimation oriented information discovery process coincides the definition of data mining. During the formation of model; an easy to understand, easy to interpret and easy to apply utilitarian model that is far from the requirement of theoretical background is targeted by the discovery of the implicit relationships between the data and the identification of effect level of every factor. In addition, this paper is based on a project which was funded by The Scientific and Technological Research Council of Turkey (TUBITAK).

Keywords: Risk Management, Financial Risk, Operational Risk, Financial Early Warning System, Data Mining, CHAID Decision Tree Algorithm, SMEs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3122
7879 Satellite Data Classification Accuracy Assessment Based from Reference Dataset

Authors: Mohd Hasmadi Ismail, Kamaruzaman Jusoff

Abstract:

In order to develop forest management strategies in tropical forest in Malaysia, surveying the forest resources and monitoring the forest area affected by logging activities is essential. There are tremendous effort has been done in classification of land cover related to forest resource management in this country as it is a priority in all aspects of forest mapping using remote sensing and related technology such as GIS. In fact classification process is a compulsory step in any remote sensing research. Therefore, the main objective of this paper is to assess classification accuracy of classified forest map on Landsat TM data from difference number of reference data (200 and 388 reference data). This comparison was made through observation (200 reference data), and interpretation and observation approaches (388 reference data). Five land cover classes namely primary forest, logged over forest, water bodies, bare land and agricultural crop/mixed horticultural can be identified by the differences in spectral wavelength. Result showed that an overall accuracy from 200 reference data was 83.5 % (kappa value 0.7502459; kappa variance 0.002871), which was considered acceptable or good for optical data. However, when 200 reference data was increased to 388 in the confusion matrix, the accuracy slightly improved from 83.5% to 89.17%, with Kappa statistic increased from 0.7502459 to 0.8026135, respectively. The accuracy in this classification suggested that this strategy for the selection of training area, interpretation approaches and number of reference data used were importance to perform better classification result.

Keywords: Image Classification, Reference Data, Accuracy Assessment, Kappa Statistic, Forest Land Cover

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3139
7878 Very Large Scale Integration Architecture of Finite Impulse Response Filter Implementation Using Retiming Technique

Authors: S. Jalaja, A. M. Vijaya Prakash

Abstract:

Recursive combination of an algorithm based on Karatsuba multiplication is exploited to design a generalized transpose and parallel Finite Impulse Response (FIR) Filter. Mid-range Karatsuba multiplication and Carry Save adder based on Karatsuba multiplication reduce time complexity for higher order multiplication implemented up to n-bit. As a result, we design modified N-tap Transpose and Parallel Symmetric FIR Filter Structure using Karatsuba algorithm. The mathematical formulation of the FFA Filter is derived. The proposed architecture involves significantly less area delay product (APD) then the existing block implementation. By adopting retiming technique, hardware cost is reduced further. The filter architecture is designed by using 90 nm technology library and is implemented by using cadence EDA Tool. The synthesized result shows better performance for different word length and block size. The design achieves switching activity reduction and low power consumption by applying with and without retiming for different combination of the circuit. The proposed structure achieves more than a half of the power reduction by adopting with and without retiming techniques compared to the earlier design structure. As a proof of the concept for block size 16 and filter length 64 for CKA method, it achieves a 51% as well as 70% less power by applying retiming technique, and for CSA method it achieves a 57% as well as 77% less power by applying retiming technique compared to the previously proposed design.

Keywords: Carry save adder Karatsuba multiplication, mid-range Karatsuba multiplication, modified FFA, transposed filter, retiming.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 909
7877 Development of an Automated Quality Management System to Control District Heating

Authors: Nigina Toktasynova, Sholpan Sagyndykova, Zhanat Kenzhebayeva, Maksat Kalimoldayev, Mariya Ishimova, Irbulat Utepbergenov

Abstract:

To solve these problems, we investigated the management system of heating enterprise, including strategic planning based on the balanced scorecard (BSC), quality management in accordance with the standards of the Quality Management System (QMS) ISO 9001 and analysis of the system based on expert judgment using fuzzy inference. To carry out our work we used the theory of fuzzy sets, the QMS in accordance with ISO 9001, BSC, method of construction of business processes according to the notation IDEF0, theory of modeling using Matlab software simulation tools and graphical programming LabVIEW. The results of the work are as follows: We determined possibilities of improving the management of heat-supply plant-based on QMS; after the justification and adaptation of software tool it has been used to automate a series of functions for the management and reduction of resources and for the maintenance of the system up to date; an application for the analysis of the QMS based on fuzzy inference has been created with novel organization of communication software with the application enabling the analysis of relevant data of enterprise management system. 

Keywords: Balanced scorecard, heat supply, quality management system, the theory of fuzzy sets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1779
7876 Analysis of Diverse Cluster Ensemble Techniques

Authors: S. Sarumathi, N. Shanthi, P. Ranjetha

Abstract:

Data mining is the procedure of determining interesting patterns from the huge amount of data. With the intention of accessing the data faster the most supporting processes needed is clustering. Clustering is the process of identifying similarity between data according to the individuality present in the data and grouping associated data objects into clusters. Cluster ensemble is the technique to combine various runs of different clustering algorithms to obtain a general partition of the original dataset, aiming for consolidation of outcomes from a collection of individual clustering outcomes. The performances of clustering ensembles are mainly affecting by two principal factors such as diversity and quality. This paper presents the overview about the different cluster ensemble algorithm along with their methods used in cluster ensemble to improve the diversity and quality in the several cluster ensemble related papers and shows the comparative analysis of different cluster ensemble also summarize various cluster ensemble methods. Henceforth this clear analysis will be very useful for the world of clustering experts and also helps in deciding the most appropriate one to determine the problem in hand.

Keywords: Cluster Ensemble, Consensus Function, CSPA, Diversity, HGPA, MCLA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1840
7875 Air Conditioning Energy Saving by Rooftop Greenery System in Subtropical Climate in Australia

Authors: M. Anwar, M. G. Rasul, M. M. K. Khan

Abstract:

The benefits of rooftop greenery systems (such as energy savings, reduction of greenhouse gas emission for mitigating climate change and maintaining sustainable development, indoor temperature control etc.) in buildings are well recognized, however there remains very little research conducted for quantifying the benefits in subtropical climates such as in Australia. This study mainly focuses on measuring/determining temperature profile and air conditioning energy savings by implementing rooftop greenery systems in subtropical Central Queensland in Australia. An experimental set-up was installed at Rockhampton campus of Central Queensland University, where two standard shipping containers (6m x 2.4m x 2.4m) were converted into small offices, one with green roof and one without. These were used for temperature, humidity and energy consumption data collection. The study found that an energy savings of up to 11.70% and temperature difference of up to 4°C can be achieved in March in subtropical Central Queensland climate in Australia. It is expected that more energy can be saved in peak summer days (December/February) as temperature difference between green roof and non-green roof is higher in December- February.

Keywords: Extensive green roof, Rooftop greenery system, Subtropical climate, Shipping container.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2044
7874 A Distributed Approach to Extract High Utility Itemsets from XML Data

Authors: S. Kannimuthu, K. Premalatha

Abstract:

This paper investigates a new data mining capability that entails mining of High Utility Itemsets (HUI) in a distributed environment. Existing research in data mining deals with only presence or absence of an items and do not consider the semantic measures like weight or cost of the items. Thus, HUI mining algorithm has evolved. HUI mining is the one kind of utility mining concept, aims to identify itemsets whose utility satisfies a given threshold. Although, the approach of mining HUIs in a distributed environment and mining of the same from XML data have not explored yet. In this work, a novel approach is proposed to mine HUIs from the XML based data in a distributed environment. This work utilizes Service Oriented Computing (SOC) paradigm which provides Knowledge as a Service (KaaS). The interesting patterns are provided via the web services with the help of knowledge server to answer the queries of the consumers. The performance of the approach is evaluated on various databases using execution time and memory consumption.

Keywords: Data mining, Knowledge as a Service, service oriented computing, utility mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2453
7873 On the Network Packet Loss Tolerance of SVM Based Activity Recognition

Authors: Gamze Uslu, Sebnem Baydere, Alper K. Demir

Abstract:

In this study, data loss tolerance of Support Vector Machines (SVM) based activity recognition model and multi activity classification performance when data are received over a lossy wireless sensor network is examined. Initially, the classification algorithm we use is evaluated in terms of resilience to random data loss with 3D acceleration sensor data for sitting, lying, walking and standing actions. The results show that the proposed classification method can recognize these activities successfully despite high data loss. Secondly, the effect of differentiated quality of service performance on activity recognition success is measured with activity data acquired from a multi hop wireless sensor network, which introduces  high data loss. The effect of number of nodes on the reliability and multi activity classification success is demonstrated in simulation environment. To the best of our knowledge, the effect of data loss in a wireless sensor network on activity detection success rate of an SVM based classification algorithm has not been studied before.

Keywords: Activity recognition, support vector machines, acceleration sensor, wireless sensor networks, packet loss.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2869
7872 Performance and Availability Analyses of PV Generation Systems in Taiwan

Authors: H. S. Huang, J. C. Jao, K. L. Yen, C. T. Tsai

Abstract:

The purpose of this article applies the monthly final energy yield and failure data of 202 PV systems installed in Taiwan to analyze the PV operational performance and system availability. This data is collected by Industrial Technology Research Institute through manual records. Bad data detection and failure data estimation approaches are proposed to guarantee the quality of the received information. The performance ratio value and system availability are then calculated and compared with those of other countries. It is indicated that the average performance ratio of Taiwan-s PV systems is 0.74 and the availability is 95.7%. These results are similar with those of Germany, Switzerland, Italy and Japan.

Keywords: availability, performance ratio, PV system, Taiwan

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4435
7871 Stealthy Network Transfer of Data

Authors: N. Veerasamy, C. J. Cheyne

Abstract:

Users of computer systems may often require the private transfer of messages/communications between parties across a network. Information warfare and the protection and dominance of information in the military context is a prime example of an application area in which the confidentiality of data needs to be maintained. The safe transportation of critical data is therefore often a vital requirement for many private communications. However, unwanted interception/sniffing of communications is also a possibility. An elementary stealthy transfer scheme is therefore proposed by the authors. This scheme makes use of encoding, splitting of a message and the use of a hashing algorithm to verify the correctness of the reconstructed message. For this proof-of-concept purpose, the authors have experimented with the random sending of encoded parts of a message and the construction thereof to demonstrate how data can stealthily be transferred across a network so as to prevent the obvious retrieval of data.

Keywords: Construction, encode, interception, stealthy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1196
7870 Survey on Arabic Sentiment Analysis in Twitter

Authors: Sarah O. Alhumoud, Mawaheb I. Altuwaijri, Tarfa M. Albuhairi, Wejdan M. Alohaideb

Abstract:

Large-scale data stream analysis has become one of the important business and research priorities lately. Social networks like Twitter and other micro-blogging platforms hold an enormous amount of data that is large in volume, velocity and variety. Extracting valuable information and trends out of these data would aid in a better understanding and decision-making. Multiple analysis techniques are deployed for English content. Moreover, one of the languages that produce a large amount of data over social networks and is least analyzed is the Arabic language. The proposed paper is a survey on the research efforts to analyze the Arabic content in Twitter focusing on the tools and methods used to extract the sentiments for the Arabic content on Twitter.

Keywords: Big Data, Social Networks, Sentiment Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4347
7869 Implementation of an On-Line PD Measurement System Using HFCT

Authors: F. Haghjoo, M. Sarlak, S.M. Shahrtash

Abstract:

In order to perform on-line measuring and detection of PD signals, a total solution composing of an HFCT, A/D converter and a complete software package is proposed. The software package includes compensation of HFCT contribution, filtering and noise reduction using wavelet transform and soft calibration routines. The results have shown good performance and high accuracy.

Keywords: Partial Discharge, Measurement, On-line, HFCT

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1817
7868 Mean Shift-based Preprocessing Methodology for Improved 3D Buildings Reconstruction

Authors: Nikolaos Vassilas, Theocharis Tsenoglou, Djamchid Ghazanfarpour

Abstract:

In this work, we explore the capability of the mean shift algorithm as a powerful preprocessing tool for improving the quality of spatial data, acquired from airborne scanners, from densely built urban areas. On one hand, high resolution image data corrupted by noise caused by lossy compression techniques are appropriately smoothed while at the same time preserving the optical edges and, on the other, low resolution LiDAR data in the form of normalized Digital Surface Map (nDSM) is upsampled through the joint mean shift algorithm. Experiments on both the edge-preserving smoothing and upsampling capabilities using synthetic RGB-z data show that the mean shift algorithm is superior to bilateral filtering as well as to other classical smoothing and upsampling algorithms. Application of the proposed methodology for 3D reconstruction of buildings of a pilot region of Athens, Greece results in a significant visual improvement of the 3D building block model.

Keywords: 3D buildings reconstruction, data fusion, data upsampling, mean shift.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2004
7867 Study on Changes of Land Use impacting the Process of Urbanization, by Using Landsat Data in African Regions: A Case Study in Kigali, Rwanda

Authors: Delphine Mukaneza, Lin Qiao, Wang Pengxin, Li Yan, Chen Yingyi

Abstract:

Human activities on land use make the land-cover gradually change or transit. In this study, we examined the use of Landsat TM data to detect the land use change of Kigali between 1987 and 2009 using remote sensing techniques and analysis of data using ENVI and ArcGIS, a GIS software. Six different categories of land use were distinguished: bare soil, built up land, wetland, water, vegetation, and others. With remote sensing techniques, we analyzed land use data in 1987, 1999 and 2009, changed areas were found and a dynamic situation of land use in Kigali city was found during the 22 years studied. According to relevant Landsat data, the research focused on land use change in accordance with the role of remote sensing in the process of urbanization. The result of the work has shown the rapid increase of built up land between 1987 and 1999 and a big decrease of vegetation caused by the rebuild of the city after the 1994 genocide, while in the period of 1999 to 2009 there was a reduction in built up land and vegetation, after the authority of Kigali city established, a Master Plan where all constructions which were not in the range of the master Plan were destroyed. Rwanda's capital, Kigali City, through the expansion of the urban area, it is increasing the internal employment rate and attracts business investors and the service sector to improve their economy, which will increase the population growth and provide a better life. The overall planning of the city of Kigali considers the environment, land use, infrastructure, cultural and socio-economic factors, the economic development and population forecast, urban development, and constraints specification. To achieve the above purpose, the Government has set for the overall planning of city Kigali, different stages of the detailed description of the design, strategy and action plan that would guide Kigali planners and members of the public in the future to have more detailed regional plans and practical measures. Thus, land use change is significantly the performance of Kigali active human area, which plays an important role for the country to take certain decisions. Another area to take into account is the natural situation of Kigali city. Agriculture in the region does not occupy a dominant position, and with the population growth and socio-economic development, the construction area will gradually rise and speed up the process of urbanization. Thus, as a developing country, Rwanda's population continues to grow and there is low rate of utilization of land, where urbanization remains low. As mentioned earlier, the 1994 genocide massacres, population growth and urbanization processes, have been the factors driving the dramatic changes in land use. The focus on further research would be on analysis of Rwanda’s natural resources, social and economic factors that could be, the driving force of land use change.

Keywords: Land use change, urbanization, Kigali City, Landsat.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1057
7866 Comparative Analysis of the Third Generation of Research Data for Evaluation of Solar Energy Potential

Authors: Claudineia Brazil, Elison Eduardo Jardim Bierhals, Luciane Teresa Salvi, Rafael Haag

Abstract:

Renewable energy sources are dependent on climatic variability, so for adequate energy planning, observations of the meteorological variables are required, preferably representing long-period series. Despite the scientific and technological advances that meteorological measurement systems have undergone in the last decades, there is still a considerable lack of meteorological observations that form series of long periods. The reanalysis is a system of assimilation of data prepared using general atmospheric circulation models, based on the combination of data collected at surface stations, ocean buoys, satellites and radiosondes, allowing the production of long period data, for a wide gamma. The third generation of reanalysis data emerged in 2010, among them is the Climate Forecast System Reanalysis (CFSR) developed by the National Centers for Environmental Prediction (NCEP), these data have a spatial resolution of 0.50 x 0.50. In order to overcome these difficulties, it aims to evaluate the performance of solar radiation estimation through alternative data bases, such as data from Reanalysis and from meteorological satellites that satisfactorily meet the absence of observations of solar radiation at global and/or regional level. The results of the analysis of the solar radiation data indicated that the reanalysis data of the CFSR model presented a good performance in relation to the observed data, with determination coefficient around 0.90. Therefore, it is concluded that these data have the potential to be used as an alternative source in locations with no seasons or long series of solar radiation, important for the evaluation of solar energy potential.

Keywords: Climate, reanalysis, renewable energy, solar radiation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 904
7865 Explorative Data Mining of Constructivist Learning Experiences and Activities with Multiple Dimensions

Authors: Patrick Wessa, Bart Baesens

Abstract:

This paper discusses the use of explorative data mining tools that allow the educator to explore new relationships between reported learning experiences and actual activities, even if there are multiple dimensions with a large number of measured items. The underlying technology is based on the so-called Compendium Platform for Reproducible Computing (http://www.freestatistics.org) which was built on top the computational R Framework (http://www.wessa.net).

Keywords: Reproducible computing, data mining, explorative data analysis, compendium technology, computer assisted education

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1252
7864 Role of Process Parameters on Pocket Milling with Abrasive Water Jet Machining Technique

Authors: T. V. K. Gupta, J. Ramkumar, Puneet Tandon, N. S. Vyas

Abstract:

Abrasive Water Jet Machining is an unconventional machining process well known for machining hard to cut materials. The primary research focus on the process was for through cutting and a very limited literature is available on pocket milling using AWJM. The present work is an attempt to use this process for milling applications considering a set of various process parameters. Four different input parameters, which were considered by researchers for part separation, are selected for the above application, i.e., abrasive size, flow rate, standoff distance and traverse speed. Pockets of definite size are machined to investigate surface roughness, material removal rate and pocket depth. Based on the data available through experiments on SS304 material, it is observed that higher traverse speeds gives a better finish because of reduction in the particle energy density and lower depth is also observed. Increase in the standoff distance and abrasive flow rate reduces the rate of material removal as the jet loses its focus and occurrence of collisions within the particles. ANOVA for individual output parameter has been studied to know the significant process parameters.

Keywords: Abrasive flow rate, surface finish, abrasive size, standoff distance, traverse speed.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4231
7863 Analysis of Textual Data Based On Multiple 2-Class Classification Models

Authors: Shigeaki Sakurai, Ryohei Orihara

Abstract:

This paper proposes a new method for analyzing textual data. The method deals with items of textual data, where each item is described based on various viewpoints. The method acquires 2- class classification models of the viewpoints by applying an inductive learning method to items with multiple viewpoints. The method infers whether the viewpoints are assigned to the new items or not by using the models. The method extracts expressions from the new items classified into the viewpoints and extracts characteristic expressions corresponding to the viewpoints by comparing the frequency of expressions among the viewpoints. This paper also applies the method to questionnaire data given by guests at a hotel and verifies its effect through numerical experiments.

Keywords: Text mining, Multiple viewpoints, Differential analysis, Questionnaire data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1289
7862 Using Automated Database Reverse Engineering for Database Integration

Authors: M. R. Abbasifard, M. Rahgozar, A. Bayati, P. Pournemati

Abstract:

One important problem in today organizations is the existence of non-integrated information systems, inconsistency and lack of suitable correlations between legacy and modern systems. One main solution is to transfer the local databases into a global one. In this regards we need to extract the data structures from the legacy systems and integrate them with the new technology systems. In legacy systems, huge amounts of a data are stored in legacy databases. They require particular attention since they need more efforts to be normalized, reformatted and moved to the modern database environments. Designing the new integrated (global) database architecture and applying the reverse engineering requires data normalization. This paper proposes the use of database reverse engineering in order to integrate legacy and modern databases in organizations. The suggested approach consists of methods and techniques for generating data transformation rules needed for the data structure normalization.

Keywords: Reverse Engineering, Database Integration, System Integration, Data Structure Normalization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1851
7861 Analysis of Cooperative Learning Behavior Based on the Data of Students' Movement

Authors: Wang Lin, Li Zhiqiang

Abstract:

The purpose of this paper is to analyze the cooperative learning behavior pattern based on the data of students' movement. The study firstly reviewed the cooperative learning theory and its research status, and briefly introduced the k-means clustering algorithm. Then, it used clustering algorithm and mathematical statistics theory to analyze the activity rhythm of individual student and groups in different functional areas, according to the movement data provided by 10 first-year graduate students. It also focused on the analysis of students' behavior in the learning area and explored the law of cooperative learning behavior. The research result showed that the cooperative learning behavior analysis method based on movement data proposed in this paper is feasible. From the results of data analysis, the characteristics of behavior of students and their cooperative learning behavior patterns could be found.

Keywords: Behavior pattern, cooperative learning, data analyze, K-means clustering algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 813
7860 A Security Cloud Storage Scheme Based Accountable Key-Policy Attribute-Based Encryption without Key Escrow

Authors: Ming Lun Wang, Yan Wang, Ning Ruo Sun

Abstract:

With the development of cloud computing, more and more users start to utilize the cloud storage service. However, there exist some issues: 1) cloud server steals the shared data, 2) sharers collude with the cloud server to steal the shared data, 3) cloud server tampers the shared data, 4) sharers and key generation center (KGC) conspire to steal the shared data. In this paper, we use advanced encryption standard (AES), hash algorithms, and accountable key-policy attribute-based encryption without key escrow (WOKE-AKP-ABE) to build a security cloud storage scheme. Moreover, the data are encrypted to protect the privacy. We use hash algorithms to prevent the cloud server from tampering the data uploaded to the cloud. Analysis results show that this scheme can resist conspired attacks.

Keywords: Cloud storage security, sharing storage, attributes, Hash algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1036
7859 Multimethod Approach to Research in Interlanguage Pragmatics

Authors: Saad Al-Gahtani, Ghassan H Al Shatter

Abstract:

Argument over the use of particular method in interlanguage pragmatics has increased recently. Researchers argued the advantages and disadvantages of each method either natural or elicited. Findings of different studies indicated that the use of one method may not provide enough data to answer all its questions. The current study investigated the validity of using multimethod approach in interlanguage pragmatics to understand the development of requests in Arabic as a second language (Arabic L2). To this end, the study adopted two methods belong to two types of data sources: the institutional discourse (natural data), and the role play (elicited data). Participants were 117 learners of Arabic L2 at the university level, representing four levels (beginners, low-intermediate, highintermediate, and advanced). Results showed that using two or more methods in interlanguage pragmatics affect the size and nature of data.

Keywords: Arabic L2, Development of requests, Interlanguage Pragmatics, Multimethod approach.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1829
7858 Index t-SNE: Tracking Dynamics of High-Dimensional Datasets with Coherent Embeddings

Authors: G. Candel, D. Naccache

Abstract:

t-SNE is an embedding method that the data science community has widely used. It helps two main tasks: to display results by coloring items according to the item class or feature value; and for forensic, giving a first overview of the dataset distribution. Two interesting characteristics of t-SNE are the structure preservation property and the answer to the crowding problem, where all neighbors in high dimensional space cannot be represented correctly in low dimensional space. t-SNE preserves the local neighborhood, and similar items are nicely spaced by adjusting to the local density. These two characteristics produce a meaningful representation, where the cluster area is proportional to its size in number, and relationships between clusters are materialized by closeness on the embedding. This algorithm is non-parametric. The transformation from a high to low dimensional space is described but not learned. Two initializations of the algorithm would lead to two different embedding. In a forensic approach, analysts would like to compare two or more datasets using their embedding. A naive approach would be to embed all datasets together. However, this process is costly as the complexity of t-SNE is quadratic, and would be infeasible for too many datasets. Another approach would be to learn a parametric model over an embedding built with a subset of data. While this approach is highly scalable, points could be mapped at the same exact position, making them indistinguishable. This type of model would be unable to adapt to new outliers nor concept drift. This paper presents a methodology to reuse an embedding to create a new one, where cluster positions are preserved. The optimization process minimizes two costs, one relative to the embedding shape and the second relative to the support embedding’ match. The embedding with the support process can be repeated more than once, with the newly obtained embedding. The successive embedding can be used to study the impact of one variable over the dataset distribution or monitor changes over time. This method has the same complexity as t-SNE per embedding, and memory requirements are only doubled. For a dataset of n elements sorted and split into k subsets, the total embedding complexity would be reduced from O(n2) to O(n2/k), and the memory requirement from n2 to 2(n/k)2 which enables computation on recent laptops. The method showed promising results on a real-world dataset, allowing to observe the birth, evolution and death of clusters. The proposed approach facilitates identifying significant trends and changes, which empowers the monitoring high dimensional datasets’ dynamics.

Keywords: Concept drift, data visualization, dimension reduction, embedding, monitoring, reusability, t-SNE, unsupervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 488
7857 Optimization of a Bioremediation Strategy for an Urban Stream of Matanza-Riachuelo Basin

Authors: María D. Groppa, Andrea Trentini, Myriam Zawoznik, Roxana Bigi, Carlos Nadra, Patricia L. Marconi

Abstract:

In the present work, a remediation bioprocess based on the use of a local isolate of the microalgae Chlorella vulgaris immobilized in alginate beads is proposed. This process was shown to be effective for the reduction of several chemical and microbial contaminants present in Cildáñez stream, a water course that is part of the Matanza-Riachuelo Basin (Buenos Aires, Argentina). The bioprocess, involving the culture of the microalga in autotrophic conditions in a stirred-tank bioreactor supplied with a marine propeller for 6 days, allowed a significant reduction of Escherichia coli and total coliform numbers (over 95%), as well as of ammoniacal nitrogen (96%), nitrates (86%), nitrites (98%), and total phosphorus (53%) contents. Pb content was also significantly diminished after the bioprocess (95%). Standardized cytotoxicity tests using Allium cepa seeds and Cildáñez water pre- and post-remediation were also performed. Germination rate and mitotic index of onion seeds imbibed in Cildáñez water subjected to the bioprocess was similar to that observed in seeds imbibed in distilled water and significantly superior to that registered when untreated Cildáñez water was used for imbibition. Our results demonstrate the potential of this simple and cost-effective technology to remove urban-water contaminants, offering as an additional advantage the possibility of an easy biomass recovery, which may become a source of alternative energy.

Keywords: Bioreactor, bioremediation, Chlorella vulgaris, Matanza-Riachuelo basin, microalgae.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 842
7856 Design of Integration Security System using XML Security

Authors: Juhan Kim, Soohyung Kim, Kiyoung Moon

Abstract:

In this paper, we design an integration security system that provides authentication service, authorization service, and management service of security data and a unified interface for the management service. The interface is originated from XKMS protocol and is used to manage security data such as XACML policies, SAML assertions and other authentication security data including public keys. The system includes security services such as authentication, authorization and delegation of authentication by employing SAML and XACML based on security data such as authentication data, attributes information, assertions and polices managed with the interface in the system. It also has SAML producer that issues assertions related on the result of the authentication and the authorization services.

Keywords: XML, XML Security, XACML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1428
7855 An Evaluation Model for Semantic Enablement of Virtual Research Environments

Authors: Tristan O'Neill, Trina Myers, Jarrod Trevathan

Abstract:

The Tropical Data Hub (TDH) is a virtual research environment that provides researchers with an e-research infrastructure to congregate significant tropical data sets for data reuse, integration, searching, and correlation. However, researchers often require data and metadata synthesis across disciplines for crossdomain analyses and knowledge discovery. A triplestore offers a semantic layer to achieve a more intelligent method of search to support the synthesis requirements by automating latent linkages in the data and metadata. Presently, the benchmarks to aid the decision of which triplestore is best suited for use in an application environment like the TDH are limited to performance. This paper describes a new evaluation tool developed to analyze both features and performance. The tool comprises a weighted decision matrix to evaluate the interoperability, functionality, performance, and support availability of a range of integrated and native triplestores to rank them according to requirements of the TDH.

Keywords: Virtual research environment, Semantic Web, performance analysis, tropical data hub.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1782
7854 Improvement of the Q-System Using the Rock Engineering System: A Case Study of Water Conveyor Tunnel of Azad Dam

Authors: S. Golmohammadi, M. Noorian Bidgoli

Abstract:

Because the status and mechanical parameters of discontinuities in the rock mass are included in the calculations, various methods of rock engineering classification are often used as a starting point for the design of different types of structures. The Q-system is one of the most frequently used methods for stability analysis and determination of support systems of underground structures in rock, including tunnel. In this method, six main parameters of the rock mass, namely, the Rock Quality Designation (RQD), joint set number (Jn), joint roughness number (Jr), joint alteration number (Ja), joint water parameter (Jw) and Stress Reduction Factor (SRF) are required. In this regard, in order to achieve a reasonable and optimal design, identifying the effective parameters for the stability of the mentioned structures is one of the most important goals and the most necessary actions in rock engineering. Therefore, it is necessary to study the relationships between the parameters of a system and how they interact with each other and, ultimately, the whole system. In this research, it has been attempted to determine the most effective parameters (key parameters) from the six parameters of rock mass in the Q-system using the Rock Engineering System (RES) method to improve the relationships between the parameters in the calculation of the Q value. The RES system is, in fact, a method by which one can determine the degree of cause and effect of a system's parameters by making an interaction matrix. In this research, the geomechanical data collected from the water conveyor tunnel of Azad Dam were used to make the interaction matrix of the Q-system. For this purpose, instead of using the conventional methods that are always accompanied by defects such as uncertainty, the Q-system interaction matrix is coded using a technique that is actually a statistical analysis of the data and determining the correlation coefficient between them. So, the effect of each parameter on the system is evaluated with greater certainty. The results of this study show that the formed interaction matrix provides a reasonable estimate of the effective parameters in the Q-system. Among the six parameters of the Q-system, the SRF and Jr parameters have the maximum and minimum impact on the system, respectively, and also the RQD and Jw parameters have the maximum and minimum impact on the system, respectively. Therefore, by developing this method, we can obtain a more accurate relation to the rock mass classification by weighting the required parameters in the Q-system.

Keywords: Q-system, Rock Engineering System, statistical analysis, rock mass, tunnel.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 294
7853 A Mobile Agent-based Clustering Data Fusion Algorithm in WSN

Authors: Xiangbin Zhu, Wenjuan Zhang

Abstract:

In wireless sensor networks,the mobile agent technology is used in data fusion. According to the node residual energy and the results of partial integration,we design the node clustering algorithm. Optimization of mobile agent in the routing within the cluster strategy for wireless sensor networks to further reduce the amount of data transfer. Through the experiments, using mobile agents in the integration process within the cluster can be reduced the path loss in some extent.

Keywords: wireless sensor networks, data fusion, mobile agent

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1510
7852 Collision Detection Algorithm Based on Data Parallelism

Authors: Zhen Peng, Baifeng Wu

Abstract:

Modern computing technology enters the era of parallel computing with the trend of sustainable and scalable parallelism. Single Instruction Multiple Data (SIMD) is an important way to go along with the trend. It is able to gather more and more computing ability by increasing the number of processor cores without the need of modifying the program. Meanwhile, in the field of scientific computing and engineering design, many computation intensive applications are facing the challenge of increasingly large amount of data. Data parallel computing will be an important way to further improve the performance of these applications. In this paper, we take the accurate collision detection in building information modeling as an example. We demonstrate a model for constructing a data parallel algorithm. According to the model, a complex object is decomposed into the sets of simple objects; collision detection among complex objects is converted into those among simple objects. The resulting algorithm is a typical SIMD algorithm, and its advantages in parallelism and scalability is unparalleled in respect to the traditional algorithms.

Keywords: Data parallelism, collision detection, single instruction multiple data, building information modeling, continuous scalability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1234