Search results for: indexing data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24126

Search results for: indexing data

24096 Video Summarization: Techniques and Applications

Authors: Zaynab El Khattabi, Youness Tabii, Abdelhamid Benkaddour

Abstract:

Nowadays, huge amount of multimedia repositories make the browsing, retrieval and delivery of video contents very slow and even difficult tasks. Video summarization has been proposed to improve faster browsing of large video collections and more efficient content indexing and access. In this paper, we focus on approaches to video summarization. The video summaries can be generated in many different forms. However, two fundamentals ways to generate summaries are static and dynamic. We present different techniques for each mode in the literature and describe some features used for generating video summaries. We conclude with perspective for further research.

Keywords: video summarization, static summarization, video skimming, semantic features

Procedia PDF Downloads 369
24095 A Two-Step Framework for Unsupervised Speaker Segmentation Using BIC and Artificial Neural Network

Authors: Ahmad Alwosheel, Ahmed Alqaraawi

Abstract:

This work proposes a new speaker segmentation approach for two speakers. It is an online approach that does not require a prior information about speaker models. It has two phases, a conventional approach such as unsupervised BIC-based is utilized in the first phase to detect speaker changes and train a Neural Network, while in the second phase, the output trained parameters from the Neural Network are used to predict next incoming audio stream. Using this approach, a comparable accuracy to similar BIC-based approaches is achieved with a significant improvement in terms of computation time.

Keywords: artificial neural network, diarization, speaker indexing, speaker segmentation

Procedia PDF Downloads 466
24094 Analysis of Patent Protection of Bone Tissue Engineering Scaffold Technology

Authors: Yunwei Zhang, Na Li, Yuhong Niu

Abstract:

Bone tissue engineering scaffold was regarded as an important clinical technology of curing bony defect. The patent protection of bone tissue engineering scaffold had been paid more attention and strengthened all over the world. This study analyzed the future development trends of international technologies in the field of bone tissue engineering scaffold and its patent protection. This study used the methods of data classification and classification indexing to analyze 2718 patents retrieved in the patent database. Results showed that the patents coming from United States had a competitive advantage over other countiries in the field of bone tissue engineering scaffold. The number of patent applications by a single company in U.S. was a quarter of that of the world. However, the capability of R&D in China was obviously weaker than global level, patents mainly coming from universities and scientific research institutions. Moreover, it would be predicted that synthetic organic materials as new materials would be gradually replaced by composite materials. The patent technology protections of composite materials would be more strengthened in the future.

Keywords: bone tissue engineering, patent analysis, Scaffold material, patent protection

Procedia PDF Downloads 109
24093 Cerebral Pulsatility Mediates the Link Between Physical Activity and Executive Functions in Older Adults with Cardiovascular Risk Factors: A Longitudinal NIRS Study

Authors: Hanieh Mohammadi, Sarah Fraser, Anil Nigam, Frederic Lesage, Louis Bherer

Abstract:

A chronically higher cerebral pulsatility is thought to damage cerebral microcirculation, leading to cognitive decline in older adults. Although it is widely known that regular physical activity is linked to improvement in some cognitive domains, including executive functions, the mediating role of cerebral pulsatility on this link remains to be elucidated. This study assessed the impact of 6 months of regular physical activity upon changes in an optical index of cerebral pulsatility and the role of physical activity for the improvement of executive functions. 27 older adults (aged 57-79, 66.7% women) with cardiovascular risk factors (CVRF) were enrolled in the study. The participants completed the behavioral Stroop test, which was extracted from the Delis-Kaplan executive functions system battery at baseline (T0) and after 6 months (T6) of physical activity. Near-infrared spectroscopy (NIRS) was applied for an innovative approach to indexing cerebral pulsatility in the brain microcirculation at T0 and T6. The participants were at standing rest while a NIRS device recorded hemodynamics data from frontal and motor cortex subregions at T0 and T6. The cerebral pulsatility index of interest was cerebral pulse amplitude, which was extracted from the pulsatile component of NIRS data. Our data indicated that 6 months of physical activity was associated with a reduction in the response time for the executive functions, including inhibition (T0: 56.33± 18.2 to T6: 53.33± 15.7,p= 0.038)and Switching(T0: 63.05± 5.68 to T6: 57.96 ±7.19,p< 0.001) conditions of the Stroop test. Also, physical activity was associated with a reduction in cerebral pulse amplitude (T0: 0.62± 0.05 to T6: 0.55± 0.08, p < 0.001). Notably, cerebral pulse amplitude was a significant mediator of the link between physical activity and response to the Stroop test for both inhibition (β=0.33 (0.61,0.23),p< 0.05)and switching (β=0.42 (0.69,0.11),p <0.01) conditions. This study suggests that regular physical activity may support cognitive functions through the improvement of cerebral pulsatility in older adults with CVRF.

Keywords: near-infrared spectroscopy, cerebral pulsatility, physical activity, cardiovascular risk factors, executive functions

Procedia PDF Downloads 163
24092 A Blind Three-Dimensional Meshes Watermarking Using the Interquartile Range

Authors: Emad E. Abdallah, Alaa E. Abdallah, Bajes Y. Alskarnah

Abstract:

We introduce a robust three-dimensional watermarking algorithm for copyright protection and indexing. The basic idea behind our technique is to measure the interquartile range or the spread of the 3D model vertices. The algorithm starts by converting all the vertices to spherical coordinate followed by partitioning them into small groups. The proposed algorithm is slightly altering the interquartile range distribution of the small groups based on predefined watermark. The experimental results on several 3D meshes prove perceptual invisibility and the robustness of the proposed technique against the most common attacks including compression, noise, smoothing, scaling, rotation as well as combinations of these attacks.

Keywords: watermarking, three-dimensional models, perceptual invisibility, interquartile range, 3D attacks

Procedia PDF Downloads 440
24091 Parallel Querying of Distributed Ontologies with Shared Vocabulary

Authors: Sharjeel Aslam, Vassil Vassilev, Karim Ouazzane

Abstract:

Ontologies and various semantic repositories became a convenient approach for implementing model-driven architectures of distributed systems on the Web. SPARQL is the standard query language for querying such. However, although SPARQL is well-established standard for querying semantic repositories in RDF and OWL format and there are commonly used APIs which supports it, like Jena for Java, its parallel option is not incorporated in them. This article presents a complete framework consisting of an object algebra for parallel RDF and an index-based implementation of the parallel query engine capable of dealing with the distributed RDF ontologies which share common vocabulary. It has been implemented in Java, and for validation of the algorithms has been applied to the problem of organizing virtual exhibitions on the Web.

Keywords: distributed ontologies, parallel querying, semantic indexing, shared vocabulary, SPARQL

Procedia PDF Downloads 163
24090 Speeding-up Gray-Scale FIC by Moments

Authors: Eman A. Al-Hilo, Hawraa H. Al-Waelly

Abstract:

In this work, fractal compression (FIC) technique is introduced based on using moment features to block indexing the zero-mean range-domain blocks. The moment features have been used to speed up the IFS-matching stage. Its moments ratio descriptor is used to filter the domain blocks and keep only the blocks that are suitable to be IFS matched with tested range block. The results of tests conducted on Lena picture and Cat picture (256 pixels, resolution 24 bits/pixel) image showed a minimum encoding time (0.89 sec for Lena image and 0.78 of Cat image) with appropriate PSNR (30.01dB for Lena image and 29.8 of Cat image). The reduction in ET is about 12% for Lena and 67% for Cat image.

Keywords: fractal gray level image, fractal compression technique, iterated function system, moments feature, zero-mean range-domain block

Procedia PDF Downloads 466
24089 Content-Based Color Image Retrieval Based on the 2-D Histogram and Statistical Moments

Authors: El Asnaoui Khalid, Aksasse Brahim, Ouanan Mohammed

Abstract:

In this paper, we are interested in the problem of finding similar images in a large database. For this purpose we propose a new algorithm based on a combination of the 2-D histogram intersection in the HSV space and statistical moments. The proposed histogram is based on a 3x3 window and not only on the intensity of the pixel. This approach can overcome the drawback of the conventional 1-D histogram which is ignoring the spatial distribution of pixels in the image, while the statistical moments are used to escape the effects of the discretisation of the color space which is intrinsic to the use of histograms. We compare the performance of our new algorithm to various methods of the state of the art and we show that it has several advantages. It is fast, consumes little memory and requires no learning. To validate our results, we apply this algorithm to search for similar images in different image databases.

Keywords: 2-D histogram, statistical moments, indexing, similarity distance, histograms intersection

Procedia PDF Downloads 421
24088 A Conglomerate of Multiple Optical Character Recognition Table Detection and Extraction

Authors: Smita Pallavi, Raj Ratn Pranesh, Sumit Kumar

Abstract:

Information representation as tables is compact and concise method that eases searching, indexing, and storage requirements. Extracting and cloning tables from parsable documents is easier and widely used; however, industry still faces challenges in detecting and extracting tables from OCR (Optical Character Recognition) documents or images. This paper proposes an algorithm that detects and extracts multiple tables from OCR document. The algorithm uses a combination of image processing techniques, text recognition, and procedural coding to identify distinct tables in the same image and map the text to appropriate the corresponding cell in dataframe, which can be stored as comma-separated values, database, excel, and multiple other usable formats.

Keywords: table extraction, optical character recognition, image processing, text extraction, morphological transformation

Procedia PDF Downloads 116
24087 Processing Big Data: An Approach Using Feature Selection

Authors: Nikat Parveen, M. Ananthi

Abstract:

Big data is one of the emerging technology, which collects the data from various sensors and those data will be used in many fields. Data retrieval is one of the major issue where there is a need to extract the exact data as per the need. In this paper, large amount of data set is processed by using the feature selection. Feature selection helps to choose the data which are actually needed to process and execute the task. The key value is the one which helps to point out exact data available in the storage space. Here the available data is streamed and R-Center is proposed to achieve this task.

Keywords: big data, key value, feature selection, retrieval, performance

Procedia PDF Downloads 304
24086 Discursive Construction of Strike in the Media Coverage of Academic Staff Union of Universities vs Federal Government of Nigeria Industrial Conflict of 2013

Authors: Samuel Alaba Akinwotu

Abstract:

Over the years, Nigeria’s educational system has greatly suffered from the menace of industrial conflict. The smooth running of the nation’s public educational institutions has been hampered by incessant strikes embarked upon by workers of these institutions. Even though industrial conflicts in Nigeria have enjoyed wide reportage in the media, there has been a dearth of critical examination of the language use that index the conflict’s discourse in the media. This study which is driven by a combination of Critical Discourse Analysis (CDA) and Conceptual Metaphor (CM) examines the discursive and ideological features of language indexing the industrial conflict between the Academic Staff Union of Universities (ASUU) and the Federal Government of Nigeria (FGN) in 2013. It aims to identify and assess the conceptual and cognitive motivations of the stances expressed by the parties and the public and the role of the media in the management and resolution of the conflict. For data, media reports and readers’ comments were purposively sampled from six print and online news sources (The Punch, This Day, Vanguard, The Nation, Osun Defender and AITonline) published between July and December 2013. The study provides further insight into industrial conflict and proves to be useful for the management and resolution of industrial conflicts especially in our public educational institutions.

Keywords: industrial conflict, critical discourse analysis, conceptual metaphor, federal government of Nigeria, academic staff union of universities

Procedia PDF Downloads 108
24085 Applications of Big Data in Education

Authors: Faisal Kalota

Abstract:

Big Data and analytics have gained a huge momentum in recent years. Big Data feeds into the field of Learning Analytics (LA) that may allow academic institutions to better understand the learners’ needs and proactively address them. Hence, it is important to have an understanding of Big Data and its applications. The purpose of this descriptive paper is to provide an overview of Big Data, the technologies used in Big Data, and some of the applications of Big Data in education. Additionally, it discusses some of the concerns related to Big Data and current research trends. While Big Data can provide big benefits, it is important that institutions understand their own needs, infrastructure, resources, and limitation before jumping on the Big Data bandwagon.

Keywords: big data, learning analytics, analytics, big data in education, Hadoop

Procedia PDF Downloads 380
24084 A Survey of Response Generation of Dialogue Systems

Authors: Yifan Fan, Xudong Luo, Pingping Lin

Abstract:

An essential task in the field of artificial intelligence is to allow computers to interact with people through natural language. Therefore, researches such as virtual assistants and dialogue systems have received widespread attention from industry and academia. The response generation plays a crucial role in dialogue systems, so to push forward the research on this topic, this paper surveys various methods for response generation. We sort out these methods into three categories. First one includes finite state machine methods, framework methods, and instance methods. The second contains full-text indexing methods, ontology methods, vast knowledge base method, and some other methods. The third covers retrieval methods and generative methods. We also discuss some hybrid methods based knowledge and deep learning. We compare their disadvantages and advantages and point out in which ways these studies can be improved further. Our discussion covers some studies published in leading conferences such as IJCAI and AAAI in recent years.

Keywords: deep learning, generative, knowledge, response generation, retrieval

Procedia PDF Downloads 103
24083 Analysis of Big Data

Authors: Sandeep Sharma, Sarabjit Singh

Abstract:

As per the user demand and growth trends of large free data the storage solutions are now becoming more challenge-able to protect, store and to retrieve data. The days are not so far when the storage companies and organizations are start saying 'no' to store our valuable data or they will start charging a huge amount for its storage and protection. On the other hand as per the environmental conditions it becomes challenge-able to maintain and establish new data warehouses and data centers to protect global warming threats. A challenge of small data is over now, the challenges are big that how to manage the exponential growth of data. In this paper we have analyzed the growth trend of big data and its future implications. We have also focused on the impact of the unstructured data on various concerns and we have also suggested some possible remedies to streamline big data.

Keywords: big data, unstructured data, volume, variety, velocity

Procedia PDF Downloads 510
24082 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, WangQun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSQL), and gives 6 data cleaning methods based on these algorithms.

Keywords: data cleaning, dependency rules, violation data discovery, data repair

Procedia PDF Downloads 530
24081 Citation Analysis of New Zealand Court Decisions

Authors: Tobias Milz, L. Macpherson, Varvara Vetrova

Abstract:

The law is a fundamental pillar of human societies as it shapes, controls and governs how humans conduct business, behave and interact with each other. Recent advances in computer-assisted technologies such as NLP, data science and AI are creating opportunities to support the practice, research and study of this pervasive domain. It is therefore not surprising that there has been an increase in investments into supporting technologies for the legal industry (also known as “legal tech” or “law tech”) over the last decade. A sub-discipline of particular appeal is concerned with assisted legal research. Supporting law researchers and practitioners to retrieve information from the vast amount of ever-growing legal documentation is of natural interest to the legal research community. One tool that has been in use for this purpose since the early nineteenth century is legal citation indexing. Among other use cases, they provided an effective means to discover new precedent cases. Nowadays, computer-assisted network analysis tools can allow for new and more efficient ways to reveal the “hidden” information that is conveyed through citation behavior. Unfortunately, access to openly available legal data is still lacking in New Zealand and access to such networks is only commercially available via providers such as LexisNexis. Consequently, there is a need to create, analyze and provide a legal citation network with sufficient data to support legal research tasks. This paper describes the development and analysis of a legal citation Network for New Zealand containing over 300.000 decisions from 125 different courts of all areas of law and jurisdiction. Using python, the authors assembled web crawlers, scrapers and an OCR pipeline to collect and convert court decisions from openly available sources such as NZLII into uniform and machine-readable text. This facilitated the use of regular expressions to identify references to other court decisions from within the decision text. The data was then imported into a graph-based database (Neo4j) with the courts and their respective cases represented as nodes and the extracted citations as links. Furthermore, additional links between courts of connected cases were added to indicate an indirect citation between the courts. Neo4j, as a graph-based database, allows efficient querying and use of network algorithms such as PageRank to reveal the most influential/most cited courts and court decisions over time. This paper shows that the in-degree distribution of the New Zealand legal citation network resembles a power-law distribution, which indicates a possible scale-free behavior of the network. This is in line with findings of the respective citation networks of the U.S. Supreme Court, Austria and Germany. The authors of this paper provide the database as an openly available data source to support further legal research. The decision texts can be exported from the database to be used for NLP-related legal research, while the network can be used for in-depth analysis. For example, users of the database can specify the network algorithms and metrics to only include specific courts to filter the results to the area of law of interest.

Keywords: case citation network, citation analysis, network analysis, Neo4j

Procedia PDF Downloads 76
24080 Video Shot Detection and Key Frame Extraction Using Faber-Shauder DWT and SVD

Authors: Assma Azeroual, Karim Afdel, Mohamed El Hajji, Hassan Douzi

Abstract:

Key frame extraction methods select the most representative frames of a video, which can be used in different areas of video processing such as video retrieval, video summary, and video indexing. In this paper we present a novel approach for extracting key frames from video sequences. The frame is characterized uniquely by his contours which are represented by the dominant blocks. These dominant blocks are located on the contours and its near textures. When the video frames have a noticeable changement, its dominant blocks changed, then we can extracte a key frame. The dominant blocks of every frame is computed, and then feature vectors are extracted from the dominant blocks image of each frame and arranged in a feature matrix. Singular Value Decomposition is used to calculate sliding windows ranks of those matrices. Finally the computed ranks are traced and then we are able to extract key frames of a video. Experimental results show that the proposed approach is robust against a large range of digital effects used during shot transition.

Keywords: FSDWT, key frame extraction, shot detection, singular value decomposition

Procedia PDF Downloads 355
24079 Assessment of Soil Quality Indicators in Rice Soils Under Rainfed Ecosystem

Authors: R. Kaleeswari

Abstract:

An investigation was carried out to assess the soil biological quality parameters in rice soils under rainfed and to compare soil quality indexing methods viz., Principal component analysis, Minimum data set and Indicator scoring method and to develop soil quality indices for formulating soil and crop management strategies.Soil samples were collected and analyzed for soil biological properties by adopting standard procedure. Biological indicators were determined for soil quality assessment, viz., microbial biomass carbon and nitrogen (MBC and MBN), potentially mineralizable nitrogen (PMN) and soil respiration and dehydrogenease activity. Among the methods of rice cultivation, Organic nutrition, Integrated Nutrient Management (INM) and System of Rice Intensification (SRI ), rice cultivation registered higher values of MBC, MBN and PMN. Mechanical and conventional rice cultivation registered lower values of biological quality indicators. Organic nutrient management and INM enhanced the soil respiration rate. SRI and aerobic rice cultivation methods increased the rate of soil respiration, while conventional and mechanical rice farming lowered the soil respiration rate. Dehydrogenase activity (DHA) was registered to be higher in soils under organic nutrition and Integrated Nutrient Management INM. System of Rice Intensification SRI and aerobic rice cultivation enhanced the DHA; while conventional and mechanical rice cultivation methods reduced DHA. The microbial biomass carbon (MBC) of the rice soils varied from 65 to 244 mg kg-1. Among the nutrient management practices, INM registered the highest available microbial biomass carbon of 285 mg kg-1.Potentially mineralizable N content of the rice soils varied from 20.3 to 56.8 mg kg-1. Aerobic rice farming registered the highest potentially mineralizable N of 78.9 mg kg-1..The soil respiration rate of the rice soils varied from 60 to 125 µgCO2 g-1. Nutrient management practices ofINM practice registered the highest. soil respiration rate of 129 µgCO2 g-1.The dehydrogenase activity of the rice soils varied from 38.3 to 135.3µgTPFg-1 day-1. SRI method of rice cultivation registered the highest dehydrogenase activity of 160.2 µgTPFg-1 day-1. Soil variables from each PC were considered for minimum soil data set (MDS). Principal component analysis (PCA) was used to select the representative soil quality indicators. In intensive rice cultivating regions, soil quality indicators were selected based on factor loading value and contribution percentage value using principal component analysis (PCA).Variables having significant difference within production systems were used for the preparation of minimum data set (MDS).

Keywords: soil quality, rice, biological properties, PCA analysis

Procedia PDF Downloads 64
24078 Morphological Analysis of Manipuri Language: Wahei-Neinarol

Authors: Y. Bablu Singh, B. S. Purkayashtha, Chungkham Yashawanta Singh

Abstract:

Morphological analysis forms the basic foundation in NLP applications including syntax parsing Machine Translation (MT), Information Retrieval (IR) and automatic indexing in all languages. It is the field of the linguistics; it can provide valuable information for computer based linguistics task such as lemmatization and studies of internal structure of the words. Computational Morphology is the application of morphological rules in the field of computational linguistics, and it is the emerging area in AI, which studies the structure of words, which are formed by combining smaller units of linguistics information, called morphemes: the building blocks of words. Morphological analysis provides about semantic and syntactic role in a sentence. It analyzes the Manipuri word forms and produces several grammatical information associated with the words. The Morphological Analyzer for Manipuri has been tested on 3500 Manipuri words in Shakti Standard format (SSF) using Meitei Mayek as source; thereby an accuracy of 80% has been obtained on a manual check.

Keywords: morphological analysis, machine translation, computational morphology, information retrieval, SSF

Procedia PDF Downloads 299
24077 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: mining big data, big data, machine learning, telecommunication

Procedia PDF Downloads 364
24076 JavaScript Object Notation Data against eXtensible Markup Language Data in Software Applications a Software Testing Approach

Authors: Theertha Chandroth

Abstract:

This paper presents a comparative study on how to check JSON (JavaScript Object Notation) data against XML (eXtensible Markup Language) data from a software testing point of view. JSON and XML are widely used data interchange formats, each with its unique syntax and structure. The objective is to explore various techniques and methodologies for validating comparison and integration between JSON data to XML and vice versa. By understanding the process of checking JSON data against XML data, testers, developers and data practitioners can ensure accurate data representation, seamless data interchange, and effective data validation.

Keywords: XML, JSON, data comparison, integration testing, Python, SQL

Procedia PDF Downloads 83
24075 Context-Aware Point-Of-Interests Recommender Systems Using Integrated Sentiment and Network Analysis

Authors: Ho Yeon Park, Kyoung-Jae Kim

Abstract:

Recently, user’s interests for location-based social network service increases according to the advances of social web and location-based technologies. It may be easy to recommend preferred items if we can use user’s preference, context and social network information simultaneously. In this study, we propose context-aware POI (point-of-interests) recommender systems using location-based network analysis and sentiment analysis which consider context, social network information and implicit user’s preference score. We propose a context-aware POI recommendation system consisting of three sub-modules and an integrated recommendation system of them. First, we will develop a recommendation module based on network analysis. This module combines social network analysis and cluster-indexing collaboration filtering. Next, this study develops a recommendation module using social singular value decomposition (SVD) and implicit SVD. In this research, we will develop a recommendation module that can recommend preference scores based on the frequency of POI visits of user in POI recommendation process by using social and implicit SVD which can reflect implicit feedback in collaborative filtering. We also develop a recommendation module using them that can estimate preference scores based on the recommendation. Finally, this study will propose a recommendation module using opinion mining and emotional analysis using data such as reviews of POIs extracted from location-based social networks. Finally, we will develop an integration algorithm that combines the results of the three recommendation modules proposed in this research. Experimental results show the usefulness of the proposed model in relation to the recommended performance.

Keywords: sentiment analysis, network analysis, recommender systems, point-of-interests, business analytics

Procedia PDF Downloads 219
24074 Multi-Source Data Fusion for Urban Comprehensive Management

Authors: Bolin Hua

Abstract:

In city governance, various data are involved, including city component data, demographic data, housing data and all kinds of business data. These data reflects different aspects of people, events and activities. Data generated from various systems are different in form and data source are different because they may come from different sectors. In order to reflect one or several facets of an event or rule, data from multiple sources need fusion together. Data from different sources using different ways of collection raised several issues which need to be resolved. Problem of data fusion include data update and synchronization, data exchange and sharing, file parsing and entry, duplicate data and its comparison, resource catalogue construction. Governments adopt statistical analysis, time series analysis, extrapolation, monitoring analysis, value mining, scenario prediction in order to achieve pattern discovery, law verification, root cause analysis and public opinion monitoring. The result of Multi-source data fusion is to form a uniform central database, which includes people data, location data, object data, and institution data, business data and space data. We need to use meta data to be referred to and read when application needs to access, manipulate and display the data. A uniform meta data management ensures effectiveness and consistency of data in the process of data exchange, data modeling, data cleansing, data loading, data storing, data analysis, data search and data delivery.

Keywords: multi-source data fusion, urban comprehensive management, information fusion, government data

Procedia PDF Downloads 347
24073 Reviewing Privacy Preserving Distributed Data Mining

Authors: Sajjad Baghernezhad, Saeideh Baghernezhad

Abstract:

Nowadays considering human involved in increasing data development some methods such as data mining to extract science are unavoidable. One of the discussions of data mining is inherent distribution of the data usually the bases creating or receiving such data belong to corporate or non-corporate persons and do not give their information freely to others. Yet there is no guarantee to enable someone to mine special data without entering in the owner’s privacy. Sending data and then gathering them by each vertical or horizontal software depends on the type of their preserving type and also executed to improve data privacy. In this study it was attempted to compare comprehensively preserving data methods; also general methods such as random data, coding and strong and weak points of each one are examined.

Keywords: data mining, distributed data mining, privacy protection, privacy preserving

Procedia PDF Downloads 487
24072 The Right to Data Portability and Its Influence on the Development of Digital Services

Authors: Roman Bieda

Abstract:

The General Data Protection Regulation (GDPR) will come into force on 25 May 2018 which will create a new legal framework for the protection of personal data in the European Union. Article 20 of GDPR introduces a right to data portability. This right allows for data subjects to receive the personal data which they have provided to a data controller, in a structured, commonly used and machine-readable format, and to transmit this data to another data controller. The right to data portability, by facilitating transferring personal data between IT environments (e.g.: applications), will also facilitate changing the provider of services (e.g. changing a bank or a cloud computing service provider). Therefore, it will contribute to the development of competition and the digital market. The aim of this paper is to discuss the right to data portability and its influence on the development of new digital services.

Keywords: data portability, digital market, GDPR, personal data

Procedia PDF Downloads 439
24071 Enterprise Information Portal Features: Results of Content Analysis Literature Review

Authors: Michal Krčál

Abstract:

Since their introduction in 1990’s, Enterprise Information Portals (EIPs) were investigated from different perspectives (e.g. project management, technology acceptance, IS success). However, no systematic literature review was produced to systematize both the research efforts and the technology itself. This paper reports first results of an extent systematic literature review study focused on research of EIPs and its categorization, specifically it reports a conceptual model of EIP features. The previous attempt to categorize EIP features was published in 2002. For the purpose of the literature review, content of 89 articles was analyzed in order to identify and categorize features of EIPs. The methodology of the literature review was as follows. Firstly, search queries in major indexing databases (Web of Science and SCOPUS) were used. The results of queries were analyzed according to their usability for the goal of the study. Then, full-texts were coded in Atlas.ti according to previously established coding scheme. The codes were categorized and the conceptual model of EIP features was created.

Keywords: enterprise information portal, content analysis, features, systematic literature review

Procedia PDF Downloads 268
24070 Finding Related Scientific Documents Using Formal Concept Analysis

Authors: Nadeem Akhtar, Hira Javed

Abstract:

An important aspect of research is literature survey. Availability of a large amount of literature across different domains triggers the need for optimized systems which provide relevant literature to researchers. We propose a search system based on keywords for text documents. This experimental approach provides a hierarchical structure to the document corpus. The documents are labelled with keywords using KEA (Keyword Extraction Algorithm) and are automatically organized in a lattice structure using Formal Concept Analysis (FCA). This groups the semantically related documents together. The hierarchical structure, based on keywords gives out only those documents which precisely contain them. This approach open doors for multi-domain research. The documents across multiple domains which are indexed by similar keywords are grouped together. A hierarchical relationship between keywords is obtained. To signify the effectiveness of the approach, we have carried out the experiment and evaluation on Semeval-2010 Dataset. Results depict that the presented method is considerably successful in indexing of scientific papers.

Keywords: formal concept analysis, keyword extraction algorithm, scientific documents, lattice

Procedia PDF Downloads 302
24069 Recent Advances in Data Warehouse

Authors: Fahad Hanash Alzahrani

Abstract:

This paper describes some recent advances in a quickly developing area of data storing and processing based on Data Warehouses and Data Mining techniques, which are associated with software, hardware, data mining algorithms and visualisation techniques having common features for any specific problems and tasks of their implementation.

Keywords: data warehouse, data mining, knowledge discovery in databases, on-line analytical processing

Procedia PDF Downloads 364
24068 How to Use Big Data in Logistics Issues

Authors: Mehmet Akif Aslan, Mehmet Simsek, Eyup Sensoy

Abstract:

Big Data stands for today’s cutting-edge technology. As the technology becomes widespread, so does Data. Utilizing massive data sets enable companies to get competitive advantages over their adversaries. Out of many area of Big Data usage, logistics has significance role in both commercial sector and military. This paper lays out what big data is and how it is used in both military and commercial logistics.

Keywords: big data, logistics, operational efficiency, risk management

Procedia PDF Downloads 608
24067 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: clustering, data mining, DBSCAN, k-means, k-medoids, sensor data

Procedia PDF Downloads 342