Search results for: Multidimensional Sequence Data
7090 Data Mining for Cancer Management in Egypt Case Study: Childhood Acute Lymphoblastic Leukemia
Authors: Nevine M. Labib, Michael N. Malek
Abstract:
Data Mining aims at discovering knowledge out of data and presenting it in a form that is easily comprehensible to humans. One of the useful applications in Egypt is the Cancer management, especially the management of Acute Lymphoblastic Leukemia or ALL, which is the most common type of cancer in children. This paper discusses the process of designing a prototype that can help in the management of childhood ALL, which has a great significance in the health care field. Besides, it has a social impact on decreasing the rate of infection in children in Egypt. It also provides valubale information about the distribution and segmentation of ALL in Egypt, which may be linked to the possible risk factors. Undirected Knowledge Discovery is used since, in the case of this research project, there is no target field as the data provided is mainly subjective. This is done in order to quantify the subjective variables. Therefore, the computer will be asked to identify significant patterns in the provided medical data about ALL. This may be achieved through collecting the data necessary for the system, determimng the data mining technique to be used for the system, and choosing the most suitable implementation tool for the domain. The research makes use of a data mining tool, Clementine, so as to apply Decision Trees technique. We feed it with data extracted from real-life cases taken from specialized Cancer Institutes. Relevant medical cases details such as patient medical history and diagnosis are analyzed, classified, and clustered in order to improve the disease management.Keywords: Data Mining, Decision Trees, Knowledge Discovery, Leukemia.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22157089 A Data Warehouse System to Help Assist Breast Cancer Screening in Diagnosis, Education and Research
Authors: Souâd Demigha
Abstract:
Early detection of breast cancer is considered as a major public health issue. Breast cancer screening is not generalized to the entire population due to a lack of resources, staff and appropriate tools. Systematic screening can result in a volume of data which can not be managed by present computer architecture, either in terms of storage capabilities or in terms of exploitation tools. We propose in this paper to design and develop a data warehouse system in radiology-senology (DWRS). The aim of such a system is on one hand, to support this important volume of information providing from multiple sources of data and images and for the other hand, to help assist breast cancer screening in diagnosis, education and research.Keywords: Breast cancer screening, data warehouse, diagnosis, education, research.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17157088 Data Security in a DApp Twitter Alike on Web 3.0 With Blockchain Based Technology
Authors: Vishal Awasthi, Tanya Soni, Vigya Awasthi, Swati Singh, Shivali Verma
Abstract:
There is a growing demand for a network that grants a high level of data security and confidentiality. For this reason, the semantic web was introduced, which allows data to be shared and reused across applications while safeguarding users privacy and user’s will grab back control of their data. The earlier Web 1.0 and Web 2.0 versions were built on client-server architecture, in which there was the risk of data theft and unconsented sale of user data. A decentralized version, Known as Web 3.0, that is mostly built on blockchain technology was interjected to resolve these issues. The recent research focuses on blockchain technology, deals with privacy, security, transparency, and innovation of decentralized applications (DApps), e.g. a Twitter Clone, Whatsapp clone. In this paper the Twitter Alike built on the Ethereum blockchain will replace traditional techniques with improved latency, throughput, and data ownership. The central principle of this DApp is smart contract implemented using Solidity which is an object- oriented and highlevel language. Consequently, this will provide a better Quality Services, high data security, and integrity for both present and future internet technologies.
Keywords: Blockchain, DApps, Ethereum, Semantic Web, Smart Contract, Solidity.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3327087 Predicting Groundwater Areas Using Data Mining Techniques: Groundwater in Jordan as Case Study
Authors: Faisal Aburub, Wael Hadi
Abstract:
Data mining is the process of extracting useful or hidden information from a large database. Extracted information can be used to discover relationships among features, where data objects are grouped according to logical relationships; or to predict unseen objects to one of the predefined groups. In this paper, we aim to investigate four well-known data mining algorithms in order to predict groundwater areas in Jordan. These algorithms are Support Vector Machines (SVMs), Naïve Bayes (NB), K-Nearest Neighbor (kNN) and Classification Based on Association Rule (CBA). The experimental results indicate that the SVMs algorithm outperformed other algorithms in terms of classification accuracy, precision and F1 evaluation measures using the datasets of groundwater areas that were collected from Jordanian Ministry of Water and Irrigation.Keywords: Classification, data mining, evaluation measures, groundwater.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25957086 Data Mining on the Router Logs for Statistical Application Classification
Authors: M. Rahmati, S.M. Mirzababaei
Abstract:
With the advance of information technology in the new era the applications of Internet to access data resources has steadily increased and huge amount of data have become accessible in various forms. Obviously, the network providers and agencies, look after to prevent electronic attacks that may be harmful or may be related to terrorist applications. Thus, these have facilitated the authorities to under take a variety of methods to protect the special regions from harmful data. One of the most important approaches is to use firewall in the network facilities. The main objectives of firewalls are to stop the transfer of suspicious packets in several ways. However because of its blind packet stopping, high process power requirements and expensive prices some of the providers are reluctant to use the firewall. In this paper we proposed a method to find a discriminate function to distinguish between usual packets and harmful ones by the statistical processing on the network router logs. By discriminating these data, an administrator may take an approach action against the user. This method is very fast and can be used simply in adjacent with the Internet routers.Keywords: Data Mining, Firewall, Optimization, Packetclassification, Statistical Pattern Recognition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16557085 The Knowledge Representation of the Genetic Regulatory Networks Based on Ontology
Authors: Ines Hamdi, Mohamed Ben Ahmed
Abstract:
The understanding of the system level of biological behavior and phenomenon variously needs some elements such as gene sequence, protein structure, gene functions and metabolic pathways. Challenging problems are representing, learning and reasoning about these biochemical reactions, gene and protein structure, genotype and relation between the phenotype, and expression system on those interactions. The goal of our work is to understand the behaviors of the interactions networks and to model their evolution in time and in space. We propose in this study an ontological meta-model for the knowledge representation of the genetic regulatory networks. Ontology in artificial intelligence means the fundamental categories and relations that provide a framework for knowledge models. Domain ontology's are now commonly used to enable heterogeneous information resources, such as knowledge-based systems, to communicate with each other. The interest of our model is to represent the spatial, temporal and spatio-temporal knowledge. We validated our propositions in the genetic regulatory network of the Aarbidosis thaliana flower
Keywords: Ontological model, spatio-temporal modeling, Genetic Regulatory Networks (GRNs), knowledge representation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14857084 Improvement of Data Transfer over Simple Object Access Protocol (SOAP)
Authors: Khaled Ahmed Kadouh, Kamal Ali Albashiri
Abstract:
This paper presents a designed algorithm involves improvement of transferring data over Simple Object Access Protocol (SOAP). The aim of this work is to establish whether using SOAP in exchanging XML messages has any added advantages or not. The results showed that XML messages without SOAP take longer time and consume more memory, especially with binary data.
Keywords: JAX-WS, SMTP, SOAP, Web service, XML.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21237083 Numerical Simulations of Flood and Inundation in Jobaru River Basin Using Laser Profiler Data
Authors: Hiroto Nakashima, Toshihiro Morita, Koichiro Ohgushi
Abstract:
Laser Profiler (LP) data from aerial laser surveys have been increasingly used as topographical inputs to numerical simulations of flooding and inundation in river basins. LP data has great potential for reproducing topography, but its effective usage has not yet been fully established. In this study, flooding and inundation are simulated numerically using LP data for the Jobaru River basin of Japan’s Saga Plain. The analysis shows that the topography is reproduced satisfactorily in the computational domain with urban and agricultural areas requiring different grid sizes. A 2-D numerical simulation shows that flood flow behavior changes as grid size is varied.
Keywords: LP data, numerical simulation, topological analysis, mesh size.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15367082 Channels Splitting Strategy for Optical Local Area Networks of Passive Star Topology
Authors: Peristera Baziana
Abstract:
In this paper, we present a network configuration for a WDM LANs of passive star topology that assume that the set of data WDM channels is split into two separate sets of channels, with different access rights over them. Especially, a synchronous transmission WDMA access algorithm is adopted in order to increase the probability of successful transmission over the data channels and consequently to reduce the probability of data packets transmission cancellation in order to avoid the data channels collisions. Thus, a control pre-transmission access scheme is followed over a separate control channel. An analytical Markovian model is studied and the average throughput is mathematically derived. The performance is studied for several numbers of data channels and various values of control phase duration.Keywords: Access algorithm, channels division, collisions avoidance, wavelength division multiplexing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10147081 Road Accidents Bigdata Mining and Visualization Using Support Vector Machines
Authors: Usha Lokala, Srinivas Nowduri, Prabhakar K. Sharma
Abstract:
Useful information has been extracted from the road accident data in United Kingdom (UK), using data analytics method, for avoiding possible accidents in rural and urban areas. This analysis make use of several methodologies such as data integration, support vector machines (SVM), correlation machines and multinomial goodness. The entire datasets have been imported from the traffic department of UK with due permission. The information extracted from these huge datasets forms a basis for several predictions, which in turn avoid unnecessary memory lapses. Since data is expected to grow continuously over a period of time, this work primarily proposes a new framework model which can be trained and adapt itself to new data and make accurate predictions. This work also throws some light on use of SVM’s methodology for text classifiers from the obtained traffic data. Finally, it emphasizes the uniqueness and adaptability of SVMs methodology appropriate for this kind of research work.Keywords: Road accident, machine learning, support vector machines.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11297080 Waste to Biofuel by Torrefaction Technology
Authors: Jyh-Cherng Chen, Yu-Zen Lin, Wei-Zhi Chen
Abstract:
Torrefaction is one of waste to energy (WTE) technologies developing in Taiwan recently, which can reduce the moisture and impuritiesand increase the energy density of biowaste effectively.To understand the torrefaction characteristics of different biowaste and the influences of different torrefaction conditions, four typical biowaste were selected to carry out the torrefaction experiments. The physical and chemical properties of different biowaste prior to and after torrefaction were analyzed and compared. Experimental results show that the contents of elemental carbon and caloric value of the four biowaste were significantly increased after torrefaction. The increase of combustible and caloric value in bamboo was the greatest among the four biowaste. The caloric value of bamboo can be increased from 1526 kcal/kg to 6104 kcal/kg after 300oC and 1 hour torrefaction. The caloric valueof torrefied bamboo was almost four times as the original. The increase of elemental carbon content in wood was the greatest (from 41.03% to 75.24%), and the next was bamboo (from 47.07% to 74.63%). The major parameters which affected the caloric value of torrefied biowaste followed the sequence of biowaste kinds, torrefaction time, and torrefaction temperature. The optimal torrefaction conditions of the experiments were bamboo torrefied at 300oC for 3 hours, and the corresponding caloric value of torrefied bamboo was 5953 kcal/kg. This caloric value is similar to that of brown coal or bituminous coal.
Keywords: Torrefaction, waste to energy, calorie, biofuel.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20387079 A Testbed for the Experiments Performed in Missing Value Treatments
Authors: Dias de J. C. Lilian, Lobato M. F. Fábio, de Santana L. Ádamo
Abstract:
The occurrence of missing values in database is a serious problem for Data Mining tasks, responsible for degrading data quality and accuracy of analyses. In this context, the area has shown a lack of standardization for experiments to treat missing values, introducing difficulties to the evaluation process among different researches due to the absence in the use of common parameters. This paper proposes a testbed intended to facilitate the experiments implementation and provide unbiased parameters using available datasets and suited performance metrics in order to optimize the evaluation and comparison between the state of art missing values treatments.
Keywords: Data imputation, data mining, missing values treatment, testbed.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15137078 Data-Reusing Adaptive Filtering Algorithms with Adaptive Error Constraint
Authors: Young-Seok Choi
Abstract:
We present a family of data-reusing and affine projection algorithms. For identification of a noisy linear finite impulse response channel, a partial knowledge of a channel, especially noise, can be used to improve the performance of the adaptive filter. Motivated by this fact, the proposed scheme incorporates an estimate of a knowledge of noise. A constraint, called the adaptive noise constraint, estimates an unknown information of noise. By imposing this constraint on a cost function of data-reusing and affine projection algorithms, a cost function based on the adaptive noise constraint and Lagrange multiplier is defined. Minimizing the new cost function leads to the adaptive noise constrained (ANC) data-reusing and affine projection algorithms. Experimental results comparing the proposed schemes to standard data-reusing and affine projection algorithms clearly indicate their superior performance.Keywords: Data-reusing, affine projection algorithm, error constraint, system identification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16197077 Handling Mobility using Virtual Grid in Static Wireless Sensor Networks
Authors: T.P. Sharma
Abstract:
Querying a data source and routing data towards sink becomes a serious challenge in static wireless sensor networks if sink and/or data source are mobile. Many a times the event to be observed either moves or spreads across wide area making maintenance of continuous path between source and sink a challenge. Also, sink can move while query is being issued or data is on its way towards sink. In this paper, we extend our already proposed Grid Based Data Dissemination (GBDD) scheme which is a virtual grid based topology management scheme restricting impact of movement of sink(s) and event(s) to some specific cells of a grid. This obviates the need for frequent path modifications and hence maintains continuous flow of data while minimizing the network energy consumptions. Simulation experiments show significant improvements in network energy savings and average packet delay for a packet to reach at sink.Keywords: Mobility in WSNs, virtual grid, GBDD, clustering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15507076 Molecular Characterization of Free Radicals Decomposing Genes on Plant Developmental Stages
Authors: R. Haddad, K. Morris, V. Buchanan-Wollaston
Abstract:
Biochemical and molecular analysis of some antioxidant enzyme genes revealed different level of gene expression on oilseed (Brassica napus). For molecular and biochemical analysis, leaf tissues were harvested from plants at eight different developmental stages, from young to senescence. The levels of total protein and chlorophyll were increased during maturity stages of plant, while these were decreased during the last stages of plant growth. Structural analysis (nucleotide and deduced amino acid sequence, and phylogenic tree) of a complementary DNA revealed a high level of similarity for a family of Catalase genes. The expression of the gene encoded by different Catalase isoforms was assessed during different plant growth phase. No significant difference between samples was observed, when Catalase activity was statistically analyzed at different developmental stages. EST analysis exhibited different transcripts levels for a number of other relevant antioxidant genes (different isoforms of SOD and glutathione). The high level of transcription of these genes at senescence stages was indicated that these genes are senescenceinduced genes.Keywords: Biochemical analysis, Oilseed, Expression pattern, Growth phases
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15507075 Experimental Modal Analysis and Model Validation of Antenna Structures
Authors: B.R. Potgieter, G. Venter
Abstract:
Numerical design optimization is a powerful tool that can be used by engineers during any stage of the design process. There are many different applications for structural optimization. A specific application that will be discussed in the following paper is experimental data matching. Data obtained through tests on a physical structure will be matched with data from a numerical model of that same structure. The data of interest will be the dynamic characteristics of an antenna structure focusing on the mode shapes and modal frequencies. The structure used was a scaled and simplified model of the Karoo Array Telescope-7 (KAT-7) antenna structure. This kind of data matching is a complex and difficult task. This paper discusses how optimization can assist an engineer during the process of correlating a finite element model with vibration test data.Keywords: Finite Element Model (FEM), Karoo Array Telescope(KAT-7), modal frequencies, mode shapes, optimization, shape optimization, size optimization, vibration tests
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18527074 Compressed Suffix Arrays to Self-Indexes Based on Partitioned Elias-Fano
Abstract:
A practical and simple self-indexing data structure, Partitioned Elias-Fano (PEF) - Compressed Suffix Arrays (CSA), is built in linear time for the CSA based on PEF indexes. Moreover, the PEF-CSA is compared with two classical compressed indexing methods, Ferragina and Manzini implementation (FMI) and Sad-CSA on different type and size files in Pizza & Chili. The PEF-CSA performs better on the existing data in terms of the compression ratio, count, and locates time except for the evenly distributed data such as proteins data. The observations of the experiments are that the distribution of the φ is more important than the alphabet size on the compression ratio. Unevenly distributed data φ makes better compression effect, and the larger the size of the hit counts, the longer the count and locate time.
Keywords: Compressed suffix array, self-indexing, partitioned Elias-Fano, PEF-CSA.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10837073 A Decision Matrix for the Evaluation of Triplestores for Use in a Virtual Research Environment
Authors: Tristan O’Neill, Trina Myers, Jarrod Trevathan
Abstract:
The Tropical Data Hub (TDH) is a virtual research environment that provides researchers with an e-research infrastructure to congregate significant tropical data sets for data reuse, integration, searching, and correlation. However, researchers often require data and metadata synthesis across disciplines for cross-domain analyses and knowledge discovery. A triplestore offers a semantic layer to achieve a more intelligent method of search to support the synthesis requirements by automating latent linkages in the data and metadata. Presently, the benchmarks to aid the decision of which triplestore is best suited for use in an application environment like the TDH are limited to performance. This paper describes a new evaluation tool developed to analyze both features and performance. The tool comprises a weighted decision matrix to evaluate the interoperability, functionality, performance, and support availability of a range of integrated and native triplestores to rank them according to requirements of the TDH.
Keywords: Virtual research environment, Semantic Web, performance analysis, tropical data hub.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17047072 Real-time Performance Study of EPA Periodic Data Transmission
Authors: Liu Ning, Zhong Chongquan, Teng Hongfei
Abstract:
EPA (Ethernet for Plant Automation) resolves the nondeterministic problem of standard Ethernet and accomplishes real-time communication by means of micro-segment topology and deterministic scheduling mechanism. This paper studies the real-time performance of EPA periodic data transmission from theoretical and experimental perspective. By analyzing information transmission characteristics and EPA deterministic scheduling mechanism, 5 indicators including delivery time, time synchronization accuracy, data-sending time offset accuracy, utilization percentage of configured timeslice and non-RTE bandwidth that can be used to specify the real-time performance of EPA periodic data transmission are presented and investigated. On this basis, the test principles and test methods of the indicators are respectively studied and some formulas for real-time performance of EPA system are derived. Furthermore, an experiment platform is developed to test the indicators of EPA periodic data transmission in a micro-segment. According to the analysis and the experiment, the methods to improve the real-time performance of EPA periodic data transmission including optimizing network structure, studying self-adaptive adjustment method of timeslice and providing data-sending time offset accuracy for configuration are proposed.
Keywords: EPA system, Industrial Ethernet, Periodic data, Real-time performance
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14697071 Data Quality Enhancement with String Length Distribution
Authors: Qi Xiu, Hiromu Hota, Yohsuke Ishii, Takuya Oda
Abstract:
Recently, collectable manufacturing data are rapidly increasing. On the other hand, mega recall is getting serious as a social problem. Under such circumstances, there are increasing needs for preventing mega recalls by defect analysis such as root cause analysis and abnormal detection utilizing manufacturing data. However, the time to classify strings in manufacturing data by traditional method is too long to meet requirement of quick defect analysis. Therefore, we present String Length Distribution Classification method (SLDC) to correctly classify strings in a short time. This method learns character features, especially string length distribution from Product ID, Machine ID in BOM and asset list. By applying the proposal to strings in actual manufacturing data, we verified that the classification time of strings can be reduced by 80%. As a result, it can be estimated that the requirement of quick defect analysis can be fulfilled.Keywords: Data quality, feature selection, probability distribution, string classification, string length.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13287070 Deterministic Random Number Generator Algorithm for Cryptosystem Keys
Authors: Adi A. Maaita, Hamza A. A. Al_Sewadi
Abstract:
One of the crucial parameters of digital cryptographic systems is the selection of the keys used and their distribution. The randomness of the keys has a strong impact on the system’s security strength being difficult to be predicted, guessed, reproduced, or discovered by a cryptanalyst. Therefore, adequate key randomness generation is still sought for the benefit of stronger cryptosystems. This paper suggests an algorithm designed to generate and test pseudo random number sequences intended for cryptographic applications. This algorithm is based on mathematically manipulating a publically agreed upon information between sender and receiver over a public channel. This information is used as a seed for performing some mathematical functions in order to generate a sequence of pseudorandom numbers that will be used for encryption/decryption purposes. This manipulation involves permutations and substitutions that fulfill Shannon’s principle of “confusion and diffusion”. ASCII code characters were utilized in the generation process instead of using bit strings initially, which adds more flexibility in testing different seed values. Finally, the obtained results would indicate sound difficulty of guessing keys by attackers.Keywords: Cryptosystems, Information Security agreement, Key distribution, Random numbers.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 34327069 Efficient and Extensible Data Processing Framework in Ubiquitious Sensor Networks
Authors: Junghoon Lee, Gyung-Leen Park, Ho-Young Kwak, Cheol Min Kim
Abstract:
This paper presents the design and implements the prototype of an intelligent data processing framework in ubiquitous sensor networks. Much focus is put on how to handle the sensor data stream as well as the interoperability between the low-level sensor data and application clients. Our framework first addresses systematic middleware which mitigates the interaction between the application layer and low-level sensors, for the sake of analyzing a great volume of sensor data by filtering and integrating to create value-added context information. Then, an agent-based architecture is proposed for real-time data distribution to efficiently forward a specific event to the appropriate application registered in the directory service via the open interface. The prototype implementation demonstrates that our framework can host a sophisticated application on the ubiquitous sensor network and it can autonomously evolve to new middleware, taking advantages of promising technologies such as software agents, XML, cloud computing, and the like.
Keywords: sensor network, intelligent farm, middleware, event detection
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13577068 Eliciting and Confirming Data, Information, Knowledge and Wisdom in a Specialist Health Care Setting: The WICKED Method
Authors: S. Impey, D. Berry, S. Furtado, M. Galvin, L. Grogan, O. Hardiman, L. Hederman, M. Heverin, V. Wade, L. Douris, D. O'Sullivan, G. Stephens
Abstract:
Healthcare is a knowledge-rich environment. This knowledge, while valuable, is not always accessible outside the borders of individual clinics. This research aims to address part of this problem (at a study site) by constructing a maximal data set (knowledge artefact) for motor neurone disease (MND). This data set is proposed as an initial knowledge base for a concurrent project to develop an MND patient data platform. It represents the domain knowledge at the study site for the duration of the research (12 months). A knowledge elicitation method was also developed from the lessons learned during this process - the WICKED method. WICKED is an anagram of the words: eliciting and confirming data, information, knowledge, wisdom. But it is also a reference to the concept of wicked problems, which are complex and challenging, as is eliciting expert knowledge. The method was evaluated at a second site, and benefits and limitations were noted. Benefits include that the method provided a systematic way to manage data, information, knowledge and wisdom (DIKW) from various sources, including healthcare specialists and existing data sets. Limitations surrounded the time required and how the data set produced only represents DIKW known during the research period. Future work is underway to address these limitations.
Keywords: Healthcare, knowledge acquisition, maximal data sets, action design science.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5447067 Application of Multi-Dimensional Principal Component Analysis to Medical Data
Authors: Naoki Yamamoto, Jun Murakami, Chiharu Okuma, Yutaro Shigeto, Satoko Saito, Takashi Izumi, Nozomi Hayashida
Abstract:
Multi-dimensional principal component analysis (PCA) is the extension of the PCA, which is used widely as the dimensionality reduction technique in multivariate data analysis, to handle multi-dimensional data. To calculate the PCA the singular value decomposition (SVD) is commonly employed by the reason of its numerical stability. The multi-dimensional PCA can be calculated by using the higher-order SVD (HOSVD), which is proposed by Lathauwer et al., similarly with the case of ordinary PCA. In this paper, we apply the multi-dimensional PCA to the multi-dimensional medical data including the functional independence measure (FIM) score, and describe the results of experimental analysis.Keywords: multi-dimensional principal component analysis, higher-order SVD (HOSVD), functional independence measure (FIM), medical data, tensor decomposition
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25027066 Procedure Model for Data-Driven Decision Support Regarding the Integration of Renewable Energies into Industrial Energy Management
Authors: M. Graus, K. Westhoff, X. Xu
Abstract:
The climate change causes a change in all aspects of society. While the expansion of renewable energies proceeds, industry could not be convinced based on general studies about the potential of demand side management to reinforce smart grid considerations in their operational business. In this article, a procedure model for a case-specific data-driven decision support for industrial energy management based on a holistic data analytics approach is presented. The model is executed on the example of the strategic decision problem, to integrate the aspect of renewable energies into industrial energy management. This question is induced due to considerations of changing the electricity contract model from a standard rate to volatile energy prices corresponding to the energy spot market which is increasingly more affected by renewable energies. The procedure model corresponds to a data analytics process consisting on a data model, analysis, simulation and optimization step. This procedure will help to quantify the potentials of sustainable production concepts based on the data from a factory. The model is validated with data from a printer in analogy to a simple production machine. The overall goal is to establish smart grid principles for industry via the transformation from knowledge-driven to data-driven decisions within manufacturing companies.
Keywords: Data analytics, green production, industrial energy management, optimization, renewable energies, simulation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17367065 Dynamic Data Partition Algorithm for a Parallel H.264 Encoder
Authors: Juntae Kim, Jaeyoung Park, Kyoungkun Lee, Jong Tae Kim
Abstract:
The H.264/AVC standard is a highly efficient video codec providing high-quality videos at low bit-rates. As employing advanced techniques, the computational complexity has been increased. The complexity brings about the major problem in the implementation of a real-time encoder and decoder. Parallelism is the one of approaches which can be implemented by multi-core system. We analyze macroblock-level parallelism which ensures the same bit rate with high concurrency of processors. In order to reduce the encoding time, dynamic data partition based on macroblock region is proposed. The data partition has the advantages in load balancing and data communication overhead. Using the data partition, the encoder obtains more than 3.59x speed-up on a four-processor system. This work can be applied to other multimedia processing applications.Keywords: H.264/AVC, video coding, thread-level parallelism, OpenMP, multimedia
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17957064 Developing Structured Sizing Systems for Manufacturing Ready-Made Garments of Indian Females Using Decision Tree-Based Data Mining
Authors: Hina Kausher, Sangita Srivastava
Abstract:
In India, there is a lack of standard, systematic sizing approach for producing readymade garments. Garments manufacturing companies use their own created size tables by modifying international sizing charts of ready-made garments. The purpose of this study is to tabulate the anthropometric data which cover the variety of figure proportions in both height and girth. 3,000 data have been collected by an anthropometric survey undertaken over females between the ages of 16 to 80 years from the some states of India to produce the sizing system suitable for clothing manufacture and retailing. The data are used for the statistical analysis of body measurements, the formulation of sizing systems and body measurements tables. Factor analysis technique is used to filter the control body dimensions from the large number of variables. Decision tree-based data mining is used to cluster the data. The standard and structured sizing system can facilitate pattern grading and garment production. Moreover, it can exceed buying ratios and upgrade size allocations to retail segments.Keywords: Anthropometric data, data mining, decision tree, garments manufacturing, ready-made garments, sizing systems.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9617063 XML Schema Automatic Matching Solution
Authors: Huynh Quyet Thang, Vo Sy Nam
Abstract:
Schema matching plays a key role in many different applications, such as schema integration, data integration, data warehousing, data transformation, E-commerce, peer-to-peer data management, ontology matching and integration, semantic Web, semantic query processing, etc. Manual matching is expensive and error-prone, so it is therefore important to develop techniques to automate the schema matching process. In this paper, we present a solution for XML schema automated matching problem which produces semantic mappings between corresponding schema elements of given source and target schemas. This solution contributed in solving more comprehensively and efficiently XML schema automated matching problem. Our solution based on combining linguistic similarity, data type compatibility and structural similarity of XML schema elements. After describing our solution, we present experimental results that demonstrate the effectiveness of this approach.Keywords: XML Schema, Schema Matching, SemanticMatching, Automatic XML Schema Matching.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18317062 Mapping the Digital Landscape: An Analysis of Party Differences between Conventional and Digital Policy Positions
Authors: Daniel Schwarz, Jan Fivaz, Alessia Neuroni
Abstract:
Although digitization is a buzzword in almost every election campaign, the political parties leave voters largely in the dark about their specific positions on digital issues. In the run-up to the 2019 elections in Switzerland, the ‘Digitization Monitor’ project (DMP) was launched in order to change this situation. Within the framework of the DMP, all 4,736 candidates were surveyed about their digital policy positions and values. The DMP is designed as a digital policy supplement to the existing ‘smartvote’ voting advice application. This enabled a direct comparison of the digital policy attitudes according to the DMP with the topics of the ‘smartvote’ questionnaire which are comprehensive in content but mainly related to conventional policy areas. This paper’s main research goal is to analyze and visualize possible differences between conventional and digital policy areas in terms of response patterns between and within political parties. The analysis is based on dimensionality reduction methods (multidimensional scaling and principal component analysis) for the visualization of inter-party differences, and on standard deviation as a measure of variation for the evaluation of intra-party unity. The results reveal that digital issues show a lower degree of inter-party polarization compared to conventional policy areas. Thus, the parties have more common ground in issues on digitization than in conventional policy areas. In contrast, the study reveals a mixed picture regarding intra-party unity. Homogeneous parties show a lower degree of unity in digitization issues whereas parties with heterogeneous positions in conventional areas have more united positions in digital areas. All things considered, the findings are encouraging as less polarized conditions apply to the debate on digital development compared to conventional politics. For the future, it would be desirable if in further countries similar projects to the DMP could emerge to broaden the basis for conclusions.
Keywords: Comparison of political issue dimensions, digital awareness of candidates, digital policy space, party positions on digital issues.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6487061 Data Oriented Model of Image: as a Framework for Image Processing
Authors: A. Habibizad Navin, A. Sadighi, M. Naghian Fesharaki, M. Mirnia, M. Teshnelab, R. Keshmiri
Abstract:
This paper presents a new data oriented model of image. Then a representation of it, ADBT, is introduced. The ability of ADBT is clustering, segmentation, measuring similarity of images etc, with desired precision and corresponding speed.
Keywords: Data oriented modelling, image, clustering, segmentation, classification, ADBT and image processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1800