Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 25197

Search results for: data encoding

25107 Analysis of Big Data

Authors: Sandeep Sharma, Sarabjit Singh

Abstract:

As per the user demand and growth trends of large free data the storage solutions are now becoming more challenge-able to protect, store and to retrieve data. The days are not so far when the storage companies and organizations are start saying 'no' to store our valuable data or they will start charging a huge amount for its storage and protection. On the other hand as per the environmental conditions it becomes challenge-able to maintain and establish new data warehouses and data centers to protect global warming threats. A challenge of small data is over now, the challenges are big that how to manage the exponential growth of data. In this paper we have analyzed the growth trend of big data and its future implications. We have also focused on the impact of the unstructured data on various concerns and we have also suggested some possible remedies to streamline big data.

Keywords: big data, unstructured data, volume, variety, velocity

Procedia PDF Downloads 545

25106 Tunable in Phase, out of Phase and T/4 Square-Wave Pulses in Delay-Coupled Optoelectronic Oscillators

Authors: Jade Martínez-Llinàs, Pere Colet

Abstract:

By exploring the possible dynamical regimes in a prototypical model for mutually delay-coupled OEOs, here it is shown that two mutually coupled non-identical OEOs, besides in- and out-of-phase square-waves, can generate stable square-wave pulses synchronized at a quarter of the period (T/4) in a broad parameter region. The key point to obtain T/4 solutions is that the two OEO operate with mixed feedback, namely with negative feedback in one and positive in the other. Furthermore, the coexistence of multiple solutions provides a large degree of flexibility for tuning the frequency in the GHz range without changing any parameter. As a result the two coupled OEOs system is good candidate to be implemented for information encoding as a high-capacity memory device.

Keywords: nonlinear optics, optoelectronic oscillators, square waves, synchronization

Procedia PDF Downloads 366

25105 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, WangQun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSQL), and gives 6 data cleaning methods based on these algorithms.

Keywords: data cleaning, dependency rules, violation data discovery, data repair

Procedia PDF Downloads 562

25104 Modification of Escherichia coli PtolT Expression Vector via Site-Directed Mutagenesis

Authors: Yakup Ulusu, Numan Eczacıoğlu, İsa Gökçe, Helen Waller, Jeremy H. Lakey

Abstract:

Besides having the appropriate amino acid sequence to perform the function of proteins, it is important to have correct conformation after this sequence to process. To consist of this conformation depends on the amino acid sequence at the primary structure, hydrophobic interaction, chaperones and enzymes in charge of folding etc. Misfolded proteins are not functional and tend to be aggregated. Cysteine originating disulfide cross-links make stable this conformation of functional proteins. When two of the cysteine amino acids come side by side, disulfide bond is established that forms a cystine bridge. Due to this feature cysteine plays an important role on the formation of three-dimensional structure of many proteins. There are two cysteine amino acids (C44, C69) in the Tol-A-III protein. Unlike protein disulfide bonds from within his own, any non-specific cystine bridge causes a change in the three dimensional structure of the protein. Proteins can be expressed in various host cells as directly or fusion (chimeric). As a result of overproduction of the recombinant proteins, aggregation of insoluble proteins in the host cell can occur by forming a crystal structure called inclusion body. In general fusion proteins are produced for provide affinity tags to make proteins more soluble and production of some toxic proteins via fusion protein expression system like pTolT. Proteins can be modified by using a site-directed mutagenesis. By this way, creation of non-specific disulfide crosslinks can be prevented at fusion protein expression system via the present cysteine replaced by another amino acid such as serine, glycine or etc. To do this, we need; a DNA molecule that contains the gene that encodes for the target protein, required primers for mutation to be designed according to site directed mutagenesis reaction. This study was aimed to be replaced cysteine encoding codon TGT with serine encoding codon AGT. For this sense and reverse primers designed (given below) and used site-directed mutagenesis reaction. Several new copy of the template plasmid DNA has been formed with above mentioned mutagenic primers via polymerase chain reaction (PCR). PCR product consists of both the master template DNA (wild type) and the new DNA sequences containing mutations. Dpn-l endonuclease restriction enzyme which is specific for methylated DNA and cuts them to the elimination of the master template DNA. E. coli cells obtained after transformation were incubated LB medium with antibiotic. After purification of plasmid DNA from E. coli, the presence of the mutation was determined by DNA sequence analysis. Developed this new plasmid is called PtolT-δ.

Keywords: site directed mutagenesis, Escherichia coli, pTolT, protein expression

Procedia PDF Downloads 372

25103 The Potential of Edaphic Algae for Bioremediation of the Diesel-Contaminated Soil

Authors: C. J. Tien, C. S. Chen, S. F. Huang, Z. X. Wang

Abstract:

Algae in soil ecosystems can produce organic matters and oxygen by photosynthesis. Heterocyst-forming cyanobacteria can fix nitrogen to increase soil nitrogen contents. Secretion of mucilage by some algae increases the soil water content and soil aggregation. These actions will improve soil quality and fertility, and further increase abundance and diversity of soil microorganisms. In addition, some mixotrophic and heterotrophic algae are able to degrade petroleum hydrocarbons. Therefore, the objectives of this study were to analyze the effects of algal addition on the degradation of total petroleum hydrocarbons (TPH), diversity and activity of bacteria and algae in the diesel-contaminated soil under different nutrient contents and frequency of plowing and irrigation in order to assess the potential bioremediation technique using edaphic algae. The known amount of diesel was added into the farmland soil. This diesel-contaminated soil was subject to five settings, experiment-1 with algal addition by plowing and irrigation every two weeks, experiment-2 with algal addition by plowing and irrigation every four weeks, experiment-3 with algal and nutrient addition by plowing and irrigation every two weeks, experiment-4 with algal and nutrient addition by plowing and irrigation every four weeks, and the control without algal addition. Soil samples were taken every two weeks to analyze TPH concentrations, diversity of bacteria and algae, and catabolic genes encoding functional degrading enzymes. The results show that the TPH removal rates of five settings after the two-month experimental period were in the order: experiment-2 > expermient-4 > experiment-3 > experiment-1 > control. It indicated that algal addition enhanced the degradation of TPH in the diesel-contaminated soil, but not for nutrient addition. Plowing and irrigation every four weeks resulted in more TPH removal than that every two weeks. The banding patterns of denaturing gradient gel electrophoresis (DGGE) revealed an increase in diversity of bacteria and algae after algal addition. Three petroleum hydrocarbon-degrading algae (Anabaena sp., Oscillatoria sp. and Nostoc sp.) and two added algal strains (Leptolyngbya sp. and Synechococcus sp.) were sequenced from DGGE prominent bands. The four hydrocarbon-degrading bacteria Gordonia sp., Mycobacterium sp., Rodococcus sp. and Alcanivorax sp. were abundant in the treated soils. These results suggested that growth of indigenous bacteria and algae were improved after adding edaphic algae. Real-time polymerase chain reaction results showed that relative amounts of four catabolic genes encoding catechol 2, 3-dioxygenase, toluene monooxygenase, xylene monooxygenase and phenol monooxygenase were appeared and expressed in the treated soil. The addition of algae increased the expression of these genes at the end of experiments to biodegrade petroleum hydrocarbons. This study demonstrated that edaphic algae were suitable biomaterials for bioremediating diesel-contaminated soils with plowing and irrigation every four weeks.

Keywords: catabolic gene, diesel, diversity, edaphic algae

Procedia PDF Downloads 278

25102 GeneNet: Temporal Graph Data Visualization for Gene Nomenclature and Relationships

Authors: Jake Gonzalez, Tommy Dang

Abstract:

This paper proposes a temporal graph approach to visualize and analyze the evolution of gene relationships and nomenclature over time. An interactive web-based tool implements this temporal graph, enabling researchers to traverse a timeline and observe coupled dynamics in network topology and naming conventions. Analysis of a real human genomic dataset reveals the emergence of densely interconnected functional modules over time, representing groups of genes involved in key biological processes. For example, the antimicrobial peptide DEFA1A3 shows increased connections to related alpha-defensins involved in infection response. Tracking degree and betweenness centrality shifts over timeline iterations also quantitatively highlight the reprioritization of certain genes’ topological importance as knowledge advances. Examination of the CNR1 gene encoding the cannabinoid receptor CB1 demonstrates changing synonymous relationships and consolidating naming patterns over time, reflecting its unique functional role discovery. The integrated framework interconnecting these topological and nomenclature dynamics provides richer contextual insights compared to isolated analysis methods. Overall, this temporal graph approach enables a more holistic study of knowledge evolution to elucidate complex biology.

Keywords: temporal graph, gene relationships, nomenclature evolution, interactive visualization, biological insights

Procedia PDF Downloads 61

25101 Building an Opinion Dynamics Model from Experimental Data

Authors: Dino Carpentras, Paul J. Maher, Caoimhe O'Reilly, Michael Quayle

Abstract:

Opinion dynamics is a sub-field of agent-based modeling that focuses on people’s opinions and their evolutions over time. Despite the rapid increase in the number of publications in this field, it is still not clear how to apply these models to real-world scenarios. Indeed, there is no agreement on how people update their opinion while interacting. Furthermore, it is not clear if different topics will show the same dynamics (e.g., more polarized topics may behave differently). These problems are mostly due to the lack of experimental validation of the models. Some previous studies started bridging this gap in the literature by directly measuring people’s opinions before and after the interaction. However, these experiments force people to express their opinion as a number instead of using natural language (and then, eventually, encoding it as numbers). This is not the way people normally interact, and it may strongly alter the measured dynamics. Another limitation of these studies is that they usually average all the topics together, without checking if different topics may show different dynamics. In our work, we collected data from 200 participants on 5 unpolarized topics. Participants expressed their opinions in natural language (“agree” or “disagree”). We also measured the certainty of their answer, expressed as a number between 1 and 10. However, this value was not shown to other participants to keep the interaction based on natural language. We then showed the opinion (and not the certainty) of another participant and, after a distraction task, we repeated the measurement. To make the data compatible with opinion dynamics models, we multiplied opinion and certainty to obtain a new parameter (here called “continuous opinion”) ranging from -10 to +10 (using agree=1 and disagree=-1). We firstly checked the 5 topics individually, finding that all of them behaved in a similar way despite having different initial opinions distributions. This suggested that the same model could be applied for different unpolarized topics. We also observed that people tend to maintain similar levels of certainty, even when they changed their opinion. This is a strong violation of what is suggested from common models, where people starting at, for example, +8, will first move towards 0 instead of directly jumping to -8. We also observed social influence, meaning that people exposed with “agree” were more likely to move to higher levels of continuous opinion, while people exposed with “disagree” were more likely to move to lower levels. However, we also observed that the effect of influence was smaller than the effect of random fluctuations. Also, this configuration is different from standard models, where noise, when present, is usually much smaller than the effect of social influence. Starting from this, we built an opinion dynamics model that explains more than 80% of data variance. This model was also able to show the natural conversion of polarization from unpolarized states. This experimental approach offers a new way to build models grounded on experimental data. Furthermore, the model offers new insight into the fundamental terms of opinion dynamics models.

Keywords: experimental validation, micro-dynamics rule, opinion dynamics, update rule

Procedia PDF Downloads 108

25100 Cloud-Based Multiresolution Geodata Cube for Efficient Raster Data Visualization and Analysis

Authors: Lassi Lehto, Jaakko Kahkonen, Juha Oksanen, Tapani Sarjakoski

Abstract:

The use of raster-formatted data sets in geospatial analysis is increasing rapidly. At the same time, geographic data are being introduced into disciplines outside the traditional domain of geoinformatics, like climate change, intelligent transport, and immigration studies. These developments call for better methods to deliver raster geodata in an efficient and easy-to-use manner. Data cube technologies have traditionally been used in the geospatial domain for managing Earth Observation data sets that have strict requirements for effective handling of time series. The same approach and methodologies can also be applied in managing other types of geospatial data sets. A cloud service-based geodata cube, called GeoCubes Finland, has been developed to support online delivery and analysis of most important geospatial data sets with national coverage. The main target group of the service is the academic research institutes in the country. The most significant aspects of the GeoCubes data repository include the use of multiple resolution levels, cloud-optimized file structure, and a customized, flexible content access API. Input data sets are pre-processed while being ingested into the repository to bring them into a harmonized form in aspects like georeferencing, sampling resolutions, spatial subdivision, and value encoding. All the resolution levels are created using an appropriate generalization method, selected depending on the nature of the source data set. Multiple pre-processed resolutions enable new kinds of online analysis approaches to be introduced. Analysis processes based on interactive visual exploration can be effectively carried out, as the level of resolution most close to the visual scale can always be used. In the same way, statistical analysis can be carried out on resolution levels that best reflect the scale of the phenomenon being studied. Access times remain close to constant, independent of the scale applied in the application. The cloud service-based approach, applied in the GeoCubes Finland repository, enables analysis operations to be performed on the server platform, thus making high-performance computing facilities easily accessible. The developed GeoCubes API supports this kind of approach for online analysis. The use of cloud-optimized file structures in data storage enables the fast extraction of subareas. The access API allows for the use of vector-formatted administrative areas and user-defined polygons as definitions of subareas for data retrieval. Administrative areas of the country in four levels are available readily from the GeoCubes platform. In addition to direct delivery of raster data, the service also supports the so-called virtual file format, in which only a small text file is first downloaded. The text file contains links to the raster content on the service platform. The actual raster data is downloaded on demand, from the spatial area and resolution level required in each stage of the application. By the geodata cube approach, pre-harmonized geospatial data sets are made accessible to new categories of inexperienced users in an easy-to-use manner. At the same time, the multiresolution nature of the GeoCubes repository facilitates expert users to introduce new kinds of interactive online analysis operations.

Keywords: cloud service, geodata cube, multiresolution, raster geodata

Procedia PDF Downloads 133

25099 Constructions of Linear and Robust Codes Based on Wavelet Decompositions

Authors: Alla Levina, Sergey Taranov

Abstract:

The classical approach to the providing noise immunity and integrity of information that process in computing devices and communication channels is to use linear codes. Linear codes have fast and efficient algorithms of encoding and decoding information, but this codes concentrate their detect and correct abilities in certain error configurations. To protect against any configuration of errors at predetermined probability can robust codes. This is accomplished by the use of perfect nonlinear and almost perfect nonlinear functions to calculate the code redundancy. The paper presents the error-correcting coding scheme using biorthogonal wavelet transform. Wavelet transform applied in various fields of science. Some of the wavelet applications are cleaning of signal from noise, data compression, spectral analysis of the signal components. The article suggests methods for constructing linear codes based on wavelet decomposition. For developed constructions we build generator and check matrix that contain the scaling function coefficients of wavelet. Based on linear wavelet codes we develop robust codes that provide uniform protection against all errors. In article we propose two constructions of robust code. The first class of robust code is based on multiplicative inverse in finite field. In the second robust code construction the redundancy part is a cube of information part. Also, this paper investigates the characteristics of proposed robust and linear codes.

Keywords: robust code, linear code, wavelet decomposition, scaling function, error masking probability

Procedia PDF Downloads 488

25098 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: mining big data, big data, machine learning, telecommunication

Procedia PDF Downloads 406

25097 Heterogeneity of Genes Encoding the Structural Proteins of Avian Infectious Bronchitis Virus

Authors: Shahid Hussain Abro, Siamak Zohari, Lena H. M. Renström, Désirée S. Jansson, Faruk Otman, Karin Ullman, Claudia Baule

Abstract:

Infectious bronchitis is an acute, highly contagious respiratory, nephropathogenic and reproductive disease of poultry that is caused by infectious bronchitis virus (IBV). The present study used a large data set of structural gene sequences, including newly generated ones and sequences available in the GenBank database to further analyze the diversity and to identify selective pressures and recombination spots. There were some deletions or insertions in the analyzed regions in isolates of the Italy-02 and D274 genotypes. Whereas, there were no insertions or deletions observed in the isolates of the Massachusetts and 4/91 genotype. The hypervariable nucleotide sequence regions spanned positions 152–239, 554–582, 686–737 and 802–912 in the S1 sub-unit of the all analyzed genotypes. The nucleotide sequence data of the E gene showed that this gene was comparatively unstable and subjected to a high frequency of mutations. The M gene showed substitutions consistently distributed except for a region between nucleotide positions 250–680 that remained conserved. The lowest variation in the nucleotide sequences of ORF5a was observed in the isolates of the D274 genotype. While, ORF5b and N gene sequences showed highly conserved regions and were less subjected to variation. Genes ORF3a, ORF3b, M, ORF5a, ORF5b and N presented negative selective pressure among the analyzed isolates. However, some regions of the ORFs showed favorable selective pressure(s). The S1 and E proteins were subjected to a high rate of mutational substitutions and non-synonymous amino acids. Strong signals of recombination breakpoints and ending break point were observed in the S and N genes. Overall, the results of this study revealed that very likely the strong selective pressures in E, M and the high frequency of substitutions in the S gene can probably be considered the main determinants in the evolution of IBV.

Keywords: IBV, avian infectious bronchitis, structural genes, genotypes, genetic diversity

Procedia PDF Downloads 431

25096 JavaScript Object Notation Data against eXtensible Markup Language Data in Software Applications a Software Testing Approach

Authors: Theertha Chandroth

Abstract:

This paper presents a comparative study on how to check JSON (JavaScript Object Notation) data against XML (eXtensible Markup Language) data from a software testing point of view. JSON and XML are widely used data interchange formats, each with its unique syntax and structure. The objective is to explore various techniques and methodologies for validating comparison and integration between JSON data to XML and vice versa. By understanding the process of checking JSON data against XML data, testers, developers and data practitioners can ensure accurate data representation, seamless data interchange, and effective data validation.

Keywords: XML, JSON, data comparison, integration testing, Python, SQL

Procedia PDF Downloads 137

25095 Using Machine Learning Techniques to Extract Useful Information from Dark Data

Authors: Nigar Hussain

Abstract:

It is a subset of big data. Dark data means those data in which we fail to use for future decisions. There are many issues in existing work, but some need powerful tools for utilizing dark data. It needs sufficient techniques to deal with dark data. That enables users to exploit their excellence, adaptability, speed, less time utilization, execution, and accessibility. Another issue is the way to utilize dark data to extract helpful information to settle on better choices. In this paper, we proposed upgrade strategies to remove the dark side from dark data. Using a supervised model and machine learning techniques, we utilized dark data and achieved an F1 score of 89.48%.

Keywords: big data, dark data, machine learning, heatmap, random forest

Procedia PDF Downloads 27

25094 Characterization of Crustin from Litopenaeus vannamei

Authors: Suchao Donpudsa, Anchalee Tassanakajon, Vichien Rimphanitchayakit

Abstract:

A crustin gene, LV-SWD1, previously found in the hemocyte cDNA library of Litopenaeus vannamei, contains the open reading frames of 288 bp encoding a putative protein of 96 amino acid residues. The putative signal peptides of the LV-SWD1 were identified using the online SignalP 3.0 with predicted cleavage sites between Ala24-Val25, resulting in 72 residue mature protein with calculated molecular mass of 7.4 kDa and predicted pI of 8.5. This crustin contains a Arg-Pro rich region at the amino-terminus and a single whey acidic protein (WAP) domain at the carboxyl-terminus. In order to characterize their properties and biological activities, the recombinant crustin protein was produced in the Escherichia coli expression system. Antimicrobial assays showed that the growth of Bacillus subtilis was inhibited by this recombinant crustin with MIC of about 25-50 µM.

Keywords: crustin, single whey acidic protein, Litopenaeus vannamei, antimicrobial activity

Procedia PDF Downloads 241

25093 Preschool Story Retelling: Actions and Verb Use

Authors: Eva Nwokah, Casey Taliancich-Klinger, Lauren Luna, Sarah Rodriguez

Abstract:

Story-retelling is a technique frequently used to assess children’s language skills and support their development of narratives. Fourteen preschool children listened to one of two stories from the wordless, illustrated Frog book series and then retold the story using the pictures. A comparison of three verb types (action, mental and other) in the original story model, and children's verb use in their retold stories revealed the salience of action events. The children's stories contained a similar proportion of verb types to the original story. However, the action verbs they used were rarely those they had heard in the original. The implications for the process of lexical encoding and narrative recall are discussed, as well as suggestions for the use of wordless picture books and the language teaching of new verbs.

Keywords: story re-telling, verb use, preschool language, wordless picture books

Procedia PDF Downloads 269

25092 Multi-Source Data Fusion for Urban Comprehensive Management

Authors: Bolin Hua

Abstract:

In city governance, various data are involved, including city component data, demographic data, housing data and all kinds of business data. These data reflects different aspects of people, events and activities. Data generated from various systems are different in form and data source are different because they may come from different sectors. In order to reflect one or several facets of an event or rule, data from multiple sources need fusion together. Data from different sources using different ways of collection raised several issues which need to be resolved. Problem of data fusion include data update and synchronization, data exchange and sharing, file parsing and entry, duplicate data and its comparison, resource catalogue construction. Governments adopt statistical analysis, time series analysis, extrapolation, monitoring analysis, value mining, scenario prediction in order to achieve pattern discovery, law verification, root cause analysis and public opinion monitoring. The result of Multi-source data fusion is to form a uniform central database, which includes people data, location data, object data, and institution data, business data and space data. We need to use meta data to be referred to and read when application needs to access, manipulate and display the data. A uniform meta data management ensures effectiveness and consistency of data in the process of data exchange, data modeling, data cleansing, data loading, data storing, data analysis, data search and data delivery.

Keywords: multi-source data fusion, urban comprehensive management, information fusion, government data

Procedia PDF Downloads 392

25091 Reviewing Privacy Preserving Distributed Data Mining

Authors: Sajjad Baghernezhad, Saeideh Baghernezhad

Abstract:

Nowadays considering human involved in increasing data development some methods such as data mining to extract science are unavoidable. One of the discussions of data mining is inherent distribution of the data usually the bases creating or receiving such data belong to corporate or non-corporate persons and do not give their information freely to others. Yet there is no guarantee to enable someone to mine special data without entering in the owner’s privacy. Sending data and then gathering them by each vertical or horizontal software depends on the type of their preserving type and also executed to improve data privacy. In this study it was attempted to compare comprehensively preserving data methods; also general methods such as random data, coding and strong and weak points of each one are examined.

Keywords: data mining, distributed data mining, privacy protection, privacy preserving

Procedia PDF Downloads 523

25090 The Right to Data Portability and Its Influence on the Development of Digital Services

Authors: Roman Bieda

Abstract:

The General Data Protection Regulation (GDPR) will come into force on 25 May 2018 which will create a new legal framework for the protection of personal data in the European Union. Article 20 of GDPR introduces a right to data portability. This right allows for data subjects to receive the personal data which they have provided to a data controller, in a structured, commonly used and machine-readable format, and to transmit this data to another data controller. The right to data portability, by facilitating transferring personal data between IT environments (e.g.: applications), will also facilitate changing the provider of services (e.g. changing a bank or a cloud computing service provider). Therefore, it will contribute to the development of competition and the digital market. The aim of this paper is to discuss the right to data portability and its influence on the development of new digital services.

Keywords: data portability, digital market, GDPR, personal data

Procedia PDF Downloads 471

25089 Time Efficient Color Coding for Structured-Light 3D Scanner

Authors: Po-Hao Huang, Pei-Ju Chiang

Abstract:

The structured light 3D scanner is commonly used for measuring the 3D shape of an object. Through projecting designed light patterns on the object, deformed patterns can be obtained and used for the geometric shape reconstruction. At present, Gray code is the most reliable and commonly used light pattern in the structured light 3D scanner. However, the trade-off between scanning efficiency and accuracy is a long-standing and challenging problem. The design of light patterns plays a significant role in the scanning efficiency and accuracy. Thereby, we proposed a novel encoding method integrating color information and Gray-code to improve the scanning efficiency. We will demonstrate that with the proposed method, the scanning time can be reduced to approximate half of the one needed by Gray-code without reduction of precision.

Keywords: gray-code, structured light scanner, 3D shape acquisition, 3D reconstruction

Procedia PDF Downloads 456

25088 Recent Advances in Data Warehouse

Authors: Fahad Hanash Alzahrani

Abstract:

This paper describes some recent advances in a quickly developing area of data storing and processing based on Data Warehouses and Data Mining techniques, which are associated with software, hardware, data mining algorithms and visualisation techniques having common features for any specific problems and tasks of their implementation.

Keywords: data warehouse, data mining, knowledge discovery in databases, on-line analytical processing

Procedia PDF Downloads 402

25087 How to Use Big Data in Logistics Issues

Authors: Mehmet Akif Aslan, Mehmet Simsek, Eyup Sensoy

Abstract:

Big Data stands for today’s cutting-edge technology. As the technology becomes widespread, so does Data. Utilizing massive data sets enable companies to get competitive advantages over their adversaries. Out of many area of Big Data usage, logistics has significance role in both commercial sector and military. This paper lays out what big data is and how it is used in both military and commercial logistics.

Keywords: big data, logistics, operational efficiency, risk management

Procedia PDF Downloads 639

25086 Autosomal Dominant Polycystic Kidney Patients May Be Predisposed to Various Cardiomyopathies

Authors: Fouad Chebib, Marie Hogan, Ziad El-Zoghby, Maria Irazabal, Sarah Senum, Christina Heyer, Charles Madsen, Emilie Cornec-Le Gall, Atta Behfar, Barbara Ehrlich, Peter Harris, Vicente Torres

Abstract:

Background: Mutations in PKD1 and PKD2, the genes encoding the proteins polycystin-1 (PC1) and polycystin-2 (PC2) cause autosomal dominant polycystic kidney disease (ADPKD). ADPKD is a systemic disease associated with several extrarenal manifestations. Animal models have suggested an important role for the polycystins in cardiovascular function. The aim of the current study is to evaluate the association of various cardiomyopathies in a large cohort of patients with ADPKD. Methods: Clinical data was retrieved from medical records for all patients with ADPKD and cardiomyopathies (n=159). Genetic analysis was performed on available DNA by direct sequencing. Results: Among the 58 patients included in this case series, 39 patients had idiopathic dilated cardiomyopathy (IDCM), 17 had hypertrophic obstructive cardiomyopathy (HOCM), and 2 had left ventricular noncompaction (LVNC). The mean age at cardiomyopathy diagnosis was 53.3, 59.9 and 53.5 years in IDCM, HOCM and LVNC patients respectively. The median left ventricular ejection fraction at initial diagnosis of IDCM was 25%. Average basal septal thickness was 19.9 mm in patients with HOCM. Genetic data was available in 19, 8 and 2 cases of IDCM, HOCM, and LVNC respectively. PKD1 mutations were detected in 47.4%, 62.5% and 100% of IDCM, HOCM and LVNC cases. PKD2 mutations were detected only in IDCM cases and were overrepresented (36.8%) relative to the expected frequency in ADPKD (~15%). The prevalence of IDCM, HOCM, and LVNC in our ADPKD clinical cohort was 1:17, 1:39 and 1:333 respectively. When compared to the general population, IDCM and HOCM was approximately 10-fold more prevalent in patients with ADPKD. Conclusions: In summary, we suggest that PKD1 or PKD2 mutations may predispose to idiopathic dilated or hypertrophic cardiomyopathy. There is a trend for patients with PKD2 mutations to develop the former and for patients with PKD1 mutations to develop the latter. Predisposition to various cardiomyopathies may be another extrarenal manifestation of ADPKD.

Keywords: autosomal dominant polycystic kidney (ADPKD), polycystic kidney disease, cardiovascular, cardiomyopathy, idiopathic dilated cardiomyopathy, hypertrophic cardiomyopathy, left ventricular noncompaction

Procedia PDF Downloads 311

25085 Impact of Population Size on Symmetric Travelling Salesman Problem Efficiency

Authors: Wafa' Alsharafat, Suhila Farhan Abu-Owida

Abstract:

Genetic algorithm (GA) is a powerful evolutionary searching technique that is used successfully to solve and optimize problems in different research areas. Genetic Algorithm (GA) considered as one of optimization methods used to solve Travel salesman Problem (TSP). The feasibility of GA in finding a TSP solution is dependent on GA operators; encoding method, population size, termination criteria, in general. In specific, crossover and its probability play a significant role in finding possible solutions for Symmetric TSP (STSP). In addition, the crossover should be determined and enhanced in term reaching optimal or at least near optimal. In this paper, we spot the light on using a modified crossover method called modified sequential constructive crossover and its impact on reaching optimal solution. To justify the relevance of a parameter value in solving the TSP, a set comparative analysis conducted on different crossover methods values.

Keywords: genetic algorithm, crossover, mutation, TSP

Procedia PDF Downloads 224

25084 Closed-Form Sharma-Mittal Entropy Rate for Gaussian Processes

Authors: Septimia Sarbu

Abstract:

The entropy rate of a stochastic process is a fundamental concept in information theory. It provides a limit to the amount of information that can be transmitted reliably over a communication channel, as stated by Shannon's coding theorems. Recently, researchers have focused on developing new measures of information that generalize Shannon's classical theory. The aim is to design more efficient information encoding and transmission schemes. This paper continues the study of generalized entropy rates, by deriving a closed-form solution to the Sharma-Mittal entropy rate for Gaussian processes. Using the squeeze theorem, we solve the limit in the definition of the entropy rate, for different values of alpha and beta, which are the parameters of the Sharma-Mittal entropy. In the end, we compare it with Shannon and Rényi's entropy rates for Gaussian processes.

Keywords: generalized entropies, Sharma-Mittal entropy rate, Gaussian processes, eigenvalues of the covariance matrix, squeeze theorem

Procedia PDF Downloads 517

25083 Harnessing Deep-Level Metagenomics to Explore the Three Dynamic One Health Areas: Healthcare, Domiciliary and Veterinary

Authors: Christina Killian, Katie Wall, Séamus Fanning, Guerrino Macori

Abstract:

Deep-level metagenomics offers a useful technical approach to explore the three dynamic One Health axes: healthcare, domiciliary and veterinary. There is currently limited understanding of the composition of complex biofilms, natural abundance of AMR genes and gene transfer occurrence in these ecological niches. By using a newly established small-scale complex biofilm model, COMBAT has the potential to provide new information on microbial diversity, antimicrobial resistance (AMR)-encoding gene abundance, and their transfer in complex biofilms of importance to these three One Health axes. Shotgun metagenomics has been used to sample the genomes of all microbes comprising the complex communities found in each biofilm source. A comparative analysis between untreated and biocide-treated biofilms is described. The basic steps include the purification of genomic DNA, followed by library preparation, sequencing, and finally, data analysis. The use of long-read sequencing facilitates the completion of metagenome-assembled genomes (MAG). Samples were sequenced using a PromethION platform, and following quality checks, binning methods, and bespoke bioinformatics pipelines, we describe the recovery of individual MAGs to identify mobile gene elements (MGE) and the corresponding AMR genotypes that map to these structures. High-throughput sequencing strategies have been deployed to characterize these communities. Accurately defining the profiles of these niches is an essential step towards elucidating the impact of the microbiota on each niche biofilm environment and their evolution.

Keywords: COMBAT, biofilm, metagenomics, high-throughput sequencing

Procedia PDF Downloads 53

25082 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: clustering, data mining, DBSCAN, k-means, k-medoids, sensor data

Procedia PDF Downloads 376

25081 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review

Procedia PDF Downloads 161

25080 Role of ABC Transporters in Non-Target Site Herbicide Resistance in Black Grass (Alopecurus myosuroides)

Authors: Alina Goldberg Cavalleri, Sara Franco Ortega, Nawaporn Onkokesung, Richard Dale, Melissa Brazier-Hicks, Robert Edwards

Abstract:

Non-target site based resistance (NTSR) to herbicides in weeds is a polygenic trait associated with the upregulation of proteins involved in xenobiotic detoxification and translocation we have termed the xenome. Among the xenome proteins, ABC transporters play a key role in enhancing herbicide metabolism by effluxing conjugated xenobiotics from the cytoplasm into the vacuole. The importance of ABC transporters is emphasized by the fact that they often contribute to multidrug resistance in human cells and antibiotic resistance in bacteria. They also play a key role in insecticide resistance in major vectors of human diseases and crop pests. By surveying available databases, transcripts encoding ABCs have been identified as being enhanced in populations exhibiting NTSR in several weed species. Based on a transcriptomics data in black grass (Alopecurus myosuroides, Am), we have identified three proteins from the ABC-C subfamily that are upregulated in NTSR populations. ABC-C transporters are poorly characterized proteins in plants, but in Arabidopsis localize to the vacuolar membrane and have functional roles in transporting glutathionylated (GSH)-xenobiotic conjugates. We found that the up-regulation of AmABCs strongly correlates with the up-regulation of a glutathione transferase termed AmGSTU2, which can conjugate GSH to herbicides. The expression profile of the ABC transcripts was profiled in populations of black grass showing different degree of resistance to herbicides. This, together with a phylogenetic analysis, revealed that AmABCs cluster in different groups which might indicate different substrate and roles in the herbicide resistance phenotype in the different populations

Keywords: black grass, herbicide, resistance, transporters

Procedia PDF Downloads 151

25079 Government Big Data Ecosystem: A Systematic Literature Review

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Data that is high in volume, velocity, veracity and comes from a variety of sources is usually generated in all sectors including the government sector. Globally public administrations are pursuing (big) data as new technology and trying to adopt a data-centric architecture for hosting and sharing data. Properly executed, big data and data analytics in the government (big) data ecosystem can be led to data-driven government and have a direct impact on the way policymakers work and citizens interact with governments. In this research paper, we conduct a systematic literature review. The main aims of this paper are to highlight essential aspects of the government (big) data ecosystem and to explore the most critical socio-technical factors that contribute to the successful implementation of government (big) data ecosystem. The essential aspects of government (big) data ecosystem include definition, data types, data lifecycle models, and actors and their roles. We also discuss the potential impact of (big) data in public administration and gaps in the government data ecosystems literature. As this is a new topic, we did not find specific articles on government (big) data ecosystem and therefore focused our research on various relevant areas like humanitarian data, open government data, scientific research data, industry data, etc.

Keywords: applications of big data, big data, big data types. big data ecosystem, critical success factors, data-driven government, egovernment, gaps in data ecosystems, government (big) data, literature review, public administration, systematic review

Procedia PDF Downloads 227

25078 A Machine Learning Decision Support Framework for Industrial Engineering Purposes

Authors: Anli Du Preez, James Bekker

Abstract:

Data is currently one of the most critical and influential emerging technologies. However, the true potential of data is yet to be exploited since, currently, about 1% of generated data are ever actually analyzed for value creation. There is a data gap where data is not explored due to the lack of data analytics infrastructure and the required data analytics skills. This study developed a decision support framework for data analytics by following Jabareen’s framework development methodology. The study focused on machine learning algorithms, which is a subset of data analytics. The developed framework is designed to assist data analysts with little experience, in choosing the appropriate machine learning algorithm given the purpose of their application.

Keywords: Data analytics, Industrial engineering, Machine learning, Value creation

Procedia PDF Downloads 166