Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 25182

Search results for: whole exome sequencing data

24522 Haplotypes of the Human Leukocyte Antigen-G Different HIV-1 Groups from the Netherlands

Authors: A. Alyami, S. Christmas, K. Neeltje, G. Pollakis, B. Paxton, Z. Al-Bayati

Abstract:

The Human leukocyte antigen-G (HLA-G) molecule plays an important role in immunomodulation. To date, 16 untranslated regions (UTR) HLA-G haplotypes have been previously defined by sequenced SNPs in the coding region. From these, UTR-1, UTR-2, UTR-3, UTR-4, UTR-5, UTR-6 and UTR-7 are the most frequent 3’UTR haplotypes at the global level. UTR-1 is associated with higher levels of soluble HLA-G and HLA-G expression, whereas UTR-5 and UTR-7 are linked with low levels of soluble HLA-G and HLA-G expression. Human immunodeficiency virus type 1 (HIV-1) infection results in the progressive loss of immune function in infected individuals. The virus escape mechanism typically includes T lymphocytes and NK cell recognition and lyses by classical HLA-A and B down-regulation, which has been associated with non-classical HLA-G molecule up-regulation, respectively. We evaluated the haplotypes of the HLA-G 3′ untranslated region frequencies observed in three HIV-1 groups from the Netherlands and their susceptibility to develop infection. The three groups are made up of mainly men who have sex with men (MSM), injection drug users (IDU) and a high-risk-seronegative (HRSN) group. DNA samples were amplified with published primers prior sequencing. According to our results, the low expresser frequencies show higher in HRSN compared to other groups. This is indicating that 3’UTR polymorphisms may be identified as potential prognostic biomarkers to determine susceptibility to HIV.

Keywords: Human leukocyte antigen-G (HLA-G) , men who have sex with men (MSM), injection drug users (IDU), high-risk-seronegative (HRSN) group, high-untranslated region (UTR)

Procedia PDF Downloads 152

24521 Li-Fi Technology: Data Transmission through Visible Light

Authors: Shahzad Hassan, Kamran Saeed

Abstract:

People are always in search of Wi-Fi hotspots because Internet is a major demand nowadays. But like all other technologies, there is still room for improvement in the Wi-Fi technology with regards to the speed and quality of connectivity. In order to address these aspects, Harald Haas, a professor at the University of Edinburgh, proposed what we know as the Li-Fi (Light Fidelity). Li-Fi is a new technology in the field of wireless communication to provide connectivity within a network environment. It is a two-way mode of wireless communication using light. Basically, the data is transmitted through Light Emitting Diodes which can vary the intensity of light very fast, even faster than the blink of an eye. From the research and experiments conducted so far, it can be said that Li-Fi can increase the speed and reliability of the transfer of data. This paper pays particular attention on the assessment of the performance of this technology. In other words, it is a 5G technology which uses LED as the medium of data transfer. For coverage within the buildings, Wi-Fi is good but Li-Fi can be considered favorable in situations where large amounts of data are to be transferred in areas with electromagnetic interferences. It brings a lot of data related qualities such as efficiency, security as well as large throughputs to the table of wireless communication. All in all, it can be said that Li-Fi is going to be a future phenomenon where the presence of light will mean access to the Internet as well as speedy data transfer.

Keywords: communication, LED, Li-Fi, Wi-Fi

Procedia PDF Downloads 336

24520 An Analysis of Humanitarian Data Management of Polish Non-Governmental Organizations in Ukraine Since February 2022 and Its Relevance for Ukrainian Humanitarian Data Ecosystem

Authors: Renata Kurpiewska-Korbut

Abstract:

Making an assumption that the use and sharing of data generated in humanitarian action constitute a core function of humanitarian organizations, the paper analyzes the position of the largest Polish humanitarian non-governmental organizations in the humanitarian data ecosystem in Ukraine and their approach to non-personal and personal data management since February of 2022. Both expert interviews and document analysis of non-profit organizations providing a direct response in the Ukrainian crisis context, i.e., the Polish Humanitarian Action, Caritas, Polish Medical Mission, Polish Red Cross, and the Polish Center for International Aid and the applicability of theoretical perspective of contingency theory – with its central point that the context or specific set of conditions determining the way of behavior and the choice of methods of action – help to examine the significance of data complexity and adaptive approach to data management by relief organizations in the humanitarian supply chain network. The purpose of this study is to determine how the existence of well-established and accurate internal procedures and good practices of using and sharing data (including safeguards for sensitive data) by the surveyed organizations with comparable human and technological capabilities are implemented and adjusted to Ukrainian humanitarian settings and data infrastructure. The study also poses a fundamental question of whether this crisis experience will have a determining effect on their future performance. The obtained finding indicate that Polish humanitarian organizations in Ukraine, which have their own unique code of conduct and effective managerial data practices determined by contingencies, have limited influence on improving the situational awareness of other assistance providers in the data ecosystem despite their attempts to undertake interagency work in the area of data sharing.

Keywords: humanitarian data ecosystem, humanitarian data management, polish NGOs, Ukraine

Procedia PDF Downloads 89

24519 An Approach for Estimation in Hierarchical Clustered Data Applicable to Rare Diseases

Authors: Daniel C. Bonzo

Abstract:

Practical considerations lead to the use of unit of analysis within subjects, e.g., bleeding episodes or treatment-related adverse events, in rare disease settings. This is coupled with data augmentation techniques such as extrapolation to enlarge the subject base. In general, one can think about extrapolation of data as extending information and conclusions from one estimand to another estimand. This approach induces hierarchichal clustered data with varying cluster sizes. Extrapolation of clinical trial data is being accepted increasingly by regulatory agencies as a means of generating data in diverse situations during drug development process. Under certain circumstances, data can be extrapolated to a different population, a different but related indication, and different but similar product. We consider here the problem of estimation (point and interval) using a mixed-models approach under an extrapolation. It is proposed that estimators (point and interval) be constructed using weighting schemes for the clusters, e.g., equally weighted and with weights proportional to cluster size. Simulated data generated under varying scenarios are then used to evaluate the performance of this approach. In conclusion, the evaluation result showed that the approach is a useful means for improving statistical inference in rare disease settings and thus aids not only signal detection but risk-benefit evaluation as well.

Keywords: clustered data, estimand, extrapolation, mixed model

Procedia PDF Downloads 132

24518 De Novo Assembly and Characterization of the Transcriptome from the Fluoroacetate Producing Plant, Dichapetalum Cymosum

Authors: Selisha A. Sooklal, Phelelani Mpangase, Shaun Aron, Karl Rumbold

Abstract:

Organically bound fluorine (C-F bond) is extremely rare in nature. Despite this, the first fluorinated secondary metabolite, fluoroacetate, was isolated from the plant Dichapetalum cymosum (commonly known as Gifblaar). However, the enzyme responsible for fluorination (fluorinase) in Gifblaar was never isolated and very little progress has been achieved in understanding this process in higher plants. Fluorinated compounds have vast applications in the pharmaceutical, agrochemical and fine chemicals industries. Consequently, an enzyme capable of catalysing a C-F bond has great potential as a biocatalyst in the industry considering that the field of fluorination is virtually synthetic. As with any biocatalyst, a range of these enzymes are required. Therefore, it is imperative to expand the exploration for novel fluorinases. This study aimed to gain molecular insights into secondary metabolite biosynthesis in Gifblaar using a high-throughput sequencing-based approach. Mechanical wounding studies were performed using Gifblaar leaf tissue in order to induce expression of the fluorinase. The transcriptome of the wounded and unwounded plant was then sequenced on the Illumina HiSeq platform. A total of 26.4 million short sequence reads were assembled into 77 845 transcripts using Trinity. Overall, 68.6 % of transcripts were annotated with gene identities using public databases (SwissProt, TrEMBL, GO, COG, Pfam, EC) with an E-value threshold of 1E-05. Sequences exhibited the greatest homology to the model plant, Arabidopsis thaliana (27 %). A total of 244 annotated transcripts were found to be differentially expressed between the wounded and unwounded plant. In addition, secondary metabolic pathways present in Gifblaar were successfully reconstructed using Pathway tools. Due to lack of genetic information for plant fluorinases, a transcript failed to be annotated as a fluorinating enzyme. Thus, a local database containing the 5 existing bacterial fluorinases was created. Fifteen transcripts having homology to partial regions of existing fluorinases were found. In efforts to obtain the full coding sequence of the Gifblaar fluorinase, primers were designed targeting the regions of homology and genome walking will be performed to amplify the unknown regions. This is the first genetic data available for Gifblaar. It has provided novel insights into the mechanisms of metabolite biosynthesis and will allow for the discovery of the first eukaryotic fluorinase.

Keywords: biocatalyst, fluorinase, gifblaar, transcriptome

Procedia PDF Downloads 267

24517 Authorization of Commercial Communication Satellite Grounds for Promoting Turkish Data Relay System

Authors: Celal Dudak, Aslı Utku, Burak Yağlioğlu

Abstract:

Uninterrupted and continuous satellite communication through the whole orbit time is becoming more indispensable every day. Data relay systems are developed and built for various high/low data rate information exchanges like TDRSS of USA and EDRSS of Europe. In these missions, a couple of task-dedicated communication satellites exist. In this regard, for Turkey a data relay system is attempted to be defined exchanging low data rate information (i.e. TTC) for Earth-observing LEO satellites appointing commercial GEO communication satellites all over the world. First, justification of this attempt is given, demonstrating duration enhancements in the link. Discussion of preference of RF communication is, also, given instead of laser communication. Then, preferred communication GEOs – including TURKSAT4A already belonging to Turkey- are given, together with the coverage enhancements through STK simulations and the corresponding link budget. Also, a block diagram of the communication system is given on the LEO satellite.

Keywords: communication, GEO satellite, data relay system, coverage

Procedia PDF Downloads 436

24516 The Development of Encrypted Near Field Communication Data Exchange Format Transmission in an NFC Passive Tag for Checking the Genuine Product

Authors: Tanawat Hongthai, Dusit Thanapatay

Abstract:

This paper presents the development of encrypted near field communication (NFC) data exchange format transmission in an NFC passive tag for the feasibility of implementing a genuine product authentication. We propose a research encryption and checking the genuine product into four major categories; concept, infrastructure, development and applications. This result shows the passive NFC-forum Type 2 tag can be configured to be compatible with the NFC data exchange format (NDEF), which can be automatically partially data updated when there is NFC field.

Keywords: near field communication, NFC data exchange format, checking the genuine product, encrypted NFC

Procedia PDF Downloads 272

24515 The Isolation of Enterobacter Ludwigii Strain T976 from Nicotiana Tabacum L. Yunyan 97 and Its Application Study

Authors: Gao Qin, Hu Liwei, Dong Xiangzhou, Zhu Qifa, Cheng Tingming, Zhao Limei, Yang Mengmeng, Zhai Zhen, Dai Huaxin, Liang Taibo, Zhang Shixiang, Xue Chaoqun

Abstract:

The functional strain T976 for starch degradation was isolated from Nicotiana tabacum L. Yunyan 97 tobacco leaves, the ratio of starch hydrolysis transparent circle diameter to colony diameter of the strain was 4.14, 16S rDNA sequencing identified these strains as Enterobacter ludwigii. Then Enterobacter ludwigii T976 was fermented and spaying Yunyan 97 plant in vigorous growing stage. The results of once spraying fermentation broth of Enterobacter ludwigii T976 showed that starch content of upper leaves decreased slightly, from 3.77% to 3.1%, the reducing sugar content increased from 4.39% to 5.53%, and the total sugar content increased from 5.82% to 7.39%. The chemical content was also checked after three time spraying. The starch content of middle leaves decreased from 5.63% to 3.74%, while the content of total sugar and reducing sugar decreased slightly. And the starch content of upper leaves decreased from 7.62% to 4.78%, the total sugar and reducing sugar decreased slightly, and starch content of middle leaf decreased from 6.27% to 3.62%, the total sugar and reducing sugar did not change much, and other chemical components were in a suitable range.

Keywords: nicotiana tabacum, yunyan 97, leaf, starch, degradation, enterobacter ludwigii

Procedia PDF Downloads 51

24514 Data Hiding by Vector Quantization in Color Image

Authors: Yung Gi Wu

Abstract:

With the growing of computer and network, digital data can be spread to anywhere in the world quickly. In addition, digital data can also be copied or tampered easily so that the security issue becomes an important topic in the protection of digital data. Digital watermark is a method to protect the ownership of digital data. Embedding the watermark will influence the quality certainly. In this paper, Vector Quantization (VQ) is used to embed the watermark into the image to fulfill the goal of data hiding. This kind of watermarking is invisible which means that the users will not conscious the existing of embedded watermark even though the embedded image has tiny difference compared to the original image. Meanwhile, VQ needs a lot of computation burden so that we adopt a fast VQ encoding scheme by partial distortion searching (PDS) and mean approximation scheme to speed up the data hiding process. The watermarks we hide to the image could be gray, bi-level and color images. Texts are also can be regarded as watermark to embed. In order to test the robustness of the system, we adopt Photoshop to fulfill sharpen, cropping and altering to check if the extracted watermark is still recognizable. Experimental results demonstrate that the proposed system can resist the above three kinds of tampering in general cases.

Keywords: data hiding, vector quantization, watermark, color image

Procedia PDF Downloads 358

24513 Anomaly Detection in a Data Center with a Reconstruction Method Using a Multi-Autoencoders Model

Authors: Victor Breux, Jérôme Boutet, Alain Goret, Viviane Cattin

Abstract:

Early detection of anomalies in data centers is important to reduce downtimes and the costs of periodic maintenance. However, there is little research on this topic and even fewer on the fusion of sensor data for the detection of abnormal events. The goal of this paper is to propose a method for anomaly detection in data centers by combining sensor data (temperature, humidity, power) and deep learning models. The model described in the paper uses one autoencoder per sensor to reconstruct the inputs. The auto-encoders contain Long-Short Term Memory (LSTM) layers and are trained using the normal samples of the relevant sensors selected by correlation analysis. The difference signal between the input and its reconstruction is then used to classify the samples using feature extraction and a random forest classifier. The data measured by the sensors of a data center between January 2019 and May 2020 are used to train the model, while the data between June 2020 and May 2021 are used to assess it. Performances of the model are assessed a posteriori through F1-score by comparing detected anomalies with the data center’s history. The proposed model outperforms the state-of-the-art reconstruction method, which uses only one autoencoder taking multivariate sequences and detects an anomaly with a threshold on the reconstruction error, with an F1-score of 83.60% compared to 24.16%.

Keywords: anomaly detection, autoencoder, data centers, deep learning

Procedia PDF Downloads 187

24512 Production of Lignocellulosic Enzymes by Bacillus safensis LCX Using Agro-Food Wastes in Solid State Fermentation

Authors: Abeer A. Q. Ahmed, Tracey McKay

Abstract:

The increasing demand for renewable fuels and chemicals is pressuring manufacturing industry toward finding more sustainable cost-effective resources. Lignocellulose, such as agro-food wastes, is a suitable equivalent to petroleum for fine chemicals and fuels production. The complex structure of lignocellulose, however, requires a variety of enzymes in order to degrade its components into their respective building blocks that can be used further for the production of various value added products. This study aimed to isolate bacterial strain with the ability to produce a variety of lignocellulosic enzymes. One bacterial isolate was identified by 16S rRNA gene sequencing and phylogenetic analysis as Bacillus safensis LCX found to have CMCase, xylanase, manganese peroxidase, lignin peroxidase, and laccase activities. The enzymes production was induced by growing Bacillus safensis LCX in solid state fermentation using wheat straw, wheat bran, and corn stover. The activities of enzymes were determined by specific colorimetric assays. This study presents Bacillus safensis LCX as a promising source for lignocellulosic enzymes. These findings can extend the knowledge on agro-food wastes valorization strategies toward a sustainable production of fuels and chemicals.

Keywords: Bacillus safensis LCX, high valued chemicals, lignocellulosic enzymes, solid state fermentation

Procedia PDF Downloads 291

24511 MicroRNA Drivers of Resistance to Androgen Deprivation Therapy in Prostate Cancer

Authors: Philippa Saunders, Claire Fletcher

Abstract:

INTRODUCTION: Prostate cancer is the most prevalent malignancy affecting Western males. It is initially an androgen-dependent disease: androgens bind to the androgen receptor and drive the expression of genes that promote proliferation and evasion of apoptosis. Despite reduced androgen dependence in advanced prostate cancer, androgen receptor signaling remains a key driver of growth. Androgen deprivation therapy (ADT) is, therefore, a first-line treatment approach and works well initially, but resistance inevitably develops. Abiraterone and Enzalutamide are drugs widely used in ADT and are androgen synthesis and androgen receptor signaling inhibitors, respectively. The shortage of other treatment options means acquired resistance to these drugs is a major clinical problem. MicroRNAs (miRs) are important mediators of post-transcriptional gene regulation and show altered expression in cancer. Several have been linked to the development of resistance to ADT. Manipulation of such miRs may be a pathway to breakthrough treatments for advanced prostate cancer. This study aimed to validate ADT resistance-implicated miRs and their clinically relevant targets. MATERIAL AND METHOD: Small RNA-sequencing of Abiraterone- and Enzalutamide-resistant C42 prostate cancer cells identified subsets of miRs dysregulated as compared to parental cells. Real-Time Quantitative Reverse Transcription PCR (qRT-PCR) was used to validate altered expression of candidate ADT resistance-implicated miRs 195-5p, 497-5p and 29a-5p in ADT-resistant and -responsive prostate cancer cell lines, patient-derived xenografts (PDXs) and primary prostate cancer explants. RESULTS AND DISCUSSION: This study suggests a possible role for miR-497-5p in the development of ADT resistance in prostate cancer. MiR-497-5p expression was increased in ADT-resistant versus ADT-responsive prostate cancer cells. Importantly, miR-497-5p expression was also increased in Enzalutamide-treated, castrated (ADT-mimicking) PDXs versus intact PDXs. MiR-195-5p was also elevated in ADT-resistant versus -responsive prostate cancer cells, while there was a drop in miR-29a-5p expression. Candidate clinically relevant targets of miR-497-5p in prostate cancer were identified by mining AGO-PAR-CLIP-seq data sets and may include AVL9 and FZD6. CONCLUSION: In summary, this study identified microRNAs that are implicated in prostate cancer resistance to androgen deprivation therapy and could represent novel therapeutic targets for advanced disease.

Keywords: microRNA, androgen deprivation therapy, Enzalutamide, abiraterone, patient-derived xenograft

Procedia PDF Downloads 135

24510 Integration Process and Analytic Interface of different Environmental Open Data Sets with Java/Oracle and R

Authors: Pavel H. Llamocca, Victoria Lopez

Abstract:

The main objective of our work is the comparative analysis of environmental data from Open Data bases, belonging to different governments. This means that you have to integrate data from various different sources. Nowadays, many governments have the intention of publishing thousands of data sets for people and organizations to use them. In this way, the quantity of applications based on Open Data is increasing. However each government has its own procedures to publish its data, and it causes a variety of formats of data sets because there are no international standards to specify the formats of the data sets from Open Data bases. Due to this variety of formats, we must build a data integration process that is able to put together all kind of formats. There are some software tools developed in order to give support to the integration process, e.g. Data Tamer, Data Wrangler. The problem with these tools is that they need data scientist interaction to take part in the integration process as a final step. In our case we don’t want to depend on a data scientist, because environmental data are usually similar and these processes can be automated by programming. The main idea of our tool is to build Hadoop procedures adapted to data sources per each government in order to achieve an automated integration. Our work focus in environment data like temperature, energy consumption, air quality, solar radiation, speeds of wind, etc. Since 2 years, the government of Madrid is publishing its Open Data bases relative to environment indicators in real time. In the same way, other governments have published Open Data sets relative to the environment (like Andalucia or Bilbao). But all of those data sets have different formats and our solution is able to integrate all of them, furthermore it allows the user to make and visualize some analysis over the real-time data. Once the integration task is done, all the data from any government has the same format and the analysis process can be initiated in a computational better way. So the tool presented in this work has two goals: 1. Integration process; and 2. Graphic and analytic interface. As a first approach, the integration process was developed using Java and Oracle and the graphic and analytic interface with Java (jsp). However, in order to open our software tool, as second approach, we also developed an implementation with R language as mature open source technology. R is a really powerful open source programming language that allows us to process and analyze a huge amount of data with high performance. There are also some R libraries for the building of a graphic interface like shiny. A performance comparison between both implementations was made and no significant differences were found. In addition, our work provides with an Official Real-Time Integrated Data Set about Environment Data in Spain to any developer in order that they can build their own applications.

Keywords: open data, R language, data integration, environmental data

Procedia PDF Downloads 309

24509 Transforming Data into Knowledge: Mathematical and Statistical Innovations in Data Analytics

Authors: Zahid Ullah, Atlas Khan

Abstract:

The rapid growth of data in various domains has created a pressing need for effective methods to transform this data into meaningful knowledge. In this era of big data, mathematical and statistical innovations play a crucial role in unlocking insights and facilitating informed decision-making in data analytics. This abstract aims to explore the transformative potential of these innovations and their impact on converting raw data into actionable knowledge. Drawing upon a comprehensive review of existing literature, this research investigates the cutting-edge mathematical and statistical techniques that enable the conversion of data into knowledge. By evaluating their underlying principles, strengths, and limitations, we aim to identify the most promising innovations in data analytics. To demonstrate the practical applications of these innovations, real-world datasets will be utilized through case studies or simulations. This empirical approach will showcase how mathematical and statistical innovations can extract patterns, trends, and insights from complex data, enabling evidence-based decision-making across diverse domains. Furthermore, a comparative analysis will be conducted to assess the performance, scalability, interpretability, and adaptability of different innovations. By benchmarking against established techniques, we aim to validate the effectiveness and superiority of the proposed mathematical and statistical innovations in data analytics. Ethical considerations surrounding data analytics, such as privacy, security, bias, and fairness, will be addressed throughout the research. Guidelines and best practices will be developed to ensure the responsible and ethical use of mathematical and statistical innovations in data analytics. The expected contributions of this research include advancements in mathematical and statistical sciences, improved data analysis techniques, enhanced decision-making processes, and practical implications for industries and policymakers. The outcomes will guide the adoption and implementation of mathematical and statistical innovations, empowering stakeholders to transform data into actionable knowledge and drive meaningful outcomes.

Keywords: data analytics, mathematical innovations, knowledge extraction, decision-making

Procedia PDF Downloads 69

24508 FCNN-MR: A Parallel Instance Selection Method Based on Fast Condensed Nearest Neighbor Rule

Authors: Lu Si, Jie Yu, Shasha Li, Jun Ma, Lei Luo, Qingbo Wu, Yongqi Ma, Zhengji Liu

Abstract:

Instance selection (IS) technique is used to reduce the data size to improve the performance of data mining methods. Recently, to process very large data set, several proposed methods divide the training set into some disjoint subsets and apply IS algorithms independently to each subset. In this paper, we analyze the limitation of these methods and give our viewpoint about how to divide and conquer in IS procedure. Then, based on fast condensed nearest neighbor (FCNN) rule, we propose a large data sets instance selection method with MapReduce framework. Besides ensuring the prediction accuracy and reduction rate, it has two desirable properties: First, it reduces the work load in the aggregation node; Second and most important, it produces the same result with the sequential version, which other parallel methods cannot achieve. We evaluate the performance of FCNN-MR on one small data set and two large data sets. The experimental results show that it is effective and practical.

Keywords: instance selection, data reduction, MapReduce, kNN

Procedia PDF Downloads 250

24507 Decision Tree Based Scheduling for Flexible Job Shops with Multiple Process Plans

Authors: H.-H. Doh, J.-M. Yu, Y.-J. Kwon, J.-H. Shin, H.-W. Kim, S.-H. Nam, D.-H. Lee

Abstract:

This paper suggests a decision tree based approach for flexible job shop scheduling with multiple process plans, i. e. each job can be processed through alternative operations, each of which can be processed on alternative machines. The main decision variables are: (a) selecting operation/machine pair; and (b) sequencing the jobs assigned to each machine. As an extension of the priority scheduling approach that selects the best priority rule combination after many simulation runs, this study suggests a decision tree based approach in which a decision tree is used to select a priority rule combination adequate for a specific system state and hence the burdens required for developing simulation models and carrying out simulation runs can be eliminated. The decision tree based scheduling approach consists of construction and scheduling modules. In the construction module, a decision tree is constructed using a four-stage algorithm, and in the scheduling module, a priority rule combination is selected using the decision tree. To show the performance of the decision tree based approach suggested in this study, a case study was done on a flexible job shop with reconfigurable manufacturing cells and a conventional job shop, and the results are reported by comparing it with individual priority rule combinations for the objectives of minimizing total flow time and total tardiness.

Keywords: flexible job shop scheduling, decision tree, priority rules, case study

Procedia PDF Downloads 351

24506 A Design Framework for an Open Market Platform of Enriched Card-Based Transactional Data for Big Data Analytics and Open Banking

Authors: Trevor Toy, Josef Langerman

Abstract:

Around a quarter of the world’s data is generated by financial with an estimated 708.5 billion global non-cash transactions reached between 2018 and. And with Open Banking still a rapidly developing concept within the financial industry, there is an opportunity to create a secure mechanism for connecting its stakeholders to openly, legitimately and consensually share the data required to enable it. Integration and data sharing of anonymised transactional data are still operated in silos and centralised between the large corporate entities in the ecosystem that have the resources to do so. Smaller fintechs generating data and businesses looking to consume data are largely excluded from the process. Therefore there is a growing demand for accessible transactional data for analytical purposes and also to support the rapid global adoption of Open Banking. The following research has provided a solution framework that aims to provide a secure decentralised marketplace for 1.) data providers to list their transactional data, 2.) data consumers to find and access that data, and 3.) data subjects (the individuals making the transactions that generate the data) to manage and sell the data that relates to themselves. The platform also provides an integrated system for downstream transactional-related data from merchants, enriching the data product available to build a comprehensive view of a data subject’s spending habits. A robust and sustainable data market can be developed by providing a more accessible mechanism for data producers to monetise their data investments and encouraging data subjects to share their data through the same financial incentives. At the centre of the platform is the market mechanism that connects the data providers and their data subjects to the data consumers. This core component of the platform is developed on a decentralised blockchain contract with a market layer that manages transaction, user, pricing, payment, tagging, contract, control, and lineage features that pertain to the user interactions on the platform. One of the platform’s key features is enabling the participation and management of personal data by the individuals from whom the data is being generated. This framework developed a proof-of-concept on the Etheruem blockchain base where an individual can securely manage access to their own personal data and that individual’s identifiable relationship to the card-based transaction data provided by financial institutions. This gives data consumers access to a complete view of transactional spending behaviour in correlation to key demographic information. This platform solution can ultimately support the growth, prosperity, and development of economies, businesses, communities, and individuals by providing accessible and relevant transactional data for big data analytics and open banking.

Keywords: big data markets, open banking, blockchain, personal data management

Procedia PDF Downloads 70

24505 Rapid Detection and Differentiation of Camel Pox, Contagious Ecthyma and Papilloma Viruses in Clinical Samples of Camels Using a Multiplex PCR

Authors: A. I. Khalafalla, K. A. Al-Busada, I. M. El-Sabagh

Abstract:

Pox and pox-like diseases of camels are a group of exanthematous skin conditions that have become increasingly important economically. They may be caused by three distinct viruses: camelpox virus (CMPV), camel contagious ecthyma virus (CCEV) and camel papillomavirus (CAPV). These diseases are difficult to differentiate based on clinical presentation in disease outbreaks. Molecular methods such as PCR targeting species-specific genes have been developed and used to identify CMPV and CCEV, but not simultaneously in a single tube. Recently, multiplex PCR has gained reputation as a convenient diagnostic method with cost- and time–saving benefits. In the present communication, we describe the development, optimization and validation a multiplex PCR assays able to detect simultaneously the genome of the three viruses in one single test allowing for rapid and efficient molecular diagnosis. The assay was developed based on the evaluation and combination of published and new primer sets, and was applied to the detection of 110 tissue samples. The method showed high sensitivity, and the specificity was confirmed by PCR-product sequencing. In conclusion, this rapid, sensitive and specific assay is considered a useful method for identifying three important viruses in specimens from camels and as part of a molecular diagnostic regime.

Keywords: multiplex PCR, diagnosis, pox and pox-like diseases, camels

Procedia PDF Downloads 463

24504 Experimental Evaluation of Succinct Ternary Tree

Authors: Dmitriy Kuptsov

Abstract:

Tree data structures, such as binary or in general k-ary trees, are essential in computer science. The applications of these data structures can range from data search and retrieval to sorting and ranking algorithms. Naive implementations of these data structures can consume prohibitively large volumes of random access memory limiting their applicability in certain solutions. Thus, in these cases, more advanced representation of these data structures is essential. In this paper we present the design of the compact version of ternary tree data structure and demonstrate the results for the experimental evaluation using static dictionary problem. We compare these results with the results for binary and regular ternary trees. The conducted evaluation study shows that our design, in the best case, consumes up to 12 times less memory (for the dictionary used in our experimental evaluation) than a regular ternary tree and in certain configuration shows performance comparable to regular ternary trees. We have evaluated the performance of the algorithms using both 32 and 64 bit operating systems.

Keywords: algorithms, data structures, succinct ternary tree, per- formance evaluation

Procedia PDF Downloads 156

24503 Predicting Data Center Resource Usage Using Quantile Regression to Conserve Energy While Fulfilling the Service Level Agreement

Authors: Ahmed I. Alutabi, Naghmeh Dezhabad, Sudhakar Ganti

Abstract:

Data centers have been growing in size and dema nd continuously in the last two decades. Planning for the deployment of resources has been shallow and always resorted to over-provisioning. Data center operators try to maximize the availability of their services by allocating multiple of the needed resources. One resource that has been wasted, with little thought, has been energy. In recent years, programmable resource allocation has paved the way to allow for more efficient and robust data centers. In this work, we examine the predictability of resource usage in a data center environment. We use a number of models that cover a wide spectrum of machine learning categories. Then we establish a framework to guarantee the client service level agreement (SLA). Our results show that using prediction can cut energy loss by up to 55%.

Keywords: machine learning, artificial intelligence, prediction, data center, resource allocation, green computing

Procedia PDF Downloads 103

24502 Defective Autophagy Disturbs Neural Migration and Network Activity in hiPSC-Derived Cockayne Syndrome B Disease Models

Authors: Julia Kapr, Andrea Rossi, Haribaskar Ramachandran, Marius Pollet, Ilka Egger, Selina Dangeleit, Katharina Koch, Jean Krutmann, Ellen Fritsche

Abstract:

It is widely acknowledged that animal models do not always represent human disease. Especially human brain development is difficult to model in animals due to a variety of structural and functional species-specificities. This causes significant discrepancies between predicted and apparent drug efficacies in clinical trials and their subsequent failure. Emerging alternatives based on 3D in vitro approaches, such as human brain spheres or organoids, may in the future reduce and ultimately replace animal models. Here, we present a human induced pluripotent stem cell (hiPSC)-based 3D neural in a vitro disease model for the Cockayne Syndrome B (CSB). CSB is a rare hereditary disease and is accompanied by severe neurologic defects, such as microcephaly, ataxia and intellectual disability, with currently no treatment options. Therefore, the aim of this study is to investigate the molecular and cellular defects found in neural hiPSC-derived CSB models. Understanding the underlying pathology of CSB enables the development of treatment options. The two CSB models used in this study comprise a patient-derived hiPSC line and its isogenic control as well as a CSB-deficient cell line based on a healthy hiPSC line (IMR90-4) background thereby excluding genetic background-related effects. Neurally induced and differentiated brain sphere cultures were characterized via RNA Sequencing, western blot (WB), immunocytochemistry (ICC) and multielectrode arrays (MEAs). CSB-deficiency leads to an altered gene expression of markers for autophagy, focal adhesion and neural network formation. Cell migration was significantly reduced and electrical activity was significantly increased in the disease cell lines. These data hint that the cellular pathologies is possibly underlying CSB. By induction of autophagy, the migration phenotype could be partially rescued, suggesting a crucial role of disturbed autophagy in defective neural migration of the disease lines. Altered autophagy may also lead to inefficient mitophagy. Accordingly, disease cell lines were shown to have a lower mitochondrial base activity and a higher susceptibility to mitochondrial stress induced by rotenone. Since mitochondria play an important role in neurotransmitter cycling, we suggest that defective mitochondria may lead to altered electrical activity in the disease cell lines. Failure to clear the defective mitochondria by mitophagy and thus missing initiation cues for new mitochondrial production could potentiate this problem. With our data, we aim at establishing a disease adverse outcome pathway (AOP), thereby adding to the in-depth understanding of this multi-faced disorder and subsequently contributing to alternative drug development.

Keywords: autophagy, disease modeling, in vitro, pluripotent stem cells

Procedia PDF Downloads 119

24501 Prosperous Digital Image Watermarking Approach by Using DCT-DWT

Authors: Prabhakar C. Dhavale, Meenakshi M. Pawar

Abstract:

In this paper, everyday tons of data is embedded on digital media or distributed over the internet. The data is so distributed that it can easily be replicated without error, putting the rights of their owners at risk. Even when encrypted for distribution, data can easily be decrypted and copied. One way to discourage illegal duplication is to insert information known as watermark, into potentially valuable data in such a way that it is impossible to separate the watermark from the data. These challenges motivated researchers to carry out intense research in the field of watermarking. A watermark is a form, image or text that is impressed onto paper, which provides evidence of its authenticity. Digital watermarking is an extension of the same concept. There are two types of watermarks visible watermark and invisible watermark. In this project, we have concentrated on implementing watermark in image. The main consideration for any watermarking scheme is its robustness to various attacks

Keywords: watermarking, digital, DCT-DWT, security

Procedia PDF Downloads 417

24500 Machine Learning Data Architecture

Authors: Neerav Kumar, Naumaan Nayyar, Sharath Kashyap

Abstract:

Most companies see an increase in the adoption of machine learning (ML) applications across internal and external-facing use cases. ML applications vend output either in batch or real-time patterns. A complete batch ML pipeline architecture comprises data sourcing, feature engineering, model training, model deployment, model output vending into a data store for downstream application. Due to unclear role expectations, we have observed that scientists specializing in building and optimizing models are investing significant efforts into building the other components of the architecture, which we do not believe is the best use of scientists’ bandwidth. We propose a system architecture created using AWS services that bring industry best practices to managing the workflow and simplifies the process of model deployment and end-to-end data integration for an ML application. This narrows down the scope of scientists’ work to model building and refinement while specialized data engineers take over the deployment, pipeline orchestration, data quality, data permission system, etc. The pipeline infrastructure is built and deployed as code (using terraform, cdk, cloudformation, etc.) which makes it easy to replicate and/or extend the architecture to other models that are used in an organization.

Keywords: data pipeline, machine learning, AWS, architecture, batch machine learning

Procedia PDF Downloads 59

24499 Breast Cancer and BRCA Gene: A Study on Genetic and Environmental Interaction

Authors: Abhishikta Ghosh Roy

Abstract:

Breast cancer is the most common malignancy among women globally, including India. Human breast cancer results from the genetic and environmental interaction. The present study attempts to understand the molecular heterogeneity of BRCA1 and BRCA2 genes, as well as to understand the association of various lifestyle and reproductive variables for the Breast Cancer risk. The study was conducted amongst 110 patients and 128 controls with total DNA sequencing of flanking and coding regions of BRCA1 BRCA2 genes that revealed ten Single Nucleotide Polymorphisms (SNPs) (6 novels). The controls selected for the study were age, sex and ethnic group matched. After written and informed consent biological samples were collected from the subjects. After detailed molecular analysis, significant (p < 0.005) molecular heterogeneity is revealed in terms of SNPs in BRCA1 (4 Exonic & 1 Intronic) and BRCA2 (2exonic and 3 Intronic) genes. The augmentation study investigated significant (p < 0.05) association with positive family history, early age at menarche, irregular menstrual periods, menopause, prolong contraceptive use, nulliparity, history of abortions, consumption of alcohol and smoking for breast cancer risk. To the best of authors knowledge, this study is the first of its kind, envisaged that the identification of the SNPs and modification of the lifestyle factors might aid to minimize the risk among the Bengalee Hindu females.

Keywords: breast cancer, BRCA, lifestyle, India

Procedia PDF Downloads 109

24498 Characterization of a Lipolytic Enzyme of Pseudomonas nitroreducens Isolated from Mealworm's Gut

Authors: Jung-En Kuan, Whei-Fen Wu

Abstract:

In this study, a symbiotic bacteria from yellow mealworm's (Tenebrio molitor) mid-gut was isolated with characteristics of growth on minimal-tributyrin medium. After a PCR-amplification of its 16s rDNA, the resultant nucleotide sequences were then analyzed by schemes of the phylogeny trees. Accordingly, it was designated as Pseudomonas nitroreducens D-01. Next, by searching the lipolytic enzymes in its protein data bank, one of those potential lipolytic α/β hydrolases was identified, again using PCR-amplification and nucleotide-sequencing methods. To construct an expression of this lipolytic gene in plasmids, the target-gene primers were then designed, carrying the C-terminal his-tag sequences. Using the vector pET21a, a recombinant lipolytic hydrolase D gene with his-tag nucleotides was successfully cloned into it, of which the lipolytic D gene is under a control of the T7 promoter. After transformation of the resultant plasmids into Eescherichia coli BL21 (DE3), an IPTG inducer was used for the induction of the recombinant proteins. The protein products were then purified by metal-ion affinity column, and the purified proteins were found capable of forming a clear zone on tributyrin agar plate. Shortly, its enzyme activities were determined by degradation of p-nitrophenyl ester(s), and the substantial yellow end-product, p-nitrophenol, was measured at O.D.405 nm. Specifically, this lipolytic enzyme efficiently targets p-nitrophenyl butyrate. As well, it shows the most reactive activities at 40°C, pH 8 in potassium phosphate buffer. In thermal stability assays, the activities of this enzyme dramatically drop when the temperature is above 50°C. In metal ion assays, MgCl₂ and NH₄Cl induce the enzyme activities while MnSO₄, NiSO₄, CaCl₂, ZnSO₄, CoCl₂, CuSO₄, FeSO₄, and FeCl₃ reduce its activities. Besides, NaCl has no effects on its enzyme activities. Most organic solvents decrease the activities of this enzyme, such as hexane, methanol, ethanol, acetone, isopropanol, chloroform, and ethyl acetate. However, its enzyme activities increase when DMSO exists. All the surfactants like Triton X-100, Tween 80, Tween 20, and Brij35 decrease its lipolytic activities. Using Lineweaver-Burk double reciprocal methods, the function of the enzyme kinetics were determined such as Km = 0.488 (mM), Vmax = 0.0644 (mM/min), and kcat = 3.01x10³ (s⁻¹), as well the total efficiency of kcat/Km is 6.17 x10³ (mM⁻¹/s⁻¹). Afterwards, based on the phylogenetic analyses, this lipolytic protein is classified to type IV lipase by its homologous conserved region in this lipase family.

Keywords: enzyme, esterase, lipotic hydrolase, type IV

Procedia PDF Downloads 129

24497 A Comparison of Image Data Representations for Local Stereo Matching

Authors: André Smith, Amr Abdel-Dayem

Abstract:

The stereo matching problem, while having been present for several decades, continues to be an active area of research. The goal of this research is to find correspondences between elements found in a set of stereoscopic images. With these pairings, it is possible to infer the distance of objects within a scene, relative to the observer. Advancements in this field have led to experimentations with various techniques, from graph-cut energy minimization to artificial neural networks. At the basis of these techniques is a cost function, which is used to evaluate the likelihood of a particular match between points in each image. While at its core, the cost is based on comparing the image pixel data; there is a general lack of consistency as to what image data representation to use. This paper presents an experimental analysis to compare the effectiveness of more common image data representations. The goal is to determine the effectiveness of these data representations to reduce the cost for the correct correspondence relative to other possible matches.

Keywords: colour data, local stereo matching, stereo correspondence, disparity map

Procedia PDF Downloads 366

24496 Business-Intelligence Mining of Large Decentralized Multimedia Datasets with a Distributed Multi-Agent System

Authors: Karima Qayumi, Alex Norta

Abstract:

The rapid generation of high volume and a broad variety of data from the application of new technologies pose challenges for the generation of business-intelligence. Most organizations and business owners need to extract data from multiple sources and apply analytical methods for the purposes of developing their business. Therefore, the recently decentralized data management environment is relying on a distributed computing paradigm. While data are stored in highly distributed systems, the implementation of distributed data-mining techniques is a challenge. The aim of this technique is to gather knowledge from every domain and all the datasets stemming from distributed resources. As agent technologies offer significant contributions for managing the complexity of distributed systems, we consider this for next-generation data-mining processes. To demonstrate agent-based business intelligence operations, we use agent-oriented modeling techniques to develop a new artifact for mining massive datasets.

Keywords: agent-oriented modeling (AOM), business intelligence model (BIM), distributed data mining (DDM), multi-agent system (MAS)

Procedia PDF Downloads 424

24495 Timing and Noise Data Mining Algorithm and Software Tool in Very Large Scale Integration (VLSI) Design

Authors: Qing K. Zhu

Abstract:

Very Large Scale Integration (VLSI) design becomes very complex due to the continuous integration of millions of gates in one chip based on Moore’s law. Designers have encountered numerous report files during design iterations using timing and noise analysis tools. This paper presented our work using data mining techniques combined with HTML tables to extract and represent critical timing/noise data. When we apply this data-mining tool in real applications, the running speed is important. The software employs table look-up techniques in the programming for the reasonable running speed based on performance testing results. We added several advanced features for the application in one industry chip design.

Keywords: VLSI design, data mining, big data, HTML forms, web, VLSI, EDA, timing, noise

Procedia PDF Downloads 251

24494 Introduction of Electronic Health Records to Improve Data Quality in Emergency Department Operations

Authors: Anuruddha Jagoda, Samiddhi Samarakoon, Anil Jasinghe

Abstract:

In its simplest form, data quality can be defined as 'fitness for use' and it is a concept with multi-dimensions. Emergency Departments(ED) require information to treat patients and on the other hand it is the primary source of information regarding accidents, injuries, emergencies etc. Also, it is the starting point of various patient registries, databases and surveillance systems. This interventional study was carried out to improve data quality at the ED of the National Hospital of Sri Lanka (NHSL) by introducing an e health solution to improve data quality. The NHSL is the premier trauma care centre in Sri Lanka. The study consisted of three components. A research study was conducted to assess the quality of data in relation to selected five dimensions of data quality namely accuracy, completeness, timeliness, legibility and reliability. The intervention was to develop and deploy an electronic emergency department information system (eEDIS). Post assessment of the intervention confirmed that all five dimensions of data quality had improved. The most significant improvements are noticed in accuracy and timeliness dimensions.

Keywords: electronic health records, electronic emergency department information system, emergency department, data quality

Procedia PDF Downloads 267

24493 Data Presentation of Lane-Changing Events Trajectories Using HighD Dataset

Authors: Basma Khelfa, Antoine Tordeux, Ibrahima Ba

Abstract:

We present a descriptive analysis data of lane-changing events in multi-lane roads. The data are provided from The Highway Drone Dataset (HighD), which are microscopic trajectories in highway. This paper describes and analyses the role of the different parameters and their significance. Thanks to HighD data, we aim to find the most frequent reasons that motivate drivers to change lanes. We used the programming language R for the processing of these data. We analyze the involvement and relationship of different variables of each parameter of the ego vehicle and the four vehicles surrounding it, i.e., distance, speed difference, time gap, and acceleration. This was studied according to the class of the vehicle (car or truck), and according to the maneuver it undertook (overtaking or falling back).

Keywords: autonomous driving, physical traffic model, prediction model, statistical learning process

Procedia PDF Downloads 250