Search results for: biological data mining
26039 Experiments on Weakly-Supervised Learning on Imperfect Data
Authors: Yan Cheng, Yijun Shao, James Rudolph, Charlene R. Weir, Beth Sahlmann, Qing Zeng-Treitler
Abstract:
Supervised predictive models require labeled data for training purposes. Complete and accurate labeled data, i.e., a ‘gold standard’, is not always available, and imperfectly labeled data may need to serve as an alternative. An important question is if the accuracy of the labeled data creates a performance ceiling for the trained model. In this study, we trained several models to recognize the presence of delirium in clinical documents using data with annotations that are not completely accurate (i.e., weakly-supervised learning). In the external evaluation, the support vector machine model with a linear kernel performed best, achieving an area under the curve of 89.3% and accuracy of 88%, surpassing the 80% accuracy of the training sample. We then generated a set of simulated data and carried out a series of experiments which demonstrated that models trained on imperfect data can (but do not always) outperform the accuracy of the training data, e.g., the area under the curve for some models is higher than 80% when trained on the data with an error rate of 40%. Our experiments also showed that the error resistance of linear modeling is associated with larger sample size, error type, and linearity of the data (all p-values < 0.001). In conclusion, this study sheds light on the usefulness of imperfect data in clinical research via weakly-supervised learning.Keywords: weakly-supervised learning, support vector machine, prediction, delirium, simulation
Procedia PDF Downloads 19926038 Synthesis, Characterization and Biological Evaluation of Some Pyrazole Derivatives
Authors: Afifa Hafidh, Hedia Chaabane
Abstract:
This work mainly focused on the synthetic strategies and biological activities associated with pyrazoles. Pyrazole derivatives have been successfully synthesized by simple and facile method and studied for their antibacterial activity. These compounds were prepared from pyrazolic difunctional compounds as starting materials, by reaction with salicylic acid, paracetamol and thiosemicarbazide respectively. Structure of all the prepared compounds confirmation were proved using (FT-IR), (1H-NMR) and (13C-NMR) spectra in addition to melting points. The screening of the antimicrobial activity of the pyrazolic derivatives was examined against different microorganisms in the present study. They were screened for their antimicrobial activities against gram positive bacteria, gram negative bacteria and Candida albicans. The synthesized compounds were found to exhibit high antibacterial and antifungal efficiency against several tested bacterial strains, using agar diffusion method and filter paper disc-diffusion method. Ampicillin was used as positive control for all strains except Candida albicans for which Nystatin was used. The obtained results reveal that the antibacterial activity of some pyrazolic derivatives is comparable to that observed for the control samples (Ampicilin and Nystatin), suggesting a strong antibacterial activity. The analysis of these results shows that synthesized products react on the surfaces cell walls that are disrupted. When these products are in contact with the bacteria, they damage the membrane, leading to the perturbation of different cellular processes and then leakage of cytoplasm, resulting in the death of the cells. The results will be presented in details. The obtained products constitute effective antibacterial agents and important compounds for biological systems.Keywords: salicylic acid, antimicrobial activities, antioxidant activity, paracetamol, pyrazole, thiosemicarbazide
Procedia PDF Downloads 17326037 Water Quality Assessment Based on Operational Indicator in West Coastal Water of Malaysia
Authors: Seyedeh Belin Tavakoly Sany, H. Rosli, R. Majid, S. Aishah
Abstract:
In this study, water monitoring was performed from Nov. 2012 to Oct. 2013 to assess water quality and evaluate the spatial and temporal distribution of physicochemical and biological variables in water. Water samples were collected from 10 coastal water stations of West Port. In the case of water-quality assessment, multi-metric indices and operational indicators have been proposed to classify the trophic status at different stations. The trophic level of West Port coastal water ranges from eutrophic to hypertrophic. Chl-a concentration was used to estimate the biological response of phytoplankton biomass and indicated eutrophic conditions in West Port and mesotrophic conditions at the control site. During the study period, no eutrophication events or secondary symptoms occurred, which may be related to hydrodynamic turbulence and water exchange, which prevent the development of eutrophic conditions in the West Port.Keywords: water quality, multi-metric indices, operational indicator, Malaysia, West Port
Procedia PDF Downloads 29626036 Transforming Healthcare Data Privacy: Integrating Blockchain with Zero-Knowledge Proofs and Cryptographic Security
Authors: Kenneth Harper
Abstract:
Blockchain technology presents solutions for managing healthcare data, addressing critical challenges in privacy, integrity, and access. This paper explores how privacy-preserving technologies, such as zero-knowledge proofs (ZKPs) and homomorphic encryption (HE), enhance decentralized healthcare platforms by enabling secure computations and patient data protection. An examination of the mathematical foundations of these methods, their practical applications, and how they meet the evolving demands of healthcare data security is unveiled. Using real-world examples, this research highlights industry-leading implementations and offers a roadmap for future applications in secure, decentralized healthcare ecosystems.Keywords: blockchain, cryptography, data privacy, decentralized data management, differential privacy, healthcare, healthcare data security, homomorphic encryption, privacy-preserving technologies, secure computations, zero-knowledge proofs
Procedia PDF Downloads 1926035 Operating Speed Models on Tangent Sections of Two-Lane Rural Roads
Authors: Dražen Cvitanić, Biljana Maljković
Abstract:
This paper presents models for predicting operating speeds on tangent sections of two-lane rural roads developed on continuous speed data. The data corresponds to 20 drivers of different ages and driving experiences, driving their own cars along an 18 km long section of a state road. The data were first used for determination of maximum operating speeds on tangents and their comparison with speeds in the middle of tangents i.e. speed data used in most of operating speed studies. Analysis of continuous speed data indicated that the spot speed data are not reliable indicators of relevant speeds. After that, operating speed models for tangent sections were developed. There was no significant difference between models developed using speed data in the middle of tangent sections and models developed using maximum operating speeds on tangent sections. All developed models have higher coefficient of determination then models developed on spot speed data. Thus, it can be concluded that the method of measuring has more significant impact on the quality of operating speed model than the location of measurement.Keywords: operating speed, continuous speed data, tangent sections, spot speed, consistency
Procedia PDF Downloads 45226034 Comparative Study Using WEKA for Red Blood Cells Classification
Authors: Jameela Ali, Hamid A. Jalab, Loay E. George, Abdul Rahim Ahmad, Azizah Suliman, Karim Al-Jashamy
Abstract:
Red blood cells (RBC) are the most common types of blood cells and are the most intensively studied in cell biology. The lack of RBCs is a condition in which the amount of hemoglobin level is lower than normal and is referred to as “anemia”. Abnormalities in RBCs will affect the exchange of oxygen. This paper presents a comparative study for various techniques for classifying the RBCs as normal, or abnormal (anemic) using WEKA. WEKA is an open source consists of different machine learning algorithms for data mining applications. The algorithm tested are Radial Basis Function neural network, Support vector machine, and K-Nearest Neighbors algorithm. Two sets of combined features were utilized for classification of blood cells images. The first set, exclusively consist of geometrical features, was used to identify whether the tested blood cell has a spherical shape or non-spherical cells. While the second set, consist mainly of textural features was used to recognize the types of the spherical cells. We have provided an evaluation based on applying these classification methods to our RBCs image dataset which were obtained from Serdang Hospital-alaysia, and measuring the accuracy of test results. The best achieved classification rates are 97%, 98%, and 79% for Support vector machines, Radial Basis Function neural network, and K-Nearest Neighbors algorithm respectively.Keywords: K-nearest neighbors algorithm, radial basis function neural network, red blood cells, support vector machine
Procedia PDF Downloads 41026033 The Simultaneous Application of Chemical and Biological Markers to Identify Reliable Indicators of Untreated Human Waste and Fecal Pollution in Urban Philadelphia Source Waters
Authors: Stafford Stewart, Hui Yu, Rominder Suri
Abstract:
This paper publishes the results of the first known study conducted in urban Philadelphia waterways that simultaneously utilized anthropogenic chemical and biological markers to identify suitable indicators of untreated human waste and fecal pollution. A total of 13 outfall samples, 30 surface water samples, and 2 groundwater samples were analyzed for fecal contamination and untreated human waste using a suite of 25 chemical markers and 5 bio-markers. Pearson rank correlation tests were conducted to establish associations between the abundances of bio-markers and the concentrations of chemical markers. Results show that 16S rRNA gene of human-associated Bacteroidales (BacH) was very strongly correlated (0.76 – 0.97, p < 0.05) with labile chemical markers acetaminophen, cotinine, estriol, and urobilin. Likewise, human-specific F- RNA coliphages (F-RNA-II) and labile chemical markers, urobilin, ibuprofen, cotinine and estriol, were significantly correlated (0.77 – 0.95, p < 0.05). Similarly, a strong positive correlation (0.67 – 0.91, p < 0.05) was evident between the abundances of bio-markers BacH and F-RNA-II, and the concentrations of the conservative markers, trimethoprim, meprobamate, diltiazem, triclocarban, metformin, sucralose, gemfibrozil, sulfamethoxazole, and carbamazepine. Human mitochondrial DNA (MitoH) correlated moderately with labile markers nicotine and salicylic acid as well as with conservative markers metformin and triclocarban (0.31 – 0.47, p<0.05). This study showed that by associating chemical and biological markers, a robust technique was developed for fingerprinting source-specific untreated waste and fecal contamination in source waters.Keywords: anthropogenic markers, bacteroidales, fecal pollution, source waters, wastewater
Procedia PDF Downloads 1626032 A Neural Network Based Clustering Approach for Imputing Multivariate Values in Big Data
Authors: S. Nickolas, Shobha K.
Abstract:
The treatment of incomplete data is an important step in the data pre-processing. Missing values creates a noisy environment in all applications and it is an unavoidable problem in big data management and analysis. Numerous techniques likes discarding rows with missing values, mean imputation, expectation maximization, neural networks with evolutionary algorithms or optimized techniques and hot deck imputation have been introduced by researchers for handling missing data. Among these, imputation techniques plays a positive role in filling missing values when it is necessary to use all records in the data and not to discard records with missing values. In this paper we propose a novel artificial neural network based clustering algorithm, Adaptive Resonance Theory-2(ART2) for imputation of missing values in mixed attribute data sets. The process of ART2 can recognize learned models fast and be adapted to new objects rapidly. It carries out model-based clustering by using competitive learning and self-steady mechanism in dynamic environment without supervision. The proposed approach not only imputes the missing values but also provides information about handling the outliers.Keywords: ART2, data imputation, clustering, missing data, neural network, pre-processing
Procedia PDF Downloads 27426031 The Effect That the Data Assimilation of Qinghai-Tibet Plateau Has on a Precipitation Forecast
Authors: Ruixia Liu
Abstract:
Qinghai-Tibet Plateau has an important influence on the precipitation of its lower reaches. Data from remote sensing has itself advantage and numerical prediction model which assimilates RS data will be better than other. We got the assimilation data of MHS and terrestrial and sounding from GSI, and introduced the result into WRF, then got the result of RH and precipitation forecast. We found that assimilating MHS and terrestrial and sounding made the forecast on precipitation, area and the center of the precipitation more accurate by comparing the result of 1h,6h,12h, and 24h. Analyzing the difference of the initial field, we knew that the data assimilating about Qinghai-Tibet Plateau influence its lower reaches forecast by affecting on initial temperature and RH.Keywords: Qinghai-Tibet Plateau, precipitation, data assimilation, GSI
Procedia PDF Downloads 23426030 The Artificial Intelligence Driven Social Work
Authors: Avi Shrivastava
Abstract:
Our world continues to grapple with a lot of social issues. Economic growth and scientific advancements have not completely eradicated poverty, homelessness, discrimination and bias, gender inequality, health issues, mental illness, addiction, and other social issues. So, how do we improve the human condition in a world driven by advanced technology? The answer is simple: we will have to leverage technology to address some of the most important social challenges of the day. AI, or artificial intelligence, has emerged as a critical tool in the battle against issues that deprive marginalized and disadvantaged groups of the right to enjoy benefits that a society offers. Social work professionals can transform their lives by harnessing it. The lack of reliable data is one of the reasons why a lot of social work projects fail. Social work professionals continue to rely on expensive and time-consuming primary data collection methods, such as observation, surveys, questionnaires, and interviews, instead of tapping into AI-based technology to generate useful, real-time data and necessary insights. By leveraging AI’s data-mining ability, we can gain a deeper understanding of how to solve complex social problems and change lives of people. We can do the right work for the right people and at the right time. For example, AI can enable social work professionals to focus their humanitarian efforts on some of the world’s poorest regions, where there is extreme poverty. An interdisciplinary team of Stanford scientists, Marshall Burke, Stefano Ermon, David Lobell, Michael Xie, and Neal Jean, used AI to spot global poverty zones – identifying such zones is a key step in the fight against poverty. The scientists combined daytime and nighttime satellite imagery with machine learning algorithms to predict poverty in Nigeria, Uganda, Tanzania, Rwanda, and Malawi. In an article published by Stanford News, Stanford researchers use dark of night and machine learning, Ermon explained that they provided the machine-learning system, an application of AI, with the high-resolution satellite images and asked it to predict poverty in the African region. “The system essentially learned how to solve the problem by comparing those two sets of images [daytime and nighttime].” This is one example of how AI can be used by social work professionals to reach regions that need their aid the most. It can also help identify sources of inequality and conflict, which could reduce inequalities, according to Nature’s study, titled The role of artificial intelligence in achieving the Sustainable Development Goals, published in 2020. The report also notes that AI can help achieve 79 percent of the United Nation’s (UN) Sustainable Development Goals (SDG). AI is impacting our everyday lives in multiple amazing ways, yet some people do not know much about it. If someone is not familiar with this technology, they may be reluctant to use it to solve social issues. So, before we talk more about the use of AI to accomplish social work objectives, let’s put the spotlight on how AI and social work can complement each other.Keywords: social work, artificial intelligence, AI based social work, machine learning, technology
Procedia PDF Downloads 10226029 Positive Affect, Negative Affect, Organizational and Motivational Factor on the Acceptance of Big Data Technologies
Authors: Sook Ching Yee, Angela Siew Hoong Lee
Abstract:
Big data technologies have become a trend to exploit business opportunities and provide valuable business insights through the analysis of big data. However, there are still many organizations that have yet to adopt big data technologies especially small and medium organizations (SME). This study uses the technology acceptance model (TAM) to look into several constructs in the TAM and other additional constructs which are positive affect, negative affect, organizational factor and motivational factor. The conceptual model proposed in the study will be tested on the relationship and influence of positive affect, negative affect, organizational factor and motivational factor towards the intention to use big data technologies to produce an outcome. Empirical research is used in this study by conducting a survey to collect data.Keywords: big data technologies, motivational factor, negative affect, organizational factor, positive affect, technology acceptance model (TAM)
Procedia PDF Downloads 36226028 Big Data Analysis with Rhipe
Authors: Byung Ho Jung, Ji Eun Shin, Dong Hoon Lim
Abstract:
Rhipe that integrates R and Hadoop environment made it possible to process and analyze massive amounts of data using a distributed processing environment. In this paper, we implemented multiple regression analysis using Rhipe with various data sizes of actual data. Experimental results for comparing the performance of our Rhipe with stats and biglm packages available on bigmemory, showed that our Rhipe was more fast than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases. We also compared the computing speeds of pseudo-distributed and fully-distributed modes for configuring Hadoop cluster. The results showed that fully-distributed mode was faster than pseudo-distributed mode, and computing speeds of fully-distributed mode were faster as the number of data nodes increases.Keywords: big data, Hadoop, Parallel regression analysis, R, Rhipe
Procedia PDF Downloads 49726027 Security in Resource Constraints Network Light Weight Encryption for Z-MAC
Authors: Mona Almansoori, Ahmed Mustafa, Ahmad Elshamy
Abstract:
Wireless sensor network was formed by a combination of nodes, systematically it transmitting the data to their base stations, this transmission data can be easily compromised if the limited processing power and the data consistency from these nodes are kept in mind; there is always a discussion to address the secure data transfer or transmission in actual time. This will present a mechanism to securely transmit the data over a chain of sensor nodes without compromising the throughput of the network by utilizing available battery resources available in the sensor node. Our methodology takes many different advantages of Z-MAC protocol for its efficiency, and it provides a unique key by sharing the mechanism using neighbor node MAC address. We present a light weighted data integrity layer which is embedded in the Z-MAC protocol to prove that our protocol performs well than Z-MAC when we introduce the different attack scenarios.Keywords: hybrid MAC protocol, data integrity, lightweight encryption, neighbor based key sharing, sensor node dataprocessing, Z-MAC
Procedia PDF Downloads 14426026 Comparison of Quality Indices for Sediment Assessment in Ireland
Authors: Tayyaba Bibi, Jenny Ronan, Robert Hernan, Kathleen O’Rourke, Brendan McHugh, Evin McGovern, Michelle Giltrap, Gordon Chambers, James Wilson
Abstract:
Sediment contamination is a major source of ecosystem stress and has received significant attention from the scientific community. Both the Water Framework Directive (WFD) and Marine Strategy Framework Directive (MSFD) require a robust set of tools for biological and chemical monitoring. For the MSFD in particular, causal links between contaminant and effects need to be assessed. Appropriate assessment tools are required in order to make an accurate evaluation. In this study, a range of recommended sediment bioassays and chemical measurements are assessed in a number of potentially impacted and lowly impacted locations around Ireland. Previously, assessment indices have been developed on individual compartments, i.e. contaminant levels or biomarker/bioassay responses. A number of assessment indices are applied to chemical and ecotoxicological data from the Seachange project (Project code) and compared including the metal pollution index (MPI), pollution load index (PLI) and Chapman index for chemistry as well as integrated biomarker response (IBR). The benefits and drawbacks of the use of indices and aggregation techniques are discussed. In addition to this, modelling of raw data is investigated to analyse links between contaminant and effects.Keywords: bioassays, contamination indices, ecotoxicity, marine environment, sediments
Procedia PDF Downloads 22826025 Survival Data with Incomplete Missing Categorical Covariates
Authors: Madaki Umar Yusuf, Mohd Rizam B. Abubakar
Abstract:
The survival censored data with incomplete covariate data is a common occurrence in many studies in which the outcome is survival time. With model when the missing covariates are categorical, a useful technique for obtaining parameter estimates is the EM by the method of weights. The survival outcome for the class of generalized linear model is applied and this method requires the estimation of the parameters of the distribution of the covariates. In this paper, we propose some clinical trials with ve covariates, four of which have some missing values which clearly show that they were fully censored data.Keywords: EM algorithm, incomplete categorical covariates, ignorable missing data, missing at random (MAR), Weibull Distribution
Procedia PDF Downloads 40626024 A Study of Blockchain Oracles
Authors: Abdeljalil Beniiche
Abstract:
The limitation with smart contracts is that they cannot access external data that might be required to control the execution of business logic. Oracles can be used to provide external data to smart contracts. An oracle is an interface that delivers data from external data outside the blockchain to a smart contract to consume. Oracle can deliver different types of data depending on the industry and requirements. In this paper, we study and describe the widely used blockchain oracles. Then, we elaborate on his potential role, technical architecture, and design patterns. Finally, we discuss the human oracle and its key role in solving the truth problem by reaching a consensus about a certain inquiry and tasks.Keywords: blockchain, oracles, oracles design, human oracles
Procedia PDF Downloads 13626023 Heat Capacity of a Soluble in Water Protein: Equilibrium Molecular Dynamics Simulation
Authors: A. Rajabpour, A. Hadizadeh Kheirkhah
Abstract:
Heat transfer is of great importance to biological systems in order to function properly. In the present study, specific heat capacity as one of the most important heat transfer properties is calculated for a soluble in water Lysozyme protein. Using equilibrium molecular dynamics (MD) simulation, specific heat capacities of pure water, dry lysozyme, and lysozyme-water solution are calculated at 300K for different weight fractions. It is found that MD results are in good agreement with ideal binary mixing rule at small weight fractions. Results of all simulations have been validated with experimental data.Keywords: specific heat capacity, molecular dynamics simulation, lysozyme protein, equilibrium
Procedia PDF Downloads 30826022 A Fast Community Detection Algorithm
Authors: Chung-Yuan Huang, Yu-Hsiang Fu, Chuen-Tsai Sun
Abstract:
Community detection represents an important data-mining tool for analyzing and understanding real-world complex network structures and functions. We believe that at least four criteria determine the appropriateness of a community detection algorithm: (a) it produces useable normalized mutual information (NMI) and modularity results for social networks, (b) it overcomes resolution limitation problems associated with synthetic networks, (c) it produces good NMI results and performance efficiency for Lancichinetti-Fortunato-Radicchi (LFR) benchmark networks, and (d) it produces good modularity and performance efficiency for large-scale real-world complex networks. To our knowledge, no existing community detection algorithm meets all four criteria. In this paper, we describe a simple hierarchical arc-merging (HAM) algorithm that uses network topologies and rule-based arc-merging strategies to identify community structures that satisfy the criteria. We used five well-studied social network datasets and eight sets of LFR benchmark networks to validate the ground-truth community correctness of HAM, eight large-scale real-world complex networks to measure its performance efficiency, and two synthetic networks to determine its susceptibility to resolution limitation problems. Our results indicate that the proposed HAM algorithm is capable of providing satisfactory performance efficiency and that HAM-identified communities were close to ground-truth communities in social and LFR benchmark networks while overcoming resolution limitation problems.Keywords: complex network, social network, community detection, network hierarchy
Procedia PDF Downloads 22826021 Multi Data Management Systems in a Cluster Randomized Trial in Poor Resource Setting: The Pneumococcal Vaccine Schedules Trial
Authors: Abdoullah Nyassi, Golam Sarwar, Sarra Baldeh, Mamadou S. K. Jallow, Bai Lamin Dondeh, Isaac Osei, Grant A. Mackenzie
Abstract:
A randomized controlled trial is the "gold standard" for evaluating the efficacy of an intervention. Large-scale, cluster-randomized trials are expensive and difficult to conduct, though. To guarantee the validity and generalizability of findings, high-quality, dependable, and accurate data management systems are necessary. Robust data management systems are crucial for optimizing and validating the quality, accuracy, and dependability of trial data. Regarding the difficulties of data gathering in clinical trials in low-resource areas, there is a scarcity of literature on this subject, which may raise concerns. Effective data management systems and implementation goals should be part of trial procedures. Publicizing the creative clinical data management techniques used in clinical trials should boost public confidence in the study's conclusions and encourage further replication. In the ongoing pneumococcal vaccine schedule study in rural Gambia, this report details the development and deployment of multi-data management systems and methodologies. We implemented six different data management, synchronization, and reporting systems using Microsoft Access, RedCap, SQL, Visual Basic, Ruby, and ASP.NET. Additionally, data synchronization tools were developed to integrate data from these systems into the central server for reporting systems. Clinician, lab, and field data validation systems and methodologies are the main topics of this report. Our process development efforts across all domains were driven by the complexity of research project data collected in real-time data, online reporting, data synchronization, and ways for cleaning and verifying data. Consequently, we effectively used multi-data management systems, demonstrating the value of creative approaches in enhancing the consistency, accuracy, and reporting of trial data in a poor resource setting.Keywords: data management, data collection, data cleaning, cluster-randomized trial
Procedia PDF Downloads 2726020 Ants of the Genus Trichomyrmex Mayr, 1865 (Hymenoptera: Formicidae) in the Arabian Peninsula, with Description of Two New Species
Authors: Mostafa R. Sharaf, Shehzad Salman, Hathal M. Al Dhafer, Shahid A. Akbar, Abdulrahman S. Aldawood
Abstract:
The ant genus Trichomyrmex Mayr is revised for the Arabian Peninsula based on the worker caste. Nine species are recognized and descriptions of two new species, T. almosayari sp. n. and T. shakeri sp. n. from Riyadh Province, the Kingdom of Saudi Arabia (KSA) are given. A key to species and diagnostic characters of the treated species are presented. New country records are presented, T. abyssinicus (Forel) for the KSA and T. destructor (Jerdon) and T. mayri (Forel) for the State of Qatar. New distribution records for T. destructor (Jerdon) and T. mayri (Forel) in the KSA are provided. Regional and world distributions, and distribution maps for the treated species are included. Ecological and biological data are given where known.Keywords: ants, Trichomyrmex, Arabian Peninsula, T. almosayari, T. shakeri
Procedia PDF Downloads 34726019 Finding Bicluster on Gene Expression Data of Lymphoma Based on Singular Value Decomposition and Hierarchical Clustering
Authors: Alhadi Bustaman, Soeganda Formalidin, Titin Siswantining
Abstract:
DNA microarray technology is used to analyze thousand gene expression data simultaneously and a very important task for drug development and test, function annotation, and cancer diagnosis. Various clustering methods have been used for analyzing gene expression data. However, when analyzing very large and heterogeneous collections of gene expression data, conventional clustering methods often cannot produce a satisfactory solution. Biclustering algorithm has been used as an alternative approach to identifying structures from gene expression data. In this paper, we introduce a transform technique based on singular value decomposition to identify normalized matrix of gene expression data followed by Mixed-Clustering algorithm and the Lift algorithm, inspired in the node-deletion and node-addition phases proposed by Cheng and Church based on Agglomerative Hierarchical Clustering (AHC). Experimental study on standard datasets demonstrated the effectiveness of the algorithm in gene expression data.Keywords: agglomerative hierarchical clustering (AHC), biclustering, gene expression data, lymphoma, singular value decomposition (SVD)
Procedia PDF Downloads 27826018 An Efficient Traceability Mechanism in the Audited Cloud Data Storage
Authors: Ramya P, Lino Abraham Varghese, S. Bose
Abstract:
By cloud storage services, the data can be stored in the cloud, and can be shared across multiple users. Due to the unexpected hardware/software failures and human errors, which make the data stored in the cloud be lost or corrupted easily it affected the integrity of data in cloud. Some mechanisms have been designed to allow both data owners and public verifiers to efficiently audit cloud data integrity without retrieving the entire data from the cloud server. But public auditing on the integrity of shared data with the existing mechanisms will unavoidably reveal confidential information such as identity of the person, to public verifiers. Here a privacy-preserving mechanism is proposed to support public auditing on shared data stored in the cloud. It uses group signatures to compute verification metadata needed to audit the correctness of shared data. The identity of the signer on each block in shared data is kept confidential from public verifiers, who are easily verifying shared data integrity without retrieving the entire file. But on demand, the signer of the each block is reveal to the owner alone. Group private key is generated once by the owner in the static group, where as in the dynamic group, the group private key is change when the users revoke from the group. When the users leave from the group the already signed blocks are resigned by cloud service provider instead of owner is efficiently handled by efficient proxy re-signature scheme.Keywords: data integrity, dynamic group, group signature, public auditing
Procedia PDF Downloads 39226017 Study on the Treatment of Waste Water Containing Nitrogen Heterocyclic Aromatic Hydrocarbons by Phenol-Induced Microbial Communities
Authors: Zhichao Li
Abstract:
This project has treated the waste-water that contains the nitrogen heterocyclic aromatic hydrocarbons, by using the phenol-induced microbial communities. The treatment of nitrogen heterocyclic aromatic hydrocarbons is a difficult problem for coking waste-water treatment. Pyridine, quinoline and indole are three kinds of most common nitrogen heterocyclic compounds in the f, and treating these refractory organics biologically has always been a research focus. The phenol-degrading bacteria can be used in the enhanced biological treatment effectively, and has a good treatment effect. Therefore, using the phenol-induced microbial communities to treat the coking waste-water can remove multiple pollutants concurrently, and improve the treating efficiency of coking waste-water. Experiments have proved that the phenol-induced microbial communities can degrade the nitrogen heterocyclic ring aromatic hydrocarbon efficiently.Keywords: phenol, nitrogen heterocyclic aromatic hydrocarbons, phenol-degrading bacteria, microbial communities, biological treatment technology
Procedia PDF Downloads 20926016 Securing Health Monitoring in Internet of Things with Blockchain-Based Proxy Re-Encryption
Authors: Jerlin George, R. Chitra
Abstract:
The devices with sensors that can monitor your temperature, heart rate, and other vital signs and link to the internet, known as the Internet of Things (IoT), have completely transformed the way we control health. Providing real-time health data, these sensors improve diagnostics and treatment outcomes. Security and privacy matters when IoT comes into play in healthcare. Cyberattacks on centralized database systems are also a problem. To solve these challenges, the study uses blockchain technology coupled with proxy re-encryption to secure health data. ThingSpeak IoT cloud analyzes the collected data and turns them into blockchain transactions which are safely kept on the DriveHQ cloud. Transparency and data integrity are ensured by blockchain, and secure data sharing among authorized users is made possible by proxy re-encryption. This results in a health monitoring system that preserves the accuracy and confidentiality of data while reducing the safety risks of IoT-driven healthcare applications.Keywords: internet of things, healthcare, sensors, electronic health records, blockchain, proxy re-encryption, data privacy, data security
Procedia PDF Downloads 1726015 Biodiversity of Platyhelminthes Parasites on Batoids (Elasmobranchii) Fishes from the Algerian Coasts: First Annotated Inventory
Authors: Fadila Tazerouti, Affaf Boukadoum, Kamilia Gharbi, Karima Benmeslem
Abstract:
Parasites are recognized as an important component of biodiversity because of their crucial role in providing valuable information on host populations and in the functioning and balance of natural ecosystems. Although the knowledge about these pathogen organisms' diversity has increased these last years, many species still need to be identified and more investigations should be performed. Batoid fishes represent a significant biological resource, especially among populations of the Mediterranean basin. However, the data on their parasitic fauna, particularly in Algeria, remains unknown and still incomplete. Therefore, the aim of this study is to survey and provide data on the biodiversity of Platyhelminthes parasites of Elasmobranches fishes from Algerian coasts. 3217 specimens of Batoids belonging to 4 families, Topedinidae, Rajdae, Dasyatidae and Myliobatidae, caught in several sites on the Algerian coasts, were examined for their parasites. 47 taxa, including 7 new for science and belonging to 2 classes Monogenea and Cestoda, have been identified. Monogeneans presented the highest richness with 24 taxa and 5 new species for science: 4 Amphibdelloides species and one Calicotyle species. Cestodes are represented by 23 taxa and 3 new species: 2 Acanthobothrium and 1 species Echinobothrium. This study allowed us to establish for the first time in Algeria an inventory of Platyhelminthes parasites of this group of Chondrichthyes fish, as well as an invaluable contribution to the knowledge about the parasitic fauna of Algerian and Mediterranean Elasmobranch fishes.Keywords: parasitic platyhelminthes, biodiversity, elasmobranches, algerian coasts, inventory
Procedia PDF Downloads 8126014 Rodriguez Diego, Del Valle Martin, Hargreaves Matias, Riveros Jose Luis
Authors: Nathainail Bashir, Neil Anderson
Abstract:
The objective of this study site was to investigate the current state of the practice with regards to karst detection methods and recommend the best method and pattern of arrays to acquire the desire results. Proper site investigation in karst prone regions is extremely valuable in determining the location of possible voids. Two geophysical techniques were employed: multichannel analysis of surface waves (MASW) and electric resistivity tomography (ERT).The MASW data was acquired at each test location using different array lengths and different array orientations (to increase the probability of getting interpretable data in karst terrain). The ERT data were acquired using a dipole-dipole array consisting of 168 electrodes. The MASW data was interpreted (re: estimated depth to physical top of rock) and used to constrain and verify the interpretation of the ERT data. The ERT data indicates poorer quality MASW data were acquired in areas where there was significant local variation in the depth to top of rock.Keywords: dipole-dipole, ERT, Karst terrains, MASW
Procedia PDF Downloads 31526013 Data Science in Military Decision-Making: A Semi-Systematic Literature Review
Authors: H. W. Meerveld, R. H. A. Lindelauf
Abstract:
In contemporary warfare, data science is crucial for the military in achieving information superiority. Yet, to the authors’ knowledge, no extensive literature survey on data science in military decision-making has been conducted so far. In this study, 156 peer-reviewed articles were analysed through an integrative, semi-systematic literature review to gain an overview of the topic. The study examined to what extent literature is focussed on the opportunities or risks of data science in military decision-making, differentiated per level of war (i.e. strategic, operational, and tactical level). A relatively large focus on the risks of data science was observed in social science literature, implying that political and military policymakers are disproportionally influenced by a pessimistic view on the application of data science in the military domain. The perceived risks of data science are, however, hardly addressed in formal science literature. This means that the concerns on the military application of data science are not addressed to the audience that can actually develop and enhance data science models and algorithms. Cross-disciplinary research on both the opportunities and risks of military data science can address the observed research gaps. Considering the levels of war, relatively low attention for the operational level compared to the other two levels was observed, suggesting a research gap with reference to military operational data science. Opportunities for military data science mostly arise at the tactical level. On the contrary, studies examining strategic issues mostly emphasise the risks of military data science. Consequently, domain-specific requirements for military strategic data science applications are hardly expressed. Lacking such applications may ultimately lead to a suboptimal strategic decision in today’s warfare.Keywords: data science, decision-making, information superiority, literature review, military
Procedia PDF Downloads 16726012 Legal Regulation of Personal Information Data Transmission Risk Assessment: A Case Study of the EU’s DPIA
Authors: Cai Qianyi
Abstract:
In the midst of global digital revolution, the flow of data poses security threats that call China's existing legislative framework for protecting personal information into question. As a preliminary procedure for risk analysis and prevention, the risk assessment of personal data transmission lacks detailed guidelines for support. Existing provisions reveal unclear responsibilities for network operators and weakened rights for data subjects. Furthermore, the regulatory system's weak operability and a lack of industry self-regulation heighten data transmission hazards. This paper aims to compare the regulatory pathways for data information transmission risks between China and Europe from a legal framework and content perspective. It draws on the “Data Protection Impact Assessment Guidelines” to empower multiple stakeholders, including data processors, controllers, and subjects, while also defining obligations. In conclusion, this paper intends to solve China's digital security shortcomings by developing a more mature regulatory framework and industry self-regulation mechanisms, resulting in a win-win situation for personal data protection and the development of the digital economy.Keywords: personal information data transmission, risk assessment, DPIA, internet service provider, personal information data transimission, risk assessment
Procedia PDF Downloads 6126011 Estimation of Natural Pozzolan Reserves in the Volcanic Province of the Moroccan Middle Atlas Using a Geographic Information System in Order to Valorize Them
Authors: Brahim Balizi, Ayoub Aziz, Abdelilah Bellil, Abdellali El Khadiri, Jamal Mabrouki
Abstract:
Mio-polio-quaternary volcanism of the Tabular Middle Atlas, which corresponds to prospective levels of exploitable usable raw minerals, is a feature of Morocco's Middle Atlas, especially the Azrou-Timahdite region. Given their importance in national policy in terms of human development by supporting the sociological and economic component, this area has consequently been the focus of various research and prospecting of these levels in order to develop these reserves. The outcome of this labor is a massive amount of data that needs to be managed appropriately because it comes from multiple sources and formats, including side points, contour lines, geology, hydrogeology, hydrology, geological and topographical maps, satellite photos, and more. In this regard, putting in place a Geographic Information System (GIS) is essential to be able to offer a side plan that makes it possible to see the most recent topography of the area being exploited, to compute the volume of exploitation that occurs every day, and to make decisions with the fewest possible restrictions in order to use the reserves for the realization of ecological light mortars The three sites' mining will follow the contour lines in five steps that are six meters high and decline. It is anticipated that each quarry produces about 90,000 m3/year. For a single quarry, this translates to a daily production of about 450 m3 (200 days/year). About 3,540,240 m3 and 10,620,720 m3, respectively, represent the possible net exploitable volume in place for a single quarry and the three exploitable zones.Keywords: GIS, topography, exploitation, quarrying, lightweight mortar
Procedia PDF Downloads 2626010 Wavelets Contribution on Textual Data Analysis
Authors: Habiba Ben Abdessalem
Abstract:
The emergence of giant set of textual data was the push that has encouraged researchers to invest in this field. The purpose of textual data analysis methods is to facilitate access to such type of data by providing various graphic visualizations. Applying these methods requires a corpus pretreatment step, whose standards are set according to the objective of the problem studied. This step determines the forms list contained in contingency table by keeping only those information carriers. This step may, however, lead to noisy contingency tables, so the use of wavelet denoising function. The validity of the proposed approach is tested on a text database that offers economic and political events in Tunisia for a well definite period.Keywords: textual data, wavelet, denoising, contingency table
Procedia PDF Downloads 277