Search results for: biological data mining
24863 [Keynote Speech]: Feature Selection and Predictive Modeling of Housing Data Using Random Forest
Authors: Bharatendra Rai
Abstract:
Predictive data analysis and modeling involving machine learning techniques become challenging in presence of too many explanatory variables or features. Presence of too many features in machine learning is known to not only cause algorithms to slow down, but they can also lead to decrease in model prediction accuracy. This study involves housing dataset with 79 quantitative and qualitative features that describe various aspects people consider while buying a new house. Boruta algorithm that supports feature selection using a wrapper approach build around random forest is used in this study. This feature selection process leads to 49 confirmed features which are then used for developing predictive random forest models. The study also explores five different data partitioning ratios and their impact on model accuracy are captured using coefficient of determination (r-square) and root mean square error (rsme).Keywords: housing data, feature selection, random forest, Boruta algorithm, root mean square error
Procedia PDF Downloads 32324862 Potential Impacts of Maternal Nutrition and Selection for Residual Feed Intake on Metabolism and Fertility Parameters in Angus Bulls
Authors: Aidin Foroutan, David S. Wishart, Leluo L. Guan, Carolyn Fitzsimmons
Abstract:
Maximizing efficiency and growth potential of beef cattle requires not only genetic selection (i.e. residual feed intake (RFI)) but also adequate nutrition throughout all stages of growth and development. Nutrient restriction during gestation has been shown to negatively affect post-natal growth and development as well as fertility of the offspring. This, when combined with RFI may affect progeny traits. This study aims to investigate the impact of selection for divergent genetic potential for RFI and maternal nutrition during early- to mid-gestation, on bull calf traits such as fertility and muscle development using multiple ‘omics’ approaches. Comparisons were made between High-diet vs. Low-diet and between High-RFI vs. Low-RFI animals. An epigenetics experiment on semen samples identified 891 biomarkers associated with growth and development. A gene expression study on Longissimus thoracis muscle, semimembranosus muscle, liver, and testis identified 4 genes associated with muscle development and immunity of which Myocyte enhancer factor 2A [MEF2A; induces myogenesis and control muscle differentiation] was the only differentially expressed gene identified in all four tissues. An initial metabolomics experiment on serum samples using nuclear magnetic resonance (NMR) identified 4 metabolite biomarkers related to energy and protein metabolism. Once all the biomarkers are identified, bioinformatics approaches will be used to create a database covering all the ‘omics’ data collected from this project. This database will be broadened by adding other information obtained from relevant literature reviews. Association analyses with these data sets will be performed to reveal key biological pathways affected by RFI and maternal nutrition. Through these association studies between the genome and metabolome, it is expected that candidate biomarker genes and metabolites for feed efficiency, fertility, and/or muscle development are identified. If these gene/metabolite biomarkers are validated in a larger animal population, they could potentially be used in breeding programs to select superior animals. It is also expected that this work will lead to the development of an online tool that could be used to predict future traits of interest in an animal given its measurable ‘omics’ traits.Keywords: biomarker, maternal nutrition, omics, residual feed intake
Procedia PDF Downloads 19124861 Image-Based (RBG) Technique for Estimating Phosphorus Levels of Different Crops
Authors: M. M. Ali, Ahmed Al- Ani, Derek Eamus, Daniel K. Y. Tan
Abstract:
In this glasshouse study, we developed the new image-based non-destructive technique for detecting leaf P status of different crops such as cotton, tomato and lettuce. Plants were allowed to grow on nutrient media containing different P concentrations, i.e. 0%, 50% and 100% of recommended P concentration (P0 = no P, L; P1 = 2.5 mL 10 L-1 of P and P2 = 5 mL 10 L-1 of P as NaH2PO4). After 10 weeks of growth, plants were harvested and data on leaf P contents were collected using the standard destructive laboratory method and at the same time leaf images were collected by a handheld crop image sensor. We calculated leaf area, leaf perimeter and RGB (red, green and blue) values of these images. This data was further used in the linear discriminant analysis (LDA) to estimate leaf P contents, which successfully classified these plants on the basis of leaf P contents. The data indicated that P deficiency in crop plants can be predicted using the image and morphological data. Our proposed non-destructive imaging method is precise in estimating P requirements of different crop species.Keywords: image-based techniques, leaf area, leaf P contents, linear discriminant analysis
Procedia PDF Downloads 38224860 Management of Diabetics on Hemodialysis
Authors: Souheila Zemmouchi
Abstract:
Introduction: Diabetes is currently the leading cause of end-stage chronic kidney disease and dialysis, so it adds additional complexity to the management of chronic hemodialysis patients. These patients are extremely fragile because of their multiple cardiovascular and metabolic comorbidities. Clear and complete description of the experience: the management of a diabetic on hemodialysis is particularly difficult due to frequent hypoglycaemia and significant inter and perdialyticglycemic variability that is difficult to predict. The aim of our study is to describe the clinical-biological profile and to assess the cardiovascular risk of diabetics undergoing chronic hemodialysis, and compare them with non-diabetic hemodialysis patients. Methods: This cross-sectional, descriptive, and analytical study was carried out between January 01 and December 31, 2018, involving 309 hemodialysis patients spread over 4 centersThe data were collected prospectively then compiled and analyzed by the SPSS Version 10 software The FRAMINGHAM RISK SCORE has been used to assess cardiovascular risk in all hemodialysis patients Results: The survey involved 309 hemodialysis patients, including 83 diabetics, for a prevalence of 27% The average age 53 ± 10.2 years. The sex ratio is 1.5. 50% of diabetic hemodialysis patients retained residual diuresis against 32% in non-diabetics. In the group of diabetics, we noted more hypertension (70% versus 38% non-diabetics P 0.004), more intradialytichypoglycemia (15% versus 3% non-diabetics P 0.007), initially, vascular exhaustion was found in 4 diabetics versus 2 non-diabetics. 70% of diabetics with anuria had postdialytichyperglycemia. The study found a statistically significant difference between the different levels of cardiovascular risk according to the diabetic status. Conclusion: There are many challenges in the management of diabetics on hemodialysis, both to optimize glycemic control according to an individualized target and to coordinate comprehensive and effective care.Keywords: hemodialysis, diabetes, chronic renal failure, glycemic control
Procedia PDF Downloads 16024859 Bulk Modification of Poly(Dimethylsiloxane) for Biomedical Applications
Authors: A. Aslihan Gokaltun, Martin L. Yarmush, Ayse Asatekin, O. Berk Usta
Abstract:
In the last decade microfabrication processes including rapid prototyping techniques have advanced rapidly and achieved a fairly matured stage. These advances encouraged and enabled the use of microfluidic devices by a wider range of users with applications in biological separations, and cell and organoid cultures. Accordingly, a significant current challenge in the field is controlling biomolecular interactions at interfaces and the development of novel biomaterials to satisfy the unique needs of the biomedical applications. Poly(dimethylsiloxane) (PDMS) is by far the most preferred material in the fabrication of microfluidic devices. This can be attributed its favorable properties, including: (1) simple fabrication by replica molding, (2) good mechanical properties, (3) excellent optical transparency from 240 to 1100 nm, (4) biocompatibility and non-toxicity, and (5) high gas permeability. However, high hydrophobicity (water contact angle ~108°±7°) of PDMS often limits its applications where solutions containing biological samples are concerned. In our study, we created a simple, easy method for modifying the surface chemistry of PDMS microfluidic devices through the addition of surface-segregating additives during manufacture. In this method, a surface segregating copolymer is added to precursors for silicone and the desired device is manufactured following the usual methods. When the device surface is in contact with an aqueous solution, the copolymer self-organizes to expose its hydrophilic segments to the surface, making the surface of the silicone device more hydrophilic. This can lead to several improved performance criteria including lower fouling, lower non-specific adsorption, and better wettability. Specifically, this approach is expected to be useful for the manufacture of microfluidic devices. It is also likely to be useful for manufacturing silicone tubing and other materials, biomaterial applications, and surface coatings.Keywords: microfluidics, non-specific protein adsorption, PDMS, PEG, copolymer
Procedia PDF Downloads 26724858 Design of Visual Repository, Constraint and Process Modeling Tool Based on Eclipse Plug-Ins
Authors: Rushiraj Heshi, Smriti Bhandari
Abstract:
Master Data Management requires creation of Central repository, applying constraints on Repository and designing processes to manage data. Designing of Repository, constraints on repository and business processes is very tedious and time consuming task for large Enterprise. Hence Visual Repository, constraints and Process (Workflow) modeling is the most critical step in Master Data Management.In this paper, we realize a Visual Modeling tool for implementing Repositories, Constraints and Processes based on Eclipse Plugin using GMF/EMF which follows principles of Model Driven Engineering (MDE).Keywords: EMF, GMF, GEF, repository, constraint, process
Procedia PDF Downloads 49724857 The Classification Performance in Parametric and Nonparametric Discriminant Analysis for a Class- Unbalanced Data of Diabetes Risk Groups
Authors: Lily Ingsrisawang, Tasanee Nacharoen
Abstract:
Introduction: The problems of unbalanced data sets generally appear in real world applications. Due to unequal class distribution, many research papers found that the performance of existing classifier tends to be biased towards the majority class. The k -nearest neighbors’ nonparametric discriminant analysis is one method that was proposed for classifying unbalanced classes with good performance. Hence, the methods of discriminant analysis are of interest to us in investigating misclassification error rates for class-imbalanced data of three diabetes risk groups. Objective: The purpose of this study was to compare the classification performance between parametric discriminant analysis and nonparametric discriminant analysis in a three-class classification application of class-imbalanced data of diabetes risk groups. Methods: Data from a healthy project for 599 staffs in a government hospital in Bangkok were obtained for the classification problem. The staffs were diagnosed into one of three diabetes risk groups: non-risk (90%), risk (5%), and diabetic (5%). The original data along with the variables; diabetes risk group, age, gender, cholesterol, and BMI was analyzed and bootstrapped up to 50 and 100 samples, 599 observations per sample, for additional estimation of misclassification error rate. Each data set was explored for the departure of multivariate normality and the equality of covariance matrices of the three risk groups. Both the original data and the bootstrap samples show non-normality and unequal covariance matrices. The parametric linear discriminant function, quadratic discriminant function, and the nonparametric k-nearest neighbors’ discriminant function were performed over 50 and 100 bootstrap samples and applied to the original data. In finding the optimal classification rule, the choices of prior probabilities were set up for both equal proportions (0.33: 0.33: 0.33) and unequal proportions with three choices of (0.90:0.05:0.05), (0.80: 0.10: 0.10) or (0.70, 0.15, 0.15). Results: The results from 50 and 100 bootstrap samples indicated that the k-nearest neighbors approach when k = 3 or k = 4 and the prior probabilities of {non-risk:risk:diabetic} as {0.90:0.05:0.05} or {0.80:0.10:0.10} gave the smallest error rate of misclassification. Conclusion: The k-nearest neighbors approach would be suggested for classifying a three-class-imbalanced data of diabetes risk groups.Keywords: error rate, bootstrap, diabetes risk groups, k-nearest neighbors
Procedia PDF Downloads 43524856 BFDD-S: Big Data Framework to Detect and Mitigate DDoS Attack in SDN Network
Authors: Amirreza Fazely Hamedani, Muzzamil Aziz, Philipp Wieder, Ramin Yahyapour
Abstract:
Software-defined networking in recent years came into the sight of so many network designers as a successor to the traditional networking. Unlike traditional networks where control and data planes engage together within a single device in the network infrastructure such as switches and routers, the two planes are kept separated in software-defined networks (SDNs). All critical decisions about packet routing are made on the network controller, and the data level devices forward the packets based on these decisions. This type of network is vulnerable to DDoS attacks, degrading the overall functioning and performance of the network by continuously injecting the fake flows into it. This increases substantial burden on the controller side, and the result ultimately leads to the inaccessibility of the controller and the lack of network service to the legitimate users. Thus, the protection of this novel network architecture against denial of service attacks is essential. In the world of cybersecurity, attacks and new threats emerge every day. It is essential to have tools capable of managing and analyzing all this new information to detect possible attacks in real-time. These tools should provide a comprehensive solution to automatically detect, predict and prevent abnormalities in the network. Big data encompasses a wide range of studies, but it mainly refers to the massive amounts of structured and unstructured data that organizations deal with on a regular basis. On the other hand, it regards not only the volume of the data; but also that how data-driven information can be used to enhance decision-making processes, security, and the overall efficiency of a business. This paper presents an intelligent big data framework as a solution to handle illegitimate traffic burden on the SDN network created by the numerous DDoS attacks. The framework entails an efficient defence and monitoring mechanism against DDoS attacks by employing the state of the art machine learning techniques.Keywords: apache spark, apache kafka, big data, DDoS attack, machine learning, SDN network
Procedia PDF Downloads 16924855 Welding Process Selection for Storage Tank by Integrated Data Envelopment Analysis and Fuzzy Credibility Constrained Programming Approach
Authors: Rahmad Wisnu Wardana, Eakachai Warinsiriruk, Sutep Joy-A-Ka
Abstract:
Selecting the most suitable welding process usually depends on experiences or common application in similar companies. However, this approach generally ignores many criteria that can be affecting the suitable welding process selection. Therefore, knowledge automation through knowledge-based systems will significantly improve the decision-making process. The aims of this research propose integrated data envelopment analysis (DEA) and fuzzy credibility constrained programming approach for identifying the best welding process for stainless steel storage tank in the food and beverage industry. The proposed approach uses fuzzy concept and credibility measure to deal with uncertain data from experts' judgment. Furthermore, 12 parameters are used to determine the most appropriate welding processes among six competitive welding processes.Keywords: welding process selection, data envelopment analysis, fuzzy credibility constrained programming, storage tank
Procedia PDF Downloads 16724854 A Combinatorial Approach of Treatment for Landfill Leachate
Authors: Anusha Atmakuri, R. D. Tyagi, Patrick Drogui
Abstract:
Landfilling is the most familiar and easy way to dispose solid waste. Landfill is generally received via wastes from municipal near to a landfill. The waste collected is from commercial, industrial, and residential areas and many more. Landfill leachate (LFL) is formed when rainwater passes through the waste placed in landfills and consists of several dissolved organic materials, for instance, aquatic humic substances (AHS), volatile fatty acids (VFAs), heavy metals, inorganic macro components, and xenobiotic organic matters, highly toxic to the environment. These components of LFL put a load on it, hence it necessitates the treatment of LFL prior to its discharge into the environment. Various methods have been used to treat LFL over the years, such as physical, chemical, biological, physicochemical, electrical, and advanced oxidation methods. This study focuses on the combination of biological and electrochemical methods- extracellular polymeric substances and electrocoagulation(EC). The coupling of electro-coagulation process with extracellular polymeric substances (EPS) (as flocculant) as pre and\or post treatment strategy provides efficient and economical process for the decontamination of landfill leachate contaminated with suspended matter, metals (e.g., Fe, Mn) and ammonical nitrogen. Electro-coagulation and EPS mediated coagulation approach could be an economically viable for the treatment of landfill leachate, along with possessing several other advantages over several other methods. This study utilised waste substrates such as activated sludge, crude glycerol and waste cooking oil for the production of EPS using fermentation technology. A comparison of different scenarios for the treatment of landfill leachate is presented- such as using EPS alone as bioflocculant, EPS and EC with EPS being the 1st stage, and EPS and EC with EC being the 1st stage. The work establishes the use of crude EPS as a bioflocculant for the treatment of landfill leachate and wastewater from a site near a landfill, along with EC being successful in removal of some major pollutants such as COD, turbidity, total suspended solids. A combination of these two methods is to be explored more for the complete removal of all pollutants from landfill leachate.Keywords: landfill leachate, extracellular polymeric substances, electrocoagulation, bioflocculant.
Procedia PDF Downloads 8624853 On the Estimation of Crime Rate in the Southwest of Nigeria: Principal Component Analysis Approach
Authors: Kayode Balogun, Femi Ayoola
Abstract:
Crime is at alarming rate in this part of world and there are many factors that are contributing to this antisocietal behaviour both among the youths and old. In this work, principal component analysis (PCA) was used as a tool to reduce the dimensionality and to really know those variables that were crime prone in the study region. Data were collected on twenty-eight crime variables from National Bureau of Statistics (NBS) databank for a period of fifteen years, while retaining as much of the information as possible. We use PCA in this study to know the number of major variables and contributors to the crime in the Southwest Nigeria. The results of our analysis revealed that there were eight principal variables have been retained using the Scree plot and Loading plot which implies an eight-equation solution will be appropriate for the data. The eight components explained 93.81% of the total variation in the data set. We also found that the highest and commonly committed crimes in the Southwestern Nigeria were: Assault, Grievous Harm and Wounding, theft/stealing, burglary, house breaking, false pretence, unlawful arms possession and breach of public peace.Keywords: crime rates, data, Southwest Nigeria, principal component analysis, variables
Procedia PDF Downloads 44424852 On-Line Data-Driven Multivariate Statistical Prediction Approach to Production Monitoring
Authors: Hyun-Woo Cho
Abstract:
Detection of incipient abnormal events in production processes is important to improve safety and reliability of manufacturing operations and reduce losses caused by failures. The construction of calibration models for predicting faulty conditions is quite essential in making decisions on when to perform preventive maintenance. This paper presents a multivariate calibration monitoring approach based on the statistical analysis of process measurement data. The calibration model is used to predict faulty conditions from historical reference data. This approach utilizes variable selection techniques, and the predictive performance of several prediction methods are evaluated using real data. The results shows that the calibration model based on supervised probabilistic model yielded best performance in this work. By adopting a proper variable selection scheme in calibration models, the prediction performance can be improved by excluding non-informative variables from their model building steps.Keywords: calibration model, monitoring, quality improvement, feature selection
Procedia PDF Downloads 35624851 Radioprotective Effects of Selenium and Vitamin-E against 6Mv X-Rays in Human Volunteers Blood Lymphocytes by Micronuclei Assay
Authors: Vahid Changizi, Aram Rostami, Akbar Mosavi
Abstract:
Purpose of study: Critical macromolecules of cells such as DNA are in exposure to damage of free radicals that induced from interaction of ionizing radiation with biological systems. Selenium and vitamin-E are natural compound that has been shown to be a direct free radical scavenger. The aim of this study was to investigate the in vivo/in vitro radioprotective effect of selenium and vitamin-E separately and synergistically against genotoxicity induced by 6MV x-rays irradiation in cultured blood lymphocytes from 15 human volunteers. Methods: Fifteen volunteers were divided in three groups include A, B and C. These groups were given slenium(800 IU), vitamin-E(100 mg) and selenium(400 IU) + vitamin-E(50 mg), respectively. Peripheral blood samples were collected from each group before(0 hr) and 1, 2 and 3 hr after selenium and vitamin-E administration (separately and synergistically). Then the blood samples were irradiated to 200 cGy of 6 Mv x-rays. After that, lymphocyte samples were cultured with mitogenic stimulation to determine the chromosomal aberrations wih micronucleus assay in cytokinesis-blocked binucleated cells. Results: The lymphocytes in the blood samples collected at 1 hr after ingestion selenium and vitamin-E, exposed in vitro to x-rays exhibited a significant decrease in the incidence of micronuclei, compared with control group at 0 hr. The maximum protection and decrease in frequency of micronuclei(50%) was observed at 1 hr after administration of selenium and vitamin-E synergistically. Conclusion: The data suggest that ingestion of selenium and vitamin-E as a radioprotector substances before exposures may reduce genetic damage caused by x-rays irradiation.Keywords: x-rays, selenium, vitamin-e, lymphocyte, micronuclei
Procedia PDF Downloads 26724850 Multilevel Gray Scale Image Encryption through 2D Cellular Automata
Authors: Rupali Bhardwaj
Abstract:
Cryptography is the science of using mathematics to encrypt and decrypt data; the data are converted into some other gibberish form, and then the encrypted data are transmitted. The primary purpose of this paper is to provide two levels of security through a two-step process, rather than transmitted the message bits directly, first encrypted it using 2D cellular automata and then scrambled with Arnold Cat Map transformation; it provides an additional layer of protection and reduces the chance of the transmitted message being detected. A comparative analysis on effectiveness of scrambling technique is provided by scrambling degree measurement parameters i.e. Gray Difference Degree (GDD) and Correlation Coefficient.Keywords: scrambling, cellular automata, Arnold cat map, game of life, gray difference degree, correlation coefficient
Procedia PDF Downloads 37824849 Survey Based Data Security Evaluation in Pakistan Financial Institutions against Malicious Attacks
Authors: Naveed Ghani, Samreen Javed
Abstract:
In today’s heterogeneous network environment, there is a growing demand for distrust clients to jointly execute secure network to prevent from malicious attacks as the defining task of propagating malicious code is to locate new targets to attack. Residual risk is always there no matter what solutions are implemented or whet so ever security methodology or standards being adapted. Security is the first and crucial phase in the field of Computer Science. The main aim of the Computer Security is gathering of information with secure network. No one need wonder what all that malware is trying to do: It's trying to steal money through data theft, bank transfers, stolen passwords, or swiped identities. From there, with the help of our survey we learn about the importance of white listing, antimalware programs, security patches, log files, honey pots, and more used in banks for financial data protection but there’s also a need of implementing the IPV6 tunneling with Crypto data transformation according to the requirements of new technology to prevent the organization from new Malware attacks and crafting of its own messages and sending them to the target. In this paper the writer has given the idea of implementing IPV6 Tunneling Secessions on private data transmission from financial organizations whose secrecy needed to be safeguarded.Keywords: network worms, malware infection propagating malicious code, virus, security, VPN
Procedia PDF Downloads 35824848 Interactive IoT-Blockchain System for Big Data Processing
Authors: Abdallah Al-ZoubI, Mamoun Dmour
Abstract:
The spectrum of IoT devices is becoming widely diversified, entering almost all possible fields and finding applications in industry, health, finance, logistics, education, to name a few. The IoT active endpoint sensors and devices exceeded the 12 billion mark in 2021 and are expected to reach 27 billion in 2025, with over $34 billion in total market value. This sheer rise in numbers and use of IoT devices bring with it considerable concerns regarding data storage, analysis, manipulation and protection. IoT Blockchain-based systems have recently been proposed as a decentralized solution for large-scale data storage and protection. COVID-19 has actually accelerated the desire to utilize IoT devices as it impacted both demand and supply and significantly affected several regions due to logistic reasons such as supply chain interruptions, shortage of shipping containers and port congestion. An IoT-blockchain system is proposed to handle big data generated by a distributed network of sensors and controllers in an interactive manner. The system is designed using the Ethereum platform, which utilizes smart contracts, programmed in solidity to execute and manage data generated by IoT sensors and devices. such as Raspberry Pi 4, Rasbpian, and add-on hardware security modules. The proposed system will run a number of applications hosted by a local machine used to validate transactions. It then sends data to the rest of the network through InterPlanetary File System (IPFS) and Ethereum Swarm, forming a closed IoT ecosystem run by blockchain where a number of distributed IoT devices can communicate and interact, thus forming a closed, controlled environment. A prototype has been deployed with three IoT handling units distributed over a wide geographical space in order to examine its feasibility, performance and costs. Initial results indicated that big IoT data retrieval and storage is feasible and interactivity is possible, provided that certain conditions of cost, speed and thorough put are met.Keywords: IoT devices, blockchain, Ethereum, big data
Procedia PDF Downloads 15024847 Keynote Talk: The Role of Internet of Things in the Smart Cities Power System
Authors: Abdul-Rahman Al-Ali
Abstract:
As the number of mobile devices is growing exponentially, it is estimated to connect about 50 million devices to the Internet by the year 2020. At the end of this decade, it is expected that an average of eight connected devices per person worldwide. The 50 billion devices are not mobile phones and data browsing gadgets only, but machine-to-machine and man-to-machine devices. With such growing numbers of devices the Internet of Things (I.o.T) concept is one of the emerging technologies as of recently. Within the smart grid technologies, smart home appliances, Intelligent Electronic Devices (IED) and Distributed Energy Resources (DER) are major I.o.T objects that can be addressable using the IPV6. These objects are called the smart grid internet of things (SG-I.o.T). The SG-I.o.T generates big data that requires high-speed computing infrastructure, widespread computer networks, big data storage, software, and platforms services. A company’s utility control and data centers cannot handle such a large number of devices, high-speed processing, and massive data storage. Building large data center’s infrastructure takes a long time, it also requires widespread communication networks and huge capital investment. To maintain and upgrade control and data centers’ infrastructure and communication networks as well as updating and renewing software licenses which collectively, requires additional cost. This can be overcome by utilizing the emerging computing paradigms such as cloud computing. This can be used as a smart grid enabler to replace the legacy of utilities data centers. The talk will highlight the role of I.o.T, cloud computing services and their development models within the smart grid technologies.Keywords: intelligent electronic devices (IED), distributed energy resources (DER), internet, smart home appliances
Procedia PDF Downloads 32424846 Statistical Analysis of Interferon-γ for the Effectiveness of an Anti-Tuberculous Treatment
Authors: Shishen Xie, Yingda L. Xie
Abstract:
Tuberculosis (TB) is a potentially serious infectious disease that remains a health concern. The Interferon Gamma Release Assay (IGRA) is a blood test to find out if an individual is tuberculous positive or negative. This study applies statistical analysis to the clinical data of interferon-gamma levels of seventy-three subjects who diagnosed pulmonary TB in an anti-tuberculous treatment. Data analysis is performed to determine if there is a significant decline in interferon-gamma levels for the subjects during a period of six months, and to infer if the anti-tuberculous treatment is effective.Keywords: data analysis, interferon gamma release assay, statistical methods, tuberculosis infection
Procedia PDF Downloads 30624845 Fast Fourier Transform-Based Steganalysis of Covert Communications over Streaming Media
Authors: Jinghui Peng, Shanyu Tang, Jia Li
Abstract:
Steganalysis seeks to detect the presence of secret data embedded in cover objects, and there is an imminent demand to detect hidden messages in streaming media. This paper shows how a steganalysis algorithm based on Fast Fourier Transform (FFT) can be used to detect the existence of secret data embedded in streaming media. The proposed algorithm uses machine parameter characteristics and a network sniffer to determine whether the Internet traffic contains streaming channels. The detected streaming data is then transferred from the time domain to the frequency domain through FFT. The distributions of power spectra in the frequency domain between original VoIP streams and stego VoIP streams are compared in turn using t-test, achieving the p-value of 7.5686E-176 which is below the threshold. The results indicate that the proposed FFT-based steganalysis algorithm is effective in detecting the secret data embedded in VoIP streaming media.Keywords: steganalysis, security, Fast Fourier Transform, streaming media
Procedia PDF Downloads 14724844 Privacy-Preserving Model for Social Network Sites to Prevent Unwanted Information Diffusion
Authors: Sanaz Kavianpour, Zuraini Ismail, Bharanidharan Shanmugam
Abstract:
Social Network Sites (SNSs) can be served as an invaluable platform to transfer the information across a large number of individuals. A substantial component of communicating and managing information is to identify which individual will influence others in propagating information and also whether dissemination of information in the absence of social signals about that information will be occurred or not. Classifying the final audience of social data is difficult as controlling the social contexts which transfers among individuals are not completely possible. Hence, undesirable information diffusion to an unauthorized individual on SNSs can threaten individuals’ privacy. This paper highlights the information diffusion in SNSs and moreover it emphasizes the most significant privacy issues to individuals of SNSs. The goal of this paper is to propose a privacy-preserving model that has urgent regards with individuals’ data in order to control availability of data and improve privacy by providing access to the data for an appropriate third parties without compromising the advantages of information sharing through SNSs.Keywords: anonymization algorithm, classification algorithm, information diffusion, privacy, social network sites
Procedia PDF Downloads 32124843 Soil and the Gut Microbiome: Supporting the 'Hygiene Hypothesis'
Authors: Chris George, Adam Hamlin, Lily Pereg, Richard Charlesworth, Gal Winter
Abstract:
Background: According to the ‘hygiene hypothesis’ the current rise in allergies and autoimmune diseases stems mainly from reduced microbial exposure due, amongst other factors, to urbanisation and distance from soil. However, this hypothesis is based on epidemiological and not biological data. Useful insights into the underlying mechanisms of this hypothesis can be gained by studying our interaction with soil. Soil microbiota may be directly ingested or inhaled by humans, enter the body through skin-soil contact or using plants as vectors. This study aims to examine the ability of soil microbiota to colonise the gut, study the interaction of soil microbes with the immune system and their potential protective activity. Method: The nutrition of the rats was supplemented daily with fresh or autoclaved soil for 21 days followed by 14 days of no supplementations. Faecal samples were collected throughout and analysed using 16S sequencing. At the end of the experiment rats were sacrificed and tissues and digesta were collected. Results/Conclusion: Results showed significantly higher richness and diversity following soil supplementation even after recovery. Specific soil microbial groups identified as able to colonise the gut. Of particular interest was the mucosal layer which emerged as a receptive host for soil microorganisms. Histological examination revealed innate and adaptive immune activation. Findings of this study reinforce the ‘hygiene hypothesis’ by demonstrating the ability of soil microbes to colonise the gut and activate the immune system. This paves the way for further studies aimed to examine the interaction of soil microorganisms with the immune system.Keywords: gut microbiota, hygiene hypothesis, microbiome, soil
Procedia PDF Downloads 25624842 Application Difference between Cox and Logistic Regression Models
Authors: Idrissa Kayijuka
Abstract:
The logistic regression and Cox regression models (proportional hazard model) at present are being employed in the analysis of prospective epidemiologic research looking into risk factors in their application on chronic diseases. However, a theoretical relationship between the two models has been studied. By definition, Cox regression model also called Cox proportional hazard model is a procedure that is used in modeling data regarding time leading up to an event where censored cases exist. Whereas the Logistic regression model is mostly applicable in cases where the independent variables consist of numerical as well as nominal values while the resultant variable is binary (dichotomous). Arguments and findings of many researchers focused on the overview of Cox and Logistic regression models and their different applications in different areas. In this work, the analysis is done on secondary data whose source is SPSS exercise data on BREAST CANCER with a sample size of 1121 women where the main objective is to show the application difference between Cox regression model and logistic regression model based on factors that cause women to die due to breast cancer. Thus we did some analysis manually i.e. on lymph nodes status, and SPSS software helped to analyze the mentioned data. This study found out that there is an application difference between Cox and Logistic regression models which is Cox regression model is used if one wishes to analyze data which also include the follow-up time whereas Logistic regression model analyzes data without follow-up-time. Also, they have measurements of association which is different: hazard ratio and odds ratio for Cox and logistic regression models respectively. A similarity between the two models is that they are both applicable in the prediction of the upshot of a categorical variable i.e. a variable that can accommodate only a restricted number of categories. In conclusion, Cox regression model differs from logistic regression by assessing a rate instead of proportion. The two models can be applied in many other researches since they are suitable methods for analyzing data but the more recommended is the Cox, regression model.Keywords: logistic regression model, Cox regression model, survival analysis, hazard ratio
Procedia PDF Downloads 45524841 Developing Research Involving Different Species: Opportunities and Empirical Foundations
Authors: A. V. Varfolomeeva, N. S. Tkachenko, A. G. Tishchenko
Abstract:
The problem of violation of internal validity in studies of psychological structures is considered. The role of epistemological attitudes of researchers in the planning of research within the methodology of the system-evolutionary approach is assessed. Alternative programs of psychological research involving representatives of different biological species are presented. On the example of the results of two research series the variants of solving the problem are discussed.Keywords: epistemological attitudes, experimental design, validity, psychological structure, learning
Procedia PDF Downloads 11524840 Navigating Government Finance Statistics: Effortless Retrieval and Comparative Analysis through Data Science and Machine Learning
Authors: Kwaku Damoah
Abstract:
This paper presents a methodology and software application (App) designed to empower users in accessing, retrieving, and comparatively exploring data within the hierarchical network framework of the Government Finance Statistics (GFS) system. It explores the ease of navigating the GFS system and identifies the gaps filled by the new methodology and App. The GFS, embodies a complex Hierarchical Network Classification (HNC) structure, encapsulating institutional units, revenues, expenses, assets, liabilities, and economic activities. Navigating this structure demands specialized knowledge, experience, and skill, posing a significant challenge for effective analytics and fiscal policy decision-making. Many professionals encounter difficulties deciphering these classifications, hindering confident utilization of the system. This accessibility barrier obstructs a vast number of professionals, students, policymakers, and the public from leveraging the abundant data and information within the GFS. Leveraging R programming language, Data Science Analytics and Machine Learning, an efficient methodology enabling users to access, navigate, and conduct exploratory comparisons was developed. The machine learning Fiscal Analytics App (FLOWZZ) democratizes access to advanced analytics through its user-friendly interface, breaking down expertise barriers.Keywords: data science, data wrangling, drilldown analytics, government finance statistics, hierarchical network classification, machine learning, web application.
Procedia PDF Downloads 7024839 Value Chain Based New Business Opportunity
Authors: Seonjae Lee, Sungjoo Lee
Abstract:
Excavation is necessary to remain competitive in the current business environment. The company survived the rapidly changing industry conditions by adapting new business strategy and reducing technology challenges. Traditionally, the two methods are conducted excavations for new businesses. The first method is, qualitative analysis of expert opinion, which is gathered through opportunities and secondly, new technologies are discovered through quantitative data analysis of method patents. The second method increases time and cost. Patent data is restricted for use and the purpose of discovering business opportunities. This study presents the company's characteristics (sector, size, etc.), of new business opportunities in customized form by reviewing the value chain perspective and to contributing to creating new business opportunities in the proposed model. It utilizes the trademark database of the Korean Intellectual Property Office (KIPO) and proprietary company information database of the Korea Enterprise Data (KED). This data is key to discovering new business opportunities with analysis of competitors and advanced business trademarks (Module 1) and trading analysis of competitors found in the KED (Module 2).Keywords: value chain, trademark, trading analysis, new business opportunity
Procedia PDF Downloads 37324838 Towards Addressing the Cultural Snapshot Phenomenon in Cultural Mapping Libraries
Authors: Mousouris Spiridon, Kavakli Evangelia
Abstract:
This paper focuses on Digital Libraries (DLs) that contain and geovisualise cultural data, highlighting the need to define them as a separate category termed Cultural Mapping Libraries, based on their inherent connection of culture with geographic location and their design requirements in support of visual representation of cultural data on the map. An exploratory analysis of DLs that conform to the above definition brought forward the observation that existing Cultural Mapping Libraries fail to geovisualise the entirety of cultural data per point of interest thus resulting in a Cultural Snapshot phenomenon. The existence of this phenomenon was reinforced by the results of a systematic bibliographic research. In order to address the Cultural Snapshot, this paper proposes the use of the Semantic Web principles to efficiently interconnect spatial cultural data through time, per geographic location. In this way points of interest are transformed into scenery where culture evolves over time. This evolution is expressed as occurrences taking place chronologically, in an event oriented approach, a conceptualization also endorsed by the CIDOC Conceptual Reference Model (CIDOC CRM). In particular, we posit the use of CIDOC CRM as the baseline for defining the logic of Cultural Mapping Libraries as part of the Culture Domain in accordance with the Digital Library Reference Model, in order to define the rules of cultural data management by the system. Our future goal is to transform this conceptual definition in to inferencing rules that resolve the Cultural Snapshot and lead to a more complete geovisualisation of cultural data.Keywords: digital libraries, semantic web, geovisualization, CIDOC-CRM
Procedia PDF Downloads 10924837 An Evaluation of the Impact of E-Banking on Operational Efficiency of Banks in Nigeria
Authors: Ibrahim Rabiu Darazo
Abstract:
The research has been conducted on the impact of E-banking on the operational efficiency of Banks in Nigeria, A case of some selected banks (Diamond Bank Plc, GTBankPlc, and Fidelity Bank Plc) in Nigeria. The research is a quantitative research which uses both primary and secondary sources of data collection. Questionnaire were used to obtained accurate data, where 150 Questionnaire were distributed among staff and customers of the three Banks , and the data collected where analysed using chi-square, whereas the secondary data where obtained from relevant text books, journals and relevant web sites. It is clear from the findings that, the use of e-banking by the banks has improved the efficiency of these banks, in terms of providing efficient services to customers electronically, using Internet Banking, Telephone Banking ATMs, reducing time taking to serve customers, e-banking allow new customers to open an account online, customers have access to their account at all the time 24/7.E-banking provide access to customers information from the data base and cost of check and postage were eliminated using e-banking. The recommendation at the end of the research include; the Banks should try to update their electronic gadgets, e-fraud(internal & external) should also be controlled, Banks shall employ qualified man power, Biometric ATMs shall be introduce to reduce fraud using ATM Cards, as it is use in other countries like USA.Keywords: banks, electronic banking, operational efficiency of banks, biometric ATMs
Procedia PDF Downloads 33324836 Optimize Data Evaluation Metrics for Fraud Detection Using Machine Learning
Authors: Jennifer Leach, Umashanger Thayasivam
Abstract:
The use of technology has benefited society in more ways than one ever thought possible. Unfortunately, though, as society’s knowledge of technology has advanced, so has its knowledge of ways to use technology to manipulate people. This has led to a simultaneous advancement in the world of fraud. Machine learning techniques can offer a possible solution to help decrease this advancement. This research explores how the use of various machine learning techniques can aid in detecting fraudulent activity across two different types of fraudulent data, and the accuracy, precision, recall, and F1 were recorded for each method. Each machine learning model was also tested across five different training and testing splits in order to discover which testing split and technique would lead to the most optimal results.Keywords: data science, fraud detection, machine learning, supervised learning
Procedia PDF Downloads 19624835 Suitability of Satellite-Based Data for Groundwater Modelling in Southwest Nigeria
Authors: O. O. Aiyelokun, O. A. Agbede
Abstract:
Numerical modelling of groundwater flow can be susceptible to calibration errors due to lack of adequate ground-based hydro-metrological stations in river basins. Groundwater resources management in Southwest Nigeria is currently challenged by overexploitation, lack of planning and monitoring, urbanization and climate change; hence to adopt models as decision support tools for sustainable management of groundwater; they must be adequately calibrated. Since river basins in Southwest Nigeria are characterized by missing data, and lack of adequate ground-based hydro-meteorological stations; the need for adopting satellite-based data for constructing distributed models is crucial. This study seeks to evaluate the suitability of satellite-based data as substitute for ground-based, for computing boundary conditions; by determining if ground and satellite based meteorological data fit well in Ogun and Oshun River basins. The Climate Forecast System Reanalysis (CFSR) global meteorological dataset was firstly obtained in daily form and converted to monthly form for the period of 432 months (January 1979 to June, 2014). Afterwards, ground-based meteorological data for Ikeja (1981-2010), Abeokuta (1983-2010), and Oshogbo (1981-2010) were compared with CFSR data using Goodness of Fit (GOF) statistics. The study revealed that based on mean absolute error (MEA), coefficient of correlation, (r) and coefficient of determination (R²); all meteorological variables except wind speed fit well. It was further revealed that maximum and minimum temperature, relative humidity and rainfall had high range of index of agreement (d) and ratio of standard deviation (rSD), implying that CFSR dataset could be used to compute boundary conditions such as groundwater recharge and potential evapotranspiration. The study concluded that satellite-based data such as the CFSR should be used as input when constructing groundwater flow models in river basins in Southwest Nigeria, where majority of the river basins are partially gaged and characterized with long missing hydro-metrological data.Keywords: boundary condition, goodness of fit, groundwater, satellite-based data
Procedia PDF Downloads 13024834 Power Generation and Treatment potential of Microbial Fuel Cell (MFC) from Landfill Leachate
Authors: Beenish Saba, Ann D. Christy
Abstract:
Modern day municipal solid waste landfills are operated and controlled to protect the environment from contaminants during the biological stabilization and degradation of the solid waste. They are equipped with liners, caps, gas and leachate collection systems. Landfill gas is passively or actively collected and can be used as bio fuel after necessary purification, but leachate treatment is the more difficult challenge. Leachate, if not recirculated in a bioreactor landfill system, is typically transported to a local wastewater treatment plant for treatment. These plants are designed for sewage treatment, and often charge additional fees for higher strength wastewaters such as leachate if they accept them at all. Different biological, chemical, physical and integrated techniques can be used to treat the leachate. Treating that leachate with simultaneous power production using microbial fuel cells (MFC) technology has been a recent innovation, reported its application in its earliest starting phase. High chemical oxygen demand (COD), ionic strength and salt concentration are some of the characteristics which make leachate an excellent substrate for power production in MFCs. Different materials of electrodes, microbial communities, carbon co-substrates and temperature conditions are some factors that can be optimized to achieve simultaneous power production and treatment. The advantage of the MFC is its dual functionality but lower power production and high costs are the hurdles in its commercialization and more widespread application. The studies so far suggest that landfill leachate MFCs can produce 1.8 mW/m2 with 79% COD removal, while amendment with food leachate or domestic wastewater can increase performance up to 18W/m3 with 90% COD removal. The columbic efficiency is reported to vary between 2-60%. However efforts towards biofilm optimization, efficient electron transport system studies and use of genetic tools can increase the efficiency of the MFC and can determine its future potential in treating landfill leachate.Keywords: microbial fuel cell, landfill leachate, power generation, MFC
Procedia PDF Downloads 317