Search results for: missing data estimation
24991 Development of Fault Diagnosis Technology for Power System Based on Smart Meter
Authors: Chih-Chieh Yang, Chung-Neng Huang
Abstract:
In power system, how to improve the fault diagnosis technology of transmission line has always been the primary goal of power grid operators. In recent years, due to the rise of green energy, the addition of all kinds of distributed power also has an impact on the stability of the power system. Because the smart meters are with the function of data recording and bidirectional transmission, the adaptive Fuzzy Neural inference system, ANFIS, as well as the artificial intelligence that has the characteristics of learning and estimation in artificial intelligence. For transmission network, in order to avoid misjudgment of the fault type and location due to the input of these unstable power sources, combined with the above advantages of smart meter and ANFIS, a method for identifying fault types and location of faults is proposed in this study. In ANFIS training, the bus voltage and current information collected by smart meters can be trained through the ANFIS tool in MATLAB to generate fault codes to identify different types of faults and the location of faults. In addition, due to the uncertainty of distributed generation, a wind power system is added to the transmission network to verify the diagnosis correctness of the study. Simulation results show that the method proposed in this study can correctly identify the fault type and location of fault with more efficiency, and can deal with the interference caused by the addition of unstable power sources.Keywords: ANFIS, fault diagnosis, power system, smart meter
Procedia PDF Downloads 14024990 The Diffusion of Membrane Nanodomains with Specific Ganglioside Composition
Authors: Barbora Chmelova, Radek Sachl
Abstract:
Gangliosides are amphipathic membrane lipids. Due to the composition of bulky oligosaccharide chains containing one or more sialic acids linked to the hydrophobic ceramide base, gangliosides are classified among glycosphingolipids. This unique structure induces a high self-aggregating tendency of gangliosides and, therefore, the formation of nanoscopic clusters called nanodomains. Gangliosides are preferentially present in an extracellular membrane leaflet of all human tissues and thus have an impact on a huge number of biological processes, such as intercellular communication, cell signalling, membrane trafficking, and regulation of receptor activity. Defects in their metabolism, impairment of proper ganglioside function, or changes in their organization lead to serious health conditions such as Alzheimer´s and Parkinson´s diseases, autoimmune diseases, tumour growth, etc. This work mainly focuses on ganglioside organization into nanodomains and their dynamics within the plasma membrane. Current research investigates static ganglioside nanodomains characterization; nevertheless, the information about their diffusion is missing. In our study, fluorescence correlation spectroscopy is implemented together with stimulated emission depletion (STED-FCS), which combines the diffraction-unlimited spatial resolution with high temporal resolution. By comparison of the experiments performed on model vesicles containing 4 % of either GM1, GM2, or GM3 and Monte Carlo simulations of diffusion on the plasma membrane, the description of ganglioside clustering, diffusion of nanodomains, and even diffusion of ganglioside molecules inside investigated nanodomains are described.Keywords: gangliosides, nanodomains, STED-FCS, flourescence microscopy, membrane diffusion
Procedia PDF Downloads 8124989 The Effect of Measurement Distribution on System Identification and Detection of Behavior of Nonlinearities of Data
Authors: Mohammad Javad Mollakazemi, Farhad Asadi, Aref Ghafouri
Abstract:
In this paper, we considered and applied parametric modeling for some experimental data of dynamical system. In this study, we investigated the different distribution of output measurement from some dynamical systems. Also, with variance processing in experimental data we obtained the region of nonlinearity in experimental data and then identification of output section is applied in different situation and data distribution. Finally, the effect of the spanning the measurement such as variance to identification and limitation of this approach is explained.Keywords: Gaussian process, nonlinearity distribution, particle filter, system identification
Procedia PDF Downloads 51624988 Building a Scalable Telemetry Based Multiclass Predictive Maintenance Model in R
Authors: Jaya Mathew
Abstract:
Many organizations are faced with the challenge of how to analyze and build Machine Learning models using their sensitive telemetry data. In this paper, we discuss how users can leverage the power of R without having to move their big data around as well as a cloud based solution for organizations willing to host their data in the cloud. By using ScaleR technology to benefit from parallelization and remote computing or R Services on premise or in the cloud, users can leverage the power of R at scale without having to move their data around.Keywords: predictive maintenance, machine learning, big data, cloud based, on premise solution, R
Procedia PDF Downloads 37924987 Analyzing the Impact of Migration on HIV and AIDS Incidence Cases in Malaysia
Authors: Ofosuhene O. Apenteng, Noor Azina Ismail
Abstract:
The human immunodeficiency virus (HIV) that causes acquired immune deficiency syndrome (AIDS) remains a global cause of morbidity and mortality. It has caused panic since its emergence. Relationships between migration and HIV/AIDS have become complex. In the absence of prospectively designed studies, dynamic mathematical models that take into account the migration movement which will give very useful information. We have explored the utility of mathematical models in understanding transmission dynamics of HIV and AIDS and in assessing the magnitude of how migration has impact on the disease. The model was calibrated to HIV and AIDS incidence data from Malaysia Ministry of Health from the period of 1986 to 2011 using Bayesian analysis with combination of Markov chain Monte Carlo method (MCMC) approach to estimate the model parameters. From the estimated parameters, the estimated basic reproduction number was 22.5812. The rate at which the susceptible individual moved to HIV compartment has the highest sensitivity value which is more significant as compared to the remaining parameters. Thus, the disease becomes unstable. This is a big concern and not good indicator from the public health point of view since the aim is to stabilize the epidemic at the disease-free equilibrium. However, these results suggest that the government as a policy maker should make further efforts to curb illegal activities performed by migrants. It is shown that our models reflect considerably the dynamic behavior of the HIV/AIDS epidemic in Malaysia and eventually could be used strategically for other countries.Keywords: epidemic model, reproduction number, HIV, MCMC, parameter estimation
Procedia PDF Downloads 36724986 Comparison of Rainfall Trends in the Western Ghats and Coastal Region of Karnataka, India
Authors: Vinay C. Doranalu, Amba Shetty
Abstract:
In recent days due to climate change, there is a large variation in spatial distribution of daily rainfall within a small region. Rainfall is one of the main end climatic variables which affect spatio-temporal patterns of water availability. The real task postured by the change in climate is identification, estimation and understanding the uncertainty of rainfall. This study intended to analyze the spatial variations and temporal trends of daily precipitation using high resolution (0.25º x 0.25º) gridded data of Indian Meteorological Department (IMD). For the study, 38 grid points were selected in the study area and analyzed for daily precipitation time series (113 years) over the period 1901-2013. Grid points were divided into two zones based on the elevation and situated location of grid points: Low Land (exposed to sea and low elevated area/ coastal region) and High Land (Interior from sea and high elevated area/western Ghats). Time series were applied to examine the spatial analysis and temporal trends in each grid points by non-parametric Mann-Kendall test and Theil-Sen estimator to perceive the nature of trend and magnitude of slope in trend of rainfall. Pettit-Mann-Whitney test is applied to detect the most probable change point in trends of the time period. Results have revealed remarkable monotonic trend in each grid for daily precipitation of the time series. In general, by the regional cluster analysis found that increasing precipitation trend in shoreline region and decreasing trend in Western Ghats from recent years. Spatial distribution of rainfall can be partly explained by heterogeneity in temporal trends of rainfall by change point analysis. The Mann-Kendall test shows significant variation as weaker rainfall towards the rainfall distribution over eastern parts of the Western Ghats region of Karnataka.Keywords: change point analysis, coastal region India, gridded rainfall data, non-parametric
Procedia PDF Downloads 29524985 Trusting the Big Data Analytics Process from the Perspective of Different Stakeholders
Authors: Sven Gehrke, Johannes Ruhland
Abstract:
Data is the oil of our time, without them progress would come to a hold [1]. On the other hand, the mistrust of data mining is increasing [2]. The paper at hand shows different aspects of the concept of trust and describes the information asymmetry of the typical stakeholders of a data mining project using the CRISP-DM phase model. Based on the identified influencing factors in relation to trust, problematic aspects of the current approach are verified using various interviews with the stakeholders. The results of the interviews confirm the theoretically identified weak points of the phase model with regard to trust and show potential research areas.Keywords: trust, data mining, CRISP DM, stakeholder management
Procedia PDF Downloads 9424984 Wireless Transmission of Big Data Using Novel Secure Algorithm
Authors: K. Thiagarajan, K. Saranya, A. Veeraiah, B. Sudha
Abstract:
This paper presents a novel algorithm for secure, reliable and flexible transmission of big data in two hop wireless networks using cooperative jamming scheme. Two hop wireless networks consist of source, relay and destination nodes. Big data has to transmit from source to relay and from relay to destination by deploying security in physical layer. Cooperative jamming scheme determines transmission of big data in more secure manner by protecting it from eavesdroppers and malicious nodes of unknown location. The novel algorithm that ensures secure and energy balance transmission of big data, includes selection of data transmitting region, segmenting the selected region, determining probability ratio for each node (capture node, non-capture and eavesdropper node) in every segment, evaluating the probability using binary based evaluation. If it is secure transmission resume with the two- hop transmission of big data, otherwise prevent the attackers by cooperative jamming scheme and transmit the data in two-hop transmission.Keywords: big data, two-hop transmission, physical layer wireless security, cooperative jamming, energy balance
Procedia PDF Downloads 49124983 Roughness Discrimination Using Bioinspired Tactile Sensors
Authors: Zhengkun Yi
Abstract:
Surface texture discrimination using artificial tactile sensors has attracted increasing attentions in the past decade as it can endow technical and robot systems with a key missing ability. However, as a major component of texture, roughness has rarely been explored. This paper presents an approach for tactile surface roughness discrimination, which includes two parts: (1) design and fabrication of a bioinspired artificial fingertip, and (2) tactile signal processing for tactile surface roughness discrimination. The bioinspired fingertip is comprised of two polydimethylsiloxane (PDMS) layers, a polymethyl methacrylate (PMMA) bar, and two perpendicular polyvinylidene difluoride (PVDF) film sensors. This artificial fingertip mimics human fingertips in three aspects: (1) Elastic properties of epidermis and dermis in human skin are replicated by the two PDMS layers with different stiffness, (2) The PMMA bar serves the role analogous to that of a bone, and (3) PVDF film sensors emulate Meissner’s corpuscles in terms of both location and response to the vibratory stimuli. Various extracted features and classification algorithms including support vector machines (SVM) and k-nearest neighbors (kNN) are examined for tactile surface roughness discrimination. Eight standard rough surfaces with roughness values (Ra) of 50 μm, 25 μm, 12.5 μm, 6.3 μm 3.2 μm, 1.6 μm, 0.8 μm, and 0.4 μm are explored. The highest classification accuracy of (82.6 ± 10.8) % can be achieved using solely one PVDF film sensor with kNN (k = 9) classifier and the standard deviation feature.Keywords: bioinspired fingertip, classifier, feature extraction, roughness discrimination
Procedia PDF Downloads 31324982 Detection and Classification Strabismus Using Convolutional Neural Network and Spatial Image Processing
Authors: Anoop T. R., Otman Basir, Robert F. Hess, Eileen E. Birch, Brooke A. Koritala, Reed M. Jost, Becky Luu, David Stager, Ben Thompson
Abstract:
Strabismus refers to a misalignment of the eyes. Early detection and treatment of strabismus in childhood can prevent the development of permanent vision loss due to abnormal development of visual brain areas. We developed a two-stage method for strabismus detection and classification based on photographs of the face. The first stage detects the presence or absence of strabismus, and the second stage classifies the type of strabismus. The first stage comprises face detection using Haar cascade, facial landmark estimation, face alignment, aligned face landmark detection, segmentation of the eye region, and detection of strabismus using VGG 16 convolution neural networks. Face alignment transforms the face to a canonical pose to ensure consistency in subsequent analysis. Using facial landmarks, the eye region is segmented from the aligned face and fed into a VGG 16 CNN model, which has been trained to classify strabismus. The CNN determines whether strabismus is present and classifies the type of strabismus (exotropia, esotropia, and vertical deviation). If stage 1 detects strabismus, the eye region image is fed into stage 2, which starts with the estimation of pupil center coordinates using mask R-CNN deep neural networks. Then, the distance between the pupil coordinates and eye landmarks is calculated along with the angle that the pupil coordinates make with the horizontal and vertical axis. The distance and angle information is used to characterize the degree and direction of the strabismic eye misalignment. This model was tested on 100 clinically labeled images of children with (n = 50) and without (n = 50) strabismus. The True Positive Rate (TPR) and False Positive Rate (FPR) of the first stage were 94% and 6% respectively. The classification stage has produced a TPR of 94.73%, 94.44%, and 100% for esotropia, exotropia, and vertical deviations, respectively. This method also had an FPR of 5.26%, 5.55%, and 0% for esotropia, exotropia, and vertical deviation, respectively. The addition of one more feature related to the location of corneal light reflections may reduce the FPR, which was primarily due to children with pseudo-strabismus (the appearance of strabismus due to a wide nasal bridge or skin folds on the nasal side of the eyes).Keywords: strabismus, deep neural networks, face detection, facial landmarks, face alignment, segmentation, VGG 16, mask R-CNN, pupil coordinates, angle deviation, horizontal and vertical deviation
Procedia PDF Downloads 9624981 One Step Further: Pull-Process-Push Data Processing
Authors: Romeo Botes, Imelda Smit
Abstract:
In today’s modern age of technology vast amounts of data needs to be processed in real-time to keep users satisfied. This data comes from various sources and in many formats, including electronic and mobile devices such as GPRS modems and GPS devices. They make use of different protocols including TCP, UDP, and HTTP/s for data communication to web servers and eventually to users. The data obtained from these devices may provide valuable information to users, but are mostly in an unreadable format which needs to be processed to provide information and business intelligence. This data is not always current, it is mostly historical data. The data is not subject to implementation of consistency and redundancy measures as most other data usually is. Most important to the users is that the data are to be pre-processed in a readable format when it is entered into the database. To accomplish this, programmers build processing programs and scripts to decode and process the information stored in databases. Programmers make use of various techniques in such programs to accomplish this, but sometimes neglect the effect some of these techniques may have on database performance. One of the techniques generally used,is to pull data from the database server, process it and push it back to the database server in one single step. Since the processing of the data usually takes some time, it keeps the database busy and locked for the period of time that the processing takes place. Because of this, it decreases the overall performance of the database server and therefore the system’s performance. This paper follows on a paper discussing the performance increase that may be achieved by utilizing array lists along with a pull-process-push data processing technique split in three steps. The purpose of this paper is to expand the number of clients when comparing the two techniques to establish the impact it may have on performance of the CPU storage and processing time.Keywords: performance measures, algorithm techniques, data processing, push data, process data, array list
Procedia PDF Downloads 24524980 Estimating Poverty Levels from Satellite Imagery: A Comparison of Human Readers and an Artificial Intelligence Model
Authors: Ola Hall, Ibrahim Wahab, Thorsteinn Rognvaldsson, Mattias Ohlsson
Abstract:
The subfield of poverty and welfare estimation that applies machine learning tools and methods on satellite imagery is a nascent but rapidly growing one. This is in part driven by the sustainable development goal, whose overarching principle is that no region is left behind. Among other things, this requires that welfare levels can be accurately and rapidly estimated at different spatial scales and resolutions. Conventional tools of household surveys and interviews do not suffice in this regard. While they are useful for gaining a longitudinal understanding of the welfare levels of populations, they do not offer adequate spatial coverage for the accuracy that is needed, nor are their implementation sufficiently swift to gain an accurate insight into people and places. It is this void that satellite imagery fills. Previously, this was near-impossible to implement due to the sheer volume of data that needed processing. Recent advances in machine learning, especially the deep learning subtype, such as deep neural networks, have made this a rapidly growing area of scholarship. Despite their unprecedented levels of performance, such models lack transparency and explainability and thus have seen limited downstream applications as humans generally are apprehensive of techniques that are not inherently interpretable and trustworthy. While several studies have demonstrated the superhuman performance of AI models, none has directly compared the performance of such models and human readers in the domain of poverty studies. In the present study, we directly compare the performance of human readers and a DL model using different resolutions of satellite imagery to estimate the welfare levels of demographic and health survey clusters in Tanzania, using the wealth quintile ratings from the same survey as the ground truth data. The cluster-level imagery covers all 608 cluster locations, of which 428 were classified as rural. The imagery for the human readers was sourced from the Google Maps Platform at an ultra-high resolution of 0.6m per pixel at zoom level 18, while that of the machine learning model was sourced from the comparatively lower resolution Sentinel-2 10m per pixel data for the same cluster locations. Rank correlation coefficients of between 0.31 and 0.32 achieved by the human readers were much lower when compared to those attained by the machine learning model – 0.69-0.79. This superhuman performance by the model is even more significant given that it was trained on the relatively lower 10-meter resolution satellite data while the human readers estimated welfare levels from the higher 0.6m spatial resolution data from which key markers of poverty and slums – roofing and road quality – are discernible. It is important to note, however, that the human readers did not receive any training before ratings, and had this been done, their performance might have improved. The stellar performance of the model also comes with the inevitable shortfall relating to limited transparency and explainability. The findings have significant implications for attaining the objective of the current frontier of deep learning models in this domain of scholarship – eXplainable Artificial Intelligence through a collaborative rather than a comparative framework.Keywords: poverty prediction, satellite imagery, human readers, machine learning, Tanzania
Procedia PDF Downloads 10724979 Estimating Algae Concentration Based on Deep Learning from Satellite Observation in Korea
Authors: Heewon Jeong, Seongpyo Kim, Joon Ha Kim
Abstract:
Over the last few tens of years, the coastal regions of Korea have experienced red tide algal blooms, which are harmful and toxic to both humans and marine organisms due to their potential threat. It was accelerated owing to eutrophication by human activities, certain oceanic processes, and climate change. Previous studies have tried to monitoring and predicting the algae concentration of the ocean with the bio-optical algorithms applied to color images of the satellite. However, the accurate estimation of algal blooms remains problems to challenges because of the complexity of coastal waters. Therefore, this study suggests a new method to identify the concentration of red tide algal bloom from images of geostationary ocean color imager (GOCI) which are representing the water environment of the sea in Korea. The method employed GOCI images, which took the water leaving radiances centered at 443nm, 490nm and 660nm respectively, as well as observed weather data (i.e., humidity, temperature and atmospheric pressure) for the database to apply optical characteristics of algae and train deep learning algorithm. Convolution neural network (CNN) was used to extract the significant features from the images. And then artificial neural network (ANN) was used to estimate the concentration of algae from the extracted features. For training of the deep learning model, backpropagation learning strategy is developed. The established methods were tested and compared with the performances of GOCI data processing system (GDPS), which is based on standard image processing algorithms and optical algorithms. The model had better performance to estimate algae concentration than the GDPS which is impossible to estimate greater than 5mg/m³. Thus, deep learning model trained successfully to assess algae concentration in spite of the complexity of water environment. Furthermore, the results of this system and methodology can be used to improve the performances of remote sensing. Acknowledgement: This work was supported by the 'Climate Technology Development and Application' research project (#K07731) through a grant provided by GIST in 2017.Keywords: deep learning, algae concentration, remote sensing, satellite
Procedia PDF Downloads 18424978 Extreme Temperature Forecast in Mbonge, Cameroon Through Return Level Analysis of the Generalized Extreme Value (GEV) Distribution
Authors: Nkongho Ayuketang Arreyndip, Ebobenow Joseph
Abstract:
In this paper, temperature extremes are forecast by employing the block maxima method of the generalized extreme value (GEV) distribution to analyse temperature data from the Cameroon Development Corporation (CDC). By considering two sets of data (raw data and simulated data) and two (stationary and non-stationary) models of the GEV distribution, return levels analysis is carried out and it was found that in the stationary model, the return values are constant over time with the raw data, while in the simulated data the return values show an increasing trend with an upper bound. In the non-stationary model, the return levels of both the raw data and simulated data show an increasing trend with an upper bound. This clearly shows that although temperatures in the tropics show a sign of increase in the future, there is a maximum temperature at which there is no exceedance. The results of this paper are very vital in agricultural and environmental research.Keywords: forecasting, generalized extreme value (GEV), meteorology, return level
Procedia PDF Downloads 48224977 Fine Characterization of Glucose Modified Human Serum Albumin by Different Biophysical and Biochemical Techniques at a Range
Authors: Neelofar, Khursheed Alam, Jamal Ahmad
Abstract:
Protein modification in diabetes mellitus may lead to early glycation products (EGPs) or amadori product as well as advanced glycation end products (AGEs). Early glycation involves the reaction of glucose with N-terminal and lysyl side chain amino groups to form Schiff’s base which undergoes rearrangements to form more stable early glycation product known as Amadori product. After Amadori, the reactions become more complicated leading to the formation of advanced glycation end products (AGEs) that interact with various AGE receptors, thereby playing an important role in the long-term complications of diabetes. Millard reaction or nonenzymatic glycation reaction accelerate in diabetes due to hyperglycation and alter serum protein’s structure, their normal functions that lead micro and macro vascular complications in diabetic patients. In this study, Human Serum Albumin (HSA) with a constant concentration was incubated with different concentrations of glucose at 370C for a week. At 4th day, Amadori product was formed that was confirmed by colorimetric method NBT assay and TBA assay which both are authenticate early glycation product. Conformational changes in native as well as all samples of Amadori albumin with different concentrations of glucose were investigated by various biophysical and biochemical techniques. Main biophysical techniques hyperchromacity, quenching of fluorescence intensity, FTIR, CD and SDS-PAGE were used. Further conformational changes were observed by biochemical assays mainly HMF formation, fructoseamine, reduction of fructoseamine with NaBH4, carbonyl content estimation, lysine and arginine residues estimation, ANS binding property and thiol group estimation. This study find structural and biochemical changes in Amadori modified HSA with normal to hyperchronic range of glucose with respect to native HSA. When glucose concentration was increased from normal to chronic range biochemical and structural changes also increased. Highest alteration in secondary and tertiary structure and conformation in glycated HSA was observed at the hyperchronic concentration (75mM) of glucose. Although it has been found that Amadori modified proteins is also involved in secondary complications of diabetes as AGEs but very few studies have been done to analyze the conformational changes in Amadori modified proteins due to early glycation. Most of the studies were found on the structural changes in Amadori protein at a particular glucose concentration but no study was found to compare the biophysical and biochemical changes in HSA due to early glycation with a range of glucose concentration at a constant incubation time. So this study provide the information about the biochemical and biophysical changes occur in Amadori modified albumin at a range of glucose normal to chronic in diabetes. Although many implicates currently in use i.e. glycaemic control, insulin treatment and other chemical therapies that can control many aspects of diabetes. However, even with intensive use of current antidiabetic agents more than 50 % of diabetic patient’s type 2 suffers poor glycaemic control and 18 % develop serious complications within six years of diagnosis. Experimental evidence related to diabetes suggests that preventing the nonenzymatic glycation of relevant proteins or blocking their biological effects might beneficially influence the evolution of vascular complications in diabetic patients or quantization of amadori adduct of HSA by authentic antibodies against HSA-EGPs can be used as marker for early detection of the initiation/progression of secondary complications of diabetes. So this research work may be helpful for the same.Keywords: diabetes mellitus, glycation, albumin, amadori, biophysical and biochemical techniques
Procedia PDF Downloads 27324976 Impact of Stack Caches: Locality Awareness and Cost Effectiveness
Authors: Abdulrahman K. Alshegaifi, Chun-Hsi Huang
Abstract:
Treating data based on its location in memory has received much attention in recent years due to its different properties, which offer important aspects for cache utilization. Stack data and non-stack data may interfere with each other’s locality in the data cache. One of the important aspects of stack data is that it has high spatial and temporal locality. In this work, we simulate non-unified cache design that split data cache into stack and non-stack caches in order to maintain stack data and non-stack data separate in different caches. We observe that the overall hit rate of non-unified cache design is sensitive to the size of non-stack cache. Then, we investigate the appropriate size and associativity for stack cache to achieve high hit ratio especially when over 99% of accesses are directed to stack cache. The result shows that on average more than 99% of stack cache accuracy is achieved by using 2KB of capacity and 1-way associativity. Further, we analyze the improvement in hit rate when adding small, fixed, size of stack cache at level1 to unified cache architecture. The result shows that the overall hit rate of unified cache design with adding 1KB of stack cache is improved by approximately, on average, 3.9% for Rijndael benchmark. The stack cache is simulated by using SimpleScalar toolset.Keywords: hit rate, locality of program, stack cache, stack data
Procedia PDF Downloads 30424975 Higher Freshwater Fish and Sea Fish Intake Is Inversely Associated with Liver Cancer in Patients with Hepatitis B
Authors: Maomao Cao
Abstract:
Background and aims While the association between higher consumption of fish and lower liver cancer risk has been confirmed, however, the association between specific fish intake and liver cancer risk remains unknown. We aimed to identify the association between specific fish consumption and the risk of liver cancer. Methods: Based on a community-based seropositive hepatitis B cohort involving 18404 individuals, face to face interview was conducted by a standardized questionnaire to acquire baseline information. Three common fish types in this study were analyzed, including freshwater fish, sea fish, and small fish (shrimp, crab, conch, and shell). All participants received liver cancer screening, and possible cases were identified by CT or MRI. Multivariable logistic models were applied to estimate the odds ratio (OR) and 95% confidence intervals (CI). Multivariate multiple imputations were utilized to impute observations with missing values. Results: 179 liver cancer cases were identified. Consumption of freshwater fish and sea fish at least once a week had a strong inverse association with liver cancer risk compared with the lowest intake level, with an adjusted OR of 0.53 (95% CI, 0.38-0.75) and 0.38 (95% CI, 0.19-0.73), respectively. This inverse association was also observed after the imputation. There was no statistically significant association between intake of small fish and liver cancer risk (OR=0.58, 95%, CI 0.32-1.08). Conclusions: Our findings suggest that consumption of freshwater fish and sea fish at least once a week could reduce liver cancer risk.Keywords: cross-sectional study, fish intake, liver cancer, risk factor
Procedia PDF Downloads 27524974 Autonomic Threat Avoidance and Self-Healing in Database Management System
Authors: Wajahat Munir, Muhammad Haseeb, Adeel Anjum, Basit Raza, Ahmad Kamran Malik
Abstract:
Databases are the key components of the software systems. Due to the exponential growth of data, it is the concern that the data should be accurate and available. The data in databases is vulnerable to internal and external threats, especially when it contains sensitive data like medical or military applications. Whenever the data is changed by malicious intent, data analysis result may lead to disastrous decisions. Autonomic self-healing is molded toward computer system after inspiring from the autonomic system of human body. In order to guarantee the accuracy and availability of data, we propose a technique which on a priority basis, tries to avoid any malicious transaction from execution and in case a malicious transaction affects the system, it heals the system in an isolated mode in such a way that the availability of system would not be compromised. Using this autonomic system, the management cost and time of DBAs can be minimized. In the end, we test our model and present the findings.Keywords: autonomic computing, self-healing, threat avoidance, security
Procedia PDF Downloads 50524973 Information Extraction Based on Search Engine Results
Authors: Mohammed R. Elkobaisi, Abdelsalam Maatuk
Abstract:
The search engines are the large scale information retrieval tools from the Web that are currently freely available to all. This paper explains how to convert the raw resulted number of search engines into useful information. This represents a new method for data gathering comparing with traditional methods. When a query is submitted for a multiple numbers of keywords, this take a long time and effort, hence we develop a user interface program to automatic search by taking multi-keywords at the same time and leave this program to collect wanted data automatically. The collected raw data is processed using mathematical and statistical theories to eliminate unwanted data and converting it to usable data.Keywords: search engines, information extraction, agent system
Procedia PDF Downloads 43024972 Implementation and Performance Analysis of Data Encryption Standard and RSA Algorithm with Image Steganography and Audio Steganography
Authors: S. C. Sharma, Ankit Gambhir, Rajeev Arya
Abstract:
In today’s era data security is an important concern and most demanding issues because it is essential for people using online banking, e-shopping, reservations etc. The two major techniques that are used for secure communication are Cryptography and Steganography. Cryptographic algorithms scramble the data so that intruder will not able to retrieve it; however steganography covers that data in some cover file so that presence of communication is hidden. This paper presents the implementation of Ron Rivest, Adi Shamir, and Leonard Adleman (RSA) Algorithm with Image and Audio Steganography and Data Encryption Standard (DES) Algorithm with Image and Audio Steganography. The coding for both the algorithms have been done using MATLAB and its observed that these techniques performed better than individual techniques. The risk of unauthorized access is alleviated up to a certain extent by using these techniques. These techniques could be used in Banks, RAW agencies etc, where highly confidential data is transferred. Finally, the comparisons of such two techniques are also given in tabular forms.Keywords: audio steganography, data security, DES, image steganography, intruder, RSA, steganography
Procedia PDF Downloads 29124971 Data Monetisation by E-commerce Companies: A Need for a Regulatory Framework in India
Authors: Anushtha Saxena
Abstract:
This paper examines the process of data monetisation bye-commerce companies operating in India. Data monetisation is collecting, storing, and analysing consumers’ data to use further the data that is generated for profits, revenue, etc. Data monetisation enables e-commerce companies to get better businesses opportunities, innovative products and services, a competitive edge over others to the consumers, and generate millions of revenues. This paper analyses the issues and challenges that are faced due to the process of data monetisation. Some of the issues highlighted in the paper pertain to the right to privacy, protection of data of e-commerce consumers. At the same time, data monetisation cannot be prohibited, but it can be regulated and monitored by stringent laws and regulations. The right to privacy isa fundamental right guaranteed to the citizens of India through Article 21 of The Constitution of India. The Supreme Court of India recognized the Right to Privacy as a fundamental right in the landmark judgment of Justice K.S. Puttaswamy (Retd) and Another v. Union of India . This paper highlights the legal issue of how e-commerce businesses violate individuals’ right to privacy by using the data collected, stored by them for economic gains and monetisation and protection of data. The researcher has mainly focused on e-commerce companies like online shopping websitesto analyse the legal issue of data monetisation. In the Internet of Things and the digital age, people have shifted to online shopping as it is convenient, easy, flexible, comfortable, time-consuming, etc. But at the same time, the e-commerce companies store the data of their consumers and use it by selling to the third party or generating more data from the data stored with them. This violatesindividuals’ right to privacy because the consumers do not know anything while giving their data online. Many times, data is collected without the consent of individuals also. Data can be structured, unstructured, etc., that is used by analytics to monetise. The Indian legislation like The Information Technology Act, 2000, etc., does not effectively protect the e-consumers concerning their data and how it is used by e-commerce businesses to monetise and generate revenues from that data. The paper also examines the draft Data Protection Bill, 2021, pending in the Parliament of India, and how this Bill can make a huge impact on data monetisation. This paper also aims to study the European Union General Data Protection Regulation and how this legislation can be helpful in the Indian scenarioconcerning e-commerce businesses with respect to data monetisation.Keywords: data monetization, e-commerce companies, regulatory framework, GDPR
Procedia PDF Downloads 12024970 Experiments on Weakly-Supervised Learning on Imperfect Data
Authors: Yan Cheng, Yijun Shao, James Rudolph, Charlene R. Weir, Beth Sahlmann, Qing Zeng-Treitler
Abstract:
Supervised predictive models require labeled data for training purposes. Complete and accurate labeled data, i.e., a ‘gold standard’, is not always available, and imperfectly labeled data may need to serve as an alternative. An important question is if the accuracy of the labeled data creates a performance ceiling for the trained model. In this study, we trained several models to recognize the presence of delirium in clinical documents using data with annotations that are not completely accurate (i.e., weakly-supervised learning). In the external evaluation, the support vector machine model with a linear kernel performed best, achieving an area under the curve of 89.3% and accuracy of 88%, surpassing the 80% accuracy of the training sample. We then generated a set of simulated data and carried out a series of experiments which demonstrated that models trained on imperfect data can (but do not always) outperform the accuracy of the training data, e.g., the area under the curve for some models is higher than 80% when trained on the data with an error rate of 40%. Our experiments also showed that the error resistance of linear modeling is associated with larger sample size, error type, and linearity of the data (all p-values < 0.001). In conclusion, this study sheds light on the usefulness of imperfect data in clinical research via weakly-supervised learning.Keywords: weakly-supervised learning, support vector machine, prediction, delirium, simulation
Procedia PDF Downloads 20024969 Estimation of Bio-Kinetic Coefficients for Treatment of Brewery Wastewater
Authors: Abimbola M. Enitan, J. Adeyemo
Abstract:
Anaerobic modeling is a useful tool to describe and simulate the condition and behaviour of anaerobic treatment units for better effluent quality and biogas generation. The present investigation deals with the anaerobic treatment of brewery wastewater with varying organic loads. The chemical oxygen demand (COD) and total suspended solids (TSS) of the influent and effluent of the bioreactor were determined at various retention times to generate data for kinetic coefficients. The bio-kinetic coefficients in the modified Stover–Kincannon kinetic and methane generation models were determined to study the performance of anaerobic digestion process. At steady-state, the determination of the kinetic coefficient (K), the endogenous decay coefficient (Kd), the maximum growth rate of microorganisms (µmax), the growth yield coefficient (Y), ultimate methane yield (Bo), maximum utilization rate constant Umax and the saturation constant (KB) in the model were calculated to be 0.046 g/g COD, 0.083 (dˉ¹), 0.117 (d-¹), 0.357 g/g, 0.516 (L CH4/gCODadded), 18.51 (g/L/day) and 13.64 (g/L/day) respectively. The outcome of this study will help in simulation of anaerobic model to predict usable methane and good effluent quality during the treatment of industrial wastewater. Thus, this will protect the environment, conserve natural resources, saves time and reduce cost incur by the industries for the discharge of untreated or partially treated wastewater. It will also contribute to a sustainable long-term clean development mechanism for the optimization of the methane produced from anaerobic degradation of waste in a close system.Keywords: brewery wastewater, methane generation model, environment, anaerobic modeling
Procedia PDF Downloads 27224968 Transforming Healthcare Data Privacy: Integrating Blockchain with Zero-Knowledge Proofs and Cryptographic Security
Authors: Kenneth Harper
Abstract:
Blockchain technology presents solutions for managing healthcare data, addressing critical challenges in privacy, integrity, and access. This paper explores how privacy-preserving technologies, such as zero-knowledge proofs (ZKPs) and homomorphic encryption (HE), enhance decentralized healthcare platforms by enabling secure computations and patient data protection. An examination of the mathematical foundations of these methods, their practical applications, and how they meet the evolving demands of healthcare data security is unveiled. Using real-world examples, this research highlights industry-leading implementations and offers a roadmap for future applications in secure, decentralized healthcare ecosystems.Keywords: blockchain, cryptography, data privacy, decentralized data management, differential privacy, healthcare, healthcare data security, homomorphic encryption, privacy-preserving technologies, secure computations, zero-knowledge proofs
Procedia PDF Downloads 2024967 Operating Speed Models on Tangent Sections of Two-Lane Rural Roads
Authors: Dražen Cvitanić, Biljana Maljković
Abstract:
This paper presents models for predicting operating speeds on tangent sections of two-lane rural roads developed on continuous speed data. The data corresponds to 20 drivers of different ages and driving experiences, driving their own cars along an 18 km long section of a state road. The data were first used for determination of maximum operating speeds on tangents and their comparison with speeds in the middle of tangents i.e. speed data used in most of operating speed studies. Analysis of continuous speed data indicated that the spot speed data are not reliable indicators of relevant speeds. After that, operating speed models for tangent sections were developed. There was no significant difference between models developed using speed data in the middle of tangent sections and models developed using maximum operating speeds on tangent sections. All developed models have higher coefficient of determination then models developed on spot speed data. Thus, it can be concluded that the method of measuring has more significant impact on the quality of operating speed model than the location of measurement.Keywords: operating speed, continuous speed data, tangent sections, spot speed, consistency
Procedia PDF Downloads 45224966 Estimation of the Length and Location of Ground Surface Deformation Caused by the Reverse Faulting
Authors: Nader Khalafian, Mohsen Ghaderi
Abstract:
Field observations have revealed many examples of structures which were damaged due to ground surface deformation caused by the faulting phenomena. In this paper some efforts were made in order to estimate the length and location of the ground surface where large displacements were created due to the reverse faulting. This research has conducted in two steps; (1) in the first step, a 2D explicit finite element model were developed using ABAQUS software. A subroutine for Mohr-Coulomb failure criterion with strain softening model was developed by the authors in order to properly model the stress strain behavior of the soil in the fault rapture zone. The results of the numerical analysis were verified with the results of available centrifuge experiments. Reasonable coincidence was found between the numerical and experimental data. (2) In the second step, the effects of the fault dip angle (δ), depth of soil layer (H), dilation and friction angle of sand (ψ and φ) and the amount of fault offset (d) on the soil surface displacement and fault rupture path were investigated. An artificial neural network-based model (ANN), as a powerful prediction tool, was developed to generate a general model for predicting faulting characteristics. A properly sized database was created to train and test network. It was found that the length and location of the zone of displaced ground surface can be accurately estimated using the proposed model.Keywords: reverse faulting, surface deformation, numerical, neural network
Procedia PDF Downloads 42124965 The Effect That the Data Assimilation of Qinghai-Tibet Plateau Has on a Precipitation Forecast
Authors: Ruixia Liu
Abstract:
Qinghai-Tibet Plateau has an important influence on the precipitation of its lower reaches. Data from remote sensing has itself advantage and numerical prediction model which assimilates RS data will be better than other. We got the assimilation data of MHS and terrestrial and sounding from GSI, and introduced the result into WRF, then got the result of RH and precipitation forecast. We found that assimilating MHS and terrestrial and sounding made the forecast on precipitation, area and the center of the precipitation more accurate by comparing the result of 1h,6h,12h, and 24h. Analyzing the difference of the initial field, we knew that the data assimilating about Qinghai-Tibet Plateau influence its lower reaches forecast by affecting on initial temperature and RH.Keywords: Qinghai-Tibet Plateau, precipitation, data assimilation, GSI
Procedia PDF Downloads 23524964 Modeling and Estimating Reserve of the Ali Javad Porphyry Copper-Gold Deposit, East Azerbaijan, Iran
Authors: Behzad Hajalilou, Nasim Hajalilou, Saeid Ansari
Abstract:
The study area is located in East Azerbaijan province, north of Ahar city, and 1/100000 geological map of Varzgan. This region is located in the middle of Iran zone. Ali Javad Porphyry copper-gold ore deposit has been created in a magmatic complex containing intrusive masses, combining Granodiorite and quartz Monzonite that penetrates into the Eocene volcanic aggregate. The most important mineralization includes primary oxides minerals (magnetite), sulfide (pyrite, chalcopyrite, Molybdenite, Bornite, Chalcocite, Covollite), secondary oxide or hydroxide minerals (hematite, goethite, limonite), and carbonate (malachite and Azurite). The mineralization forms into the vein-veinlets and scattered system. The alterations observed in the region include intermediate Argillic, advanced Argillic, Phyllic, silica, Propylitic, chlorite and Potassic. The 3D model of mineralization of the Alijavad is provided by Data DATAMINE software and based on the study of 700 polished sections of 32 drilled boreholes in the region. This model is completely compatible with the model provided by Lowell and Gilbert for the mineralization of porphyry copper deposits of quartz Monzonite type. The estimated cumulative residual value of copper for Ali Javad deposit is 81.5 million tons with 0.75 percent of copper, and for gold is 8.37 million tons with 1.8 ppm.Keywords: porphyry copper, mineralization, Ali Javad, modeling, reserve estimation
Procedia PDF Downloads 22024963 Positive Affect, Negative Affect, Organizational and Motivational Factor on the Acceptance of Big Data Technologies
Authors: Sook Ching Yee, Angela Siew Hoong Lee
Abstract:
Big data technologies have become a trend to exploit business opportunities and provide valuable business insights through the analysis of big data. However, there are still many organizations that have yet to adopt big data technologies especially small and medium organizations (SME). This study uses the technology acceptance model (TAM) to look into several constructs in the TAM and other additional constructs which are positive affect, negative affect, organizational factor and motivational factor. The conceptual model proposed in the study will be tested on the relationship and influence of positive affect, negative affect, organizational factor and motivational factor towards the intention to use big data technologies to produce an outcome. Empirical research is used in this study by conducting a survey to collect data.Keywords: big data technologies, motivational factor, negative affect, organizational factor, positive affect, technology acceptance model (TAM)
Procedia PDF Downloads 36324962 Big Data Analysis with Rhipe
Authors: Byung Ho Jung, Ji Eun Shin, Dong Hoon Lim
Abstract:
Rhipe that integrates R and Hadoop environment made it possible to process and analyze massive amounts of data using a distributed processing environment. In this paper, we implemented multiple regression analysis using Rhipe with various data sizes of actual data. Experimental results for comparing the performance of our Rhipe with stats and biglm packages available on bigmemory, showed that our Rhipe was more fast than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases. We also compared the computing speeds of pseudo-distributed and fully-distributed modes for configuring Hadoop cluster. The results showed that fully-distributed mode was faster than pseudo-distributed mode, and computing speeds of fully-distributed mode were faster as the number of data nodes increases.Keywords: big data, Hadoop, Parallel regression analysis, R, Rhipe
Procedia PDF Downloads 498