Search results for: data sensitivity
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 26507

Search results for: data sensitivity

25997 Frequent Item Set Mining for Big Data Using MapReduce Framework

Authors: Tamanna Jethava, Rahul Joshi

Abstract:

Frequent Item sets play an essential role in many data Mining tasks that try to find interesting patterns from the database. Typically it refers to a set of items that frequently appear together in transaction dataset. There are several mining algorithm being used for frequent item set mining, yet most do not scale to the type of data we presented with today, so called “BIG DATA”. Big Data is a collection of large data sets. Our approach is to work on the frequent item set mining over the large dataset with scalable and speedy way. Big Data basically works with Map Reduce along with HDFS is used to find out frequent item sets from Big Data on large cluster. This paper focuses on using pre-processing & mining algorithm as hybrid approach for big data over Hadoop platform.

Keywords: frequent item set mining, big data, Hadoop, MapReduce

Procedia PDF Downloads 435
25996 The Role Of Data Gathering In NGOs

Authors: Hussaini Garba Mohammed

Abstract:

Background/Significance: The lack of data gathering is affecting NGOs world-wide in general to have good data information about educational and health related issues among communities in any country and around the world. For example, HIV/AIDS smoking (Tuberculosis diseases) and COVID-19 virus carriers is becoming a serious public health problem, especially among old men and women. But there is no full details data survey assessment from communities, villages, and rural area in some countries to show the percentage of victims and patients, especial with this world COVID-19 virus among the people. These data are essential to inform programming targets, strategies, and priorities in getting good information about data gathering in any society.

Keywords: reliable information, data assessment, data mining, data communication

Procedia PDF Downloads 179
25995 The Value of Dynamic Magnetic Resonance Defecography in Assessing the Severity of Defecation Disorders

Authors: Ge Sun, Monika Trzpis, Robbert J. de Haas, Paul M. A. Broens

Abstract:

Introduction: Dynamic magnetic resonance defecography is frequently used to assess defecation disorders. We aimed to investigate the usefulness of dynamic magnetic resonance defecography for assessing the severity of defecation disorder. Methods: We included patients retrospectively from our tertiary referral hospital who had undergone dynamic magnetic resonance defecography, anorectal manometry, and anal electrical sensitivity tests to assess defecation disorders between 2014 and 2020. The primary outcome was the association between the dynamic magnetic resonance defecography variables and the severity of defecation disorders. We assessed the severity of fecal incontinence and constipation with the Wexner incontinence and Agachan constipation scores. Results: Out of the 32 patients included, 24 completed the defecation questionnaire. During defecation, the M line length at magnetic resonance correlated with the Agachan score (r = 0.45, p = 0.03) and was associated with anal sphincter pressure (r=0.39, p=0.03) just before defecation. During rest and squeezing, the H line length at imaging correlated with the Wexner incontinence score (r=0.49, p=0.01 and r=0.69, p< 0.001, respectively). H line length also correlated positively with the anal electrical sensation threshold during squeezing (r=0.50, p=0.004) and during rest (r= 0.42, p=0.02). Conclusions: The M and H line lengths at dynamic magnetic resonance defecography can be used to assess the severity of constipation and fecal incontinence respectively and reflect anatomic changes of the pelvic floor. However, as these anatomic changes are generally late-stage and irreversible, anal manometry seems a better diagnostic approach to assess early and potentially reversible changes in patients with defecation disorders.

Keywords: defecation disorders, dynamic magnetic resonance defecography, anorectal manometry, anal electrical sensitivity tests, H line, M line

Procedia PDF Downloads 105
25994 Magnetic Resonance Imaging for Assessment of the Quadriceps Tendon Cross-Sectional Area as an Adjunctive Diagnostic Parameter in Patients with Patellofemoral Pain Syndrome

Authors: Jae Ni Jang, SoYoon Park, Sukhee Park, Yumin Song, Jae Won Kim, Keum Nae Kang, Young Uk Kim

Abstract:

Objectives: Patellofemoral pain syndrome (PFPS) is a common clinical condition characterized by anterior knee pain. Here, we investigated the quadriceps tendon cross-sectional area (QTCSA) as a novel predictor for the diagnosis of PFPS. By examining the association between the QTCSA and PFPS, we aimed to provide a more valuable diagnostic parameter and more equivocal assessment of the diagnostic potential of PFPS by comparing the QTCSA with the quadriceps tendon thickness (QTT), a traditional measure of quadriceps tendon hypertrophy. Patients and Methods: This retrospective study included 30 patients with PFPS and 30 healthy participants who underwent knee magnetic resonance imaging. T1-weighted turbo spin echo transverse magnetic resonance images were obtained. The QTCSA was measured on the axial-angled phases of the images by drawing outlines, and the QTT was measured at the most hypertrophied quadriceps tendon. Results: The average QTT and QTCSA for patients with PFPS (6.33±0.80 mm and 155.77±36.60 mm², respectively) were significantly greater than those for healthy participants (5.77±0.36 mm and 111.90±24.10 mm2, respectively; both P<0.001). We used a receiver operating characteristic curve to confirm the sensitivities and specificities for both the QTT and QTCSA as predictors of PFPS. The optimal diagnostic cutoff value for QTT was 5.98 mm, with a sensitivity of 66.7%, a specificity of 70.0%, and an area under the curve of 0.75 (0.62–0.88). The optimal diagnostic cutoff value for QTCSA was 121.04 mm², with a sensitivity of 73.3%, a specificity of 70.0%, and an area under the curve of 0.83 (0.74–0.93). Conclusion: The QTCSA was found to be a more reliable diagnostic indicator for PFPS than QTT.

Keywords: patellofemoral pain syndrome, quadriceps muscle, hypertrophy, magnetic resonance imaging

Procedia PDF Downloads 50
25993 Automatic Reporting System for Transcriptome Indel Identification and Annotation Based on Snapshot of Next-Generation Sequencing Reads Alignment

Authors: Shuo Mu, Guangzhi Jiang, Jinsa Chen

Abstract:

The analysis of Indel for RNA sequencing of clinical samples is easily affected by sequencing experiment errors and software selection. In order to improve the efficiency and accuracy of analysis, we developed an automatic reporting system for Indel recognition and annotation based on image snapshot of transcriptome reads alignment. This system includes sequence local-assembly and realignment, target point snapshot, and image-based recognition processes. We integrated high-confidence Indel dataset from several known databases as a training set to improve the accuracy of image processing and added a bioinformatical processing module to annotate and filter Indel artifacts. Subsequently, the system will automatically generate data, including data quality levels and images results report. Sanger sequencing verification of the reference Indel mutation of cell line NA12878 showed that the process can achieve 83% sensitivity and 96% specificity. Analysis of the collected clinical samples showed that the interpretation accuracy of the process was equivalent to that of manual inspection, and the processing efficiency showed a significant improvement. This work shows the feasibility of accurate Indel analysis of clinical next-generation sequencing (NGS) transcriptome. This result may be useful for RNA study for clinical samples with microsatellite instability in immunotherapy in the future.

Keywords: automatic reporting, indel, next-generation sequencing, NGS, transcriptome

Procedia PDF Downloads 191
25992 Credit Risk Assessment Using Rule Based Classifiers: A Comparative Study

Authors: Salima Smiti, Ines Gasmi, Makram Soui

Abstract:

Credit risk is the most important issue for financial institutions. Its assessment becomes an important task used to predict defaulter customers and classify customers as good or bad payers. To this objective, numerous techniques have been applied for credit risk assessment. However, to our knowledge, several evaluation techniques are black-box models such as neural networks, SVM, etc. They generate applicants’ classes without any explanation. In this paper, we propose to assess credit risk using rules classification method. Our output is a set of rules which describe and explain the decision. To this end, we will compare seven classification algorithms (JRip, Decision Table, OneR, ZeroR, Fuzzy Rule, PART and Genetic programming (GP)) where the goal is to find the best rules satisfying many criteria: accuracy, sensitivity, and specificity. The obtained results confirm the efficiency of the GP algorithm for German and Australian datasets compared to other rule-based techniques to predict the credit risk.

Keywords: credit risk assessment, classification algorithms, data mining, rule extraction

Procedia PDF Downloads 181
25991 Regulation on Macrophage and Insulin Resistance after Aerobic Exercise in High-Fat Diet Mice

Authors: Qiaofeng Guo

Abstract:

Aims: Obesity is often accompanied by insulin resistance (IR) and whole-body inflammation. Aerobic exercise is an effective treatment to improve insulin resistance and inflammation. However, the anti-inflammatory mechanisms of exercise on epididymal and subcutaneous adipose remain to be elucidated. Here, we compared the macrophage polarization between epididymal and subcutaneous adipose after aerobic exercise. Methods: Male C57BL/6 mice were fed a normal diet group or a high-fat diet group for 12 weeks and performed aerobic training on a treadmill at 55%~65% VO₂ max for eight weeks. Food intake, body weight, and fasting blood glucose levels were monitored weekly. The intraperitoneal glucose tolerance test was to evaluate the insulin resistance model. Fat mass, blood lipid profile, serum IL-1β, TNF-α levels, and CD31/CD206 rates were analysed after the intervention. Results: FBG (P<0.01), AUCIPGTT (P<0.01), and HOMA-IR (P<0.01) increased significantly for a high-fat diet and decreased significantly after the exercise. Eight weeks of aerobic exercise attenuated HFD-induced weight gain and glucose intolerance and improved insulin sensitivity. Serum IL-1β, TNF-α, CD11C/CD206 expression in subcutaneous adipose tissue were not changed before and after exercise, but not in epididymal adipose tissue (P<0.01). Conclusion: Insulin resistance is not accompanied by chronic inflammation and M1 polarization of subcutaneous adipose tissue macrophages in high-fat diet mice. Aerobic exercise effectively improved lipid metabolism and insulin sensitivity, which may be closely associated with reduced M1 polarization of epididymal adipose macrophages.

Keywords: aerobic exercise, insulin resistance, chronic inflammation, adipose, macrophage polarization

Procedia PDF Downloads 78
25990 The Application of Data Mining Technology in Building Energy Consumption Data Analysis

Authors: Liang Zhao, Jili Zhang, Chongquan Zhong

Abstract:

Energy consumption data, in particular those involving public buildings, are impacted by many factors: the building structure, climate/environmental parameters, construction, system operating condition, and user behavior patterns. Traditional methods for data analysis are insufficient. This paper delves into the data mining technology to determine its application in the analysis of building energy consumption data including energy consumption prediction, fault diagnosis, and optimal operation. Recent literature are reviewed and summarized, the problems faced by data mining technology in the area of energy consumption data analysis are enumerated, and research points for future studies are given.

Keywords: data mining, data analysis, prediction, optimization, building operational performance

Procedia PDF Downloads 852
25989 A Human Factors Approach to Workload Optimization for On-Screen Review Tasks

Authors: Christina Kirsch, Adam Hatzigiannis

Abstract:

Rail operators and maintainers worldwide are increasingly replacing walking patrols in the rail corridor with mechanized track patrols -essentially data capture on trains- and on-screen reviews of track infrastructure in centralized review facilities. The benefit is that infrastructure workers are less exposed to the dangers of the rail corridor. The impact is a significant change in work design from walking track sections and direct observation in the real world to sedentary jobs in the review facility reviewing captured data on screens. Defects in rail infrastructure can have catastrophic consequences. Reviewer performance regarding accuracy and efficiency of reviews within the available time frame is essential to ensure safety and operational performance. Rail operators must optimize workload and resource loading to transition to on-screen reviews successfully. Therefore, they need to know what workload assessment methodologies will provide reliable and valid data to optimize resourcing for on-screen reviews. This paper compares objective workload measures, including track difficulty ratings and review distance covered per hour, and subjective workload assessments (NASA TLX) and analyses the link between workload and reviewer performance, including sensitivity, precision, and overall accuracy. An experimental study was completed with eight on-screen reviewers, including infrastructure workers and engineers, reviewing track sections with different levels of track difficulty over nine days. Each day the reviewers completed four 90-minute sessions of on-screen inspection of the track infrastructure. Data regarding the speed of review (km/ hour), detected defects, false negatives, and false positives were collected. Additionally, all reviewers completed a subjective workload assessment (NASA TLX) after each 90-minute session and a short employee engagement survey at the end of the study period that captured impacts on job satisfaction and motivation. The results showed that objective measures for tracking difficulty align with subjective mental demand, temporal demand, effort, and frustration in the NASA TLX. Interestingly, review speed correlated with subjective assessments of physical and temporal demand, but to mental demand. Subjective performance ratings correlated with all accuracy measures and review speed. The results showed that subjective NASA TLX workload assessments accurately reflect objective workload. The analysis of the impact of workload on performance showed that subjective mental demand correlated with high precision -accurately detected defects, not false positives. Conversely, high temporal demand was negatively correlated with sensitivity and the percentage of detected existing defects. Review speed was significantly correlated with false negatives. With an increase in review speed, accuracy declined. On the other hand, review speed correlated with subjective performance assessments. Reviewers thought their performance was higher when they reviewed the track sections faster, despite the decline in accuracy. The study results were used to optimize resourcing and ensure that reviewers had enough time to review the allocated track sections to improve defect detection rates in accordance with the efficiency-thoroughness trade-off. Overall, the study showed the importance of a multi-method approach to workload assessment and optimization, combining subjective workload assessments with objective workload and performance measures to ensure that recommendations for work system optimization are evidence-based and reliable.

Keywords: automation, efficiency-thoroughness trade-off, human factors, job design, NASA TLX, performance optimization, subjective workload assessment, workload analysis

Procedia PDF Downloads 121
25988 To Handle Data-Driven Software Development Projects Effectively

Authors: Shahnewaz Khan

Abstract:

Machine learning (ML) techniques are often used in projects for creating data-driven applications. These tasks typically demand additional research and analysis. The proper technique and strategy must be chosen to ensure the success of data-driven projects. Otherwise, even exerting a lot of effort, the necessary development might not always be possible. In this post, an effort to examine the workflow of data-driven software development projects and its implementation process in order to describe how to manage a project successfully. Which will assist in minimizing the added workload.

Keywords: data, data-driven projects, data science, NLP, software project

Procedia PDF Downloads 83
25987 Model-Driven and Data-Driven Approaches for Crop Yield Prediction: Analysis and Comparison

Authors: Xiangtuo Chen, Paul-Henry Cournéde

Abstract:

Crop yield prediction is a paramount issue in agriculture. The main idea of this paper is to find out efficient way to predict the yield of corn based meteorological records. The prediction models used in this paper can be classified into model-driven approaches and data-driven approaches, according to the different modeling methodologies. The model-driven approaches are based on crop mechanistic modeling. They describe crop growth in interaction with their environment as dynamical systems. But the calibration process of the dynamic system comes up with much difficulty, because it turns out to be a multidimensional non-convex optimization problem. An original contribution of this paper is to propose a statistical methodology, Multi-Scenarios Parameters Estimation (MSPE), for the parametrization of potentially complex mechanistic models from a new type of datasets (climatic data, final yield in many situations). It is tested with CORNFLO, a crop model for maize growth. On the other hand, the data-driven approach for yield prediction is free of the complex biophysical process. But it has some strict requirements about the dataset. A second contribution of the paper is the comparison of these model-driven methods with classical data-driven methods. For this purpose, we consider two classes of regression methods, methods derived from linear regression (Ridge and Lasso Regression, Principal Components Regression or Partial Least Squares Regression) and machine learning methods (Random Forest, k-Nearest Neighbor, Artificial Neural Network and SVM regression). The dataset consists of 720 records of corn yield at county scale provided by the United States Department of Agriculture (USDA) and the associated climatic data. A 5-folds cross-validation process and two accuracy metrics: root mean square error of prediction(RMSEP), mean absolute error of prediction(MAEP) were used to evaluate the crop prediction capacity. The results show that among the data-driven approaches, Random Forest is the most robust and generally achieves the best prediction error (MAEP 4.27%). It also outperforms our model-driven approach (MAEP 6.11%). However, the method to calibrate the mechanistic model from dataset easy to access offers several side-perspectives. The mechanistic model can potentially help to underline the stresses suffered by the crop or to identify the biological parameters of interest for breeding purposes. For this reason, an interesting perspective is to combine these two types of approaches.

Keywords: crop yield prediction, crop model, sensitivity analysis, paramater estimation, particle swarm optimization, random forest

Procedia PDF Downloads 231
25986 The Impacts of Cultural Event on Networking: Liverpool's Cultural Sector in the Aftermath of 2008

Authors: Yi-De Liu

Abstract:

The aim of this paper is to discuss how the construct of networking and social capital can be used to understand the effect events can have on the cultural sector. Based on case study, this research sought the views of those working in the cultural sector on Liverpool’s year as the European Capital of Culture (ECOC). Methodologically, this study involves literature review to prompt theoretical sensitivity, the collection of primary data via online survey (n= 42) and follow-up telephone interviews (n= 8) to explore the emerging findings in more detail. The findings point to a number of ways in which the ECOC constitutes a boost for networking and its effects on city’s cultural sector, including organisational learning, aspiration and leadership. The contributions of this study are two-fold: (1) Evaluating the long-term effects on network formation in the cultural sector following major event; (2) conceptualising the impact assessment of organisational social capital for future ECOC or similar events.

Keywords: network, social capital, cultural impact, european capital of culture

Procedia PDF Downloads 204
25985 Security of Database Using Chaotic Systems

Authors: Eman W. Boghdady, A. R. Shehata, M. A. Azem

Abstract:

Database (DB) security demands permitting authorized users and prohibiting non-authorized users and intruders actions on the DB and the objects inside it. Organizations that are running successfully demand the confidentiality of their DBs. They do not allow the unauthorized access to their data/information. They also demand the assurance that their data is protected against any malicious or accidental modification. DB protection and confidentiality are the security concerns. There are four types of controls to obtain the DB protection, those include: access control, information flow control, inference control, and cryptographic. The cryptographic control is considered as the backbone for DB security, it secures the DB by encryption during storage and communications. Current cryptographic techniques are classified into two types: traditional classical cryptography using standard algorithms (DES, AES, IDEA, etc.) and chaos cryptography using continuous (Chau, Rossler, Lorenz, etc.) or discreet (Logistics, Henon, etc.) algorithms. The important characteristics of chaos are its extreme sensitivity to initial conditions of the system. In this paper, DB-security systems based on chaotic algorithms are described. The Pseudo Random Numbers Generators (PRNGs) from the different chaotic algorithms are implemented using Matlab and their statistical properties are evaluated using NIST and other statistical test-suits. Then, these algorithms are used to secure conventional DB (plaintext), where the statistical properties of the ciphertext are also tested. To increase the complexity of the PRNGs and to let pass all the NIST statistical tests, we propose two hybrid PRNGs: one based on two chaotic Logistic maps and another based on two chaotic Henon maps, where each chaotic algorithm is running side-by-side and starting from random independent initial conditions and parameters (encryption keys). The resulted hybrid PRNGs passed the NIST statistical test suit.

Keywords: algorithms and data structure, DB security, encryption, chaotic algorithms, Matlab, NIST

Procedia PDF Downloads 265
25984 Landslide Susceptibility Analysis in the St. Lawrence Lowlands Using High Resolution Data and Failure Plane Analysis

Authors: Kevin Potoczny, Katsuichiro Goda

Abstract:

The St. Lawrence lowlands extend from Ottawa to Quebec City and are known for large deposits of sensitive Leda clay. Leda clay deposits are responsible for many large landslides, such as the 1993 Lemieux and 2010 St. Jude (4 fatalities) landslides. Due to the large extent and sensitivity of Leda clay, regional hazard analysis for landslides is an important tool in risk management. A 2018 regional study by Farzam et al. on the susceptibility of Leda clay slopes to landslide hazard uses 1 arc second topographical data. A qualitative method known as Hazus is used to estimate susceptibility by checking for various criteria in a location and determine a susceptibility rating on a scale of 0 (no susceptibility) to 10 (very high susceptibility). These criteria are slope angle, geological group, soil wetness, and distance from waterbodies. Given the flat nature of St. Lawrence lowlands, the current assessment fails to capture local slopes, such as the St. Jude site. Additionally, the data did not allow one to analyze failure planes accurately. This study majorly improves the analysis performed by Farzam et al. in two aspects. First, regional assessment with high resolution data allows for identification of local locations that may have been previously identified as low susceptibility. This then provides the opportunity to conduct a more refined analysis on the failure plane of the slope. Slopes derived from 1 arc second data are relatively gentle (0-10 degrees) across the region; however, the 1- and 2-meter resolution 2022 HRDEM provided by NRCAN shows that short, steep slopes are present. At a regional level, 1 arc second data can underestimate the susceptibility of short, steep slopes, which can be dangerous as Leda clay landslides behave retrogressively and travel upwards into flatter terrain. At the location of the St. Jude landslide, slope differences are significant. 1 arc second data shows a maximum slope of 12.80 degrees and a mean slope of 4.72 degrees, while the HRDEM data shows a maximum slope of 56.67 degrees and a mean slope of 10.72 degrees. This equates to a difference of three susceptibility levels when the soil is dry and one susceptibility level when wet. The use of GIS software is used to create a regional susceptibility map across the St. Lawrence lowlands at 1- and 2-meter resolutions. Failure planes are necessary to differentiate between small and large landslides, which have so far been ignored in regional analysis. Leda clay failures can only retrogress as far as their failure planes, so the regional analysis must be able to transition smoothly into a more robust local analysis. It is expected that slopes within the region, once previously assessed at low susceptibility scores, contain local areas of high susceptibility. The goal is to create opportunities for local failure plane analysis to be undertaken, which has not been possible before. Due to the low resolution of previous regional analyses, any slope near a waterbody could be considered hazardous. However, high-resolution regional analysis would allow for more precise determination of hazard sites.

Keywords: hazus, high-resolution DEM, leda clay, regional analysis, susceptibility

Procedia PDF Downloads 76
25983 Validation of SWAT Model for Prediction of Water Yield and Water Balance: Case Study of Upstream Catchment of Jebba Dam in Nigeria

Authors: Adeniyi G. Adeogun, Bolaji F. Sule, Adebayo W. Salami, Michael O. Daramola

Abstract:

Estimation of water yield and water balance in a river catchment is critical to the sustainable management of water resources at watershed level in any country. Therefore, in the present study, Soil and Water Assessment Tool (SWAT) interfaced with Geographical Information System (GIS) was applied as a tool to predict water balance and water yield of a catchment area in Nigeria. The catchment area, which was 12,992km2, is located upstream Jebba hydropower dam in North central part of Nigeria. In this study, data on the observed flow were collected and compared with simulated flow using SWAT. The correlation between the two data sets was evaluated using statistical measures, such as, Nasch-Sucliffe Efficiency (NSE) and coefficient of determination (R2). The model output shows a good agreement between the observed flow and simulated flow as indicated by NSE and R2, which were greater than 0.7 for both calibration and validation period. A total of 42,733 mm of water was predicted by the calibrated model as the water yield potential of the basin for a simulation period 1985 to 2010. This interesting performance obtained with SWAT model suggests that SWAT model could be a promising tool to predict water balance and water yield in sustainable management of water resources. In addition, SWAT could be applied to other water resources in other basins in Nigeria as a decision support tool for sustainable water management in Nigeria.

Keywords: GIS, modeling, sensitivity analysis, SWAT, water yield, watershed level

Procedia PDF Downloads 439
25982 Forest Soil Greenhouse Gas Real-Time Analysis Using Quadrupole Mass Spectrometry

Authors: Timothy L. Porter, T. Randy Dillingham

Abstract:

Vegetation growth and decomposition, along with soil microbial activity play a complex role in the production of greenhouse gases originating in forest soils. The absorption or emission (respiration) of these gases is a function of many factors relating to the soils themselves, the plants, and the environment in which the plants are growing. For this study, we have constructed a battery-powered, portable field mass spectrometer for use in analyzing gases in the soils surrounding trees, plants, and other areas. We have used the instrument to sample in real-time the greenhouse gases carbon dioxide and methane in soils where plant life may be contributing to the production of gases such as methane. Gases such as isoprene, which may help correlate gas respiration to microbial activity have also been measured. The instrument is composed of a quadrupole mass spectrometer with part per billion or better sensitivity, coupled to battery-powered turbo and diaphragm pumps. A unique ambient air pressure differentially pumped intake apparatus allows for the real-time sampling of gases in the soils from the surface to several inches below the surface. Results show that this instrument is capable of instant, part-per-billion sensitivity measurement of carbon dioxide and methane in the near surface region of various forest soils. We have measured differences in soil respiration resulting from forest thinning, forest burning, and forest logging as compared to pristine, untouched forests. Further studies will include measurements of greenhouse gas respiration as a function of temperature, microbial activity as measured by isoprene production, and forest restoration after fire.

Keywords: forest, soil, greenhouse, quadrupole

Procedia PDF Downloads 116
25981 The Relationship Between Artificial Intelligence, Data Science, and Privacy

Authors: M. Naidoo

Abstract:

Artificial intelligence often requires large amounts of good quality data. Within important fields, such as healthcare, the training of AI systems predominately relies on health and personal data; however, the usage of this data is complicated by various layers of law and ethics that seek to protect individuals’ privacy rights. This research seeks to establish the challenges AI and data sciences pose to (i) informational rights, (ii) privacy rights, and (iii) data protection. To solve some of the issues presented, various methods are suggested, such as embedding values in technological development, proper balancing of rights and interests, and others.

Keywords: artificial intelligence, data science, law, policy

Procedia PDF Downloads 106
25980 Hybrid Robust Estimation via Median Filter and Wavelet Thresholding with Automatic Boundary Correction

Authors: Alsaidi M. Altaher, Mohd Tahir Ismail

Abstract:

Wavelet thresholding has been a power tool in curve estimation and data analysis. In the presence of outliers this non parametric estimator can not suppress the outliers involved. This study proposes a new two-stage combined method based on the use of the median filter as primary step before applying wavelet thresholding. After suppressing the outliers in a signal through the median filter, the classical wavelet thresholding is then applied for removing the remaining noise. We use automatic boundary corrections; using a low order polynomial model or local polynomial model as a more realistic rule to correct the bias at the boundary region; instead of using the classical assumptions such periodic or symmetric. A simulation experiment has been conducted to evaluate the numerical performance of the proposed method. Results show strong evidences that the proposed method is extremely effective in terms of correcting the boundary bias and eliminating outlier’s sensitivity.

Keywords: boundary correction, median filter, simulation, wavelet thresholding

Procedia PDF Downloads 428
25979 Simulation Data Summarization Based on Spatial Histograms

Authors: Jing Zhao, Yoshiharu Ishikawa, Chuan Xiao, Kento Sugiura

Abstract:

In order to analyze large-scale scientific data, research on data exploration and visualization has gained popularity. In this paper, we focus on the exploration and visualization of scientific simulation data, and define a spatial V-Optimal histogram for data summarization. We propose histogram construction algorithms based on a general binary hierarchical partitioning as well as a more specific one, the l-grid partitioning. For effective data summarization and efficient data visualization in scientific data analysis, we propose an optimal algorithm as well as a heuristic algorithm for histogram construction. To verify the effectiveness and efficiency of the proposed methods, we conduct experiments on the massive evacuation simulation data.

Keywords: simulation data, data summarization, spatial histograms, exploration, visualization

Procedia PDF Downloads 176
25978 Charging-Vacuum Helium Mass Spectrometer Leak Detection Technology in the Application of Space Products Leak Testing and Error Control

Authors: Jijun Shi, Lichen Sun, Jianchao Zhao, Lizhi Sun, Enjun Liu, Chongwu Guo

Abstract:

Because of the consistency of pressure direction, more short cycle, and high sensitivity, Charging-Vacuum helium mass spectrometer leak testing technology is the most popular leak testing technology for the seal testing of the spacecraft parts, especially the small and medium size ones. Usually, auxiliary pump was used, and the minimum detectable leak rate could reach 5E-9Pa•m3/s, even better on certain occasions. Relative error is more important when evaluating the results. How to choose the reference leak, the background level of helium, and record formats would affect the leak rate tested. In the linearity range of leak testing system, it would reduce 10% relative error if the reference leak with larger leak rate was used, and the relative error would reduce obviously if the background of helium was low efficiently, the record format of decimal was used, and the more stable data were recorded.

Keywords: leak testing, spacecraft parts, relative error, error control

Procedia PDF Downloads 456
25977 New Neuroplasmonic Sensor Based on Soft Nanolithography

Authors: Seyedeh Mehri Hamidi, Nasrin Asgari, Foozieh Sohrabi, Mohammad Ali Ansari

Abstract:

New neuro plasmonic sensor based on one dimensional plasmonic nano-grating has been prepared. To record neural activity, the sample has been exposed under different infrared laser and then has been calculated by ellipsometry parameters. Our results show that we have efficient sensitivity to different laser excitation.

Keywords: neural activity, Plasmonic sensor, Nanograting, Gold thin film

Procedia PDF Downloads 398
25976 The Perception and Integration of Lexical Tone and Vowel in Mandarin-speaking Children with Autism: An Event-Related Potential Study

Authors: Rui Wang, Luodi Yu, Dan Huang, Hsuan-Chih Chen, Yang Zhang, Suiping Wang

Abstract:

Enhanced discrimination of pure tones but diminished discrimination of speech pitch (i.e., lexical tone) were found in children with autism who speak a tonal language (Mandarin), suggesting a speech-specific impairment of pitch perception in these children. However, in tonal languages, both lexical tone and vowel are phonemic cues and integrally dependent on each other. Therefore, it is unclear whether the presence of phonemic vowel dimension contributes to the observed lexical tone deficits in Mandarin-speaking children with autism. The current study employed a multi-feature oddball paradigm to examine how vowel and tone dimensions contribute to the neural responses for syllable change detection and involuntary attentional orienting in school-age Mandarin-speaking children with autism. In the oddball sequence, syllable /da1/ served as the standard stimulus. There were three deviant stimulus conditions, representing tone-only change (TO, /da4/), vowel-only change (VO, /du1/), and change of tone and vowel simultaneously (TV, /du4/). EEG data were collected from 25 children with autism and 20 age-matched normal controls during passive listening to the stimulation. For each deviant condition, difference waveform measuring mismatch negativity (MMN) was derived from subtracting the ERP waveform to the standard sound from that to the deviant sound for each participant. Additionally, the linear summation of TO and VO difference waveforms was compared to the TV difference waveform, to examine whether neural sensitivity for TV change detection reflects simple summation or nonlinear integration of the two individual dimensions. The MMN results showed that the autism group had smaller amplitude compared with the control group in the TO and VO conditions, suggesting impaired discriminative sensitivity for both dimensions. In the control group, amplitude of the TV difference waveform approximated the linear summation of the TO and VO waveforms only in the early time window but not in the late window, suggesting a time course from dimensional summation to nonlinear integration. In the autism group, however, the nonlinear TV integration was already present in the early window. These findings suggest that speech perception atypicality in children with autism rests not only in the processing of single phonemic dimensions, but also in the dimensional integration process.

Keywords: autism, event-related potentials , mismatch negativity, speech perception

Procedia PDF Downloads 218
25975 Effect of a Single Injection of hCG on Testosterone Concentration in Male Alpacas

Authors: A. ElZawam, D. McLean, A. Tibary

Abstract:

In alpaca, age at puberty is variable and the factors regulating the pattern of puberty and sexual maturation are a subject of controversy. Plasma testosterone level is often used as an indicator of sexual maturity. Our hypothesis is that hCG treatment will cause an increase in testosterone level that is correlated with animal age. The specific aim was to investigate the testicular tissue response to a single hCG injection by monitoring the serum testosterone concentration. Eighty four (n=84) males ranging in age from 6 to 60 months were used. Alpacas were grouped based on their ages into 15 groups. Each group had three to five male animals. Blood samples were collected from the jugular vein before treatment with hCG and 2 hours after intravenous administration of 3000 IU of hCG (Chorulon®). The serum was harvested and stored at -20ºC until the analysis. The effect of age on basal testosterone level and response to hCG treatment was evaluated by Analysis of Variance. As a result, basal serum testosterone concentrations were very low (<0.1ng/ml) until 9 months of age. Although basal serum testosterone concentrations increased steadily with age there was a significant variation amongst males within the same age group. Administration of 3000 IU of hCG, resulted in an average increase of 50% (P<0.05) in serum testosterone concentration after 2 hours. The percentage increase in serum testosterone in response to hCG stimulation varied from 51 to 81%. There was no correlation between the degree of response and age. However, the response to hCG injection presented two modes of increase depending on the age of animals. The first mode occurred at ages 9 to 14 months and the second mode was observed between 22 and 36 months. In conclusion, our results suggest that testicular growth and sensitivity to LH stimulation may be bimodal in the male alpaca with a rapid increase in growth and sensitivity between 9 and 14 months of age and a second phase of increased responsiveness after 21 months of ages.

Keywords: alpaca, testosterone, hCG, animal science

Procedia PDF Downloads 570
25974 Detecting Heartbeat Architectural Tactic in Source Code Using Program Analysis

Authors: Ananta Kumar Das, Sujit Kumar Chakrabarti

Abstract:

Architectural tactics such as heartbeat, ping-echo, encapsulate, encrypt data are techniques that are used to achieve quality attributes of a system. Detecting architectural tactics has several benefits: it can aid system comprehension (e.g., legacy systems) and in the estimation of quality attributes such as safety, security, maintainability, etc. Architectural tactics are typically spread over the source code and are implicit. For large codebases, manual detection is often not feasible. Therefore, there is a need for automated methods of detection of architectural tactics. This paper presents a formalization of the heartbeat architectural tactic and a program analytic approach to detect this tactic in source code. The experiment of the proposed method is done on a set of Java applications. The outcome of the experiment strongly suggests that the method compares well with a manual approach in terms of its sensitivity and specificity, and far supersedes a manual exercise in terms of its scalability.

Keywords: software architecture, architectural tactics, detecting architectural tactics, program analysis, AST, alias analysis

Procedia PDF Downloads 159
25973 Algorithms used in Spatial Data Mining GIS

Authors: Vahid Bairami Rad

Abstract:

Extracting knowledge from spatial data like GIS data is important to reduce the data and extract information. Therefore, the development of new techniques and tools that support the human in transforming data into useful knowledge has been the focus of the relatively new and interdisciplinary research area ‘knowledge discovery in databases’. Thus, we introduce a set of database primitives or basic operations for spatial data mining which are sufficient to express most of the spatial data mining algorithms from the literature. This approach has several advantages. Similar to the relational standard language SQL, the use of standard primitives will speed-up the development of new data mining algorithms and will also make them more portable. We introduced a database-oriented framework for spatial data mining which is based on the concepts of neighborhood graphs and paths. A small set of basic operations on these graphs and paths were defined as database primitives for spatial data mining. Furthermore, techniques to efficiently support the database primitives by a commercial DBMS were presented.

Keywords: spatial data base, knowledge discovery database, data mining, spatial relationship, predictive data mining

Procedia PDF Downloads 460
25972 Advancing the Analysis of Physical Activity Behaviour in Diverse, Rapidly Evolving Populations: Using Unsupervised Machine Learning to Segment and Cluster Accelerometer Data

Authors: Christopher Thornton, Niina Kolehmainen, Kianoush Nazarpour

Abstract:

Background: Accelerometers are widely used to measure physical activity behavior, including in children. The traditional method for processing acceleration data uses cut points, relying on calibration studies that relate the quantity of acceleration to energy expenditure. As these relationships do not generalise across diverse populations, they must be parametrised for each subpopulation, including different age groups, which is costly and makes studies across diverse populations difficult. A data-driven approach that allows physical activity intensity states to emerge from the data under study without relying on parameters derived from external populations offers a new perspective on this problem and potentially improved results. We evaluated the data-driven approach in a diverse population with a range of rapidly evolving physical and mental capabilities, namely very young children (9-38 months old), where this new approach may be particularly appropriate. Methods: We applied an unsupervised machine learning approach (a hidden semi-Markov model - HSMM) to segment and cluster the accelerometer data recorded from 275 children with a diverse range of physical and cognitive abilities. The HSMM was configured to identify a maximum of six physical activity intensity states and the output of the model was the time spent by each child in each of the states. For comparison, we also processed the accelerometer data using published cut points with available thresholds for the population. This provided us with time estimates for each child’s sedentary (SED), light physical activity (LPA), and moderate-to-vigorous physical activity (MVPA). Data on the children’s physical and cognitive abilities were collected using the Paediatric Evaluation of Disability Inventory (PEDI-CAT). Results: The HSMM identified two inactive states (INS, comparable to SED), two lightly active long duration states (LAS, comparable to LPA), and two short-duration high-intensity states (HIS, comparable to MVPA). Overall, the children spent on average 237/392 minutes per day in INS/SED, 211/129 minutes per day in LAS/LPA, and 178/168 minutes in HIS/MVPA. We found that INS overlapped with 53% of SED, LAS overlapped with 37% of LPA and HIS overlapped with 60% of MVPA. We also looked at the correlation between the time spent by a child in either HIS or MVPA and their physical and cognitive abilities. We found that HIS was more strongly correlated with physical mobility (R²HIS =0.5, R²MVPA= 0.28), cognitive ability (R²HIS =0.31, R²MVPA= 0.15), and age (R²HIS =0.15, R²MVPA= 0.09), indicating increased sensitivity to key attributes associated with a child’s mobility. Conclusion: An unsupervised machine learning technique can segment and cluster accelerometer data according to the intensity of movement at a given time. It provides a potentially more sensitive, appropriate, and cost-effective approach to analysing physical activity behavior in diverse populations, compared to the current cut points approach. This, in turn, supports research that is more inclusive across diverse populations.

Keywords: physical activity, machine learning, under 5s, disability, accelerometer

Procedia PDF Downloads 210
25971 Surface-Enhanced Raman Spectroscopy on Gold Nanoparticles in the Kidney Disease

Authors: Leonardo C. Pacheco-Londoño, Nataly J Galan-Freyle, Lisandro Pacheco-Lugo, Antonio Acosta-Hoyos, Elkin Navarro, Gustavo Aroca-Martinez, Karin Rondón-Payares, Alberto C. Espinosa-Garavito, Samuel P. Hernández-Rivera

Abstract:

At the Life Science Research Center at Simon Bolivar University, a primary focus is the diagnosis of various diseases, and the use of gold nanoparticles (Au-NPs) in diverse biomedical applications is continually expanding. In the present study, Au-NPs were employed as substrates for Surface-Enhanced Raman Spectroscopy (SERS) aimed at diagnosing kidney diseases arising from Lupus Nephritis (LN), preeclampsia (PC), and Hypertension (H). Discrimination models were developed for distinguishing patients with and without kidney diseases based on the SERS signals from urine samples by partial least squares-discriminant analysis (PLS-DA). A comparative study of the Raman signals across the three conditions was conducted, leading to the identification of potential metabolite signals. Model performance was assessed through cross-validation and external validation, determining parameters like sensitivity and specificity. Additionally, a secondary analysis was performed using machine learning (ML) models, wherein different ML algorithms were evaluated for their efficiency. Models’ validation was carried out using cross-validation and external validation, and other parameters were determined, such as sensitivity and specificity; the models showed average values of 0.9 for both parameters. Additionally, it is not possible to highlight this collaborative effort involved two university research centers and two healthcare institutions, ensuring ethical treatment and informed consent of patient samples.

Keywords: SERS, Raman, PLS-DA, kidney diseases

Procedia PDF Downloads 45
25970 Data Stream Association Rule Mining with Cloud Computing

Authors: B. Suraj Aravind, M. H. M. Krishna Prasad

Abstract:

There exist emerging applications of data streams that require association rule mining, such as network traffic monitoring, web click streams analysis, sensor data, data from satellites etc. Data streams typically arrive continuously in high speed with huge amount and changing data distribution. This raises new issues that need to be considered when developing association rule mining techniques for stream data. This paper proposes to introduce an improved data stream association rule mining algorithm by eliminating the limitation of resources. For this, the concept of cloud computing is used. Inclusion of this may lead to additional unknown problems which needs further research.

Keywords: data stream, association rule mining, cloud computing, frequent itemsets

Procedia PDF Downloads 501
25969 Technology, Ethics and Experience: Understanding Interactions as Ethical Practice

Authors: Joan Casas-Roma

Abstract:

Technology has become one of the main channels through which people engage in most of their everyday activities; from working to learning, or even when socializing, technology often acts as both an enabler and a mediator of such activities. Moreover, the affordances and interactions created by those technological tools determine the way in which the users interact with one another, as well as how they relate to the relevant environment, thus favoring certain kinds of actions and behaviors while discouraging others. In this regard, virtue ethics theories place a strong focus on a person's daily practice (understood as their decisions, actions, and behaviors) as the means to develop and enhance their habits and ethical competences --such as their awareness and sensitivity towards certain ethically-desirable principles. Under this understanding of ethics, this set of technologically-enabled affordances and interactions can be seen as the possibility space where the daily practice of their users takes place in a wide plethora of contexts and situations. At this point, the following question pops into mind: could these affordances and interactions be shaped in a way that would promote behaviors and habits basedonethically-desirable principles into their users? In the field of game design, the MDA framework (which stands for Mechanics, Dynamics, Aesthetics) explores how the interactions enabled within the possibility space of a game can lead to creating certain experiences and provoking specific reactions to the players. In this sense, these interactions can be shaped in ways thatcreate experiences to raise the players' awareness and sensitivity towards certain topics or principles. This research brings together the notions of technological affordances, the notions of practice and practical wisdom from virtue ethics, and the MDA framework from game design in order to explore how the possibility space created by technological interactions can be shaped in ways that enable and promote actions and behaviors supporting certain ethically-desirable principles. When shaped accordingly, interactions supporting certain ethically-desirable principlescould allow their users to carry out the kind of practice that, according to virtue ethics theories, provides the grounds to develop and enhance their awareness, sensitivity, and ethical reasoning capabilities. Moreover, and because ethical practice can happen collaterally in almost every context, decision, and action, this additional layer could potentially be applied in a wide variety of technological tools, contexts, and functionalities. This work explores the theoretical background, as well as the initial considerations and steps that would be needed in order to harness the potential ethically-desirable benefits that technology can bring, once it is understood as the space where most of their users' daily practice takes place.

Keywords: ethics, design methodology, human-computer interaction, philosophy of technology

Procedia PDF Downloads 158
25968 Short-Path Near-Infrared Laser Detection of Environmental Gases by Wavelength-Modulation Spectroscopy

Authors: Isao Tomita

Abstract:

The detection of environmental gases, 12CO_2, 13CO_2, and CH_4, using near-infrared semiconductor lasers with a short laser path length is studied by means of wavelength-modulation spectroscopy. The developed system is compact and has high sensitivity enough to detect the absorption peaks of isotopic 13CO_2 of a 3-% CO_2 gas at 2 um with a path length of 2.4 m, where its peak size is two orders of magnitude smaller than that of the ordinary 12CO_2 peaks. In addition, the detection of 12CO_2 peaks of a 385-ppm (0.0385-%) CO_2 gas in the air is made at 2 um with a path length of 1.4 m. Furthermore, in pursuing the detection of an ancient environmental CH_4 gas confined to a bubble in ice at the polar regions, measurements of the absorption spectrum for a trace gas of CH_4 in a small area are attempted. For a 100-% CH_4 gas trapped in a 1 mm^3 glass container, the absorption peaks of CH_4 are obtained at 1.65 um with a path length of 3 mm, and also the gas pressure is extrapolated from the measured data.

Keywords: environmental gases, Near-Infrared Laser Detection, Wavelength-Modulation Spectroscopy, gas pressure

Procedia PDF Downloads 423