Search results for: data mining techniques
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 29145

Search results for: data mining techniques

28095 An Adaptive Oversampling Technique for Imbalanced Datasets

Authors: Shaukat Ali Shahee, Usha Ananthakumar

Abstract:

A data set exhibits class imbalance problem when one class has very few examples compared to the other class, and this is also referred to as between class imbalance. The traditional classifiers fail to classify the minority class examples correctly due to its bias towards the majority class. Apart from between-class imbalance, imbalance within classes where classes are composed of a different number of sub-clusters with these sub-clusters containing different number of examples also deteriorates the performance of the classifier. Previously, many methods have been proposed for handling imbalanced dataset problem. These methods can be classified into four categories: data preprocessing, algorithmic based, cost-based methods and ensemble of classifier. Data preprocessing techniques have shown great potential as they attempt to improve data distribution rather than the classifier. Data preprocessing technique handles class imbalance either by increasing the minority class examples or by decreasing the majority class examples. Decreasing the majority class examples lead to loss of information and also when minority class has an absolute rarity, removing the majority class examples is generally not recommended. Existing methods available for handling class imbalance do not address both between-class imbalance and within-class imbalance simultaneously. In this paper, we propose a method that handles between class imbalance and within class imbalance simultaneously for binary classification problem. Removing between class imbalance and within class imbalance simultaneously eliminates the biases of the classifier towards bigger sub-clusters by minimizing the error domination of bigger sub-clusters in total error. The proposed method uses model-based clustering to find the presence of sub-clusters or sub-concepts in the dataset. The number of examples oversampled among the sub-clusters is determined based on the complexity of sub-clusters. The method also takes into consideration the scatter of the data in the feature space and also adaptively copes up with unseen test data using Lowner-John ellipsoid for increasing the accuracy of the classifier. In this study, neural network is being used as this is one such classifier where the total error is minimized and removing the between-class imbalance and within class imbalance simultaneously help the classifier in giving equal weight to all the sub-clusters irrespective of the classes. The proposed method is validated on 9 publicly available data sets and compared with three existing oversampling techniques that rely on the spatial location of minority class examples in the euclidean feature space. The experimental results show the proposed method to be statistically significantly superior to other methods in terms of various accuracy measures. Thus the proposed method can serve as a good alternative to handle various problem domains like credit scoring, customer churn prediction, financial distress, etc., that typically involve imbalanced data sets.

Keywords: classification, imbalanced dataset, Lowner-John ellipsoid, model based clustering, oversampling

Procedia PDF Downloads 405
28094 Application of Artificial Neural Network for Prediction of High Tensile Steel Strands in Post-Tensioned Slabs

Authors: Gaurav Sancheti

Abstract:

This study presents an impacting approach of Artificial Neural Networks (ANNs) in determining the quantity of High Tensile Steel (HTS) strands required in post-tensioned (PT) slabs. Various PT slab configurations were generated by varying the span and depth of the slab. For each of these slab configurations, quantity of required HTS strands were recorded. ANNs with backpropagation algorithm and varying architectures were developed and their performance was evaluated in terms of Mean Square Error (MSE). The recorded data for the quantity of HTS strands was used as a feeder database for training the developed ANNs. The networks were validated using various validation techniques. The results show that the proposed ANNs have a great potential with good prediction and generalization capability.

Keywords: artificial neural networks, back propagation, conceptual design, high tensile steel strands, post tensioned slabs, validation techniques

Procedia PDF Downloads 210
28093 A Machine Learning Based Framework for Education Levelling in Multicultural Countries: UAE as a Case Study

Authors: Shatha Ghareeb, Rawaa Al-Jumeily, Thar Baker

Abstract:

In Abu Dhabi, there are many different education curriculums where sector of private schools and quality assurance is supervising many private schools in Abu Dhabi for many nationalities. As there are many different education curriculums in Abu Dhabi to meet expats’ needs, there are different requirements for registration and success. In addition, there are different age groups for starting education in each curriculum. In fact, each curriculum has a different number of years, assessment techniques, reassessment rules, and exam boards. Currently, students that transfer curriculums are not being placed in the right year group due to different start and end dates of each academic year and their date of birth for each year group is different for each curriculum and as a result, we find students that are either younger or older for that year group which therefore creates gaps in their learning and performance. In addition, there is not a way of storing student data throughout their academic journey so that schools can track the student learning process. In this paper, we propose to develop a computational framework applicable in multicultural countries such as UAE in which multi-education systems are implemented. The ultimate goal is to use cloud and fog computing technology integrated with Artificial Intelligence techniques of Machine Learning to aid in a smooth transition when assigning students to their year groups, and provide leveling and differentiation information of students who relocate from a particular education curriculum to another, whilst also having the ability to store and access student data from anywhere throughout their academic journey.

Keywords: admissions, algorithms, cloud computing, differentiation, fog computing, levelling, machine learning

Procedia PDF Downloads 132
28092 Approach Based on Fuzzy C-Means for Band Selection in Hyperspectral Images

Authors: Diego Saqui, José H. Saito, José R. Campos, Lúcio A. de C. Jorge

Abstract:

Hyperspectral images and remote sensing are important for many applications. A problem in the use of these images is the high volume of data to be processed, stored and transferred. Dimensionality reduction techniques can be used to reduce the volume of data. In this paper, an approach to band selection based on clustering algorithms is presented. This approach allows to reduce the volume of data. The proposed structure is based on Fuzzy C-Means (or K-Means) and NWHFC algorithms. New attributes in relation to other studies in the literature, such as kurtosis and low correlation, are also considered. A comparison of the results of the approach using the Fuzzy C-Means and K-Means with different attributes is performed. The use of both algorithms show similar good results but, particularly when used attributes variance and kurtosis in the clustering process, however applicable in hyperspectral images.

Keywords: band selection, fuzzy c-means, k-means, hyperspectral image

Procedia PDF Downloads 388
28091 Automatic Thresholding for Data Gap Detection for a Set of Sensors in Instrumented Buildings

Authors: Houda Najeh, Stéphane Ploix, Mahendra Pratap Singh, Karim Chabir, Mohamed Naceur Abdelkrim

Abstract:

Building systems are highly vulnerable to different kinds of faults and failures. In fact, various faults, failures and human behaviors could affect the building performance. This paper tackles the detection of unreliable sensors in buildings. Different literature surveys on diagnosis techniques for sensor grids in buildings have been published but all of them treat only bias and outliers. Occurences of data gaps have also not been given an adequate span of attention in the academia. The proposed methodology comprises the automatic thresholding for data gap detection for a set of heterogeneous sensors in instrumented buildings. Sensor measurements are considered to be regular time series. However, in reality, sensor values are not uniformly sampled. So, the issue to solve is from which delay each sensor become faulty? The use of time series is required for detection of abnormalities on the delays. The efficiency of the method is evaluated on measurements obtained from a real power plant: an office at Grenoble Institute of technology equipped by 30 sensors.

Keywords: building system, time series, diagnosis, outliers, delay, data gap

Procedia PDF Downloads 235
28090 Parallel Self Organizing Neural Network Based Estimation of Archie’s Parameters and Water Saturation in Sandstone Reservoir

Authors: G. M. Hamada, A. A. Al-Gathe, A. M. Al-Khudafi

Abstract:

Determination of water saturation in sandstone is a vital question to determine the initial oil or gas in place in reservoir rocks. Water saturation determination using electrical measurements is mainly on Archie’s formula. Consequently accuracy of Archie’s formula parameters affects water saturation values rigorously. Determination of Archie’s parameters a, m, and n is proceeded by three conventional techniques, Core Archie-Parameter Estimation (CAPE) and 3-D. This work introduces the hybrid system of parallel self-organizing neural network (PSONN) targeting accepted values of Archie’s parameters and, consequently, reliable water saturation values. This work focuses on Archie’s parameters determination techniques; conventional technique, CAPE technique, and 3-D technique, and then the calculation of water saturation using current. Using the same data, a hybrid parallel self-organizing neural network (PSONN) algorithm is used to estimate Archie’s parameters and predict water saturation. Results have shown that estimated Arche’s parameters m, a, and n are highly accepted with statistical analysis, indicating that the PSONN model has a lower statistical error and higher correlation coefficient. This study was conducted using a high number of measurement points for 144 core plugs from a sandstone reservoir. PSONN algorithm can provide reliable water saturation values, and it can supplement or even replace the conventional techniques to determine Archie’s parameters and thereby calculate water saturation profiles.

Keywords: water saturation, Archie’s parameters, artificial intelligence, PSONN, sandstone reservoir

Procedia PDF Downloads 120
28089 Impact of Urbanization on Natural Drainage Pattern in District of Larkana, Sindh Pakistan

Authors: Sumaira Zafar, Arjumand Zaidi

Abstract:

During past few years, several floods have adversely affected the areas along lower Indus River. Besides other climate related anomalies, rapidly increasing urbanization and blockage of natural drains due to siltation or encroachments are two other critical causes that may be responsible for these disasters. Due to flat topography of river Indus plains and blockage of natural waterways, drainage of storm water takes time adversely affecting the crop health and soil properties of the area. Government of Sindh is taking a keen interest in revival of natural drainage network in the province and has initiated this work under Sindh Irrigation and Drainage Authority. In this paper, geospatial techniques are used to analyze landuse/land-cover changes of Larkana district over the past three decades (1980-present) and their impact on natural drainage system. Satellite derived Digital Elevation Model (DEM) and topographic sheets (recent and 1950) are used to delineate natural drainage pattern of the district. The urban landuse map developed in this study is further overlaid on drainage line layer to identify the critical areas where the natural floodwater flows are being inhibited by urbanization. Rainfall and flow data are utilized to identify areas of heavy flow, whereas, satellite data including Landsat 7 and Google Earth are used to map previous floods extent and landuse/cover of the study area. Alternatives to natural drainage systems are also suggested wherever possible. The output maps of natural drainage pattern can be used to develop a decision support system for urban planners, Sindh development authorities and flood mitigation and management agencies.

Keywords: geospatial techniques, satellite data, natural drainage, flood, urbanization

Procedia PDF Downloads 495
28088 GPU-Based Back-Projection of Synthetic Aperture Radar (SAR) Data onto 3D Reference Voxels

Authors: Joshua Buli, David Pietrowski, Samuel Britton

Abstract:

Processing SAR data usually requires constraints in extent in the Fourier domain as well as approximations and interpolations onto a planar surface to form an exploitable image. This results in a potential loss of data requires several interpolative techniques, and restricts visualization to two-dimensional plane imagery. The data can be interpolated into a ground plane projection, with or without terrain as a component, all to better view SAR data in an image domain comparable to what a human would view, to ease interpretation. An alternate but computationally heavy method to make use of more of the data is the basis of this research. Pre-processing of the SAR data is completed first (matched-filtering, motion compensation, etc.), the data is then range compressed, and lastly, the contribution from each pulse is determined for each specific point in space by searching the time history data for the reflectivity values for each pulse summed over the entire collection. This results in a per-3D-point reflectivity using the entire collection domain. New advances in GPU processing have finally allowed this rapid projection of acquired SAR data onto any desired reference surface (called backprojection). Mathematically, the computations are fast and easy to implement, despite limitations in SAR phase history data size and 3D-point cloud size. Backprojection processing algorithms are embarrassingly parallel since each 3D point in the scene has the same reflectivity calculation applied for all pulses, independent of all other 3D points and pulse data under consideration. Therefore, given the simplicity of the single backprojection calculation, the work can be spread across thousands of GPU threads allowing for accurate reflectivity representation of a scene. Furthermore, because reflectivity values are associated with individual three-dimensional points, a plane is no longer the sole permissible mapping base; a digital elevation model or even a cloud of points (collected from any sensor capable of measuring ground topography) can be used as a basis for the backprojection technique. This technique minimizes any interpolations and modifications of the raw data, maintaining maximum data integrity. This innovative processing will allow for SAR data to be rapidly brought into a common reference frame for immediate exploitation and data fusion with other three-dimensional data and representations.

Keywords: backprojection, data fusion, exploitation, three-dimensional, visualization

Procedia PDF Downloads 63
28087 Analysis of Scholarly Communication Patterns in Korean Studies

Authors: Erin Hea-Jin Kim

Abstract:

This study aims to investigate scholarly communication patterns in Korean studies, which focuses on all aspects of Korea, including history, culture, literature, politics, society, economics, religion, and so on. It is called ‘national study or home study’ as the subject of the study is itself, whereas it is called ‘area study’ as the subject of the study is others, i.e., outside of Korea. Understanding of the structure of scholarly communication in Korean studies is important since the motivations, procedures, results, or outcomes of individual studies may be affected by the cooperative relationships that appear in the communication structure. To this end, we collected 1,798 articles with the (author or index) keyword ‘Korean’ published in 2018 from the Scopus database and extracted the institution and country of the authors using a text mining technique. A total of 96 countries, including South Korea, was identified. Then we constructed a co-authorship network based on the countries identified. The indicators of social network analysis (SNA), co-occurrences, and cluster analysis were used to measure the activity and connectivity of participation in collaboration in Korean studies. As a result, the highest frequency of collaboration appears in the following order: S. Korea with the United States (603), S. Korea with Japan (146), S. Korea with China (131), S. Korea with the United Kingdom (83), and China with the United States (65). This means that the most active participants are S. Korea as well as the USA. The highest rank in the role of mediator measured by betweenness centrality appears in the following order: United States (0.165), United Kingdom (0.045), China (0.043), Japan (0.037), Australia (0.026), and South Africa (0.023). These results show that these countries contribute to connecting in Korean studies. We found two major communities among the co-authorship network. Asian countries and America belong to the same community, and the United Kingdom and European countries belong to the other community. Korean studies have a long history, and the study has emerged since Japanese colonization. However, Korean studies have never been investigated by digital content analysis. The contributions of this study are an analysis of co-authorship in Korean studies with a global perspective based on digital content, which has not attempted so far to our knowledge, and to suggest ideas on how to analyze the humanities disciplines such as history, literature, or Korean studies by text mining. The limitation of this study is that the scholarly data we collected did not cover all domestic journals because we only gathered scholarly data from Scopus. There are thousands of domestic journals not indexed in Scopus that we can consider in terms of national studies, but are not possible to collect.

Keywords: co-authorship network, Korean studies, Koreanology, scholarly communication

Procedia PDF Downloads 143
28086 Geometallurgy of Niobium Deposits: An Integrated Multi-Disciplined Approach

Authors: Mohamed Nasraoui

Abstract:

Spatial ore distribution, ore heterogeneity and their links with geological processes involved in Niobium concentration are all factors for consideration when bridging field observations to extraction scheme. Indeed, mineralogy changes of Nb-hosting phases, their textural relationships with hydrothermal or secondary minerals, play a key control over mineral processing. This study based both on filed work and ore characterization presents data from several Nb-deposits related to carbonatite complexes. The results obtained by a wide range of analytical techniques, including, XRD, XRF, ICP-MS, SEM, Microprobe, Spectro-CL, FTIR-DTA and Mössbauer spectroscopy, demonstrate how geometallurgical assessment, at all stage of mine development, can greatly assist in the design of a suitable extraction flowsheet and data reconciliation.

Keywords: carbonatites, Nb-geometallurgy, Nb-mineralogy, mineral processing.

Procedia PDF Downloads 151
28085 How to Use Big Data in Logistics Issues

Authors: Mehmet Akif Aslan, Mehmet Simsek, Eyup Sensoy

Abstract:

Big Data stands for today’s cutting-edge technology. As the technology becomes widespread, so does Data. Utilizing massive data sets enable companies to get competitive advantages over their adversaries. Out of many area of Big Data usage, logistics has significance role in both commercial sector and military. This paper lays out what big data is and how it is used in both military and commercial logistics.

Keywords: big data, logistics, operational efficiency, risk management

Procedia PDF Downloads 631
28084 Adaptive Beamforming with Steering Error and Mutual Coupling between Antenna Sensors

Authors: Ju-Hong Lee, Ching-Wei Liao

Abstract:

Owing to close antenna spacing between antenna sensors within a compact space, a part of data in one antenna sensor would outflow to other antenna sensors when the antenna sensors in an antenna array operate simultaneously. This phenomenon is called mutual coupling effect (MCE). It has been shown that the performance of antenna array systems can be degraded when the antenna sensors are in close proximity. Especially, in a systems equipped with massive antenna sensors, the degradation of beamforming performance due to the MCE is significantly inevitable. Moreover, it has been shown that even a small angle error between the true direction angle of the desired signal and the steering angle deteriorates the effectiveness of an array beamforming system. However, the true direction vector of the desired signal may not be exactly known in some applications, e.g., the application in land mobile-cellular wireless systems. Therefore, it is worth developing robust techniques to deal with the problem due to the MCE and steering angle error for array beamforming systems. In this paper, we present an efficient technique for performing adaptive beamforming with robust capabilities against the MCE and the steering angle error. Only the data vector received by an antenna array is required by the proposed technique. By using the received array data vector, a correlation matrix is constructed to replace the original correlation matrix associated with the received array data vector. Then, the mutual coupling matrix due to the MCE on the antenna array is estimated through a recursive algorithm. An appropriate estimate of the direction angle of the desired signal can also be obtained during the recursive process. Based on the estimated mutual coupling matrix, the estimated direction angle, and the reconstructed correlation matrix, the proposed technique can effectively cure the performance degradation due to steering angle error and MCE. The novelty of the proposed technique is that the implementation procedure is very simple and the resulting adaptive beamforming performance is satisfactory. Simulation results show that the proposed technique provides much better beamforming performance without requiring complicated complexity as compared with the existing robust techniques.

Keywords: adaptive beamforming, mutual coupling effect, recursive algorithm, steering angle error

Procedia PDF Downloads 313
28083 Multi-Scaled Non-Local Means Filter for Medical Images Denoising: Empirical Mode Decomposition vs. Wavelet Transform

Authors: Hana Rabbouch

Abstract:

In recent years, there has been considerable growth of denoising techniques mainly devoted to medical imaging. This important evolution is not only due to the progress of computing techniques, but also to the emergence of multi-resolution analysis (MRA) on both mathematical and algorithmic bases. In this paper, a comparative study is conducted between the two best-known MRA-based decomposition techniques: the Empirical Mode Decomposition (EMD) and the Discrete Wavelet Transform (DWT). The comparison is carried out in a framework of multi-scale denoising, where a Non-Local Means (NLM) filter is performed scale-by-scale to a sample of benchmark medical images. The results prove the effectiveness of the multiscaled denoising, especially when the NLM filtering is coupled with the EMD.

Keywords: medical imaging, non local means, denoising, multiscaled analysis, empirical mode decomposition, wavelets

Procedia PDF Downloads 130
28082 The Application of Lean-Kaizen in Course Plan and Delivery in Malaysian Higher Education Sector

Authors: Nur Aishah Binti Awi, Zulfiqar Khan

Abstract:

Lean-kaizen has always been applied in manufacturing sector since many years ago. What about education sector? This paper discuss on how lean-kaizen can also be applied in education sector, specifically in academic area of Malaysian’s higher education sector. The purpose of this paper is to describe the application of lean kaizen in course plan and delivery. Lean-kaizen techniques have been used to identify waste in the course plan and delivery. A field study has been conducted to obtain the data. This study used both quantitative and qualitative data. The researcher had interviewed the chosen lecturers regarding to the problems of course plan and delivery that they encountered. Secondary data of students’ feedback at the end of semester also has been used to improve course plan and delivery. The result empirically shows that lean-kaizen helps to improve the course plan and delivery by reducing the wastes. Thus, this study demonstrates that lean-kaizen can also help education sector to improve their services as achieved by manufacturing sector.

Keywords: course delivery, education, Kaizen, lean

Procedia PDF Downloads 358
28081 In situ Stabilization of Arsenic in Soils with Birnessite and Goethite

Authors: Saeed Bagherifam, Trevor Brown, Chris Fellows, Ravi Naidu

Abstract:

Over the last century, rapid urbanization, industrial emissions, and mining activities have resulted in widespread contamination of the environment by heavy metal(loid)s. Arsenic (As) is a toxic metalloid belonging to group 15 of the periodic table, which occurs naturally at low concentrations in soils and the earth’s crust, although concentrations can be significantly elevated in natural systems as a result of dispersion from anthropogenic sources, e.g., mining activities. Bioavailability is the fraction of a contaminant in soils that is available for uptake by plants, food chains, and humans and therefore presents the greatest risk to terrestrial ecosystems. Numerous attempts have been made to establish in situ and ex-situ technologies of remedial action for remediation of arsenic-contaminated soils. In situ stabilization techniques are based on deactivation or chemical immobilization of metalloid(s) in soil by means of soil amendments, which consequently reduce the bioavailability (for biota) and bioaccessibility (for humans) of metalloids due to the formation of low-solubility products or precipitates. This study investigated the effectiveness of two different types of synthetic manganese and iron oxides (birnessite and goethite) for stabilization of As in a soil spiked with 1000 mg kg⁻¹ of As and treated with 10% dosages of soil amendments. Birnessite was made using HCl and KMnO₄, and goethite was synthesized by the dropwise addition of KOH into Fe(NO₃) solution. The resulting contaminated soils were subjected to a series of chemical extraction studies including sequential extraction (BCR method), single-step extraction with distilled (DI) water, 2M HNO₃ and simplified bioaccessibility extraction tests (SBET) for estimation of bioaccessible fractions of As in two different soil fractions ( < 250 µm and < 2 mm). Concentrations of As in samples were measured using inductively coupled plasma mass spectrometry (ICP-MS). The results showed that soil with birnessite reduced bioaccessibility of As by up to 92% in both soil fractions. Furthermore, the results of single-step extractions revealed that the application of both birnessite and Goethite reduced DI water and HNO₃ extractable amounts of arsenic by 75, 75, 91, and 57%, respectively. Moreover, the results of the sequential extraction studies showed that both birnessite and goethite dramatically reduced the exchangeable fraction of As in soils. However, the amounts of recalcitrant fractions were higher in birnessite, and Goethite amended soils. The results revealed that the application of both birnessite and goethite significantly reduced bioavailability and the exchangeable fraction of As in contaminated soils, and therefore birnessite and Goethite amendments might be considered as promising adsorbents for stabilization and remediation of As contaminated soils.

Keywords: arsenic, bioavailability, in situ stabilisation, metalloid(s) contaminated soils

Procedia PDF Downloads 125
28080 The Type II Immune Response in Acute and Chronic Pancreatitis Mediated by STAT6 in Murine

Authors: Hager Elsheikh

Abstract:

Context: Pancreatitis is a condition characterized by inflammation in the pancreas, which can lead to serious complications if untreated. Both acute and chronic pancreatitis are associated with immune reactions and fibrosis, which further damage the pancreas. The type 2 immune response, primarily driven by alternative activated macrophages (AAMs), plays a significant role in the development of fibrosis. The IL-4/STAT6 pathway is a crucial signaling pathway for the activation of M2 macrophages. Pancreatic fibrosis is induced by dysregulated inflammatory responses and can result in the autodigestion and necrosis of pancreatic acinar cells. Research Aim: The aim of this study is to investigate the impact of STAT6, a crucial molecule in the IL-4/STAT6 pathway, on the severity and development of fibrosis during acute and chronic pancreatitis. The research also aims to understand the influence of the JAK/STAT6 signaling pathway on the balance between fibrosis and regeneration in the presence of different macrophage populations. Methodology: The research utilizes murine models of acute and chronic pancreatitis induced by cerulean injection. Animal models will be employed to study the effect of STAT6 knockout on disease severity and fibrosis. Isolation of acinar cells and cell culture techniques will be used to assess the impact of different macrophage populations on wound healing and regeneration. Various techniques such as PCR, histology, immunofluorescence, and transcriptomics will be employed to analyze the tissues and cells. Findings: The research aims to provide insights into the mechanisms underlying tissue fibrosis and wound healing during acute and chronic pancreatitis. By investigating the influence of the JAK/STAT6 signaling pathway and different macrophage populations, the study aims to understand their impact on tissue fibrosis, disease severity, and pancreatic regeneration. Theoretical Importance: This research contributes to our understanding of the role of specific signaling pathways, macrophage polarization, and the type 2 immune response in pancreatitis. It provides insights into the molecular mechanisms underlying tissue fibrosis and the potential for targeted therapies. Data Collection and Analysis Procedures: Data will be collected through the use of murine models, isolation and culture of acinar cells, and various experimental techniques such as PCR, histology, immunofluorescence, and transcriptomics. Data will be analyzed using appropriate statistical methods and techniques, and the findings will be interpreted in the context of the research objectives. Conclusion: By investigating the mechanisms of tissue fibrosis and wound healing during acute and chronic pancreatitis, this research aims to enhance our understanding of the disease progression and potential therapeutic targets. The findings have theoretical importance in expanding our knowledge of pancreatic fibrosis and the role of macrophage polarization in the context of the type 2 immune response.

Keywords: immunity in chronic diseases, pancreatitis, macrophages, immune response

Procedia PDF Downloads 27
28079 Quantifying User-Related, System-Related, and Context-Related Patterns of Smartphone Use

Authors: Andrew T. Hendrickson, Liven De Marez, Marijn Martens, Gytha Muller, Tudor Paisa, Koen Ponnet, Catherine Schweizer, Megan Van Meer, Mariek Vanden Abeele

Abstract:

Quantifying and understanding the myriad ways people use their phones and how that impacts their relationships, cognitive abilities, mental health, and well-being is increasingly important in our phone-centric society. However, most studies on the patterns of phone use have focused on theory-driven tests of specific usage hypotheses using self-report questionnaires or analyses of smaller datasets. In this work we present a series of analyses from a large corpus of over 3000 users that combine data-driven and theory-driven analyses to identify reliable smartphone usage patterns and clusters of similar users. Furthermore, we compare the stability of user clusters across user- and system-initiated sessions, as well as during the hypothesized ritualized behavior times directly before and after sleeping. Our results indicate support for some hypothesized usage patterns but present a more complete and nuanced view of how people use smartphones.

Keywords: data mining, experience sampling, smartphone usage, health and well being

Procedia PDF Downloads 151
28078 Bandwidth Efficient Cluster Based Collision Avoidance Multicasting Protocol in VANETs

Authors: Navneet Kaur, Amarpreet Singh

Abstract:

In Vehicular Adhoc Networks, Data Dissemination is a challenging task. There are number of techniques, types and protocols available for disseminating the data but in order to preserve limited bandwidth and to disseminate maximum data over networks makes it more challenging. There are broadcasting, multicasting and geocasting based protocols. Multicasting based protocols are found to be best for conserving the bandwidth. One such protocol named BEAM exists that improves the performance of Vehicular Adhoc Networks by reducing the number of in-network message transactions and thereby efficiently utilizing the bandwidth during an emergency situation. But this protocol may result in multicar chain collision as there was no V2V communication. So, this paper proposes a new protocol named Enhanced Bandwidth Efficient Cluster Based Multicasting Protocol (EBECM) that will overcome the limitations of existing BEAM protocol. And Simulation results will show the improved performance of EBECM in terms of Routing overhead, throughput and PDR when compared with BEAM protocol.

Keywords: BEAM, data dissemination, emergency situation, vehicular adhoc network

Procedia PDF Downloads 334
28077 Magnetic Activated Carbon: Preparation, Characterization, and Application for Vanadium Removal

Authors: Hakimeh Sharififard, Mansooreh Soleimani

Abstract:

In this work, the magnetic activated carbon nanocomposite (Fe-CAC) has been synthesized by anchorage iron hydr(oxide) nanoparticles onto commercial activated carbon (CAC) surface and characterized using BET, XRF, SEM techniques. The influence of various removal parameters such as pH, contact time and initial concentration of vanadium on vanadium removal was evaluated using CAC and Fe-CAC in batch method. The sorption isotherms were studied using Langmuir, Freundlich and Dubinin–Radushkevich (D–R) isotherm models. These equilibrium data were well described by the Freundlich model. Results showed that CAC had the vanadium adsorption capacity of 37.87 mg/g, while the Fe-AC was able to adsorb 119.01 mg/g of vanadium. Kinetic data was found to confirm pseudo-second-order kinetic model for both adsorbents.

Keywords: magnetic activated carbon, remove, vanadium, nanocomposite, freundlich

Procedia PDF Downloads 446
28076 Review of Malaria Diagnosis Techniques

Authors: Lubabatu Sada Sodangu

Abstract:

Malaria is a major cause of death in tropical and subtropical nations. Malaria cases are continually rising as a result of a number of factors, despite the fact that the condition is now treatable using effective methods. In this situation, quick and effective diagnostic methods are essential for the management and control of malaria. Malaria diagnosis using conventional methods is still troublesome, hence new technologies have been created and implemented to get around the drawbacks. The review describes the currently known malaria diagnostic techniques, their strengths and shortcomings.

Keywords: malaria, technique, diagnosis, Africa

Procedia PDF Downloads 42
28075 Review of Malaria Diagnosis Techniques

Authors: Lubabatu Sada Sodangi

Abstract:

Malaria is a major cause of death in tropical and subtropical nations. Malaria cases are continually rising as a result of a number of factors, despite the fact that the condition is now treatable using effective methods. In this situation, quick and effective diagnostic methods are essential for the management and control of malaria. Malaria diagnosis using conventional methods is still troublesome; hence, new technologies have been created and implemented to get around the drawbacks. The review describes the currently known malaria diagnostic techniques, their strengths, and shortcomings.

Keywords: malaria, technique, diagnosis, Africa

Procedia PDF Downloads 49
28074 Copyright Clearance for Artificial Intelligence Training Data: Challenges and Solutions

Authors: Erva Akin

Abstract:

– The use of copyrighted material for machine learning purposes is a challenging issue in the field of artificial intelligence (AI). While machine learning algorithms require large amounts of data to train and improve their accuracy and creativity, the use of copyrighted material without permission from the authors may infringe on their intellectual property rights. In order to overcome copyright legal hurdle against the data sharing, access and re-use of data, the use of copyrighted material for machine learning purposes may be considered permissible under certain circumstances. For example, if the copyright holder has given permission to use the data through a licensing agreement, then the use for machine learning purposes may be lawful. It is also argued that copying for non-expressive purposes that do not involve conveying expressive elements to the public, such as automated data extraction, should not be seen as infringing. The focus of such ‘copy-reliant technologies’ is on understanding language rules, styles, and syntax and no creative ideas are being used. However, the non-expressive use defense is within the framework of the fair use doctrine, which allows the use of copyrighted material for research or educational purposes. The questions arise because the fair use doctrine is not available in EU law, instead, the InfoSoc Directive provides for a rigid system of exclusive rights with a list of exceptions and limitations. One could only argue that non-expressive uses of copyrighted material for machine learning purposes do not constitute a ‘reproduction’ in the first place. Nevertheless, the use of machine learning with copyrighted material is difficult because EU copyright law applies to the mere use of the works. Two solutions can be proposed to address the problem of copyright clearance for AI training data. The first is to introduce a broad exception for text and data mining, either mandatorily or for commercial and scientific purposes, or to permit the reproduction of works for non-expressive purposes. The second is that copyright laws should permit the reproduction of works for non-expressive purposes, which opens the door to discussions regarding the transposition of the fair use principle from the US into EU law. Both solutions aim to provide more space for AI developers to operate and encourage greater freedom, which could lead to more rapid innovation in the field. The Data Governance Act presents a significant opportunity to advance these debates. Finally, issues concerning the balance of general public interests and legitimate private interests in machine learning training data must be addressed. In my opinion, it is crucial that robot-creation output should fall into the public domain. Machines depend on human creativity, innovation, and expression. To encourage technological advancement and innovation, freedom of expression and business operation must be prioritised.

Keywords: artificial intelligence, copyright, data governance, machine learning

Procedia PDF Downloads 75
28073 Graph Neural Network-Based Classification for Disease Prediction in Health Care Heterogeneous Data Structures of Electronic Health Record

Authors: Raghavi C. Janaswamy

Abstract:

In the healthcare sector, heterogenous data elements such as patients, diagnosis, symptoms, conditions, observation text from physician notes, and prescriptions form the essentials of the Electronic Health Record (EHR). The data in the form of clear text and images are stored or processed in a relational format in most systems. However, the intrinsic structure restrictions and complex joins of relational databases limit the widespread utility. In this regard, the design and development of realistic mapping and deep connections as real-time objects offer unparallel advantages. Herein, a graph neural network-based classification of EHR data has been developed. The patient conditions have been predicted as a node classification task using a graph-based open source EHR data, Synthea Database, stored in Tigergraph. The Synthea DB dataset is leveraged due to its closer representation of the real-time data and being voluminous. The graph model is built from the EHR heterogeneous data using python modules, namely, pyTigerGraph to get nodes and edges from the Tigergraph database, PyTorch to tensorize the nodes and edges, PyTorch-Geometric (PyG) to train the Graph Neural Network (GNN) and adopt the self-supervised learning techniques with the AutoEncoders to generate the node embeddings and eventually perform the node classifications using the node embeddings. The model predicts patient conditions ranging from common to rare situations. The outcome is deemed to open up opportunities for data querying toward better predictions and accuracy.

Keywords: electronic health record, graph neural network, heterogeneous data, prediction

Procedia PDF Downloads 78
28072 Breast Cancer Survivability Prediction via Classifier Ensemble

Authors: Mohamed Al-Badrashiny, Abdelghani Bellaachia

Abstract:

This paper presents a classifier ensemble approach for predicting the survivability of the breast cancer patients using the latest database version of the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute. The system consists of two main components; features selection and classifier ensemble components. The features selection component divides the features in SEER database into four groups. After that it tries to find the most important features among the four groups that maximizes the weighted average F-score of a certain classification algorithm. The ensemble component uses three different classifiers, each of which models different set of features from SEER through the features selection module. On top of them, another classifier is used to give the final decision based on the output decisions and confidence scores from each of the underlying classifiers. Different classification algorithms have been examined; the best setup found is by using the decision tree, Bayesian network, and Na¨ıve Bayes algorithms for the underlying classifiers and Na¨ıve Bayes for the classifier ensemble step. The system outperforms all published systems to date when evaluated against the exact same data of SEER (period of 1973-2002). It gives 87.39% weighted average F-score compared to 85.82% and 81.34% of the other published systems. By increasing the data size to cover the whole database (period of 1973-2014), the overall weighted average F-score jumps to 92.4% on the held out unseen test set.

Keywords: classifier ensemble, breast cancer survivability, data mining, SEER

Procedia PDF Downloads 312
28071 Advances in Fiber Optic Technology for High-Speed Data Transmission

Authors: Salim Yusif

Abstract:

Fiber optic technology has revolutionized telecommunications and data transmission, providing unmatched speed, bandwidth, and reliability. This paper presents the latest advancements in fiber optic technology, focusing on innovations in fiber materials, transmission techniques, and network architectures that enhance the performance of high-speed data transmission systems. Key advancements include the development of ultra-low-loss optical fibers, multi-core fibers, advanced modulation formats, and the integration of fiber optics into next-generation network architectures such as Software-Defined Networking (SDN) and Network Function Virtualization (NFV). Additionally, recent developments in fiber optic sensors are discussed, extending the utility of optical fibers beyond data transmission. Through comprehensive analysis and experimental validation, this research offers valuable insights into the future directions of fiber optic technology, highlighting its potential to drive innovation across various industries.

Keywords: fiber optics, high-speed data transmission, ultra-low-loss optical fibers, multi-core fibers, modulation formats, coherent detection, software-defined networking, network function virtualization, fiber optic sensors

Procedia PDF Downloads 41
28070 Handling Missing Data by Using Expectation-Maximization and Expectation-Maximization with Bootstrapping for Linear Functional Relationship Model

Authors: Adilah Abdul Ghapor, Yong Zulina Zubairi, A. H. M. R. Imon

Abstract:

Missing value problem is common in statistics and has been of interest for years. This article considers two modern techniques in handling missing data for linear functional relationship model (LFRM) namely the Expectation-Maximization (EM) algorithm and Expectation-Maximization with Bootstrapping (EMB) algorithm using three performance indicators; namely the mean absolute error (MAE), root mean square error (RMSE) and estimated biased (EB). In this study, we applied the methods of imputing missing values in two types of LFRM namely the full model of LFRM and in LFRM when the slope is estimated using a nonparametric method. Results of the simulation study suggest that EMB algorithm performs much better than EM algorithm in both models. We also illustrate the applicability of the approach in a real data set.

Keywords: expectation-maximization, expectation-maximization with bootstrapping, linear functional relationship model, performance indicators

Procedia PDF Downloads 445
28069 A Generalized Framework for Adaptive Machine Learning Deployments in Algorithmic Trading

Authors: Robert Caulk

Abstract:

A generalized framework for adaptive machine learning deployments in algorithmic trading is introduced, tested, and released as open-source code. The presented software aims to test the hypothesis that recent data contains enough information to form a probabilistically favorable short-term price prediction. Further, the framework contains various adaptive machine learning techniques that are geared toward generating profit during strong trends and minimizing losses during trend changes. Results demonstrate that this adaptive machine learning approach is capable of capturing trends and generating profit. The presentation also discusses the importance of defining the parameter space associated with the dynamic training data-set and using the parameter space to identify and remove outliers from prediction data points. Meanwhile, the generalized architecture enables common users to exploit the powerful machinery while focusing on high-level feature engineering and model testing. The presentation also highlights common strengths and weaknesses associated with the presented technique and presents a broad range of well-tested starting points for feature set construction, target setting, and statistical methods for enforcing risk management and maintaining probabilistically favorable entry and exit points. The presentation also describes the end-to-end data processing tools associated with FreqAI, including automatic data fetching, data aggregation, feature engineering, safe and robust data pre-processing, outlier detection, custom machine learning and statistical tools, data post-processing, and adaptive training backtest emulation, and deployment of adaptive training in live environments. Finally, the generalized user interface is also discussed in the presentation. Feature engineering is simplified so that users can seed their feature sets with common indicator libraries (e.g. TA-lib, pandas-ta). The user also feeds data expansion parameters to fill out a large feature set for the model, which can contain as many as 10,000+ features. The presentation describes the various object-oriented programming techniques employed to make FreqAI agnostic to third-party libraries and external data sources. In other words, the back-end is constructed in such a way that users can leverage a broad range of common regression libraries (Catboost, LightGBM, Sklearn, etc) as well as common Neural Network libraries (TensorFlow, PyTorch) without worrying about the logistical complexities associated with data handling and API interactions. The presentation finishes by drawing conclusions about the most important parameters associated with a live deployment of the adaptive learning framework and provides the road map for future development in FreqAI.

Keywords: machine learning, market trend detection, open-source, adaptive learning, parameter space exploration

Procedia PDF Downloads 78
28068 Auditory Brainstem Response in Wave VI for the Detection of Learning Disabilities

Authors: Maria Isabel Garcia-Planas, Maria Victoria Garcia-Camba

Abstract:

The use of brain stem auditory evoked potential (BAEP) is a common way to study the auditory function of people, a way to learn the functionality of a part of the brain neuronal groups that intervene in the learning process by studying the behaviour of wave VI. The latest advances in neuroscience have revealed the existence of different brain activity in the learning process that can be highlighted through the use of innocuous, low-cost, and easy-access techniques such as, among others, the BAEP that can help us to detect early possible neurodevelopmental difficulties for their subsequent assessment and cure. To date and to the authors' best knowledge, only the latency data obtained, observing the first to V waves and mainly in the left ear, were taken into account. This work shows that it is essential to take into account both ears; with these latest data, it has been possible had diagnosed more precise some cases than with the previous data had been diagnosed as 'normal' despite showing signs of some alteration that motivated the new consultation to the specialist.

Keywords: ear, neurodevelopment, auditory evoked potentials, intervals of normality, learning disabilities

Procedia PDF Downloads 151
28067 Cost Reduction Techniques for Provision of Shelter to Homeless

Authors: Mukul Anand

Abstract:

Quality oriented affordable shelter for all has always been the key issue in the housing sector of our country. Homelessness is the acute form of housing need. It is a paradox that in spite of innumerable government initiated programmes for affordable housing, certain section of society is still devoid of shelter. About nineteen million (18.78 million) households grapple with housing shortage in Urban India in 2012. In Indian scenario there is major mismatch between the people for whom the houses are being built and those who need them. The prime force faced by public authorities in facilitation of quality housing for all is high cost of construction. The present paper will comprehend executable techniques for dilution of cost factor in housing the homeless. The key actors responsible for delivery of cheap housing stock such as capacity building, resource optimization, innovative low cost building material and indigenous skeleton housing system will also be incorporated in developing these techniques. Time performance, which is an important angle of above actors, will also be explored so as to increase the effectiveness of low cost housing. Along with this best practices will be taken up as case studies where both conventional techniques of housing and innovative low cost housing techniques would be cited. Transportation consists of approximately 30% of total construction budget. Thus use of alternative local solutions depending upon the region would be covered so as to highlight major components of low cost housing. Government is laid back regarding base line information on use of innovative low cost method and technique of resource optimization. Therefore, the paper would be an attempt to bring to light simpler solutions for achieving low cost housing.

Keywords: construction, cost, housing, optimization, shelter

Procedia PDF Downloads 434
28066 Hydrological Analysis for Urban Water Management

Authors: Ranjit Kumar Sahu, Ramakar Jha

Abstract:

Urban Water Management is the practice of managing freshwater, waste water, and storm water as components of a basin-wide management plan. It builds on existing water supply and sanitation considerations within an urban settlement by incorporating urban water management within the scope of the entire river basin. The pervasive problems generated by urban development have prompted, in the present work, to study the spatial extent of urbanization in Golden Triangle of Odisha connecting the cities Bhubaneswar (20.2700° N, 85.8400° E), Puri (19.8106° N, 85.8314° E) and Konark (19.9000° N, 86.1200° E)., and patterns of periodic changes in urban development (systematic/random) in order to develop future plans for (i) urbanization promotion areas, and (ii) urbanization control areas. Remote Sensing, using USGS (U.S. Geological Survey) Landsat8 maps, supervised classification of the Urban Sprawl has been done for during 1980 - 2014, specifically after 2000. This Work presents the following: (i) Time series analysis of Hydrological data (ground water and rainfall), (ii) Application of SWMM (Storm Water Management Model) and other soft computing techniques for Urban Water Management, and (iii) Uncertainty analysis of model parameters (Urban Sprawl and correlation analysis). The outcome of the study shows drastic growth results in urbanization and depletion of ground water levels in the area that has been discussed briefly. Other relative outcomes like declining trend of rainfall and rise of sand mining in local vicinity has been also discussed. Research on this kind of work will (i) improve water supply and consumption efficiency (ii) Upgrade drinking water quality and waste water treatment (iii) Increase economic efficiency of services to sustain operations and investments for water, waste water, and storm water management, and (iv) engage communities to reflect their needs and knowledge for water management.

Keywords: Storm Water Management Model (SWMM), uncertainty analysis, urban sprawl, land use change

Procedia PDF Downloads 419