Search results for: LiDAR datasets
216 Self-Supervised Attributed Graph Clustering with Dual Contrastive Loss Constraints
Authors: Lijuan Zhou, Mengqi Wu, Changyong Niu
Abstract:
Attributed graph clustering can utilize the graph topology and node attributes to uncover hidden community structures and patterns in complex networks, aiding in the understanding and analysis of complex systems. Utilizing contrastive learning for attributed graph clustering can effectively exploit meaningful implicit relationships between data. However, existing attributed graph clustering methods based on contrastive learning suffer from the following drawbacks: 1) Complex data augmentation increases computational cost, and inappropriate data augmentation may lead to semantic drift. 2) The selection of positive and negative samples neglects the intrinsic cluster structure learned from graph topology and node attributes. Therefore, this paper proposes a method called self-supervised Attributed Graph Clustering with Dual Contrastive Loss constraints (AGC-DCL). Firstly, Siamese Multilayer Perceptron (MLP) encoders are employed to generate two views separately to avoid complex data augmentation. Secondly, the neighborhood contrastive loss is introduced to constrain node representation using local topological structure while effectively embedding attribute information through attribute reconstruction. Additionally, clustering-oriented contrastive loss is applied to fully utilize clustering information in global semantics for discriminative node representations, regarding the cluster centers from two views as negative samples to fully leverage effective clustering information from different views. Comparative clustering results with existing attributed graph clustering algorithms on six datasets demonstrate the superiority of the proposed method.Keywords: attributed graph clustering, contrastive learning, clustering-oriented, self-supervised learning
Procedia PDF Downloads 53215 Reducing the Imbalance Penalty Through Artificial Intelligence Methods Geothermal Production Forecasting: A Case Study for Turkey
Authors: Hayriye Anıl, Görkem Kar
Abstract:
In addition to being rich in renewable energy resources, Turkey is one of the countries that promise potential in geothermal energy production with its high installed power, cheapness, and sustainability. Increasing imbalance penalties become an economic burden for organizations since geothermal generation plants cannot maintain the balance of supply and demand due to the inadequacy of the production forecasts given in the day-ahead market. A better production forecast reduces the imbalance penalties of market participants and provides a better imbalance in the day ahead market. In this study, using machine learning, deep learning, and, time series methods, the total generation of the power plants belonging to Zorlu Natural Electricity Generation, which has a high installed capacity in terms of geothermal, was estimated for the first one and two weeks of March, then the imbalance penalties were calculated with these estimates and compared with the real values. These modeling operations were carried out on two datasets, the basic dataset and the dataset created by extracting new features from this dataset with the feature engineering method. According to the results, Support Vector Regression from traditional machine learning models outperformed other models and exhibited the best performance. In addition, the estimation results in the feature engineering dataset showed lower error rates than the basic dataset. It has been concluded that the estimated imbalance penalty calculated for the selected organization is lower than the actual imbalance penalty, optimum and profitable accounts.Keywords: machine learning, deep learning, time series models, feature engineering, geothermal energy production forecasting
Procedia PDF Downloads 110214 C-eXpress: A Web-Based Analysis Platform for Comparative Functional Genomics and Proteomics in Human Cancer Cell Line, NCI-60 as an Example
Authors: Chi-Ching Lee, Po-Jung Huang, Kuo-Yang Huang, Petrus Tang
Abstract:
Background: Recent advances in high-throughput research technologies such as new-generation sequencing and multi-dimensional liquid chromatography makes it possible to dissect the complete transcriptome and proteome in a single run for the first time. However, it is almost impossible for many laboratories to handle and analysis these “BIG” data without the support from a bioinformatics team. We aimed to provide a web-based analysis platform for users with only limited knowledge on bio-computing to study the functional genomics and proteomics. Method: We use NCI-60 as an example dataset to demonstrate the power of the web-based analysis platform and data delivering system: C-eXpress takes a simple text file that contain the standard NCBI gene or protein ID and expression levels (rpkm or fold) as input file to generate a distribution map of gene/protein expression levels in a heatmap diagram organized by color gradients. The diagram is hyper-linked to a dynamic html table that allows the users to filter the datasets based on various gene features. A dynamic summary chart is generated automatically after each filtering process. Results: We implemented an integrated database that contain pre-defined annotations such as gene/protein properties (ID, name, length, MW, pI); pathways based on KEGG and GO biological process; subcellular localization based on GO cellular component; functional classification based on GO molecular function, kinase, peptidase and transporter. Multiple ways of sorting of column and rows is also provided for comparative analysis and visualization of multiple samples.Keywords: cancer, visualization, database, functional annotation
Procedia PDF Downloads 618213 Verification of Satellite and Observation Measurements to Build Solar Energy Projects in North Africa
Authors: Samy A. Khalil, U. Ali Rahoma
Abstract:
The measurements of solar radiation, satellite data has been routinely utilize to estimate solar energy. However, the temporal coverage of satellite data has some limits. The reanalysis, also known as "retrospective analysis" of the atmosphere's parameters, is produce by fusing the output of NWP (Numerical Weather Prediction) models with observation data from a variety of sources, including ground, and satellite, ship, and aircraft observation. The result is a comprehensive record of the parameters affecting weather and climate. The effectiveness of reanalysis datasets (ERA-5) for North Africa was evaluate against high-quality surfaces measured using statistical analysis. Estimating the distribution of global solar radiation (GSR) over five chosen areas in North Africa through ten-years during the period time from 2011 to 2020. To investigate seasonal change in dataset performance, a seasonal statistical analysis was conduct, which showed a considerable difference in mistakes throughout the year. By altering the temporal resolution of the data used for comparison, the performance of the dataset is alter. Better performance is indicate by the data's monthly mean values, but data accuracy is degraded. Solar resource assessment and power estimation are discuses using the ERA-5 solar radiation data. The average values of mean bias error (MBE), root mean square error (RMSE) and mean absolute error (MAE) of the reanalysis data of solar radiation vary from 0.079 to 0.222, 0.055 to 0.178, and 0.0145 to 0.198 respectively during the period time in the present research. The correlation coefficient (R2) varies from 0.93 to 99% during the period time in the present research. This research's objective is to provide a reliable representation of the world's solar radiation to aid in the use of solar energy in all sectors.Keywords: solar energy, ERA-5 analysis data, global solar radiation, North Africa
Procedia PDF Downloads 98212 Prompt Design for Code Generation in Data Analysis Using Large Language Models
Authors: Lu Song Ma Li Zhi
Abstract:
With the rapid advancement of artificial intelligence technology, large language models (LLMs) have become a milestone in the field of natural language processing, demonstrating remarkable capabilities in semantic understanding, intelligent question answering, and text generation. These models are gradually penetrating various industries, particularly showcasing significant application potential in the data analysis domain. However, retraining or fine-tuning these models requires substantial computational resources and ample downstream task datasets, which poses a significant challenge for many enterprises and research institutions. Without modifying the internal parameters of the large models, prompt engineering techniques can rapidly adapt these models to new domains. This paper proposes a prompt design strategy aimed at leveraging the capabilities of large language models to automate the generation of data analysis code. By carefully designing prompts, data analysis requirements can be described in natural language, which the large language model can then understand and convert into executable data analysis code, thereby greatly enhancing the efficiency and convenience of data analysis. This strategy not only lowers the threshold for using large models but also significantly improves the accuracy and efficiency of data analysis. Our approach includes requirements for the precision of natural language descriptions, coverage of diverse data analysis needs, and mechanisms for immediate feedback and adjustment. Experimental results show that with this prompt design strategy, large language models perform exceptionally well in multiple data analysis tasks, generating high-quality code and significantly shortening the data analysis cycle. This method provides an efficient and convenient tool for the data analysis field and demonstrates the enormous potential of large language models in practical applications.Keywords: large language models, prompt design, data analysis, code generation
Procedia PDF Downloads 39211 Non-Invasive Data Extraction from Machine Display Units Using Video Analytics
Authors: Ravneet Kaur, Joydeep Acharya, Sudhanshu Gaur
Abstract:
Artificial Intelligence (AI) has the potential to transform manufacturing by improving shop floor processes such as production, maintenance and quality. However, industrial datasets are notoriously difficult to extract in a real-time, streaming fashion thus, negating potential AI benefits. The main example is some specialized industrial controllers that are operated by custom software which complicates the process of connecting them to an Information Technology (IT) based data acquisition network. Security concerns may also limit direct physical access to these controllers for data acquisition. To connect the Operational Technology (OT) data stored in these controllers to an AI application in a secure, reliable and available way, we propose a novel Industrial IoT (IIoT) solution in this paper. In this solution, we demonstrate how video cameras can be installed in a factory shop floor to continuously obtain images of the controller HMIs. We propose image pre-processing to segment the HMI into regions of streaming data and regions of fixed meta-data. We then evaluate the performance of multiple Optical Character Recognition (OCR) technologies such as Tesseract and Google vision to recognize the streaming data and test it for typical factory HMIs and realistic lighting conditions. Finally, we use the meta-data to match the OCR output with the temporal, domain-dependent context of the data to improve the accuracy of the output. Our IIoT solution enables reliable and efficient data extraction which will improve the performance of subsequent AI applications.Keywords: human machine interface, industrial internet of things, internet of things, optical character recognition, video analytics
Procedia PDF Downloads 109210 The Application of Participatory Social Media in Collaborative Planning: A Systematic Review
Authors: Yujie Chen , Zhen Li
Abstract:
In the context of planning transformation, how to promote public participation in the formulation and implementation of collaborative planning has been the focused issue of discussion. However, existing studies have often been case-specific or focused on a specific design field, leaving the role of participatory social media (PSM) in urban collaborative planning generally questioned. A systematic database search was conducted in December 2019. Articles and projects were eligible if they reported a quantitative empirical study applying participatory social media in the collaborative planning process (a prospective, retrospective, experimental, longitudinal research, or collective actions in planning practices). Twenty studies and seven projects were included in the review. Findings showed that social media are generally applied in public spatial behavior, transportation behavior, and community planning fields, with new technologies and new datasets. PSM has provided a new platform for participatory design, decision analysis, and collaborative negotiation most widely used in participatory design. Findings extracted several existing forms of PSM. PSM mainly act as three roles: the language of decision-making for communication, study mode for spatial evaluation, and decision agenda for interactive decision support. Three optimization content of PSM were recognized, including improving participatory scale, improvement of the grass-root organization, and promotion of politics. However, basically, participants only could provide information and comment through PSM in the future collaborative planning process, therefore the issues of low data response rate, poor spatial data quality, and participation sustainability issues worth more attention and solutions.Keywords: participatory social media, collaborative planning, planning workshop, application mode
Procedia PDF Downloads 133209 Meanings and Concepts of Standardization in Systems Medicine
Authors: Imme Petersen, Wiebke Sick, Regine Kollek
Abstract:
In systems medicine, high-throughput technologies produce large amounts of data on different biological and pathological processes, including (disturbed) gene expressions, metabolic pathways and signaling. The large volume of data of different types, stored in separate databases and often located at different geographical sites have posed new challenges regarding data handling and processing. Tools based on bioinformatics have been developed to resolve the upcoming problems of systematizing, standardizing and integrating the various data. However, the heterogeneity of data gathered at different levels of biological complexity is still a major challenge in data analysis. To build multilayer disease modules, large and heterogeneous data of disease-related information (e.g., genotype, phenotype, environmental factors) are correlated. Therefore, a great deal of attention in systems medicine has been put on data standardization, primarily to retrieve and combine large, heterogeneous datasets into standardized and incorporated forms and structures. However, this data-centred concept of standardization in systems medicine is contrary to the debate in science and technology studies (STS) on standardization that rather emphasizes the dynamics, contexts and negotiations of standard operating procedures. Based on empirical work on research consortia that explore the molecular profile of diseases to establish systems medical approaches in the clinic in Germany, we trace how standardized data are processed and shaped by bioinformatics tools, how scientists using such data in research perceive such standard operating procedures and which consequences for knowledge production (e.g. modeling) arise from it. Hence, different concepts and meanings of standardization are explored to get a deeper insight into standard operating procedures not only in systems medicine, but also beyond.Keywords: data, science and technology studies (STS), standardization, systems medicine
Procedia PDF Downloads 341208 Pregnant Women in Substance Abuse: Transition of Characteristics and Mining of Association from Teds-a 2011 to 2018
Authors: Md Tareq Ferdous Khan, Shrabanti Mazumder, MB Rao
Abstract:
Background: Substance use during pregnancy is a longstanding public health problem that results in severe consequences for pregnant women and fetuses. Methods: Eight (2011-2018) datasets on pregnant women’s admissions are extracted from TEDS-A. Distributions of sociodemographic, substance abuse behaviors, and clinical characteristics are constructed and compared over the years for trends by the Cochran-Armitage test. Market basket analysis is used in mining the association among polysubstance abuse. Results: Over the years, pregnant woman admissions as the percentage of total and female admissions remain stable, where total annual admissions range from 1.54 to about 2 million with the female share of 33.30% to 35.61%. Pregnant women aged 21-29, 12 or more years of education, white race, unemployed, holding independent living status are among the most vulnerable. Concerns prevail on a significant number of polysubstance users, young age at first use, frequency of daily users, and records of prior admissions (60%). Trends of abused primary substances show a significant rise in heroin (66%) and methamphetamine (46%) over the years, although the latest year shows a considerable downturn. On the other hand, significant decreasing patterns are evident for alcohol (43%), marijuana or hashish (24%), cocaine or crack (23%), other opiates or synthetics (36%), and benzodiazepines (29%). Basket analysis reveals some patterns of co-occurrence of substances consistent over the years. Conclusions: This comprehensive study can work as a reference to identify the most vulnerable groups based on their characteristics and deal with the most hazardous substances from their evidence of co-occurrence.Keywords: basket analysis, pregnant women, substance abuse, trend analysis
Procedia PDF Downloads 195207 Calpoly Autonomous Transportation Experience: Software for Driverless Vehicle Operating on Campus
Authors: F. Tang, S. Boskovich, A. Raheja, Z. Aliyazicioglu, S. Bhandari, N. Tsuchiya
Abstract:
Calpoly Autonomous Transportation Experience (CATE) is a driverless vehicle that we are developing to provide safe, accessible, and efficient transportation of passengers throughout the Cal Poly Pomona campus for events such as orientation tours. Unlike the other self-driving vehicles that are usually developed to operate with other vehicles and reside only on the road networks, CATE will operate exclusively on walk-paths of the campus (potentially narrow passages) with pedestrians traveling from multiple locations. Safety becomes paramount as CATE operates within the same environment as pedestrians. As driverless vehicles assume greater roles in today’s transportation, this project will contribute to autonomous driving with pedestrian traffic in a highly dynamic environment. The CATE project requires significant interdisciplinary work. Researchers from mechanical engineering, electrical engineering and computer science are working together to attack the problem from different perspectives (hardware, software and system). In this abstract, we describe the software aspects of the project, with a focus on the requirements and the major components. CATE shall provide a GUI interface for the average user to interact with the car and access its available functionalities, such as selecting a destination from any origin on campus. We have developed an interface that provides an aerial view of the campus map, the current car location, routes, and the goal location. Users can interact with CATE through audio or manual inputs. CATE shall plan routes from the origin to the selected destination for the vehicle to travel. We will use an existing aerial map for the campus and convert it to a spatial graph configuration where the vertices represent the landmarks and edges represent paths that the car should follow with some designated behaviors (such as stay on the right side of the lane or follow an edge). Graph search algorithms such as A* will be implemented as the default path planning algorithm. D* Lite will be explored to efficiently recompute the path when there are any changes to the map. CATE shall avoid any static obstacles and walking pedestrians within some safe distance. Unlike traveling along traditional roadways, CATE’s route directly coexists with pedestrians. To ensure the safety of the pedestrians, we will use sensor fusion techniques that combine data from both lidar and stereo vision for obstacle avoidance while also allowing CATE to operate along its intended route. We will also build prediction models for pedestrian traffic patterns. CATE shall improve its location and work under a GPS-denied situation. CATE relies on its GPS to give its current location, which has a precision of a few meters. We have implemented an Unscented Kalman Filter (UKF) that allows the fusion of data from multiple sensors (such as GPS, IMU, odometry) in order to increase the confidence of localization. We also noticed that GPS signals can easily get degraded or blocked on campus due to high-rise buildings or trees. UKF can also help here to generate a better state estimate. In summary, CATE will provide on-campus transportation experience that coexists with dynamic pedestrian traffic. In future work, we will extend it to multi-vehicle scenarios.Keywords: driverless vehicle, path planning, sensor fusion, state estimate
Procedia PDF Downloads 144206 A Comprehensive Study and Evaluation on Image Fashion Features Extraction
Authors: Yuanchao Sang, Zhihao Gong, Longsheng Chen, Long Chen
Abstract:
Clothing fashion represents a human’s aesthetic appreciation towards everyday outfits and appetite for fashion, and it reflects the development of status in society, humanity, and economics. However, modelling fashion by machine is extremely challenging because fashion is too abstract to be efficiently described by machines. Even human beings can hardly reach a consensus about fashion. In this paper, we are dedicated to answering a fundamental fashion-related problem: what image feature best describes clothing fashion? To address this issue, we have designed and evaluated various image features, ranging from traditional low-level hand-crafted features to mid-level style awareness features to various current popular deep neural network-based features, which have shown state-of-the-art performance in various vision tasks. In summary, we tested the following 9 feature representations: color, texture, shape, style, convolutional neural networks (CNNs), CNNs with distance metric learning (CNNs&DML), AutoEncoder, CNNs with multiple layer combination (CNNs&MLC) and CNNs with dynamic feature clustering (CNNs&DFC). Finally, we validated the performance of these features on two publicly available datasets. Quantitative and qualitative experimental results on both intra-domain and inter-domain fashion clothing image retrieval showed that deep learning based feature representations far outweigh traditional hand-crafted feature representation. Additionally, among all deep learning based methods, CNNs with explicit feature clustering performs best, which shows feature clustering is essential for discriminative fashion feature representation.Keywords: convolutional neural network, feature representation, image processing, machine modelling
Procedia PDF Downloads 139205 A Framework for Auditing Multilevel Models Using Explainability Methods
Authors: Debarati Bhaumik, Diptish Dey
Abstract:
Multilevel models, increasingly deployed in industries such as insurance, food production, and entertainment within functions such as marketing and supply chain management, need to be transparent and ethical. Applications usually result in binary classification within groups or hierarchies based on a set of input features. Using open-source datasets, we demonstrate that popular explainability methods, such as SHAP and LIME, consistently underperform inaccuracy when interpreting these models. They fail to predict the order of feature importance, the magnitudes, and occasionally even the nature of the feature contribution (negative versus positive contribution to the outcome). Besides accuracy, the computational intractability of SHAP for binomial classification is a cause of concern. For transparent and ethical applications of these hierarchical statistical models, sound audit frameworks need to be developed. In this paper, we propose an audit framework for technical assessment of multilevel regression models focusing on three aspects: (i) model assumptions & statistical properties, (ii) model transparency using different explainability methods, and (iii) discrimination assessment. To this end, we undertake a quantitative approach and compare intrinsic model methods with SHAP and LIME. The framework comprises a shortlist of KPIs, such as PoCE (Percentage of Correct Explanations) and MDG (Mean Discriminatory Gap) per feature, for each of these three aspects. A traffic light risk assessment method is furthermore coupled to these KPIs. The audit framework will assist regulatory bodies in performing conformity assessments of AI systems using multilevel binomial classification models at businesses. It will also benefit businesses deploying multilevel models to be future-proof and aligned with the European Commission’s proposed Regulation on Artificial Intelligence.Keywords: audit, multilevel model, model transparency, model explainability, discrimination, ethics
Procedia PDF Downloads 94204 Urban Road Network Connectivity and Accessibility Analysis Using RS and GIS: A Case Study of Chandannagar City
Authors: Joy Ghosh, Debasmita Biswas
Abstract:
The road network of any area is the most important indicator of regional planning. For proper utilization of urban road networks, the structural parameters such as connectivity and accessibility should be analyzed and evaluated. This paper aims to explain the application of GIS on urban road network connectivity and accessibility analysis with a case study of Chandannagar City. This paper has been made to analyze the road network connectivity through various connectivity measurements like the total number of nodes and links, Cyclomatic Number, Alpha Index, Beta Index, Gamma index, Eta index, Pi index, Theta Index, and Aggregated Transport Score, Road Density based on existing road network in Chandannagar city in India. Accessibility is measured through the shortest Path Matrix, associate Number, and Shimbel Index. Various urban services, such as schools, banks, Hospitals, petrol pumps, ATMs, police stations, theatres, parks, etc., are considered for the accessibility analysis for each ward. This paper also highlights the relationship between urban land use/ land cover (LULC) and urban road network and population density using various spatial and statistical measurements. The datasets were collected through a field survey of 33 wards of the Chandannagar Municipal Corporation area, and the secondary data were collected through an open street map and satellite image of LANDSAT8 OLI & TIRS from USGS. Chandannagar was actually once a French colony, and at that time, various sort of planning was applied, but now Chandannagar city continues to grow haphazardly because that city is facing some problems; the knowledge gained from this paper helps to create a more efficient and accessible road network. Therefore, it would be suggested that some wards need to improve their connectivity and accessibility for the future growth and development of Chandannagar.Keywords: accessibility, connectivity, transport, road network
Procedia PDF Downloads 73203 Recurrent Neural Networks for Complex Survival Models
Authors: Pius Marthin, Nihal Ata Tutkun
Abstract:
Survival analysis has become one of the paramount procedures in the modeling of time-to-event data. When we encounter complex survival problems, the traditional approach remains limited in accounting for the complex correlational structure between the covariates and the outcome due to the strong assumptions that limit the inference and prediction ability of the resulting models. Several studies exist on the deep learning approach to survival modeling; moreover, the application for the case of complex survival problems still needs to be improved. In addition, the existing models need to address the data structure's complexity fully and are subject to noise and redundant information. In this study, we design a deep learning technique (CmpXRnnSurv_AE) that obliterates the limitations imposed by traditional approaches and addresses the above issues to jointly predict the risk-specific probabilities and survival function for recurrent events with competing risks. We introduce the component termed Risks Information Weights (RIW) as an attention mechanism to compute the weighted cumulative incidence function (WCIF) and an external auto-encoder (ExternalAE) as a feature selector to extract complex characteristics among the set of covariates responsible for the cause-specific events. We train our model using synthetic and real data sets and employ the appropriate metrics for complex survival models for evaluation. As benchmarks, we selected both traditional and machine learning models and our model demonstrates better performance across all datasets.Keywords: cumulative incidence function (CIF), risk information weight (RIW), autoencoders (AE), survival analysis, recurrent events with competing risks, recurrent neural networks (RNN), long short-term memory (LSTM), self-attention, multilayers perceptrons (MLPs)
Procedia PDF Downloads 90202 Geospatial Curve Fitting Methods for Disease Mapping of Tuberculosis in Eastern Cape Province, South Africa
Authors: Davies Obaromi, Qin Yongsong, James Ndege
Abstract:
To interpolate scattered or regularly distributed data, there are imprecise or exact methods. However, there are some of these methods that could be used for interpolating data in a regular grid and others in an irregular grid. In spatial epidemiology, it is important to examine how a disease prevalence rates are distributed in space, and how they relate with each other within a defined distance and direction. In this study, for the geographic and graphic representation of the disease prevalence, linear and biharmonic spline methods were implemented in MATLAB, and used to identify, localize and compare for smoothing in the distribution patterns of tuberculosis (TB) in Eastern Cape Province. The aim of this study is to produce a more “smooth” graphical disease map for TB prevalence patterns by a 3-D curve fitting techniques, especially the biharmonic splines that can suppress noise easily, by seeking a least-squares fit rather than exact interpolation. The datasets are represented generally as a 3D or XYZ triplets, where X and Y are the spatial coordinates and Z is the variable of interest and in this case, TB counts in the province. This smoothing spline is a method of fitting a smooth curve to a set of noisy observations using a spline function, and it has also become the conventional method for its high precision, simplicity and flexibility. Surface and contour plots are produced for the TB prevalence at the provincial level for 2012 – 2015. From the results, the general outlook of all the fittings showed a systematic pattern in the distribution of TB cases in the province and this is consistent with some spatial statistical analyses carried out in the province. This new method is rarely used in disease mapping applications, but it has a superior advantage to be assessed at subjective locations rather than only on a rectangular grid as seen in most traditional GIS methods of geospatial analyses.Keywords: linear, biharmonic splines, tuberculosis, South Africa
Procedia PDF Downloads 238201 Damage Identification in Reinforced Concrete Beams Using Modal Parameters and Their Formulation
Authors: Ali Al-Ghalib, Fouad Mohammad
Abstract:
The identification of damage in reinforced concrete structures subjected to incremental cracking performance exploiting vibration data is recognized as a challenging topic in the published and heavily cited literature. Therefore, this paper attempts to shine light on the extent of dynamic methods when applied to reinforced concrete beams simulated with various scenarios of defects. For this purpose, three different reinforced concrete beams are tested through the course of the study. The three beams are loaded statically to failure in incremental successive load cycles and later rehabilitated. After each static load stage, the beams are tested under free-free support condition using experimental modal analysis. The beams were all of the same length and cross-sectional area (2.0x0.14x0.09)m, but they were different in concrete compressive strength and the type of damage presented. The experimental modal parameters as damage identification parameters were showed computationally expensive, time consuming and require substantial inputs and considerable expertise. Nonetheless, they were proved plausible for the condition monitoring of the current case study as well as structural changes in the course of progressive loads. It was accentuated that a satisfactory localization and quantification for structural changes (Level 2 and Level 3 of damage identification problem) can only be achieved reasonably through considering frequencies and mode shapes of a system in a proper analytical model. A convenient post analysis process for various datasets of vibration measurements for the three beams is conducted in order to extract, check and correlate the basic modal parameters; namely, natural frequency, modal damping and mode shapes. The results of the extracted modal parameters and their combination are utilized and discussed in this research as quantification parameters.Keywords: experimental modal analysis, damage identification, structural health monitoring, reinforced concrete beam
Procedia PDF Downloads 263200 Functional Feeding Groups and Trophic Levels of Benthic Macroinvertebrates Assemblages in Albertine Rift Rivers and Streams in South Western Uganda
Authors: Peace Liz Sasha Musonge
Abstract:
Behavioral aspects of species nutrition such as feeding methods and food type are archetypal biological traits signifying how species have adapted to their environment. This concept of functional feeding groups (FFG) analysis is currently used to ascertain the trophic levels of the aquatic food web in a specific microhabitat. However, in Eastern Africa, information about the FFG classification of benthic macroinvertebrates in highland rivers and streams is almost absent, and existing studies have fragmented datasets. For this reason, we carried out a robust study to determine the feed type, trophic level and FFGs, of 56 macroinvertebrate taxa (identified to family level) from Albertine rift valley streams. Our findings showed that all five major functional feeding groups were represented; Gatherer Collectors (GC); Predators (PR); shredders (SH); Scrapers (SC); and Filterer collectors. The most dominant functional feeding group was the Gatherer Collectors (GC) that accounted for 53.5% of the total population. The most abundant (GC) families were Baetidae (7813 individuals), Chironomidae NTP (5628) and Caenidae (1848). Majority of the macroinvertebrate population feed on Fine particulate organic matter (FPOM) from the stream bottom. In terms of taxa richness the Predators (PR) had the highest value of 24 taxa and the Filterer Collectors group had the least number of taxa (3). The families that had the highest number of predators (PR) were Corixidae (1024 individuals), Coenagrionidae (445) and Libellulidae (283). However, Predators accounted for only 7.4% of the population. The findings highlighted the functional feeding groups and habitat type of macroinvertebrate communities along an altitudinal gradient.Keywords: trophic levels, functional feeding groups, macroinvertebrates, Albertine rift
Procedia PDF Downloads 235199 Thick Data Techniques for Identifying Abnormality in Video Frames for Wireless Capsule Endoscopy
Authors: Jinan Fiaidhi, Sabah Mohammed, Petros Zezos
Abstract:
Capsule endoscopy (CE) is an established noninvasive diagnostic modality in investigating small bowel disease. CE has a pivotal role in assessing patients with suspected bleeding or identifying evidence of active Crohn's disease in the small bowel. However, CE produces lengthy videos with at least eighty thousand frames, with a frequency rate of 2 frames per second. Gastroenterologists cannot dedicate 8 to 15 hours to reading the CE video frames to arrive at a diagnosis. This is why the issue of analyzing CE videos based on modern artificial intelligence techniques becomes a necessity. However, machine learning, including deep learning, has failed to report robust results because of the lack of large samples to train its neural nets. In this paper, we are describing a thick data approach that learns from a few anchor images. We are using sound datasets like KVASIR and CrohnIPI to filter candidate frames that include interesting anomalies in any CE video. We are identifying candidate frames based on feature extraction to provide representative measures of the anomaly, like the size of the anomaly and the color contrast compared to the image background, and later feed these features to a decision tree that can classify the candidate frames as having a condition like the Crohn's Disease. Our thick data approach reported accuracy of detecting Crohn's Disease based on the availability of ulcer areas at the candidate frames for KVASIR was 89.9% and for the CrohnIPI was 83.3%. We are continuing our research to fine-tune our approach by adding more thick data methods for enhancing diagnosis accuracy.Keywords: thick data analytics, capsule endoscopy, Crohn’s disease, siamese neural network, decision tree
Procedia PDF Downloads 156198 Spatio-Temporal Variability and Trends in Frost-Free Season Parameters in Finland: Influence of Climate Teleconnections
Authors: Masoud Irannezhad, Sirpa Rasmus, Saghar Ahmadian, Deliang Chen, Bjorn Klove
Abstract:
Variability and changes in thermal conditions play a crucial role in functioning of human society, particularly over cold climate regions like Finland. Accordingly, the frost-free season (FFS) parameters in terms of start (FFSS), end (FFSE) and length (FFSL) have substantial effects not only on natural environment (e.g. flora and fauna), but also on human requirements (e.g. agriculture, forestry and energy generation). Applying the 0°C threshold of minimum temperature (Tmin), the FFS was defined as the period between the last spring frost as FFSS and the first fall frost as FFSE. For this study, gridded (10 x 10 km2) daily minimum temperature datasets throughout Finland during 1961-2011 was used to investigate recent spatio-temporal variations and trends in frost-free season (FFS) parameters and their relationships with the well-known large-scale climate teleconnections (CTs). The FFS in Finland naturally increases from north (~60 days) to south (~190 days), in association with earlier FFSS (~24 April) and later FFSE (~30 October). Statistically significant (p<0.05) trends in FFSL were all positive (increasing) ranged between 0 and 13.5 (days/decade) and mainly observed in the east, upper west, centre and upper north of Finland. Such lengthening trends in FFS were attributable to both earlier FFSS and later FFSE mostly over central and upper northern Finland, while only to later FFSE in eastern and upper western parts. Variations in both FFSL and FFSS were significantly associated with the Polar (POL) pattern over northern Finland, while with the East Atlantic (EA) pattern over eastern and upper western areas. However, the POL and Scandinavia (SCA) patterns were most influential CTs for FFSE variability over northern Finland.Keywords: climate teleconnections, Finland, frost-free season, trend analysis
Procedia PDF Downloads 203197 The Role of Oceanic Environmental Conditions on Catch of Sardinella spp. In Ghana
Authors: Emmanuel Okine Neokye Serge Dossou Martin Iniga Bortey Nketia Alabi-Doku
Abstract:
Fish stock distribution is greatly influenced by oceanographic environmental conditions. Temporal variations of temperature and other oceanic properties, resulting from climate change have been documented to have a strong impact on fisheries and aquaculture. In Ghana, Sardinella species are one of the most important fisheries resources; they constitute about 60% of the total catch of coastal fisheries and are more predominant during the upwelling season. The present study investigated the role of physical oceanographic environmental conditions in the catches of Sardinella species: S. aurita and S. maderensis, which were landed in Ghana. Furthermore, we examined the relationship between environmental conditions and catches of Sardinella species for seasonal and interannual variations between 2005 and 2015. For oceanographic environmental factors, we used comprehensive datasets, which consist of :(1) daily in situ SST data obtained at two coastal stations in Ghana; (i) Cape 3 Points (4.7° N, -2.09° W) and (ii) Tema (5° N, 0° E), for the period 2005–2015, (2) Monthly SST data (MOAA GPV) from JAMSTEC, and (3) gridded 10 metre wind data from CCMP reanalysis. The analysis of the data collected showed that higher (lower) wind velocity forms stronger (weaker) coastal upwelling that is detected by lower (higher) SST, resulting in a higher (lower) catch of Sardinella spp., in both seasonal and interannual variations. It was also observed that the capture ability of small pelagic fish species such as Sardinella spp. is depend on the intensity of the coastal upwelling. Moreso, the Atlantic Meridional Mode index (climatic index) is now known to be a possible factor to the interannual variation in catch of small pelagic fish species.Keywords: Sardinella spp., fish, climate change, Ghana
Procedia PDF Downloads 12196 Sea Level Characteristics Referenced to Specific Geodetic Datum in Alexandria, Egypt
Authors: Ahmed M. Khedr, Saad M. Abdelrahman, Kareem M. Tonbol
Abstract:
Two geo-referenced sea level datasets (September 2008 – November 2010) and (April 2012 – January 2014) were recorded at Alexandria Western Harbour (AWH). Accurate re-definition of tidal datum, referred to the latest International Terrestrial Reference Frame (ITRF-2014), was discussed and updated to improve our understanding of the old predefined tidal datum at Alexandria. Tidal and non-tidal components of sea level were separated with the use of Delft-3D hydrodynamic model-tide suit (Delft-3D, 2015). Tidal characteristics at AWH were investigated and harmonic analysis showed the most significant 34 constituents with their amplitudes and phases. Tide was identified as semi-diurnal pattern as indicated by a “Form Factor” of 0.24 and 0.25, respectively. Principle tidal datums related to major tidal phenomena were recalculated referred to a meaningful geodetic height datum. The portion of residual energy (surge) out of the total sea level energy was computed for each dataset and found 77% and 72%, respectively. Power spectral density (PSD) showed accurate resolvability in high band (1–6) cycle/days for the nominated independent constituents, except some neighbouring constituents, which are too close in frequency. Wind and atmospheric pressure data, during the recorded sea level time, were analysed and cross-correlated with the surge signals. Moderate association between surge and wind and atmospheric pressure data were obtained. In addition, long-term sea level rise trend at AWH was computed and showed good agreement with earlier estimated rates.Keywords: Alexandria, Delft-3D, Egypt, geodetic reference, harmonic analysis, sea level
Procedia PDF Downloads 165195 Quality Analysis of Vegetables Through Image Processing
Authors: Abdul Khalique Baloch, Ali Okatan
Abstract:
The quality analysis of food and vegetable from image is hot topic now a day, where researchers make them better then pervious findings through different technique and methods. In this research we have review the literature, and find gape from them, and suggest better proposed approach, design the algorithm, developed a software to measure the quality from images, where accuracy of image show better results, and compare the results with Perouse work done so for. The Application we uses an open-source dataset and python language with tensor flow lite framework. In this research we focus to sort food and vegetable from image, in the images, the application can sorts and make them grading after process the images, it could create less errors them human base sorting errors by manual grading. Digital pictures datasets were created. The collected images arranged by classes. The classification accuracy of the system was about 94%. As fruits and vegetables play main role in day-to-day life, the quality of fruits and vegetables is necessary in evaluating agricultural produce, the customer always buy good quality fruits and vegetables. This document is about quality detection of fruit and vegetables using images. Most of customers suffering due to unhealthy foods and vegetables by suppliers, so there is no proper quality measurement level followed by hotel managements. it have developed software to measure the quality of the fruits and vegetables by using images, it will tell you how is your fruits and vegetables are fresh or rotten. Some algorithms reviewed in this thesis including digital images, ResNet, VGG16, CNN and Transfer Learning grading feature extraction. This application used an open source dataset of images and language used python, and designs a framework of system.Keywords: deep learning, computer vision, image processing, rotten fruit detection, fruits quality criteria, vegetables quality criteria
Procedia PDF Downloads 70194 Local Directional Encoded Derivative Binary Pattern Based Coral Image Classification Using Weighted Distance Gray Wolf Optimization Algorithm
Authors: Annalakshmi G., Sakthivel Murugan S.
Abstract:
This paper presents a local directional encoded derivative binary pattern (LDEDBP) feature extraction method that can be applied for the classification of submarine coral reef images. The classification of coral reef images using texture features is difficult due to the dissimilarities in class samples. In coral reef image classification, texture features are extracted using the proposed method called local directional encoded derivative binary pattern (LDEDBP). The proposed approach extracts the complete structural arrangement of the local region using local binary batten (LBP) and also extracts the edge information using local directional pattern (LDP) from the edge response available in a particular region, thereby achieving extra discriminative feature value. Typically the LDP extracts the edge details in all eight directions. The process of integrating edge responses along with the local binary pattern achieves a more robust texture descriptor than the other descriptors used in texture feature extraction methods. Finally, the proposed technique is applied to an extreme learning machine (ELM) method with a meta-heuristic algorithm known as weighted distance grey wolf optimizer (GWO) to optimize the input weight and biases of single-hidden-layer feed-forward neural networks (SLFN). In the empirical results, ELM-WDGWO demonstrated their better performance in terms of accuracy on all coral datasets, namely RSMAS, EILAT, EILAT2, and MLC, compared with other state-of-the-art algorithms. The proposed method achieves the highest overall classification accuracy of 94% compared to the other state of art methods.Keywords: feature extraction, local directional pattern, ELM classifier, GWO optimization
Procedia PDF Downloads 163193 Developing a DNN Model for the Production of Biogas From a Hybrid BO-TPE System in an Anaerobic Wastewater Treatment Plant
Authors: Hadjer Sadoune, Liza Lamini, Scherazade Krim, Amel Djouadi, Rachida Rihani
Abstract:
Deep neural networks are highly regarded for their accuracy in predicting intricate fermentation processes. Their ability to learn from a large amount of datasets through artificial intelligence makes them particularly effective models. The primary obstacle in improving the performance of these models is to carefully choose the suitable hyperparameters, including the neural network architecture (number of hidden layers and hidden units), activation function, optimizer, learning rate, and other relevant factors. This study predicts biogas production from real wastewater treatment plant data using a sophisticated approach: hybrid Bayesian optimization with a tree-structured Parzen estimator (BO-TPE) for an optimised deep neural network (DNN) model. The plant utilizes an Upflow Anaerobic Sludge Blanket (UASB) digester that treats industrial wastewater from soft drinks and breweries. The digester has a working volume of 1574 m3 and a total volume of 1914 m3. Its internal diameter and height were 19 and 7.14 m, respectively. The data preprocessing was conducted with meticulous attention to preserving data quality while avoiding data reduction. Three normalization techniques were applied to the pre-processed data (MinMaxScaler, RobustScaler and StandardScaler) and compared with the Non-Normalized data. The RobustScaler approach has strong predictive ability for estimating the volume of biogas produced. The highest predicted biogas volume was 2236.105 Nm³/d, with coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE) values of 0.712, 164.610, and 223.429, respectively.Keywords: anaerobic digestion, biogas production, deep neural network, hybrid bo-tpe, hyperparameters tuning
Procedia PDF Downloads 38192 Alternating Expectation-Maximization Algorithm for a Bilinear Model in Isoform Quantification from RNA-Seq Data
Authors: Wenjiang Deng, Tian Mou, Yudi Pawitan, Trung Nghia Vu
Abstract:
Estimation of isoform-level gene expression from RNA-seq data depends on simplifying assumptions, such as uniform reads distribution, that are easily violated in real data. Such violations typically lead to biased estimates. Most existing methods provide a bias correction step(s), which is based on biological considerations, such as GC content–and applied in single samples separately. The main problem is that not all biases are known. For example, new technologies such as single-cell RNA-seq (scRNA-seq) may introduce new sources of bias not seen in bulk-cell data. This study introduces a method called XAEM based on a more flexible and robust statistical model. Existing methods are essentially based on a linear model Xβ, where the design matrix X is known and derived based on the simplifying assumptions. In contrast, XAEM considers Xβ as a bilinear model with both X and β unknown. Joint estimation of X and β is made possible by simultaneous analysis of multi-sample RNA-seq data. Compared to existing methods, XAEM automatically performs empirical correction of potentially unknown biases. XAEM implements an alternating expectation-maximization (AEM) algorithm, alternating between estimation of X and β. For speed XAEM utilizes quasi-mapping for read alignment, thus leading to a fast algorithm. Overall XAEM performs favorably compared to other recent advanced methods. For simulated datasets, XAEM obtains higher accuracy for multiple-isoform genes, particularly for paralogs. In a differential-expression analysis of a real scRNA-seq dataset, XAEM achieves substantially greater rediscovery rates in an independent validation set.Keywords: alternating EM algorithm, bias correction, bilinear model, gene expression, RNA-seq
Procedia PDF Downloads 142191 Hybridization of Manually Extracted and Convolutional Features for Classification of Chest X-Ray of COVID-19
Authors: M. Bilal Ishfaq, Adnan N. Qureshi
Abstract:
COVID-19 is the most infectious disease these days, it was first reported in Wuhan, the capital city of Hubei in China then it spread rapidly throughout the whole world. Later on 11 March 2020, the World Health Organisation (WHO) declared it a pandemic. Since COVID-19 is highly contagious, it has affected approximately 219M people worldwide and caused 4.55M deaths. It has brought the importance of accurate diagnosis of respiratory diseases such as pneumonia and COVID-19 to the forefront. In this paper, we propose a hybrid approach for the automated detection of COVID-19 using medical imaging. We have presented the hybridization of manually extracted and convolutional features. Our approach combines Haralick texture features and convolutional features extracted from chest X-rays and CT scans. We also employ a minimum redundancy maximum relevance (MRMR) feature selection algorithm to reduce computational complexity and enhance classification performance. The proposed model is evaluated on four publicly available datasets, including Chest X-ray Pneumonia, COVID-19 Pneumonia, COVID-19 CTMaster, and VinBig data. The results demonstrate high accuracy and effectiveness, with 0.9925 on the Chest X-ray pneumonia dataset, 0.9895 on the COVID-19, Pneumonia and Normal Chest X-ray dataset, 0.9806 on the Covid CTMaster dataset, and 0.9398 on the VinBig dataset. We further evaluate the effectiveness of the proposed model using ROC curves, where the AUC for the best-performing model reaches 0.96. Our proposed model provides a promising tool for the early detection and accurate diagnosis of COVID-19, which can assist healthcare professionals in making informed treatment decisions and improving patient outcomes. The results of the proposed model are quite plausible and the system can be deployed in a clinical or research setting to assist in the diagnosis of COVID-19.Keywords: COVID-19, feature engineering, artificial neural networks, radiology images
Procedia PDF Downloads 75190 Transforming Data into Knowledge: Mathematical and Statistical Innovations in Data Analytics
Authors: Zahid Ullah, Atlas Khan
Abstract:
The rapid growth of data in various domains has created a pressing need for effective methods to transform this data into meaningful knowledge. In this era of big data, mathematical and statistical innovations play a crucial role in unlocking insights and facilitating informed decision-making in data analytics. This abstract aims to explore the transformative potential of these innovations and their impact on converting raw data into actionable knowledge. Drawing upon a comprehensive review of existing literature, this research investigates the cutting-edge mathematical and statistical techniques that enable the conversion of data into knowledge. By evaluating their underlying principles, strengths, and limitations, we aim to identify the most promising innovations in data analytics. To demonstrate the practical applications of these innovations, real-world datasets will be utilized through case studies or simulations. This empirical approach will showcase how mathematical and statistical innovations can extract patterns, trends, and insights from complex data, enabling evidence-based decision-making across diverse domains. Furthermore, a comparative analysis will be conducted to assess the performance, scalability, interpretability, and adaptability of different innovations. By benchmarking against established techniques, we aim to validate the effectiveness and superiority of the proposed mathematical and statistical innovations in data analytics. Ethical considerations surrounding data analytics, such as privacy, security, bias, and fairness, will be addressed throughout the research. Guidelines and best practices will be developed to ensure the responsible and ethical use of mathematical and statistical innovations in data analytics. The expected contributions of this research include advancements in mathematical and statistical sciences, improved data analysis techniques, enhanced decision-making processes, and practical implications for industries and policymakers. The outcomes will guide the adoption and implementation of mathematical and statistical innovations, empowering stakeholders to transform data into actionable knowledge and drive meaningful outcomes.Keywords: data analytics, mathematical innovations, knowledge extraction, decision-making
Procedia PDF Downloads 75189 The Rapid Industrialization Model
Authors: Fredrick Etyang
Abstract:
This paper presents a Rapid Industrialization Model (RIM) designed to support existing industrialization policies, strategies and industrial development plans at National, Regional and Constituent level in Africa. The model will reinforce efforts to attainment of inclusive and sustainable industrialization of Africa by state and non-state actors. The overall objective of this model is to serve as a framework for rapid industrialization in developing economies and the specific objectives range from supporting rapid industrialization development to promoting a structural change in the economy, a balanced regional industrial growth, achievement of local, regional and international competitiveness in areas of clear comparative advantage in industrial exports and ultimately, the RIM will serve as a step-by-step guideline for the industrialization of African Economies. This model is a product of a scientific research process underpinned by desk research through the review of African countries development plans, strategies, datasets, industrialization efforts and consultation with key informants. The rigorous research process unearthed multi-directional and renewed efforts towards industrialization of Africa premised on collective commitment of individual states, regional economic communities and the African union commission among other strategic stakeholders. It was further, established that the inputs into industrialization of Africa outshine the levels of industrial development on the continent. The RIM comes in handy to serve as step-by-step framework for African countries to follow in their industrial development efforts of transforming inputs into tangible outputs and outcomes in the short, intermediate and long-run. This model postulates three stages of industrialization and three phases toward rapid industrialization of African economies, the model is simple to understand, easily implementable and contextualizable with high return on investment for each unit invested into industrialization supported by the model. Therefore, effective implementation of the model will result into inclusive and sustainable rapid industrialization of Africa.Keywords: economic development, industrialization, economic efficiency, exports and imports
Procedia PDF Downloads 84188 Use Cloud-Based Watson Deep Learning Platform to Train Models Faster and More Accurate
Authors: Susan Diamond
Abstract:
Machine Learning workloads have traditionally been run in high-performance computing (HPC) environments, where users log in to dedicated machines and utilize the attached GPUs to run training jobs on huge datasets. Training of large neural network models is very resource intensive, and even after exploiting parallelism and accelerators such as GPUs, a single training job can still take days. Consequently, the cost of hardware is a barrier to entry. Even when upfront cost is not a concern, the lead time to set up such an HPC environment takes months from acquiring hardware to set up the hardware with the right set of firmware, software installed and configured. Furthermore, scalability is hard to achieve in a rigid traditional lab environment. Therefore, it is slow to react to the dynamic change in the artificial intelligent industry. Watson Deep Learning as a service, a cloud-based deep learning platform that mitigates the long lead time and high upfront investment in hardware. It enables robust and scalable sharing of resources among the teams in an organization. It is designed for on-demand cloud environments. Providing a similar user experience in a multi-tenant cloud environment comes with its own unique challenges regarding fault tolerance, performance, and security. Watson Deep Learning as a service tackles these challenges and present a deep learning stack for the cloud environments in a secure, scalable and fault-tolerant manner. It supports a wide range of deep-learning frameworks such as Tensorflow, PyTorch, Caffe, Torch, Theano, and MXNet etc. These frameworks reduce the effort and skillset required to design, train, and use deep learning models. Deep Learning as a service is used at IBM by AI researchers in areas including machine translation, computer vision, and healthcare.Keywords: deep learning, machine learning, cognitive computing, model training
Procedia PDF Downloads 209187 Digitally Mapping Aboriginal Journey Ways
Authors: Paul Longley Arthur
Abstract:
This paper reports on an Australian Research Council-funded project utilising the Australian digital research infrastructure the ‘Time-Layered Cultural Map of Australia’ (TLCMap) (https://www.tlcmap.org/) [1]. This resource has been developed to help researchers create digital maps from cultural, textual, and historical data, layered with datasets registered on the platform. TLCMap is a set of online tools that allows humanities researchers to compile humanities data using spatio-temporal coordinates – to upload, gather, analyse and visualise data. It is the only purpose-designed, Australian-developed research tool for humanities and social science researchers to identify geographical clusters and parallel journeys by sight. This presentation discusses a series of Aboriginal mapping and visualisation experiments using TLCMap to show how Indigenous knowledge can reconfigure contemporary understandings of space including the urbanised landscape [2, 3]. The research data being generated – investigating the historical movements of Aboriginal people, the distribution of networks, and their relation to land – lends itself to mapping and geo-spatial visualisation and analysis. TLCMap allows researchers to create layers on a 3D map which pinpoint locations with accompanying information, and this has enabled our research team to plot out traditional historical journeys undertaken by Aboriginal people as well as to compile a gazetteer of Aboriginal place names, many of which have largely been undocumented until now [4]. The documented journeys intersect with and overlay many of today’s urban formations including main roads, municipal boundaries, and state borders. The paper questions how such data can be incorporated into a more culturally and ethically responsive understanding of contemporary urban spaces and as well as natural environments [5].Keywords: spatio-temporal mapping, visualisation, Indigenous knowledge, mobility and migration, research infrastructure
Procedia PDF Downloads 18