Search results for: data stream mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25762

Search results for: data stream mining

25162 Brainbow Image Segmentation Using Bayesian Sequential Partitioning

Authors: Yayun Hsu, Henry Horng-Shing Lu

Abstract:

This paper proposes a data-driven, biology-inspired neural segmentation method of 3D drosophila Brainbow images. We use Bayesian Sequential Partitioning algorithm for probabilistic modeling, which can be used to detect somas and to eliminate cross talk effects. This work attempts to develop an automatic methodology for neuron image segmentation, which nowadays still lacks a complete solution due to the complexity of the image. The proposed method does not need any predetermined, risk-prone thresholds since biological information is inherently included in the image processing procedure. Therefore, it is less sensitive to variations in neuron morphology; meanwhile, its flexibility would be beneficial for tracing the intertwining structure of neurons.

Keywords: brainbow, 3D imaging, image segmentation, neuron morphology, biological data mining, non-parametric learning

Procedia PDF Downloads 482
25161 Occupational Safety and Health in the Wake of Drones

Authors: Hoda Rahmani, Gary Weckman

Abstract:

The body of research examining the integration of drones into various industries is expanding rapidly. Despite progress made in addressing the cybersecurity concerns for commercial drones, knowledge deficits remain in determining potential occupational hazards and risks of drone use to employees’ well-being and health in the workplace. This creates difficulty in identifying key approaches to risk mitigation strategies and thus reflects the need for raising awareness among employers, safety professionals, and policymakers about workplace drone-related accidents. The purpose of this study is to investigate the prevalence of and possible risk factors for drone-related mishaps by comparing the application of drones in construction with manufacturing industries. The chief reason for considering these specific sectors is to ascertain whether there exists any significant difference between indoor and outdoor flights since most construction sites use drones outside and vice versa. Therefore, the current research seeks to examine the causes and patterns of workplace drone-related mishaps and suggest possible ergonomic interventions through data collection. Potential ergonomic practices to mitigate hazards associated with flying drones could include providing operators with professional pieces of training, conducting a risk analysis, and promoting the use of personal protective equipment. For the purpose of data analysis, two data mining techniques, the random forest and association rule mining algorithms, will be performed to find meaningful associations and trends in data as well as influential features that have an impact on the occurrence of drone-related accidents in construction and manufacturing sectors. In addition, Spearman’s correlation and chi-square tests will be used to measure the possible correlation between different variables. Indeed, by recognizing risks and hazards, occupational safety stakeholders will be able to pursue data-driven and evidence-based policy change with the aim of reducing drone mishaps, increasing productivity, creating a safer work environment, and extending human performance in safe and fulfilling ways. This research study was supported by the National Institute for Occupational Safety and Health through the Pilot Research Project Training Program of the University of Cincinnati Education and Research Center Grant #T42OH008432.

Keywords: commercial drones, ergonomic interventions, occupational safety, pattern recognition

Procedia PDF Downloads 205
25160 Combination of Artificial Neural Network Model and Geographic Information System for Prediction Water Quality

Authors: Sirilak Areerachakul

Abstract:

Water quality has initiated serious management efforts in many countries. Artificial Neural Network (ANN) models are developed as forecasting tools in predicting water quality trend based on historical data. This study endeavors to automatically classify water quality. The water quality classes are evaluated using 6 factor indices. These factors are pH value (pH), Dissolved Oxygen (DO), Biochemical Oxygen Demand (BOD), Nitrate Nitrogen (NO3N), Ammonia Nitrogen (NH3N) and Total Coliform (T-Coliform). The methodology involves applying data mining techniques using multilayer perceptron (MLP) neural network models. The data consisted of 11 sites of Saen Saep canal in Bangkok, Thailand. The data is obtained from the Department of Drainage and Sewerage Bangkok Metropolitan Administration during 2007-2011. The results of multilayer perceptron neural network exhibit a high accuracy multilayer perception rate at 94.23% in classifying the water quality of Saen Saep canal in Bangkok. Subsequently, this encouraging result could be combined with GIS data improves the classification accuracy significantly.

Keywords: artificial neural network, geographic information system, water quality, computer science

Procedia PDF Downloads 339
25159 A Topological Approach for Motion Track Discrimination

Authors: Tegan H. Emerson, Colin C. Olson, George Stantchev, Jason A. Edelberg, Michael Wilson

Abstract:

Detecting small targets at range is difficult because there is not enough spatial information present in an image sub-region containing the target to use correlation-based methods to differentiate it from dynamic confusers present in the scene. Moreover, this lack of spatial information also disqualifies the use of most state-of-the-art deep learning image-based classifiers. Here, we use characteristics of target tracks extracted from video sequences as data from which to derive distinguishing topological features that help robustly differentiate targets of interest from confusers. In particular, we calculate persistent homology from time-delayed embeddings of dynamic statistics calculated from motion tracks extracted from a wide field-of-view video stream. In short, we use topological methods to extract features related to target motion dynamics that are useful for classification and disambiguation and show that small targets can be detected at range with high probability.

Keywords: motion tracks, persistence images, time-delay embedding, topological data analysis

Procedia PDF Downloads 111
25158 Unsupervised Domain Adaptive Text Retrieval with Query Generation

Authors: Rui Yin, Haojie Wang, Xun Li

Abstract:

Recently, mainstream dense retrieval methods have obtained state-of-the-art results on some datasets and tasks. However, they require large amounts of training data, which is not available in most domains. The severe performance degradation of dense retrievers on new data domains has limited the use of dense retrieval methods to only a few domains with large training datasets. In this paper, we propose an unsupervised domain-adaptive approach based on query generation. First, a generative model is used to generate relevant queries for each passage in the target corpus, and then the generated queries are used for mining negative passages. Finally, the query-passage pairs are labeled with a cross-encoder and used to train a domain-adapted dense retriever. Experiments show that our approach is more robust than previous methods in target domains that require less unlabeled data.

Keywords: dense retrieval, query generation, unsupervised training, text retrieval

Procedia PDF Downloads 66
25157 Artificial Reproduction System and Imbalanced Dataset: A Mendelian Classification

Authors: Anita Kushwaha

Abstract:

We propose a new evolutionary computational model called Artificial Reproduction System which is based on the complex process of meiotic reproduction occurring between male and female cells of the living organisms. Artificial Reproduction System is an attempt towards a new computational intelligence approach inspired by the theoretical reproduction mechanism, observed reproduction functions, principles and mechanisms. A reproductive organism is programmed by genes and can be viewed as an automaton, mapping and reducing so as to create copies of those genes in its off springs. In Artificial Reproduction System, the binding mechanism between male and female cells is studied, parameters are chosen and a network is constructed also a feedback system for self regularization is established. The model then applies Mendel’s law of inheritance, allele-allele associations and can be used to perform data analysis of imbalanced data, multivariate, multiclass and big data. In the experimental study Artificial Reproduction System is compared with other state of the art classifiers like SVM, Radial Basis Function, neural networks, K-Nearest Neighbor for some benchmark datasets and comparison results indicates a good performance.

Keywords: bio-inspired computation, nature- inspired computation, natural computing, data mining

Procedia PDF Downloads 270
25156 Design and Development of a Computerized Medical Record System for Hospitals in Remote Areas

Authors: Grace Omowunmi Soyebi

Abstract:

A computerized medical record system is a collection of medical information about a person that is stored on a computer. One principal problem of most hospitals in rural areas is using the file management system for keeping records. A lot of time is wasted when a patient visits the hospital, probably in an emergency, and the nurse or attendant has to search through voluminous files before the patient's file can be retrieved; this may cause an unexpected to happen to the patient. This data mining application is to be designed using a structured system analysis and design method which will help in a well-articulated analysis of the existing file management system, feasibility study, and proper documentation of the design and implementation of a computerized medical record system. This computerized system will replace the file management system and help to quickly retrieve a patient's record with increased data security, access clinical records for decision-making, and reduce the time range at which a patient gets attended to.

Keywords: programming, data, software development, innovation

Procedia PDF Downloads 81
25155 Estimation of Small Hydropower Potential Using Remote Sensing and GIS Techniques in Pakistan

Authors: Malik Abid Hussain Khokhar, Muhammad Naveed Tahir, Muhammad Amin

Abstract:

Energy demand has been increased manifold due to increasing population, urban sprawl and rapid socio-economic improvements. Low water capacity in dams for continuation of hydrological power, land cover and land use are the key parameters which are creating problems for more energy production. Overall installed hydropower capacity of Pakistan is more than 35000 MW whereas Pakistan is producing up to 17000 MW and the requirement is more than 22000 that is resulting shortfall of 5000 - 7000 MW. Therefore, there is a dire need to develop small hydropower to fulfill the up-coming requirements. In this regards, excessive rainfall, snow nurtured fast flowing perennial tributaries and streams in northern mountain regions of Pakistan offer a gigantic scope of hydropower potential throughout the year. Rivers flowing in KP (Khyber Pakhtunkhwa) province, GB (Gilgit Baltistan) and AJK (Azad Jammu & Kashmir) possess sufficient water availability for rapid energy growth. In the backdrop of such scenario, small hydropower plants are believed very suitable measures for more green environment and power sustainable option for the development of such regions. Aim of this study is to estimate hydropower potential sites for small hydropower plants and stream distribution as per steam network available in the available basins in the study area. The proposed methodology will focus on features to meet the objectives i.e. site selection of maximum hydropower potential for hydroelectric generation using well emerging GIS tool SWAT as hydrological run-off model on the Neelum, Kunhar and the Dor Rivers’ basins. For validation of the results, NDWI will be computed to show water concentration in the study area while overlaying on geospatial enhanced DEM. This study will represent analysis of basins, watershed, stream links, and flow directions with slope elevation for hydropower potential to produce increasing demand of electricity by installing small hydropower stations. Later on, this study will be benefitted for other adjacent regions for further estimation of site selection for installation of such small power plants as well.

Keywords: energy, stream network, basins, SWAT, evapotranspiration

Procedia PDF Downloads 217
25154 The Reduction of Post-Blast Fumes to Improve Productivity and Safety: A Review Paper

Authors: Nhleko Monique Chiloane

Abstract:

The gold mining industry has predominantly used ammonium nitrate fuel oil (ANFO) explosives for decades, although these are known to be “gassier” and their detonation results in toxic fumes, for example, carbon monoxide (CO), nitrogen oxides (NOx) and ammonia. Re-entry into underground workings too soon after blasting can lead to fatal exposure to toxic fumes. It is, therefore, required that the polluted air be removed from the affected areas within a reasonable period before employees' re-entry into the working area. Post-blast re-entry times have therefore been described as a productivity bottleneck. The known causes of post-blast fumes are water ingress, incorrect fuel to oxygen ratio, confinement, explosive additives etc. To prevent or minimize post-blast fumes, some researchers have used neutralization, re-burning technique and non-explosive products or different oxidizing agents. The use of commercial explosives without nitrate oxidizing agents can also minimize the production of blasting fumes and thereby reduce the time needed for the clearance of these fumes to allow workers to re-enter the underground workings safely. The reduction in non-production time directly contributes to an increase in the available time per shift for productive work, thus leading to continuous mining. However, owing to its low cost and ease of use, ANFO is still widely used in South African underground blasting operations.

Keywords: post-blast fumes, continuous mining, ammonium nitrate explosive, non-explosive blasting, re-entry period

Procedia PDF Downloads 178
25153 A Methodology for Developing New Technology Ideas to Avoid Patent Infringement: F-Term Based Patent Analysis

Authors: Kisik Song, Sungjoo Lee

Abstract:

With the growing importance of intangible assets recently, the impact of patent infringement on the business of a company has become more evident. Accordingly, it is essential for firms to estimate the risk of patent infringement risk before developing a technology and create new technology ideas to avoid the risk. Recognizing the needs, several attempts have been made to help develop new technology opportunities and most of them have focused on identifying emerging vacant technologies from patent analysis. In these studies, the IPC (International Patent Classification) system or keywords from text-mining application to patent documents was generally used to define vacant technologies. Unlike those studies, this study adopted F-term, which classifies patent documents according to the technical features of the inventions described in them. Since the technical features are analyzed by various perspectives by F-term, F-term provides more detailed information about technologies compared to IPC while more systematic information compared to keywords. Therefore, if well utilized, it can be a useful guideline to create a new technology idea. Recognizing the potential of F-term, this paper aims to suggest a novel approach to developing new technology ideas to avoid patent infringement based on F-term. For this purpose, we firstly collected data about F-term and then applied text-mining to the descriptions about classification criteria and attributes. From the text-mining results, we could identify other technologies with similar technical features of the existing one, the patented technology. Finally, we compare the technologies and extract the technical features that are commonly used in other technologies but have not been used in the existing one. These features are presented in terms of “purpose”, “function”, “structure”, “material”, “method”, “processing and operation procedure” and “control means” and so are useful for creating new technology ideas that help avoid infringing patent rights of other companies. Theoretically, this is one of the earliest attempts to adopt F-term to patent analysis; the proposed methodology can show how to best take advantage of F-term with the wealth of technical information. In practice, the proposed methodology can be valuable in the ideation process for successful product and service innovation without infringing the patents of other companies.

Keywords: patent infringement, new technology ideas, patent analysis, F-term

Procedia PDF Downloads 263
25152 Surveillance Video Summarization Based on Histogram Differencing and Sum Conditional Variance

Authors: Nada Jasim Habeeb, Rana Saad Mohammed, Muntaha Khudair Abbass

Abstract:

For more efficient and fast video summarization, this paper presents a surveillance video summarization method. The presented method works to improve video summarization technique. This method depends on temporal differencing to extract most important data from large video stream. This method uses histogram differencing and Sum Conditional Variance which is robust against to illumination variations in order to extract motion objects. The experimental results showed that the presented method gives better output compared with temporal differencing based summarization techniques.

Keywords: temporal differencing, video summarization, histogram differencing, sum conditional variance

Procedia PDF Downloads 345
25151 Improvement of Microstructure, Wear and Mechanical Properties of Modified G38NiCrMo8-4-4 Steel Used in Mining Industry

Authors: Mustafa Col, Funda Gul Koc, Merve Yangaz, Eylem Subasi, Can Akbasoglu

Abstract:

G38NiCrMo8-4-4 steel is widely used in mining industries, machine parts, gears due to its high strength and toughness properties. In this study, microstructure, wear and mechanical properties of G38NiCrMo8-4-4 steel modified with boron used in the mining industry were investigated. For this purpose, cast materials were alloyed by melting in an induction furnace to include boron with the rates of 0 ppm, 15 ppm, and 50 ppm (wt.) and were formed in the dimensions of 150x200x150 mm by casting into the sand mould. Homogenization heat treatment was applied to the specimens at 1150˚C for 7 hours. Then all specimens were austenitized at 930˚C for 1 hour, quenched in the polymer solution and tempered at 650˚C for 1 hour. Microstructures of the specimens were investigated by using light microscope and SEM to determine the effect of boron and heat treatment conditions. Changes in microstructure properties and material hardness were obtained due to increasing boron content and heat treatment conditions after microstructure investigations and hardness tests. Wear tests were carried out using a pin-on-disc tribometer under dry sliding conditions. Charpy V notch impact test was performed to determine the toughness properties of the specimens. Fracture and worn surfaces were investigated with scanning electron microscope (SEM). The results show that boron element has a positive effect on the hardness and wear properties of G38NiCrMo8-4-4 steel.

Keywords: G38NiCrMo8-4-4 steel, boron, heat treatment, microstructure, wear, mechanical properties

Procedia PDF Downloads 194
25150 Impact of Coal Mining on River Sediment Quality in the Sydney Basin, Australia

Authors: A. Ali, V. Strezov, P. Davies, I. Wright, T. Kan

Abstract:

The environmental impacts arising from mining activities affect the air, water, and soil quality. Impacts may result in unexpected and adverse environmental outcomes. This study reports on the impact of coal production on sediment in Sydney region of Australia. The sediment samples upstream and downstream from the discharge points from three mines were taken, and 80 parameters were tested. The results were assessed against sediment quality based on presence of metals. The study revealed the increment of metal content in the sediment downstream of the reference locations. In many cases, the sediment was above the Australia and New Zealand Environment Conservation Council and international sediment quality guidelines value (SQGV). The major outliers to the guidelines were nickel (Ni) and zinc (Zn).

Keywords: coal mine, environmental impact, produced water, sediment quality guidelines value (SQGV)

Procedia PDF Downloads 300
25149 Cultural Dynamics in Online Consumer Behavior: Exploring Cross-Country Variances in Review Influence

Authors: Eunjung Lee

Abstract:

This research investigates the intricate connection between cultural differences and online consumer behaviors by integrating Hofstede's Cultural Dimensions theory with analysis methodologies such as text mining, data mining, and topic analysis. Our aim is to provide a comprehensive understanding of how national cultural differences influence individuals' behaviors when engaging with online reviews. To ensure the relevance of our investigation, we systematically analyze and interpret the cultural nuances influencing online consumer behaviors, especially in the context of online reviews. By anchoring our research in Hofstede's Cultural Dimensions theory, we seek to offer valuable insights for marketers to tailor their strategies based on the cultural preferences of diverse global consumer bases. In our methodology, we employ advanced text mining techniques to extract insights from a diverse range of online reviews gathered globally for a specific product or service like Netflix. This approach allows us to reveal hidden cultural cues in the language used by consumers from various backgrounds. Complementing text mining, data mining techniques are applied to extract meaningful patterns from online review datasets collected from different countries, aiming to unveil underlying structures and gain a deeper understanding of the impact of cultural differences on online consumer behaviors. The study also integrates topic analysis to identify recurring subjects, sentiments, and opinions within online reviews. Marketers can leverage these insights to inform the development of culturally sensitive strategies, enhance target audience segmentation, and refine messaging approaches aligned with cultural preferences. Anchored in Hofstede's Cultural Dimensions theory, our research employs sophisticated methodologies to delve into the intricate relationship between cultural differences and online consumer behaviors. Applied to specific cultural dimensions, such as individualism vs. collectivism, masculinity vs. femininity, uncertainty avoidance, and long-term vs. short-term orientation, the study uncovers nuanced insights. For example, in exploring individualism vs. collectivism, we examine how reviewers from individualistic cultures prioritize personal experiences while those from collectivistic cultures emphasize communal opinions. Similarly, within masculinity vs. femininity, we investigate whether distinct topics align with cultural notions, such as robust features in masculine cultures and user-friendliness in feminine cultures. Examining information-seeking behaviors under uncertainty avoidance reveals how cultures differ in seeking detailed information or providing succinct reviews based on their comfort with ambiguity. Additionally, in assessing long-term vs. short-term orientation, the research explores how cultural focus on enduring benefits or immediate gratification influences reviews. These concrete examples contribute to the theoretical enhancement of Hofstede's Cultural Dimensions theory, providing a detailed understanding of cultural impacts on online consumer behaviors. As online reviews become increasingly crucial in decision-making, this research not only contributes to the academic understanding of cultural influences but also proposes practical recommendations for enhancing online review systems. Marketers can leverage these findings to design targeted and culturally relevant strategies, ultimately enhancing their global marketing effectiveness and optimizing online review systems for maximum impact.

Keywords: comparative analysis, cultural dimensions, marketing intelligence, national culture, online consumer behavior, text mining

Procedia PDF Downloads 43
25148 Catchment Yield Prediction in an Ungauged Basin Using PyTOPKAPI

Authors: B. S. Fatoyinbo, D. Stretch, O. T. Amoo, D. Allopi

Abstract:

This study extends the use of the Drainage Area Regionalization (DAR) method in generating synthetic data and calibrating PyTOPKAPI stream yield for an ungauged basin at a daily time scale. The generation of runoff in determining a river yield has been subjected to various topographic and spatial meteorological variables, which integers form the Catchment Characteristics Model (CCM). Many of the conventional CCM models adapted in Africa have been challenged with a paucity of adequate, relevance and accurate data to parameterize and validate the potential. The purpose of generating synthetic flow is to test a hydrological model, which will not suffer from the impact of very low flows or very high flows, thus allowing to check whether the model is structurally sound enough or not. The employed physically-based, watershed-scale hydrologic model (PyTOPKAPI) was parameterized with GIS-pre-processing parameters and remote sensing hydro-meteorological variables. The validation with mean annual runoff ratio proposes a decent graphical understanding between observed and the simulated discharge. The Nash-Sutcliffe efficiency and coefficient of determination (R²) values of 0.704 and 0.739 proves strong model efficiency. Given the current climate variability impact, water planner can now assert a tool for flow quantification and sustainable planning purposes.

Keywords: catchment characteristics model, GIS, synthetic data, ungauged basin

Procedia PDF Downloads 319
25147 E-Waste Generation in Bangladesh: Present and Future Estimation by Material Flow Analysis Method

Authors: Rowshan Mamtaz, Shuvo Ahmed, Imran Noor, Sumaiya Rahman, Prithvi Shams, Fahmida Gulshan

Abstract:

Last few decades have witnessed a phenomenal rise in the use of electrical and electronic equipment globally in our everyday life. As these items reach the end of their lifecycle, they turn into e-wastes and contribute to the waste stream. Bangladesh, in conformity with the global trend and due to its ongoing rapid growth, is also using electronics-based appliances and equipment at an increasing rate. This has caused a corresponding increase in the generation of e-wastes. Bangladesh is a developing country; its overall waste management system, is not yet efficient, nor is it environmentally sustainable. Most of its solid wastes are disposed of in a crude way at dumping sites. Addition of e-wastes, which often contain toxic heavy metals, into its waste stream has made the situation more difficult and challenging. Assessment of generation of e-wastes is an important step towards addressing the challenges posed by e-wastes, setting targets, and identifying the best practices for their management. Understanding and proper management of e-wastes is a stated item of the Sustainable Development Goals (SDG) campaign, and Bangladesh is committed to fulfilling it. A better understanding and availability of reliable baseline data on e-wastes will help in preventing illegal dumping, promote recycling, and create jobs in the recycling sectors and thus facilitate sustainable e-waste management. With this objective in mind, the present study has attempted to estimate the amount of e-wastes and its future generation trend in Bangladesh. To achieve this, sales data on eight selected electrical and electronic products (TV, Refrigerator, Fan, Mobile phone, Computer, IT equipment, CFL (Compact Fluorescent Lamp) bulbs, and Air Conditioner) have been collected from different sources. Primary and secondary data on the collection, recycling, and disposal of the e-wastes have also been gathered by questionnaire survey, field visits, interviews, and formal and informal meetings with the stakeholders. Material Flow Analysis (MFA) method has been applied, and mathematical models have been developed in the present study to estimate e-waste amounts and their future trends up to the year 2035 for the eight selected electrical and electronic equipment. End of life (EOL) method is adopted in the estimation. Model inputs are products’ annual sale/import data, past and future sales data, and average life span. From the model outputs, it is estimated that the generation of e-wastes in Bangladesh in 2018 is 0.40 million tons and by 2035 the amount will be 4.62 million tons with an average annual growth rate of 20%. Among the eight selected products, the number of e-wastes generated from seven products are increasing whereas only one product, CFL bulb, showed a decreasing trend of waste generation. The average growth rate of e-waste from TV sets is the highest (28%) while those from Fans and IT equipment are the lowest (11%). Field surveys conducted in the e-waste recycling sector also revealed that every year around 0.0133 million tons of e-wastes enter into the recycling business in Bangladesh which may increase in the near future.

Keywords: Bangladesh, end of life, e-waste, material flow analysis

Procedia PDF Downloads 191
25146 Statistical Analysis to Select Evacuation Route

Authors: Zaky Musyarof, Dwi Yono Sutarto, Dwima Rindy Atika, R. B. Fajriya Hakim

Abstract:

Each country should be responsible for the safety of people, especially responsible for the safety of people living in disaster-prone areas. One of those services is provides evacuation route for them. But all this time, the selection of evacuation route is seem doesn’t well organized, it could be seen that when a disaster happen, there will be many accumulation of people on the steps of evacuation route. That condition is dangerous to people because hampers evacuation process. By some methods in Statistical analysis, author tries to give a suggestion how to prepare evacuation route which is organized and based on people habit. Those methods are association rules, sequential pattern mining, hierarchical cluster analysis and fuzzy logic.

Keywords: association rules, sequential pattern mining, cluster analysis, fuzzy logic, evacuation route

Procedia PDF Downloads 498
25145 Heart Ailment Prediction Using Machine Learning Methods

Authors: Abhigyan Hedau, Priya Shelke, Riddhi Mirajkar, Shreyash Chaple, Mrunali Gadekar, Himanshu Akula

Abstract:

The heart is the coordinating centre of the major endocrine glandular structure of the body, which produces hormones that profoundly affect the operations of the body, and diagnosing cardiovascular disease is a difficult but critical task. By extracting knowledge and information about the disease from patient data, data mining is a more practical technique to help doctors detect disorders. We use a variety of machine learning methods here, including logistic regression and support vector classifiers (SVC), K-nearest neighbours Classifiers (KNN), Decision Tree Classifiers, Random Forest classifiers and Gradient Boosting classifiers. These algorithms are applied to patient data containing 13 different factors to build a system that predicts heart disease in less time with more accuracy.

Keywords: logistic regression, support vector classifier, k-nearest neighbour, decision tree, random forest and gradient boosting

Procedia PDF Downloads 44
25144 Convergence and Stability in Federated Learning with Adaptive Differential Privacy Preservation

Authors: Rizwan Rizwan

Abstract:

This paper provides an overview of Federated Learning (FL) and its application in enhancing data security, privacy, and efficiency. FL utilizes three distinct architectures to ensure privacy is never compromised. It involves training individual edge devices and aggregating their models on a server without sharing raw data. This approach not only provides secure models without data sharing but also offers a highly efficient privacy--preserving solution with improved security and data access. Also we discusses various frameworks used in FL and its integration with machine learning, deep learning, and data mining. In order to address the challenges of multi--party collaborative modeling scenarios, a brief review FL scheme combined with an adaptive gradient descent strategy and differential privacy mechanism. The adaptive learning rate algorithm adjusts the gradient descent process to avoid issues such as model overfitting and fluctuations, thereby enhancing modeling efficiency and performance in multi-party computation scenarios. Additionally, to cater to ultra-large-scale distributed secure computing, the research introduces a differential privacy mechanism that defends against various background knowledge attacks.

Keywords: federated learning, differential privacy, gradient descent strategy, convergence, stability, threats

Procedia PDF Downloads 25
25143 Advanced Technology for Natural Gas Liquids (NGL) Recovery Using Residue Gas Split

Authors: Riddhiman Sherlekar, Umang Paladia, Rachit Desai, Yash Patel

Abstract:

The competitive scenario of the oil and gas market is a challenge for today’s plant designers to achieve designs that meet client expectations with shrinking budgets, safety requirements, and operating flexibility. Natural Gas Liquids have three main industrial uses. They can be used as fuels, or as petrochemical feedstock or as refinery blends that can be further processed and sold as straight run cuts, such as naphtha, kerosene and gas oil. NGL extraction is not a chemical reaction. It involves the separation of heavier hydrocarbons from the main gas stream through pressure as temperature reduction, which depending upon the degree of NGL extraction may involve cryogenic process. Previous technologies i.e. short cycle dry desiccant absorption, Joule-Thompson or Low temperature refrigeration, lean oil absorption have been giving results of only 40 to 45% ethane recoveries, which were unsatisfying depending upon the current scenario of down turn market. Here new technology has been suggested for boosting up the recoveries of ethane+ up to 95% and up to 99% for propane+ components. Cryogenic plants provide reboiling to demethanizers by using part of inlet feed gas, or inlet feed split. If the two stream temperatures are not similar, there is lost work in the mixing operation unless the designer has access to some proprietary design. The concept introduced in this process consists of reboiling the demethanizer with the residue gas, or residue gas split. The innovation of this process is that it does not use the typical inlet gas feed split type of flow arrangement to reboil the demethanizer or deethanizer column, but instead uses an open heat pump scheme to that effect. The residue gas compressor provides the heat pump effect. The heat pump stream is then further cooled and entered in the top section of the column as a cold reflux. Because of the nature of this design, this process offers the opportunity to operate at full ethane rejection or recovery. The scheme is also very adaptable to revamp existing facilities. This advancement can be proven not only in enhancing the results but also provides operational flexibility, optimize heat exchange, introduces equipment cost reduction, opens a future for the innovative designs while keeping execution costs low.

Keywords: deethanizer, demethanizer, residue gas, NGL

Procedia PDF Downloads 264
25142 Seasonal Variation of the Impact of Mining Activities on Ga-Selati River in Limpopo Province, South Africa

Authors: Joshua N. Edokpayi, John O. Odiyo, Patience P. Shikwambana

Abstract:

Water is a very rare natural resource in South Africa. Ga-Selati River is used for both domestic and industrial purposes. This study was carried out in order to assess the quality of Ga-Selati River in a mining area of Limpopo Province-Phalaborwa. The pH, Electrical Conductivity (EC) and Total Dissolved Solids (TDS) were determined using a Crinson multimeter while turbidity was measured using a Labcon Turbidimeter. The concentrations of Al, Ca, Cd, Cr, Fe, K, Mg, Mn, Na and Pb were analysed in triplicate using a Varian 520 flame atomic absorption spectrometer (AAS) supplied by PerkinElmer, after acid digestion with nitric acid in a fume cupboard. The average pH of the river from eight different sampling sites was 8.00 and 9.38 in wet and dry season respectively. Higher EC values were determined in the dry season (138.7 mS/m) than in the wet season (96.93 mS/m). Similarly, TDS values were higher in dry (929.29 mg/L) than in the wet season (640.72 mg/L) season. These values exceeded the recommended guideline of South Africa Department of Water Affairs and Forestry (DWAF) for domestic water use (70 mS/m) and that of the World Health Organization (WHO) (600 mS/m), respectively. Turbidity varied between 1.78-5.20 and 0.95-2.37 NTU in both wet and dry seasons. Total hardness of 312.50 mg/L and 297.75 mg/L as the concentration of CaCO3 was computed for the river in both the wet and the dry seasons and the river water was categorised as very hard. Mean concentration of the metals studied in both the wet and the dry seasons are: Na (94.06 mg/L and 196.3 mg/L), K (11.79 mg/L and 13.62 mg/L), Ca (45.60 mg/L and 41.30 mg/L), Mg (48.41 mg/L and 44.71 mg/L), Al (0.31 mg/L and 0.38 mg/L), Cd (0.01 mg/L and 0.01 mg/L), Cr (0.02 mg/L and 0.09 mg/L), Pb (0.05 mg/L and 0.06 mg/L), Mn (0.31 mg/L and 0.11 mg/L) and Fe (0.76 mg/L and 0.69 mg/L). Results from this study reveal that most of the metals were present in concentrations higher than the recommended guidelines of DWAF and WHO for domestic use and the protection of aquatic life.

Keywords: contamination, mining activities, surface water, trace metals

Procedia PDF Downloads 312
25141 Information Needs and Information Usage of the Older Person Club’s Members in Bangkok

Authors: Siriporn Poolsuwan

Abstract:

This research aims to explore the information needs, information usages, and problems of information usage of the older people club’s members in Dusit District, Bangkok. There are 12 clubs and 746 club’s members in this district. The research results use for older person service in this district. Data is gathered from 252 club’s members by using questionnaires. The quantitative approach uses in research by percentage, means and standard deviation. The results are as follows (1) The older people need Information for entertainment, occupation and academic in the field of short story, computer work, and religion and morality. (2) The participants use Information from various sources. (3) The Problem of information usage is their language skills because of the older people’s literacy problem.

Keywords: information behavior, older person, information seeking, knowledge discovery and data mining

Procedia PDF Downloads 266
25140 Fischer Tropsch Synthesis in Compressed Carbon Dioxide with Integrated Recycle

Authors: Kanchan Mondal, Adam Sims, Madhav Soti, Jitendra Gautam, David Carron

Abstract:

Fischer-Tropsch (FT) synthesis is a complex series of heterogeneous reactions between CO and H2 molecules (present in the syngas) on the surface of an active catalyst (Co, Fe, Ru, Ni, etc.) to produce gaseous, liquid, and waxy hydrocarbons. This product is composed of paraffins, olefins, and oxygenated compounds. The key challenge in applying the Fischer-Tropsch process to produce transportation fuels is to make the capital and production costs economically feasible relative to the comparative cost of existing petroleum resources. To meet this challenge, it is imperative to enhance the CO conversion while maximizing carbon selectivity towards the desired liquid hydrocarbon ranges (i.e. reduction in CH4 and CO2 selectivities) at high throughputs. At the same time, it is equally essential to increase the catalyst robustness and longevity without sacrificing catalyst activity. This paper focuses on process development to achieve the above. The paper describes the influence of operating parameters on Fischer Tropsch synthesis (FTS) from coal derived syngas in supercritical carbon dioxide (ScCO2). In addition, the unreacted gas and solvent recycle was incorporated and the effect of unreacted feed recycle was evaluated. It was expected that with the recycle, the feed rate can be increased. The increase in conversion and liquid selectivity accompanied by the production of narrower carbon number distribution in the product suggest that higher flow rates can and should be used when incorporating exit gas recycle. It was observed that this process was capable of enhancing the hydrocarbon selectivity (nearly 98 % CO conversion), reducing improving the carbon efficiency from 17 % to 51 % in a once through process and further converting 16 % CO2 to liquid with integrated recycle of the product gas stream and increasing the life of the catalyst. Catalyst robustness enhancement has been attributed to the absorption of heat of reaction by the compressed CO2 which reduced the formation of hotspots and the dissolution of waxes by the CO2 solvent which reduced the blinding of active sites. In addition, the recycling the product gas stream reduced the reactor footprint to one-fourth of the once through size and product fractionation utilizing the solvent effects of supercritical CO2 were realized. In addition to the negative CO2 selectivities, methane production was also inhibited and was limited to less than 1.5%. The effect of the process conditions on the life of the catalysts will also be presented. Fe based catalysts are known to have a high proclivity for producing CO2 during FTS. The data of the product spectrum and selectivity on Co and Fe-Co based catalysts as well as those obtained from commercial sources will also be presented. The measurable decision criteria were the increase in CO conversion at H2:CO ratio of 1:1 (as commonly found in coal gasification product stream) in supercritical phase as compared to gas phase reaction, decrease in CO2 and CH4 selectivity, overall liquid product distribution, and finally an increase in the life of the catalysts.

Keywords: carbon efficiency, Fischer Tropsch synthesis, low GHG, pressure tunable fractionation

Procedia PDF Downloads 236
25139 Geometrical Fluid Model for Blood Rheology and Pulsatile Flow in Stenosed Arteries

Authors: Karan Kamboj, Vikramjeet Singh, Vinod Kumar

Abstract:

Considering blood to be a non-Newtonian Carreau liquid, this indirect numerical model investigates the pulsatile blood flow in a constricted restricted conduit that has numerous gentle stenosis inside the view of an increasing body speed. Asymptotic answers are obtained for the flow rate, pressure inclination, speed profile, sheer divider pressure, and longitudinal impedance to stream after the use of the twofold irritation approach to the problem of the succeeding non-straight limit esteem. It has been observed that the speed of the blood increases when there is an increase in the point of tightening of the conduit, the body speed increase, and the power regulation file. However, this rheological manner of behaving changes to one of longitudinal impedance to stream and divider sheer pressure when each of the previously mentioned boundaries increases. It has also been seen that the sheer divider pressure in the bloodstream greatly increases when there is an increase in the maximum depth of the stenosis but that it significantly decreases when there is an increase in the pulsatile Reynolds number. This is an interesting phenomenon. The assessments of the amount of growth in the longitudinal resistance to flow increase overall with the increment of the maximum depth of the stenosis and the Weissenberg number. Additionally, it is noted that the average speed of blood increases noticeably with the growth of the point of tightening of the corridor, and body speed increases border. This is something that can be observed.

Keywords: geometry of artery, pulsatile blood flow, numerous stenosis

Procedia PDF Downloads 94
25138 Destination Port Detection For Vessels: An Analytic Tool For Optimizing Port Authorities Resources

Authors: Lubna Eljabu, Mohammad Etemad, Stan Matwin

Abstract:

Port authorities have many challenges in congested ports to allocate their resources to provide a safe and secure loading/ unloading procedure for cargo vessels. Selecting a destination port is the decision of a vessel master based on many factors such as weather, wavelength and changes of priorities. Having access to a tool which leverages AIS messages to monitor vessel’s movements and accurately predict their next destination port promotes an effective resource allocation process for port authorities. In this research, we propose a method, namely, Reference Route of Trajectory (RRoT) to assist port authorities in predicting inflow and outflow traffic in their local environment by monitoring Automatic Identification System (AIS) messages. Our RRoT method creates a reference route based on historical AIS messages. It utilizes some of the best trajectory similarity measure to identify the destination of a vessel using their recent movement. We evaluated five different similarity measures such as Discrete Fr´echet Distance (DFD), Dynamic Time Warping (DTW), Partial Curve Mapping (PCM), Area between two curves (Area) and Curve length (CL). Our experiments show that our method identifies the destination port with an accuracy of 98.97% and an fmeasure of 99.08% using Dynamic Time Warping (DTW) similarity measure.

Keywords: spatial temporal data mining, trajectory mining, trajectory similarity, resource optimization

Procedia PDF Downloads 115
25137 Electrophysiological Correlates of Statistical Learning in Children with and without Developmental Language Disorder

Authors: Ana Paula Soares, Alexandrina Lages, Helena Oliveira, Francisco-Javier Gutiérrez-Domínguez, Marisa Lousada

Abstract:

From an early age, exposure to a spoken language allows us to implicitly capture the structure underlying the succession of the speech sounds in that language and to segment it into meaningful units (words). Statistical learning (SL), i.e., the ability to pick up patterns in the sensory environment even without intention or consciousness of doing it, is thus assumed to play a central role in the acquisition of the rule-governed aspects of language and possibly to lie behind the language difficulties exhibited by children with development language disorder (DLD). The research conducted so far has, however, led to inconsistent results, which might stem from the behavioral tasks used to test SL. In a classic SL experiment, participants are first exposed to a continuous stream (e.g., syllables) in which, unbeknownst to the participants, stimuli are grouped into triplets that always appear together in the stream (e.g., ‘tokibu’, ‘tipolu’), with no pauses between each other (e.g., ‘tokibutipolugopilatokibu’) and without any information regarding the task or the stimuli. Following exposure, SL is assessed by asking participants to discriminate between triplets previously presented (‘tokibu’) from new sequences never presented together during exposure (‘kipopi’), i.e., to perform a two-alternative-forced-choice (2-AFC) task. Despite the widespread use of the 2-AFC to test SL, it has come under increasing criticism as it is an offline post-learning task that only assesses the result of the learning that had occurred during the previous exposure phase and that might be affected by other factors beyond the computation of regularities embedded in the input, typically the likelihood two syllables occurring together, a statistic known as transitional probability (TP). One solution to overcome these limitations is to assess SL as exposure to the stream unfolds using online techniques such as event-related potentials (ERP) that is highly sensitive to the time-course of the learning in the brain. Here we collected ERPs to examine the neurofunctional correlates of SL in preschool children with DLD, and chronological-age typical language development (TLD) controls who were exposed to an auditory stream in which eight three-syllable nonsense words, four of which presenting high-TPs and the other four low-TPs, to further analyze whether the ability of DLD and TLD children to extract-word-like units from the steam was modulated by words’ predictability. Moreover, to ascertain if the previous knowledge of the to-be-learned-regularities affected the neural responses to high- and low-TP words, children performed the auditory SL task, firstly, under implicit, and, subsequently, under explicit conditions. Although behavioral evidence of SL was not obtained in either group, the neural responses elicited during the exposure phases of the SL tasks differentiated children with DLD from children with TLD. Specifically, the results indicated that only children from the TDL group showed neural evidence of SL, particularly in the SL task performed under explicit conditions, firstly, for the low-TP, and, subsequently, for the high-TP ‘words’. Taken together, these findings support the view that children with DLD showed deficits in the extraction of the regularities embedded in the auditory input which might underlie the language difficulties.

Keywords: development language disorder, statistical learning, transitional probabilities, word segmentation

Procedia PDF Downloads 186
25136 Study of the Transport of ²²⁶Ra Colloidal in Mining Context Using a Multi-Disciplinary Approach

Authors: Marine Reymond, Michael Descostes, Marie Muguet, Clemence Besancon, Martine Leermakers, Catherine Beaucaire, Sophie Billon, Patricia Patrier

Abstract:

²²⁶Ra is one of the radionuclides resulting from the disintegration of ²³⁸U. Due to its half-life (1600 y) and its high specific activity (3.7 x 1010 Bq/g), ²²⁶Ra is found at the ultra-trace level in the natural environment (usually below 1 Bq/L, i.e. 10-13 mol/L). Because of its decay in ²²²Rn, a radioactive gas with a shorter half-life (3.8 days) which is difficult to control and dangerous for humans when inhaled, ²²⁶Ra is subject to a dedicated monitoring in surface waters especially in the context of uranium mining. In natural waters, radionuclides occur in dissolved, colloidal or particular forms. Due to the size of colloids, generally ranging between 1 nm and 1 µm and their high specific surface areas, the colloidal fraction could be involved in the transport of trace elements, including radionuclides in the environment. The colloidal fraction is not always easy to determine and few existing studies focus on ²²⁶Ra. In the present study, a complete multidisciplinary approach is proposed to assess the colloidal transport of ²²⁶Ra. It includes water sampling by conventional filtration (0.2µm) and the innovative Diffusive Gradient in Thin Films technique to measure the dissolved fraction (<10nm), from which the colloidal fraction could be estimated. Suspended matter in these waters were also sampled and characterized mineralogically by X-Ray Diffraction, infrared spectroscopy and scanning electron microscopy. All of these data, which were acquired on a rehabilitated former uranium mine, allowed to build a geochemical model using the geochemical calculation code PhreeqC to describe, as accurately as possible, the colloidal transport of ²²⁶Ra. Colloidal transport of ²²⁶Ra was found, for some of the sampling points, to account for up to 95% of the total ²²⁶Ra measured in water. Mineralogical characterization and associated geochemical modelling highlight the role of barite, a barium sulfate mineral well known to trap ²²⁶Ra into its structure. Barite was shown to be responsible for the colloidal ²²⁶Ra fraction despite the presence of kaolinite and ferrihydrite, which are also known to retain ²²⁶Ra by sorption.

Keywords: colloids, mining context, radium, transport

Procedia PDF Downloads 154
25135 Hybridized Approach for Distance Estimation Using K-Means Clustering

Authors: Ritu Vashistha, Jitender Kumar

Abstract:

Clustering using the K-means algorithm is a very common way to understand and analyze the obtained output data. When a similar object is grouped, this is called the basis of Clustering. There is K number of objects and C number of cluster in to single cluster in which k is always supposed to be less than C having each cluster to be its own centroid but the major problem is how is identify the cluster is correct based on the data. Formulation of the cluster is not a regular task for every tuple of row record or entity but it is done by an iterative process. Each and every record, tuple, entity is checked and examined and similarity dissimilarity is examined. So this iterative process seems to be very lengthy and unable to give optimal output for the cluster and time taken to find the cluster. To overcome the drawback challenge, we are proposing a formula to find the clusters at the run time, so this approach can give us optimal results. The proposed approach uses the Euclidian distance formula as well melanosis to find the minimum distance between slots as technically we called clusters and the same approach we have also applied to Ant Colony Optimization(ACO) algorithm, which results in the production of two and multi-dimensional matrix.

Keywords: ant colony optimization, data clustering, centroids, data mining, k-means

Procedia PDF Downloads 125
25134 A Survey on Compression Methods for Table Constraints

Authors: N. Gharbi

Abstract:

Constraint Satisfaction problems are mathematical problems that are often used to model many real-world problems for which we look if there exists a solution satisfying all its constraints. Table constraints are important for modeling parts of many problems since they list all combinations of allowed or forbidden values. However, they admit practical limitations because they are sometimes too large to be represented in a direct way. In this paper, we present a survey of the different categories of the proposed approaches to compress table constraints in order to reduce both space and time complexities.

Keywords: constraint programming, compression, data mining, table constraints

Procedia PDF Downloads 322
25133 A Method to Evaluate and Compare Web Information Extractors

Authors: Patricia Jiménez, Rafael Corchuelo, Hassan A. Sleiman

Abstract:

Web mining is gaining importance at an increasing pace. Currently, there are many complementary research topics under this umbrella. Their common theme is that they all focus on applying knowledge discovery techniques to data that is gathered from the Web. Sometimes, these data are relatively easy to gather, chiefly when it comes from server logs. Unfortunately, there are cases in which the data to be mined is the data that is displayed on a web document. In such cases, it is necessary to apply a pre-processing step to first extract the information of interest from the web documents. Such pre-processing steps are performed using so-called information extractors, which are software components that are typically configured by means of rules that are tailored to extracting the information of interest from a web page and structuring it according to a pre-defined schema. Paramount to getting good mining results is that the technique used to extract the source information is exact, which requires to evaluate and compare the different proposals in the literature from an empirical point of view. According to Google Scholar, about 4 200 papers on information extraction have been published during the last decade. Unfortunately, they were not evaluated within a homogeneous framework, which leads to difficulties to compare them empirically. In this paper, we report on an original information extraction evaluation method. Our contribution is three-fold: a) this is the first attempt to provide an evaluation method for proposals that work on semi-structured documents; the little existing work on this topic focuses on proposals that work on free text, which has little to do with extracting information from semi-structured documents. b) It provides a method that relies on statistically sound tests to support the conclusions drawn; the previous work does not provide clear guidelines or recommend statistically sound tests, but rather a survey that collects many features to take into account as well as related work; c) We provide a novel method to compute the performance measures regarding unsupervised proposals; otherwise they would require the intervention of a user to compute them by using the annotations on the evaluation sets and the information extracted. Our contributions will definitely help researchers in this area make sure that they have advanced the state of the art not only conceptually, but from an empirical point of view; it will also help practitioners make informed decisions on which proposal is the most adequate for a particular problem. This conference is a good forum to discuss on our ideas so that we can spread them to help improve the evaluation of information extraction proposals and gather valuable feedback from other researchers.

Keywords: web information extractors, information extraction evaluation method, Google scholar, web

Procedia PDF Downloads 245