Search results for: graph mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1522

Search results for: graph mining

262 Fuzzy Optimization Multi-Objective Clustering Ensemble Model for Multi-Source Data Analysis

Authors: C. B. Le, V. N. Pham

Abstract:

In modern data analysis, multi-source data appears more and more in real applications. Multi-source data clustering has emerged as a important issue in the data mining and machine learning community. Different data sources provide information about different data. Therefore, multi-source data linking is essential to improve clustering performance. However, in practice multi-source data is often heterogeneous, uncertain, and large. This issue is considered a major challenge from multi-source data. Ensemble is a versatile machine learning model in which learning techniques can work in parallel, with big data. Clustering ensemble has been shown to outperform any standard clustering algorithm in terms of accuracy and robustness. However, most of the traditional clustering ensemble approaches are based on single-objective function and single-source data. This paper proposes a new clustering ensemble method for multi-source data analysis. The fuzzy optimized multi-objective clustering ensemble method is called FOMOCE. Firstly, a clustering ensemble mathematical model based on the structure of multi-objective clustering function, multi-source data, and dark knowledge is introduced. Then, rules for extracting dark knowledge from the input data, clustering algorithms, and base clusterings are designed and applied. Finally, a clustering ensemble algorithm is proposed for multi-source data analysis. The experiments were performed on the standard sample data set. The experimental results demonstrate the superior performance of the FOMOCE method compared to the existing clustering ensemble methods and multi-source clustering methods.

Keywords: clustering ensemble, multi-source, multi-objective, fuzzy clustering

Procedia PDF Downloads 189
261 Analyzing the Commentator Network Within the French YouTube Environment

Authors: Kurt Maxwell Kusterer, Sylvain Mignot, Annick Vignes

Abstract:

To our best knowledge YouTube is the largest video hosting platform in the world. A high number of creators, viewers, subscribers and commentators act in this specific eco-system which generates huge sums of money. Views, subscribers, and comments help to increase the popularity of content creators. The most popular creators are sponsored by brands and participate in marketing campaigns. For a few of them, this becomes a financially rewarding profession. This is made possible through the YouTube Partner Program, which shares revenue among creators based on their popularity. We believe that the role of comments in increasing the popularity is to be emphasized. In what follows, YouTube is considered as a bilateral network between the videos and the commentators. Analyzing a detailed data set focused on French YouTubers, we consider each comment as a link between a commentator and a video. Our research question asks what are the predominant features of a video which give it the highest probability to be commented on. Following on from this question, how can we use these features to predict the action of the agent in commenting one video instead of another, considering the characteristics of the commentators, videos, topics, channels, and recommendations. We expect to see that the videos of more popular channels generate higher viewer engagement and thus are more frequently commented. The interest lies in discovering features which have not classically been considered as markers for popularity on the platform. A quick view of our data set shows that 96% of the commentators comment only once on a certain video. Thus, we study a non-weighted bipartite network between commentators and videos built on the sub-sample of 96% of unique comments. A link exists between two nodes when a commentator makes a comment on a video. We run an Exponential Random Graph Model (ERGM) approach to evaluate which characteristics influence the probability of commenting a video. The creation of a link will be explained in terms of common video features, such as duration, quality, number of likes, number of views, etc. Our data is relevant for the period of 2020-2021 and focuses on the French YouTube environment. From this set of 391 588 videos, we extract the channels which can be monetized according to YouTube regulations (channels with at least 1000 subscribers and more than 4000 hours of viewing time during the last twelve months).In the end, we have a data set of 128 462 videos which consist of 4093 channels. Based on these videos, we have a data set of 1 032 771 unique commentators, with a mean of 2 comments per a commentator, a minimum of 1 comment each, and a maximum of 584 comments.

Keywords: YouTube, social networks, economics, consumer behaviour

Procedia PDF Downloads 68
260 Atomic Town: History and Vernacular Heritage at the Mary Kathleen Uranium Mine in Australia

Authors: Erik Eklund

Abstract:

Mary Kathleen was a purpose-built company town located in northwest Queensland in Australia. It was created to work on a rich uranium deposit discovered in the area in July 1954. The town was complete by 1958, possessing curved streets, modern materials, and a progressive urban planning scheme. Formed in the minds of corporate executives and architects and made manifest in arid zone country between Cloncurry and Mount Isa, Mary Kathleen was a modern marvel in the outback, a town that tamed the wild country of northwest Queensland, or so it seemed. The town was also a product of the Cold War. In the context of a nuclear arms race between the Soviet Union and her allies, and the United States of America (USA) and her Allies, a rapid rush to locate, mine, and process uranium after 1944 led to the creation of uranium towns in Czechoslovakia, Canada, the Soviet Union, USA and Australia of which Mary Kathleen was one such example. Mary Kathleen closed in 1981, and most of the town’s infrastructure was removed. Since then, the town’s ghostly remains have attracted travellers and tourists. Never an officially-sanctioned tourist site, the area has nevertheless become a regular stop for campers and day trippers who have engaged with the site often without formal interpretation. This paper explores the status of this vernacular heritage and asks why it has not gained any official status and what visitors might see in the place despite its uncertain status.

Keywords: uranium mining, planned communities, official heritage, vernacular heritage, Australian history

Procedia PDF Downloads 89
259 Lipidomic Response to Neoadjuvant Chemoradiotherapy in Rectal Cancer

Authors: Patricia O. Carvalho, Marcia C. F. Messias, Salvador Sanchez Vinces, Caroline F. A. Gatinoni, Vitor P. Iordanu, Carlos A. R. Martinez

Abstract:

Lipidomics methods are widely used in the identification and validation of disease-specific biomarkers and therapy response evaluation. The present study aimed to identify a panel of potential lipid biomarkers to evaluate response to neoadjuvant chemoradiotherapy in rectal adenocarcinoma (RAC). Liquid chromatography–mass spectrometry (LC-MS)-based untargeted lipidomic was used to profile human serum samples from patients with clinical stage T2 or T3 resectable RAC, after and before chemoradiotherapy treatment. A total of 28 blood plasma samples were collected from 14 patients with RAC who recruited at the São Francisco University Hospital (HUSF/USF). The study was approved by the ethics committee (CAAE 14958819.8.0000.5514). Univariate and multivariate statistical analyses were applied to explore dysregulated metabolic pathways using untargeted lipidic profiling and data mining approaches. A total of 36 statistically significant altered lipids were identified and the subsequent partial least-squares discriminant analysis model was both cross validated (R2, Q2) and permutated. Lisophosphatidyl-choline (LPC) plasmalogens containing palmitoleic and oleic acids, with high variable importance in projection score, showed a tendency to be lower after completion of chemoradiotherapy. Chemoradiotherapy seems to change plasmanyl-phospholipids levels, indicating that these lipids play an important role in the RAC pathogenesis.

Keywords: lipidomics, neoadjuvant chemoradiotherapy, plasmalogens, rectal adenocarcinoma

Procedia PDF Downloads 131
258 A Methodology for Automatic Diversification of Document Categories

Authors: Dasom Kim, Chen Liu, Myungsu Lim, Su-Hyeon Jeon, ByeoungKug Jeon, Kee-Young Kwahk, Namgyu Kim

Abstract:

Recently, numerous documents including unstructured data and text have been created due to the rapid increase in the usage of social media and the Internet. Each document is usually provided with a specific category for the convenience of the users. In the past, the categorization was performed manually. However, in the case of manual categorization, not only can the accuracy of the categorization be not guaranteed but the categorization also requires a large amount of time and huge costs. Many studies have been conducted towards the automatic creation of categories to solve the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorizing complex documents with multiple topics because the methods work by assuming that one document can be categorized into one category only. In order to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, they are also limited in that their learning process involves training using a multi-categorized document set. These methods therefore cannot be applied to multi-categorization of most documents unless multi-categorized training sets are provided. To overcome the limitation of the requirement of a multi-categorized training set by traditional multi-categorization algorithms, we previously proposed a new methodology that can extend a category of a single-categorized document to multiple categorizes by analyzing relationships among categories, topics, and documents. In this paper, we design a survey-based verification scenario for estimating the accuracy of our automatic categorization methodology.

Keywords: big data analysis, document classification, multi-category, text mining, topic analysis

Procedia PDF Downloads 272
257 Children and Migration in Ghana: Unveiling the Realities of Vulnerability and Social Exclusion

Authors: Thomas Yeboah

Abstract:

In contemporary times, the incessant movement of northern children especially girls to southern Ghana at the detriment of their education is worrisome. Due to the misplaced mindset of the migrants concerning southern Ghana, majority of them move without an idea of where to stay and what to do exposing them to hash conditions of living. Majority find menial work in cocoa farms, illegal mining and head porterage business. This study was conducted in the Kumasi Metropolis to ascertain the major causes of child migration from the northern part of Ghana to the south and their living conditions. Both qualitative and quantitative tools of data collection and analysis were employed. The purposive sampling technique was used to select 90 migrants below 18 years. Specifically, interviews, focus group discussions and questionnaires were used to elicit responses from the units of analysis. The study revealed that the major cause of child migration from northern Ghana to the south is poverty. It was evident that respondents were vulnerable to the new environment in which they lived. They are exposed to harsh environmental conditions; sexual, verbal and physical assault; and harassment from arm robbers. The paper recommends that policy decisions should be able to create an enabling environment for the labour force in the north to ameliorate the compelling effects poverty has on child migration. Efforts should also be made to create a proper psychological climate in the minds of the children regarding their destination areas through sensitization and education.

Keywords: child migration, vulnerability, social exclusion, child labour, Ghana

Procedia PDF Downloads 443
256 Petrology Investigation of Apatite Minerals in the Esfordi Mine

Authors: Haleh Rezaei Zanjirabadi, Fatemeh Saberi, Bahman Rahimzadeh, Fariborz Masoudi, Mohammad Rahgosha

Abstract:

In this study, apatite minerals from the iron-phosphate deposit of Yazd have been investigated within the microcontinent zone of Iran in the Zagros structural zone. The geological units in the Esfordi area belong to the pre-Cambrian to lower-Cambrian age, consisting of a succession of carbonate rocks (dolomite), shale, tuff, sandstone, and volcanic rocks. In addition to the mentioned sedimentary and volcanic rocks, the granitoid mass of Bahabad, which is the largest intrusive mass in the region, has intruded into the eastern part of this series and has caused its metamorphism and alteration. After collecting the available data, various samples of Esfordi’s apatite were prepared, and their mineralogy and crystallography were investigated using laboratory methods such as petrographic microscopy, Raman spectroscopy, EDS, and SEM. In non-destructive Raman spectroscopy, the molecular structure of apatite minerals was revealed in four distinct spectral ranges. Initially, the spectra of phosphate and aluminum bonds with O2HO, OH, were observed, followed by the identification of Cl, OH, Al, Na, Ca and hydroxyl units depending on the type of apatite mineral family. In SEM analysis, based on various shapes and different phases of apatites, their constituent major elements were identified through EDS, indicating that the samples from the Esfordi mining area exhibit a dense and coherent texture with smooth surfaces. Based on the elemental analysis results by EDS, the apatites in the Esfordi area are classified into the calcic apatite group.

Keywords: petrology, apatite, Esfordi, EDS, SEM, Raman spectroscopy

Procedia PDF Downloads 61
255 Mitigation Measures for the Acid Mine Drainage Emanating from the Sabie Goldfield: Case Study of the Nestor Mine

Authors: Rudzani Lusunzi, Frans Waanders, Elvis Fosso-Kankeu, Robert Khashane Netshitungulwana

Abstract:

The Sabie Goldfield has a history of gold mining dating back more than a century. Acid mine drainage (AMD) from the Nestor mine tailings storage facility (MTSF) poses a serious threat to the nearby ecosystem, specifically the Sabie River system. This study aims at developing mitigation measures for the AMD emanating from the Nestor MTSF using materials from the Glynns Lydenburg MTSF. The Nestor MTSF (NM) and the Glynns Lydenburg MTSF (GM) each provided about 20 kg of bulk composite samples. Using samples from the Nestor MTSF and the Glynns Lydenburg MTSF, two mixtures were created. MIX-A is a mixture that contains 25% weight percent (GM) and 75% weight percent (NM). MIX-B is the name given to the second mixture, which contains 50% AN and 50% AG. The same static test, i.e., acid–base accounting (ABA), net acid generation (NAG), and acid buffering characteristics curve (ABCC) was used to estimate the acid-generating probabilities of samples NM and GM for MIX-A and MIX-B. Furthermore, the mineralogy of the Nestor MTSF samples consists of the primary acid-producing mineral pyrite as well as the secondary minerals ferricopiapite and jarosite, which are common in acidic conditions. The Glynns Lydenburg MTSF samples, on the other hand, contain primary acid-neutralizing minerals calcite and dolomite. Based on the assessment conducted, materials from the Glynns Lydenburg are capable of neutralizing AMD from Nestor MTSF. Therefore, the alkaline tailings materials from the Glynns Lydenburg MTSF can be used to rehabilitate the acidic Nestor MTSF.

Keywords: Nestor Mine, acid mine drainage, mitigation, Sabie River system

Procedia PDF Downloads 85
254 Artificial Neural Networks Application on Nusselt Number and Pressure Drop Prediction in Triangular Corrugated Plate Heat Exchanger

Authors: Hany Elsaid Fawaz Abdallah

Abstract:

This study presents a new artificial neural network(ANN) model to predict the Nusselt Number and pressure drop for the turbulent flow in a triangular corrugated plate heat exchanger for forced air and turbulent water flow. An experimental investigation was performed to create a new dataset for the Nusselt Number and pressure drop values in the following range of dimensionless parameters: The plate corrugation angles (from 0° to 60°), the Reynolds number (from 10000 to 40000), pitch to height ratio (from 1 to 4), and Prandtl number (from 0.7 to 200). Based on the ANN performance graph, the three-layer structure with {12-8-6} hidden neurons has been chosen. The training procedure includes back-propagation with the biases and weight adjustment, the evaluation of the loss function for the training and validation dataset and feed-forward propagation of the input parameters. The linear function was used at the output layer as the activation function, while for the hidden layers, the rectified linear unit activation function was utilized. In order to accelerate the ANN training, the loss function minimization may be achieved by the adaptive moment estimation algorithm (ADAM). The ‘‘MinMax’’ normalization approach was utilized to avoid the increase in the training time due to drastic differences in the loss function gradients with respect to the values of weights. Since the test dataset is not being used for the ANN training, a cross-validation technique is applied to the ANN network using the new data. Such procedure was repeated until loss function convergence was achieved or for 4000 epochs with a batch size of 200 points. The program code was written in Python 3.0 using open-source ANN libraries such as Scikit learn, TensorFlow and Keras libraries. The mean average percent error values of 9.4% for the Nusselt number and 8.2% for pressure drop for the ANN model have been achieved. Therefore, higher accuracy compared to the generalized correlations was achieved. The performance validation of the obtained model was based on a comparison of predicted data with the experimental results yielding excellent accuracy.

Keywords: artificial neural networks, corrugated channel, heat transfer enhancement, Nusselt number, pressure drop, generalized correlations

Procedia PDF Downloads 87
253 Light-Weight Network for Real-Time Pose Estimation

Authors: Jianghao Hu, Hongyu Wang

Abstract:

The effective and efficient human pose estimation algorithm is an important task for real-time human pose estimation on mobile devices. This paper proposes a light-weight human key points detection algorithm, Light-Weight Network for Real-Time Pose Estimation (LWPE). LWPE uses light-weight backbone network and depthwise separable convolutions to reduce parameters and lower latency. LWPE uses the feature pyramid network (FPN) to fuse the high-resolution, semantically weak features with the low-resolution, semantically strong features. In the meantime, with multi-scale prediction, the predicted result by the low-resolution feature map is stacked to the adjacent higher-resolution feature map to intermediately monitor the network and continuously refine the results. At the last step, the key point coordinates predicted in the highest-resolution are used as the final output of the network. For the key-points that are difficult to predict, LWPE adopts the online hard key points mining strategy to focus on the key points that hard predicting. The proposed algorithm achieves excellent performance in the single-person dataset selected in the AI (artificial intelligence) challenge dataset. The algorithm maintains high-precision performance even though the model only contains 3.9M parameters, and it can run at 225 frames per second (FPS) on the generic graphics processing unit (GPU).

Keywords: depthwise separable convolutions, feature pyramid network, human pose estimation, light-weight backbone

Procedia PDF Downloads 154
252 Assessment of the Effect of Cu and Zn on the Growth of Two Chlorophytic Microalgae

Authors: Medina O. Kadiri, John E. Gabriel

Abstract:

Heavy metals are metallic elements with a relatively high density, at least five times greater compared to water. The sources of heavy metal pollution in the environment include industrial, medical, agricultural, pharmaceutical, domestic effluents, and atmospheric sources, mining, foundries, smelting, and any heavy metal-based operation. Although some heavy metals in trace quantities are required for biological metabolism, their higher concentrations elicit toxicities. Others are distinctly toxic and are of no biological functions. Microalgae are the primary producers of aquatic ecosystems and, therefore, the foundation of the aquatic food chain. A study investigating the effects of copper and zinc on the two chlorophytes-Chlorella vulgaris and Dictyosphaerium pulchellum was done in the laboratory, under different concentrations of 0mg/l, 2mg/l, 4mg/l, 6mg/l, 8mg/l, 10mg/l, and 20mg/l. The growth of the test microalgae was determined every two days for 14 days. The results showed that the effects of the test heavy metals were concentration-dependent. From the two microalgae species tested, Chlorella vulgaris showed appreciable growth up to 8mg/l concentration of zinc. Dictyoshphaerium pulchellum had only minimal growth at different copper concentrations except for 2mg/l, which seemed to have relatively higher growth. The growth of the control was remarkably higher than in other concentrations. Generally, the growth of both test algae was consistently inhibited by heavy metals. Comparatively, copper generally inhibited the growth of both algae than zinc. Chlorella vulgaris can be used for bioremediation of high concentrations of zinc. The potential of many microalgae in heavy metal bioremediation can be explored.

Keywords: heavy metals, green algae, microalgae, pollution

Procedia PDF Downloads 195
251 An Automated Approach to the Nozzle Configuration of Polycrystalline Diamond Compact Drill Bits for Effective Cuttings Removal

Authors: R. Suresh, Pavan Kumar Nimmagadda, Ming Zo Tan, Shane Hart, Sharp Ugwuocha

Abstract:

Polycrystalline diamond compact (PDC) drill bits are extensively used in the oil and gas industry as well as the mining industry. Industry engineers continually improve upon PDC drill bit designs and hydraulic conditions. Optimized injection nozzles play a key role in improving the drilling performance and efficiency of these ever changing PDC drill bits. In the first part of this study, computational fluid dynamics (CFD) modelling is performed to investigate the hydrodynamic characteristics of drilling fluid flow around the PDC drill bit. An Open-source CFD software – OpenFOAM simulates the flow around the drill bit, based on the field input data. A specifically developed console application integrates the entire CFD process including, domain extraction, meshing, and solving governing equations and post-processing. The results from the OpenFOAM solver are then compared with that of the ANSYS Fluent software. The data from both software programs agree. The second part of the paper describes the parametric study of the PDC drill bit nozzle to determine the effect of parameters such as number of nozzles, nozzle velocity, nozzle radial position and orientations on the flow field characteristics and bit washing patterns. After analyzing a series of nozzle configurations, the best configuration is identified and recommendations are made for modifying the PDC bit design.

Keywords: ANSYS Fluent, computational fluid dynamics, nozzle configuration, OpenFOAM, PDC dill bit

Procedia PDF Downloads 420
250 Cultivation of High-value Patent from the Perspective of Knowledge Diffusion: A Case Study of the Power Semiconductor Field

Authors: Lin Qing

Abstract:

[Objective/Significance] The cultivation of high-value patents is the focus and difficulty of patent work, which is of great significance to the construction of a powerful country with intellectual property rights. This work should not only pay attention to the existing patent applications, but also start from the pre-application to explore the high-value technical solutions as the core of high-value patents. [Methods/processes] Comply with the principle of scientific and technological knowledge diffusion, this study studies the top academic conference papers and their cited patent applications, taking the power semiconductor field as an example, using facts date show the feasibility and rationality of mining technology solutions from high quality research results to foster high value patents, stating the actual benefits of these achievements to the industry, giving patent protection suggestions for Chinese applicants comparative with field situation. [Results/Conclusion] The research shows that the quality of citation applications of ISPSD papers is significantly higher than the field average level, and the ability of Chinese applicants to use patent protection related achievements needs to be improved. This study provides a practical and highly targeted reference idea for patent administrators and researchers, and also makes a positive exploration for the practice of the spirit of breaking the five rules.

Keywords: high-value patents cultivation, technical solutions, knowledge diffusion, top academic conference papers, intellectual property information analysis

Procedia PDF Downloads 128
249 Big Data in Construction Project Management: The Colombian Northeast Case

Authors: Sergio Zabala-Vargas, Miguel Jiménez-Barrera, Luz VArgas-Sánchez

Abstract:

In recent years, information related to project management in organizations has been increasing exponentially. Performance data, management statistics, indicator results have forced the collection, analysis, traceability, and dissemination of project managers to be essential. In this sense, there are current trends to facilitate efficient decision-making in emerging technology projects, such as: Machine Learning, Data Analytics, Data Mining, and Big Data. The latter is the most interesting in this project. This research is part of the thematic line Construction methods and project management. Many authors present the relevance that the use of emerging technologies, such as Big Data, has taken in recent years in project management in the construction sector. The main focus is the optimization of time, scope, budget, and in general mitigating risks. This research was developed in the northeastern region of Colombia-South America. The first phase was aimed at diagnosing the use of emerging technologies (Big-Data) in the construction sector. In Colombia, the construction sector represents more than 50% of the productive system, and more than 2 million people participate in this economic segment. The quantitative approach was used. A survey was applied to a sample of 91 companies in the construction sector. Preliminary results indicate that the use of Big Data and other emerging technologies is very low and also that there is interest in modernizing project management. There is evidence of a correlation between the interest in using new data management technologies and the incorporation of Building Information Modeling BIM. The next phase of the research will allow the generation of guidelines and strategies for the incorporation of technological tools in the construction sector in Colombia.

Keywords: big data, building information modeling, tecnology, project manamegent

Procedia PDF Downloads 128
248 Valence and Arousal-Based Sentiment Analysis: A Comparative Study

Authors: Usama Shahid, Muhammad Zunnurain Hussain

Abstract:

This research paper presents a comprehensive analysis of a sentiment analysis approach that employs valence and arousal as its foundational pillars, in comparison to traditional techniques. Sentiment analysis is an indispensable task in natural language processing that involves the extraction of opinions and emotions from textual data. The valence and arousal dimensions, representing the intensity and positivity/negativity of emotions, respectively, enable the creation of four quadrants, each representing a specific emotional state. The study seeks to determine the impact of utilizing these quadrants to identify distinct emotional states on the accuracy and efficiency of sentiment analysis, in comparison to traditional techniques. The results reveal that the valence and arousal-based approach outperforms other approaches, particularly in identifying nuanced emotions that may be missed by conventional methods. The study's findings are crucial for applications such as social media monitoring and market research, where the accurate classification of emotions and opinions is paramount. Overall, this research highlights the potential of using valence and arousal as a framework for sentiment analysis and offers invaluable insights into the benefits of incorporating specific types of emotions into the analysis. These findings have significant implications for researchers and practitioners in the field of natural language processing, as they provide a basis for the development of more accurate and effective sentiment analysis tools.

Keywords: sentiment analysis, valence and arousal, emotional states, natural language processing, machine learning, text analysis, sentiment classification, opinion mining

Procedia PDF Downloads 100
247 Smart in Performance: More to Practical Life than Hardware and Software

Authors: Faten Hatem

Abstract:

This paper promotes the importance of focusing on spatial aspects and affective factors that impact smart urbanism. This helps to better inform city governance, spatial planning, and policymaking to focus on what Smart does and what it can achieve for cities in terms of performance rather than on using the notion for prestige in a worldwide trend towards becoming a smart city. By illustrating how this style of practice compromises the social aspects and related elements of space making through an interdisciplinary comparative approach, the paper clarifies the impact of this compromise on the overall smart city performance. In response, this paper recognizes the importance of establishing a new meaning for urban progress by moving beyond improving basic services of the city to enhance the actual human experience which is essential for the development of authentic smart cities. The topic is presented under five overlooked areas that discuss the relation between smart cities’ potential and efficiency paradox, the social aspect, connectedness with nature, the human factor, and untapped resources. However, these themes are not meant to be discussed in silos, instead, they are presented to collectively examine smart cities in performance, arguing there is more to the practical life of smart cities than software and hardware inventions. The study is based on a case study approach, presenting Milton Keynes as a living example to learn from while engaging with various methods for data collection including multi-disciplinary semi-structured interviews, field observations, and data mining.

Keywords: smart design, the human in the city, human needs and urban planning, sustainability, smart cities, smart

Procedia PDF Downloads 99
246 Effective Stacking of Deep Neural Models for Automated Object Recognition in Retail Stores

Authors: Ankit Sinha, Soham Banerjee, Pratik Chattopadhyay

Abstract:

Automated product recognition in retail stores is an important real-world application in the domain of Computer Vision and Pattern Recognition. In this paper, we consider the problem of automatically identifying the classes of the products placed on racks in retail stores from an image of the rack and information about the query/product images. We improve upon the existing approaches in terms of effectiveness and memory requirement by developing a two-stage object detection and recognition pipeline comprising of a Faster-RCNN-based object localizer that detects the object regions in the rack image and a ResNet-18-based image encoder that classifies the detected regions into the appropriate classes. Each of the models is fine-tuned using appropriate data sets for better prediction and data augmentation is performed on each query image to prepare an extensive gallery set for fine-tuning the ResNet-18-based product recognition model. This encoder is trained using a triplet loss function following the strategy of online-hard-negative-mining for improved prediction. The proposed models are lightweight and can be connected in an end-to-end manner during deployment to automatically identify each product object placed in a rack image. Extensive experiments using Grozi-32k and GP-180 data sets verify the effectiveness of the proposed model.

Keywords: retail stores, faster-RCNN, object localization, ResNet-18, triplet loss, data augmentation, product recognition

Procedia PDF Downloads 156
245 Consequential Effects of Coal Utilization on Urban Water Supply Sources – a Study of Ajali River in Enugu State Nigeria

Authors: Enebe Christian Chukwudi

Abstract:

Water bodies around the world notably underground water, ground water, rivers, streams, and seas, face degradation of their water quality as a result of activities associated with coal utilization including coal mining, coal processing, coal burning, waste storage and thermal pollution from coal plants which tend to contaminate these water bodies. This contamination results from heavy metals, presence of sulphate and iron, dissolved solids, mercury and other toxins contained in coal ash, sludge, and coal waste. These wastes sometimes find their way to sources of urban water supply and contaminate them. A major problem encountered in the supply of potable water to Enugu municipality is the contamination of Ajali River, the source of water supply to Enugu municipal by coal waste. Hydro geochemical analysis of Ajali water samples indicate high sulphate and iron content, high total dissolved solids(TDS), low pH (acidity values) and significant hardness in addition to presence of heavy metals, mercury, and other toxins. This is indicative of the following remedial measures: I. Proper disposal of mine wastes at designated disposal sites that are suitably prepared. II. Proper water treatment and III. Reduction of coal related contaminants taking advantage of clean coal technology.

Keywords: effects, coal, utilization, water quality, sources, waste, contamination, treatment

Procedia PDF Downloads 423
244 Comparative Study Using WEKA for Red Blood Cells Classification

Authors: Jameela Ali, Hamid A. Jalab, Loay E. George, Abdul Rahim Ahmad, Azizah Suliman, Karim Al-Jashamy

Abstract:

Red blood cells (RBC) are the most common types of blood cells and are the most intensively studied in cell biology. The lack of RBCs is a condition in which the amount of hemoglobin level is lower than normal and is referred to as “anemia”. Abnormalities in RBCs will affect the exchange of oxygen. This paper presents a comparative study for various techniques for classifying the RBCs as normal, or abnormal (anemic) using WEKA. WEKA is an open source consists of different machine learning algorithms for data mining applications. The algorithm tested are Radial Basis Function neural network, Support vector machine, and K-Nearest Neighbors algorithm. Two sets of combined features were utilized for classification of blood cells images. The first set, exclusively consist of geometrical features, was used to identify whether the tested blood cell has a spherical shape or non-spherical cells. While the second set, consist mainly of textural features was used to recognize the types of the spherical cells. We have provided an evaluation based on applying these classification methods to our RBCs image dataset which were obtained from Serdang Hospital-alaysia, and measuring the accuracy of test results. The best achieved classification rates are 97%, 98%, and 79% for Support vector machines, Radial Basis Function neural network, and K-Nearest Neighbors algorithm respectively.

Keywords: K-nearest neighbors algorithm, radial basis function neural network, red blood cells, support vector machine

Procedia PDF Downloads 409
243 Gas Phase Extraction: An Environmentally Sustainable and Effective Method for The Extraction and Recovery of Metal from Ores

Authors: Kolela J Nyembwe, Darlington C. Ashiegbu, Herman J. Potgieter

Abstract:

Over the past few decades, the demand for metals has increased significantly. This has led to a decrease and decline of high-grade ore over time and an increase in mineral complexity and matrix heterogeneity. In addition to that, there are rising concerns about greener processes and a sustainable environment. Due to these challenges, the mining and metal industry has been forced to develop new technologies that are able to economically process and recover metallic values from low-grade ores, materials having a metal content locked up in industrially processed residues (tailings and slag), and complex matrix mineral deposits. Several methods to address these issues have been developed, among which are ionic liquids (IL), heap leaching, and bioleaching. Recently, the gas phase extraction technique has been gaining interest because it eliminates many of the problems encountered in conventional mineral processing methods. The technique relies on the formation of volatile metal complexes, which can be removed from the residual solids by a carrier gas. The complexes can then be reduced using the appropriate method to obtain the metal and regenerate-recover the organic extractant. Laboratory work on the gas phase have been conducted for the extraction and recovery of aluminium (Al), iron (Fe), copper (Cu), chrome (Cr), nickel (Ni), lead (Pb), and vanadium V. In all cases the extraction revealed to depend of temperature and mineral surface area. The process technology appears very promising, offers the feasibility of recirculation, organic reagent regeneration, and has the potential to deliver on all promises of a “greener” process.

Keywords: gas-phase extraction, hydrometallurgy, low-grade ore, sustainable environment

Procedia PDF Downloads 132
242 Network Analysis of Genes Involved in the Biosynthesis of Medicinally Important Naphthodianthrone Derivatives of Hypericum perforatum

Authors: Nafiseh Noormohammadi, Ahmad Sobhani Najafabadi

Abstract:

Hypericins (hypericin and pseudohypericin) are natural napthodianthrone derivatives produced by Hypericum perforatum (St. John’s Wort), which have many medicinal properties such as antitumor, antineoplastic, antiviral, and antidepressant activities. Production and accumulation of hypericin in the plant are influenced by both genetic and environmental conditions. Despite the existence of different high-throughput data on the plant, genetic dimensions of hypericin biosynthesis have not yet been completely understood. In this research, 21 high-quality RNA-seq data on different parts of the plant were integrated into metabolic data to reconstruct a coexpression network. Results showed that a cluster of 30 transcripts was correlated with total hypericin. The identified transcripts were divided into three main groups based on their functions, including hypericin biosynthesis genes, transporters, detoxification genes, and transcription factors (TFs). In the biosynthetic group, different isoforms of polyketide synthase (PKSs) and phenolic oxidative coupling proteins (POCPs) were identified. Phylogenetic analysis of protein sequences integrated into gene expression analysis showed that some of the POCPs seem to be very important in the biosynthetic pathway of hypericin. In the TFs group, six TFs were correlated with total hypericin. qPCR analysis of these six TFs confirmed that three of them were highly correlated. The identified genes in this research are a rich resource for further studies on the molecular breeding of H. perforatum in order to obtain varieties with high hypericin production.

Keywords: hypericin, St. John’s Wort, data mining, transcription factors, secondary metabolites

Procedia PDF Downloads 92
241 A Fast Community Detection Algorithm

Authors: Chung-Yuan Huang, Yu-Hsiang Fu, Chuen-Tsai Sun

Abstract:

Community detection represents an important data-mining tool for analyzing and understanding real-world complex network structures and functions. We believe that at least four criteria determine the appropriateness of a community detection algorithm: (a) it produces useable normalized mutual information (NMI) and modularity results for social networks, (b) it overcomes resolution limitation problems associated with synthetic networks, (c) it produces good NMI results and performance efficiency for Lancichinetti-Fortunato-Radicchi (LFR) benchmark networks, and (d) it produces good modularity and performance efficiency for large-scale real-world complex networks. To our knowledge, no existing community detection algorithm meets all four criteria. In this paper, we describe a simple hierarchical arc-merging (HAM) algorithm that uses network topologies and rule-based arc-merging strategies to identify community structures that satisfy the criteria. We used five well-studied social network datasets and eight sets of LFR benchmark networks to validate the ground-truth community correctness of HAM, eight large-scale real-world complex networks to measure its performance efficiency, and two synthetic networks to determine its susceptibility to resolution limitation problems. Our results indicate that the proposed HAM algorithm is capable of providing satisfactory performance efficiency and that HAM-identified communities were close to ground-truth communities in social and LFR benchmark networks while overcoming resolution limitation problems.

Keywords: complex network, social network, community detection, network hierarchy

Procedia PDF Downloads 227
240 Estimation of Natural Pozzolan Reserves in the Volcanic Province of the Moroccan Middle Atlas Using a Geographic Information System in Order to Valorize Them

Authors: Brahim Balizi, Ayoub Aziz, Abdelilah Bellil, Abdellali El Khadiri, Jamal Mabrouki

Abstract:

Mio-polio-quaternary volcanism of the Tabular Middle Atlas, which corresponds to prospective levels of exploitable usable raw minerals, is a feature of Morocco's Middle Atlas, especially the Azrou-Timahdite region. Given their importance in national policy in terms of human development by supporting the sociological and economic component, this area has consequently been the focus of various research and prospecting of these levels in order to develop these reserves. The outcome of this labor is a massive amount of data that needs to be managed appropriately because it comes from multiple sources and formats, including side points, contour lines, geology, hydrogeology, hydrology, geological and topographical maps, satellite photos, and more. In this regard, putting in place a Geographic Information System (GIS) is essential to be able to offer a side plan that makes it possible to see the most recent topography of the area being exploited, to compute the volume of exploitation that occurs every day, and to make decisions with the fewest possible restrictions in order to use the reserves for the realization of ecological light mortars The three sites' mining will follow the contour lines in five steps that are six meters high and decline. It is anticipated that each quarry produces about 90,000 m3/year. For a single quarry, this translates to a daily production of about 450 m3 (200 days/year). About 3,540,240 m3 and 10,620,720 m3, respectively, represent the possible net exploitable volume in place for a single quarry and the three exploitable zones.

Keywords: GIS, topography, exploitation, quarrying, lightweight mortar

Procedia PDF Downloads 26
239 Predictive Analytics Algorithms: Mitigating Elementary School Drop Out Rates

Authors: Bongs Lainjo

Abstract:

Educational institutions and authorities that are mandated to run education systems in various countries need to implement a curriculum that considers the possibility and existence of elementary school dropouts. This research focuses on elementary school dropout rates and the ability to replicate various predictive models carried out globally on selected Elementary Schools. The study was carried out by comparing the classical case studies in Africa, North America, South America, Asia and Europe. Some of the reasons put forward for children dropping out include the notion of being successful in life without necessarily going through the education process. Such mentality is coupled with a tough curriculum that does not take care of all students. The system has completely led to poor school attendance - truancy which continuously leads to dropouts. In this study, the focus is on developing a model that can systematically be implemented by school administrations to prevent possible dropout scenarios. At the elementary level, especially the lower grades, a child's perception of education can be easily changed so that they focus on the better future that their parents desire. To deal effectively with the elementary school dropout problem, strategies that are put in place need to be studied and predictive models are installed in every educational system with a view to helping prevent an imminent school dropout just before it happens. In a competency-based curriculum that most advanced nations are trying to implement, the education systems have wholesome ideas of learning that reduce the rate of dropout.

Keywords: elementary school, predictive models, machine learning, risk factors, data mining, classifiers, dropout rates, education system, competency-based curriculum

Procedia PDF Downloads 175
238 Using Data Mining in Automotive Safety

Authors: Carine Cridelich, Pablo Juesas Cano, Emmanuel Ramasso, Noureddine Zerhouni, Bernd Weiler

Abstract:

Safety is one of the most important considerations when buying a new car. While active safety aims at avoiding accidents, passive safety systems such as airbags and seat belts protect the occupant in case of an accident. In addition to legal regulations, organizations like Euro NCAP provide consumers with an independent assessment of the safety performance of cars and drive the development of safety systems in automobile industry. Those ratings are mainly based on injury assessment reference values derived from physical parameters measured in dummies during a car crash test. The components and sub-systems of a safety system are designed to achieve the required restraint performance. Sled tests and other types of tests are then carried out by car makers and their suppliers to confirm the protection level of the safety system. A Knowledge Discovery in Databases (KDD) process is proposed in order to minimize the number of tests. The KDD process is based on the data emerging from sled tests according to Euro NCAP specifications. About 30 parameters of the passive safety systems from different data sources (crash data, dummy protocol) are first analysed together with experts opinions. A procedure is proposed to manage missing data and validated on real data sets. Finally, a procedure is developed to estimate a set of rough initial parameters of the passive system before testing aiming at reducing the number of tests.

Keywords: KDD process, passive safety systems, sled test, dummy injury assessment reference values, frontal impact

Procedia PDF Downloads 382
237 A Comparative Study for Various Techniques Using WEKA for Red Blood Cells Classification

Authors: Jameela Ali, Hamid A. Jalab, Loay E. George, Abdul Rahim Ahmad, Azizah Suliman, Karim Al-Jashamy

Abstract:

Red blood cells (RBC) are the most common types of blood cells and are the most intensively studied in cell biology. The lack of RBCs is a condition in which the amount of hemoglobin level is lower than normal and is referred to as “anemia”. Abnormalities in RBCs will affect the exchange of oxygen. This paper presents a comparative study for various techniques for classifyig the red blood cells as normal, or abnormal (anemic) using WEKA. WEKA is an open source consists of different machine learning algorithms for data mining applications. The algorithm tested are Radial Basis Function neural network, Support vector machine, and K-Nearest Neighbors algorithm. Two sets of combined features were utilized for classification of blood cells images. The first set, exclusively consist of geometrical features, was used to identify whether the tested blood cell has a spherical shape or non-spherical cells. While the second set, consist mainly of textural features was used to recognize the types of the spherical cells. We have provided an evaluation based on applying these classification methods to our RBCs image dataset which were obtained from Serdang Hospital-Malaysia, and measuring the accuracy of test results. The best achieved classification rates are 97%, 98%, and 79% for Support vector machines, Radial Basis Function neural network, and K-Nearest Neighbors algorithm respectively

Keywords: red blood cells, classification, radial basis function neural networks, suport vector machine, k-nearest neighbors algorithm

Procedia PDF Downloads 480
236 Simscape Library for Large-Signal Physical Network Modeling of Inertial Microelectromechanical Devices

Authors: S. Srinivasan, E. Cretu

Abstract:

The information flow (e.g. block-diagram or signal flow graph) paradigm for the design and simulation of Microelectromechanical (MEMS)-based systems allows to model MEMS devices using causal transfer functions easily, and interface them with electronic subsystems for fast system-level explorations of design alternatives and optimization. Nevertheless, the physical bi-directional coupling between different energy domains is not easily captured in causal signal flow modeling. Moreover, models of fundamental components acting as building blocks (e.g. gap-varying MEMS capacitor structures) depend not only on the component, but also on the specific excitation mode (e.g. voltage or charge-actuation). In contrast, the energy flow modeling paradigm in terms of generalized across-through variables offers an acausal perspective, separating clearly the physical model from the boundary conditions. This promotes reusability and the use of primitive physical models for assembling MEMS devices from primitive structures, based on the interconnection topology in generalized circuits. The physical modeling capabilities of Simscape have been used in the present work in order to develop a MEMS library containing parameterized fundamental building blocks (area and gap-varying MEMS capacitors, nonlinear springs, displacement stoppers, etc.) for the design, simulation and optimization of MEMS inertial sensors. The models capture both the nonlinear electromechanical interactions and geometrical nonlinearities and can be used for both small and large signal analyses, including the numerical computation of pull-in voltages (stability loss). Simscape behavioral modeling language was used for the implementation of reduced-order macro models, that present the advantage of a seamless interface with Simulink blocks, for creating hybrid information/energy flow system models. Test bench simulations of the library models compare favorably with both analytical results and with more in-depth finite element simulations performed in ANSYS. Separate MEMS-electronic integration tests were done on closed-loop MEMS accelerometers, where Simscape was used for modeling the MEMS device and Simulink for the electronic subsystem.

Keywords: across-through variables, electromechanical coupling, energy flow, information flow, Matlab/Simulink, MEMS, nonlinear, pull-in instability, reduced order macro models, Simscape

Procedia PDF Downloads 134
235 Efficient Fuzzy Classified Cryptographic Model for Intelligent Encryption Technique towards E-Banking XML Transactions

Authors: Maher Aburrous, Adel Khelifi, Manar Abu Talib

Abstract:

Transactions performed by financial institutions on daily basis require XML encryption on large scale. Encrypting large volume of message fully will result both performance and resource issues. In this paper a novel approach is presented for securing financial XML transactions using classification data mining (DM) algorithms. Our strategy defines the complete process of classifying XML transactions by using set of classification algorithms, classified XML documents processed at later stage using element-wise encryption. Classification algorithms were used to identify the XML transaction rules and factors in order to classify the message content fetching important elements within. We have implemented four classification algorithms to fetch the importance level value within each XML document. Classified content is processed using element-wise encryption for selected parts with "High", "Medium" or “Low” importance level values. Element-wise encryption is performed using AES symmetric encryption algorithm and proposed modified algorithm for AES to overcome the problem of computational overhead, in which substitute byte, shift row will remain as in the original AES while mix column operation is replaced by 128 permutation operation followed by add round key operation. An implementation has been conducted using data set fetched from e-banking service to present system functionality and efficiency. Results from our implementation showed a clear improvement in processing time encrypting XML documents.

Keywords: XML transaction, encryption, Advanced Encryption Standard (AES), XML classification, e-banking security, fuzzy classification, cryptography, intelligent encryption

Procedia PDF Downloads 410
234 Analyzing Factors Impacting COVID-19 Vaccination Rates

Authors: Dongseok Cho, Mitchell Driedger, Sera Han, Noman Khan, Mohammed Elmorsy, Mohamad El-Hajj

Abstract:

Since the approval of the COVID-19 vaccine in late 2020, vaccination rates have varied around the globe. Access to a vaccine supply, mandated vaccination policy, and vaccine hesitancy contribute to these rates. This study used COVID-19 vaccination data from Our World in Data and the Multilateral Leaders Task Force on COVID-19 to create two COVID-19 vaccination indices. The first index is the Vaccine Utilization Index (VUI), which measures how effectively each country has utilized its vaccine supply to doubly vaccinate its population. The second index is the Vaccination Acceleration Index (VAI), which evaluates how efficiently each country vaccinated its population within its first 150 days. Pearson correlations were created between these indices and country indicators obtained from the World Bank. The results of these correlations identify countries with stronger health indicators, such as lower mortality rates, lower age dependency ratios, and higher rates of immunization to other diseases, displaying higher VUI and VAI scores than countries with lesser values. VAI scores are also positively correlated to Governance and Economic indicators, such as regulatory quality, control of corruption, and GDP per capita. As represented by the VUI, proper utilization of the COVID-19 vaccine supply by country is observed in countries that display excellence in health practices. A country’s motivation to accelerate its vaccination rates within the first 150 days of vaccinating, as represented by the VAI, was largely a product of the governing body’s effectiveness and economic status, as well as overall excellence in health practises.

Keywords: data mining, Pearson correlation, COVID-19, vaccination rates and hesitancy

Procedia PDF Downloads 114
233 Phytoextraction of Copper and Zinc by Willow Varieties in a Pot Experiment

Authors: Muhammad Mohsin, Mir Md Abdus Salam, Pertti Pulkkinen, Ari Pappinen

Abstract:

Soil and water contamination by heavy metals is a major challenging issue for the environment. Phytoextraction is an emerging, environmentally friendly and cost-efficient technology in which plants are used to eliminate pollutants from the soil and water. We aimed to assess the copper (Cu) and zinc (Zn) removal efficiency by two willow varieties such as Klara (S. viminalis x S. schwerinii x S. dasyclados) and Karin ((S.schwerinii x S. viminalis) x (S. viminalis x S.burjatica)) under different soil treatments (control/unpolluted, polluted, lime with polluted, wood ash with polluted). In 180 days of pot experiment, these willow varieties were grown in a highly polluted soil collected from Pyhasalmi mining area in Finland. The lime and wood ash were added to the polluted soil to improve the soil pH and observe their effects on metals accumulation in plant biomass. The Inductively Coupled Plasma Optical Emission Spectrometer (ELAN 6000 ICP-EOS, Perkin-Elmer Corporation) was used in this study to assess the heavy metals concentration in the plant biomass. The result shows that both varieties of willow have the capability to accumulate the considerable amount of Cu and Zn varying from 36.95 to 314.80 mg kg⁻¹ and 260.66 to 858.70 mg kg⁻¹, respectively. The application of lime and wood ash substantially affected the stimulation of the plant height, dry biomass and deposition of Cu and Zn into total plant biomass. Besides, the lime application appeared to upsurge Cu and Zn concentrations in the shoots and leaves in both willow varieties when planted in polluted soil. However, wood ash application was found more efficient to mobilize the metals in the roots of both varieties. The study recommends willow plantations to rehabilitate the Cu and Zn polluted soils.

Keywords: heavy metals, lime, phytoextraction, wood ash, willow

Procedia PDF Downloads 236