Search results for: mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1051

Search results for: mining

631 Formal Innovations vs. Informal Innovations: The Case of the Mining Sector in Nigeria

Authors: Jegede Oluseye Oladayo

Abstract:

The study mapped innovation activities in the formal and informal mining sector in Nigeria. Data were collected through primary and secondary sources. Primary data were collected through guided questionnaire administration, guided interviews and personal observation. A purposive sampling method was adopted to select firms that are micro, small and medium enterprises. The study covered 100 (50 in the formal sector and 50 in the informal sector) purposively selected companies in south-western Nigeria. Secondary data were collected from different published sources. Data were analysed using descriptive and inferential statistics. Of the four types of technological innovations sampled, organisational innovation was found to be highest both in the formal (100%) and informal (100%) sectors, followed by process innovation: 60% in the formal sector and 28% in the informal sector, marketing innovation and diffusion based innovation were implemented by 64% and 4% respectively in the formal sector. There were no R&D activities (intramural or extramural) in both sectors, however, innovation activities occur at moderate levels in the formal sector. This is characterised by acquisition of machinery, equipment, hardware (100%), software (56), training (82%) and acquisition of external knowledge (60%) in the formal sector. In the informal sector, innovation activities were characterised by acquisition of external knowledge (100%), training/learning by experience (100%) and acquisition of tools (68%). The impact of innovation on firm’s performance in the formal sector was expressed mainly as increased capacity of production (100%), reduced production cost per unit of labour (88%), compliance with governmental regulatory requirements (72%) and entry on new markets (60%). In the informal sector, the impact of innovation was mainly expressed in improved flexibility of production (70%) and machinery/energy efficiency (70%). The important technological driver of process innovation in the mining sector was acquisition of machinery which accounts for the prevalence of 100% both in the formal and informal sectors. Next to this is training and re-training of technical staff, 74% in both the formal and the informal sector. Other factors influencing organisational innovation are skill of workforce with a prevalence of 80% in both the formal and informal sector. The important technological drivers include educational background of the manager/head of technical department (54%) for organisational innovation and (50%) for process innovation in the formal sector. The study concluded that innovation competence of the firms was mostly organisational changes.

Keywords: innovation prevalence, innovation activities, innovation performance, innovation drivers

Procedia PDF Downloads 354
630 Aviation versus Aerospace: A Differential Analysis of Workforce Jobs via Text Mining

Authors: Sarah Werner, Michael J. Pritchard

Abstract:

From pilots to engineers, the skills development within the aerospace industry is exceptionally broad. Employers often struggle with finding the right mixture of qualified skills to fill their organizational demands. This effort to find qualified talent is further complicated by the industrial delineation between two key areas: aviation and aerospace. In a broad sense, the aerospace industry overlaps with the aviation industry. In turn, the aviation industry is a smaller sector segment within the context of the broader definition of the aerospace industry. Furthermore, it could be conceptually argued that -in practice- there is little distinction between these two sectors (i.e., aviation and aerospace). However, through our unstructured text analysis of over 6,000 job listings captured, our team found a clear delineation between aviation-related jobs and aerospace-related jobs. Using techniques in natural language processing, our research identifies an integrated workforce skill pattern that clearly breaks between these two sectors. While the aviation sector has largely maintained its need for pilots, mechanics, and associated support personnel, the staffing needs of the aerospace industry are being progressively driven by integrative engineering needs. Increasingly, this is leading many aerospace-based organizations towards the acquisition of 'system level' staffing requirements. This research helps to better align higher educational institutions with the current industrial staffing complexities within the broader aerospace sector.

Keywords: aerospace industry, job demand, text mining, workforce development

Procedia PDF Downloads 234
629 A Case Study of Ontology-Based Sentiment Analysis for Fan Pages

Authors: C. -L. Huang, J. -H. Ho

Abstract:

Social media has become more and more important in our life. Many enterprises promote their services and products to fans via the social media. The positive or negative sentiment of feedbacks from fans is very important for enterprises to improve their products, services, and promotion activities. The purpose of this paper is to understand the sentiment of the fan’s responses by analyzing the responses posted by fans on Facebook. The entity and aspect of fan’s responses were analyzed based on a predefined ontology. The ontology for cell phone sentiment analysis consists of aspect categories on the top level as follows: overall, shape, hardware, brand, price, and service. Each category consists of several sub-categories. All aspects for a fan’s response were found based on the ontology, and their corresponding sentimental terms were found using lexicon-based approach. The sentimental scores for aspects of fan responses were obtained by summarizing the sentimental terms in responses. The frequency of 'like' was also weighted in the sentimental score calculation. Three famous cell phone fan pages on Facebook were selected as demonstration cases to evaluate performances of the proposed methodology. Human judgment by several domain experts was also built for performance comparison. The performances of proposed approach were as good as those of human judgment on precision, recall and F1-measure.

Keywords: opinion mining, ontology, sentiment analysis, text mining

Procedia PDF Downloads 214
628 Effect of Heavy Metals on the Life History Trait of Heterocephalobellus sp. and Cephalobus sp. (Nematode: Cephalobidae) Collected from a Small-Scale Mining Site, Davao de Oro, Philippines

Authors: Alissa Jane S. Mondejar, Florifern C. Paglinawan, Nanette Hope N. Sumaya, Joey Genevieve T. Martinez, Mylah Villacorte-Tabelin

Abstract:

Mining is associated with increased heavy metals in the environment, and heavy metal contamination disrupts the activities of soil fauna, such as nematodes, causing changes in the function of the soil ecosystem. Previous studies found that nematode community composition and diversity indices were strongly affected by heavy metals (e.g., Pb, Cu, and Zn). In this study, the influence of heavy metals on nematode survivability and reproduction were investigated. Life history analysis of the free-living nematodes, Heterocephalobellus sp. and Cephalobus sp. (Rhabditida: Cephalobidae) were assessed using the hanging drop technique, a technique often used in life history trait experiments. The nematodes were exposed to different temperatures, i.e.,20°C, 25°C, and 30°C, in different groups (control and heavy metal exposed) and fed with the same bacterial density of 1×109 Escherichia coli cells ml-1 for 30 days. Results showed that increasing temperature and exposure to heavy metals had a significant influence on the survivability and egg production of both species. Heterocephalobellus sp. and Cephalobus sp., when exposed to 20°C survived longer and produced few numbers of eggs but without subsequent hatching. Life history parameters of Heterocephalobellus sp. showed that the value of parameters was higher in the control group under net production rate (R0), fecundity (mx) which is also the same value for the total fertility rate (TFR), generation times (G0, G₁, and Gh) and Population doubling time (PDT). However, a lower rate of natural increase (rm) was observed since generation times were higher. Meanwhile, the life history parameters of Cephalobus sp. showed that the value of net production rate (R0) was higher in the exposed group. Fecundity (mx) which is also the same value for the TFR, G0, G1, Gh, and PDT, were higher in the control group. However, a lower rate of natural increase (rm) was observed since generation times were higher. In conclusion, temperature and exposure to heavy metals had a negative influence on the life history of the nematodes, however, further experiments should be considered.

Keywords: artisanal and small-scale gold mining (ASGM), hanging drop method, heavy metals, life history trait.

Procedia PDF Downloads 57
627 Alternative Approaches to Community Involvement in Resettlement Schemes to Prevent Potential Conflicts: Case Study in Chibuto District, Mozambique

Authors: Constâncio Augusto Machanguana

Abstract:

The world over, resettling communities, for whatever purpose (mining, dams, forestry and wildlife management, roads, or facilitating services delivery), often leads to tensions between those resettled, the investors, and the local and national governments involved in the process. Causes include unclear government legislation and regulations, confusing Corporate Social Responsibility policies and guidelines, and other social-economic policies leading to unrealistic expectations among those being resettled, causing frustrations within the community, shifting them to any imminent conflict against the investors (company). The exploitation of heavy mineral sands along Mozambique’s long coastline and hinterland has not been providing a benefit for the affected communities. A case in point is the exploration, since 2018, of heavy sands in Chibuto District in the Southern Province of Gaza. A likely contributing factor is the standard type of socio-economic surveys and community involvement processes that could smooth the relationship among the parties. This research aims to investigate alternative processes to plan, initiate and guide resettlement processes in such a way that tensions and conflicts are avoided. Based on the process already finished, compared to similar cases along with the country, mixed methods to collect primary data were adopted: three focus groups of 125 people, representing 324 resettled householders; five semi-structured interviews with relevant stakeholders such as the local government, NGO’s and local leaders to understand their role in all stages of the process. The preliminary results show that the community has limited or no understanding of the potential impacts of these large-scale explorations, and the apparent harmony between the parties (community and company) may hide the dissatisfaction of those resettled. So, rather than focusing on negative mining impacts, the research contributes to science by identifying the best resettlement approach that can be replicated in other contexts along with the country in the actual context of the new discovery of mineral resources.

Keywords: conflict mitigation, resettlement, mining, Mozambique

Procedia PDF Downloads 88
626 Toxic Metal and Radiological Risk Assessment of Soil, Water and Vegetables around a Gold Mine Turned Residential Area in Mokuro Area of Ile-Ife, Osun State Nigeria: An Implications for Human Health

Authors: Grace O. Akinlade, Danjuma D. Maza, Oluwakemi O. Olawolu, Delight O. Babalola, John A. O. Oyekunle, Joshua O. Ojo

Abstract:

The Mokuro area of Ile-Ife, South West Nigeria, was well known for gold mining in the past (about twenty years ago). However, the place has since been reclaimed and converted to residential area without any environmental risk assessment of the impact of the mining tailings on the environment. Soil, water, and plant samples were collected from 4 different locations around the mine-turned-residential area. Soil samples were pulverized and sieved into finer particles, while the plant samples were dried and pulverized. All the samples were digested and analyzed for As, Pb, Cd, and Zn using atomic absorption spectroscopy (AAS). From the analysis results, the hazard index (HI) was then calculated for the metals. The soil and plant samples were air dried and pulverized, then weighed, after which the samples were packed into special and properly sealed containers to prevent radon gas leakage. After the sealing, the samples were kept for 28 days to attain secular equilibrium. The concentrations of 40K, 238U, and 232Th in the samples were measured using a cesium iodide (CsI) spectrometer and URSA software. The AAS analysis showed that As, Pb, Cd (Toxic metals), and Zn (essential trace metals) are in concentrations lower than permissible limits in plants and soil samples, while the water samples had concentrations higher than permissible limits. The calculated health indices (HI) show that HI for water is >1 and that of plants and soil is <1. Gamma spectrometry result shows high levels of activity concentrations above the recommended limits for all the soil and plant samples collected from the area. Only the water samples have activity concentrations below the recommended limit. Consequently, the absorbed dose, annual effective dose, and excess lifetime cancer risk are all above the recommended safe limit for all the samples except for water samples. In conclusion, all the samples collected from the area are either contaminated with toxic metals or they pose radiological hazards to the consumers. Further detailed study is therefore recommended in order to be able to advise the residents appropriately.

Keywords: toxic metals, gamma spectrometry, Ile-Ife, radiological hazards, gold mining

Procedia PDF Downloads 23
625 A Data-Mining Model for Protection of FACTS-Based Transmission Line

Authors: Ashok Kalagura

Abstract:

This paper presents a data-mining model for fault-zone identification of flexible AC transmission systems (FACTS)-based transmission line including a thyristor-controlled series compensator (TCSC) and unified power-flow controller (UPFC), using ensemble decision trees. Given the randomness in the ensemble of decision trees stacked inside the random forests model, it provides an effective decision on the fault-zone identification. Half-cycle post-fault current and voltage samples from the fault inception are used as an input vector against target output ‘1’ for the fault after TCSC/UPFC and ‘1’ for the fault before TCSC/UPFC for fault-zone identification. The algorithm is tested on simulated fault data with wide variations in operating parameters of the power system network, including noisy environment providing a reliability measure of 99% with faster response time (3/4th cycle from fault inception). The results of the presented approach using the RF model indicate the reliable identification of the fault zone in FACTS-based transmission lines.

Keywords: distance relaying, fault-zone identification, random forests, RFs, support vector machine, SVM, thyristor-controlled series compensator, TCSC, unified power-flow controller, UPFC

Procedia PDF Downloads 403
624 Filtering Intrusion Detection Alarms Using Ant Clustering Approach

Authors: Ghodhbani Salah, Jemili Farah

Abstract:

With the growth of cyber attacks, information safety has become an important issue all over the world. Many firms rely on security technologies such as intrusion detection systems (IDSs) to manage information technology security risks. IDSs are considered to be the last line of defense to secure a network and play a very important role in detecting large number of attacks. However the main problem with today’s most popular commercial IDSs is generating high volume of alerts and huge number of false positives. This drawback has become the main motivation for many research papers in IDS area. Hence, in this paper we present a data mining technique to assist network administrators to analyze and reduce false positive alarms that are produced by an IDS and increase detection accuracy. Our data mining technique is unsupervised clustering method based on hybrid ANT algorithm. This algorithm discovers clusters of intruders’ behavior without prior knowledge of a possible number of classes, then we apply K-means algorithm to improve the convergence of the ANT clustering. Experimental results on real dataset show that our proposed approach is efficient with high detection rate and low false alarm rate.

Keywords: intrusion detection system, alarm filtering, ANT class, ant clustering, intruders’ behaviors, false alarms

Procedia PDF Downloads 380
623 Application of Remote Sensing Technique on the Monitoring of Mine Eco-Environment

Authors: Haidong Li, Weishou Shen, Guoping Lv, Tao Wang

Abstract:

Aiming to overcome the limitation of the application of traditional remote sensing (RS) technique in the mine eco-environmental monitoring, in this paper, we first classified the eco-environmental damages caused by mining activities and then introduced the principle, classification and characteristics of the Light Detection and Ranging (LiDAR) technique. The potentiality of LiDAR technique in the mine eco-environmental monitoring was analyzed, particularly in extracting vertical structure parameters of vegetation, through comparing the feasibility and applicability of traditional RS method and LiDAR technique in monitoring different types of indicators. The application situation of LiDAR technique in extracting typical mine indicators, such as land destruction in mining areas, damage of ecological integrity and natural soil erosion. The result showed that the LiDAR technique has the ability to monitor most of the mine eco-environmental indicators, and exhibited higher accuracy comparing with traditional RS technique, specifically speaking, the applicability of LiDAR technique on each indicator depends on the accuracy requirement of mine eco-environmental monitoring. In the item of large mine, LiDAR three-dimensional point cloud data not only could be used as the complementary data source of optical RS, Airborne/Satellite LiDAR could also fulfill the demand of extracting vertical structure parameters of vegetation in large areas.

Keywords: LiDAR, mine, ecological damage, monitoring, traditional remote sensing technique

Procedia PDF Downloads 370
622 Integrating of Multi-Criteria Decision Making and Spatial Data Warehouse in Geographic Information System

Authors: Zohra Mekranfar, Ahmed Saidi, Abdellah Mebrek

Abstract:

This work aims to develop multi-criteria decision making (MCDM) and spatial data warehouse (SDW) methods, which will be integrated into a GIS according to a ‘GIS dominant’ approach. The GIS operating tools will be operational to operate the SDW. The MCDM methods can provide many solutions to a set of problems with various and multiple criteria. When the problem is so complex, integrating spatial dimension, it makes sense to combine the MCDM process with other approaches like data mining, ascending analyses, we present in this paper an experiment showing a geo-decisional methodology of SWD construction, On-line analytical processing (OLAP) technology which combines both basic multidimensional analysis and the concepts of data mining provides powerful tools to highlight inductions and information not obvious by traditional tools. However, these OLAP tools become more complex in the presence of the spatial dimension. The integration of OLAP with a GIS is the future geographic and spatial information solution. GIS offers advanced functions for the acquisition, storage, analysis, and display of geographic information. However, their effectiveness for complex spatial analysis is questionable due to their determinism and their decisional rigor. A prerequisite for the implementation of any analysis or exploration of spatial data requires the construction and structuring of a spatial data warehouse (SDW). This SDW must be easily usable by the GIS and by the tools offered by an OLAP system.

Keywords: data warehouse, GIS, MCDM, SOLAP

Procedia PDF Downloads 149
621 High-Throughput Artificial Guide RNA Sequence Design for Type I, II and III CRISPR/Cas-Mediated Genome Editing

Authors: Farahnaz Sadat Golestan Hashemi, Mohd Razi Ismail, Mohd Y. Rafii

Abstract:

A huge revolution has emerged in genome engineering by the discovery of CRISPR (clustered regularly interspaced palindromic repeats) and CRISPR-associated system genes (Cas) in bacteria. The function of type II Streptococcus pyogenes (Sp) CRISPR/Cas9 system has been confirmed in various species. Other S. thermophilus (St) CRISPR-Cas systems, CRISPR1-Cas and CRISPR3-Cas, have been also reported for preventing phage infection. The CRISPR1-Cas system interferes by cleaving foreign dsDNA entering the cell in a length-specific and orientation-dependant manner. The S. thermophilus CRISPR3-Cas system also acts by cleaving phage dsDNA genomes at the same specific position inside the targeted protospacer as observed in the CRISPR1-Cas system. It is worth mentioning, for the effective DNA cleavage activity, RNA-guided Cas9 orthologs require their own specific PAM (protospacer adjacent motif) sequences. Activity levels are based on the sequence of the protospacer and specific combinations of favorable PAM bases. Therefore, based on the specific length and sequence of PAM followed by a constant length of target site for the three orthogonals of Cas9 protein, a well-organized procedure will be required for high-throughput and accurate mining of possible target sites in a large genomic dataset. Consequently, we created a reliable procedure to explore potential gRNA sequences for type I (Streptococcus thermophiles), II (Streptococcus pyogenes), and III (Streptococcus thermophiles) CRISPR/Cas systems. To mine CRISPR target sites, four different searching modes of sgRNA binding to target DNA strand were applied. These searching modes are as follows: i) coding strand searching, ii) anti-coding strand searching, iii) both strand searching, and iv) paired-gRNA searching. The output of such procedure highlights the power of comparative genome mining for different CRISPR/Cas systems. This could yield a repertoire of Cas9 variants with expanded capabilities of gRNA design, and will pave the way for further advance genome and epigenome engineering.

Keywords: CRISPR/Cas systems, gRNA mining, Streptococcus pyogenes, Streptococcus thermophiles

Procedia PDF Downloads 224
620 The Analysis Fleet Operational Performance as an Indicator of Load and Haul Productivity

Authors: Linet Melisa Daubanes, Nhleko Monique Chiloane

Abstract:

The shovel-truck system is the most prevalent material handling system used in surface mining operations. Material handling entails the loading and hauling of material from production areas to dumping areas. The material handling process has operational delays that have a negative impact on the productivity of the load and haul fleet. Factors that may contribute to operational delays include shovel-truck mismatch, haul routes, machine breakdowns, extreme weather conditions, etc. The aim of this paper is to investigate factors that contribute to operational delays affecting the productivity of the load and haul fleet at the mine. Productivity is the measure of the effectiveness of producing products from a given quantity of units, the ratio of output to inputs. Productivity can be improved by producing more outputs with the same or fewer units and/or introducing better working methods etc. Several key performance indicators (KPI) for the evaluation of productivity will be discussed in this study. These KPIs include but are not limited to hauling conditions, bucket fill factor, cycle time, and utilization. The research methodology of this study is a combination of on-site time studies and observations. Productivity can be optimized by managing the factors that affect the operational performance of the haulage fleet.

Keywords: cycle time, fleet performance, load and haul, surface mining

Procedia PDF Downloads 163
619 The Environmental Concerns in Coal Mining, and Utilization in Pakistan

Authors: S. R. H. Baqri, T. Shahina, M. T. Hasan

Abstract:

Pakistan is facing acute shortage of energy and looking for indigenous resources of the energy mix to meet the short fall. After the discovery of huge coal resources in Thar Desert of Sindh province, focus has shifted to coal power generation. The government of Pakistan has planned power generation of 20000 MW on coal by the year 2025. This target will be achieved by mining and power generation in Thar coal Field and on imported coal in different parts of Pakistan. Total indigenous coal production of around 3.0 million tons is being utilized in brick kilns, cement and sugar industry. Coal-based power generation is only limited to three units of 50 MW near Hyderabad from nearby Lakhra Coal field. The purpose of this presentation is to identify and redressal of issues of coal mining and utilization with reference to environmental hazards. Thar coal resource is estimated at 175 billion tons out of a total resource estimate of 184 billion tons in Pakistan. Coal of Pakistan is of Tertiary age (Palaeocene/Eocene) and classified from lignite to sub-bituminous category. Coal characterization has established three main pollutants such as Sulphur, Carbon dioxide and Methane besides some others associated with coal and rock types. The element Sulphur occurs in organic as well as inorganic forms associated with coals as free sulphur and as pyrite, gypsum, respectively. Carbon dioxide, methane and minerals are mostly associated with fractures, joints local faults, seatearth and roof rocks. The abandoned and working coal mines give kerosene odour due to escape of methane in the atmosphere. While the frozen methane/methane ices in organic matter rich sediments have also been reported from the Makran coastal and offshore areas. The Sulphur escapes into the atmosphere during mining and utilization of coal in industry. The natural erosional processes due to rivers, streams, lakes and coastal waves erode over lying sediments allowing pollutants to escape into air and water. Power plants emissions should be controlled through application of appropriate clean coal technology and need to be regularly monitored. Therefore, the systematic and scientific studies will be required to estimate the quantity of methane, carbon dioxide and sulphur at various sites such as abandoned and working coal mines, exploratory wells for coal, oil and gas. Pressure gauges on gas pipes connecting the coal-bearing horizons will be installed on surface to know the quantity of gas. The quality and quantity of gases will be examined according to the defined intervals of times. This will help to design and recommend the methods and procedures to stop the escape of gases into atmosphere. The element of Sulphur can be removed partially by gravity and chemical methods after grinding and before industrial utilization of coal.

Keywords: atmosphere, coal production, energy, pollutants

Procedia PDF Downloads 408
618 Cultural Dynamics in Online Consumer Behavior: Exploring Cross-Country Variances in Review Influence

Authors: Eunjung Lee

Abstract:

This research investigates the intricate connection between cultural differences and online consumer behaviors by integrating Hofstede's Cultural Dimensions theory with analysis methodologies such as text mining, data mining, and topic analysis. Our aim is to provide a comprehensive understanding of how national cultural differences influence individuals' behaviors when engaging with online reviews. To ensure the relevance of our investigation, we systematically analyze and interpret the cultural nuances influencing online consumer behaviors, especially in the context of online reviews. By anchoring our research in Hofstede's Cultural Dimensions theory, we seek to offer valuable insights for marketers to tailor their strategies based on the cultural preferences of diverse global consumer bases. In our methodology, we employ advanced text mining techniques to extract insights from a diverse range of online reviews gathered globally for a specific product or service like Netflix. This approach allows us to reveal hidden cultural cues in the language used by consumers from various backgrounds. Complementing text mining, data mining techniques are applied to extract meaningful patterns from online review datasets collected from different countries, aiming to unveil underlying structures and gain a deeper understanding of the impact of cultural differences on online consumer behaviors. The study also integrates topic analysis to identify recurring subjects, sentiments, and opinions within online reviews. Marketers can leverage these insights to inform the development of culturally sensitive strategies, enhance target audience segmentation, and refine messaging approaches aligned with cultural preferences. Anchored in Hofstede's Cultural Dimensions theory, our research employs sophisticated methodologies to delve into the intricate relationship between cultural differences and online consumer behaviors. Applied to specific cultural dimensions, such as individualism vs. collectivism, masculinity vs. femininity, uncertainty avoidance, and long-term vs. short-term orientation, the study uncovers nuanced insights. For example, in exploring individualism vs. collectivism, we examine how reviewers from individualistic cultures prioritize personal experiences while those from collectivistic cultures emphasize communal opinions. Similarly, within masculinity vs. femininity, we investigate whether distinct topics align with cultural notions, such as robust features in masculine cultures and user-friendliness in feminine cultures. Examining information-seeking behaviors under uncertainty avoidance reveals how cultures differ in seeking detailed information or providing succinct reviews based on their comfort with ambiguity. Additionally, in assessing long-term vs. short-term orientation, the research explores how cultural focus on enduring benefits or immediate gratification influences reviews. These concrete examples contribute to the theoretical enhancement of Hofstede's Cultural Dimensions theory, providing a detailed understanding of cultural impacts on online consumer behaviors. As online reviews become increasingly crucial in decision-making, this research not only contributes to the academic understanding of cultural influences but also proposes practical recommendations for enhancing online review systems. Marketers can leverage these findings to design targeted and culturally relevant strategies, ultimately enhancing their global marketing effectiveness and optimizing online review systems for maximum impact.

Keywords: comparative analysis, cultural dimensions, marketing intelligence, national culture, online consumer behavior, text mining

Procedia PDF Downloads 20
617 GIS-Based Spatial Distribution and Evaluation of Selected Heavy Metals Contamination in Topsoil around Ecton Mining Area, Derbyshire, UK

Authors: Zahid O. Alibrahim, Craig D. Williams, Clive L. Roberts

Abstract:

The study area (Ecton mining area) is located in the southern part of the Peak District in Derbyshire, England. It is bounded by the River Manifold from the west. This area has been mined for a long period. As a result, huge amounts of potentially toxic metals were released into the surrounding area and are most likely to be a significant source of heavy metal contamination to the local soil, water and vegetation. In order to appraise the potential heavy metal pollution in this area, 37 topsoil samples (5-20 cm depth) were collected and analysed for their total content of Cu, Pb, Zn, Mn, Cr, Ni and V using ICP (Inductively Coupled Plasma) optical emission spectroscopy. Multivariate Geospatial analyses using the GIS technique were utilised to draw geochemical maps of the metals of interest over the study area. A few hotspot points, areas of elevated concentrations of metals, were specified, which are presumed to be the results of anthropogenic activities. In addition, the soil’s environmental quality was evaluated by calculating the Mullers’ Geoaccumulation index (I geo), which suggests that the degree of contamination of the investigated heavy metals has the following trend: Pb > Zn > Cu > Mn > Ni = Cr = V. Furthermore, the potential ecological risk, using the enrichment factor (EF), was also specified. On the basis of the calculated amount or the EF, the levels of pollution for the studied metals in the study area have the following order: Pb>Zn>Cu>Cr>V>Ni>Mn.

Keywords: enrichment factor, geoaccumulation index, GIS, heavy metals, multivariate analysis

Procedia PDF Downloads 329
616 Dissimilarity Measure for General Histogram Data and Its Application to Hierarchical Clustering

Authors: K. Umbleja, M. Ichino

Abstract:

Symbolic data mining has been developed to analyze data in very large datasets. It is also useful in cases when entry specific details should remain hidden. Symbolic data mining is quickly gaining popularity as datasets in need of analyzing are becoming ever larger. One type of such symbolic data is a histogram, which enables to save huge amounts of information into a single variable with high-level of granularity. Other types of symbolic data can also be described in histograms, therefore making histogram a very important and general symbolic data type - a method developed for histograms - can also be applied to other types of symbolic data. Due to its complex structure, analyzing histograms is complicated. This paper proposes a method, which allows to compare two histogram-valued variables and therefore find a dissimilarity between two histograms. Proposed method uses the Ichino-Yaguchi dissimilarity measure for mixed feature-type data analysis as a base and develops a dissimilarity measure specifically for histogram data, which allows to compare histograms with different number of bins and bin widths (so called general histogram). Proposed dissimilarity measure is then used as a measure for clustering. Furthermore, linkage method based on weighted averages is proposed with the concept of cluster compactness to measure the quality of clustering. The method is then validated with application on real datasets. As a result, the proposed dissimilarity measure is found producing adequate and comparable results with general histograms without the loss of detail or need to transform the data.

Keywords: dissimilarity measure, hierarchical clustering, histograms, symbolic data analysis

Procedia PDF Downloads 133
615 From Electroencephalogram to Epileptic Seizures Detection by Using Artificial Neural Networks

Authors: Gaetano Zazzaro, Angelo Martone, Roberto V. Montaquila, Luigi Pavone

Abstract:

Seizure is the main factor that affects the quality of life of epileptic patients. The diagnosis of epilepsy, and hence the identification of epileptogenic zone, is commonly made by using continuous Electroencephalogram (EEG) signal monitoring. Seizure identification on EEG signals is made manually by epileptologists and this process is usually very long and error prone. The aim of this paper is to describe an automated method able to detect seizures in EEG signals, using knowledge discovery in database process and data mining methods and algorithms, which can support physicians during the seizure detection process. Our detection method is based on Artificial Neural Network classifier, trained by applying the multilayer perceptron algorithm, and by using a software application, called Training Builder that has been developed for the massive extraction of features from EEG signals. This tool is able to cover all the data preparation steps ranging from signal processing to data analysis techniques, including the sliding window paradigm, the dimensionality reduction algorithms, information theory, and feature selection measures. The final model shows excellent performances, reaching an accuracy of over 99% during tests on data of a single patient retrieved from a publicly available EEG dataset.

Keywords: artificial neural network, data mining, electroencephalogram, epilepsy, feature extraction, seizure detection, signal processing

Procedia PDF Downloads 159
614 Destination Port Detection For Vessels: An Analytic Tool For Optimizing Port Authorities Resources

Authors: Lubna Eljabu, Mohammad Etemad, Stan Matwin

Abstract:

Port authorities have many challenges in congested ports to allocate their resources to provide a safe and secure loading/ unloading procedure for cargo vessels. Selecting a destination port is the decision of a vessel master based on many factors such as weather, wavelength and changes of priorities. Having access to a tool which leverages AIS messages to monitor vessel’s movements and accurately predict their next destination port promotes an effective resource allocation process for port authorities. In this research, we propose a method, namely, Reference Route of Trajectory (RRoT) to assist port authorities in predicting inflow and outflow traffic in their local environment by monitoring Automatic Identification System (AIS) messages. Our RRoT method creates a reference route based on historical AIS messages. It utilizes some of the best trajectory similarity measure to identify the destination of a vessel using their recent movement. We evaluated five different similarity measures such as Discrete Fr´echet Distance (DFD), Dynamic Time Warping (DTW), Partial Curve Mapping (PCM), Area between two curves (Area) and Curve length (CL). Our experiments show that our method identifies the destination port with an accuracy of 98.97% and an fmeasure of 99.08% using Dynamic Time Warping (DTW) similarity measure.

Keywords: spatial temporal data mining, trajectory mining, trajectory similarity, resource optimization

Procedia PDF Downloads 91
613 Application of Acid Base Accounting to Predict Post-Mining Drainage Quality in Coalfields of the Main Karoo Basin and Selected Sub-Basins, South Africa

Authors: Lindani Ncube, Baojin Zhao, Ken Liu, Helen Johanna Van Niekerk

Abstract:

Acid Base Accounting (ABA) is a tool used to assess the total amount of acidity or alkalinity contained in a specific rock sample, and is based on the total S concentration and the carbonate content of a sample. A preliminary ABA test was conducted on 14 sandstone and 5 coal samples taken from coalfields representing the Main Karoo Basin (Highveld, Vryheid and Molteno/Indwe Coalfields) and the Sub-basins (Witbank and Waterberg Coalfields). The results indicate that sandstone and coal from the Main Karoo Basin have the potential of generating Acid Mine Drainage (AMD) as they contain sufficient pyrite to generate acid, with the final pH of samples relatively low upon complete oxidation of pyrite. Sandstone from collieries representing the Main Karoo Basin are characterised by elevated contents of reactive S%. All the studied samples were characterised by an Acid Potential (AP) that is less than the Neutralizing Potential (NP) except for two samples. The results further indicate that the sandstone from the Main Karoo Basin is prone to acid generation as compared to the sandstone from the Sub-basins. However, the coal has a relatively low potential of generating any acid. The application of ABA in this study contributes to an understanding of the complexities governing water-rock interactions. In general, the coalfields from the Main Karoo Basin have much higher potential to produce AMD during mining processes than the coalfields in the Sub-basins.

Keywords: Main Karoo Basin, sub-basin, coal, sandstone, acid base accounting (ABA)

Procedia PDF Downloads 404
612 Text Mining Past Medical History in Electrophysiological Studies

Authors: Roni Ramon-Gonen, Amir Dori, Shahar Shelly

Abstract:

Background and objectives: Healthcare professionals produce abundant textual information in their daily clinical practice. The extraction of insights from all the gathered information, mainly unstructured and lacking in normalization, is one of the major challenges in computational medicine. In this respect, text mining assembles different techniques to derive valuable insights from unstructured textual data, so it has led to being especially relevant in Medicine. Neurological patient’s history allows the clinician to define the patient’s symptoms and along with the result of the nerve conduction study (NCS) and electromyography (EMG) test, assists in formulating a differential diagnosis. Past medical history (PMH) helps to direct the latter. In this study, we aimed to identify relevant PMH, understand which PMHs are common among patients in the referral cohort and documented by the medical staff, and examine the differences by sex and age in a large cohort based on textual format notes. Methods: We retrospectively identified all patients with abnormal NCS between May 2016 to February 2022. Age, gender, and all NCS attributes reports were recorded, including the summary text. All patients’ histories were extracted from the text report by a query. Basic text cleansing and data preparation were performed, as well as lemmatization. Very popular words (like ‘left’ and ‘right’) were deleted. Several words were replaced with their abbreviations. A bag of words approach was used to perform the analyses. Different visualizations which are common in text analysis, were created to easily grasp the results. Results: We identified 5282 unique patients. Three thousand and five (57%) patients had documented PMH. Of which 60.4% (n=1817) were males. The total median age was 62 years (range 0.12 – 97.2 years), and the majority of patients (83%) presented after the age of forty years. The top two documented medical histories were diabetes mellitus (DM) and surgery. DM was observed in 16.3% of the patients, and surgery at 15.4%. Other frequent patient histories (among the top 20) were fracture, cancer (ca), motor vehicle accident (MVA), leg, lumbar, discopathy, back and carpal tunnel release (CTR). When separating the data by sex, we can see that DM and MVA are more frequent among males, while cancer and CTR are less frequent. On the other hand, the top medical history in females was surgery and, after that, DM. Other frequent histories among females are breast cancer, fractures, and CTR. In the younger population (ages 18 to 26), the frequent PMH were surgery, fractures, trauma, and MVA. Discussion: By applying text mining approaches to unstructured data, we were able to better understand which medical histories are more relevant in these circumstances and, in addition, gain additional insights regarding sex and age differences. These insights might help to collect epidemiological demographical data as well as raise new hypotheses. One limitation of this work is that each clinician might use different words or abbreviations to describe the same condition, and therefore using a coding system can be beneficial.

Keywords: abnormal studies, healthcare analytics, medical history, nerve conduction studies, text mining, textual analysis

Procedia PDF Downloads 64
611 Information Management Approach in the Prediction of Acute Appendicitis

Authors: Ahmad Shahin, Walid Moudani, Ali Bekraki

Abstract:

This research aims at presenting a predictive data mining model to handle an accurate diagnosis of acute appendicitis with patients for the purpose of maximizing the health service quality, minimizing morbidity/mortality, and reducing cost. However, acute appendicitis is the most common disease which requires timely accurate diagnosis and needs surgical intervention. Although the treatment of acute appendicitis is simple and straightforward, its diagnosis is still difficult because no single sign, symptom, laboratory or image examination accurately confirms the diagnosis of acute appendicitis in all cases. This contributes in increasing morbidity and negative appendectomy. In this study, the authors propose to generate an accurate model in prediction of patients with acute appendicitis which is based, firstly, on the segmentation technique associated to ABC algorithm to segment the patients; secondly, on applying fuzzy logic to process the massive volume of heterogeneous and noisy data (age, sex, fever, white blood cell, neutrophilia, CRP, urine, ultrasound, CT, appendectomy, etc.) in order to express knowledge and analyze the relationships among data in a comprehensive manner; and thirdly, on applying dynamic programming technique to reduce the number of data attributes. The proposed model is evaluated based on a set of benchmark techniques and even on a set of benchmark classification problems of osteoporosis, diabetes and heart obtained from the UCI data and other data sources.

Keywords: healthcare management, acute appendicitis, data mining, classification, decision tree

Procedia PDF Downloads 324
610 Embodying the Ecological Validity in Creating the Sustainable Public Policy: A Study in Strengthening the Green Economy in Indonesia

Authors: Gatot Dwi Hendro, Hayyan ul Haq

Abstract:

This work aims to explore the strategy in embodying the ecological validity in creating the sustainability of public policy, particularly in strengthening the green economy in Indonesia. This green economy plays an important role in supporting the national development in Indonesia, as it is a part of the national policy that posits the primary priority in Indonesian governance. The green economy refers to the national development covering strategic natural resources, such as mining, gold, oil, coal, forest, water, marine, and the other supporting infrastructure for products and distribution, such as fabrics, roads, bridges, and so forth. Thus, all activities in those national development should consider the sustainability. This sustainability requires the strong commitment of the national and regional government, as well as the local governments to put the ecology as the main requirement for issuing any policy, such as licence in mining production, and developing and building new production and supporting infrastructures for optimising the national resources. For that reason this work will focus on the strategy how to embody the ecological values and norms in the public policy. In detail, this work will offer the method, i.e. legal techniques, in visualising and embodying the norms and public policy that valid ecologically. This ecological validity is required in order to maintain and sustain our collective life.

Keywords: ecological validity, sustainable development, coherence, Indonesian Pancasila values, environment, marine

Procedia PDF Downloads 455
609 Using Rainfall Simulators to Design and Assess the Post-Mining Erosional Stability

Authors: Ashraf M. Khalifa, Hwat Bing So, Greg Maddocks

Abstract:

Changes to the mining environmental approvals process in Queensland have been rolled out under the MERFP Act (2018). This includes requirements for a Progressive Rehabilitation and Closure Plan (PRC Plan). Key considerations of the landform design report within the PRC Plan must include: (i) identification of materials available for landform rehabilitation, including their ability to achieve the required landform design outcomes, (ii) erosion assessments to determine landform heights, gradients, profiles, and material placement, (iii) slope profile design considering the interactions between soil erodibility, rainfall erosivity, landform height, gradient, and vegetation cover to identify acceptable erosion rates over a long-term average, (iv) an analysis of future stability based on the factors described above e.g., erosion and /or landform evolution modelling. ACARP funded an extensive and thorough erosion assessment program using rainfall simulators from 1998 to 2010. The ACARP program included laboratory assessment of 35 soil and spoil samples from 16 coal mines and samples from a gold mine in Queensland using 3 x 0.8 m laboratory rainfall simulator. The reliability of the laboratory rainfall simulator was verified through field measurements using larger flumes 20 x 5 meters and catchment scale measurements at three sites (3 different catchments, average area of 2.5 ha each). Soil cover systems are a primary component of a constructed mine landform. The primary functions of a soil cover system are to sustain vegetation and limit the infiltration of water and oxygen into underlying reactive mine waste. If the external surface of the landform erodes, the functions of the cover system cannot be maintained, and the cover system will most likely fail. Assessing a constructed landform’s potential ‘long-term’ erosion stability requires defensible erosion rate thresholds below which rehabilitation landform designs are considered acceptably erosion-resistant or ‘stable’. The process used to quantify erosion rates using rainfall simulators (flumes) to measure rill and inter-rill erosion on bulk samples under laboratory conditions or on in-situ material under field conditions will be explained.

Keywords: open-cut, mining, erosion, rainfall simulator

Procedia PDF Downloads 74
608 Characteristic Study of Polymer Sand as a Potential Substitute for Natural River Sand in Construction Industry

Authors: Abhishek Khupsare, Ajay Parmar, Ajay Agarwal, Swapnil Wanjari

Abstract:

The extreme demand for aggregate leads to the exploitation of river-bed for fine aggregates, affecting the environment adversely. Therefore, a suitable alternative to natural river sand is essentially required. This study focuses on preventing environmental impact by developing polymer sand to replace natural river sand (NRS). Development of polymer sand by mixing high volume fly ash, bottom ash, cement, natural river sand, and locally purchased high solid content polycarboxylate ether-based superplasticizer (HS-PCE). All the physical and chemical properties of polymer sand (P-Sand) were observed and satisfied the requirement of the Indian Standard code. P-Sand yields good specific gravity of 2.31 and is classified as zone-I sand with a satisfactory friction angle (37˚) compared to natural river sand (NRS) and Geopolymer fly ash sand (GFS). Though the water absorption (6.83%) and pH (12.18) are slightly more than those of GFS and NRS, the alkali silica reaction and soundness are well within the permissible limit as per Indian Standards. The chemical analysis by X-Ray fluorescence showed the presence of high amounts of SiO2 and Al2O3 with magnitudes of 58.879% 325 and 26.77%, respectively. Finally, the compressive strength of M-25 grade concrete using P-sand and Geopolymer sand (GFS) was observed to be 87.51% and 83.82% with respect to natural river sand (NRS) after 28 days, respectively. The results of this study indicate that P-sand can be a good alternative to NRS for construction work as it not only reduces the environmental effect due to sand mining but also focuses on utilising fly ash and bottom ash.

Keywords: polymer sand, fly ash, bottom ash, HSPCE plasticizer, river sand mining

Procedia PDF Downloads 42
607 Mineral Deposits in Spatial Planning Systems – Review of European Practices

Authors: Alicja Kot-Niewiadomska

Abstract:

Securing sustainable access to raw materials is vital for the growth of the European economy and for the goals laid down in Strategy Europe 2020. One of the most important sources of mineral raw materials are primary deposits. The efficient management of them, including extraction, will ensure competitiveness of the European economy. A critical element of this approach is mineral deposits safeguarding and the most important tool - spatial planning. The safeguarding of deposits should be understood as safeguarding of land access, and safeguarding of area against development, which may (potential) prevent the use of the deposit and the necessary mining activities. Many European Union countries successfully integrated their mineral policy and spatial policy, which has ensured the proper place of mineral deposits in their spatial planning systems. These, in turn, are widely recognized as the most important mineral deposit safeguarding tool, the essence of which is to ensure long-term access to its resources. The examples of Austria, Portugal, Slovakia, Czech Republic, Sweden, and the United Kingdom, discussed in the paper, are often mentioned as examples of good practices in this area. Although none of these countries managed to avoid cases of social and environmental conflicts related to mining activities, the solutions they implement certainly deserve special attention. And for many countries, including Poland, they can be a potential source of solutions aimed at improving the protection of mineral deposits.

Keywords: mineral deposits, land use planning, mineral deposit safeguarding, European practices

Procedia PDF Downloads 142
606 A Practical and Theoretical Study on the Electromotor Bearing Defect Detection in a Wet Mill Using the Vibration Analysis Method and Defect Length Calculation in the Bearing

Authors: Mostafa Firoozabadi, Alireza Foroughi Nematollahi

Abstract:

Wet mills are one of the most important equipment in the mining industries and any defect occurrence in them can stop the production line and it can make some irrecoverable damages to the system. Electromotors are the significant parts of a mill and their monitoring is a necessary process to prevent unwanted defects. The purpose of this study is to investigate the Electromotor bearing defects, theoretically and practically, using the vibration analysis method. When a defect happens in a bearing, it can be transferred to the other parts of the equipment like inner ring, outer ring, balls, and the bearing cage. The electromotor defects source can be electrical or mechanical. Sometimes, the electrical and mechanical defect frequencies are modulated and the bearing defect detection becomes difficult. In this paper, to detect the electromotor bearing defects, the electrical and mechanical defect frequencies are extracted firstly. Then, by calculating the bearing defect frequencies, and the spectrum and time signal analysis, the bearing defects are detected. In addition, the obtained frequency determines that the bearing level in which the defect has happened and by comparing this level to the standards it determines the bearing remaining lifetime. Finally, the defect length is calculated by theoretical equations to demonstrate that there is no need to replace the bearing. The results of the proposed method, which has been implemented on the wet mills in the Golgohar mining and industrial company in Iran, show that this method is capable of detecting the electromotor bearing defects accurately and on time.

Keywords: bearing defect length, defect frequency, electromotor defects, vibration analysis

Procedia PDF Downloads 472
605 Analysis of the Development of Mining Companies Social Corporate Responsibility Based on the Rating Score

Authors: Tatiana Ponomarenko, Oksana Marinina, Marina Nevskaya

Abstract:

Modern corporate social responsibility (CSR) is a sphere of multilevel responsibility of a company toward society represented by various stakeholders. The relevance of CSR management grows due to the active development of socially responsible investing (principles for responsible investment) taking into account factors of environmental, social and corporate governance (ESG), growing attention of the investment community in general to the long-term stability of companies and the quality of control of nonfinancial risks. The modern approach to CSR strategic management is aimed at the creation of trustful relationships with stakeholders, on the basis of which a contribution to the sustainable development of companies, regions, and national economics is insured. However, the practical concepts of social responsibility in mining companies are different, which leads to various degrees of application of CSR. A number of companies implement CSR using a traditional (limited) understanding of responsibility toward employees and counteragents, the others understand CSR much wider and try to use leverages of efficient cooperation. As in large mining companies the scope of CSR measures is diverse and characterized by different indices, the study was aimed at evaluating CSR efficiency on the basis of a proprietary methodology and determining the level of development of CSR management in terms of anti-crisis, reactive and proactive development. The methodology of the research includes analysis of integrated global reporting initiative (GRI) reports of large mining companies; choice of most representative sectoral agents by a criterion of the regularity of issuance and publication of reports; calculation of indices of evaluation of CSR level of the selected companies in dynamics. The methodology of evaluation of CSR level is based on a rating score of changes in standard indices of GRI reports by economic, environmental, and social directions. Result. By the results of the analysis, companies of fuel and energy and metallurgic complexes, in overwhelming majority, reflecting three indices out of a wide range of possible indicators of SDGs (Sustainable Development Goals), were selected for the study. The evaluation of the scopes of CSR of the companies Gazprom, LUKOIL, Metalloinvest, Nornikel, Rosneft, Severstal, SIBUR, SUEK corresponds to the reactive type of development according to a scale of CSR strategic management, which is the average value out of the possible values. The chief drawback is that companies, in the process of analyzing global goals, often choose the goals which relate to their own activities, paying insufficient attention to the interests of the stakeholders inside the country. This fact evidences the necessity of searching for more effective mechanisms of CSR control. Acknowledgment: This article is prepared within grant support of the RFBR, project 19-510-44013 'Development of the concept of mineral resources value formation in the context of sustainable development in resource-oriented economies'.

Keywords: sustainable development, corporate social responsibility, development strategies, efficiency assessment

Procedia PDF Downloads 106
604 A Novel Approach for the Analysis of Ground Water Quality by Using Classification Rules and Water Quality Index

Authors: Kamakshaiah Kolli, R. Seshadri

Abstract:

Water is a key resource in all economic activities ranging from agriculture to industry. Only a tiny fraction of the planet's abundant water is available to us as fresh water. Assessment of water quality has always been paramount in the field of environmental quality management. It is the foundation for health, hygiene, progress and prosperity. With ever increasing pressure of human population, there is severe stress on water resources. Therefore efficient water management is essential to civil society for betterment of quality of life. The present study emphasizes on the groundwater quality, sources of ground water contamination, variation of groundwater quality and its spatial distribution. The bases for groundwater quality assessment are groundwater bodies and representative monitoring network enabling determination of chemical status of groundwater body. For this study, water samples were collected from various areas of the entire corporation area of Guntur. Water is required for all living organisms of which 1.7% is available as ground water. Water has no calories or any nutrients, but essential for various metabolic activities in our body. Chemical and physical parameters can be tested for identifying the portability of ground water. Electrical conductivity, pH, alkalinity, Total Alkalinity, TDS, Calcium, Magnesium, Sodium, Potassium, Chloride, and Sulphate of the ground water from Guntur district: Different areas of the District were analyzed. Our aim is to check, if the ground water from the above areas are potable or not. As multivariate are present, Data mining technique using JRIP rules was employed for classifying the ground water.

Keywords: groundwater, water quality standards, potability, data mining, JRIP, PCA, classification

Procedia PDF Downloads 404
603 Investigating Dynamic Transition Process of Issues Using Unstructured Text Analysis

Authors: Myungsu Lim, William Xiu Shun Wong, Yoonjin Hyun, Chen Liu, Seongi Choi, Dasom Kim, Namgyu Kim

Abstract:

The amount of real-time data generated through various mass media has been increasing rapidly. In this study, we had performed topic analysis by using the unstructured text data that is distributed through news article. As one of the most prevalent applications of topic analysis, the issue tracking technique investigates the changes of the social issues that identified through topic analysis. Currently, traditional issue tracking is conducted by identifying the main topics of documents that cover an entire period at the same time and analyzing the occurrence of each topic by the period of occurrence. However, this traditional issue tracking approach has limitation that it cannot discover dynamic mutation process of complex social issues. The purpose of this study is to overcome the limitations of the existing issue tracking method. We first derived core issues of each period, and then discover the dynamic mutation process of various issues. In this study, we further analyze the mutation process from the perspective of the issues categories, in order to figure out the pattern of issue flow, including the frequency and reliability of the pattern. In other words, this study allows us to understand the components of the complex issues by tracking the dynamic history of issues. This methodology can facilitate a clearer understanding of complex social phenomena by providing mutation history and related category information of the phenomena.

Keywords: Data Mining, Issue Tracking, Text Mining, topic Analysis, topic Detection, Trend Detection

Procedia PDF Downloads 377
602 An Automatic Bayesian Classification System for File Format Selection

Authors: Roman Graf, Sergiu Gordea, Heather M. Ryan

Abstract:

This paper presents an approach for the classification of an unstructured format description for identification of file formats. The main contribution of this work is the employment of data mining techniques to support file format selection with just the unstructured text description that comprises the most important format features for a particular organisation. Subsequently, the file format indentification method employs file format classifier and associated configurations to support digital preservation experts with an estimation of required file format. Our goal is to make use of a format specification knowledge base aggregated from a different Web sources in order to select file format for a particular institution. Using the naive Bayes method, the decision support system recommends to an expert, the file format for his institution. The proposed methods facilitate the selection of file format and the quality of a digital preservation process. The presented approach is meant to facilitate decision making for the preservation of digital content in libraries and archives using domain expert knowledge and specifications of file formats. To facilitate decision-making, the aggregated information about the file formats is presented as a file format vocabulary that comprises most common terms that are characteristic for all researched formats. The goal is to suggest a particular file format based on this vocabulary for analysis by an expert. The sample file format calculation and the calculation results including probabilities are presented in the evaluation section.

Keywords: data mining, digital libraries, digital preservation, file format

Procedia PDF Downloads 472