Search results for: Web Mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1084

Search results for: Web Mining

544 Evaluation of Arsenic Removal in Soils Contaminated by the Phytoremediation Technique

Authors: V. Ibujes, A. Guevara, P. Barreto

Abstract:

Concentration of arsenic represents a serious threat to human health. It is a bioaccumulable toxic element and is transferred through the food chain. In Ecuador, values of 0.0423 mg/kg As are registered in potatoes of the skirts of the Tungurahua volcano. The increase of arsenic contamination in Ecuador is mainly due to mining activity, since the process of gold extraction generates toxic tailings with mercury. In the Province of Azuay, due to the mining activity, the soil reaches concentrations of 2,500 to 6,420 mg/kg As whereas in the province of Tungurahua it can be found arsenic concentrations of 6.9 to 198.7 mg/kg due to volcanic eruptions. Since the contamination by arsenic, the present investigation is directed to the remediation of the soils in the provinces of Azuay and Tungurahua by phytoremediation technique and the definition of a methodology of extraction by means of analysis of arsenic in the system soil-plant. The methodology consists in selection of two types of plants that have the best arsenic removal capacity in synthetic solutions 60 μM As, a lower percentage of mortality and hydroponics resistance. The arsenic concentrations in each plant were obtained from taking 10 ml aliquots and the subsequent analysis of the ICP-OES (inductively coupled plasma-optical emission spectrometry) equipment. Soils were contaminated with synthetic solutions of arsenic with the capillarity method to achieve arsenic concentration of 13 and 15 mg/kg. Subsequently, two types of plants were evaluated to reduce the concentration of arsenic in soils for 7 weeks. The global variance for soil types was obtained with the InfoStat program. To measure the changes in arsenic concentration in the soil-plant system, the Rhizo and Wenzel arsenic extraction methodology was used and subsequently analyzed with the ICP-OES (optima 8000 Pekin Elmer). As a result, the selected plants were bluegrass and llanten, due to the high percentages of arsenic removal of 55% and 67% and low mortality rates of 9% and 8% respectively. In conclusion, Azuay soil with an initial concentration of 13 mg/kg As reached the concentrations of 11.49 and 11.04 mg/kg As for bluegrass and llanten respectively, and for the initial concentration of 15 mg/kg As reached 11.79 and 11.10 mg/kg As for blue grass and llanten after 7 weeks. For the Tungurahua soil with an initial concentration of 13 mg/kg As it reached the concentrations of 11.56 and 12.16 mg/kg As for the bluegrass and llanten respectively, and for the initial concentration of 15 mg/kg As reached 11.97 and 12.27 mg/kg Ace for bluegrass and llanten after 7 weeks. The best arsenic extraction methodology of soil-plant system is Wenzel.

Keywords: blue grass, llanten, phytoremediation, soil of Azuay, soil of Tungurahua, synthetic arsenic solution

Procedia PDF Downloads 103
543 The Crisis of Turkey's Downing the Russian Warplane within the Concept of Country Branding: The Examples of BBC World, and Al Jazeera English

Authors: Derya Gül Ünlü, Oguz Kuş

Abstract:

The branding of a country means that the country has its own position different from other countries in its region and thus it is perceived more specifically. It is made possible by the branding efforts of a country and the uniqueness of all the national structures, by presenting it in a specific way, by creating the desired image and attracting tourists and foreign investors. Establishing a national brand involves, in a sense, the process of managing the perceptions of the citizens of the other country about the target country, by structuring the image of the country permanently and holistically. By this means, countries are not easily affected by their crisis of international relations. Therefore, within the scope of the research that will be carried out from this point, it is aimed to show how the warplane downing crisis between Turkey and Russia is perceived on social media. The Russian warplane was downed by Turkey on November 24, 2015, on the grounds that Turkey violated the airspace on the Syrian border. Whereupon the relations between the two countries have been tensed, and Russia has called on its citizens not to go to Turkey and citizens in Turkey to return to their countries. Moreover, relations between two countries have been weakened, for example, tourism tours organized in Russia to Turkey and visa-free travel were canceled and all military dialogue was cut off. After the event, various news sites on social media published plenty of news related to topic and the readers made various comments about the event and Turkey. In this context, an investigation into the perception of Turkey's national brand before and after the warplane downing crisis has been conducted. through comments fetched from the reports on the BBC World, and from Al Jazeera English news sites on Facebook accounts, which takes place widely in the social media. In order to realize study, user comments were fetched from jet downing-related news which are published on Facebook fan-page of BBC World Service, and Al Jazeera English. Regarding this, all the news published between 24.10.2015-24.12.2015 and containing Turk and Turkey keyword in its title composed data set of our study. Afterwards, comments written to these news were analyzed via text mining technique. Furthermore, by sentiment analysis, it was intended to reveal reader’s emotions before and after the crisis.

Keywords: Al Jazeera English, BBC World, country branding, social media, text mining

Procedia PDF Downloads 223
542 Extracting Opinions from Big Data of Indonesian Customer Reviews Using Hadoop MapReduce

Authors: Veronica S. Moertini, Vinsensius Kevin, Gede Karya

Abstract:

Customer reviews have been collected by many kinds of e-commerce websites selling products, services, hotel rooms, tickets and so on. Each website collects its own customer reviews. The reviews can be crawled, collected from those websites and stored as big data. Text analysis techniques can be used to analyze that data to produce summarized information, such as customer opinions. Then, these opinions can be published by independent service provider websites and used to help customers in choosing the most suitable products or services. As the opinions are analyzed from big data of reviews originated from many websites, it is expected that the results are more trusted and accurate. Indonesian customers write reviews in Indonesian language, which comes with its own structures and uniqueness. We found that most of the reviews are expressed with “daily language”, which is informal, do not follow the correct grammar, have many abbreviations and slangs or non-formal words. Hadoop is an emerging platform aimed for storing and analyzing big data in distributed systems. A Hadoop cluster consists of master and slave nodes/computers operated in a network. Hadoop comes with distributed file system (HDFS) and MapReduce framework for supporting parallel computation. However, MapReduce has weakness (i.e. inefficient) for iterative computations, specifically, the cost of reading/writing data (I/O cost) is high. Given this fact, we conclude that MapReduce function is best adapted for “one-pass” computation. In this research, we develop an efficient technique for extracting or mining opinions from big data of Indonesian reviews, which is based on MapReduce with one-pass computation. In designing the algorithm, we avoid iterative computation and instead adopt a “look up table” technique. The stages of the proposed technique are: (1) Crawling the data reviews from websites; (2) cleaning and finding root words from the raw reviews; (3) computing the frequency of the meaningful opinion words; (4) analyzing customers sentiments towards defined objects. The experiments for evaluating the performance of the technique were conducted on a Hadoop cluster with 14 slave nodes. The results show that the proposed technique (stage 2 to 4) discovers useful opinions, is capable of processing big data efficiently and scalable.

Keywords: big data analysis, Hadoop MapReduce, analyzing text data, mining Indonesian reviews

Procedia PDF Downloads 201
541 In-situ Phytoremediation Of Polluted Soils By Micropollutants From Artisanal Gold Mining Processes In Burkina Faso

Authors: Yamma Rose, Kone Martine, Yonli Arsène, Wanko Ngnien Adrien

Abstract:

Artisanal gold mining has seen a resurgence in recent years in Burkina Faso with its corollary of soil and water pollution. Indeed, in addition to visible impacts, it generates discharges rich in trace metal elements and acids. This pollution has significant environmental consequences, making these lands unusable while the population depends on the natural environment for its survival. The goal of this study is to assess the decontamination potential of Chrysopogon zizanioides on two artisanal gold processing sites in Burkina Faso. The cyanidation sites of Nebia (1Ha) and Nimbrogo (2Ha) located respectively in the Central West and Central South regions were selected. The soils were characterized to determine the initial pollution levels before the implementation of phytoremediation. After development of the site, parallel trenches equidistant 6 m apart, 30 cm deep, 40 cm wide and opposite to the water flow direction were dug and filled with earth amended with manure. The Chrysopogon zizanioides plants were transplanted 5 cm equidistant into the trenches. The mere fact that Chrysopogon zizanioides grew in the polluted soil is an indication that this plant tolerates and resists the toxicity of trace elements present on the site. The characterization shows sites very polluted with free cyanide 900 times higher than the national standard, the level of Hg in the soil is 5 times more than the limit value, iron and Zn are respectively 1000 times and 200 more than the tolerated environmental value. At time T1 (6 months) and T2 (12 months) of culture, Chrysopogon zizanioides showed less development on the Nimbrogo site than that of the Nebia site. Plant shoots and associated soil samples were collected and analyzed for total As, Hg, Fe and Zn concentration. The trace element content of the soil, the bioaccumulation factor and the hyper accumulation thresholds were also determined to assess the remediation potential. The concentration of As and Hg in the soil was below international risk thresholds, while that of Fe and Zn was well above these thresholds. The CN removal efficiency at the Nebia site is respectively 29.90% and 68.62% compared to 6.6% and 60.8% at Nimbrogo at time T1 and T2.

Keywords: chrysopogon zizanioides, in-situ phytoremediation, polluted soils, micropollutants

Procedia PDF Downloads 78
540 Predicting Success and Failure in Drug Development Using Text Analysis

Authors: Zhi Hao Chow, Cian Mulligan, Jack Walsh, Antonio Garzon Vico, Dimitar Krastev

Abstract:

Drug development is resource-intensive, time-consuming, and increasingly expensive with each developmental stage. The success rates of drug development are also relatively low, and the resources committed are wasted with each failed candidate. As such, a reliable method of predicting the success of drug development is in demand. The hypothesis was that some examples of failed drug candidates are pushed through developmental pipelines based on false confidence and may possess common linguistic features identifiable through sentiment analysis. Here, the concept of using text analysis to discover such features in research publications and investor reports as predictors of success was explored. R studios were used to perform text mining and lexicon-based sentiment analysis to identify affective phrases and determine their frequency in each document, then using SPSS to determine the relationship between our defined variables and the accuracy of predicting outcomes. A total of 161 publications were collected and categorised into 4 groups: (i) Cancer treatment, (ii) Neurodegenerative disease treatment, (iii) Vaccines, and (iv) Others (containing all other drugs that do not fit into the 3 categories). Text analysis was then performed on each document using 2 separate datasets (BING and AFINN) in R within the category of drugs to determine the frequency of positive or negative phrases in each document. A relative positivity and negativity value were then calculated by dividing the frequency of phrases with the word count of each document. Regression analysis was then performed with SPSS statistical software on each dataset (values from using BING or AFINN dataset during text analysis) using a random selection of 61 documents to construct a model. The remaining documents were then used to determine the predictive power of the models. Model constructed from BING predicts the outcome of drug performance in clinical trials with an overall percentage of 65.3%. AFINN model had a lower accuracy at predicting outcomes compared to the BING model at 62.5% but was not effective at predicting the failure of drugs in clinical trials. Overall, the study did not show significant efficacy of the model at predicting outcomes of drugs in development. Many improvements may need to be made to later iterations of the model to sufficiently increase the accuracy.

Keywords: data analysis, drug development, sentiment analysis, text-mining

Procedia PDF Downloads 157
539 A Word-to-Vector Formulation for Word Representation

Authors: Sandra Rizkallah, Amir F. Atiya

Abstract:

This work presents a novel word to vector representation that is based on embedding the words into a sphere, whereby the dot product of the corresponding vectors represents the similarity between any two words. Embedding the vectors into a sphere enabled us to take into consideration the antonymity between words, not only the synonymity, because of the suitability to handle the polarity nature of words. For example, a word and its antonym can be represented as a vector and its negative. Moreover, we have managed to extract an adequate vocabulary. The obtained results show that the proposed approach can capture the essence of the language, and can be generalized to estimate a correct similarity of any new pair of words.

Keywords: natural language processing, word to vector, text similarity, text mining

Procedia PDF Downloads 275
538 Structural Analysis and Modelling in an Evolving Iron Ore Operation

Authors: Sameh Shahin, Nannang Arrys

Abstract:

Optimizing pit slope stability and reducing strip ratio of a mining operation are two key tasks in geotechnical engineering. With a growing demand for minerals and an increasing cost associated with extraction, companies are constantly re-evaluating the viability of mineral deposits and challenging their geological understanding. Within Rio Tinto Iron Ore, the Structural Geology (SG) team investigate and collect critical data, such as point based orientations, mapping and geological inferences from adjacent pits to re-model deposits where previous interpretations have failed to account for structurally controlled slope failures. Utilizing innovative data collection methods and data-driven investigation, SG aims to address the root causes of slope instability. Committing to a resource grid drill campaign as the primary source of data collection will often bias data collection to a specific orientation and significantly reduce the capability to identify and qualify complexity. Consequently, these limitations make it difficult to construct a realistic and coherent structural model that identifies adverse structural domains. Without the consideration of complexity and the capability of capturing these structural domains, mining operations run the risk of inadequately designed slopes that may fail and potentially harm people. Regional structural trends have been considered in conjunction with surface and in-pit mapping data to model multi-batter fold structures that were absent from previous iterations of the structural model. The risk is evident in newly identified dip-slope and rock-mass controlled sectors of the geotechnical design rather than a ubiquitous dip-slope sector across the pit. The reward is two-fold: 1) providing sectors of rock-mass controlled design in previously interpreted structurally controlled domains and 2) the opportunity to optimize the slope angle for mineral recovery and reduced strip ratio. Furthermore, a resulting high confidence model with structures and geometries that can account for historic slope instabilities in structurally controlled domains where design assumptions failed.

Keywords: structural geology, geotechnical design, optimization, slope stability, risk mitigation

Procedia PDF Downloads 46
537 Modernization of Translation Studies Curriculum at Higher Education Level in Armenia

Authors: A. Vahanyan

Abstract:

The paper touches upon the problem of revision and modernization of the current curriculum on translation studies at the Armenian Higher Education Institutions (HEIs). In the contemporary world where quality and speed of services provided are mostly valued, certain higher education centers in Armenia though do not demonstrate enough flexibility in terms of the revision and amendment of courses taught. This issue is present for various curricula at the university level and Translation Studies related curriculum, in particular. Technological innovations that are of great help for translators have been long ago smoothly implemented into the global Translation Industry. According to the European Master's in Translation (EMT) framework, translation service provision comprises linguistic, intercultural, information mining, thematic, and technological competencies. Therefore, to form the competencies mentioned above, the curriculum should be seriously restructured to meet the modern education and job market requirements, relevant courses should be proposed. New courses, in particular, should focus on the formation of technological competences. These suggestions have been made upon the author’s research of the problem across various HEIs in Armenia. The updated curricula should include courses aimed at familiarization with various computer-assisted translation (CAT) tools (MemoQ, Trados, OmegaT, Wordfast, etc.) in the translation process, creation of glossaries and termbases compatible with different platforms), which will ensure consistency in translation of similar texts and speeding up the translation process itself. Another aspect that may be strengthened via curriculum modification is the introduction of interdisciplinary and Project-Based Learning courses, which will enable info mining and thematic competences, which are of great importance as well. Of course, the amendment of the existing curriculum with the mentioned courses will require corresponding faculty development via training, workshops, and seminars. Finally, the provision of extensive internship with translation agencies is strongly recommended as it will ensure the synthesis of theoretical background and practical skills highly required for the specific area. Summing up, restructuring and modernization of the existing curricula on Translation Studies should focus on three major aspects, i.e., introduction of new courses that meet the global quality standards of education, professional development for faculty, and integration of extensive internship supervised by experts in the field.

Keywords: competencies, curriculum, modernization, technical literacy, translation studies

Procedia PDF Downloads 131
536 Representation Data without Lost Compression Properties in Time Series: A Review

Authors: Nabilah Filzah Mohd Radzuan, Zalinda Othman, Azuraliza Abu Bakar, Abdul Razak Hamdan

Abstract:

Uncertain data is believed to be an important issue in building up a prediction model. The main objective in the time series uncertainty analysis is to formulate uncertain data in order to gain knowledge and fit low dimensional model prior to a prediction task. This paper discusses the performance of a number of techniques in dealing with uncertain data specifically those which solve uncertain data condition by minimizing the loss of compression properties.

Keywords: compression properties, uncertainty, uncertain time series, mining technique, weather prediction

Procedia PDF Downloads 428
535 Multi-Cluster Overlapping K-Means Extension Algorithm (MCOKE)

Authors: Said Baadel, Fadi Thabtah, Joan Lu

Abstract:

Clustering involves the partitioning of n objects into k clusters. Many clustering algorithms use hard-partitioning techniques where each object is assigned to one cluster. In this paper, we propose an overlapping algorithm MCOKE which allows objects to belong to one or more clusters. The algorithm is different from fuzzy clustering techniques because objects that overlap are assigned a membership value of 1 (one) as opposed to a fuzzy membership degree. The algorithm is also different from other overlapping algorithms that require a similarity threshold to be defined as a priority which can be difficult to determine by novice users.

Keywords: data mining, k-means, MCOKE, overlapping

Procedia PDF Downloads 575
534 Utilize 5G Mobile Connection as a Node in the Proof of Authority Blockchain Used for Microtransaction

Authors: Frode van der Laak

Abstract:

The paper contributes to the feasibility of using a 5G mobile connection as a node for a Proof of Authority (PoA) blockchain, which is used for microtransactions at the same time. It uses the phone number identity of the users that are linked to the crypto wallet address. It also proposed a consensus protocol based on Proof-of-Authority (PoA) blockchain; PoA is a permission blockchain where consensus is achieved through a set of designated authority rather than through mining, as is the case with a Proof of Work (PoW) blockchain. This report will first explain the concept of a PoA blockchain and how it works. It will then discuss the potential benefits and challenges of using a 5G mobile connection as a node in such a blockchain, and finally, the main open problem statement and proposed solutions with the requirements.

Keywords: 5G, mobile, connection, node, PoA, blockchain, microtransaction

Procedia PDF Downloads 96
533 Numerical Simulation of Flow and Particle Motion in Liquid – Solid Hydrocyclone

Authors: Seyed Roozbeh Pishva, Alireza Aboudi Asl

Abstract:

In this investigation a hydrocyclone by using for separation particles from fluid in oil and gas, mining and other industries is simulated. Case study is cone – cylindrical and solid - liquid hydrocyclone. The fluid is water and the solid is a type of silis having diameters of 53, 75, 106, 150, 212, 250, and 300 micron. In this investigation CFD method used for analysis flow and movement of particles in hydrocyclone. In this modeling flow is three-dimention, turbulence and RSM model have been used for solving. Particles are three dimensional, spherical and non rotating and for tracking them Lagrangian model is used. The results of this study in addition to analyzing flowfield, obtaining efficiency of hydrocyclone in 5, 7, 12, and 15 percent concentrations and compare them with experimental result that both of them had suitable agreement with each other.

Keywords: hydrocyclone, RSM Model, CFD, copper industry

Procedia PDF Downloads 572
532 A Method to Evaluate and Compare Web Information Extractors

Authors: Patricia Jiménez, Rafael Corchuelo, Hassan A. Sleiman

Abstract:

Web mining is gaining importance at an increasing pace. Currently, there are many complementary research topics under this umbrella. Their common theme is that they all focus on applying knowledge discovery techniques to data that is gathered from the Web. Sometimes, these data are relatively easy to gather, chiefly when it comes from server logs. Unfortunately, there are cases in which the data to be mined is the data that is displayed on a web document. In such cases, it is necessary to apply a pre-processing step to first extract the information of interest from the web documents. Such pre-processing steps are performed using so-called information extractors, which are software components that are typically configured by means of rules that are tailored to extracting the information of interest from a web page and structuring it according to a pre-defined schema. Paramount to getting good mining results is that the technique used to extract the source information is exact, which requires to evaluate and compare the different proposals in the literature from an empirical point of view. According to Google Scholar, about 4 200 papers on information extraction have been published during the last decade. Unfortunately, they were not evaluated within a homogeneous framework, which leads to difficulties to compare them empirically. In this paper, we report on an original information extraction evaluation method. Our contribution is three-fold: a) this is the first attempt to provide an evaluation method for proposals that work on semi-structured documents; the little existing work on this topic focuses on proposals that work on free text, which has little to do with extracting information from semi-structured documents. b) It provides a method that relies on statistically sound tests to support the conclusions drawn; the previous work does not provide clear guidelines or recommend statistically sound tests, but rather a survey that collects many features to take into account as well as related work; c) We provide a novel method to compute the performance measures regarding unsupervised proposals; otherwise they would require the intervention of a user to compute them by using the annotations on the evaluation sets and the information extracted. Our contributions will definitely help researchers in this area make sure that they have advanced the state of the art not only conceptually, but from an empirical point of view; it will also help practitioners make informed decisions on which proposal is the most adequate for a particular problem. This conference is a good forum to discuss on our ideas so that we can spread them to help improve the evaluation of information extraction proposals and gather valuable feedback from other researchers.

Keywords: web information extractors, information extraction evaluation method, Google scholar, web

Procedia PDF Downloads 248
531 Reasons for Non-Applicability of Software Entropy Metrics for Bug Prediction in Android

Authors: Arvinder Kaur, Deepti Chopra

Abstract:

Software Entropy Metrics for bug prediction have been validated on various software systems by different researchers. In our previous research, we have validated that Software Entropy Metrics calculated for Mozilla subsystem’s predict the future bugs reasonably well. In this study, the Software Entropy metrics are calculated for a subsystem of Android and it is noticed that these metrics are not suitable for bug prediction. The results are compared with a subsystem of Mozilla and a comparison is made between the two software systems to determine the reasons why Software Entropy metrics are not applicable for Android.

Keywords: android, bug prediction, mining software repositories, software entropy

Procedia PDF Downloads 578
530 Experience Modularization for New Value of Evanescent Cultural Communities: Developing Creative Tourism Services in Bangkok

Authors: Wuttigrai Ngamsirijit

Abstract:

Creative tourism is an ongoing development in many countries as an attempt to moving away from serial reproduction of culture and reviving the culture. Despite, in the destinations with diverse and potential cultural resources, creating new tourism services can be vague. This paper presents how tourism experiences are modularized and consolidated in order to form new creative tourism service offerings in evanescent cultural communities of Bangkok, Thailand. The benefits from data mining in accommodating value co-creation are discussed, and implication of experience modularization to national creative tourism policy is addressed.

Keywords: co-creation, creative tourism, new service design, experience modularization

Procedia PDF Downloads 366
529 The Effects of Human Activities on Plant Diversity in Tropical Wetlands of Lake Tana (Ethiopia)

Authors: Abrehet Kahsay Mehari

Abstract:

Aquatic plants provide the physical structure of wetlands and increase their habitat complexity and heterogeneity, and as such, have a profound influence on other biotas. In this study, we investigated how human disturbance activities influenced the species richness and community composition of aquatic plants in the wetlands of Lake Tana, Ethiopia. Twelve wetlands were selected: four lacustrine, four river mouths, and four riverine papyrus swamps. Data on aquatic plants, environmental variables, and human activities were collected during the dry and wet seasons of 2018. A linear mixed effect model and a distance-based Redundancy Analysis (db-RDA) were used to relate aquatic plant species richness and community composition, respectively, to human activities and environmental variables. A total of 113 aquatic plant species, belonging to 38 families, were identified across all wetlands during the dry and wet seasons. Emergent species had the maximum area covered at 73.45 % and attained the highest relative abundance, followed by amphibious and other forms. The mean taxonomic richness of aquatic plants was significantly lower in wetlands with high overall human disturbance scores compared to wetlands with low overall human disturbance scores. Moreover, taxonomic richness showed a negative correlation with livestock grazing, tree plantation, and sand mining. The community composition also varied across wetlands with varying levels of human disturbance and was primarily driven by turnover (i.e., replacement of species) rather than nestedness resultant(i.e., loss of species). Distance-based redundancy analysis revealed that livestock grazing, tree plantation, sand mining, waste dumping, and crop cultivation were significant predictors of variation in aquatic plant communities’ composition in the wetlands. Linear mixed effect models and distance-based redundancy analysis also revealed that water depth, turbidity, conductivity, pH, sediment depth, and temperature were important drivers of variations in aquatic plant species richness and community composition. Papyrus swamps had the highest species richness and supported different plant communities. Conservation efforts should therefore focus on these habitats and measures should be taken to restore the highly disturbed and species poor wetlands near the river mouths.

Keywords: species richness, community composition, aquatic plants, wetlands, Lake Tana, human disturbance activities

Procedia PDF Downloads 123
528 Optimization of Gold Mining Parameters by Cyanidation

Authors: Della Saddam Housseyn

Abstract:

Gold, the quintessential noble metal, is one of the most popular metals today, given its ever-increasing cost in the international market. The Amesmessa gold deposit is one of the gold-producing deposits. The first step in our job is to analyze the ore (considered rich ore). Mineralogical and chemical analysis has shown that the general constitution of the ore is quartz in addition to other phases such as Al2O3, Fe2O3, CaO, dolomite. The second step consists of all the leaching tests carried out in rolling bottles. These tests were carried out on 14 samples to determine the maximum recovery rate and the optimum consumption of reagent (NaCN and CaO). Tests carried out on a pulp density at 50% solid, 500 ppm cyanide concentration and particle size less than 0.6 mm at alkaline pH gave a recovery rate of 94.37%.

Keywords: cyanide, DRX, FX, gold, leaching, rate of recovery, SAA

Procedia PDF Downloads 180
527 Occupational Safety and Health in the Wake of Drones

Authors: Hoda Rahmani, Gary Weckman

Abstract:

The body of research examining the integration of drones into various industries is expanding rapidly. Despite progress made in addressing the cybersecurity concerns for commercial drones, knowledge deficits remain in determining potential occupational hazards and risks of drone use to employees’ well-being and health in the workplace. This creates difficulty in identifying key approaches to risk mitigation strategies and thus reflects the need for raising awareness among employers, safety professionals, and policymakers about workplace drone-related accidents. The purpose of this study is to investigate the prevalence of and possible risk factors for drone-related mishaps by comparing the application of drones in construction with manufacturing industries. The chief reason for considering these specific sectors is to ascertain whether there exists any significant difference between indoor and outdoor flights since most construction sites use drones outside and vice versa. Therefore, the current research seeks to examine the causes and patterns of workplace drone-related mishaps and suggest possible ergonomic interventions through data collection. Potential ergonomic practices to mitigate hazards associated with flying drones could include providing operators with professional pieces of training, conducting a risk analysis, and promoting the use of personal protective equipment. For the purpose of data analysis, two data mining techniques, the random forest and association rule mining algorithms, will be performed to find meaningful associations and trends in data as well as influential features that have an impact on the occurrence of drone-related accidents in construction and manufacturing sectors. In addition, Spearman’s correlation and chi-square tests will be used to measure the possible correlation between different variables. Indeed, by recognizing risks and hazards, occupational safety stakeholders will be able to pursue data-driven and evidence-based policy change with the aim of reducing drone mishaps, increasing productivity, creating a safer work environment, and extending human performance in safe and fulfilling ways. This research study was supported by the National Institute for Occupational Safety and Health through the Pilot Research Project Training Program of the University of Cincinnati Education and Research Center Grant #T42OH008432.

Keywords: commercial drones, ergonomic interventions, occupational safety, pattern recognition

Procedia PDF Downloads 209
526 Best Resource Recommendation for a Stochastic Process

Authors: Likewin Thomas, M. V. Manoj Kumar, B. Annappa

Abstract:

The aim of this study was to develop an Artificial Neural Network0 s recommendation model for an online process using the complexity of load, performance, and average servicing time of the resources. Here, the proposed model investigates the resource performance using stochastic gradient decent method for learning ranking function. A probabilistic cost function is implemented to identify the optimal θ values (load) on each resource. Based on this result the recommendation of resource suitable for performing the currently executing task is made. The test result of CoSeLoG project is presented with an accuracy of 72.856%.

Keywords: ADALINE, neural network, gradient decent, process mining, resource behaviour, polynomial regression model

Procedia PDF Downloads 390
525 The Role Of Data Gathering In NGOs

Authors: Hussaini Garba Mohammed

Abstract:

Background/Significance: The lack of data gathering is affecting NGOs world-wide in general to have good data information about educational and health related issues among communities in any country and around the world. For example, HIV/AIDS smoking (Tuberculosis diseases) and COVID-19 virus carriers is becoming a serious public health problem, especially among old men and women. But there is no full details data survey assessment from communities, villages, and rural area in some countries to show the percentage of victims and patients, especial with this world COVID-19 virus among the people. These data are essential to inform programming targets, strategies, and priorities in getting good information about data gathering in any society.

Keywords: reliable information, data assessment, data mining, data communication

Procedia PDF Downloads 179
524 Estimation of Morbidity Level of Industrial Labour Conditions at Zestafoni Ferroalloy Plant

Authors: M. Turmanauli, T. Todua, O. Gvaberidze, R. Javakhadze, N. Chkhaidze, N. Khatiashvili

Abstract:

Background: Mining process has the significant influence on human health and quality of life. In recent years the events in Georgia were reflected on the industry working process, especially minimal requirements of labor safety, hygiene standards of workplace and the regime of work and rest are not observed. This situation is often caused by the lack of responsibility, awareness, and knowledge both of workers and employers. The control of working conditions and its protection has been worsened in many of industries. Materials and Methods: For evaluation of the current situation the prospective epidemiological study by face to face interview method was conducted at Georgian “Manganese Zestafoni Ferroalloy Plant” in 2011-2013. 65.7% of employees (1428 bulletin) were surveyed and the incidence rates of temporary disability days were studied. Results: The average length of a temporary disability single accident was studied taking into consideration as sex groups as well as the whole cohort. According to the classes of harmfulness the following results were received: Class 2.0-10.3%; 3.1-12.4%; 3.2-35.1%; 3.3-12.1%; 3.4-17.6%; 4.0-12.5%. Among the employees 47.5% and 83.1% were tobacco and alcohol consumers respectively. According to the age groups and years of work on the base of previous experience ≥50 ages and ≥21 years of work data prevalence respectively. The obtained data revealed increased morbidity rate according to age and years of work. It was found that the bone and articulate system and connective tissue diseases, aggravation of chronic respiratory diseases, ischemic heart diseases, hypertension and cerebral blood discirculation were the leading among the other diseases. High prevalence of morbidity observed in the workplace with not satisfactory labor conditions from the hygienic point of view. Conclusion: According to received data the causes of morbidity are the followings: unsafety labor conditions; incomplete of preventive medical examinations (preliminary and periodic); lack of access to appropriate health care services; derangement of gathering, recording, and analysis of morbidity data. This epidemiological study was conducted at the JSC “Manganese Ferro Alloy Plant” according to State program “ Prevention of Occupational Diseases” (Program code is 35 03 02 05).

Keywords: occupational health, mining process, morbidity level, cerebral blood discirculation

Procedia PDF Downloads 428
523 Monitoring the Pollution Status of the Goan Coast Using Genotoxicity Biomarkers in the Bivalve, Meretrix ovum

Authors: Avelyno D'Costa, S. K. Shyama, M. K. Praveen Kumar

Abstract:

The coast of Goa, India receives constant anthropogenic stress through its major rivers which carry mining rejects of iron and manganese ores from upstream mining sites and petroleum hydrocarbons from shipping and harbor-related activities which put the aquatic fauna such as bivalves at risk. The present study reports the pollution status of the Goan coast by the above xenobiotics employing genotoxicity studies. This is further supplemented by the quantification of total petroleum hydrocarbons (TPHs) and various trace metals (iron, manganese, copper, cadmium, and lead) in gills of the estuarine clam, Meretrix ovum as well as from the surrounding water and sediment, over a two-year sampling period, from January 2013 to December 2014. Bivalves were collected from a probable unpolluted site at Palolem and a probable polluted site at Vasco, based upon the anthropogenic activities at these sites. Genotoxicity was assessed in the gill cells using the comet assay and micronucleus test. The quantity of TPHs and trace metals present in gill tissue, water and sediments were analyzed using spectrofluorometry and atomic absorption spectrophotometry (AAS), respectively. The statistical significance of data was analyzed employing Student’s t-test. The relationship between DNA damage and pollutant concentrations was evaluated using multiple regression analysis. Significant DNA damage was observed in the bivalves collected from Vasco which is a region of high industrial activity. Concentrations of TPHs and trace metals (iron, manganese, and cadmium) were also found to be significantly high in gills of the bivalves collected from Vasco compared to those collected from Palolem. Further, the concentrations of these pollutants were also found to be significantly high in the water and sediments at Vasco compared to that of Palolem. This may be due to the lack of industrial activity at Palolem. A high positive correlation was observed between the pollutant levels and DNA damage in the bivalves collected from Vasco suggesting the genotoxic nature of these pollutants. Further, M. ovum can be used as a bioindicator species for monitoring the level of pollution of the estuarine/coastal regions by TPHs and trace metals.

Keywords: comet assay, metals, micronucleus test, total petroleum Hydrocarbons

Procedia PDF Downloads 237
522 Analysis of Scholarly Communication Patterns in Korean Studies

Authors: Erin Hea-Jin Kim

Abstract:

This study aims to investigate scholarly communication patterns in Korean studies, which focuses on all aspects of Korea, including history, culture, literature, politics, society, economics, religion, and so on. It is called ‘national study or home study’ as the subject of the study is itself, whereas it is called ‘area study’ as the subject of the study is others, i.e., outside of Korea. Understanding of the structure of scholarly communication in Korean studies is important since the motivations, procedures, results, or outcomes of individual studies may be affected by the cooperative relationships that appear in the communication structure. To this end, we collected 1,798 articles with the (author or index) keyword ‘Korean’ published in 2018 from the Scopus database and extracted the institution and country of the authors using a text mining technique. A total of 96 countries, including South Korea, was identified. Then we constructed a co-authorship network based on the countries identified. The indicators of social network analysis (SNA), co-occurrences, and cluster analysis were used to measure the activity and connectivity of participation in collaboration in Korean studies. As a result, the highest frequency of collaboration appears in the following order: S. Korea with the United States (603), S. Korea with Japan (146), S. Korea with China (131), S. Korea with the United Kingdom (83), and China with the United States (65). This means that the most active participants are S. Korea as well as the USA. The highest rank in the role of mediator measured by betweenness centrality appears in the following order: United States (0.165), United Kingdom (0.045), China (0.043), Japan (0.037), Australia (0.026), and South Africa (0.023). These results show that these countries contribute to connecting in Korean studies. We found two major communities among the co-authorship network. Asian countries and America belong to the same community, and the United Kingdom and European countries belong to the other community. Korean studies have a long history, and the study has emerged since Japanese colonization. However, Korean studies have never been investigated by digital content analysis. The contributions of this study are an analysis of co-authorship in Korean studies with a global perspective based on digital content, which has not attempted so far to our knowledge, and to suggest ideas on how to analyze the humanities disciplines such as history, literature, or Korean studies by text mining. The limitation of this study is that the scholarly data we collected did not cover all domestic journals because we only gathered scholarly data from Scopus. There are thousands of domestic journals not indexed in Scopus that we can consider in terms of national studies, but are not possible to collect.

Keywords: co-authorship network, Korean studies, Koreanology, scholarly communication

Procedia PDF Downloads 157
521 EDM for Prediction of Academic Trends and Patterns

Authors: Trupti Diwan

Abstract:

Predicting student failure at school has changed into a difficult challenge due to both the large number of factors that can affect the reduced performance of students and the imbalanced nature of these kinds of data sets. This paper surveys the two elements needed to make prediction on Students’ Academic Performances which are parameters and methods. This paper also proposes a framework for predicting the performance of engineering students. Genetic programming can be used to predict student failure/success. Ranking algorithm is used to rank students according to their credit points. The framework can be used as a basis for the system implementation & prediction of students’ Academic Performance in Higher Learning Institute.

Keywords: classification, educational data mining, student failure, grammar-based genetic programming

Procedia PDF Downloads 422
520 Biofilm Text Classifiers Developed Using Natural Language Processing and Unsupervised Learning Approach

Authors: Kanika Gupta, Ashok Kumar

Abstract:

Biofilms are dense, highly hydrated cell clusters that are irreversibly attached to a substratum, to an interface or to each other, and are embedded in a self-produced gelatinous matrix composed of extracellular polymeric substances. Research in biofilm field has become very significant, as biofilm has shown high mechanical resilience and resistance to antibiotic treatment and constituted as a significant problem in both healthcare and other industry related to microorganisms. The massive information both stated and hidden in the biofilm literature are growing exponentially therefore it is not possible for researchers and practitioners to automatically extract and relate information from different written resources. So, the current work proposes and discusses the use of text mining techniques for the extraction of information from biofilm literature corpora containing 34306 documents. It is very difficult and expensive to obtain annotated material for biomedical literature as the literature is unstructured i.e. free-text. Therefore, we considered unsupervised approach, where no annotated training is necessary and using this approach we developed a system that will classify the text on the basis of growth and development, drug effects, radiation effects, classification and physiology of biofilms. For this, a two-step structure was used where the first step is to extract keywords from the biofilm literature using a metathesaurus and standard natural language processing tools like Rapid Miner_v5.3 and the second step is to discover relations between the genes extracted from the whole set of biofilm literature using pubmed.mineR_v1.0.11. We used unsupervised approach, which is the machine learning task of inferring a function to describe hidden structure from 'unlabeled' data, in the above-extracted datasets to develop classifiers using WinPython-64 bit_v3.5.4.0Qt5 and R studio_v0.99.467 packages which will automatically classify the text by using the mentioned sets. The developed classifiers were tested on a large data set of biofilm literature which showed that the unsupervised approach proposed is promising as well as suited for a semi-automatic labeling of the extracted relations. The entire information was stored in the relational database which was hosted locally on the server. The generated biofilm vocabulary and genes relations will be significant for researchers dealing with biofilm research, making their search easy and efficient as the keywords and genes could be directly mapped with the documents used for database development.

Keywords: biofilms literature, classifiers development, text mining, unsupervised learning approach, unstructured data, relational database

Procedia PDF Downloads 170
519 Analysis on Thermococcus achaeans with Frequent Pattern Mining

Authors: Jeongyeob Hong, Myeonghoon Park, Taeson Yoon

Abstract:

After the advent of Achaeans which utilize different metabolism pathway and contain conspicuously different cellular structure, they have been recognized as possible materials for developing quality of human beings. Among diverse Achaeans, in this paper, we compared 16s RNA Sequences of four different species of Thermococcus: Achaeans genus specialized in sulfur-dealing metabolism. Four Species, Barophilus, Kodakarensis, Hydrothermalis, and Onnurineus, live near the hydrothermal vent that emits extreme amount of sulfur and heat. By comparing ribosomal sequences of aforementioned four species, we found similarities in their sequences and expressed protein, enabling us to expect that certain ribosomal sequence or proteins are vital for their survival. Apriori algorithms and Decision Tree were used. for comparison.

Keywords: Achaeans, Thermococcus, apriori algorithm, decision tree

Procedia PDF Downloads 290
518 Affects Associations Analysis in Emergency Situations

Authors: Joanna Grzybowska, Magdalena Igras, Mariusz Ziółko

Abstract:

Association rule learning is an approach for discovering interesting relationships in large databases. The analysis of relations, invisible at first glance, is a source of new knowledge which can be subsequently used for prediction. We used this data mining technique (which is an automatic and objective method) to learn about interesting affects associations in a corpus of emergency phone calls. We also made an attempt to match revealed rules with their possible situational context. The corpus was collected and subjectively annotated by two researchers. Each of 3306 recordings contains information on emotion: (1) type (sadness, weariness, anxiety, surprise, stress, anger, frustration, calm, relief, compassion, contentment, amusement, joy) (2) valence (negative, neutral, or positive) (3) intensity (low, typical, alternating, high). Also, additional information, that is a clue to speaker’s emotional state, was annotated: speech rate (slow, normal, fast), characteristic vocabulary (filled pauses, repeated words) and conversation style (normal, chaotic). Exponentially many rules can be extracted from a set of items (an item is a previously annotated single information). To generate the rules in the form of an implication X → Y (where X and Y are frequent k-itemsets) the Apriori algorithm was used - it avoids performing needless computations. Then, two basic measures (Support and Confidence) and several additional symmetric and asymmetric objective measures (e.g. Laplace, Conviction, Interest Factor, Cosine, correlation coefficient) were calculated for each rule. Each applied interestingness measure revealed different rules - we selected some top rules for each measure. Owing to the specificity of the corpus (emergency situations), most of the strong rules contain only negative emotions. There are though strong rules including neutral or even positive emotions. Three examples of the strongest rules are: {sadness} → {anxiety}; {sadness, weariness, stress, frustration} → {anger}; {compassion} → {sadness}. Association rule learning revealed the strongest configurations of affects (as well as configurations of affects with affect-related information) in our emergency phone calls corpus. The acquired knowledge can be used for prediction to fulfill the emotional profile of a new caller. Furthermore, a rule-related possible context analysis may be a clue to the situation a caller is in.

Keywords: data mining, emergency phone calls, emotional profiles, rules

Procedia PDF Downloads 408
517 Economic Characteristics of Bitcoin: "An Analytical Study"

Authors: Abdelhalem Shahen

Abstract:

The world is now experiencing a digital revolution and greatly accelerated technological developments, in addition to the transition from the economy in its traditional form to the digital economy, which has resulted in the emergence of new tools that are appropriate to those developments, and from this, this paper attempts to explore the economic characteristics of the bitcoin currency that circulated recently. Due to the many advantages that distinguish it from money in its traditional forms, which have a range of economic effects. The study found that Bitcoin is among the technological innovations, which contain a set of characteristics that are worth studying, those that make it the focus of attention, such as the digital currency, the peer-to-peer property, Lower and Faster Transaction Costs, transparency, decentralized control, privacy, and Double-Spending, as well as security and Cryptographic, and finally mining.

Keywords: Digital Economics, Digital Currencies, Bitcoin, Features of Bitcoin

Procedia PDF Downloads 138
516 Frequent Pattern Mining for Digenic Human Traits

Authors: Atsuko Okazaki, Jurg Ott

Abstract:

Some genetic diseases (‘digenic traits’) are due to the interaction between two DNA variants. For example, certain forms of Retinitis Pigmentosa (a genetic form of blindness) occur in the presence of two mutant variants, one in the ROM1 gene and one in the RDS gene, while the occurrence of only one of these mutant variants leads to a completely normal phenotype. Detecting such digenic traits by genetic methods is difficult. A common approach to finding disease-causing variants is to compare 100,000s of variants between individuals with a trait (cases) and those without the trait (controls). Such genome-wide association studies (GWASs) have been very successful but hinge on genetic effects of single variants, that is, there should be a difference in allele or genotype frequencies between cases and controls at a disease-causing variant. Frequent pattern mining (FPM) methods offer an avenue at detecting digenic traits even in the absence of single-variant effects. The idea is to enumerate pairs of genotypes (genotype patterns) with each of the two genotypes originating from different variants that may be located at very different genomic positions. What is needed is for genotype patterns to be significantly more common in cases than in controls. Let Y = 2 refer to cases and Y = 1 to controls, with X denoting a specific genotype pattern. We are seeking association rules, ‘X → Y’, with high confidence, P(Y = 2|X), significantly higher than the proportion of cases, P(Y = 2) in the study. Clearly, generally available FPM methods are very suitable for detecting disease-associated genotype patterns. We use fpgrowth as the basic FPM algorithm and built a framework around it to enumerate high-frequency digenic genotype patterns and to evaluate their statistical significance by permutation analysis. Application to a published dataset on opioid dependence furnished results that could not be found with classical GWAS methodology. There were 143 cases and 153 healthy controls, each genotyped for 82 variants in eight genes of the opioid system. The aim was to find out whether any of these variants were disease-associated. The single-variant analysis did not lead to significant results. Application of our FPM implementation resulted in one significant (p < 0.01) genotype pattern with both genotypes in the pattern being heterozygous and originating from two variants on different chromosomes. This pattern occurred in 14 cases and none of the controls. Thus, the pattern seems quite specific to this form of substance abuse and is also rather predictive of disease. An algorithm called Multifactor Dimension Reduction (MDR) was developed some 20 years ago and has been in use in human genetics ever since. This and our algorithms share some similar properties, but they are also very different in other respects. The main difference seems to be that our algorithm focuses on patterns of genotypes while the main object of inference in MDR is the 3 × 3 table of genotypes at two variants.

Keywords: digenic traits, DNA variants, epistasis, statistical genetics

Procedia PDF Downloads 121
515 Brainbow Image Segmentation Using Bayesian Sequential Partitioning

Authors: Yayun Hsu, Henry Horng-Shing Lu

Abstract:

This paper proposes a data-driven, biology-inspired neural segmentation method of 3D drosophila Brainbow images. We use Bayesian Sequential Partitioning algorithm for probabilistic modeling, which can be used to detect somas and to eliminate cross talk effects. This work attempts to develop an automatic methodology for neuron image segmentation, which nowadays still lacks a complete solution due to the complexity of the image. The proposed method does not need any predetermined, risk-prone thresholds since biological information is inherently included in the image processing procedure. Therefore, it is less sensitive to variations in neuron morphology; meanwhile, its flexibility would be beneficial for tracing the intertwining structure of neurons.

Keywords: brainbow, 3D imaging, image segmentation, neuron morphology, biological data mining, non-parametric learning

Procedia PDF Downloads 487