Search results for: sentiment mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1214

Search results for: sentiment mining

284 Investigating the Biosorption Potential of Indigenous Filamentous Fungi from Copperbelt Tailing Dams in Zambia with Copper and Cobalt Tolerance

Authors: Leonce Dusengemungu

Abstract:

Filamentous fungi indigenous to heavy metals (HMs) contaminated environments have a considerable biosorption potential yet are currently under-investigated in developing countries. In the work presented herein, the biosorption potential of three indigenous filamentous fungi (Aspergillus transmontanensis, Cladosporium cladosporioides, and Geotrichum candidum) isolated from copper and cobalt mining wasteland sites in Zambia's Copperbelt province was investigated. In Cu and Co tolerance tests, all the fungal isolates were shown to be tolerant, with mycelial growth at HMs concentrations of up to 7000 ppm. However, exposure to high Cu and Co concentrations hindered the growth of the three strains to varying degrees, resulting in reduced mycelial biomass (evidenced by loss of the infrared bands at 887 and 930 cm-1 of the 1,3-glucans backbone) as well as morphological alterations, sporulation, and pigment synthesis. In addition, gas chromatography-mass spectrometry characterization of the fungal biomass extracts allowed to detect changes in the chemical constituents upon exposure to HMs, with profiles poorer in maltol, 1,2-cyclopentadione, and n-hexadecanoic acid, and richer in furaldehydes. Biosorption tests showed that A. transmontanensis and G. candidum showed better performance as bioremediators than C. cladosporioides, with biosorption efficiencies of 1645, 1853 and 1253 ppm at pH 3, respectively, and may deserve further research in field conditions.

Keywords: bioremediation, fungi, biosorption, heavy metal

Procedia PDF Downloads 40
283 Recommender Systems Using Ensemble Techniques

Authors: Yeonjeong Lee, Kyoung-jae Kim, Youngtae Kim

Abstract:

This study proposes a novel recommender system that uses data mining and multi-model ensemble techniques to enhance the recommendation performance through reflecting the precise user’s preference. The proposed model consists of two steps. In the first step, this study uses logistic regression, decision trees, and artificial neural networks to predict customers who have high likelihood to purchase products in each product group. Then, this study combines the results of each predictor using the multi-model ensemble techniques such as bagging and bumping. In the second step, this study uses the market basket analysis to extract association rules for co-purchased products. Finally, the system selects customers who have high likelihood to purchase products in each product group and recommends proper products from same or different product groups to them through above two steps. We test the usability of the proposed system by using prototype and real-world transaction and profile data. In addition, we survey about user satisfaction for the recommended product list from the proposed system and the randomly selected product lists. The results also show that the proposed system may be useful in real-world online shopping store.

Keywords: product recommender system, ensemble technique, association rules, decision tree, artificial neural networks

Procedia PDF Downloads 263
282 Malware Beaconing Detection by Mining Large-scale DNS Logs for Targeted Attack Identification

Authors: Andrii Shalaginov, Katrin Franke, Xiongwei Huang

Abstract:

One of the leading problems in Cyber Security today is the emergence of targeted attacks conducted by adversaries with access to sophisticated tools. These attacks usually steal senior level employee system privileges, in order to gain unauthorized access to confidential knowledge and valuable intellectual property. Malware used for initial compromise of the systems are sophisticated and may target zero-day vulnerabilities. In this work we utilize common behaviour of malware called ”beacon”, which implies that infected hosts communicate to Command and Control servers at regular intervals that have relatively small time variations. By analysing such beacon activity through passive network monitoring, it is possible to detect potential malware infections. So, we focus on time gaps as indicators of possible C2 activity in targeted enterprise networks. We represent DNS log files as a graph, whose vertices are destination domains and edges are timestamps. Then by using four periodicity detection algorithms for each pair of internal-external communications, we check timestamp sequences to identify the beacon activities. Finally, based on the graph structure, we infer the existence of other infected hosts and malicious domains enrolled in the attack activities.

Keywords: malware detection, network security, targeted attack, computational intelligence

Procedia PDF Downloads 228
281 Evaluation of Drilling Performance through Bit-Rock Interaction Using Passive Vibration Assisted Rotation Drilling (PVARD) Tool

Authors: Md. Shaheen Shah, Abdelsalam Abugharara, Dipesh Maharjan, Syed Imtiaz, Stephen Butt

Abstract:

Drilling performance is an essential goal in petroleum and mining industry. Drilling rate of penetration (ROP), which is inversely proportional to the mechanical specific energy (MSE) is influenced by numerous factors among which are the applied parameter: torque (T), weight on bit (WOB), fluid flow rate, revolution per minute (rpm), rock related parameters: rock type, rock homogeneousness, rock anisotropy orientation, and mechanical parameters: bit type, configuration of the bottom hole assembly (BHA). This paper is focused on studying the drilling performance by implementing a passive vibration assisted rotary drilling tool (pVARD) as part of the BHA through using different bit types: coring bit, roller cone bit, and PDC bit and various rock types: rock-like material, granite, sandstone, etc. The results of this study aim to produce a pVARD index for optimal drilling performance considering the recommendations of the pVARD’s spring compression tests and stress-strain analysis of rock samples conducted prior to drilling experiments, analyzing the cutting size distribution, and evaluating the applied drilling parameters as a function of WOB. These results are compared with those obtained from drilling without pVARD, which represents the typical rigid BHA of the conventional drilling.

Keywords: BHA, drilling performance, MSE, pVARD, rate of penetration, ROP, tensile and shear fractures, unconfined compressive strength

Procedia PDF Downloads 123
280 Mathematical modeling of the calculation of the absorbed dose in uranium production workers with the genetic effects.

Authors: P. Kazymbet, G. Abildinova, K.Makhambetov, M. Bakhtin, D. Rybalkina, K. Zhumadilov

Abstract:

Conducted cytogenetic research in workers Stepnogorsk Mining-Chemical Combine (Akmola region) with the study of 26341 chromosomal metaphase. Using a regression analysis with program DataFit, version 5.0, dependence between exposure dose and the following cytogenetic exponents has been studied: frequency of aberrant cells, frequency of chromosomal aberrations, frequency of the amounts of dicentric chromosomes, and centric rings. Experimental data on calibration curves "dose-effect" enabled the development of a mathematical model, allowing on data of the frequency of aberrant cells, chromosome aberrations, the amounts of dicentric chromosomes and centric rings calculate the absorbed dose at the time of the study. In the dose range of 0.1 Gy to 5.0 Gy dependence cytogenetic parameters on the dose had the following equation: Y = 0,0067е^0,3307х (R2 = 0,8206) – for frequency of chromosomal aberrations; Y = 0,0057е^0,3161х (R2 = 0,8832) –for frequency of cells with chromosomal aberrations; Y =5 Е-0,5е^0,6383 (R2 = 0,6321) – or frequency of the amounts of dicentric chromosomes and centric rings on cells. On the basis of cytogenetic parameters and regression equations calculated absorbed dose in workers of uranium production at the time of the study did not exceed 0.3 Gy.

Keywords: Stepnogorsk, mathematical modeling, cytogenetic, dicentric chromosomes

Procedia PDF Downloads 447
279 Comparative Sustainability Performance Analysis of Australian Companies Using Composite Measures

Authors: Ramona Zharfpeykan, Paul Rouse

Abstract:

Organizational sustainability is important to both organizations themselves and their stakeholders. Despite its increasing popularity and increasing numbers of organizations reporting sustainability, research on evaluating and comparing the sustainability performance of companies is limited. The aim of this study was to develop models to measure sustainability performance for both cross-sectional and longitudinal comparisons across companies in the same or different industries. A secondary aim was to see if sustainability reports can be used to evaluate sustainability performance. The study used both a content analysis of Australian sustainability reports in mining and metals and financial services for 2011-2014 and a survey of Australian and New Zealand organizations. Two methods ranging from a composite index using uniform weights to data envelopment analysis (DEA) were employed to analyze the data and develop the models. The results show strong statistically significant relationships between the developed models, which suggests that each model provides a consistent, systematic and reasonably robust analysis. The results of the models show that for both industries, companies that had sustainability scores above or below the industry average stayed almost the same during the study period. These indices and models can be used by companies to evaluate their sustainability performance and compare it with previous years, or with other companies in the same or different industries. These methods can also be used by various stakeholders and sustainability ranking companies such as the Global Reporting Initiative (GRI).

Keywords: data envelopment analysis, sustainability, sustainability performance measurement system, sustainability performance index, global reporting initiative

Procedia PDF Downloads 143
278 Strategic Mine Planning: A SWOT Analysis Applied to KOV Open Pit Mine in the Democratic Republic of Congo

Authors: Patrick May Mukonki

Abstract:

KOV pit (Kamoto Oliveira Virgule) is located 10 km from Kolwezi town, one of the mineral rich town in the Lualaba province of the Democratic Republic of Congo. The KOV pit is currently operating under the Katanga Mining Limited (KML), a Glencore-Gecamines (a State Owned Company) join venture. Recently, the mine optimization process provided a life of mine of approximately 10 years withnice pushbacks using the Datamine NPV Scheduler software. In previous KOV pit studies, we recently outlined the impact of the accuracy of the geological information on a long-term mine plan for a big copper mine such as KOV pit. The approach taken, discussed three main scenarios and outlined some weaknesses on the geological information side, and now, in this paper that we are going to develop here, we are going to highlight, as an overview, those weaknesses, strengths and opportunities, in a global SWOT analysis. The approach we are taking here is essentially descriptive in terms of steps taken to optimize KOV pit and, at every step, we categorized the challenges we faced to have a better tradeoff between what we called strengths and what we called weaknesses. The same logic is applied in terms of the opportunities and threats. The SWOT analysis conducted in this paper demonstrates that, despite a general poor ore body definition, and very rude ground water conditions, there is room for improvement for such high grade ore body.

Keywords: mine planning, mine optimization, mine scheduling, SWOT analysis

Procedia PDF Downloads 199
277 Short Answer Grading Using Multi-Context Features

Authors: S. Sharan Sundar, Nithish B. Moudhgalya, Nidhi Bhandari, Vineeth Vijayaraghavan

Abstract:

Automatic Short Answer Grading is one of the prime applications of artificial intelligence in education. Several approaches involving the utilization of selective handcrafted features, graphical matching techniques, concept identification and mapping, complex deep frameworks, sentence embeddings, etc. have been explored over the years. However, keeping in mind the real-world application of the task, these solutions present a slight overhead in terms of computations and resources in achieving high performances. In this work, a simple and effective solution making use of elemental features based on statistical, linguistic properties, and word-based similarity measures in conjunction with tree-based classifiers and regressors is proposed. The results for classification tasks show improvements ranging from 1%-30%, while the regression task shows a stark improvement of 35%. The authors attribute these improvements to the addition of multiple similarity scores to provide ensemble of scoring criteria to the models. The authors also believe the work could reinstate that classical natural language processing techniques and simple machine learning models can be used to achieve high results for short answer grading.

Keywords: artificial intelligence, intelligent systems, natural language processing, text mining

Procedia PDF Downloads 113
276 Synergy Effect of Energy and Water Saving in China's Energy Sectors: A Multi-Objective Optimization Analysis

Authors: Yi Jin, Xu Tang, Cuiyang Feng

Abstract:

The ‘11th five-year’ and ‘12th five-year’ plans have clearly put forward to strictly control the total amount and intensity of energy and water consumption. The synergy effect of energy and water has rarely been considered in the process of energy and water saving in China, where its contribution cannot be maximized. Energy sectors consume large amounts of energy and water when producing massive energy, which makes them both energy and water intensive. Therefore, the synergy effect in these sectors is significant. This paper assesses and optimizes the synergy effect in three energy sectors under the background of promoting energy and water saving. Results show that: From the perspective of critical path, chemical industry, mining and processing of non-metal ores and smelting and pressing of metals are coupling points in the process of energy and water flowing to energy sectors, in which the implementation of energy and water saving policies can bring significant synergy effect. Multi-objective optimization shows that increasing efforts on input restructuring can effectively improve synergy effects; relatively large synergetic energy saving and little water saving are obtained after solely reducing the energy and water intensity of coupling sectors. By optimizing the input structure of sectors, especially the coupling sectors, the synergy effect of energy and water saving can be improved in energy sectors under the premise of keeping economy running stably.

Keywords: critical path, energy sector, multi-objective optimization, synergy effect, water

Procedia PDF Downloads 334
275 Agile Methodology for Modeling and Design of Data Warehouses -AM4DW-

Authors: Nieto Bernal Wilson, Carmona Suarez Edgar

Abstract:

The organizations have structured and unstructured information in different formats, sources, and systems. Part of these come from ERP under OLTP processing that support the information system, however these organizations in OLAP processing level, presented some deficiencies, part of this problematic lies in that does not exist interesting into extract knowledge from their data sources, as also the absence of operational capabilities to tackle with these kind of projects.  Data Warehouse and its applications are considered as non-proprietary tools, which are of great interest to business intelligence, since they are repositories basis for creating models or patterns (behavior of customers, suppliers, products, social networks and genomics) and facilitate corporate decision making and research. The following paper present a structured methodology, simple, inspired from the agile development models as Scrum, XP and AUP. Also the models object relational, spatial data models, and the base line of data modeling under UML and Big data, from this way sought to deliver an agile methodology for the developing of data warehouses, simple and of easy application. The methodology naturally take into account the application of process for the respectively information analysis, visualization and data mining, particularly for patterns generation and derived models from the objects facts structured.

Keywords: data warehouse, model data, big data, object fact, object relational fact, process developed data warehouse

Procedia PDF Downloads 381
274 Multi-Source Data Fusion for Urban Comprehensive Management

Authors: Bolin Hua

Abstract:

In city governance, various data are involved, including city component data, demographic data, housing data and all kinds of business data. These data reflects different aspects of people, events and activities. Data generated from various systems are different in form and data source are different because they may come from different sectors. In order to reflect one or several facets of an event or rule, data from multiple sources need fusion together. Data from different sources using different ways of collection raised several issues which need to be resolved. Problem of data fusion include data update and synchronization, data exchange and sharing, file parsing and entry, duplicate data and its comparison, resource catalogue construction. Governments adopt statistical analysis, time series analysis, extrapolation, monitoring analysis, value mining, scenario prediction in order to achieve pattern discovery, law verification, root cause analysis and public opinion monitoring. The result of Multi-source data fusion is to form a uniform central database, which includes people data, location data, object data, and institution data, business data and space data. We need to use meta data to be referred to and read when application needs to access, manipulate and display the data. A uniform meta data management ensures effectiveness and consistency of data in the process of data exchange, data modeling, data cleansing, data loading, data storing, data analysis, data search and data delivery.

Keywords: multi-source data fusion, urban comprehensive management, information fusion, government data

Procedia PDF Downloads 351
273 Mining the Proteome of Fusobacterium nucleatum for Potential Therapeutics Discovery

Authors: Abdul Musaweer Habib, Habibul Hasan Mazumder, Saiful Islam, Sohel Sikder, Omar Faruk Sikder

Abstract:

The plethora of genome sequence information of bacteria in recent times has ushered in many novel strategies for antibacterial drug discovery and facilitated medical science to take up the challenge of the increasing resistance of pathogenic bacteria to current antibiotics. In this study, we adopted subtractive genomics approach to analyze the whole genome sequence of the Fusobacterium nucleatum, a human oral pathogen having association with colorectal cancer. Our study divulged 1499 proteins of Fusobacterium nucleatum, which has no homolog in human genome. These proteins were subjected to screening further by using the Database of Essential Genes (DEG) that resulted in the identification of 32 vitally important proteins for the bacterium. Subsequent analysis of the identified pivotal proteins, using the KEGG Automated Annotation Server (KAAS) resulted in sorting 3 key enzymes of F. nucleatum that may be good candidates as potential drug targets, since they are unique for the bacterium and absent in humans. In addition, we have demonstrated the 3-D structure of these three proteins. Finally, determination of ligand binding sites of the key proteins as well as screening for functional inhibitors that best fitted with the ligands sites were conducted to discover effective novel therapeutic compounds against Fusobacterium nucleatum.

Keywords: colorectal cancer, drug target, Fusobacterium nucleatum, homology modeling, ligands

Procedia PDF Downloads 358
272 Numerical Simulation of Fracturing Behaviour of Pre-Cracked Crystalline Rock Using a Cohesive Grain-Based Distinct Element Model

Authors: Mahdi Saadat, Abbas Taheri

Abstract:

Understanding the cracking response of crystalline rocks at mineralogical scale is of great importance during the design procedure of mining structures. A grain-based distinct element model (GBM) is employed to numerically study the cracking response of Barre granite at micro- and macro-scales. The GBM framework is augmented with a proposed distinct element-based cohesive model to reproduce the micro-cracking response of the inter- and intra-grain contacts. The cohesive GBM framework is implemented in PFC2D distinct element codes. The microstructural properties of Barre granite are imported in PFC2D to generate synthetic specimens. The microproperties of the model is calibrated against the laboratory uniaxial compressive and Brazilian split tensile tests. The calibrated model is then used to simulate the fracturing behaviour of pre-cracked Barre granite with different flaw configurations. The numerical results of the proposed model demonstrate a good agreement with the experimental counterparts. The GBM framework proposed thus appears promising for further investigation of the influence of grain microstructure and mineralogical properties on the cracking behaviour of crystalline rocks.

Keywords: discrete element modelling, cohesive grain-based model, crystalline rock, fracturing behavior

Procedia PDF Downloads 103
271 A Comparative Study on Supercritical C02 and Water as Working Fluids in a Heterogeneous Geothermal Reservoir

Authors: Musa D. Aliyu, Ouahid Harireche, Colin D. Hills

Abstract:

The incapability of supercritical C02 to transport and dissolve mineral species from the geothermal reservoir to the fracture apertures and other important parameters in heat mining makes it an attractive substance for Heat extraction from hot dry rock. In other words, the thermodynamic efficiency of hot dry rock (HDR) reservoirs also increases if supercritical C02 is circulated at excess temperatures of 3740C without the drawbacks connected with silica dissolution. Studies have shown that circulation of supercritical C02 in homogenous geothermal reservoirs is quite encouraging; in comparison to that of the water. This paper aims at investigating the aforementioned processes in the case of the heterogeneous geothermal reservoir located at the Soultz site (France). The MultiPhysics finite element package COMSOL with an interface of coupling different processes encountered in the geothermal reservoir stimulation is used. A fully coupled numerical model is developed to study the thermal and hydraulic processes in order to predict the long-term operation of the basic reservoir parameters that give optimum energy production. The results reveal that the temperature of the SCC02 at the production outlet is higher than that of water in long-term stimulation; as the temperature is an essential ingredient in rating the energy production. It is also observed that the mass flow rate of the SCC02 is far more favourable compared to that of water.

Keywords: FEM, HDR, heterogeneous reservoir, stimulation, supercritical C02

Procedia PDF Downloads 352
270 A Framework for Event-Based Monitoring of Business Processes in the Supply Chain Management of Industry 4.0

Authors: Johannes Atug, Andreas Radke, Mitchell Tseng, Gunther Reinhart

Abstract:

In modern supply chains, large numbers of SKU (Stock-Keeping-Unit) need to be timely managed, and any delays in noticing disruptions of items often limit the ability to defer the impact on customer order fulfillment. However, in supply chains of IoT-connected enterprises, the ERP (Enterprise-Resource-Planning), the MES (Manufacturing-Execution-System) and the SCADA (Supervisory-Control-and-Data-Acquisition) systems generate large amounts of data, which generally glean much earlier notice of deviations in the business process steps. That is, analyzing these streams of data with process mining techniques allows the monitoring of the supply chain business processes and thus identification of items that deviate from the standard order fulfillment process. In this paper, a framework to enable event-based SCM (Supply-Chain-Management) processes including an overview of core enabling technologies are presented, which is based on the RAMI (Reference-Architecture-Model for Industrie 4.0) architecture. The application of this framework in the industry is presented, and implications for SCM in industry 4.0 and further research are outlined.

Keywords: cyber-physical production systems, event-based monitoring, supply chain management, RAMI (Reference-Architecture-Model for Industrie 4.0)

Procedia PDF Downloads 204
269 Optimised Path Recommendation for a Real Time Process

Authors: Likewin Thomas, M. V. Manoj Kumar, B. Annappa

Abstract:

Traditional execution process follows the path of execution drawn by the process analyst without observing the behaviour of resource and other real-time constraints. Identifying process model, predicting the behaviour of resource and recommending the optimal path of execution for a real time process is challenging. The proposed AlfyMiner: αyM iner gives a new dimension in process execution with the novel techniques Process Model Analyser: PMAMiner and Resource behaviour Analyser: RBAMiner for recommending the probable path of execution. PMAMiner discovers next probable activity for currently executing activity in an online process using variant matching technique to identify the set of next probable activity, among which the next probable activity is discovered using decision tree model. RBAMiner identifies the resource suitable for performing the discovered next probable activity and observe the behaviour based on; load and performance using polynomial regression model, and waiting time using queueing theory. Based on the observed behaviour αyM iner recommend the probable path of execution with; next probable activity and the best suitable resource for performing it. Experiments were conducted on process logs of CoSeLoG Project1 and 72% of accuracy is obtained in identifying and recommending next probable activity and the efficiency of resource performance was optimised by 59% by decreasing their load.

Keywords: cross-organization process mining, process behaviour, path of execution, polynomial regression model

Procedia PDF Downloads 308
268 The Lifecycle of a Heritage Language: A Comparative Case Study of Volga German Descendants in North America

Authors: Ashleigh Dawn Moeller

Abstract:

This is a comparative case study which examines the language attitudes and behaviors of descendants of Volga German immigrants in North America and how these attitudes combined with surrounding social conditions have caused their heritage language to develop differently within each community. Of particular interest for this study are the accounts of second- and third-generation descendants in Oregon, Kansas, and North Dakota regarding their parents’ and grandparents’ attitudes toward their language and how this correlates with the current sentiment as well as visibility of their heritage language and culture. This study discusses the point at which cultural identity could diverge from language identity and what elements play a role in this development, establishing the potential for environments (linguistic landscapes) which uphold their heritage yet have detached from the language itself. Emigrating from Germany in the 1700s, these families settled for over a hundred years along the Volga Region of Imperial Russia. Subsequently, many descendants of these settlers immigrated to the Americas in the 1800-1900s. Identifying neither as German nor Russian, they called themselves Wolgadeutche (Volga Germans). During their time in Russia, the German language was maintained relatively homogenously, yet the use and status of their heritage language diverged considerably upon settlement across the Americas. Data shows that specific conditions, such as community isolation, size, religion, location as well as language policy established prior to and following the Volga German immigration to North America have had a substantial impact on the maintenance of their heritage language—causing complete loss in some areas and peripheral use or even full rebirth in others. These past conditions combined with the family accounts correlate directly with the general attitudes and ideologies of the descendants toward their heritage language. Data also shows that in many locations, despite a strong presence of German within the linguistic landscape, minimal to no German is spoken nor understood; the attitude toward the language is indifferent while a staunch holding to the heritage is maintained and boasted. Data for this study was gathered from historical accounts, archived records and newspapers, and published biographies as well as from formal interviews with second- and third-generation descendants of Volga German immigrants conducted in Oregon and Kansas. Through the interviews, members of the community have shared and provided their family genealogies as well as biographies published by family members. These have helped to trace their relatives back to specific locations, thus allowing for comparisons within the same families residing in distinctly different areas of North America. This study is part of a larger ongoing project which researches the immigration of Volga and Black Sea Germans to North America and diachronically examines the over-arching sociological factors which have directly impacted the maintenance, loss, or rebirth of their heritage language. This project follows specific families who settled in areas of Colorado, Kansas, Nebraska, Illinois, Minnesota, North and South Dakota, Saskatchewan, and Manitoba, and who later had relatives move west to areas of Oregon and Washington State. Interviews for the larger project will continue into the following year.

Keywords: heritage language, immigrant language, language change, language contact, linguistic landscape, Volga Germans, Wolgadeutsche

Procedia PDF Downloads 100
267 Visual Text Analytics Technologies for Real-Time Big Data: Chronological Evolution and Issues

Authors: Siti Azrina B. A. Aziz, Siti Hafizah A. Hamid

Abstract:

New approaches to analyze and visualize data stream in real-time basis is important in making a prompt decision by the decision maker. Financial market trading and surveillance, large-scale emergency response and crowd control are some example scenarios that require real-time analytic and data visualization. This situation has led to the development of techniques and tools that support humans in analyzing the source data. With the emergence of Big Data and social media, new techniques and tools are required in order to process the streaming data. Today, ranges of tools which implement some of these functionalities are available. In this paper, we present chronological evolution evaluation of technologies for supporting of real-time analytic and visualization of the data stream. Based on the past research papers published from 2002 to 2014, we gathered the general information, main techniques, challenges and open issues. The techniques for streaming text visualization are identified based on Text Visualization Browser in chronological order. This paper aims to review the evolution of streaming text visualization techniques and tools, as well as to discuss the problems and challenges for each of identified tools.

Keywords: information visualization, visual analytics, text mining, visual text analytics tools, big data visualization

Procedia PDF Downloads 375
266 Classification of Land Cover Usage from Satellite Images Using Deep Learning Algorithms

Authors: Shaik Ayesha Fathima, Shaik Noor Jahan, Duvvada Rajeswara Rao

Abstract:

Earth's environment and its evolution can be seen through satellite images in near real-time. Through satellite imagery, remote sensing data provide crucial information that can be used for a variety of applications, including image fusion, change detection, land cover classification, agriculture, mining, disaster mitigation, and monitoring climate change. The objective of this project is to propose a method for classifying satellite images according to multiple predefined land cover classes. The proposed approach involves collecting data in image format. The data is then pre-processed using data pre-processing techniques. The processed data is fed into the proposed algorithm and the obtained result is analyzed. Some of the algorithms used in satellite imagery classification are U-Net, Random Forest, Deep Labv3, CNN, ANN, Resnet etc. In this project, we are using the DeepLabv3 (Atrous convolution) algorithm for land cover classification. The dataset used is the deep globe land cover classification dataset. DeepLabv3 is a semantic segmentation system that uses atrous convolution to capture multi-scale context by adopting multiple atrous rates in cascade or in parallel to determine the scale of segments.

Keywords: area calculation, atrous convolution, deep globe land cover classification, deepLabv3, land cover classification, resnet 50

Procedia PDF Downloads 115
265 Visual Template Detection and Compositional Automatic Regular Expression Generation for Business Invoice Extraction

Authors: Anthony Proschka, Deepak Mishra, Merlyn Ramanan, Zurab Baratashvili

Abstract:

Small and medium-sized businesses receive over 160 billion invoices every year. Since these documents exhibit many subtle differences in layout and text, extracting structured fields such as sender name, amount, and VAT rate from them automatically is an open research question. In this paper, existing work in template-based document extraction is extended, and a system is devised that is able to reliably extract all required fields for up to 70% of all documents in the data set, more than any other previously reported method. The approaches are described for 1) detecting through visual features which template a given document belongs to, 2) automatically generating extraction rules for a given new template by composing regular expressions from multiple components, and 3) computing confidence scores that indicate the accuracy of the automatic extractions. The system can generate templates with as little as one training sample and only requires the ground truth field values instead of detailed annotations such as bounding boxes that are hard to obtain. The system is deployed and used inside a commercial accounting software.

Keywords: data mining, information retrieval, business, feature extraction, layout, business data processing, document handling, end-user trained information extraction, document archiving, scanned business documents, automated document processing, F1-measure, commercial accounting software

Procedia PDF Downloads 99
264 Anthropogenic Impact on Migration Process of River Yamuna in Delhi-NCR Using Geospatial Techniques

Authors: Mohd Asim, K. Nageswara Rao

Abstract:

The present work was carried out on River Yamuna passing through Delhi- National Capital Region (Delhi-NCR) of India for a stretch of about 130 km to assess the anthropogenic impact on the channel migration process for a period of 200 years with the help of satellite data and topographical maps with integration of geographic information system environment. Digital Shoreline Analysis System (DSAS) application was used to quantify river channel migration in ArcGIS environment. The average river channel migration was calculated to be 22.8 m/year for the entire study area. River channel migration was found to be moving in westward and eastward direction. Westward migration is more than 4 km maximum in length and eastward migration is about 4.19 km. The river has migrated a total of 32.26 sq. km of area. The results reveal that the river is being impacted by various human activities. The impact indicators include engineering structures, sand mining, embankments, urbanization, land use/land cover, canal network. The DSAS application was also used to predict the position of river channel in future for 2032 and 2042 by analyzing the past and present rate and direction of movement. The length of channel in 2032 and 2042 will be 132.5 and 141.6 km respectively. The channel will migrate maximum after crossing Okhla Barrage near Faridabad for about 3.84 sq. km from 2022 to 2042 from west to east.

Keywords: river migration, remote sensing, river Yamuna, anthropogenic impacts, DSAS, Delhi-NCR

Procedia PDF Downloads 99
263 Implementation Association Rule Method in Determining the Layout of Qita Supermarket as a Strategy in the Competitive Retail Industry in Indonesia

Authors: Dwipa Rizki Utama, Hanief Ibrahim

Abstract:

The development of industry retail in Indonesia is very fast, various strategy was undertaken to boost the customer satisfaction and the productivity purchases to boost the profit, one of which is implementing strategies layout. The purpose of this study is to determine the layout of Qita supermarket, a retail industry in Indonesia, in order to improve customer satisfaction and to maximize the rate of products’ sale as a whole, so as the infrequently purchased products will be purchased. This research uses a literature study method, and one of the data mining methods is association rule which applied in market basket analysis. Data were tested amounted 100 from 160 after pre-processing data, so then the distribution department and 26 departments corresponding to the data previous layout will be obtained. From those data, by the association rule method, customer behavior when purchasing items simultaneously can be studied, so then the layout of the supermarket based on customer behavior can be determined. Using the rapid miner software by the minimal support 25% and minimal confidence 30% showed that the 14th department purchased at the same time with department 10, 21st department purchased at the same time with department 13, 15th department purchased at the same time with department 12, 14th department purchased at the same time with department 12, and 10th department purchased at the same time with department 14. From those results, a better supermarket layout can be arranged than the previous layout.

Keywords: industry retail, strategy, association rule, supermarket

Procedia PDF Downloads 163
262 A Fuzzy-Rough Feature Selection Based on Binary Shuffled Frog Leaping Algorithm

Authors: Javad Rahimipour Anaraki, Saeed Samet, Mahdi Eftekhari, Chang Wook Ahn

Abstract:

Feature selection and attribute reduction are crucial problems, and widely used techniques in the field of machine learning, data mining and pattern recognition to overcome the well-known phenomenon of the Curse of Dimensionality. This paper presents a feature selection method that efficiently carries out attribute reduction, thereby selecting the most informative features of a dataset. It consists of two components: 1) a measure for feature subset evaluation, and 2) a search strategy. For the evaluation measure, we have employed the fuzzy-rough dependency degree (FRFDD) of the lower approximation-based fuzzy-rough feature selection (L-FRFS) due to its effectiveness in feature selection. As for the search strategy, a modified version of a binary shuffled frog leaping algorithm is proposed (B-SFLA). The proposed feature selection method is obtained by hybridizing the B-SFLA with the FRDD. Nine classifiers have been employed to compare the proposed approach with several existing methods over twenty two datasets, including nine high dimensional and large ones, from the UCI repository. The experimental results demonstrate that the B-SFLA approach significantly outperforms other metaheuristic methods in terms of the number of selected features and the classification accuracy.

Keywords: binary shuffled frog leaping algorithm, feature selection, fuzzy-rough set, minimal reduct

Procedia PDF Downloads 186
261 Intelligent Recognition of Diabetes Disease via FCM Based Attribute Weighting

Authors: Kemal Polat

Abstract:

In this paper, an attribute weighting method called fuzzy C-means clustering based attribute weighting (FCMAW) for classification of Diabetes disease dataset has been used. The aims of this study are to reduce the variance within attributes of diabetes dataset and to improve the classification accuracy of classifier algorithm transforming from non-linear separable datasets to linearly separable datasets. Pima Indians Diabetes dataset has two classes including normal subjects (500 instances) and diabetes subjects (268 instances). Fuzzy C-means clustering is an improved version of K-means clustering method and is one of most used clustering methods in data mining and machine learning applications. In this study, as the first stage, fuzzy C-means clustering process has been used for finding the centers of attributes in Pima Indians diabetes dataset and then weighted the dataset according to the ratios of the means of attributes to centers of theirs. Secondly, after weighting process, the classifier algorithms including support vector machine (SVM) and k-NN (k- nearest neighbor) classifiers have been used for classifying weighted Pima Indians diabetes dataset. Experimental results show that the proposed attribute weighting method (FCMAW) has obtained very promising results in the classification of Pima Indians diabetes dataset.

Keywords: fuzzy C-means clustering, fuzzy C-means clustering based attribute weighting, Pima Indians diabetes, SVM

Procedia PDF Downloads 383
260 ParkedGuard: An Efficient and Accurate Parked Domain Detection System Using Graphical Locality Analysis and Coarse-To-Fine Strategy

Authors: Chia-Min Lai, Wan-Ching Lin, Hahn-Ming Lee, Ching-Hao Mao

Abstract:

As world wild internet has non-stop developments, making profit by lending registered domain names emerges as a new business in recent years. Unfortunately, the larger the market scale of domain lending service becomes, the riskier that there exist malicious behaviors or malwares hiding behind parked domains will be. Also, previous work for differentiating parked domain suffers two main defects: 1) too much data-collecting effort and CPU latency needed for features engineering and 2) ineffectiveness when detecting parked domains containing external links that are usually abused by hackers, e.g., drive-by download attack. Aiming for alleviating above defects without sacrificing practical usability, this paper proposes ParkedGuard as an efficient and accurate parked domain detector. Several scripting behavioral features were analyzed, while those with special statistical significance are adopted in ParkedGuard to make feature engineering much more cost-efficient. On the other hand, finding memberships between external links and parked domains was modeled as a graph mining problem, and a coarse-to-fine strategy was elaborately designed by leverage the graphical locality such that ParkedGuard outperforms the state-of-the-art in terms of both recall and precision rates.

Keywords: coarse-to-fine strategy, domain parking service, graphical locality analysis, parked domain

Procedia PDF Downloads 386
259 Geomechanics Properties of Tuzluca (Eastern. Turkey) Bedded Rock Salt and Geotechnical Safety

Authors: Mehmet Salih Bayraktutan

Abstract:

Geomechanical properties of Rock Salt Deposits in Tuzluca Salt Mine Area (Eastern Turkey) are studied for modeling the operation- excavation strategy. The purpose of this research focused on calculating the critical value of span height- which will meet the safety requirements. The Mine Site Tuzluca Hills consist of alternating parallel bedding of Salt ( NaCl ) and Gypsum ( CaS04 + 2 H20) rocks. Rock Salt beds are more resistant than narrow Gypsum interlayers. Rock Salt beds formed almost 97 percent of the total height of the Hill. Therefore, the geotechnical safety of Galleries depends on the mechanical criteria of Rock Salt Cores. General deposition of Tuzluca Basin was finally completed by Tuzluca Evaporites, as for the uppermost stratigraphic unit. They are currently running mining operations performed by classic mechanical excavation, room and pillar method. Rooms and Pillars are currently experiencing an initial stage of fracturing in places. Geotechnical safety of the whole mining area evaluated by Rock Mass Rating (RMR), Rock Quality Designation (RQD) spacing of joints, and the interaction of groundwater and fracture system. In general, bedded rock salt Show large lateral deformation capacity (while deformation modulus stays in relative small values, here E= 9.86 GPa). In such litho-stratigraphic environments, creep is a critical mechanism in failure. Rock Salt creep rate in steady-state is greater than interbedding layers. Under long-lasted compressive stresses, creep may cause shear displacements, partly using bedding planes. Eventually, steady-state creep in time returns to accelerated stages. Uniaxial compression creep tests on specimens were performed to have an idea of rock salt strength. To give an idea, on Rock Salt cores, average axial strength and strain are found as 18 - 24 MPa and 0.43-0.45 %, respectively. Uniaxial Compressive strength of 26- 32 MPa, from bedded rock salt cores. Elastic modulus is comparatively low, but lateral deformation of the rock salt is high under the uniaxial compression stress state. Poisson ratio = 0.44, break load = 156 kN, cohesion c= 12.8 kg/cm2, specific gravity SG=2.17 gr/cm3. Fracture System; spacing of fractures, joints, faults, offsets are evaluated under acting geodynamic mechanism. Two sand beds, each 4-6 m thick, exist near to upper level and at the top of the evaporating sequence. They act as aquifers and keep infiltrated water on top for a long duration, which may result in the failure of roofs or pillars. Two major active seismic ( N30W and N70E ) striking Fault Planes and parallel fracture strands have seismically triggered moderate risk of structural deformation of rock salt bedding sequence. Earthquakes and Floods are two prevailing sources of geohazards in this region—the seismotectonic activity of the Mine Site based on the crossing framework of Kagizman Faults and Igdir Faults. Dominant Hazard Risk sources include; a) Weak mechanical properties of rock salt, gypsum, anhydrite beds-creep. b) Physical discontinuities cutting across the thick parallel layers of Evaporite Mass, c) Intercalated beds of weak cemented or loose sand, clayey sandy sediments. On the other hand, absorbing the effects of salt-gyps parallel bedded deposits on seismic wave amplitudes has a reducing effect on the Rock Mass.

Keywords: bedded rock salt, creep, failure mechanism, geotechnical safety

Procedia PDF Downloads 167
258 Spreading Japan's National Image through China during the Era of Mass Tourism: The Japan National Tourism Organization’s Use of Sina Weibo

Authors: Abigail Qian Zhou

Abstract:

Since China has entered an era of mass tourism, there has been a fundamental change in the way Chinese people approach and perceive the image of other countries. With the advent of the new media era, social networking sites such as Sina Weibo have become a tool for many foreign governmental organizations to spread and promote their national image. Among them, the Japan National Tourism Organization (JNTO) was one of the first foreign official tourism agencies to register with Sina Weibo and actively implement communication activities. Due to historical and political reasons, cognition of Japan's national image by the Chinese has always been complicated and contradictory. However, since 2015, China has become the largest source of tourists visiting Japan. This clearly indicates that the broadening of Japan's national image in China has been effective and has value worthy of reference in promoting a positive Chinese perception of Japan and encouraging Japanese tourism. Within this context and using the method of content analysis in media studies through content mining software, this study analyzed how JNTO’s Sina Weibo accounts have constructed and spread Japan's national image. This study also summarized the characteristics of its content and form, and finally revealed the strategy of JNTO in building its international image. The findings of this study not only add a tourism-based perspective to traditional national image communications research, but also provide some reference for the effective international dissemination of national image in the future.

Keywords: national image, international communication, tourism, Japan, China

Procedia PDF Downloads 102
257 Relationship between the Ability of Accruals and Non-Systematic Risk of Shares for Companies Listed in Stock Exchange: Case Study, Tehran

Authors: Lina Najafian, Hamidreza Vakilifard

Abstract:

The present study focused on the relationship between the quality of accruals and non-systematic risk. The independent study variables included the ability of accruals, the information content of accruals, and amount of discretionary accruals considered as accruals quality measures. The dependent variable was non-systematic risk based on the Fama and French Three Factor model (FFTFM) and the capital asset pricing model (CAPM). The control variables were firm size, financial leverage, stock return, cash flow fluctuations, and book-to-market ratio. The data collection method was based on library research and document mining including financial statements. Multiple regression analysis was used to analyze the data. The study results showed that there is a significant direct relationship between financial leverage and discretionary accruals and non-systematic risk based on FFTFM and CAPM. There is also a significant direct relationship between the ability of accruals, information content of accruals, firm size, and stock return and non-systematic based on both models. It was also found that there is no relationship between book-to-market ratio and cash flow fluctuations and non-systematic risk.

Keywords: accruals quality, non-systematic risk, CAPM, FFTFM

Procedia PDF Downloads 137
256 Ecotourism Sites in Central Visayas, Philippines: A Green Business Profile

Authors: Ivy Jumao-As, Randy Lupango, Clifford Villaflores, Marites Khanser

Abstract:

Alongside inadequate implementation of ecotourism standards and other pressing issues on sustainable development is the lack of business plans and formal business structures of various ecotourism sites in the Central Visayas, Philippines, and other parts of the country. Addressing these issues plays a key role to boost ecotourism which is a sustainability tool to the country’s economic development. A three-phase research is designed to investigate the green business practices of selected ecotourism sites in the region in order to propose a business model for ecotourism destinations in the region and outside. This paper reports the initial phase of the study which described the sites’ profile as well as operators of the following selected destinations: Cebu City Protected Landscape and Olango Island Wildlife Bird Sanctuary in Cebu, Rajah Sikatuna Protected Landscape in Bohol. Interview, Self-Administered Questionnaire with key informants and Data Mining were employed in the data collection. Findings highlighted similarities and differences in terms of eco-tourism products, type and number of visitors, manpower composition, cultural and natural resources, complementary services and products, awards and accreditation, peak and off peak seasons, among others. Recommendations based from common issues initially identified in this study are also highlighted.

Keywords: ecotourism, ecotourism sites, green business, sustainability

Procedia PDF Downloads 237
255 Parkinson’s Disease Detection Analysis through Machine Learning Approaches

Authors: Muhtasim Shafi Kader, Fizar Ahmed, Annesha Acharjee

Abstract:

Machine learning and data mining are crucial in health care, as well as medical information and detection. Machine learning approaches are now being utilized to improve awareness of a variety of critical health issues, including diabetes detection, neuron cell tumor diagnosis, COVID 19 identification, and so on. Parkinson’s disease is basically a disease for our senior citizens in Bangladesh. Parkinson's Disease indications often seem progressive and get worst with time. People got affected trouble walking and communicating with the condition advances. Patients can also have psychological and social vagaries, nap problems, hopelessness, reminiscence loss, and weariness. Parkinson's disease can happen in both men and women. Though men are affected by the illness at a proportion that is around partial of them are women. In this research, we have to get out the accurate ML algorithm to find out the disease with a predictable dataset and the model of the following machine learning classifiers. Therefore, nine ML classifiers are secondhand to portion study to use machine learning approaches like as follows, Naive Bayes, Adaptive Boosting, Bagging Classifier, Decision Tree Classifier, Random Forest classifier, XBG Classifier, K Nearest Neighbor Classifier, Support Vector Machine Classifier, and Gradient Boosting Classifier are used.

Keywords: naive bayes, adaptive boosting, bagging classifier, decision tree classifier, random forest classifier, XBG classifier, k nearest neighbor classifier, support vector classifier, gradient boosting classifier

Procedia PDF Downloads 102