Search results for: data mining applications and discovery
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 29825

Search results for: data mining applications and discovery

29465 Web-Based Cognitive Writing Instruction (WeCWI): A Theoretical-and-Pedagogical e-Framework for Language Development

Authors: Boon Yih Mah

Abstract:

Web-based Cognitive Writing Instruction (WeCWI)’s contribution towards language development can be divided into linguistic and non-linguistic perspectives. In linguistic perspective, WeCWI focuses on the literacy and language discoveries, while the cognitive and psychological discoveries are the hubs in non-linguistic perspective. In linguistic perspective, WeCWI draws attention to free reading and enterprises, which are supported by the language acquisition theories. Besides, the adoption of process genre approach as a hybrid guided writing approach fosters literacy development. Literacy and language developments are interconnected in the communication process; hence, WeCWI encourages meaningful discussion based on the interactionist theory that involves input, negotiation, output, and interactional feedback. Rooted in the e-learning interaction-based model, WeCWI promotes online discussion via synchronous and asynchronous communications, which allows interactions happened among the learners, instructor, and digital content. In non-linguistic perspective, WeCWI highlights on the contribution of reading, discussion, and writing towards cognitive development. Based on the inquiry models, learners’ critical thinking is fostered during information exploration process through interaction and questioning. Lastly, to lower writing anxiety, WeCWI develops the instructional tool with supportive features to facilitate the writing process. To bring a positive user experience to the learner, WeCWI aims to create the instructional tool with different interface designs based on two different types of perceptual learning style.

Keywords: WeCWI, literacy discovery, language discovery, cognitive discovery, psychological discovery

Procedia PDF Downloads 545
29464 Neural Networks Models for Measuring Hotel Users Satisfaction

Authors: Asma Ameur, Dhafer Malouche

Abstract:

Nowadays, user comments on the Internet have an important impact on hotel bookings. This confirms that the e-reputation issue can influence the likelihood of customer loyalty to a hotel. In this way, e-reputation has become a real differentiator between hotels. For this reason, we have a unique opportunity in the opinion mining field to analyze the comments. In fact, this field provides the possibility of extracting information related to the polarity of user reviews. This sentimental study (Opinion Mining) represents a new line of research for analyzing the unstructured textual data. Knowing the score of e-reputation helps the hotelier to better manage his marketing strategy. The score we then obtain is translated into the image of hotels to differentiate between them. Therefore, this present research highlights the importance of hotel satisfaction ‘scoring. To calculate the satisfaction score, the sentimental analysis can be manipulated by several techniques of machine learning. In fact, this study treats the extracted textual data by using the Artificial Neural Networks Approach (ANNs). In this context, we adopt the aforementioned technique to extract information from the comments available in the ‘Trip Advisor’ website. This actual paper details the description and the modeling of the ANNs approach for the scoring of online hotel reviews. In summary, the validation of this used method provides a significant model for hotel sentiment analysis. So, it provides the possibility to determine precisely the polarity of the hotel users reviews. The empirical results show that the ANNs are an accurate approach for sentiment analysis. The obtained results show also that this proposed approach serves to the dimensionality reduction for textual data’ clustering. Thus, this study provides researchers with a useful exploration of this technique. Finally, we outline guidelines for future research in the hotel e-reputation field as comparing the ANNs with other technique.

Keywords: clustering, consumer behavior, data mining, e-reputation, machine learning, neural network, online hotel ‘reviews, opinion mining, scoring

Procedia PDF Downloads 115
29463 Planning Urban Sprawl in Mining Areas in Africa: How to Ensure Coherent Development

Authors: Pascal Rey, Anaïs Weber

Abstract:

Many mining projects are being developed in Africa the last decades. Due to the economic opportunities they offer, these projects result in a massive and rapid influx of migrants to the surrounding area. In areas where central government representation is low and local administration lack financial resources, urban development is often anarchical, beyond all public control. It leads to socio-spatial segregation, insecurity and the risk of social conflicts rising. Aware that their economic development is very correlated with local situation, mining companies get more and more involved in regional planning in setting up tools and Strategic Directions document. One of the commonly used tools in this regard is the “Influx Management Plan”. It consists in looking at the region’s absorption capacities in order to ensure its coherent development and by developing several urban centers than one macrocephalic city. It includes many other measures such as urban governance support, skills transfer, creation of strategic guidelines, financial support (local taxes, mining taxes, development funds etc.) local development projects. Through various examples of mining projects in Guinea, A country that is host to many large mining projects, we will look at the implications of regional and urban planning of which mining companies are key playor as well as public authorities. While their investment capacity offers advantages and accelerates development, their actions raise questions of the unilaterality of interests and local governance. By interfering in public affairs are mining companies not increasing the risk of central and local government shirking their responsibilities in terms of regional development, or even calling their legitimacy into question? Is such public-private collaboration really sustainable for the region as a whole and for all stakeholders?

Keywords: Africa, guinea, mine, urban planning

Procedia PDF Downloads 480
29462 Grid and Market Integration of Large Scale Wind Farms using Advanced Predictive Data Mining Techniques

Authors: Umit Cali

Abstract:

The integration of intermittent energy sources like wind farms into the electricity grid has become an important challenge for the utilization and control of electric power systems, because of the fluctuating behaviour of wind power generation. Wind power predictions improve the economic and technical integration of large amounts of wind energy into the existing electricity grid. Trading, balancing, grid operation, controllability and safety issues increase the importance of predicting power output from wind power operators. Therefore, wind power forecasting systems have to be integrated into the monitoring and control systems of the transmission system operator (TSO) and wind farm operators/traders. The wind forecasts are relatively precise for the time period of only a few hours, and, therefore, relevant with regard to Spot and Intraday markets. In this work predictive data mining techniques are applied to identify a statistical and neural network model or set of models that can be used to predict wind power output of large onshore and offshore wind farms. These advanced data analytic methods helps us to amalgamate the information in very large meteorological, oceanographic and SCADA data sets into useful information and manageable systems. Accurate wind power forecasts are beneficial for wind plant operators, utility operators, and utility customers. An accurate forecast allows grid operators to schedule economically efficient generation to meet the demand of electrical customers. This study is also dedicated to an in-depth consideration of issues such as the comparison of day ahead and the short-term wind power forecasting results, determination of the accuracy of the wind power prediction and the evaluation of the energy economic and technical benefits of wind power forecasting.

Keywords: renewable energy sources, wind power, forecasting, data mining, big data, artificial intelligence, energy economics, power trading, power grids

Procedia PDF Downloads 500
29461 Research on the Correlation between College Students' Physical Fitness and Running Habits: Data Mining of Smart Phone Sports App

Authors: Mingming Guo, Xiaozan Wang

Abstract:

Introduction: The purpose of this study is to examine the correlation between the physical fitness of Chinese college students and their daily running habits (RH). Methods: A total of 718 college students from East China Normal University participated in this study (385 boys and 333 girls). Each participant participated in the Chinese Students’ Physical Fitness Test during the 2018-2019 school year. In addition, each student is also required to use the app to record all their running results during each run during the 2018-2019 school year. Researchers can query and export all running records through the app's management platform. Results: (1) The total number of kilometers run by the students showed a significant negative correlation with their vital capacity (VC), sitting body flexion (SBF), and long jump (LJ) (rᵥ

Keywords: college students, physical fitness, running habits, data mining

Procedia PDF Downloads 127
29460 Assessment for the Backfill Using the Run of the Mine Tailings and Portland Cement

Authors: Javad Someehneshin, Weizhou Quan, Abdelsalam Abugharara, Stephen Butt

Abstract:

Narrow vein mining (NVM) is exploiting very thin but valuable ore bodies that are uneconomical to extract by conventional mining methods. NVM applies the technique of Sustainable Mining by Drilling (SMD). The SMD method is used to mine stranded, steeply dipping ore veins, which are too small or isolated to mine economically using conventional methods since the dilution is minimized. This novel mining technique uses drilling rigs to extract the ore through directional drilling surgically. This paper is focusing on utilizing the run of the mine tailings and Portland cement as backfill material to support the hanging wall for providing safe mine operation. Cemented paste backfill (CPB) is designed by mixing waste tailings, water, and cement of the precise percentage for optimal outcomes. It is a non-homogenous material that contains 70-85% solids. Usually, a hydraulic binder is added to the mixture to increase the strength of the CPB. The binder fraction mostly accounts for 2–10% of the total weight. In the mining industry, CPB has been improved and expanded gradually because it provides safety and support for the mines. Furthermore, CPB helps manage the waste tailings in an economical method and plays a significant role in environmental protection.

Keywords: backfilling, cement backfill, tailings, Portland cement

Procedia PDF Downloads 124
29459 Discovering Causal Structure from Observations: The Relationships between Technophile Attitude, Users Value and Use Intention of Mobility Management Travel App

Authors: Aliasghar Mehdizadeh Dastjerdi, Francisco Camara Pereira

Abstract:

The increasing complexity and demand of transport services strains transportation systems especially in urban areas with limited possibilities for building new infrastructure. The solution to this challenge requires changes of travel behavior. One of the proposed means to induce such change is multimodal travel apps. This paper describes a study of the intention to use a real-time multi-modal travel app aimed at motivating travel behavior change in the Greater Copenhagen Region (Denmark) toward promoting sustainable transport options. The proposed app is a multi-faceted smartphone app including both travel information and persuasive strategies such as health and environmental feedback, tailoring travel options, self-monitoring, tunneling users toward green behavior, social networking, nudging and gamification elements. The prospective for mobility management travel apps to stimulate sustainable mobility rests not only on the original and proper employment of the behavior change strategies, but also on explicitly anchoring it on established theoretical constructs from behavioral theories. The theoretical foundation is important because it positively and significantly influences the effectiveness of the system. However, there is a gap in current knowledge regarding the study of mobility-management travel app with support in behavioral theories, which should be explored further. This study addresses this gap by a social cognitive theory‐based examination. However, compare to conventional method in technology adoption research, this study adopts a reverse approach in which the associations between theoretical constructs are explored by Max-Min Hill-Climbing (MMHC) algorithm as a hybrid causal discovery method. A technology-use preference survey was designed to collect data. The survey elicited different groups of variables including (1) three groups of user’s motives for using the app including gain motives (e.g., saving travel time and cost), hedonic motives (e.g., enjoyment) and normative motives (e.g., less travel-related CO2 production), (2) technology-related self-concepts (i.e. technophile attitude) and (3) use Intention of the travel app. The questionnaire items led to the formulation of causal relationships discovery to learn the causal structure of the data. Causal relationships discovery from observational data is a critical challenge and it has applications in different research fields. The estimated causal structure shows that the two constructs of gain motives and technophilia have a causal effect on adoption intention. Likewise, there is a causal relationship from technophilia to both gain and hedonic motives. In line with the findings of the prior studies, it highlights the importance of functional value of the travel app as well as technology self-concept as two important variables for adoption intention. Furthermore, the results indicate the effect of technophile attitude on developing gain and hedonic motives. The causal structure shows hierarchical associations between the three groups of user’s motive. They can be explained by “frustration-regression” principle according to Alderfer's ERG (Existence, Relatedness and Growth) theory of needs meaning that a higher level need remains unfulfilled, a person may regress to lower level needs that appear easier to satisfy. To conclude, this study shows the capability of causal discovery methods to learn the causal structure of theoretical model, and accordingly interpret established associations.

Keywords: travel app, behavior change, persuasive technology, travel information, causality

Procedia PDF Downloads 124
29458 Customer Data Analysis Model Using Business Intelligence Tools in Telecommunication Companies

Authors: Monica Lia

Abstract:

This article presents a customer data analysis model using business intelligence tools for data modelling, transforming, data visualization and dynamic reports building. Economic organizational customer’s analysis is made based on the information from the transactional systems of the organization. The paper presents how to develop the data model starting for the data that companies have inside their own operational systems. The owned data can be transformed into useful information about customers using business intelligence tool. For a mature market, knowing the information inside the data and making forecast for strategic decision become more important. Business Intelligence tools are used in business organization as support for decision-making.

Keywords: customer analysis, business intelligence, data warehouse, data mining, decisions, self-service reports, interactive visual analysis, and dynamic dashboards, use cases diagram, process modelling, logical data model, data mart, ETL, star schema, OLAP, data universes

Procedia PDF Downloads 415
29457 Intra-miR-ExploreR, a Novel Bioinformatics Platform for Integrated Discovery of MiRNA:mRNA Gene Regulatory Networks

Authors: Surajit Bhattacharya, Daniel Veltri, Atit A. Patel, Daniel N. Cox

Abstract:

miRNAs have emerged as key post-transcriptional regulators of gene expression, however identification of biologically-relevant target genes for this epigenetic regulatory mechanism remains a significant challenge. To address this knowledge gap, we have developed a novel tool in R, Intra-miR-ExploreR, that facilitates integrated discovery of miRNA targets by incorporating target databases and novel target prediction algorithms, using statistical methods including Pearson and Distance Correlation on microarray data, to arrive at high confidence intragenic miRNA target predictions. We have explored the efficacy of this tool using Drosophila melanogaster as a model organism for bioinformatics analyses and functional validation. A number of putative targets were obtained which were also validated using qRT-PCR analysis. Additional features of the tool include downloadable text files containing GO analysis from DAVID and Pubmed links of literature related to gene sets. Moreover, we are constructing interaction maps of intragenic miRNAs, using both micro array and RNA-seq data, focusing on neural tissues to uncover regulatory codes via which these molecules regulate gene expression to direct cellular development.

Keywords: miRNA, miRNA:mRNA target prediction, statistical methods, miRNA:mRNA interaction network

Procedia PDF Downloads 486
29456 Educational Data Mining: The Case of the Department of Mathematics and Computing in the Period 2009-2018

Authors: Mário Ernesto Sitoe, Orlando Zacarias

Abstract:

University education is influenced by several factors that range from the adoption of strategies to strengthen the whole process to the academic performance improvement of the students themselves. This work uses data mining techniques to develop a predictive model to identify students with a tendency to evasion and retention. To this end, a database of real students’ data from the Department of University Admission (DAU) and the Department of Mathematics and Informatics (DMI) was used. The data comprised 388 undergraduate students admitted in the years 2009 to 2014. The Weka tool was used for model building, using three different techniques, namely: K-nearest neighbor, random forest, and logistic regression. To allow for training on multiple train-test splits, a cross-validation approach was employed with a varying number of folds. To reduce bias variance and improve the performance of the models, ensemble methods of Bagging and Stacking were used. After comparing the results obtained by the three classifiers, Logistic Regression using Bagging with seven folds obtained the best performance, showing results above 90% in all evaluated metrics: accuracy, rate of true positives, and precision. Retention is the most common tendency.

Keywords: evasion and retention, cross-validation, bagging, stacking

Procedia PDF Downloads 65
29455 Improvement of the Aerodynamic Behaviour of a Land Rover Discovery 4 in Turbulent Flow Using Computational Fluid Dynamics (CFD)

Authors: Ahmed Al-Saadi, Ali Hassanpour, Tariq Mahmud

Abstract:

The main objective of this study is to investigate ways to reduce the aerodynamic drag coefficient and to increase the stability of the full-size Sport Utility Vehicle using three-dimensional Computational Fluid Dynamics (CFD) simulation. The baseline model in the simulation was the Land Rover Discovery 4. Many aerodynamic devices and external design modifications were used in this study. These reduction aerodynamic techniques were tested individually or in combination to get the best design. All new models have the same capacity and comfort of the baseline model. Uniform freestream velocity of the air at inlet ranging from 28 m/s to 40 m/s was used. ANSYS Fluent software (version 16.0) was used to simulate all models. The drag coefficient obtained from the ANSYS Fluent for the baseline model was validated with experimental data. It is found that the use of modern aerodynamic add-on devices and modifications has a significant effect in reducing the aerodynamic drag coefficient.

Keywords: aerodynamics, RANS, sport utility vehicle, turbulent flow

Procedia PDF Downloads 297
29454 Resource Framework Descriptors for Interestingness in Data

Authors: C. B. Abhilash, Kavi Mahesh

Abstract:

Human beings are the most advanced species on earth; it's all because of the ability to communicate and share information via human language. In today's world, a huge amount of data is available on the web in text format. This has also resulted in the generation of big data in structured and unstructured formats. In general, the data is in the textual form, which is highly unstructured. To get insights and actionable content from this data, we need to incorporate the concepts of text mining and natural language processing. In our study, we mainly focus on Interesting data through which interesting facts are generated for the knowledge base. The approach is to derive the analytics from the text via the application of natural language processing. Using semantic web Resource framework descriptors (RDF), we generate the triple from the given data and derive the interesting patterns. The methodology also illustrates data integration using the RDF for reliable, interesting patterns.

Keywords: RDF, interestingness, knowledge base, semantic data

Procedia PDF Downloads 141
29453 Developing Fault Tolerance Metrics of Web and Mobile Applications

Authors: Ahmad Mohsin, Irfan Raza Naqvi, Syda Fatima Usamn

Abstract:

Applications with higher fault tolerance index are considered more reliable and trustworthy to drive quality. In recent years application development has been shifted from traditional desktop and web to native and hybrid application(s) for the web and mobile platforms. With the emergence of Internet of things IOTs, cloud and big data trends, the need for measuring Fault Tolerance for these complex nature applications has increased to evaluate their performance. There is a phenomenal gap between fault tolerance metrics development and measurement. Classic quality metric models focused on metrics for traditional systems ignoring the essence of today’s applications software, hardware & deployment characteristics. In this paper, we have proposed simple metrics to measure fault tolerance considering general requirements for Web and Mobile Applications. We have aligned factors – subfactors, using GQM for metrics development considering the nature of mobile we apps. Systematic Mathematical formulation is done to measure metrics quantitatively. Three web mobile applications are selected to measure Fault Tolerance factors using formulated metrics. Applications are then analysed on the basis of results from observations in a controlled environment on different mobile devices. Quantitative results are presented depicting Fault tolerance in respective applications.

Keywords: web and mobile applications, reliability, fault tolerance metric, quality metrics, GQM based metrics

Procedia PDF Downloads 326
29452 Using Printouts as Social Media Evidence and Its Authentication in the Courtroom

Authors: Chih-Ping Chang

Abstract:

Different from traditional objective evidence, social media evidence has its own characteristics with easily tampering, recoverability, and cannot be read without using other devices (such as a computer). Simply taking a screenshot from social network sites must be questioned its original identity. When the police search and seizure digital information, a common way they use is to directly print out digital data obtained and ask the signature of the parties at the presence, without taking original digital data back. In addition to the issue on its original identity, this conduct to obtain evidence may have another two results. First, it will easily allege that is tampering evidence because the police wanted to frame the suspect and falsified evidence. Second, it is not easy to discovery hidden information. The core evidence associated with crime may not appear in the contents of files. Through discovery the original file, data related to the file, such as the original producer, creation time, modification date, and even GPS location display can be revealed from hidden information. Therefore, how to show this kind of evidence in the courtroom will be arguably the most important task for ruling social media evidence. This article, first, will introduce forensic software, like EnCase, TCT, FTK, and analyze their function to prove the identity with another digital data. Then turning back to the court, the second part of this article will discuss legal standard for authentication of social media evidence and application of that forensic software in the courtroom. As the conclusion, this article will provide a rethinking, that is, what kind of authenticity is this rule of evidence chase for. Does legal system automatically operate the transcription of scientific knowledge? Or furthermore, it wants to better render justice, not only under scientific fact, but through multivariate debating.

Keywords: federal rule of evidence, internet forensic, printouts as evidence, social media evidence, United States v. Vayner

Procedia PDF Downloads 277
29451 SPARK: An Open-Source Knowledge Discovery Platform That Leverages Non-Relational Databases and Massively Parallel Computational Power for Heterogeneous Genomic Datasets

Authors: Thilina Ranaweera, Enes Makalic, John L. Hopper, Adrian Bickerstaffe

Abstract:

Data are the primary asset of biomedical researchers, and the engine for both discovery and research translation. As the volume and complexity of research datasets increase, especially with new technologies such as large single nucleotide polymorphism (SNP) chips, so too does the requirement for software to manage, process and analyze the data. Researchers often need to execute complicated queries and conduct complex analyzes of large-scale datasets. Existing tools to analyze such data, and other types of high-dimensional data, unfortunately suffer from one or more major problems. They typically require a high level of computing expertise, are too simplistic (i.e., do not fit realistic models that allow for complex interactions), are limited by computing power, do not exploit the computing power of large-scale parallel architectures (e.g. supercomputers, GPU clusters etc.), or are limited in the types of analysis available, compounded by the fact that integrating new analysis methods is not straightforward. Solutions to these problems, such as those developed and implemented on parallel architectures, are currently available to only a relatively small portion of medical researchers with access and know-how. The past decade has seen a rapid expansion of data management systems for the medical domain. Much attention has been given to systems that manage phenotype datasets generated by medical studies. The introduction of heterogeneous genomic data for research subjects that reside in these systems has highlighted the need for substantial improvements in software architecture. To address this problem, we have developed SPARK, an enabling and translational system for medical research, leveraging existing high performance computing resources, and analysis techniques currently available or being developed. It builds these into The Ark, an open-source web-based system designed to manage medical data. SPARK provides a next-generation biomedical data management solution that is based upon a novel Micro-Service architecture and Big Data technologies. The system serves to demonstrate the applicability of Micro-Service architectures for the development of high performance computing applications. When applied to high-dimensional medical datasets such as genomic data, relational data management approaches with normalized data structures suffer from unfeasibly high execution times for basic operations such as insert (i.e. importing a GWAS dataset) and the queries that are typical of the genomics research domain. SPARK resolves these problems by incorporating non-relational NoSQL databases that have been driven by the emergence of Big Data. SPARK provides researchers across the world with user-friendly access to state-of-the-art data management and analysis tools while eliminating the need for high-level informatics and programming skills. The system will benefit health and medical research by eliminating the burden of large-scale data management, querying, cleaning, and analysis. SPARK represents a major advancement in genome research technologies, vastly reducing the burden of working with genomic datasets, and enabling cutting edge analysis approaches that have previously been out of reach for many medical researchers.

Keywords: biomedical research, genomics, information systems, software

Procedia PDF Downloads 253
29450 JavaScript Object Notation Data against eXtensible Markup Language Data in Software Applications a Software Testing Approach

Authors: Theertha Chandroth

Abstract:

This paper presents a comparative study on how to check JSON (JavaScript Object Notation) data against XML (eXtensible Markup Language) data from a software testing point of view. JSON and XML are widely used data interchange formats, each with its unique syntax and structure. The objective is to explore various techniques and methodologies for validating comparison and integration between JSON data to XML and vice versa. By understanding the process of checking JSON data against XML data, testers, developers and data practitioners can ensure accurate data representation, seamless data interchange, and effective data validation.

Keywords: XML, JSON, data comparison, integration testing, Python, SQL

Procedia PDF Downloads 120
29449 A Named Data Networking Stack for Contiki-NG-OS

Authors: Sedat Bilgili, Alper K. Demir

Abstract:

The current Internet has become the dominant use with continuing growth in the home, medical, health, smart cities and industrial automation applications. Internet of Things (IoT) is an emerging technology to enable such applications in our lives. Moreover, Named Data Networking (NDN) is also emerging as a Future Internet architecture where it fits the communication needs of IoT networks. The aim of this study is to provide an NDN protocol stack implementation running on the Contiki operating system (OS). Contiki OS is an OS that is developed for constrained IoT devices. In this study, an NDN protocol stack that can work on top of IEEE 802.15.4 link and physical layers have been developed and presented.

Keywords: internet of things (IoT), named-data, named data networking (NDN), operating system

Procedia PDF Downloads 151
29448 Screening for Hit Identification against Mycobacterium abscessus

Authors: Jichan Jang

Abstract:

Mycobacterium abscessus is a rapidly growing life-threatening mycobacterium with multiple drug-resistance mechanisms. In this study, we screened the library to identify active molecules targeting Mycobacterium abscessus using resazurin live/dead assays. In this screening assay, the Z-factor was 0.7, as an indication of the statistical confidence of the assay. A cut-off of 80% growth inhibition in the screening resulted in the identification of four different compounds at a single concentration (20 μM). Dose-response curves identified three different hit candidates, which generated good inhibitory curves. All hit candidates were expected to have different molecular targets. Thus, we found that compound X, identified, may be a promising candidate in the M. abscessus drug discovery pipeline.

Keywords: Mycobacterium abscessus, antibiotics, drug discovery, emerging Pathogen

Procedia PDF Downloads 191
29447 Parallel Genetic Algorithms Clustering for Handling Recruitment Problem

Authors: Walid Moudani, Ahmad Shahin

Abstract:

This research presents a study to handle the recruitment services system. It aims to enhance a business intelligence system by embedding data mining in its core engine and to facilitate the link between job searchers and recruiters companies. The purpose of this study is to present an intelligent management system for supporting recruitment services based on data mining methods. It consists to apply segmentation on the extracted job postings offered by the different recruiters. The details of the job postings are associated to a set of relevant features that are extracted from the web and which are based on critical criterion in order to define consistent clusters. Thereafter, we assign the job searchers to the best cluster while providing a ranking according to the job postings of the selected cluster. The performance of the proposed model used is analyzed, based on a real case study, with the clustered job postings dataset and classified job searchers dataset by using some metrics.

Keywords: job postings, job searchers, clustering, genetic algorithms, business intelligence

Procedia PDF Downloads 316
29446 Gravity and Magnetic Survey, Modeling and Interpretation in the Blötberget Iron-Oxide Mining Area of Central Sweden

Authors: Ezra Yehuwalashet, Alireza Malehmir

Abstract:

Blötberget mining area in central Sweden, part of the Bergslagen mineral district, is well known for its various type of mineralization particularly iron-oxide deposits since the 1600. To shed lights on the knowledge of the host rock structures, depth extent and tonnage of the mineral deposits and support deep mineral exploration potential in the study area, new ground gravity and existing aeromagnetic data (from the Geological Survey of Sweden) were used for interpretations and modelling. A major boundary separating a gravity low from a gravity high in the southern part of the study area is noticeable and likely representing a fault boundary separating two different lithological units. Gravity data and modeling offers a possible new target area in the southeast of the known mineralization while suggesting an excess high-density region down to 800 m depth.

Keywords: gravity, magnetics, ore deposit, geophysics

Procedia PDF Downloads 43
29445 Predication Model for Leukemia Diseases Based on Data Mining Classification Algorithms with Best Accuracy

Authors: Fahd Sabry Esmail, M. Badr Senousy, Mohamed Ragaie

Abstract:

In recent years, there has been an explosion in the rate of using technology that help discovering the diseases. For example, DNA microarrays allow us for the first time to obtain a "global" view of the cell. It has great potential to provide accurate medical diagnosis, to help in finding the right treatment and cure for many diseases. Various classification algorithms can be applied on such micro-array datasets to devise methods that can predict the occurrence of Leukemia disease. In this study, we compared the classification accuracy and response time among eleven decision tree methods and six rule classifier methods using five performance criteria. The experiment results show that the performance of Random Tree is producing better result. Also it takes lowest time to build model in tree classifier. The classification rules algorithms such as nearest- neighbor-like algorithm (NNge) is the best algorithm due to the high accuracy and it takes lowest time to build model in classification.

Keywords: data mining, classification techniques, decision tree, classification rule, leukemia diseases, microarray data

Procedia PDF Downloads 306
29444 Improved Classification Procedure for Imbalanced and Overlapped Situations

Authors: Hankyu Lee, Seoung Bum Kim

Abstract:

The issue with imbalance and overlapping in the class distribution becomes important in various applications of data mining. The imbalanced dataset is a special case in classification problems in which the number of observations of one class (i.e., major class) heavily exceeds the number of observations of the other class (i.e., minor class). Overlapped dataset is the case where many observations are shared together between the two classes. Imbalanced and overlapped data can be frequently found in many real examples including fraud and abuse patients in healthcare, quality prediction in manufacturing, text classification, oil spill detection, remote sensing, and so on. The class imbalance and overlap problem is the challenging issue because this situation degrades the performance of most of the standard classification algorithms. In this study, we propose a classification procedure that can effectively handle imbalanced and overlapped datasets by splitting data space into three parts: nonoverlapping, light overlapping, and severe overlapping and applying the classification algorithm in each part. These three parts were determined based on the Hausdorff distance and the margin of the modified support vector machine. An experiments study was conducted to examine the properties of the proposed method and compared it with other classification algorithms. The results showed that the proposed method outperformed the competitors under various imbalanced and overlapped situations. Moreover, the applicability of the proposed method was demonstrated through the experiment with real data.

Keywords: classification, imbalanced data with class overlap, split data space, support vector machine

Procedia PDF Downloads 293
29443 Representation Data without Lost Compression Properties in Time Series: A Review

Authors: Nabilah Filzah Mohd Radzuan, Zalinda Othman, Azuraliza Abu Bakar, Abdul Razak Hamdan

Abstract:

Uncertain data is believed to be an important issue in building up a prediction model. The main objective in the time series uncertainty analysis is to formulate uncertain data in order to gain knowledge and fit low dimensional model prior to a prediction task. This paper discusses the performance of a number of techniques in dealing with uncertain data specifically those which solve uncertain data condition by minimizing the loss of compression properties.

Keywords: compression properties, uncertainty, uncertain time series, mining technique, weather prediction

Procedia PDF Downloads 414
29442 Coal Mining Safety Monitoring Using Wsn

Authors: Somdatta Saha

Abstract:

The main purpose was to provide an implementable design scenario for underground coal mines using wireless sensor networks (WSNs). The main reason being that given the intricacies in the physical structure of a coal mine, only low power WSN nodes can produce accurate surveillance and accident detection data. The work mainly concentrated on designing and simulating various alternate scenarios for a typical mine and comparing them based on the obtained results to arrive at a final design. In the Era of embedded technology, the Zigbee protocols are used in more and more applications. Because of the rapid development of sensors, microcontrollers, and network technology, a reliable technological condition has been provided for our automatic real-time monitoring of coal mine. The underground system collects temperature, humidity and methane values of coal mine through sensor nodes in the mine; it also collects the number of personnel inside the mine with the help of an IR sensor, and then transmits the data to information processing terminal based on ARM.

Keywords: ARM, embedded board, wireless sensor network (Zigbee)

Procedia PDF Downloads 329
29441 CoP-Networks: Virtual Spaces for New Faculty’s Professional Development in the 21st Higher Education

Authors: Eman AbuKhousa, Marwan Z. Bataineh

Abstract:

The 21st century higher education and globalization challenge new faculty members to build effective professional networks and partnership with industry in order to accelerate their growth and success. This creates the need for community of practice (CoP)-oriented development approaches that focus on cognitive apprenticeship while considering individual predisposition and future career needs. This work adopts data mining, clustering analysis, and social networking technologies to present the CoP-Network as a virtual space that connects together similar career-aspiration individuals who are socially influenced to join and engage in a process for domain-related knowledge and practice acquisitions. The CoP-Network model can be integrated into higher education to extend traditional graduate and professional development programs.

Keywords: clustering analysis, community of practice, data mining, higher education, new faculty challenges, social network, social influence, professional development

Procedia PDF Downloads 166
29440 Improving Cryptographically Generated Address Algorithm in IPv6 Secure Neighbor Discovery Protocol through Trust Management

Authors: M. Moslehpour, S. Khorsandi

Abstract:

As transition to widespread use of IPv6 addresses has gained momentum, it has been shown to be vulnerable to certain security attacks such as those targeting Neighbor Discovery Protocol (NDP) which provides the address resolution functionality in IPv6. To protect this protocol, Secure Neighbor Discovery (SEND) is introduced. This protocol uses Cryptographically Generated Address (CGA) and asymmetric cryptography as a defense against threats on integrity and identity of NDP. Although SEND protects NDP against attacks, it is computationally intensive due to Hash2 condition in CGA. To improve the CGA computation speed, we parallelized CGA generation process and used the available resources in a trusted network. Furthermore, we focused on the influence of the existence of malicious nodes on the overall load of un-malicious ones in the network. According to the evaluation results, malicious nodes have adverse impacts on the average CGA generation time and on the average number of tries. We utilized a Trust Management that is capable of detecting and isolating the malicious node to remove possible incentives for malicious behavior. We have demonstrated the effectiveness of the Trust Management System in detecting the malicious nodes and hence improving the overall system performance.

Keywords: CGA, ICMPv6, IPv6, malicious node, modifier, NDP, overall load, SEND, trust management

Procedia PDF Downloads 173
29439 Sustainable and Responsible Mining - Lundin Mining’s Subsidiary in Portugal, Sociedade Mineira de Neves-Corvo Case

Authors: Jose Daniel Braga Alves, Joaquim Gois, Alexandre Leite

Abstract:

This abstract presents the responsible and sustainable mining case study of a Portuguese mine operation, highlighting how mine exploitation can sustainably exist in balance with the environment, aligned with all stakeholders. The mining operation is remotely located in a United Nations (UN) biodiversity reserve, away from major industrial centers or logistical ports, and presents an interesting investigation to assess the balanced mine operation in alignment with all key stakeholders, which presents unique opportunities as well as challenges. Based on the sustainable mining framework, it is intended to detail examples of best practices from Sociedade Mineira de Neves-Corvo (SOMINCOR), demonstrating social acceptance by the local community, health, and safety at work, reduction of environmental impacts and management of mining waste, which directly influence the acceptance and recognition of a sustainable operation. The case study aims to present the SOMINCOR approach to sustainable mining, focusing on social responsibility, considering materials provided by Lundin Mining Corporation (LMC) and SOMINCOR and the socially responsible approach of the mining operations., referencing related international guidelines, UN Sustainable Development Goals. The researchers reviewed LMC's annual Sustainability Reports (2019, 2020 and 2021) and updated information regarding material topics of the most significant interest to internal and external stakeholders. These material topics formed the basis of the corporation-wide sustainability strategy. LMC's Responsible Mining Policy (RMP) was reviewed, focusing on the commitment that guides the approach to responsible operation and management of the Company's business. Social performance, compliance, environmental management, governance, human rights, and economic contribution are principles of the RMP. The Human Rights Risk Impact Assessment (HRRIA), based on frameworks including UN Guiding Principles (UNGP), Voluntary Principles on Security and Human Rights, and a community engagement program implemented (SLO index), was part of this research. The program consists of ongoing surveys and perceptions studies using behavioural science insights, data from which was not available within the timeframe of completing this research. LMC stakeholder engagement standards and grievance mechanisms were also reviewed. Stakeholder engagement and the community's perception are key to this operation to ensure social license to operate (SLO). Preliminary surveys with local communities provided input data for the local development strategy. After the implementation of several initiatives, subsequent surveys were performed to assess acceptance and trust from the local communities and changes to the SLO index. SOMINCOR's operation contributes to 12 out of 17 sustainable development goals. From the assessed and available data, local communities and social engagement are priorities to SOMINCOR. Experience to date shows that the continual engagement with local communities and the grievance mechanisms in place are respected and followed for all concerns presented by any stakeholder. It can be concluded that this underground mine in Portugal complies with applicable regulations and goes beyond them with regard to sustainable development and engagement with key stakeholders.

Keywords: sustainable mining, development goals, portuguese mining, zinc copper

Procedia PDF Downloads 65
29438 Mathematics Bridging Theory and Applications for a Data-Driven World

Authors: Zahid Ullah, Atlas Khan

Abstract:

In today's data-driven world, the role of mathematics in bridging the gap between theory and applications is becoming increasingly vital. This abstract highlights the significance of mathematics as a powerful tool for analyzing, interpreting, and extracting meaningful insights from vast amounts of data. By integrating mathematical principles with real-world applications, researchers can unlock the full potential of data-driven decision-making processes. This abstract delves into the various ways mathematics acts as a bridge connecting theoretical frameworks to practical applications. It explores the utilization of mathematical models, algorithms, and statistical techniques to uncover hidden patterns, trends, and correlations within complex datasets. Furthermore, it investigates the role of mathematics in enhancing predictive modeling, optimization, and risk assessment methodologies for improved decision-making in diverse fields such as finance, healthcare, engineering, and social sciences. The abstract also emphasizes the need for interdisciplinary collaboration between mathematicians, statisticians, computer scientists, and domain experts to tackle the challenges posed by the data-driven landscape. By fostering synergies between these disciplines, novel approaches can be developed to address complex problems and make data-driven insights accessible and actionable. Moreover, this abstract underscores the importance of robust mathematical foundations for ensuring the reliability and validity of data analysis. Rigorous mathematical frameworks not only provide a solid basis for understanding and interpreting results but also contribute to the development of innovative methodologies and techniques. In summary, this abstract advocates for the pivotal role of mathematics in bridging theory and applications in a data-driven world. By harnessing mathematical principles, researchers can unlock the transformative potential of data analysis, paving the way for evidence-based decision-making, optimized processes, and innovative solutions to the challenges of our rapidly evolving society.

Keywords: mathematics, bridging theory and applications, data-driven world, mathematical models

Procedia PDF Downloads 57
29437 Classification of Land Cover Usage from Satellite Images Using Deep Learning Algorithms

Authors: Shaik Ayesha Fathima, Shaik Noor Jahan, Duvvada Rajeswara Rao

Abstract:

Earth's environment and its evolution can be seen through satellite images in near real-time. Through satellite imagery, remote sensing data provide crucial information that can be used for a variety of applications, including image fusion, change detection, land cover classification, agriculture, mining, disaster mitigation, and monitoring climate change. The objective of this project is to propose a method for classifying satellite images according to multiple predefined land cover classes. The proposed approach involves collecting data in image format. The data is then pre-processed using data pre-processing techniques. The processed data is fed into the proposed algorithm and the obtained result is analyzed. Some of the algorithms used in satellite imagery classification are U-Net, Random Forest, Deep Labv3, CNN, ANN, Resnet etc. In this project, we are using the DeepLabv3 (Atrous convolution) algorithm for land cover classification. The dataset used is the deep globe land cover classification dataset. DeepLabv3 is a semantic segmentation system that uses atrous convolution to capture multi-scale context by adopting multiple atrous rates in cascade or in parallel to determine the scale of segments.

Keywords: area calculation, atrous convolution, deep globe land cover classification, deepLabv3, land cover classification, resnet 50

Procedia PDF Downloads 125
29436 Incremental Learning of Independent Topic Analysis

Authors: Takahiro Nishigaki, Katsumi Nitta, Takashi Onoda

Abstract:

In this paper, we present a method of applying Independent Topic Analysis (ITA) to increasing the number of document data. The number of document data has been increasing since the spread of the Internet. ITA was presented as one method to analyze the document data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis (ICA). ICA is a technique in the signal processing; however, it is difficult to apply the ITA to increasing number of document data. Because ITA must use the all document data so temporal and spatial cost is very high. Therefore, we present Incremental ITA which extracts the independent topics from increasing number of document data. Incremental ITA is a method of updating the independent topics when the document data is added after extracted the independent topics from a just previous the data. In addition, Incremental ITA updates the independent topics when the document data is added. And we show the result applied Incremental ITA to benchmark datasets.

Keywords: text mining, topic extraction, independent, incremental, independent component analysis

Procedia PDF Downloads 291