Search results for: Semantic Web Usage Mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1390

Search results for: Semantic Web Usage Mining

970 Sustainable Use of Laura Lens during Drought

Authors: Kazuhisa Koda, Tsutomu Kobayashi

Abstract:

Laura Island, which is located about 50 km away from downtown, is a source of water supply in Majuro atoll, which is the capital of the Republic of the Marshall Islands. Low and flat Majuro atoll has neither river nor lake. It is very important for Majuro atoll to ensure the conservation of its water resources. However, upconing, which is the process of partial rising of the freshwater-saltwater boundary near the water-supply well, was caused by the excess pumping from it during the severe drought in 1998. Upconing will make the water usage of the freshwater lens difficult. Thus, appropriate water usage is required to prevent up coning in the freshwater lens because there is no other water source during drought. Numerical simulation of water usage applying SEAWAT model was conducted at the central part of Laura Island, including the water supply well, which was affected by upconing. The freshwater lens was created as a result of infiltration of consistent average rainfall. The lens shape was almost the same as the one in 1985. 0 of monthly rainfall and variable daily pump discharge were used to calculate the sustainable pump discharge from the water supply well. Consequently, the total amount of pump discharge was increased as the daily pump discharge was increased, indicating that it needs more time to recover from upconing. Thus, a pump standard to reduce the pump intensity is being proposed, which is based on numerical simulation concerning the occurrence of the up-coning phenomenon in Laura Island during the drought.

Keywords: Freshwater lens, islands, numerical simulation, sustainable water use.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1869
969 A Robust Salient Region Extraction Based on Color and Texture Features

Authors: Mingxin Zhang, Zhaogan Lu, Junyi Shen

Abstract:

In current common research reports, salient regions are usually defined as those regions that could present the main meaningful or semantic contents. However, there are no uniform saliency metrics that could describe the saliency of implicit image regions. Most common metrics take those regions as salient regions, which have many abrupt changes or some unpredictable characteristics. But, this metric will fail to detect those salient useful regions with flat textures. In fact, according to human semantic perceptions, color and texture distinctions are the main characteristics that could distinct different regions. Thus, we present a novel saliency metric coupled with color and texture features, and its corresponding salient region extraction methods. In order to evaluate the corresponding saliency values of implicit regions in one image, three main colors and multi-resolution Gabor features are respectively used for color and texture features. For each region, its saliency value is actually to evaluate the total sum of its Euclidean distances for other regions in the color and texture spaces. A special synthesized image and several practical images with main salient regions are used to evaluate the performance of the proposed saliency metric and other several common metrics, i.e., scale saliency, wavelet transform modulus maxima point density, and important index based metrics. Experiment results verified that the proposed saliency metric could achieve more robust performance than those common saliency metrics.

Keywords: salient regions, color and texture features, image segmentation, saliency metric

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1523
968 Analysis of Road Repairs in Undermined Areas

Authors: Tomáš Seidler, Marek Mihola, Denisa Cihlarova

Abstract:

The article presents analysis results of maps of expected subsidence in undermined areas for road repair management. The analysis was done in the area of Karvina district in the Czech Republic, including undermined areas with ongoing deep mining activities or finished deep mining in years 2003 - 2009. The article discusses the possibilities of local road maintenance authorities to determine areas that will need most repairs in the future with limited data available. Using the expected subsidence maps new map of surface curvature was calculated. Combined with road maps and historical data about repairs the result came for five main categories of undermined areas, proving very simple tool for management.

Keywords: GIS, Map of Subsidence, Road, Undermined Area

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1266
967 Parametric Analysis on Information Technology Adoption and Organizational Efficiency in Northern Nigeria

Authors: A. Y. Dutse, S. I. Ningi

Abstract:

The adoption and diffusion of Information Technology (IT) is one of the fastest growing trends in organizations operating within Nigeria’s economy. Public and private organizations make huge capital investments in an attempt acquire and adopt the state-of-the-art IT for improving operational efficiency. In this study the level of IT adoption is considered the primary driver of efficiency witnessed by organizations. The research gathered data on the intensity of IT usage, and resultant efficiency increase in the organizations’ operations. The data was analyzed using multiple regression analysis and reveals that high level of IT usage has enhance efficiency of private and public organizations in Northern part of Nigeria with organizations having strategic intent on IT adoption indicating higher efficiency gains.

Keywords: IT Adoption, Nigeria, Organizational efficiency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1326
966 Analysis of Physicochemical Properties on Prediction of R5, X4 and R5X4 HIV-1 Coreceptor Usage

Authors: Kai-Ti Hsu, Hui-Ling Huang, Chun-Wei Tung, Yi-Hsiung Chen, Shinn-Ying Ho

Abstract:

Bioinformatics methods for predicting the T cell coreceptor usage from the array of membrane protein of HIV-1 are investigated. In this study, we aim to propose an effective prediction method for dealing with the three-class classification problem of CXCR4 (X4), CCR5 (R5) and CCR5/CXCR4 (R5X4). We made efforts in investigating the coreceptor prediction problem as follows: 1) proposing a feature set of informative physicochemical properties which is cooperated with SVM to achieve high prediction test accuracy of 81.48%, compared with the existing method with accuracy of 70.00%; 2) establishing a large up-to-date data set by increasing the size from 159 to 1225 sequences to verify the proposed prediction method where the mean test accuracy is 88.59%, and 3) analyzing the set of 14 informative physicochemical properties to further understand the characteristics of HIV-1coreceptors.

Keywords: Coreceptor, genetic algorithm, HIV-1, SVM, physicochemical properties, prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2336
965 The Conceptualization of Integrated Consumer Health Informatics Utilization Framework

Authors: Norfadzila, S.W.A., Balakrishnan, V., A. Abrizah

Abstract:

The purpose of this paper is to propose an integrated consumer health informatics utilization framework that can be used to gauge the online health information needs and usage patterns among Malaysian women. The proposed framework was developed based on four different theories/models: Use and Gratification Theory, Technology Acceptance 3 Model, Health Belief Model, and Multi-level Model of Information Seeking. The relevant constructs and research hypotheses are also presented in this paper. The framework will be tested in order for it to be used successfully to identify Malaysian women-s preferences of online health information resources and health information seeking activities.

Keywords: Consumer Health Informatics, Consumer Preferences, Information Needs and Usage Patterns, Online Health Information, Women Studies

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1669
964 Proposal for Cost Calculation of Warehouse Processes and Its Usage for Setting Standards for Performance Evaluation

Authors: Tomas Cechura, Michal Simon

Abstract:

This paper describes a proposal for cost calculation of warehouse processes and its usage for setting standards for performance evaluation. One of the most common options of monitoring process performance is benchmarking. The typical outcome is whether the monitored object is better or worse than an average or standard. Traditional approaches, however, cannot find any specific opportunities to improve performance or eliminate inefficiencies in processes. Higher process efficiency can be achieved for example by cost reduction assuming that the same output is generated. However, costs can be reduced only if we know their structure and we are able to calculate them accurately. In the warehouse process area it is rather difficult because in most cases we have available only aggregated values with low explanatory ability. The aim of this paper is to create a suitable method for calculating the storage costs. At the end is shown a practical example of process calculation.

Keywords: Calculation, Costs, Performance, Process, Warehouse.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9273
963 The Study of Tourists’ Behavior in Water Usage in Hotel Business: Case Study of Phuket Province, Thailand

Authors: A. Pensiri, K. Nantaporn, P. Parichut

Abstract:

Tourism is very important to the economy of many countries due to the large contribution in the areas of employment and income generation. However, the rapid growth of tourism can also be considered as one of the major uses of water user, and therefore also have a significant and detrimental impact on the environment. Guest behavior in water usage can be used to manage water in hotels for sustainable water resources management. This research presents a study of hotel guest water usage behavior at two hotels, namely Hotel A (located in Kathu district) and Hotel B (located in Muang district) in Phuket Province, Thailand, as case studies. Primary and secondary data were collected from the hotel manager through interview and questionnaires. The water flow rate was measured in-situ from each water supply device in the standard room type at each hotel, including hand washing faucets, bathroom faucets, shower and toilet flush. For the interview, the majority of respondents (n = 204 for Hotel A and n = 244 for Hotel B) were aged between 21 years and 30 years (53% for Hotel A and 65% for Hotel B) and the majority were foreign (78% in Hotel A, and 92% in Hotel B) from American, France and Austria for purposes of tourism (63% in Hotel A, and 55% in Hotel B). The data showed that water consumption ranged from 188 litres to 507 liters, and 383 litres to 415 litres per overnight guest in Hotel A and Hotel B (n = 244), respectively. These figures exceed the water efficiency benchmark set for Tropical regions by the International Tourism Partnership (ITP). It is recommended that guest water saving initiatives should be implemented at hotels. Moreover, the results showed that guests have high satisfaction for the hotels, the front office service reveal the top rates of average score of 4.35 in Hotel A and 4.20 in Hotel B, respectively, while the luxury decoration and room cleanliness exhibited the second satisfaction scored by the guests in Hotel A and B, respectively. On the basis of this information, the findings can be very useful to improve customer service satisfaction and pay attention to this particular aspect for better hotel management.

Keywords: Hotel, tourism, Phuket, water usage.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2239
962 An Algorithm Proposed for FIR Filter Coefficients Representation

Authors: Mohamed Al Mahdi Eshtawie, Masuri Bin Othman

Abstract:

Finite impulse response (FIR) filters have the advantage of linear phase, guaranteed stability, fewer finite precision errors, and efficient implementation. In contrast, they have a major disadvantage of high order need (more coefficients) than IIR counterpart with comparable performance. The high order demand imposes more hardware requirements, arithmetic operations, area usage, and power consumption when designing and fabricating the filter. Therefore, minimizing or reducing these parameters, is a major goal or target in digital filter design task. This paper presents an algorithm proposed for modifying values and the number of non-zero coefficients used to represent the FIR digital pulse shaping filter response. With this algorithm, the FIR filter frequency and phase response can be represented with a minimum number of non-zero coefficients. Therefore, reducing the arithmetic complexity needed to get the filter output. Consequently, the system characteristic i.e. power consumption, area usage, and processing time are also reduced. The proposed algorithm is more powerful when integrated with multiplierless algorithms such as distributed arithmetic (DA) in designing high order digital FIR filters. Here the DA usage eliminates the need for multipliers when implementing the multiply and accumulate unit (MAC) and the proposed algorithm will reduce the number of adders and addition operations needed through the minimization of the non-zero values coefficients to get the filter output.

Keywords: Pulse shaping Filter, Distributed Arithmetic, Optimization algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3128
961 Impact of Coal Mining on River Sediment Quality in the Sydney Basin, Australia

Authors: A. Ali, V. Strezov, P. Davies, I. Wright, T. Kan

Abstract:

The environmental impacts arising from mining activities affect the air, water, and soil quality. Impacts may result in unexpected and adverse environmental outcomes. This study reports on the impact of coal production on sediment in Sydney region of Australia. The sediment samples upstream and downstream from the discharge points from three mines were taken, and 80 parameters were tested. The results were assessed against sediment quality based on presence of metals. The study revealed the increment of metal content in the sediment downstream of the reference locations. In many cases, the sediment was above the Australia and New Zealand Environment Conservation Council and international sediment quality guidelines value (SQGV). The major outliers to the guidelines were nickel (Ni) and zinc (Zn).

Keywords: Coal mine, environmental impact, produced water, sediment quality guidelines value.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1362
960 M2LGP: Mining Multiple Level Gradual Patterns

Authors: Yogi Satrya Aryadinata, Anne Laurent, Michel Sala

Abstract:

Gradual patterns have been studied for many years as they contain precious information. They have been integrated in many expert systems and rule-based systems, for instance to reason on knowledge such as “the greater the number of turns, the greater the number of car crashes”. In many cases, this knowledge has been considered as a rule “the greater the number of turns → the greater the number of car crashes” Historically, works have thus been focused on the representation of such rules, studying how implication could be defined, especially fuzzy implication. These rules were defined by experts who were in charge to describe the systems they were working on in order to turn them to operate automatically. More recently, approaches have been proposed in order to mine databases for automatically discovering such knowledge. Several approaches have been studied, the main scientific topics being: how to determine what is an relevant gradual pattern, and how to discover them as efficiently as possible (in terms of both memory and CPU usage). However, in some cases, end-users are not interested in raw level knowledge, and are rather interested in trends. Moreover, it may be the case that no relevant pattern can be discovered at a low level of granularity (e.g. city), whereas some can be discovered at a higher level (e.g. county). In this paper, we thus extend gradual pattern approaches in order to consider multiple level gradual patterns. For this purpose, we consider two aggregation policies, namely horizontal and vertical.

Keywords: Gradual Pattern.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1453
959 Monitoring of Spectrum Usage and Signal Identification Using Cognitive Radio

Authors: O. S. Omorogiuwa, E. J. Omozusi

Abstract:

The monitoring of spectrum usage and signal identification, using cognitive radio, is done to identify frequencies that are vacant for reuse. It has been established that ‘internet of things’ device uses secondary frequency which is free, thereby facing the challenge of interference from other users, where some primary frequencies are not being utilised. The design was done by analysing a specific frequency spectrum, checking if all the frequency stations that range from 87.5-108 MHz are presently being used in Benin City, Edo State, Nigeria. From the results, it was noticed that by using Software Defined Radio/Simulink, we were able to identify vacant frequencies in the range of frequency under consideration. Also, we were able to use the significance of energy detection threshold to reuse this vacant frequency spectrum, when the cognitive radio displays a zero output (that is decision H0), meaning that the channel is unoccupied. Hence, the analysis was able to find the spectrum hole and identify how it can be reused.

Keywords: Spectrum, interference, telecommunication, cognitive radio, frequency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 803
958 Using Data Mining Techniques for Finding Cardiac Outlier Patients

Authors: Farhan Ismaeel Dakheel, Raoof Smko, K. Negrat, Abdelsalam Almarimi

Abstract:

In this paper we used data mining techniques to identify outlier patients who are using large amount of drugs over a long period of time. Any healthcare or health insurance system should deal with the quantities of drugs utilized by chronic diseases patients. In Kingdom of Bahrain, about 20% of health budget is spent on medications. For the managers of healthcare systems, there is no enough information about the ways of drug utilization by chronic diseases patients, is there any misuse or is there outliers patients. In this work, which has been done in cooperation with information department in the Bahrain Defence Force hospital; we select the data for Cardiac patients in the period starting from 1/1/2008 to December 31/12/2008 to be the data for the model in this paper. We used three techniques for finding the drug utilization for cardiac patients. First we applied a clustering technique, followed by measuring of clustering validity, and finally we applied a decision tree as classification algorithm. The clustering results is divided into three clusters according to the drug utilization, for 1603 patients, who received 15,806 prescriptions during this period can be partitioned into three groups, where 23 patients (2.59%) who received 1316 prescriptions (8.32%) are classified to be outliers. The classification algorithm shows that the use of average drug utilization and the age, and the gender of the patient can be considered to be the main predictive factors in the induced model.

Keywords: Data Mining, Clustering, Classification, Drug Utilization..

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1853
957 An Enhanced Slicing Algorithm Using Nearest Distance Analysis for Layer Manufacturing

Authors: M. Vatani, A. R. Rahimi, F. Brazandeh, A. Sanati nezhad

Abstract:

Although the STL (stereo lithography) file format is widely used as a de facto industry standard in the rapid prototyping industry due to its simplicity and ability to tessellation of almost all surfaces, but there are always some defects and shortcoming in their usage, which many of them are difficult to correct manually. In processing the complex models, size of the file and its defects grow extremely, therefore, correcting STL files become difficult. In this paper through optimizing the exiting algorithms, size of the files and memory usage of computers to process them will be reduced. In spite of type and extent of the errors in STL files, the tail-to-head searching method and analysis of the nearest distance between tails and heads techniques were used. As a result STL models sliced rapidly, and fully closed contours produced effectively and errorless.

Keywords: Layer manufacturing, STL files, slicing algorithm, nearest distance analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4087
956 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: Clustering, data mining, DBSCAN, k-means, k-medoids, sensor data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1959
955 A Novel Approach to Optimal Cutting Tool Replacement

Authors: Cem Karacal, Sohyung Cho, William Yu

Abstract:

In metal cutting industries, mathematical/statistical models are typically used to predict tool replacement time. These off-line methods usually result in less than optimum replacement time thereby either wasting resources or causing quality problems. The few online real-time methods proposed use indirect measurement techniques and are prone to similar errors. Our idea is based on identifying the optimal replacement time using an electronic nose to detect the airborne compounds released when the tool wear reaches to a chemical substrate doped into tool material during the fabrication. The study investigates the feasibility of the idea, possible doping materials and methods along with data stream mining techniques for detection and monitoring different phases of tool wear.

Keywords: Tool condition monitoring, cutting tool replacement, data stream mining, e-Nose.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1849
954 Road Traffic Accidents Analysis in Mexico City through Crowdsourcing Data and Data Mining Techniques

Authors: Gabriela V. Angeles Perez, Jose Castillejos Lopez, Araceli L. Reyes Cabello, Emilio Bravo Grajales, Adriana Perez Espinosa, Jose L. Quiroz Fabian

Abstract:

Road traffic accidents are among the principal causes of traffic congestion, causing human losses, damages to health and the environment, economic losses and material damages. Studies about traditional road traffic accidents in urban zones represents very high inversion of time and money, additionally, the result are not current. However, nowadays in many countries, the crowdsourced GPS based traffic and navigation apps have emerged as an important source of information to low cost to studies of road traffic accidents and urban congestion caused by them. In this article we identified the zones, roads and specific time in the CDMX in which the largest number of road traffic accidents are concentrated during 2016. We built a database compiling information obtained from the social network known as Waze. The methodology employed was Discovery of knowledge in the database (KDD) for the discovery of patterns in the accidents reports. Furthermore, using data mining techniques with the help of Weka. The selected algorithms was the Maximization of Expectations (EM) to obtain the number ideal of clusters for the data and k-means as a grouping method. Finally, the results were visualized with the Geographic Information System QGIS.

Keywords: Data mining, K-means, road traffic accidents, Waze, Weka.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1146
953 Design of Personal Job Recommendation Framework on Smartphone Platform

Authors: Chayaporn Kaensar

Abstract:

Recently, Job Recommender Systems have gained much attention in industries since they solve the problem of information overload on the recruiting website. Therefore, we proposed Extended Personalized Job System that has the capability of providing the appropriate jobs for job seeker and recommending some suitable information for them using Data Mining Techniques and Dynamic User Profile. On the other hands, company can also interact to the system for publishing and updating job information. This system have emerged and supported various platforms such as web application and android mobile application. In this paper, User profiles, Implicit User Action, User Feedback, and Clustering Techniques in WEKA libraries were applied and implemented. In additions, open source tools like Yii Web Application Framework, Bootstrap Front End Framework and Android Mobile Technology were also applied.

Keywords: Recommendation, user profile, data mining, web technology, mobile technology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2107
952 Finding Fuzzy Association Rules Using FWFP-Growth with Linguistic Supports and Confidences

Authors: Chien-Hua Wang, Chin-Tzong Pang

Abstract:

In data mining, the association rules are used to search for the relations of items of the transactions database. Following the data is collected and stored, it can find rules of value through association rules, and assist manager to proceed marketing strategy and plan market framework. In this paper, we attempt fuzzy partition methods and decide membership function of quantitative values of each transaction item. Also, by managers we can reflect the importance of items as linguistic terms, which are transformed as fuzzy sets of weights. Next, fuzzy weighted frequent pattern growth (FWFP-Growth) is used to complete the process of data mining. The method above is expected to improve Apriori algorithm for its better efficiency of the whole association rules. An example is given to clearly illustrate the proposed approach.

Keywords: Association Rule, Fuzzy Partition Methods, FWFP-Growth, Apiroir algorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1602
951 AniMoveMineR: Animal Behavior Exploratory Analysis Using Association Rules Mining

Authors: Suelane Garcia Fontes, Silvio Luiz Stanzani, Pedro L. Pizzigatti Corrła Ronaldo G. Morato

Abstract:

Environmental changes and major natural disasters are most prevalent in the world due to the damage that humanity has caused to nature and these damages directly affect the lives of animals. Thus, the study of animal behavior and their interactions with the environment can provide knowledge that guides researchers and public agencies in preservation and conservation actions. Exploratory analysis of animal movement can determine the patterns of animal behavior and with technological advances the ability of animals to be tracked and, consequently, behavioral studies have been expanded. There is a lot of research on animal movement and behavior, but we note that a proposal that combines resources and allows for exploratory analysis of animal movement and provide statistical measures on individual animal behavior and its interaction with the environment is missing. The contribution of this paper is to present the framework AniMoveMineR, a unified solution that aggregates trajectory analysis and data mining techniques to explore animal movement data and provide a first step in responding questions about the animal individual behavior and their interactions with other animals over time and space. We evaluated the framework through the use of monitored jaguar data in the city of Miranda Pantanal, Brazil, in order to verify if the use of AniMoveMineR allows to identify the interaction level between these jaguars. The results were positive and provided indications about the individual behavior of jaguars and about which jaguars have the highest or lowest correlation.

Keywords: Data mining, data science, trajectory, animal behavior.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 827
950 Author Profiling: Prediction of Learners’ Gender on a MOOC Platform Based on Learners’ Comments

Authors: Tahani Aljohani, Jialin Yu, Alexandra. I. Cristea

Abstract:

The more an educational system knows about a learner, the more personalised interaction it can provide, which leads to better learning. However, asking a learner directly is potentially disruptive, and often ignored by learners. Especially in the booming realm of MOOC Massive Online Learning platforms, only a very low percentage of users disclose demographic information about themselves. Thus, in this paper, we aim to predict learners’ demographic characteristics, by proposing an approach using linguistically motivated Deep Learning Architectures for Learner Profiling, particularly targeting gender prediction on a FutureLearn MOOC platform. Additionally, we tackle here the difficult problem of predicting the gender of learners based on their comments only – which are often available across MOOCs. The most common current approaches to text classification use the Long Short-Term Memory (LSTM) model, considering sentences as sequences. However, human language also has structures. In this research, rather than considering sentences as plain sequences, we hypothesise that higher semantic - and syntactic level sentence processing based on linguistics will render a richer representation. We thus evaluate, the traditional LSTM versus other bleeding edge models, which take into account syntactic structure, such as tree-structured LSTM, Stack-augmented Parser-Interpreter Neural Network (SPINN) and the Structure-Aware Tag Augmented model (SATA). Additionally, we explore using different word-level encoding functions. We have implemented these methods on Our MOOC dataset, which is the most performant one comparing with a public dataset on sentiment analysis that is further used as a cross-examining for the models' results.

Keywords: Deep learning, data mining, gender predication, MOOCs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1292
949 Feature Selection Approaches with Missing Values Handling for Data Mining - A Case Study of Heart Failure Dataset

Authors: N.Poolsawad, C.Kambhampati, J. G. F. Cleland

Abstract:

In this paper, we investigated the characteristic of a clinical dataseton the feature selection and classification measurements which deal with missing values problem.And also posed the appropriated techniques to achieve the aim of the activity; in this research aims to find features that have high effect to mortality and mortality time frame. We quantify the complexity of a clinical dataset. According to the complexity of the dataset, we proposed the data mining processto cope their complexity; missing values, high dimensionality, and the prediction problem by using the methods of missing value replacement, feature selection, and classification.The experimental results will extend to develop the prediction model for cardiology.

Keywords: feature selection, missing values, classification, clinical dataset, heart failure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3164
948 Seasonal Variation of the Impact of Mining Activities on Ga-Selati River in Limpopo Province, South Africa

Authors: Joshua N. Edokpayi, John O. Odiyo, Patience P. Shikwambana

Abstract:

Water is a very rare natural resource in South Africa. Ga-Selati River is used for both domestic and industrial purposes. This study was carried out in order to assess the quality of Ga-Selati River in a mining area of Limpopo Province-Phalaborwa. The pH, Electrical Conductivity (EC) and Total Dissolved Solids (TDS) were determined using a Crinson multimeter while turbidity was measured using a Labcon Turbidimeter. The concentrations of Al, Ca, Cd, Cr, Fe, K, Mg, Mn, Na and Pb were analysed in triplicate using a Varian 520 flame atomic absorption spectrometer (AAS) supplied by PerkinElmer, after acid digestion with nitric acid in a fume cupboard. The average pH of the river from eight different sampling sites was 8.00 and 9.38 in wet and dry season respectively. Higher EC values were determined in the dry season (138.7 mS/m) than in the wet season (96.93 mS/m). Similarly, TDS values were higher in dry (929.29 mg/L) than in the wet season (640.72 mg/L) season. These values exceeded the recommended guideline of South Africa Department of Water Affairs and Forestry (DWAF) for domestic water use (70 mS/m) and that of the World Health Organization (WHO) (600 mS/m), respectively. Turbidity varied between 1.78-5.20 and 0.95-2.37 NTU in both wet and dry seasons. Total hardness of 312.50 mg/L and 297.75 mg/L as the concentration of CaCO3 was computed for the river in both the wet and the dry seasons and the river water was categorised as very hard. Mean concentration of the metals studied in both the wet and the dry seasons are: Na (94.06 mg/L and 196.3 mg/L), K (11.79 mg/L and 13.62 mg/L), Ca (45.60 mg/L and 41.30 mg/L), Mg (48.41 mg/L and 44.71 mg/L), Al (0.31 mg/L and 0.38 mg/L), Cd (0.01 mg/L and 0.01 mg/L), Cr (0.02 mg/L and 0.09 mg/L), Pb (0.05 mg/L and 0.06 mg/L), Mn (0.31 mg/L and 0.11 mg/L) and Fe (0.76 mg/L and 0.69 mg/L). Results from this study reveal that most of the metals were present in concentrations higher than the recommended guidelines of DWAF and WHO for domestic use and the protection of aquatic life.

Keywords: Contamination, mining activities, surface water, trace metals.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1913
947 Iran’s Gas Flare Recovery Options Using MCDM

Authors: Halle Bakhteeyar, Azadeh Maroufmashat, Abbas Maleki, Sourena Sattari Khavas

Abstract:

In this paper, five options of Iran’s gas flare recovery have been compared via MCDM method. For developing the model, the weighing factor of each indicator an AHP method is used via the Expert-choice software. Several cases were considered in this analysis. They are defined where the priorities were defined always keeping one criterion in first position, while the priorities of the other criteria were defined by ordinal information defining the mutual relations of the criteria and the respective indicators. The results, show that amongst these cases, priority is obtained for CHP usage where availability indicator is highly weighted while the pipeline usage is obtained where environmental indicator highly weighted and the injection priority is obtained where economic indicator is highly weighted and also when the weighing factor of all the criteria are the same the Injection priority is obtained.

Keywords: Flare, Gas, Iran.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3389
946 Highlighting Document's Structure

Authors: Sylvie Ratté, Wilfried Njomgue, Pierre-André Ménard

Abstract:

In this paper, we present symbolic recognition models to extract knowledge characterized by document structures. Focussing on the extraction and the meticulous exploitation of the semantic structure of documents, we obtain a meaningful contextual tagging corresponding to different unit types (title, chapter, section, enumeration, etc.).

Keywords: Information retrieval, document structures, symbolic grammars.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1188
945 Utilizing Ontologies Using Ontology Editor for Creating Initial Unified Modeling Language (UML)Object Model

Authors: Waralak Vongdoiwang Siricharoen

Abstract:

One of object oriented software developing problem is the difficulty of searching the appropriate and suitable objects for starting the system. In this work, ontologies appear in the part of supporting the object discovering in the initial of object oriented software developing. There are many researches try to demonstrate that there is a great potential between object model and ontologies. Constructing ontology from object model is called ontology engineering can be done; On the other hand, this research is aiming to support the idea of building object model from ontology is also promising and practical. Ontology classes are available online in any specific areas, which can be searched by semantic search engine. There are also many helping tools to do so; one of them which are used in this research is Protégé ontology editor and Visual Paradigm. To put them together give a great outcome. This research will be shown how it works efficiently with the real case study by using ontology classes in travel/tourism domain area. It needs to combine classes, properties, and relationships from more than two ontologies in order to generate the object model. In this paper presents a simple methodology framework which explains the process of discovering objects. The results show that this framework has great value while there is possible for expansion. Reusing of existing ontologies offers a much cheaper alternative than building new ones from scratch. More ontologies are becoming available on the web, and online ontologies libraries for storing and indexing ontologies are increasing in number and demand. Semantic and Ontologies search engines have also started to appear, to facilitate search and retrieval of online ontologies.

Keywords: Software Developing, Ontology, Ontology Library, Artificial Intelligent, Protégé, Object Model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1836
944 Integration of Educational Data Mining Models to a Web-Based Support System for Predicting High School Student Performance

Authors: Sokkhey Phauk, Takeo Okazaki

Abstract:

The challenging task in educational institutions is to maximize the high performance of students and minimize the failure rate of poor-performing students. An effective method to leverage this task is to know student learning patterns with highly influencing factors and get an early prediction of student learning outcomes at the timely stage for setting up policies for improvement. Educational data mining (EDM) is an emerging disciplinary field of data mining, statistics, and machine learning concerned with extracting useful knowledge and information for the sake of improvement and development in the education environment. The study is of this work is to propose techniques in EDM and integrate it into a web-based system for predicting poor-performing students. A comparative study of prediction models is conducted. Subsequently, high performing models are developed to get higher performance. The hybrid random forest (Hybrid RF) produces the most successful classification. For the context of intervention and improving the learning outcomes, a feature selection method MICHI, which is the combination of mutual information (MI) and chi-square (CHI) algorithms based on the ranked feature scores, is introduced to select a dominant feature set that improves the performance of prediction and uses the obtained dominant set as information for intervention. By using the proposed techniques of EDM, an academic performance prediction system (APPS) is subsequently developed for educational stockholders to get an early prediction of student learning outcomes for timely intervention. Experimental outcomes and evaluation surveys report the effectiveness and usefulness of the developed system. The system is used to help educational stakeholders and related individuals for intervening and improving student performance.

Keywords: Academic performance prediction system, prediction model, educational data mining, dominant factors, feature selection methods, student performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 909
943 Mining User-Generated Contents to Detect Service Failures with Topic Model

Authors: Kyung Bae Park, Sung Ho Ha

Abstract:

Online user-generated contents (UGC) significantly change the way customers behave (e.g., shop, travel), and a pressing need to handle the overwhelmingly plethora amount of various UGC is one of the paramount issues for management. However, a current approach (e.g., sentiment analysis) is often ineffective for leveraging textual information to detect the problems or issues that a certain management suffers from. In this paper, we employ text mining of Latent Dirichlet Allocation (LDA) on a popular online review site dedicated to complaint from users. We find that the employed LDA efficiently detects customer complaints, and a further inspection with the visualization technique is effective to categorize the problems or issues. As such, management can identify the issues at stake and prioritize them accordingly in a timely manner given the limited amount of resources. The findings provide managerial insights into how analytics on social media can help maintain and improve their reputation management. Our interdisciplinary approach also highlights several insights by applying machine learning techniques in marketing research domain. On a broader technical note, this paper illustrates the details of how to implement LDA in R program from a beginning (data collection in R) to an end (LDA analysis in R) since the instruction is still largely undocumented. In this regard, it will help lower the boundary for interdisciplinary researcher to conduct related research.

Keywords: Latent Dirichlet allocation, R program, text mining, topic model, user generated contents, visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1173
942 A Proposal for a Secure and Interoperable Data Framework for Energy Digitalization

Authors: Hebberly Ahatlan

Abstract:

The process of digitizing energy systems involves transforming traditional energy infrastructure into interconnected, data-driven systems that enhance efficiency, sustainability, and responsiveness. As smart grids become increasingly integral to the efficient distribution and management of electricity from both fossil and renewable energy sources, the energy industry faces strategic challenges associated with digitalization and interoperability — particularly in the context of modern energy business models, such as virtual power plants (VPPs). The critical challenge in modern smart grids is to seamlessly integrate diverse technologies and systems, including virtualization, grid computing and service-oriented architecture (SOA), across the entire energy ecosystem. Achieving this requires addressing issues like semantic interoperability, Information Technology (IT) and Operational Technology (OT) convergence, and digital asset scalability, all while ensuring security and risk management. This paper proposes a four-layer digitalization framework to tackle these challenges, encompassing persistent data protection, trusted key management, secure messaging, and authentication of IoT resources. Data assets generated through this framework enable AI systems to derive insights for improving smart grid operations, security, and revenue generation. Furthermore, this paper also proposes a Trusted Energy Interoperability Alliance as a universal guiding standard in the development of this digitalization framework to support more dynamic and interoperable energy markets.

Keywords: Digitalization, IT/OT convergence, semantic interoperability, TEIA alliance, VPP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20
941 Customers’ Priority to Implement SSTs Using AHP Analysis

Authors: Mohammad Jafariahangari, Marjan Habibi, Miresmaeil Mirnabibaboli, Mirza Hassan Hosseini

Abstract:

Self-service technologies (SSTs) make an important contribution to the daily life of people nowadays. However, the introduction of SST does not lead to its usage. Thereby, this paper was an attempt on discovery of the most preferred SST in the customers’ point of view. To fulfill this aim, the Analytical Hierarchy Process (AHP) was applied based on Saaty’s questionnaire which was administered to the customers of e-banking services located in Golestan providence, northern Iran. This study used qualitative factors in association with the intention of consumers’ usage of SSTs to rank three SSTs: ATM, mobile banking and internet banking. The results showed that mobile banking get the highest weight in consumers’ point of view. This research can be useful both for managers and service providers and also for customers who intend to use e-banking.

Keywords: Analytical Hierarchy Process, Decision-making, Ebanking, Iran, Self-service technologies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2168