Search results for: distributed data stream mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 26721

Search results for: distributed data stream mining

25731 Biofilm Text Classifiers Developed Using Natural Language Processing and Unsupervised Learning Approach

Authors: Kanika Gupta, Ashok Kumar

Abstract:

Biofilms are dense, highly hydrated cell clusters that are irreversibly attached to a substratum, to an interface or to each other, and are embedded in a self-produced gelatinous matrix composed of extracellular polymeric substances. Research in biofilm field has become very significant, as biofilm has shown high mechanical resilience and resistance to antibiotic treatment and constituted as a significant problem in both healthcare and other industry related to microorganisms. The massive information both stated and hidden in the biofilm literature are growing exponentially therefore it is not possible for researchers and practitioners to automatically extract and relate information from different written resources. So, the current work proposes and discusses the use of text mining techniques for the extraction of information from biofilm literature corpora containing 34306 documents. It is very difficult and expensive to obtain annotated material for biomedical literature as the literature is unstructured i.e. free-text. Therefore, we considered unsupervised approach, where no annotated training is necessary and using this approach we developed a system that will classify the text on the basis of growth and development, drug effects, radiation effects, classification and physiology of biofilms. For this, a two-step structure was used where the first step is to extract keywords from the biofilm literature using a metathesaurus and standard natural language processing tools like Rapid Miner_v5.3 and the second step is to discover relations between the genes extracted from the whole set of biofilm literature using pubmed.mineR_v1.0.11. We used unsupervised approach, which is the machine learning task of inferring a function to describe hidden structure from 'unlabeled' data, in the above-extracted datasets to develop classifiers using WinPython-64 bit_v3.5.4.0Qt5 and R studio_v0.99.467 packages which will automatically classify the text by using the mentioned sets. The developed classifiers were tested on a large data set of biofilm literature which showed that the unsupervised approach proposed is promising as well as suited for a semi-automatic labeling of the extracted relations. The entire information was stored in the relational database which was hosted locally on the server. The generated biofilm vocabulary and genes relations will be significant for researchers dealing with biofilm research, making their search easy and efficient as the keywords and genes could be directly mapped with the documents used for database development.

Keywords: biofilms literature, classifiers development, text mining, unsupervised learning approach, unstructured data, relational database

Procedia PDF Downloads 160
25730 A Study of the Performance Parameter for Recommendation Algorithm Evaluation

Authors: C. Rana, S. K. Jain

Abstract:

The enormous amount of Web data has challenged its usage in efficient manner in the past few years. As such, a range of techniques are applied to tackle this problem; prominent among them is personalization and recommender system. In fact, these are the tools that assist user in finding relevant information of web. Most of the e-commerce websites are applying such tools in one way or the other. In the past decade, a large number of recommendation algorithms have been proposed to tackle such problems. However, there have not been much research in the evaluation criteria for these algorithms. As such, the traditional accuracy and classification metrics are still used for the evaluation purpose that provides a static view. This paper studies how the evolution of user preference over a period of time can be mapped in a recommender system using a new evaluation methodology that explicitly using time dimension. We have also presented different types of experimental set up that are generally used for recommender system evaluation. Furthermore, an overview of major accuracy metrics and metrics that go beyond the scope of accuracy as researched in the past few years is also discussed in detail.

Keywords: collaborative filtering, data mining, evolutionary, clustering, algorithm, recommender systems

Procedia PDF Downloads 406
25729 The Problem of the Use of Learning Analytics in Distance Higher Education: An Analytical Study of the Open and Distance University System in Mexico

Authors: Ismene Ithai Bras-Ruiz

Abstract:

Learning Analytics (LA) is employed by universities not only as a tool but as a specialized ground to enhance students and professors. However, not all the academic programs apply LA with the same goal and use the same tools. In fact, LA is formed by five main fields of study (academic analytics, action research, educational data mining, recommender systems, and personalized systems). These fields can help not just to inform academic authorities about the situation of the program, but also can detect risk students, professors with needs, or general problems. The highest level applies Artificial Intelligence techniques to support learning practices. LA has adopted different techniques: statistics, ethnography, data visualization, machine learning, natural language process, and data mining. Is expected that any academic program decided what field wants to utilize on the basis of his academic interest but also his capacities related to professors, administrators, systems, logistics, data analyst, and the academic goals. The Open and Distance University System (SUAYED in Spanish) of the University National Autonomous of Mexico (UNAM), has been working for forty years as an alternative to traditional programs; one of their main supports has been the employ of new information and communications technologies (ICT). Today, UNAM has one of the largest network higher education programs, twenty-six academic programs in different faculties. This situation means that every faculty works with heterogeneous populations and academic problems. In this sense, every program has developed its own Learning Analytic techniques to improve academic issues. In this context, an investigation was carried out to know the situation of the application of LA in all the academic programs in the different faculties. The premise of the study it was that not all the faculties have utilized advanced LA techniques and it is probable that they do not know what field of study is closer to their program goals. In consequence, not all the programs know about LA but, this does not mean they do not work with LA in a veiled or, less clear sense. It is very important to know the grade of knowledge about LA for two reasons: 1) This allows to appreciate the work of the administration to improve the quality of the teaching and, 2) if it is possible to improve others LA techniques. For this purpose, it was designed three instruments to determinate the experience and knowledge in LA. These were applied to ten faculty coordinators and his personnel; thirty members were consulted (academic secretary, systems manager, or data analyst, and coordinator of the program). The final report allowed to understand that almost all the programs work with basic statistics tools and techniques, this helps the administration only to know what is happening inside de academic program, but they are not ready to move up to the next level, this means applying Artificial Intelligence or Recommender Systems to reach a personalized learning system. This situation is not related to the knowledge of LA, but the clarity of the long-term goals.

Keywords: academic improvements, analytical techniques, learning analytics, personnel expertise

Procedia PDF Downloads 124
25728 A Large Ion Collider Experiment (ALICE) Diffractive Detector Control System for RUN-II at the Large Hadron Collider

Authors: J. C. Cabanillas-Noris, M. I. Martínez-Hernández, I. León-Monzón

Abstract:

The selection of diffractive events in the ALICE experiment during the first data taking period (RUN-I) of the Large Hadron Collider (LHC) was limited by the range over which rapidity gaps occur. It would be possible to achieve better measurements by expanding the range in which the production of particles can be detected. For this purpose, the ALICE Diffractive (AD0) detector has been installed and commissioned for the second phase (RUN-II). Any new detector should be able to take the data synchronously with all other detectors and be operated through the ALICE central systems. One of the key elements that must be developed for the AD0 detector is the Detector Control System (DCS). The DCS must be designed to operate safely and correctly this detector. Furthermore, the DCS must also provide optimum operating conditions for the acquisition and storage of physics data and ensure these are of the highest quality. The operation of AD0 implies the configuration of about 200 parameters, from electronics settings and power supply levels to the archiving of operating conditions data and the generation of safety alerts. It also includes the automation of procedures to get the AD0 detector ready for taking data in the appropriate conditions for the different run types in ALICE. The performance of AD0 detector depends on a certain number of parameters such as the nominal voltages for each photomultiplier tube (PMT), their threshold levels to accept or reject the incoming pulses, the definition of triggers, etc. All these parameters define the efficiency of AD0 and they have to be monitored and controlled through AD0 DCS. Finally, AD0 DCS provides the operator with multiple interfaces to execute these tasks. They are realized as operating panels and scripts running in the background. These features are implemented on a SCADA software platform as a distributed control system which integrates to the global control system of the ALICE experiment.

Keywords: AD0, ALICE, DCS, LHC

Procedia PDF Downloads 300
25727 Charting Sentiments with Naive Bayes and Logistic Regression

Authors: Jummalla Aashrith, N. L. Shiva Sai, K. Bhavya Sri

Abstract:

The swift progress of web technology has not only amassed a vast reservoir of internet data but also triggered a substantial surge in data generation. The internet has metamorphosed into one of the dynamic hubs for online education, idea dissemination, as well as opinion-sharing. Notably, the widely utilized social networking platform Twitter is experiencing considerable expansion, providing users with the ability to share viewpoints, participate in discussions spanning diverse communities, and broadcast messages on a global scale. The upswing in online engagement has sparked a significant curiosity in subjective analysis, particularly when it comes to Twitter data. This research is committed to delving into sentiment analysis, focusing specifically on the realm of Twitter. It aims to offer valuable insights into deciphering information within tweets, where opinions manifest in a highly unstructured and diverse manner, spanning a spectrum from positivity to negativity, occasionally punctuated by neutrality expressions. Within this document, we offer a comprehensive exploration and comparative assessment of modern approaches to opinion mining. Employing a range of machine learning algorithms such as Naive Bayes and Logistic Regression, our investigation plunges into the domain of Twitter data streams. We delve into overarching challenges and applications inherent in the realm of subjectivity analysis over Twitter.

Keywords: machine learning, sentiment analysis, visualisation, python

Procedia PDF Downloads 46
25726 Educase–Intelligent System for Pedagogical Advising Using Case-Based Reasoning

Authors: Elionai Moura, José A. Cunha, César Analide

Abstract:

This work introduces a proposal scheme for an Intelligent System applied to Pedagogical Advising using Case-Based Reasoning, to find consolidated solutions before used for the new problems, making easier the task of advising students to the pedagogical staff. We do intend, through this work, introduce the motivation behind the choices for this system structure, justifying the development of an incremental and smart web system who learns bests solutions for new cases when it’s used, showing technics and technology.

Keywords: case-based reasoning, pedagogical advising, educational data-mining (EDM), machine learning

Procedia PDF Downloads 411
25725 Developing Digital Competencies in Aboriginal Students through University-College Partnerships

Authors: W. S. Barber, S. L. King

Abstract:

This paper reports on a pilot project to develop a collaborative partnership between a community college in rural northern Ontario, Canada, and an urban university in the greater Toronto area in Oshawa, Canada. Partner institutions will collaborate to address learning needs of university applicants whose goals are to attain an undergraduate university BA in Educational Studies and Digital Technology degree, but who may not live in a geographical location that would facilitate this pathways process. The UOIT BA degree is attained through a 2+2 program, where students with a 2 year college diploma or equivalent can attain a four year undergraduate degree. The goals reported on the project are as: 1. Our aim is to expand the BA program to include an additional stream which includes serious educational games, simulations and virtual environments, 2. Develop fully (using both synchronous and asynchronous technologies) online learning modules for use by university applicants who otherwise are not geographically located close to a physical university site, 3. Assess the digital competencies of all students, including members of local, distance and Indigenous communities using a validated tool developed and tested by UOIT across numerous populations. This tool, the General Technical Competency Use and Scale (GTCU) will provide the collaborating institutions with data that will allow for analyzing how well students are prepared to succeed in fully online learning communities. Philosophically, the UOIT BA program is based on a fully online learning communities model (FOLC) that can be accessed from anywhere in the world through digital learning environments via audio video conferencing tools such as Adobe Connect. It also follows models of adult learning and mobile learning, and makes a university degree accessible to the increasing demographic of adult learners who may use mobile devices to learn anywhere anytime. The program is based on key principles of Problem Based Learning, allowing students to build their own understandings through the co-design of the learning environment in collaboration with the instructors and their peers. In this way, this degree allows students to personalize and individualize the learning based on their own culture, background and professional/personal experiences. Using modified flipped classroom strategies, students are able to interrogate video modules on their own time in preparation for one hour discussions occurring in video conferencing sessions. As a consequence of the program flexibility, students may continue to work full or part time. All of the partner institutions will co-develop four new modules, administer the GTCU and share data, while creating a new stream of the UOIT BA degree. This will increase accessibility for students to bridge from community colleges to university through a fully digital environment. We aim to work collaboratively with Indigenous elders, community members and distance education instructors to increase opportunities for more students to attain a university education.

Keywords: aboriginal, college, competencies, digital, universities

Procedia PDF Downloads 211
25724 Optimization of Waste Plastic to Fuel Oil Plants' Deployment Using Mixed Integer Programming

Authors: David Muyise

Abstract:

Mixed Integer Programming (MIP) is an approach that involves the optimization of a range of decision variables in order to minimize or maximize a particular objective function. The main objective of this study was to apply the MIP approach to optimize the deployment of waste plastic to fuel oil processing plants in Uganda. The processing plants are meant to reduce plastic pollution by pyrolyzing the waste plastic into a cleaner fuel that can be used to power diesel/paraffin engines, so as (1) to reduce the negative environmental impacts associated with plastic pollution and also (2) to curb down the energy gap by utilizing the fuel oil. A programming model was established and tested in two case study applications that are, small-scale applications in rural towns and large-scale deployment across major cities in the country. In order to design the supply chain, optimal decisions on the types of waste plastic to be processed, size, location and number of plants, and downstream fuel applications were concurrently made based on the payback period, investor requirements for capital cost and production cost of fuel and electricity. The model comprises qualitative data gathered from waste plastic pickers at landfills and potential investors, and quantitative data obtained from primary research. It was found out from the study that a distributed system is suitable for small rural towns, whereas a decentralized system is only suitable for big cities. Small towns of Kalagi, Mukono, Ishaka, and Jinja were found to be the ideal locations for the deployment of distributed processing systems, whereas Kampala, Mbarara, and Gulu cities were found to be the ideal locations initially utilize the decentralized pyrolysis technology system. We conclude that the model findings will be most important to investors, engineers, plant developers, and municipalities interested in waste plastic to fuel processing in Uganda and elsewhere in developing economy.

Keywords: mixed integer programming, fuel oil plants, optimisation of waste plastics, plastic pollution, pyrolyzing

Procedia PDF Downloads 123
25723 Studying the Effect of Froude Number and Densimetric Froude Number on Local Scours around Circular Bridge Piers

Authors: Md Abdullah Al Faruque

Abstract:

A very large percentage of bridge failures are attributed to scouring around bridge piers and this directly influences public safety. Experiments are carried out in a 12-m long rectangular open channel flume made of transparent tempered glass. A 300 mm thick bed made up of sand particles is leveled horizontally to create the test bed and a 50 mm hollow plastic cylinder is used as a model bridge pier. Tests are carried out with varying flow depths and velocities. Data points of various scour parameters such as scour depth, width, and length are collected based on different flow conditions and visual observations of changes in the stream bed downstream the bridge pier are also made as the scour progresses. Result shows that all three major flow characteristics (flow depth, Froude number and densimetric Froude number) have one way or other affect the scour profile.

Keywords: bridge pier scour, densimetric Froude number, flow depth, Froude number, sand

Procedia PDF Downloads 163
25722 Evidence Theory Enabled Quickest Change Detection Using Big Time-Series Data from Internet of Things

Authors: Hossein Jafari, Xiangfang Li, Lijun Qian, Alexander Aved, Timothy Kroecker

Abstract:

Traditionally in sensor networks and recently in the Internet of Things, numerous heterogeneous sensors are deployed in distributed manner to monitor a phenomenon that often can be model by an underlying stochastic process. The big time-series data collected by the sensors must be analyzed to detect change in the stochastic process as quickly as possible with tolerable false alarm rate. However, sensors may have different accuracy and sensitivity range, and they decay along time. As a result, the big time-series data collected by the sensors will contain uncertainties and sometimes they are conflicting. In this study, we present a framework to take advantage of Evidence Theory (a.k.a. Dempster-Shafer and Dezert-Smarandache Theories) capabilities of representing and managing uncertainty and conflict to fast change detection and effectively deal with complementary hypotheses. Specifically, Kullback-Leibler divergence is used as the similarity metric to calculate the distances between the estimated current distribution with the pre- and post-change distributions. Then mass functions are calculated and related combination rules are applied to combine the mass values among all sensors. Furthermore, we applied the method to estimate the minimum number of sensors needed to combine, so computational efficiency could be improved. Cumulative sum test is then applied on the ratio of pignistic probability to detect and declare the change for decision making purpose. Simulation results using both synthetic data and real data from experimental setup demonstrate the effectiveness of the presented schemes.

Keywords: CUSUM, evidence theory, kl divergence, quickest change detection, time series data

Procedia PDF Downloads 326
25721 Analysis of the Fair Distribution of Urban Facilities in Kabul City by Population Modeling

Authors: Ansari Mohammad Reza, Hiroko Ono

Abstract:

In this study, we investigated how much of the urban facilities are fairly distributing in the city of Kabul based on the factor of population. To find the answer to this question we simulated a fair model for the distribution of investigated facilities in the city which is proposed based on the consideration of two factors; the number of users for each facility and the average distance of reach of each facility. Then the model was evaluated to make sure about its efficiency. And finally, the two—the existing pattern and the simulation model—were compared to find the degree of bias in the existing pattern of distribution of facilities in the city. The result of the study clearly clarified that the facilities are not fairly distributed in Kabul city based on the factor of population. Our analysis also revealed that the education services and the parks are the most and the worst fair distributed facilities in this regard.

Keywords: Afghanistan, ArcGIS Software, Kabul City, fair distribution, urban facilities

Procedia PDF Downloads 171
25720 Fashion Consumption for Fashion Innovators: A Study of Fashion Consumption Behavior of Innovators and Non-Innovators

Authors: Vaishali P. Joshi, Pallav Joshi

Abstract:

The objective of this study is to examine the differences fashion innovators and non-fashion innovators in their fashion consumption behavior in terms of their pre-purchase behavior, purchase behavior and post purchase behavior. The questionnaire was distributed to a female college student for data collection for achieving the objective of the first part of the study. Question-related to fashion innovativeness and fashion consumption behavior was asked. The sample was comprised of 81 college females ages 18 through 30 who were attending Business Management degree. A series of attitude questions was used to categorize respondents on the Innovativeness Scale. 32 respondents with a score of 21 and above were designated as Fashion innovators and the remainder (49) as Non-fashion innovators. Findings showed that there exist significant differences between innovators and non-innovators in their fashion consumption behavior. Data was analyzed through frequency distribution table. Many differences were found in the behavior of innovators and non-innovators in terms of their pre-purchase, actual purchase, and post-purchase behavior.

Keywords: fashion, innovativeness, consumption behavior, purchase

Procedia PDF Downloads 554
25719 Secure Intelligent Information Management by Using a Framework of Virtual Phones-On Cloud Computation

Authors: Mohammad Hadi Khorashadi Zadeh

Abstract:

Many new applications and internet services have been emerged since the innovation of mobile networks and devices. However, these applications have problems of security, management, and performance in business environments. Cloud systems provide information transfer, management facilities, and security for virtual environments. Therefore, an innovative internet service and a business model are proposed in the present study for creating a secure and consolidated environment for managing the mobile information of organizations based on cloud virtual phones (CVP) infrastructures. Using this method, users can run Android and web applications in the cloud which enhance performance by connecting to other CVP users and increases privacy. It is possible to combine the CVP with distributed protocols and central control which mimics the behavior of human societies. This mix helps in dealing with sensitive data in mobile devices and facilitates data management with less application overhead.

Keywords: BYOD, mobile cloud computing, mobile security, information management

Procedia PDF Downloads 310
25718 Improving Fault Tolerance and Load Balancing in Heterogeneous Grid Computing Using Fractal Transform

Authors: Saad M. Darwish, Adel A. El-Zoghabi, Moustafa F. Ashry

Abstract:

The popularity of the Internet and the availability of powerful computers and high-speed networks as low-cost commodity components are changing the way we use computers today. These technical opportunities have led to the possibility of using geographically distributed and multi-owner resources to solve large-scale problems in science, engineering, and commerce. Recent research on these topics has led to the emergence of a new paradigm known as Grid computing. To achieve the promising potentials of tremendous distributed resources, effective and efficient load balancing algorithms are fundamentally important. Unfortunately, load balancing algorithms in traditional parallel and distributed systems, which usually run on homogeneous and dedicated resources, cannot work well in the new circumstances. In this paper, the concept of a fast fractal transform in heterogeneous grid computing based on R-tree and the domain-range entropy is proposed to improve fault tolerance and load balancing algorithm by improve connectivity, communication delay, network bandwidth, resource availability, and resource unpredictability. A novel two-dimension figure of merit is suggested to describe the network effects on load balance and fault tolerance estimation. Fault tolerance is enhanced by adaptively decrease replication time and message cost while load balance is enhanced by adaptively decrease mean job response time. Experimental results show that the proposed method yields superior performance over other methods.

Keywords: Grid computing, load balancing, fault tolerance, R-tree, heterogeneous systems

Procedia PDF Downloads 482
25717 Distributed Generation Connection to the Network: Obtaining Stability Using Transient Behavior

Authors: A. Hadadi, M. Abdollahi, A. Dustmohammadi

Abstract:

The growing use of DGs in distribution networks provide many advantages and also cause new problems which should be anticipated and be solved with appropriate solutions. One of the problems is transient voltage drop and short circuit in the electrical network, in the presence of distributed generation - which can lead to instability. The appearance of the short circuit will cause loss of generator synchronism, even though if it would be able to recover synchronizing mode after removing faulty generator, it will be stable. In order to increase system reliability and generator lifetime, some strategies should be planned to apply even in some situations which a fault prevent generators from separation. In this paper, one fault current limiter is installed due to prevent DGs separation from the grid when fault occurs. Furthermore, an innovative objective function is applied to determine the impedance optimal amount of fault current limiter in order to improve transient stability of distributed generation. Fault current limiter can prevent generator rotor's sudden acceleration after fault occurrence and thereby improve the network transient stability by reducing the current flow in a fast and effective manner. In fact, by applying created impedance by fault current limiter when a short circuit happens on the path of current injection DG to the fault location, the critical fault clearing time improve remarkably. Therefore, protective relay has more time to clear fault and isolate the fault zone without any instability. Finally, different transient scenarios of connection plan sustainability of small scale synchronous generators to the distribution network are presented.

Keywords: critical clearing time, fault current limiter, synchronous generator, transient stability, transient states

Procedia PDF Downloads 191
25716 Cloud-Based Dynamic Routing with Feedback in Formal Methods

Authors: Jawid Ahmad Baktash, Mursal Dawodi, Tomokazu Nagata

Abstract:

With the rapid growth of Cloud Computing, Formal Methods became a good choice for the refinement of message specification and verification for Dynamic Routing in Cloud Computing. Cloud-based Dynamic Routing is becoming increasingly popular. We propose feedback in Formal Methods for Dynamic Routing and Cloud Computing; the model and topologies show how to send messages from index zero to all others formally. The responsibility of proper verification becomes crucial with Dynamic Routing in the cloud. Formal Methods can play an essential role in the routing and development of Networks, and the testing of distributed systems. Event-B is a formal technique that consists of describing the problem rigorously and introduces solutions or details in the refinement steps. Event-B is a variant of B, designed for developing distributed systems and message passing of the dynamic routing. In Event-B and formal methods, the events consist of guarded actions occurring spontaneously rather than being invoked.

Keywords: cloud, dynamic routing, formal method, Pro-B, event-B

Procedia PDF Downloads 412
25715 Sustainable Water Supply: Rainwater Harvesting as Flood Reduction Measures in Ibadan, Nigeria

Authors: Omolara Lade, David Oloke

Abstract:

Ibadan City suffers serious water supply problems; cases of dry taps are common in virtually every part of the City. The scarcity of piped water has made communities find alternative water sources; groundwater sources being a ready source. These wells are prone to pollution due to the close proximity of septic tanks to wells, disposal of solid or liquid wastes in pits, abandoned boreholes or even stream channels and landfills. Storms and floods in Ibadan have increased with consequent devastating effects claiming over 120 lives and displacing 600 people on August 2011 alone. In this study, an analysis of the water demand and sources of supply for the city was carried out through questionnaire survey and collection of data from City’s main water supply - Water Corporation of Oyo State (WCOS), groundwater sources were explored and 30 years rainfall data were collected from Meteorological station in Ibadan. 1067 questionnaire were administered at household level with a response rate of 86.7 %. A descriptive analysis of the survey revealed that 77.1 % of the respondents did not receive water at all from WCOS while 83.8 % depend on groundwater sources. Analysis of data from WCOS revealed that main water supply is inadequate as < 10 % of the population water demand was met. Rainfall intensity is highest in June with a mean value of 188 mm, which can be harvested at community—based level and used to complement the population water demand. Rainwater harvesting if planned, and managed properly will become a valuable alternative source of managing urban flood and alleviating water scarcity in the city.

Keywords: Ibadan, rainwater harvesting, sustainable water, urban flooding

Procedia PDF Downloads 173
25714 Cleaning of Scientific References in Large Patent Databases Using Rule-Based Scoring and Clustering

Authors: Emiel Caron

Abstract:

Patent databases contain patent related data, organized in a relational data model, and are used to produce various patent statistics. These databases store raw data about scientific references cited by patents. For example, Patstat holds references to tens of millions of scientific journal publications and conference proceedings. These references might be used to connect patent databases with bibliographic databases, e.g. to study to the relation between science, technology, and innovation in various domains. Problematic in such studies is the low data quality of the references, i.e. they are often ambiguous, unstructured, and incomplete. Moreover, a complete bibliographic reference is stored in only one attribute. Therefore, a computerized cleaning and disambiguation method for large patent databases is developed in this work. The method uses rule-based scoring and clustering. The rules are based on bibliographic metadata, retrieved from the raw data by regular expressions, and are transparent and adaptable. The rules in combination with string similarity measures are used to detect pairs of records that are potential duplicates. Due to the scoring, different rules can be combined, to join scientific references, i.e. the rules reinforce each other. The scores are based on expert knowledge and initial method evaluation. After the scoring, pairs of scientific references that are above a certain threshold, are clustered by means of single-linkage clustering algorithm to form connected components. The method is designed to disambiguate all the scientific references in the Patstat database. The performance evaluation of the clustering method, on a large golden set with highly cited papers, shows on average a 99% precision and a 95% recall. The method is therefore accurate but careful, i.e. it weighs precision over recall. Consequently, separate clusters of high precision are sometimes formed, when there is not enough evidence for connecting scientific references, e.g. in the case of missing year and journal information for a reference. The clusters produced by the method can be used to directly link the Patstat database with bibliographic databases as the Web of Science or Scopus.

Keywords: clustering, data cleaning, data disambiguation, data mining, patent analysis, scientometrics

Procedia PDF Downloads 187
25713 Cotton Crops Vegetative Indices Based Assessment Using Multispectral Images

Authors: Muhammad Shahzad Shifa, Amna Shifa, Muhammad Omar, Aamir Shahzad, Rahmat Ali Khan

Abstract:

Many applications of remote sensing to vegetation and crop response depend on spectral properties of individual leaves and plants. Vegetation indices are usually determined to estimate crop biophysical parameters like crop canopies and crop leaf area indices with the help of remote sensing. Cotton crops assessment is performed with the help of vegetative indices. Remotely sensed images from an optical multispectral radiometer MSR5 are used in this study. The interpretation is based on the fact that different materials reflect and absorb light differently at different wavelengths. Non-normalized and normalized forms of these datasets are analyzed using two complementary data mining algorithms; K-means and K-nearest neighbor (KNN). Our analysis shows that the use of normalized reflectance data and vegetative indices are suitable for an automated assessment and decision making.

Keywords: cotton, condition assessment, KNN algorithm, clustering, MSR5, vegetation indices

Procedia PDF Downloads 325
25712 Analysis of Solid Waste Management Practices and the Implications for Human Health and the Environment: A Case Study of Kayamandi Informal Settlement

Authors: Peter Iyobosa Asemota

Abstract:

This study on solid waste management practices addressed aspects of environmental and health impacts resulting from poor management of solid waste. The study was occasioned by the observed rate and volume of illegal and indiscriminate dumping of solid waste materials especially in informal settlements. The main focus of this study was to establish the impact of waste management practices on human health and the environment. The study, therefore, presents a critical analysis of the state of solid waste management in the study area and the implications for human health and the environment. The study was carried out in Kayamandi informal settlement within Stellenbosch municipality. The sustainable management of solid waste is very important in order to minimize the environmental and public health risks associated with improper solid waste management. There is no denying the fact that the problems of waste management will become critical as time goes on because of improper and inefficient waste management practices. Towns and cities exhibit the burdens of waste management which is a characteristics feature of most African cities. The study critically assess the implementation of waste management practices by the residents of the informal settlement; identify the factors affecting management issues in the operation of solid waste management system by the municipality; identify factors militating against the implementation of waste management policies and legislation. Furthermore, a waste assessment study was carried out to assess the generation; composition of the waste stream and also determine the attitudes and behavior of the residents with regard to waste management practices. Findings from the study revealed that Kayamandi is not different from other informal settlements with regards to waste management. People are of the opinion that solid waste management is the sole responsibility of municipal authorities and as such, the government should be responsible for bearing the cost of solid waste management.

Keywords: environment, waste, waste composition, waste stream, policy, waste categories, sanitary landfill, waste collection, integrated solid waste management

Procedia PDF Downloads 687
25711 Multi-Criteria Inventory Classification Process Based on Logical Analysis of Data

Authors: Diana López-Soto, Soumaya Yacout, Francisco Ángel-Bello

Abstract:

Although inventories are considered as stocks of money sitting on shelve, they are needed in order to secure a constant and continuous production. Therefore, companies need to have control over the amount of inventory in order to find the balance between excessive and shortage of inventory. The classification of items according to certain criteria such as the price, the usage rate and the lead time before arrival allows any company to concentrate its investment in inventory according to certain ranking or priority of items. This makes the decision making process for inventory management easier and more justifiable. The purpose of this paper is to present a new approach for the classification of new items based on the already existing criteria. This approach is called the Logical Analysis of Data (LAD). It is used in this paper to assist the process of ABC items classification based on multiple criteria. LAD is a data mining technique based on Boolean theory that is used for pattern recognition. This technique has been tested in medicine, industry, credit risk analysis, and engineering with remarkable results. An application on ABC inventory classification is presented for the first time, and the results are compared with those obtained when using the well-known AHP technique and the ANN technique. The results show that LAD presented very good classification accuracy.

Keywords: ABC multi-criteria inventory classification, inventory management, multi-class LAD model, multi-criteria classification

Procedia PDF Downloads 872
25710 Reuse of Huge Industrial Areas

Authors: Martina Perinkova, Lenka Kolarcikova, Marketa Twrda

Abstract:

Brownfields are one of the most important problems that must be solved by today's cities. The topic of this article is description of developing a comprehensive transformation of post-industrial area of the former iron factory national cultural heritage Lower Vítkovice. City of Ostrava used to be industrial superpower of the Czechoslovak Republic, especially in the area of coal mining and iron production, after declining industrial production and mining in the 80s left many unused areas of former factories generally brownfields and backfields. Since the late 90s we are observing how the city officials or private entities seeking to remedy this situation. Regeneration of brownfields is a very expensive and long-term process. The area is now rebuilt for tourists and residents of the city in the entertainment, cultural, and social center. It was necessary do the reconstruction of the industrial monuments. Equally important was the construction of new buildings, which helped reusing of the entire complex. This is a unique example of transformation of technical monuments and completion of necessary new objects, so that the area could start working again and reintegrate back into the urban system.

Keywords: brown fields, conversion, historical and industrial buildings, reconstruction

Procedia PDF Downloads 326
25709 Process Integration: Mathematical Model for Contaminant Removal in Refinery Process Stream

Authors: Wasif Mughees, Malik Al-Ahmad

Abstract:

This research presents the graphical design analysis and mathematical programming technique to dig out the possible water allocation distribution to minimize water usage in process units. The study involves the mass and property integration in its core methodology. Tehran Oil Refinery is studied to implement the focused water pinch technology for regeneration, reuse and recycling of water streams. Process data is manipulated in terms of sources and sinks, which are given in terms of properties. Sources are the streams to be allocated. Sinks are the units which can accept the sources. Suspended Solids (SS) is taken as a single contaminant. The model minimizes the mount of freshwater from 340 to 275m3/h (19.1%). Redesigning and allocation of water streams was built. The graphical technique and mathematical programming shows the consistency of results which confirms mass transfer dependency of water streams.

Keywords: minimization, water pinch, process integration, pollution prevention

Procedia PDF Downloads 314
25708 Educational Leadership and Artificial Intelligence

Authors: Sultan Ghaleb Aldaihani

Abstract:

- The environment in which educational leadership takes place is becoming increasingly complex due to factors like globalization and rapid technological change. - This is creating a "leadership gap" where the complexity of the environment outpaces the ability of leaders to effectively respond. - Educational leadership involves guiding teachers and the broader school system towards improved student learning and achievement. 2. Implications of Artificial Intelligence (AI) in Educational Leadership: - AI has great potential to enhance education, such as through intelligent tutoring systems and automating routine tasks to free up teachers. - AI can also have significant implications for educational leadership by providing better information and data-driven decision-making capabilities. - Computer-adaptive testing can provide detailed, individualized data on student learning that leaders can use for instructional decisions and accountability. 3. Enhancing Decision-Making Processes: - Statistical models and data mining techniques can help identify at-risk students earlier, allowing for targeted interventions. - Probability-based models can diagnose students likely to drop out, enabling proactive support. - These data-driven approaches can make resource allocation and decision-making more effective. 4. Improving Efficiency and Productivity: - AI systems can automate tasks and change processes to improve the efficiency of educational leadership and administration. - Integrating AI can free up leaders to focus more on their role's human, interactive elements.

Keywords: Education, Leadership, Technology, Artificial Intelligence

Procedia PDF Downloads 26
25707 Distribution Planning with Renewable Energy Units Based on Improved Honey Bee Mating Optimization

Authors: Noradin Ghadimi, Nima Amjady, Oveis Abedinia, Roza Poursoleiman

Abstract:

This paper proposed an Improved Honey Bee Mating Optimization (IHBMO) for a planning paradigm for network upgrade. The proposed technique is a new meta-heuristic algorithm which inspired by mating of the honey bee. The paradigm is able to select amongst several choices equi-cost one assuring the optimum in terms of voltage profile, considering various scenarios of DG penetration and load demand. The distributed generation (DG) has created a challenge and an opportunity for developing various novel technologies in power generation. DG prepares a multitude of services to utilities and consumers, containing standby generation, peaks chopping sufficiency, base load generation. The proposed algorithm is applied over the 30 lines, 28 buses power system. The achieved results demonstrate the good efficiency of the DG using the proposed technique in different scenarios.

Keywords: distributed generation, IHBMO, renewable energy units, network upgrade

Procedia PDF Downloads 482
25706 FLEX: A Backdoor Detection and Elimination Method in Federated Scenario

Authors: Shuqi Zhang

Abstract:

Federated learning allows users to participate in collaborative model training without sending data to third-party servers, reducing the risk of user data privacy leakage, and is widely used in smart finance and smart healthcare. However, the distributed architecture design of federation learning itself and the existence of secure aggregation protocols make it inherently vulnerable to backdoor attacks. To solve this problem, the federated learning backdoor defense framework FLEX based on group aggregation, cluster analysis, and neuron pruning is proposed, and inter-compatibility with secure aggregation protocols is achieved. The good performance of FLEX is verified by building a horizontal federated learning framework on the CIFAR-10 dataset for experiments, which achieves 98% success rate of backdoor detection and reduces the success rate of backdoor tasks to 0% ~ 10%.

Keywords: federated learning, secure aggregation, backdoor attack, cluster analysis, neuron pruning

Procedia PDF Downloads 88
25705 Entrepreneurial Practice and Corruption in Tourism Sector: A Study of Entrepreneurial Orientation and Organizational Corruption in Nepali Star Hotels

Authors: Prabin Raj Gautam

Abstract:

Entrepreneurship in tourism sectors, particularly hotel entrepreneurship has contributed to Nepalese Gross Domestic Production (GDP). The tourist standard and star hotels in developing countries have not only been generating revenues but also providing international hospitality to the guest in the local areas. For doing so, these hotel enterprises must need to implement different business strategies to enhance and maintain their international business benchmark. The Entrepreneurial Orientation (EO) is core for making business strategies. Meanwhile, the corruption is labeled as negative factor for economic development. This paper presents the relationship between EO of Nepalese star hotels and organizational corruption. The study employed questionnaire survey as data collection tool under the quantitative methodology. Five hypotheses are developed and tested. After gathering the data form 216 questionnaire distributed to CEOs/Managers of the sample hotels, the findings show that out of five dimensions of EO, only autonomy, pro-activeness, and innovativeness are not significant to organizational corruption; however, risk-taking and competitive aggressiveness are found significant contributor. The descriptive statistics and structural equation modeling are employed to describe the data and fit the model.

Keywords: entrepreneurship, entrepreneurial orientation, organizational corruption, dimensions

Procedia PDF Downloads 313
25704 Progress in Accuracy, Reliability and Safety in Firedamp Detection

Authors: José Luis Lorenzo Bayona, Ljiljana Medic-Pejic, Isabel Amez Arenillas, Blanca Castells Somoza

Abstract:

The communication presents the study results carried out by the Official Laboratory J. M. Madariaga (LOM) of the Polytechnic University of Madrid to analyze the reliability of methane detection systems used in underground mining. Poor firedamp control in work can cause from production stoppages to fatal accidents and since there is currently a great variety of equipment with different functional characteristics, a study is needed to indicate which measurement principles have the highest degree of confidence. For the development of the project, a series of fixed, transportable and portable methane detectors with different measurement principles have been selected to subject them to laboratory tests following the methods described in the applicable regulations. The test equipment has been the one usually used in the certification and calibration of these devices, subject to the LOM quality system, and the tests have been carried out on detectors accessible in the market. The conclusions establish the main advantages and disadvantages of the equipment according to the measurement principle used; catalytic combustion, interferometry and infrared absorption.

Keywords: ATEX standards, gas detector, methane meter, mining safety

Procedia PDF Downloads 133
25703 Evaluation of the Urban Regeneration Project: Land Use Transformation and SNS Big Data Analysis

Authors: Ju-Young Kim, Tae-Heon Moon, Jung-Hun Cho

Abstract:

Urban regeneration projects have been actively promoted in Korea. In particular, Jeonju Hanok Village is evaluated as one of representative cases in terms of utilizing local cultural heritage sits in the urban regeneration project. However, recently, there has been a growing concern in this area, due to the ‘gentrification’, caused by the excessive commercialization and surging tourists. This trend was changing land and building use and resulted in the loss of identity of the region. In this regard, this study analyzed the land use transformation between 2010 and 2016 to identify the commercialization trend in Jeonju Hanok Village. In addition, it conducted SNS big data analysis on Jeonju Hanok Village from February 14th, 2016 to March 31st, 2016 to identify visitors’ awareness of the village. The study results demonstrate that rapid commercialization was underway, unlikely the initial intention, so that planners and officials in city government should reconsider the project direction and rebuild deliberate management strategies. This study is meaningful in that it analyzed the land use transformation and SNS big data to identify the current situation in urban regeneration area. Furthermore, it is expected that the study results will contribute to the vitalization of regeneration area.

Keywords: land use, SNS, text mining, urban regeneration

Procedia PDF Downloads 288
25702 A Hybrid Feature Selection and Deep Learning Algorithm for Cancer Disease Classification

Authors: Niousha Bagheri Khulenjani, Mohammad Saniee Abadeh

Abstract:

Learning from very big datasets is a significant problem for most present data mining and machine learning algorithms. MicroRNA (miRNA) is one of the important big genomic and non-coding datasets presenting the genome sequences. In this paper, a hybrid method for the classification of the miRNA data is proposed. Due to the variety of cancers and high number of genes, analyzing the miRNA dataset has been a challenging problem for researchers. The number of features corresponding to the number of samples is high and the data suffer from being imbalanced. The feature selection method has been used to select features having more ability to distinguish classes and eliminating obscures features. Afterward, a Convolutional Neural Network (CNN) classifier for classification of cancer types is utilized, which employs a Genetic Algorithm to highlight optimized hyper-parameters of CNN. In order to make the process of classification by CNN faster, Graphics Processing Unit (GPU) is recommended for calculating the mathematic equation in a parallel way. The proposed method is tested on a real-world dataset with 8,129 patients, 29 different types of tumors, and 1,046 miRNA biomarkers, taken from The Cancer Genome Atlas (TCGA) database.

Keywords: cancer classification, feature selection, deep learning, genetic algorithm

Procedia PDF Downloads 106