World Academy of Science, Engineering and Technology
[Computer and Information Engineering]
Online ISSN : 1307-6892
1009 Research of Data Cleaning Methods Based on Dependency Rules
Authors: Yang Bao, Shi Wei Deng, WangQun Lin
Abstract:
This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSQL), and gives 6 data cleaning methods based on these algorithms.Keywords: data cleaning, dependency rules, violation data discovery, data repair
Procedia PDF Downloads 5641008 Data Mining Spatial: Unsupervised Classification of Geographic Data
Authors: Chahrazed Zouaoui
Abstract:
In recent years, the volume of geospatial information is increasing due to the evolution of communication technologies and information, this information is presented often by geographic information systems (GIS) and stored on of spatial databases (BDS). The classical data mining revealed a weakness in knowledge extraction at these enormous amounts of data due to the particularity of these spatial entities, which are characterized by the interdependence between them (1st law of geography). This gave rise to spatial data mining. Spatial data mining is a process of analyzing geographic data, which allows the extraction of knowledge and spatial relationships from geospatial data, including methods of this process we distinguish the monothematic and thematic, geo- Clustering is one of the main tasks of spatial data mining, which is registered in the part of the monothematic method. It includes geo-spatial entities similar in the same class and it affects more dissimilar to the different classes. In other words, maximize intra-class similarity and minimize inter similarity classes. Taking account of the particularity of geo-spatial data. Two approaches to geo-clustering exist, the dynamic processing of data involves applying algorithms designed for the direct treatment of spatial data, and the approach based on the spatial data pre-processing, which consists of applying clustering algorithms classic pre-processed data (by integration of spatial relationships). This approach (based on pre-treatment) is quite complex in different cases, so the search for approximate solutions involves the use of approximation algorithms, including the algorithms we are interested in dedicated approaches (clustering methods for partitioning and methods for density) and approaching bees (biomimetic approach), our study is proposed to design very significant to this problem, using different algorithms for automatically detecting geo-spatial neighborhood in order to implement the method of geo- clustering by pre-treatment, and the application of the bees algorithm to this problem for the first time in the field of geo-spatial.Keywords: mining, GIS, geo-clustering, neighborhood
Procedia PDF Downloads 3751007 Bibliometric Analysis of the Impact of Funding on Scientific Development of Researchers
Authors: Ashkan Ebadi, Andrea Schiffauerova
Abstract:
Every year, a considerable amount of money is being invested on research, mainly in the form of funding allocated to universities and research institutes. To better distribute the available funds and to set the most proper R&D investment strategies for the future, evaluation of the productivity of the funded researchers and the impact of such funding is crucial. In this paper, using the data on 15 years of journal publications of the NSERC (Natural Sciences and Engineering research Council of Canada) funded researchers and by means of bibliometric analysis, the scientific development of the funded researchers and their scientific collaboration patterns will be investigated in the period of 1996-2010. According to the results it seems that there is a positive relation between the average level of funding and quantity and quality of the scientific output. In addition, whenever funding allocated to the researchers has increased, the number of co-authors per paper has also augmented. Hence, the increase in the level of funding may enable researchers to get involved in larger projects and/or scientific teams and increase their scientific output respectively.Keywords: bibliometrics, collaboration, funding, productivity
Procedia PDF Downloads 2861006 Model-Based Software Regression Test Suite Reduction
Authors: Shiwei Deng, Yang Bao
Abstract:
In this paper, we present a model-based regression test suite reducing approach that uses EFSM model dependence analysis and probability-driven greedy algorithm to reduce software regression test suites. The approach automatically identifies the difference between the original model and the modified model as a set of elementary model modifications. The EFSM dependence analysis is performed for each elementary modification to reduce the regression test suite, and then the probability-driven greedy algorithm is adopted to select the minimum set of test cases from the reduced regression test suite that cover all interaction patterns. Our initial experience shows that the approach may significantly reduce the size of regression test suites.Keywords: dependence analysis, EFSM model, greedy algorithm, regression test
Procedia PDF Downloads 4271005 Telecom Infrastructure Outsourcing: An Innovative Approach
Authors: Irfan Zafar
Abstract:
Over the years the Telecom Industry in the country has shown a lot of progress in terms of infrastructure development coupled with the availability of telecom services. This has however led to the cut throat completion among various operators thus leading to reduced tariffs to the customers. The profit margins have seen a reduction thus leading the operators to think of other avenues by adopting new models while keeping the quality of service intact. The outsourcing of the network and the resources is one such model which has shown promising benefits which includes lower costs, less risk, higher levels of customer support and engagement, predictable expenses, access to the emerging technologies, benefiting from a highly skilled workforce, adaptability, focus on the core business while reducing capital costs. A lot of research has been done on outsourcing in terms of reasons of outsourcing and its benefits. However this study is an attempt to analyze the effects of the outsourcing on an organizations performance (Telecommunication Sector) considering the variables (1) Cost Reduction (2) Organizational Performance (3) Flexibility (4) Employee Performance (5) Access to Specialized Skills & Technology and the (6) Outsourcing Risks.Keywords: outsourcing, ICT, telecommunication, IT, networking
Procedia PDF Downloads 3981004 Digital Preservation: Requirement of 21st Century
Authors: Gaurav Kumar, Shilpa
Abstract:
Digital libraries have been established all over the world to create, maintain and to preserve the digital materials. This paper focuses on operational digital preservation systems specifically in educational organizations in India. It considers the broad range of digital objects including e-journals, technical reports, e-records, project documents, scientific data, etc. This paper describes the main objectives, process and technological issues involved in preservation of digital materials. Digital preservation refers to the various methods of keeping digital materials alive for the future. It includes everything from electronic publications on CD-ROM to Online database and collections of experimental data in digital format maintains the ability to display, retrieve and use digital collections in the face of rapidly changing technological and organizational infrastructures elements. This paper exhibits the importance and objectives of digital preservation. The necessities of preservation are hardware and software technology to interpret the digital documents and discuss various aspects of digital preservation.Keywords: preservation, digital preservation, digital dark age, conservation, archive, repository, document, information technology, hardware, software, organization, machine readable format
Procedia PDF Downloads 4571003 Adaptive Routing in NoC-Based Heterogeneous MPSoCs
Authors: M. K. Benhaoua, A. E. H. Benyamina, T. Djeradi, P. Boulet
Abstract:
In this paper, we propose adaptive routing that considers the routing of communications in order to optimize the overall performance. The routing technique uses a newly proposed Algorithm to route communications between the tasks. The routing we propose of the communications leads to a better optimization of several performance metrics (time and energy consumption). Experimental results show that the proposed routing approach provides significant performance improvements when compared to those using static routing.Keywords: multi-processor systems-on-chip (mpsocs), network-on-chip (noc), heterogeneous architectures, adaptive routin
Procedia PDF Downloads 3751002 Comparative Study of Universities’ Web Structure Mining
Authors: Z. Abdullah, A. R. Hamdan
Abstract:
This paper is meant to analyze the ranking of University of Malaysia Terengganu, UMT’s website in the World Wide Web. There are only few researches have been done on comparing the ranking of universities’ websites so this research will be able to determine whether the existing UMT’s website is serving its purpose which is to introduce UMT to the world. The ranking is based on hub and authority values which are accordance to the structure of the website. These values are computed using two web-searching algorithms, HITS and SALSA. Three other universities’ websites are used as the benchmarks which are UM, Harvard and Stanford. The result is clearly showing that more work has to be done on the existing UMT’s website where important pages according to the benchmarks, do not exist in UMT’s pages. The ranking of UMT’s website will act as a guideline for the web-developer to develop a more efficient website.Keywords: algorithm, ranking, website, web structure mining
Procedia PDF Downloads 5171001 Hierarchical Clustering Algorithms in Data Mining
Authors: Z. Abdullah, A. R. Hamdan
Abstract:
Clustering is a process of grouping objects and data into groups of clusters to ensure that data objects from the same cluster are identical to each other. Clustering algorithms in one of the areas in data mining and it can be classified into partition, hierarchical, density based, and grid-based. Therefore, in this paper, we do a survey and review for four major hierarchical clustering algorithms called CURE, ROCK, CHAMELEON, and BIRCH. The obtained state of the art of these algorithms will help in eliminating the current problems, as well as deriving more robust and scalable algorithms for clustering.Keywords: clustering, unsupervised learning, algorithms, hierarchical
Procedia PDF Downloads 8851000 Decision Making System for Clinical Datasets
Authors: P. Bharathiraja
Abstract:
Computer Aided decision making system is used to enhance diagnosis and prognosis of diseases and also to assist clinicians and junior doctors in clinical decision making. Medical Data used for decision making should be definite and consistent. Data Mining and soft computing techniques are used for cleaning the data and for incorporating human reasoning in decision making systems. Fuzzy rule based inference technique can be used for classification in order to incorporate human reasoning in the decision making process. In this work, missing values are imputed using the mean or mode of the attribute. The data are normalized using min-ma normalization to improve the design and efficiency of the fuzzy inference system. The fuzzy inference system is used to handle the uncertainties that exist in the medical data. Equal-width-partitioning is used to partition the attribute values into appropriate fuzzy intervals. Fuzzy rules are generated using Class Based Associative rule mining algorithm. The system is trained and tested using heart disease data set from the University of California at Irvine (UCI) Machine Learning Repository. The data was split using a hold out approach into training and testing data. From the experimental results it can be inferred that classification using fuzzy inference system performs better than trivial IF-THEN rule based classification approaches. Furthermore it is observed that the use of fuzzy logic and fuzzy inference mechanism handles uncertainty and also resembles human decision making. The system can be used in the absence of a clinical expert to assist junior doctors and clinicians in clinical decision making.Keywords: decision making, data mining, normalization, fuzzy rule, classification
Procedia PDF Downloads 517999 Refactoring Object Oriented Software through Community Detection Using Evolutionary Computation
Authors: R. Nagarani
Abstract:
An intrinsic property of software in a real-world environment is its need to evolve, which is usually accompanied by the increase of software complexity and deterioration of software quality, making software maintenance a tough problem. Refactoring is regarded as an effective way to address this problem. Many refactoring approaches at the method and class level have been proposed. But the extent of research on software refactoring at the package level is less. This work presents a novel approach to refactor the package structures of object oriented software using genetic algorithm based community detection. It uses software networks to represent classes and their dependencies. It uses a constrained community detection algorithm to obtain the optimized community structures in software networks, which also correspond to the optimized package structures. It finally provides a list of classes as refactoring candidates by comparing the optimized package structures with the real package structures.Keywords: community detection, complex network, genetic algorithm, package, refactoring
Procedia PDF Downloads 418998 Spatio-Temporal Data Mining with Association Rules for Lake Van
Authors: Tolga Aydin, M. Fatih Alaeddinoğlu
Abstract:
People, throughout the history, have made estimates and inferences about the future by using their past experiences. Developing information technologies and the improvements in the database management systems make it possible to extract useful information from knowledge in hand for the strategic decisions. Therefore, different methods have been developed. Data mining by association rules learning is one of such methods. Apriori algorithm, one of the well-known association rules learning algorithms, is not commonly used in spatio-temporal data sets. However, it is possible to embed time and space features into the data sets and make Apriori algorithm a suitable data mining technique for learning spatio-temporal association rules. Lake Van, the largest lake of Turkey, is a closed basin. This feature causes the volume of the lake to increase or decrease as a result of change in water amount it holds. In this study, evaporation, humidity, lake altitude, amount of rainfall and temperature parameters recorded in Lake Van region throughout the years are used by the Apriori algorithm and a spatio-temporal data mining application is developed to identify overflows and newly-formed soil regions (underflows) occurring in the coastal parts of Lake Van. Identifying possible reasons of overflows and underflows may be used to alert the experts to take precautions and make the necessary investments.Keywords: apriori algorithm, association rules, data mining, spatio-temporal data
Procedia PDF Downloads 374997 An Efficient Traceability Mechanism in the Audited Cloud Data Storage
Authors: Ramya P, Lino Abraham Varghese, S. Bose
Abstract:
By cloud storage services, the data can be stored in the cloud, and can be shared across multiple users. Due to the unexpected hardware/software failures and human errors, which make the data stored in the cloud be lost or corrupted easily it affected the integrity of data in cloud. Some mechanisms have been designed to allow both data owners and public verifiers to efficiently audit cloud data integrity without retrieving the entire data from the cloud server. But public auditing on the integrity of shared data with the existing mechanisms will unavoidably reveal confidential information such as identity of the person, to public verifiers. Here a privacy-preserving mechanism is proposed to support public auditing on shared data stored in the cloud. It uses group signatures to compute verification metadata needed to audit the correctness of shared data. The identity of the signer on each block in shared data is kept confidential from public verifiers, who are easily verifying shared data integrity without retrieving the entire file. But on demand, the signer of the each block is reveal to the owner alone. Group private key is generated once by the owner in the static group, where as in the dynamic group, the group private key is change when the users revoke from the group. When the users leave from the group the already signed blocks are resigned by cloud service provider instead of owner is efficiently handled by efficient proxy re-signature scheme.Keywords: data integrity, dynamic group, group signature, public auditing
Procedia PDF Downloads 392996 Eco-Drive Predictive Analytics
Authors: Sharif Muddsair, Eisels Martin, Giesbrecht Eugenie
Abstract:
With development of society increase the demand for the movement of people also increases gradually. The various modes of the transport in different extent which expat impacts, which depends on mainly technical-operating conditions. The up-to-date telematics systems provide the transport industry a revolutionary. Appropriate use of these systems can help to substantially improve the efficiency. Vehicle monitoring and fleet tracking are among services used for improving efficiency and effectiveness of utility vehicle. There are many telematics systems which may contribute to eco-driving. Generally, they can be grouped according to their role in driving cycle. • Before driving - eco-route selection, • While driving – Advanced driver assistance, • After driving – remote analysis. Our point of interest is regulated in third point [after driving – remote analysis]. TS [Telematics-system] make it possible to record driving patterns in real time and analysis the data later on, So that driver- classification-specific hints [fast driver, slow driver, aggressive driver…)] are given to imitate eco-friendly driving style. Together with growing number of vehicle and development of information technology, telematics become an ‘active’ research subject in IT and the car industry. Telematics has gone a long way from providing navigation solution/assisting the driver to become an integral part of the vehicle. Today’s telematics ensure safety, comfort and become convenience of the driver.Keywords: internet of things, iot, connected vehicle, cv, ts, telematics services, ml, machine learning
Procedia PDF Downloads 306995 A Comparison of Bias Among Relaxed Divisor Methods Using 3 Bias Measurements
Authors: Sumachaya Harnsukworapanich, Tetsuo Ichimori
Abstract:
The apportionment method is used by many countries, to calculate the distribution of seats in political bodies. For example, this method is used in the United States (U.S.) to distribute house seats proportionally based on the population of the electoral district. Famous apportionment methods include the divisor methods called the Adams Method, Dean Method, Hill Method, Jefferson Method and Webster Method. Sometimes the results from the implementation of these divisor methods are unfair and include errors. Therefore, it is important to examine the optimization of this method by using a bias measurement to figure out precise and fair results. In this research we investigate the bias of divisor methods in the U.S. Houses of Representatives toward large and small states by applying the Stolarsky Mean Method. We compare the bias of the apportionment method by using two famous bias measurements: The Balinski and Young measurement and the Ernst measurement. Both measurements have a formula for large and small states. The Third measurement however, which was created by the researchers, did not factor in the element of large and small states into the formula. All three measurements are compared and the results show that our measurement produces similar results to the other two famous measurements.Keywords: apportionment, bias, divisor, fair, measurement
Procedia PDF Downloads 366994 Decision Making on Smart Energy Grid Development for Availability and Security of Supply Achievement Using Reliability Merits
Authors: F. Iberraken, R. Medjoudj, D. Aissani
Abstract:
The development of the smart grids concept is built around two separate definitions, namely: The European one oriented towards sustainable development and the American one oriented towards reliability and security of supply. In this paper, we have investigated reliability merits enabling decision-makers to provide a high quality of service. It is based on system behavior using interruptions and failures modeling and forecasting from one hand and on the contribution of information and communication technologies (ICT) to mitigate catastrophic ones such as blackouts from the other hand. It was found that this concept has been adopted by developing and emerging countries in short and medium terms followed by sustainability concept at long term planning. This work has highlighted the reliability merits such as: Benefits, opportunities, costs and risks considered as consistent units of measuring power customer satisfaction. From the decision making point of view, we have used the analytic hierarchy process (AHP) to achieve customer satisfaction, based on the reliability merits and the contribution of such energy resources. Certainly nowadays, fossil and nuclear ones are dominating energy production but great advances are already made to jump into cleaner ones. It was demonstrated that theses resources are not only environmentally but also economically and socially sustainable. The paper is organized as follows: Section one is devoted to the introduction, where an implicit review of smart grids development is given for the two main concepts (for USA and Europeans countries). The AHP method and the BOCR developments of reliability merits against power customer satisfaction are developed in section two. The benefits where expressed by the high level of availability, maintenance actions applicability and power quality. Opportunities were highlighted by the implementation of ICT in data transfer and processing, the mastering of peak demand control, the decentralization of the production and the power system management in default conditions. Costs were evaluated using cost-benefit analysis, including the investment expenditures in network security, becoming a target to hackers and terrorists, and the profits of operating as decentralized systems, with a reduced energy not supplied, thanks to the availability of storage units issued from renewable resources and to the current power lines (CPL) enabling the power dispatcher to manage optimally the load shedding. For risks, we have razed the adhesion of citizens to contribute financially to the system and to the utility restructuring. What is the degree of their agreement compared to the guarantees proposed by the managers about the information integrity? From technical point of view, have they sufficient information and knowledge to meet a smart home and a smart system? In section three, an application of AHP method is made to achieve power customer satisfaction based on the main energy resources as alternatives, using knowledge issued from a country that has a great advance in energy mutation. Results and discussions are given in section four. It was given us to conclude that the option to a given resource depends on the attitude of the decision maker (prudent, optimistic or pessimistic), and that status quo is neither sustainable nor satisfactory.Keywords: reliability, AHP, renewable energy resources, smart grids
Procedia PDF Downloads 442993 Comparative Methods for Speech Enhancement and the Effects on Text-Independent Speaker Identification Performance
Authors: R. Ajgou, S. Sbaa, S. Ghendir, A. Chemsa, A. Taleb-Ahmed
Abstract:
The speech enhancement algorithm is to improve speech quality. In this paper, we review some speech enhancement methods and we evaluated their performance based on Perceptual Evaluation of Speech Quality scores (PESQ, ITU-T P.862). All method was evaluated in presence of different kind of noise using TIMIT database and NOIZEUS noisy speech corpus.. The noise was taken from the AURORA database and includes suburban train noise, babble, car, exhibition hall, restaurant, street, airport and train station noise. Simulation results showed improved performance of speech enhancement for Tracking of non-stationary noise approach in comparison with various methods in terms of PESQ measure. Moreover, we have evaluated the effects of the speech enhancement technique on Speaker Identification system based on autoregressive (AR) model and Mel-frequency Cepstral coefficients (MFCC).Keywords: speech enhancement, pesq, speaker recognition, MFCC
Procedia PDF Downloads 424992 Using Data Mining Techniques to Evaluate the Different Factors Affecting the Academic Performance of Students at the Faculty of Information Technology in Hashemite University in Jordan
Authors: Feras Hanandeh, Majdi Shannag
Abstract:
This research studies the different factors that could affect the Faculty of Information Technology in Hashemite University students’ accumulative average. The research paper verifies the student information, background, their academic records, and how this information will affect the student to get high grades. The student information used in the study is extracted from the student’s academic records. The data mining tools and techniques are used to decide which attribute(s) will affect the student’s accumulative average. The results show that the most important factor which affects the students’ accumulative average is the student Acceptance Type. And we built a decision tree model and rules to determine how the student can get high grades in their courses. The overall accuracy of the model is 44% which is accepted rate.Keywords: data mining, classification, extracting rules, decision tree
Procedia PDF Downloads 416991 Dynamic Background Updating for Lightweight Moving Object Detection
Authors: Kelemewerk Destalem, Joongjae Cho, Jaeseong Lee, Ju H. Park, Joonhyuk Yoo
Abstract:
Background subtraction and temporal difference are often used for moving object detection in video. Both approaches are computationally simple and easy to be deployed in real-time image processing. However, while the background subtraction is highly sensitive to dynamic background and illumination changes, the temporal difference approach is poor at extracting relevant pixels of the moving object and at detecting the stopped or slowly moving objects in the scene. In this paper, we propose a moving object detection scheme based on adaptive background subtraction and temporal difference exploiting dynamic background updates. The proposed technique consists of a histogram equalization, a linear combination of background and temporal difference, followed by the novel frame-based and pixel-based background updating techniques. Finally, morphological operations are applied to the output images. Experimental results show that the proposed algorithm can solve the drawbacks of both background subtraction and temporal difference methods and can provide better performance than that of each method.Keywords: background subtraction, background updating, real time, light weight algorithm, temporal difference
Procedia PDF Downloads 342990 Transformation to M-Learning at the Nursing Institute in the Armed Force Hospital Alhada, in Saudi Arabia Based on Activity Theory
Authors: Rahimah Abdulrahman, A. Eardle, Wilfred Alan, Abdel Hamid Soliman
Abstract:
With the rapid development in technology, and advances in learning technologies, m-learning has begun to occupy a great part of our lives. The pace of the life getting together with the need for learning started mobile learning (m-learning) concept. In 2008, Saudi Arabia requested a national plan for the adoption of information technology (IT) across the country. Part of the recommendations of this plan concerns the implementation of mobile learning (m-learning) as well as their prospective applications to higher education within the Kingdom of Saudi Arabia. The overall aim of the research is to explore the main issues that impact the deployment of m-learning in nursing institutes in Saudi Arabia, at the Armed Force Hospitals (AFH), Alhada. This is in order to be able to develop a generic model to enable and assist the educational policy makers and implementers of m-learning, to comprehend and treat those issues effectively. Specifically, the research will explore the concept of m-learning; identify and analyse the main organisational; technological and cultural issue, that relate to the adoption of m-learning; develop a model of m-learning; investigate the perception of the students of the Nursing Institutes to the use of m-learning technologies for their nursing diploma programmes based on their experiences; conduct a validation of the m-learning model with the use of the nursing Institute of the AFH, Alhada in Saudi Arabia, and evaluate the research project as a learning experience and as a contribution to the body of knowledge. Activity Theory (AT) will be adopted for the study due to the fact that it provides a conceptual framework that engenders an understanding of the structure, development and the context of computer-supported activities. The study will be adopt a set of data collection methods which engage nursing students in a quantitative survey, while nurse teachers are engaged through in depth qualitative studies to get first-hand information about the organisational, technological and cultural issues that impact on the deployment of m-learning. The original contribution will be a model for developing m-learning material for classroom-based learning in the nursing institute that can have a general application.Keywords: activity theory (at), mobile learning (m-learning), nursing institute, Saudi Arabia (sa)
Procedia PDF Downloads 353989 The Search of Possibility of Running Six Sigma Process in It Education Center
Authors: Mohammad Amini, Aliakbar Alijarahi
Abstract:
This research that is collected and title as ‘ the search of possibility of running six sigma process in IT education center ‘ goals to test possibility of running the six sigma process and using in IT education center system. This process is a good method that is used for reducing process, errors. To evaluate running off six sigma in the IT education center, some variables relevant to this process is selected. These variables are: - The amount of support from organization master boss to process. - The current specialty. - The ability of training system for compensating reduction. - The amount of match between current culture whit six sigma culture . - The amount of current quality by comparing whit quality gain from running six sigma. For evaluation these variables we select four question and to gain the answers, we set a questionnaire from with 28 question and distribute it in our typical society. Since, our working environment is a very competition, and organization needs to decree the errors to minimum, otherwise it lasts their customers. The questionnaire from is given to 55 persons, they were filled and returned by 50 persons, after analyzing the forms these results is gained: - IT education center needs to use and run this system (six sigma) for improving their process qualities. - The most factors need to run the six sigma exist in the IT education center, but there is a need to support.Keywords: education, customer, self-action, quality, continuous improvement process
Procedia PDF Downloads 340988 Using Data Mining Technique for Scholarship Disbursement
Authors: J. K. Alhassan, S. A. Lawal
Abstract:
This work is on decision tree-based classification for the disbursement of scholarship. Tree-based data mining classification technique is used in other to determine the generic rule to be used to disburse the scholarship. The system based on the defined rules from the tree is able to determine the class (status) to which an applicant shall belong whether Granted or Not Granted. The applicants that fall to the class of granted denote a successful acquirement of scholarship while those in not granted class are unsuccessful in the scheme. An algorithm that can be used to classify the applicants based on the rules from tree-based classification was also developed. The tree-based classification is adopted because of its efficiency, effectiveness, and easy to comprehend features. The system was tested with the data of National Information Technology Development Agency (NITDA) Abuja, a Parastatal of Federal Ministry of Communication Technology that is mandated to develop and regulate information technology in Nigeria. The system was found working according to the specification. It is therefore recommended for all scholarship disbursement organizations.Keywords: classification, data mining, decision tree, scholarship
Procedia PDF Downloads 376987 BigCrypt: A Probable Approach of Big Data Encryption to Protect Personal and Business Privacy
Authors: Abdullah Al Mamun, Talal Alkharobi
Abstract:
As data size is growing up, people are became more familiar to store big amount of secret information into cloud storage. Companies are always required to need transfer massive business files from one end to another. We are going to lose privacy if we transmit it as it is and continuing same scenario repeatedly without securing the communication mechanism means proper encryption. Although asymmetric key encryption solves the main problem of symmetric key encryption but it can only encrypt limited size of data which is inapplicable for large data encryption. In this paper we propose a probable approach of pretty good privacy for encrypt big data using both symmetric and asymmetric keys. Our goal is to achieve encrypt huge collection information and transmit it through a secure communication channel for committing the business and personal privacy. To justify our method an experimental dataset from three different platform is provided. We would like to show that our approach is working for massive size of various data efficiently and reliably.Keywords: big data, cloud computing, cryptography, hadoop, public key
Procedia PDF Downloads 320986 Study of Inhibition of the End Effect Based on AR Model Predict of Combined Data Extension and Window Function
Authors: Pan Hongxia, Wang Zhenhua
Abstract:
In this paper, the EMD decomposition in the process of endpoint effect adopted data based on AR model to predict the continuation and window function method of combining the two effective inhibition. Proven by simulation of the simulation signal obtained the ideal effect, then, apply this method to the gearbox test data is also achieved good effect in the process, for the analysis of the subsequent data processing to improve the calculation accuracy. In the end, under various working conditions for the gearbox fault diagnosis laid a good foundation.Keywords: gearbox, fault diagnosis, ar model, end effect
Procedia PDF Downloads 366985 Evaluation of Classification Algorithms for Diagnosis of Asthma in Iranian Patients
Authors: Taha SamadSoltani, Peyman Rezaei Hachesu, Marjan GhaziSaeedi, Maryam Zolnoori
Abstract:
Introduction: Data mining defined as a process to find patterns and relationships along data in the database to build predictive models. Application of data mining extended in vast sectors such as the healthcare services. Medical data mining aims to solve real-world problems in the diagnosis and treatment of diseases. This method applies various techniques and algorithms which have different accuracy and precision. The purpose of this study was to apply knowledge discovery and data mining techniques for the diagnosis of asthma based on patient symptoms and history. Method: Data mining includes several steps and decisions should be made by the user which starts by creation of an understanding of the scope and application of previous knowledge in this area and identifying KD process from the point of view of the stakeholders and finished by acting on discovered knowledge using knowledge conducting, integrating knowledge with other systems and knowledge documenting and reporting.in this study a stepwise methodology followed to achieve a logical outcome. Results: Sensitivity, Specifity and Accuracy of KNN, SVM, Naïve bayes, NN, Classification tree and CN2 algorithms and related similar studies was evaluated and ROC curves were plotted to show the performance of the system. Conclusion: The results show that we can accurately diagnose asthma, approximately ninety percent, based on the demographical and clinical data. The study also showed that the methods based on pattern discovery and data mining have a higher sensitivity compared to expert and knowledge-based systems. On the other hand, medical guidelines and evidence-based medicine should be base of diagnostics methods, therefore recommended to machine learning algorithms used in combination with knowledge-based algorithms.Keywords: asthma, datamining, classification, machine learning
Procedia PDF Downloads 447984 Facts of Near Field Communication
Authors: Amin Hamrahi
Abstract:
Near Field Communication (NFC) is one of the latest wireless communication technologies. NFC enables electronic devices to communicate in short range using the radio waves. NFC offers safe yet simple communication between electronic devices. This technology provides the fastest way to communicate two device with in a fraction of second. With NFC technology, communication occurs when an NFC-compatible device is brought within a few centimeters of another NFC device. NFC is an open-platform technology that is being standardized in the NFC Forum. NFC is based on and extends on RFID. It operates on 13.56 MHz frequency.Keywords: near field communication, NFC technology, wireless communication technologies, NFC-compatible device, NFC, communication
Procedia PDF Downloads 465983 Implications of Learning Resource Centre in a Web Environment
Authors: Darshana Lal, Sonu Rana
Abstract:
Learning Resource Centers (LRC) are acquiring different kinds of documents like books, journals, thesis, dissertations, standard, databases etc. in print and e-form. This article deals with the different types of sources available in LRC. It also discusses the concept of the web, as a tool, as a multimedia system and the different interfaces available on the web. The reasons for establishing LRC are highlighted along with the assignments of LRC. Different features of LRC‘S like self-learning and group learning are described. It also implements a group of activities like reading, learning, educational etc. The use of LRC by students and faculties are given and concluded with the benefits.Keywords: internet, search engine, resource centre, opac, self-learning, group learning
Procedia PDF Downloads 378982 Multi-Criteria Test Case Selection Using Ant Colony Optimization
Authors: Niranjana Devi N.
Abstract:
Test case selection is to select the subset of only the fit test cases and remove the unfit, ambiguous, redundant, unnecessary test cases which in turn improve the quality and reduce the cost of software testing. Test cases optimization is the problem of finding the best subset of test cases from a pool of the test cases to be audited. It will meet all the objectives of testing concurrently. But most of the research have evaluated the fitness of test cases only on single parameter fault detecting capability and optimize the test cases using a single objective. In the proposed approach, nine parameters are considered for test case selection and the best subset of parameters for test case selection is obtained using Interval Type-2 Fuzzy Rough Set. Test case selection is done in two stages. The first stage is the fuzzy entropy-based filtration technique, used for estimating and reducing the ambiguity in test case fitness evaluation and selection. The second stage is the ant colony optimization-based wrapper technique with a forward search strategy, employed to select test cases from the reduced test suite of the first stage. The results are evaluated using the Coverage parameters, Precision, Recall, F-Measure, APSC, APDC, and SSR. The experimental evaluation demonstrates that by this approach considerable computational effort can be avoided.Keywords: ant colony optimization, fuzzy entropy, interval type-2 fuzzy rough set, test case selection
Procedia PDF Downloads 668981 An Approach for Reducing Morphological Operator Dataset and Recognize Optical Character Based on Significant Features
Authors: Ashis Pradhan, Mohan P. Pradhan
Abstract:
Pattern Matching is useful for recognizing character in a digital image. OCR is one such technique which reads character from a digital image and recognizes them. Line segmentation is initially used for identifying character in an image and later refined by morphological operations like binarization, erosion, thinning, etc. The work discusses a recognition technique that defines a set of morphological operators based on its orientation in a character. These operators are further categorized into groups having similar shape but different orientation for efficient utilization of memory. Finally the characters are recognized in accordance with the occurrence of frequency in hierarchy of significant pattern of those morphological operators and by comparing them with the existing database of each character.Keywords: binary image, morphological patterns, frequency count, priority, reduction data set and recognition
Procedia PDF Downloads 414980 Characteristic Study on Conventional and Soliton Based Transmission System
Authors: Bhupeshwaran Mani, S. Radha, A. Jawahar, A. Sivasubramanian
Abstract:
Here, we study the characteristic feature of conventional (ON-OFF keying) and soliton based transmission system. We consider 20 Gbps transmission system implemented with Conventional Single Mode Fiber (C-SMF) to examine the role of Gaussian pulse which is the characteristic of conventional propagation and hyperbolic-secant pulse which is the characteristic of soliton propagation in it. We note the influence of these pulses with respect to different dispersion lengths and soliton period in conventional and soliton system, respectively, and evaluate the system performance in terms of quality factor. From the analysis, we could prove that the soliton pulse has more consistent performance even for long distance without dispersion compensation than the conventional system as it is robust to dispersion. For the length of transmission of 200 Km, soliton system yielded Q of 33.958 while the conventional system totally exhausted with Q=0.Keywords: dispersion length, retrun-to-zero (rz), soliton, soliton period, q-factor
Procedia PDF Downloads 346