Search results for: mining software repositories
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2618

Search results for: mining software repositories

2318 Quantitative Evaluation of Frameworks for Web Applications

Authors: Thirumalai Selvi, N. V. Balasubramanian, P. Sheik Abdul Khader

Abstract:

An empirical study of web applications that use software frameworks is presented here. The analysis is based on two approaches. In the first, developers using such frameworks are required, based on their experience, to assign weights to parameters such as database connection. In the second approach, a performance testing tool, OpenSTA, is used to compute start time and other such measures. From such an analysis, it is concluded that open source software is superior to proprietary software. The motivation behind this research is to examine ways in which a quantitative assessment can be made of software in general and frameworks in particular. Concepts such as metrics and architectural styles are discussed along with previously published research.

Keywords: Metrics, Frameworks, Performance Testing, WebApplications, Open Source.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1734
2317 Suitability of Black Box Approaches for the Reliability Assessment of Component-Based Software

Authors: Anjushi Verma, Tirthankar Gayen

Abstract:

Although, reliability is an important attribute of quality, especially for mission critical systems, yet, there does not exist any versatile model even today for the reliability assessment of component-based software. The existing Black Box models are found to make various assumptions which may not always be realistic and may be quite contrary to the actual behaviour of software. They focus on observing the manner in which the system behaves without considering the structure of the system, the components composing the system, their interconnections, dependencies, usage frequencies, etc.As a result, the entropy (uncertainty) in assessment using these models is much high.Though, there are some models based on operation profile yet sometimes it becomes extremely difficult to obtain the exact operation profile concerned with a given operation. This paper discusses the drawbacks, deficiencies and limitations of Black Box approaches from the perspective of various authors and finally proposes a conceptual model for the reliability assessment of software.

Keywords: Black Box, faults, failure, software reliability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1359
2316 Requirements Gathering for Improved Software Usability and the Potential for Usage-Centred Design

Authors: Kholod J. Alotaibi, Andrew M. Gravell

Abstract:

Usability is an important software quality that is often neglected at the design stage. Although methods exist to incorporate elements of usability engineering, there is a need for more balanced usability focused methods that can enhance the experience of software usability for users. In this regard, the potential for Usage-Centred Design is explored with respect to requirements gathering and is shown to lead to high software usability besides other benefits. It achieves this through its focus on usage, defining essential use cases, by conducting task modeling, encouraging user collaboration, refining requirements, and so on. The requirements gathering process in UgCD is described in detail.

Keywords: Requirements gathering, Usability, Usage-Centred Design.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1925
2315 SBTAR: An Enhancing Method for Automate Test Tools

Authors: Noppakit Nawalikit, Pattarasinee Bhattarakosol

Abstract:

Since Software testing becomes an important part of Software development in order to improve the quality of software, many automation tools are created to help testing functionality of software. There are a few issues about usability of these tools, one is that the result log which is generated from tools contains useless information that the tester cannot use result log to communicate efficiently, or the result log needs to use a specific application to open. This paper introduces a new method, SBTAR that improves usability of automated test tools in a part of a result log. The practice will use the capability of tools named as IBM Rational Robot to create a customized function, the function would generate new format of a result log which contains useful information faster and easier to understand than using the original result log which was generated from the tools. This result log also increases flexibility by Microsoft Word or WordPad to make them readable.

Keywords: Software Automation Testing, Automated test tool, IBM Rational Robot.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1409
2314 Development of Software Complex for Digitalization of Enterprise Activities

Authors: G. T. Balakayeva, K. K. Nurlybayeva, M. B. Zhanuzakov

Abstract:

In the proposed work, we have developed software and designed a software architecture for the implementation of enterprise business processes. The proposed software has a multi-level architecture using a domain-specific tool. The developed architecture is a guarantor of the availability, reliability and security of the system and the implementation of business processes, which are the basis for effective enterprise management. Automating business processes, automating the algorithmic stages of an enterprise, developing optimal algorithms for managing activities, controlling and monitoring, reducing risks and improving results help organizations achieve strategic goals quickly and efficiently. The software described in this article can connect to the corporate information system via two methods: a desktop client and a web client. With an appeal to the application server, the desktop client program connects to the information system on the company's work PCs over a local network. Outside the organization, the user can interact with the information system via a web browser, which acts as a web client and connects to a web server. The developed software consists of several integrated modules that share resources and interact with each other through an API. The following technology stack was used during development: Node js, React js, MongoDB, Ngnix, Cloud Technologies, Python.

Keywords: Algorithms, document processing, automation, integrated modules, software architecture, software design, information system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 145
2313 A Cumulative Learning Approach to Data Mining Employing Censored Production Rules (CPRs)

Authors: Rekha Kandwal, Kamal K.Bharadwaj

Abstract:

Knowledge is indispensable but voluminous knowledge becomes a bottleneck for efficient processing. A great challenge for data mining activity is the generation of large number of potential rules as a result of mining process. In fact sometimes result size is comparable to the original data. Traditional data mining pruning activities such as support do not sufficiently reduce the huge rule space. Moreover, many practical applications are characterized by continual change of data and knowledge, thereby making knowledge voluminous with each change. The most predominant representation of the discovered knowledge is the standard Production Rules (PRs) in the form If P Then D. Michalski & Winston proposed Censored Production Rules (CPRs), as an extension of production rules, that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: If P Then D Unless C, where C (Censor) is an exception to the rule. Such rules are employed in situations in which the conditional statement 'If P Then D' holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence, are tight or there is simply no information available as to whether it holds or not. Thus the 'If P Then D' part of the CPR expresses important information while the Unless C part acts only as a switch changes the polarity of D to ~D. In this paper a scheme based on Dempster-Shafer Theory (DST) interpretation of a CPR is suggested for discovering CPRs from the discovered flat PRs. The discovery of CPRs from flat rules would result in considerable reduction of the already discovered rules. The proposed scheme incrementally incorporates new knowledge and also reduces the size of knowledge base considerably with each episode. Examples are given to demonstrate the behaviour of the proposed scheme. The suggested cumulative learning scheme would be useful in mining data streams.

Keywords: Censored production rules, cumulative learning, data mining, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1462
2312 A Quantitative Approach to Strategic Design of Component-Based Business Process Models

Authors: Eakong Atiptamvaree, Twittie Senivongse

Abstract:

A new paradigm for software design and development models software by its business process, translates the model into a process execution language, and has it run by a supporting execution engine. This process-oriented paradigm promotes modeling of software by less technical users or business analysts as well as rapid development. Since business process models may be shared by different organizations and sometimes even by different business domains, it is interesting to apply a technique used in traditional software component technology to design reusable business processes. This paper discusses an approach to apply a technique for software component fabrication to the design of process-oriented software units, called process components. These process components result from decomposing a business process of a particular application domain into subprocesses with an aim that the process components can be reusable in different process-based software models. The approach is quantitative because the quality of process component design is measured from technical features of the process components. The approach is also strategic because the measured quality is determined against business-oriented component management goals. A software tool has been developed to measure how good a process component design is, according to the required managerial goals and comparing to other designs. We also discuss how we benefit from reusable process components.

Keywords: Business process model, process component, component management goals, measurement

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1646
2311 Improving the Performance of Proxy Server by Using Data Mining Technique

Authors: P. Jomsri

Abstract:

Currently, web usage make a huge data from a lot of user attention. In general, proxy server is a system to support web usage from user and can manage system by using hit rates. This research tries to improve hit rates in proxy system by applying data mining technique. The data set are collected from proxy servers in the university and are investigated relationship based on several features. The model is used to predict the future access websites. Association rule technique is applied to get the relation among Date, Time, Main Group web, Sub Group web, and Domain name for created model. The results showed that this technique can predict web content for the next day, moreover the future accesses of websites increased from 38.15% to 85.57 %. This model can predict web page access which tends to increase the efficient of proxy servers as a result. In additional, the performance of internet access will be improved and help to reduce traffic in networks.

Keywords: Association rule, proxy server, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3035
2310 Classifier Based Text Mining for Neural Network

Authors: M. Govindarajan, R. M. Chandrasekaran

Abstract:

Text Mining is around applying knowledge discovery techniques to unstructured text is termed knowledge discovery in text (KDT), or Text data mining or Text Mining. In Neural Network that address classification problems, training set, testing set, learning rate are considered as key tasks. That is collection of input/output patterns that are used to train the network and used to assess the network performance, set the rate of adjustments. This paper describes a proposed back propagation neural net classifier that performs cross validation for original Neural Network. In order to reduce the optimization of classification accuracy, training time. The feasibility the benefits of the proposed approach are demonstrated by means of five data sets like contact-lenses, cpu, weather symbolic, Weather, labor-nega-data. It is shown that , compared to exiting neural network, the training time is reduced by more than 10 times faster when the dataset is larger than CPU or the network has many hidden units while accuracy ('percent correct') was the same for all datasets but contact-lences, which is the only one with missing attributes. For contact-lences the accuracy with Proposed Neural Network was in average around 0.3 % less than with the original Neural Network. This algorithm is independent of specify data sets so that many ideas and solutions can be transferred to other classifier paradigms.

Keywords: Back propagation, classification accuracy, textmining, time complexity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4188
2309 Text Mining Technique for Data Mining Application

Authors: M. Govindarajan

Abstract:

Text Mining is around applying knowledge discovery techniques to unstructured text is termed knowledge discovery in text (KDT), or Text data mining or Text Mining. In decision tree approach is most useful in classification problem. With this technique, tree is constructed to model the classification process. There are two basic steps in the technique: building the tree and applying the tree to the database. This paper describes a proposed C5.0 classifier that performs rulesets, cross validation and boosting for original C5.0 in order to reduce the optimization of error ratio. The feasibility and the benefits of the proposed approach are demonstrated by means of medial data set like hypothyroid. It is shown that, the performance of a classifier on the training cases from which it was constructed gives a poor estimate by sampling or using a separate test file, either way, the classifier is evaluated on cases that were not used to build and evaluate the classifier are both are large. If the cases in hypothyroid.data and hypothyroid.test were to be shuffled and divided into a new 2772 case training set and a 1000 case test set, C5.0 might construct a different classifier with a lower or higher error rate on the test cases. An important feature of see5 is its ability to classifiers called rulesets. The ruleset has an error rate 0.5 % on the test cases. The standard errors of the means provide an estimate of the variability of results. One way to get a more reliable estimate of predictive is by f-fold –cross- validation. The error rate of a classifier produced from all the cases is estimated as the ratio of the total number of errors on the hold-out cases to the total number of cases. The Boost option with x trials instructs See5 to construct up to x classifiers in this manner. Trials over numerous datasets, large and small, show that on average 10-classifier boosting reduces the error rate for test cases by about 25%.

Keywords: C5.0, Error Ratio, text mining, training data, test data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2457
2308 From Electroencephalogram to Epileptic Seizures Detection by Using Artificial Neural Networks

Authors: Gaetano Zazzaro, Angelo Martone, Roberto V. Montaquila, Luigi Pavone

Abstract:

Seizure is the main factor that affects the quality of life of epileptic patients. The diagnosis of epilepsy, and hence the identification of epileptogenic zone, is commonly made by using continuous Electroencephalogram (EEG) signal monitoring. Seizure identification on EEG signals is made manually by epileptologists and this process is usually very long and error prone. The aim of this paper is to describe an automated method able to detect seizures in EEG signals, using knowledge discovery in database process and data mining methods and algorithms, which can support physicians during the seizure detection process. Our detection method is based on Artificial Neural Network classifier, trained by applying the multilayer perceptron algorithm, and by using a software application, called Training Builder that has been developed for the massive extraction of features from EEG signals. This tool is able to cover all the data preparation steps ranging from signal processing to data analysis techniques, including the sliding window paradigm, the dimensionality reduction algorithms, information theory, and feature selection measures. The final model shows excellent performances, reaching an accuracy of over 99% during tests on data of a single patient retrieved from a publicly available EEG dataset.

Keywords: Artificial Neural Network, Data Mining, Electroencephalogram, Epilepsy, Feature Extraction, Seizure Detection, Signal Processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1280
2307 Factors of Effective Business Software Systems Development and Enhancement Projects Work Effort Estimation

Authors: Beata Czarnacka-Chrobot

Abstract:

Majority of Business Software Systems (BSS) Development and Enhancement Projects (D&EP) fail to meet criteria of their effectiveness, what leads to the considerable financial losses. One of the fundamental reasons for such projects- exceptionally low success rate are improperly derived estimates for their costs and time. In the case of BSS D&EP these attributes are determined by the work effort, meanwhile reliable and objective effort estimation still appears to be a great challenge to the software engineering. Thus this paper is aimed at presenting the most important synthetic conclusions coming from the author-s own studies concerning the main factors of effective BSS D&EP work effort estimation. Thanks to the rational investment decisions made on the basis of reliable and objective criteria it is possible to reduce losses caused not only by abandoned projects but also by large scale of overrunning the time and costs of BSS D&EP execution.

Keywords: Benchmarking data, business software systems development and enhancement projects, effort estimation, software engineering economics, software functional size measurement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1524
2306 Analysis of a Population of Diabetic Patients Databases with Classifiers

Authors: Murat Koklu, Yavuz Unal

Abstract:

Data mining can be called as a technique to extract information from data. It is the process of obtaining hidden information and then turning it into qualified knowledge by statistical and artificial intelligence technique. One of its application areas is medical area to form decision support systems for diagnosis just by inventing meaningful information from given medical data. In this study a decision support system for diagnosis of illness that make use of data mining and three different artificial intelligence classifier algorithms namely Multilayer Perceptron, Naive Bayes Classifier and J.48. Pima Indian dataset of UCI Machine Learning Repository was used. This dataset includes urinary and blood test results of 768 patients. These test results consist of 8 different feature vectors. Obtained classifying results were compared with the previous studies. The suggestions for future studies were presented.

Keywords: Artificial Intelligence, Classifiers, Data Mining, Diabetic Patients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5406
2305 Blueprinting of a Normalized Supply Chain Processes: Results in Implementing Normalized Software Systems

Authors: Bassam Istanbouli

Abstract:

With the technology evolving every day and with the increase in global competition, industries are always under the pressure to be the best. They need to provide good quality products at competitive prices, when and how the customer wants them.  In order to achieve this level of service, products and their respective supply chain processes need to be flexible and evolvable; otherwise changes will be extremely expensive, slow and with many combinatorial effects. Those combinatorial effects impact the whole organizational structure, from a management, financial, documentation, logistics and specially the information system Enterprise Requirement Planning (ERP) perspective. By applying the normalized system concept/theory to segments of the supply chain, we believe minimal effects, especially at the time of launching an organization global software project. The purpose of this paper is to point out that if an organization wants to develop a software from scratch or implement an existing ERP software for their business needs and if their business processes are normalized and modular then most probably this will yield to a normalized and modular software system that can be easily modified when the business evolves. Another important goal of this paper is to increase the awareness regarding the design of the business processes in a software implementation project. If the blueprints created are normalized then the software developers and configurators will use those modular blueprints to map them into modular software. This paper only prepares the ground for further studies;  the above concept will be supported by going through the steps of developing, configuring and/or implementing a software system for an organization by using two methods: The Software Development Lifecycle method (SDLC) and the Accelerated SAP implementation method (ASAP). Both methods start with the customer requirements, then blue printing of its business processes and finally mapping those processes into a software system.  Since those requirements and processes are the starting point of the implementation process, then normalizing those processes will end up in a normalizing software.

Keywords: Blueprint, ERP, SDLC, Modular.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 366
2304 Ensemble Approach for Predicting Student's Academic Performance

Authors: L. A. Muhammad, M. S. Argungu

Abstract:

Educational data mining (EDM) has recorded substantial considerations. Techniques of data mining in one way or the other have been proposed to dig out out-of-sight knowledge in educational data. The result of the study got assists academic institutions in further enhancing their process of learning and methods of passing knowledge to students. Consequently, the performance of students boasts and the educational products are by no doubt enhanced. This study adopted a student performance prediction model premised on techniques of data mining with Students' Essential Features (SEF). SEF are linked to the learner's interactivity with the e-learning management system. The performance of the student's predictive model is assessed by a set of classifiers, viz. Bayes Network, Logistic Regression, and Reduce Error Pruning Tree (REP). Consequently, ensemble methods of Bagging, Boosting, and Random Forest (RF) are applied to improve the performance of these single classifiers. The study reveals that the result shows a robust affinity between learners' behaviors and their academic attainment. Result from the study shows that the REP Tree and its ensemble record the highest accuracy of 83.33% using SEF. Hence, in terms of the Receiver Operating Curve (ROC), boosting method of REP Tree records 0.903, which is the best. This result further demonstrates the dependability of the proposed model.

Keywords: Ensemble, bagging, Random Forest, boosting, data mining, classifiers, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 702
2303 Prediction of Reusability of Object Oriented Software Systems using Clustering Approach

Authors: Anju Shri, Parvinder S. Sandhu, Vikas Gupta, Sanyam Anand

Abstract:

In literature, there are metrics for identifying the quality of reusable components but the framework that makes use of these metrics to precisely predict reusability of software components is still need to be worked out. These reusability metrics if identified in the design phase or even in the coding phase can help us to reduce the rework by improving quality of reuse of the software component and hence improve the productivity due to probabilistic increase in the reuse level. As CK metric suit is most widely used metrics for extraction of structural features of an object oriented (OO) software; So, in this study, tuned CK metric suit i.e. WMC, DIT, NOC, CBO and LCOM, is used to obtain the structural analysis of OO-based software components. An algorithm has been proposed in which the inputs can be given to K-Means Clustering system in form of tuned values of the OO software component and decision tree is formed for the 10-fold cross validation of data to evaluate the in terms of linguistic reusability value of the component. The developed reusability model has produced high precision results as desired.

Keywords: CK-Metric, Desicion Tree, Kmeans, Reusability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1893
2302 Video Summarization: Techniques and Applications

Authors: Zaynab Elkhattabi, Youness Tabii, Abdelhamid Benkaddour

Abstract:

Nowadays, huge amount of multimedia repositories make the browsing, retrieval and delivery of video contents very slow and even difficult tasks. Video summarization has been proposed to improve faster browsing of large video collections and more efficient content indexing and access. In this paper, we focus on approaches to video summarization. The video summaries can be generated in many different forms. However, two fundamentals ways to generate summaries are static and dynamic. We present different techniques for each mode in the literature and describe some features used for generating video summaries. We conclude with perspective for further research.

Keywords: Semantic features, static summarization, video skimming, Video summarization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7035
2301 Arabic Light Stemmer for Better Search Accuracy

Authors: Sahar Khedr, Dina Sayed, Ayman Hanafy

Abstract:

Arabic is one of the most ancient and critical languages in the world. It has over than 250 million Arabic native speakers and more than twenty countries having Arabic as one of its official languages. In the past decade, we have witnessed a rapid evolution in smart devices, social network and technology sector which led to the need to provide tools and libraries that properly tackle the Arabic language in different domains. Stemming is one of the most crucial linguistic fundamentals. It is used in many applications especially in information extraction and text mining fields. The motivation behind this work is to enhance the Arabic light stemmer to serve the data mining industry and leverage it in an open source community. The presented implementation works on enhancing the Arabic light stemmer by utilizing and enhancing an algorithm that provides an extension for a new set of rules and patterns accompanied by adjusted procedure. This study has proven a significant enhancement for better search accuracy with an average 10% improvement in comparison with previous works.

Keywords: Arabic data mining, Arabic Information extraction, Arabic Light stemmer, Arabic stemmer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1468
2300 Mining Network Data for Intrusion Detection through Naïve Bayesian with Clustering

Authors: Dewan Md. Farid, Nouria Harbi, Suman Ahmmed, Md. Zahidur Rahman, Chowdhury Mofizur Rahman

Abstract:

Network security attacks are the violation of information security policy that received much attention to the computational intelligence society in the last decades. Data mining has become a very useful technique for detecting network intrusions by extracting useful knowledge from large number of network data or logs. Naïve Bayesian classifier is one of the most popular data mining algorithm for classification, which provides an optimal way to predict the class of an unknown example. It has been tested that one set of probability derived from data is not good enough to have good classification rate. In this paper, we proposed a new learning algorithm for mining network logs to detect network intrusions through naïve Bayesian classifier, which first clusters the network logs into several groups based on similarity of logs, and then calculates the prior and conditional probabilities for each group of logs. For classifying a new log, the algorithm checks in which cluster the log belongs and then use that cluster-s probability set to classify the new log. We tested the performance of our proposed algorithm by employing KDD99 benchmark network intrusion detection dataset, and the experimental results proved that it improves detection rates as well as reduces false positives for different types of network intrusions.

Keywords: Clustering, detection rate, false positive, naïveBayesian classifier, network intrusion detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5517
2299 Investigating Crime Hotspot Places and their Implication to Urban Environmental Design: A Geographic Visualization and Data Mining Approach

Authors: Donna R. Tabangin, Jacqueline C. Flores, Nelson F. Emperador

Abstract:

Information is power. Geographical information is an emerging science that is advancing the development of knowledge to further help in the understanding of the relationship of “place" with other disciplines such as crime. The researchers used crime data for the years 2004 to 2007 from the Baguio City Police Office to determine the incidence and actual locations of crime hotspots. Combined qualitative and quantitative research methodology was employed through extensive fieldwork and observation, geographic visualization with Geographic Information Systems (GIS) and Global Positioning Systems (GPS), and data mining. The paper discusses emerging geographic visualization and data mining tools and methodologies that can be used to generate baseline data for environmental initiatives such as urban renewal and rejuvenation. The study was able to demonstrate that crime hotspots can be computed and were seen to be occurring to some select places in the Central Business District (CBD) of Baguio City. It was observed that some characteristics of the hotspot places- physical design and milieu may play an important role in creating opportunities for crime. A list of these environmental attributes was generated. This derived information may be used to guide the design or redesign of the urban environment of the City to be able to reduce crime and at the same time improve it physically.

Keywords: Crime mapping, data mining, environmental design, geographic visualization, GIS.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2578
2298 Using Design Sprint for Software Engineering Undergraduate Student Projects: A Method Paper

Authors: Sobhani U. Pilapitiya, Tharanga Peiris

Abstract:

Software engineering curriculums generally consist of industry-based practices such as project-based learning (PBL) which mainly focuses on efficient and innovative product development. These approaches can be tailored and used in project-based modules in software engineering curriculums. However, there are very limited attempts in the area especially related to Sri Lankan context. This paper describes a tailored pedagogical approach and its results of using design sprint which can be used for project-based modules in software engineering (SE) curriculums. A controlled group of second year software engineering students was selected for the study. The study results indicate that all of the students agreed that the design sprint approach is effective in group-based projects and 83% of students stated that it minimized the re-work compared to traditional project approaches. The tailored process was effective, easy to implement and produced desired results at the end of the session while providing students an enjoyable experience.

Keywords: design sprint, project-based learning, software engineering, curriculum

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 664
2297 Knowledge-Driven Decision Support System Based on Knowledge Warehouse and Data Mining by Improving Apriori Algorithm with Fuzzy Logic

Authors: Pejman Hosseinioun, Hasan Shakeri, Ghasem Ghorbanirostam

Abstract:

In recent years, we have seen an increasing importance of research and study on knowledge source, decision support systems, data mining and procedure of knowledge discovery in data bases and it is considered that each of these aspects affects the others. In this article, we have merged information source and knowledge source to suggest a knowledge based system within limits of management based on storing and restoring of knowledge to manage information and improve decision making and resources. In this article, we have used method of data mining and Apriori algorithm in procedure of knowledge discovery one of the problems of Apriori algorithm is that, a user should specify the minimum threshold for supporting the regularity. Imagine that a user wants to apply Apriori algorithm for a database with millions of transactions. Definitely, the user does not have necessary knowledge of all existing transactions in that database, and therefore cannot specify a suitable threshold. Our purpose in this article is to improve Apriori algorithm. To achieve our goal, we tried using fuzzy logic to put data in different clusters before applying the Apriori algorithm for existing data in the database and we also try to suggest the most suitable threshold to the user automatically.

Keywords: Decision support system, data mining, knowledge discovery, data discovery, fuzzy logic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2104
2296 Proposition for a New Approach of Version Control System Based On ECA Active Rules

Authors: S. Benhamed, S. Hocine, D. Benhamamouch

Abstract:

We try to give a solution of version control for documents in web service, that-s why we propose a new approach used specially for the XML documents. The new approach is applied in a centralized repository, this repository coexist with other repositories in a decentralized system. To achieve the activities of this approach in a standard model we use the ECA active rules. We also show how the Event-Condition-Action rules (ECA rules) have been incorporated as a mechanism for the version control of documents. The need to integrate ECA rules is that it provides a clear declarative semantics and induces an immediate operational realization in the system without the need for human intervention.

Keywords: ECA Rule, Web service, version control system, propagation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1351
2295 Discovering Complex Regularities by Adaptive Self Organizing Classification

Authors: A. Faro, D. Giordano, F. Maiorana

Abstract:

Data mining uses a variety of techniques each of which is useful for some particular task. It is important to have a deep understanding of each technique and be able to perform sophisticated analysis. In this article we describe a tool built to simulate a variation of the Kohonen network to perform unsupervised clustering and support the entire data mining process up to results visualization. A graphical representation helps the user to find out a strategy to optmize classification by adding, moving or delete a neuron in order to change the number of classes. The tool is also able to automatically suggest a strategy for number of classes optimization.The tool is used to classify macroeconomic data that report the most developed countries? import and export. It is possible to classify the countries based on their economic behaviour and use an ad hoc tool to characterize the commercial behaviour of a country in a selected class from the analysis of positive and negative features that contribute to classes formation.

Keywords: Unsupervised classification, Kohonen networks, macroeconomics, Visual data mining, cluster interpretation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1539
2294 A Robust Software for Advanced Analysis of Space Steel Frames

Authors: Viet-Hung Truong, Seung-Eock Kim

Abstract:

This paper presents a robust software package for practical advanced analysis of space steel framed structures. The pre- and post-processors of the presented software package are coded in the C++ programming language while the solver is written by using the FORTRAN programming language. A user-friendly graphical interface of the presented software is developed to facilitate the modeling process and result interpretation of the problem. The solver employs the stability functions for capturing the second-order effects to minimize modeling and computational time. Both the plastic-hinge and fiber-hinge beam-column elements are available in the presented software. The generalized displacement control method is adopted to solve the nonlinear equilibrium equations.

Keywords: Advanced analysis, beam-column, fiber-hinge, plastic hinge, steel frame.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1432
2293 A Survey on Usage and Diffusion of Project Risk Management Techniques and Software Tools in the Construction Industry

Authors: Muhammad Jamaluddin Thaheem, Alberto De Marco

Abstract:

The area of Project Risk Management (PRM) has been extensively researched, and the utilization of various tools and techniques for managing risk in several industries has been sufficiently reported. Formal and systematic PRM practices have been made available for the construction industry. Based on such body of knowledge, this paper tries to find out the global picture of PRM practices and approaches with the help of a survey to look into the usage of PRM techniques and diffusion of software tools, their level of maturity, and their usefulness in the construction sector. Results show that, despite existing techniques and tools, their usage is limited: software tools are used only by a minority of respondents and their cost is one of the largest hurdles in adoption. Finally, the paper provides some important guidelines for future research regarding quantitative risk analysis techniques and suggestions for PRM software tools development and improvement.

Keywords: Construction industry, Project risk management, Software tools, Survey study.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2949
2292 Hierarchical Clustering Algorithms in Data Mining

Authors: Z. Abdullah, A. R. Hamdan

Abstract:

Clustering is a process of grouping objects and data into groups of clusters to ensure that data objects from the same cluster are identical to each other. Clustering algorithms in one of the area in data mining and it can be classified into partition, hierarchical, density based and grid based. Therefore, in this paper we do survey and review four major hierarchical clustering algorithms called CURE, ROCK, CHAMELEON and BIRCH. The obtained state of the art of these algorithms will help in eliminating the current problems as well as deriving more robust and scalable algorithms for clustering.

Keywords: Clustering, method, algorithm, hierarchical, survey.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3353
2291 A Heuristics Approach for Fast Detecting Suspicious Money Laundering Cases in an Investment Bank

Authors: Nhien-An Le-Khac, Sammer Markos, M-Tahar Kechadi

Abstract:

Today, money laundering (ML) poses a serious threat not only to financial institutions but also to the nation. This criminal activity is becoming more and more sophisticated and seems to have moved from the cliché of drug trafficking to financing terrorism and surely not forgetting personal gain. Most international financial institutions have been implementing anti-money laundering solutions (AML) to fight investment fraud. However, traditional investigative techniques consume numerous man-hours. Recently, data mining approaches have been developed and are considered as well-suited techniques for detecting ML activities. Within the scope of a collaboration project for the purpose of developing a new solution for the AML Units in an international investment bank, we proposed a data mining-based solution for AML. In this paper, we present a heuristics approach to improve the performance for this solution. We also show some preliminary results associated with this method on analysing transaction datasets.

Keywords: data mining, anti money laundering, clustering, heuristics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3561
2290 A Study on Finding Similar Document with Multiple Categories

Authors: R. Saraçoğlu, N. Allahverdi

Abstract:

Searching similar documents and document management subjects have important place in text mining. One of the most important parts of similar document research studies is the process of classifying or clustering the documents. In this study, a similar document search approach that includes discussion of out the case of belonging to multiple categories (multiple categories problem) has been carried. The proposed method that based on Fuzzy Similarity Classification (FSC) has been compared with Rocchio algorithm and naive Bayes method which are widely used in text mining. Empirical results show that the proposed method is quite successful and can be applied effectively. For the second stage, multiple categories vector method based on information of categories regarding to frequency of being seen together has been used. Empirical results show that achievement is increased almost two times, when proposed method is compared with classical approach.

Keywords: Document similarity, Fuzzy classification, Multiple categories, Text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1684
2289 A Reusability Evaluation Model for OO-Based Software Components

Authors: Parvinder S. Sandhu, Hardeep Singh

Abstract:

The requirement to improve software productivity has promoted the research on software metric technology. There are metrics for identifying the quality of reusable components but the function that makes use of these metrics to find reusability of software components is still not clear. These metrics if identified in the design phase or even in the coding phase can help us to reduce the rework by improving quality of reuse of the component and hence improve the productivity due to probabilistic increase in the reuse level. CK metric suit is most widely used metrics for the objectoriented (OO) software; we critically analyzed the CK metrics, tried to remove the inconsistencies and devised the framework of metrics to obtain the structural analysis of OO-based software components. Neural network can learn new relationships with new input data and can be used to refine fuzzy rules to create fuzzy adaptive system. Hence, Neuro-fuzzy inference engine can be used to evaluate the reusability of OO-based component using its structural attributes as inputs. In this paper, an algorithm has been proposed in which the inputs can be given to Neuro-fuzzy system in form of tuned WMC, DIT, NOC, CBO , LCOM values of the OO software component and output can be obtained in terms of reusability. The developed reusability model has produced high precision results as expected by the human experts.

Keywords: CK-Metric, ID3, Neuro-fuzzy, Reusability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1797