Search results for: Data stream analysis.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 13652

Search results for: Data stream analysis.

12182 Principal Component Analysis-Ranking as a Variable Selection Method for the Simultaneous Spectrophotometric Determination of Phenol, Resorcinol and Catechol in Real Samples

Authors: Nahid Ghasemi, Mohammad Goodarzi, Morteza Khosravi

Abstract:

Simultaneous determination of multicomponents of phenol, resorcinol and catechol with a chemometric technique a PCranking artificial neural network (PCranking-ANN) algorithm is reported in this study. Based on the data correlation coefficient method, 3 representative PCs are selected from the scores of original UV spectral data (35 PCs) as the original input patterns for ANN to build a neural network model. The results obtained by iterating 8000 .The RMSEP for phenol, resorcinol and catechol with PCranking- ANN were 0.6680, 0.0766 and 0.1033, respectively. Calibration matrices were 0.50-21.0, 0.50-15.1 and 0.50-20.0 μg ml-1 for phenol, resorcinol and catechol, respectively. The proposed method was successfully applied for the determination of phenol, resorcinol and catechol in synthetic and water samples.

Keywords: Phenol, Resorcinol, Catechol, Principal componentrankingArtificial Neural Network, Chemometrics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1423
12181 Eliciting and Confirming Data, Information, Knowledge and Wisdom in a Specialist Health Care Setting: The WICKED Method

Authors: S. Impey, D. Berry, S. Furtado, M. Galvin, L. Grogan, O. Hardiman, L. Hederman, M. Heverin, V. Wade, L. Douris, D. O'Sullivan, G. Stephens

Abstract:

Healthcare is a knowledge-rich environment. This knowledge, while valuable, is not always accessible outside the borders of individual clinics. This research aims to address part of this problem (at a study site) by constructing a maximal data set (knowledge artefact) for motor neurone disease (MND). This data set is proposed as an initial knowledge base for a concurrent project to develop an MND patient data platform. It represents the domain knowledge at the study site for the duration of the research (12 months). A knowledge elicitation method was also developed from the lessons learned during this process - the WICKED method. WICKED is an anagram of the words: eliciting and confirming data, information, knowledge, wisdom. But it is also a reference to the concept of wicked problems, which are complex and challenging, as is eliciting expert knowledge. The method was evaluated at a second site, and benefits and limitations were noted. Benefits include that the method provided a systematic way to manage data, information, knowledge and wisdom (DIKW) from various sources, including healthcare specialists and existing data sets. Limitations surrounded the time required and how the data set produced only represents DIKW known during the research period. Future work is underway to address these limitations.

Keywords: Healthcare, knowledge acquisition, maximal data sets, action design science.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 513
12180 Temporally Coherent 3D Animation Reconstruction from RGB-D Video Data

Authors: Salam Khalifa, Naveed Ahmed

Abstract:

We present a new method to reconstruct a temporally coherent 3D animation from single or multi-view RGB-D video data using unbiased feature point sampling. Given RGB-D video data, in form of a 3D point cloud sequence, our method first extracts feature points using both color and depth information. In the subsequent steps, these feature points are used to match two 3D point clouds in consecutive frames independent of their resolution. Our new motion vectors based dynamic alignement method then fully reconstruct a spatio-temporally coherent 3D animation. We perform extensive quantitative validation using novel error functions to analyze the results. We show that despite the limiting factors of temporal and spatial noise associated to RGB-D data, it is possible to extract temporal coherence to faithfully reconstruct a temporally coherent 3D animation from RGB-D video data.

Keywords: 3D video, 3D animation, RGB-D video, Temporally Coherent 3D Animation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2067
12179 Comparison of Different Data Acquisition Techniques for Shape Optimization Problems

Authors: Attila Vámosi, Tamás Mankovits, Dávid Huri, Imre Kocsis, Tamás Szabó

Abstract:

Non-linear FEM calculations are indispensable when important technical information like operating performance of a rubber component is desired. For example rubber bumpers built into air-spring structures may undergo large deformations under load, which in itself shows non-linear behavior. The changing contact range between the parts and the incompressibility of the rubber increases this non-linear behavior further. The material characterization of an elastomeric component is also a demanding engineering task. The shape optimization problem of rubber parts led to the study of FEM based calculation processes. This type of problems was posed and investigated by several authors. In this paper the time demand of certain calculation methods are studied and the possibilities of time reduction is presented.

Keywords: Rubber bumper, data acquisition, finite element analysis, support vector regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2141
12178 Multi-Dimensional Concerns Mining for Web Applications via Concept-Analysis

Authors: Carlo Bellettini, Alessandro Marchetto, Andrea Trentini

Abstract:

Web applications have become very complex and crucial, especially when combined with areas such as CRM (Customer Relationship Management) and BPR (Business Process Reengineering), the scientific community has focused attention to Web applications design, development, analysis, and testing, by studying and proposing methodologies and tools. This paper proposes an approach to automatic multi-dimensional concern mining for Web Applications, based on concepts analysis, impact analysis, and token-based concern identification. This approach lets the user to analyse and traverse Web software relevant to a particular concern (concept, goal, purpose, etc.) via multi-dimensional separation of concerns, to document, understand and test Web applications. This technique was developed in the context of WAAT (Web Applications Analysis and Testing) project. A semi-automatic tool to support this technique is currently under development.

Keywords: Concepts Analysis, Concerns Mining, Multi-Dimensional Separation of Concerns, Impact Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1467
12177 Dynamic Data Partition Algorithm for a Parallel H.264 Encoder

Authors: Juntae Kim, Jaeyoung Park, Kyoungkun Lee, Jong Tae Kim

Abstract:

The H.264/AVC standard is a highly efficient video codec providing high-quality videos at low bit-rates. As employing advanced techniques, the computational complexity has been increased. The complexity brings about the major problem in the implementation of a real-time encoder and decoder. Parallelism is the one of approaches which can be implemented by multi-core system. We analyze macroblock-level parallelism which ensures the same bit rate with high concurrency of processors. In order to reduce the encoding time, dynamic data partition based on macroblock region is proposed. The data partition has the advantages in load balancing and data communication overhead. Using the data partition, the encoder obtains more than 3.59x speed-up on a four-processor system. This work can be applied to other multimedia processing applications.

Keywords: H.264/AVC, video coding, thread-level parallelism, OpenMP, multimedia

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1790
12176 Neural Network Models for Actual Cost and Actual Duration Estimation in Construction Projects: Findings from Greece

Authors: Panagiotis Karadimos, Leonidas Anthopoulos

Abstract:

Predicting the actual cost and duration in construction projects concern a continuous and existing problem for the construction sector. This paper addresses this problem with modern methods and data available from past public construction projects. 39 bridge projects, constructed in Greece, with a similar type of available data were examined. Considering each project’s attributes with the actual cost and the actual duration, correlation analysis is performed and the most appropriate predictive project variables are defined. Additionally, the most efficient subgroup of variables is selected with the use of the WEKA application, through its attribute selection function. The selected variables are used as input neurons for neural network models through correlation analysis. For constructing neural network models, the application FANN Tool is used. The optimum neural network model, for predicting the actual cost, produced a mean squared error with a value of 3.84886e-05 and it was based on the budgeted cost and the quantity of deck concrete. The optimum neural network model, for predicting the actual duration, produced a mean squared error with a value of 5.89463e-05 and it also was based on the budgeted cost and the amount of deck concrete.

Keywords: Actual cost and duration, attribute selection, bridge projects, neural networks, predicting models, FANN TOOL, WEKA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1199
12175 Comparative Study of the Static and Dynamic Analysis of Multi-Storey Irregular Building

Authors: Bahador Bagheri, Ehsan Salimi Firoozabad, Mohammadreza Yahyaei

Abstract:

As the world move to the accomplishment of Performance Based Engineering philosophies in seismic design of Civil Engineering structures, new seismic design provisions require Structural Engineers to perform both static and dynamic analysis for the design of structures. While Linear Equivalent Static Analysis is performed for regular buildings up to 90m height in zone I and II, Dynamic Analysis should be performed for regular and irregular buildings in zone IV and V. Dynamic Analysis can take the form of a dynamic Time History Analysis or a linear Response Spectrum Analysis. In present study, Multi-storey irregular buildings with 20 stories have been modeled using software packages ETABS and SAP 2000 v.15 for seismic zone V in India. This paper also deals with the effect of the variation of the building height on the structural response of the shear wall building. Dynamic responses of building under actual earthquakes, EL-CENTRO 1949 and CHI-CHI Taiwan 1999 have been investigated. This paper highlights the accuracy and exactness of Time History analysis in comparison with the most commonly adopted Response Spectrum Analysis and Equivalent Static Analysis.

Keywords: Equivalent Static Analysis, Time history method, Response spectrum method, Reinforce concrete building, displacement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16125
12174 Strategy in Controlling Rice-Field Conversion in Pangkep Regency, South Sulawesi, Indonesia

Authors: Nurliani, Ida Rosada

Abstract:

The national rice consumption keeps increasing along with raising income of the households and the rapid growth of population. However, food availability, particularly rice, is limited. Impacts of rice-field conversion have run cumulatively, as we can see on potential losses of rice and crops production, as well as work opportunity that keeps increasing year-by-year. Therefore, it requires policy recommendation to control rice-field conversion through economic, social, and ecological approaches. The research was a survey method intended to: (1) Identify internal factors; quality and productivity of the land as the cause of land conversion, (2) Identify external factors of land conversion, value of the rice-field and the competitor’s land, workforce absorption, and regulation, as well as (3) Formulate strategies in controlling rice-field conversion. Population of the research was farmers who applied land conversion at Pangkep Regency, South Sulawesi. Samples were determined using the incidental sampling method. Data analysis used productivity analysis, land quality analysis, total economic value analysis, and SWOT analysis. Results of the research showed that the quality of rice-field was low as well as productivity of the grains (unhulled-rice). So that, average productivity of the grains and quality of rice-field were low as well. Total economic value of rice-field was lower than the economic value of the embankment. Workforce absorption value on rice-field was higher than on the embankment. Strategies in controlling such rice-field conversion can be done by increasing rice-field productivity, improving land quality, applying cultivation technique of specific location, improving the irrigation lines, and socializing regulation and sanction about the transfer of land use.

Keywords: Land conversion, quality of rice-field, land economic value, strategy in controlling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1302
12173 Evaluation of Easy-to-Use Energy Building Design Tools for Solar Access Analysis in Urban Contexts: Comparison of Friendly Simulation Design Tools for Architectural Practice in the Early Design Stage

Authors: M. Iommi, G. Losco

Abstract:

Current building sector is focused on reduction of energy requirements, on renewable energy generation and on regeneration of existing urban areas. These targets need to be solved with a systemic approach, considering several aspects simultaneously such as climate conditions, lighting conditions, solar radiation, PV potential, etc. The solar access analysis is an already known method to analyze the solar potentials, but in current years, simulation tools have provided more effective opportunities to perform this type of analysis, in particular in the early design stage. Nowadays, the study of the solar access is related to the easiness of the use of simulation tools, in rapid and easy way, during the design process. This study presents a comparison of three simulation tools, from the point of view of the user, with the aim to highlight differences in the easy-to-use of these tools. Using a real urban context as case study, three tools; Ecotect, Townscope and Heliodon, are tested, performing models and simulations and examining the capabilities and output results of solar access analysis. The evaluation of the ease-to-use of these tools is based on some detected parameters and features, such as the types of simulation, requirements of input data, types of results, etc. As a result, a framework is provided in which features and capabilities of each tool are shown. This framework shows the differences among these tools about functions, features and capabilities. The aim of this study is to support users and to improve the integration of simulation tools for solar access with the design process.

Keywords: Solar access analysis, energy building design tools, urban planning, solar potential.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2058
12172 Economic Factorial Analysis of CO2 Emissions: The Divisia Index with Interconnected Factors Approach

Authors: Alexander Y. Vaninsky

Abstract:

This paper presents a method of economic factorial analysis of the CO2 emissions based on the extension of the Divisia index to interconnected factors. This approach, contrary to the Kaya identity, considers three main factors of the CO2 emissions: gross domestic product, energy consumption, and population - as equally important, and allows for accounting of all of them simultaneously. The three factors are included into analysis together with their carbon intensities that allows for obtaining a comprehensive picture of the change in the CO2 emissions. A computer program in R-language that is available for free download serves automation of the calculations. A case study of the U.S. carbon dioxide emissions is used as an example. 

Keywords: CO2 emissions, Economic analysis, Factorial analysis, Divisia index, Interconnected factors.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2543
12171 XML Schema Automatic Matching Solution

Authors: Huynh Quyet Thang, Vo Sy Nam

Abstract:

Schema matching plays a key role in many different applications, such as schema integration, data integration, data warehousing, data transformation, E-commerce, peer-to-peer data management, ontology matching and integration, semantic Web, semantic query processing, etc. Manual matching is expensive and error-prone, so it is therefore important to develop techniques to automate the schema matching process. In this paper, we present a solution for XML schema automated matching problem which produces semantic mappings between corresponding schema elements of given source and target schemas. This solution contributed in solving more comprehensively and efficiently XML schema automated matching problem. Our solution based on combining linguistic similarity, data type compatibility and structural similarity of XML schema elements. After describing our solution, we present experimental results that demonstrate the effectiveness of this approach.

Keywords: XML Schema, Schema Matching, SemanticMatching, Automatic XML Schema Matching.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1823
12170 Improvement of Gas Turbine Performance Test in Combine Cycle

Authors: M. Khosravy-el-Hossani, Q. Dorosti

Abstract:

One of the important applications of gas turbines is their utilization for heat recovery steam generator in combine-cycle technology. Exhaust flow and energy are two key parameters for determining heat recovery steam generator performance which are mainly determined by the main gas turbine components performance data. For this reason a method was developed for determining the exhaust energy in the new edition of ASME PTC22. The result of this investigation shows that the method of standard has considerable error. Therefore in this paper a new method is presented for modifying of the performance calculation. The modified method is based on exhaust gas constituent analysis and combustion calculations. The case study presented here by two kind of General Electric gas turbine design data for validation of methodologies. The result shows that the modified method is more precise than the ASME PTC22 method. The exhaust flow calculation deviation from design data is 1.5-2 % by ASME PTC22 method so that the deviation regarding with modified method is 0.3-0.5%. Based on precision of analyzer instruments, the method can be suitable alternative for gas turbine standard performance test. In advance two methods are proposed based on known and unknown fuel in modified method procedure. The result of this paper shows that the difference between the two methods is below than %0.02. In according to reasonable esult of the second procedure (unknown fuel composition), the method can be applied to performance evaluation of gas turbine, so that the measuring cost and data gathering should be reduced.

Keywords: Gas turbine, Performance test code, Combined cycle.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2979
12169 Validation of Reverse Engineered Web Application Models

Authors: Carlo Bellettini, Alessandro Marchetto, Andrea Trentini

Abstract:

Web applications have become complex and crucial for many firms, especially when combined with areas such as CRM (Customer Relationship Management) and BPR (Business Process Reengineering). The scientific community has focused attention to Web application design, development, analysis, testing, by studying and proposing methodologies and tools. Static and dynamic techniques may be used to analyze existing Web applications. The use of traditional static source code analysis may be very difficult, for the presence of dynamically generated code, and for the multi-language nature of the Web. Dynamic analysis may be useful, but it has an intrinsic limitation, the low number of program executions used to extract information. Our reverse engineering analysis, used into our WAAT (Web Applications Analysis and Testing) project, applies mutational techniques in order to exploit server side execution engines to accomplish part of the dynamic analysis. This paper studies the effects of mutation source code analysis applied to Web software to build application models. Mutation-based generated models may contain more information then necessary, so we need a pruning mechanism.

Keywords: Validation, Dynamic Analysis, MutationAnalysis, Reverse Engineering, Web Applications

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1615
12168 An Intelligent Approach of Rough Set in Knowledge Discovery Databases

Authors: Hrudaya Ku. Tripathy, B. K. Tripathy, Pradip K. Das

Abstract:

Knowledge Discovery in Databases (KDD) has evolved into an important and active area of research because of theoretical challenges and practical applications associated with the problem of discovering (or extracting) interesting and previously unknown knowledge from very large real-world databases. Rough Set Theory (RST) is a mathematical formalism for representing uncertainty that can be considered an extension of the classical set theory. It has been used in many different research areas, including those related to inductive machine learning and reduction of knowledge in knowledge-based systems. One important concept related to RST is that of a rough relation. In this paper we presented the current status of research on applying rough set theory to KDD, which will be helpful for handle the characteristics of real-world databases. The main aim is to show how rough set and rough set analysis can be effectively used to extract knowledge from large databases.

Keywords: Data mining, Data tables, Knowledge discovery in database (KDD), Rough sets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2326
12167 Aggregate Angularity on the Permanent Deformation Zones of Hot Mix Asphalt

Authors: Lee P. Leon, Raymond Charles

Abstract:

This paper presents a method of evaluating the effect of aggregate angularity on hot mix asphalt (HMA) properties and its relationship to the Permanent Deformation resistance. The research concluded that aggregate particle angularity had a significant effect on the Permanent Deformation performance, and also that with an increase in coarse aggregate angularity there was an increase in the resistance of mixes to Permanent Deformation. A comparison between the measured data and predictive data of permanent deformation predictive models showed the limits of existing prediction models. The numerical analysis described the permanent deformation zones and concluded that angularity has an effect of the onset of these zones. Prediction of permanent deformation help road agencies and by extension economists and engineers determine the best approach for maintenance, rehabilitation, and new construction works of the road infrastructure.

Keywords: Aggregate angularity, asphalt concrete, permanent deformation, rutting prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2074
12166 Direct Design of Steel Bridge Using Nonlinear Inelastic Analysis

Authors: Boo-Sung Koh, Seung-Eock Kim

Abstract:

In this paper, a direct design using a nonlinear inelastic analysis is suggested. Also, this paper compares the load carrying capacity obtained by a nonlinear inelastic analysis with experiment results to verify the accuracy of the results. The allowable stress design results of a railroad through a plate girder bridge and the safety factor of the nonlinear inelastic analysis were compared to examine the safety performance. As a result, the load safety factor for the nonlinear inelastic analysis was twice as high as the required safety factor under the allowable stress design standard specified in the civil engineering structure design standards for urban magnetic levitation railways, which further verified the advantages of the proposed direct design method.

Keywords: Direct design, nonlinear inelastic analysis, residual stress, initial geometric imperfection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1447
12165 Performance Analysis of Artificial Neural Network with Decision Tree in Prediction of Diabetes Mellitus

Authors: J. K. Alhassan, B. Attah, S. Misra

Abstract:

Human beings have the ability to make logical decisions. Although human decision - making is often optimal, it is insufficient when huge amount of data is to be classified. Medical dataset is a vital ingredient used in predicting patient’s health condition. In other to have the best prediction, there calls for most suitable machine learning algorithms. This work compared the performance of Artificial Neural Network (ANN) and Decision Tree Algorithms (DTA) as regards to some performance metrics using diabetes data. WEKA software was used for the implementation of the algorithms. Multilayer Perceptron (MLP) and Radial Basis Function (RBF) were the two algorithms used for ANN, while RegTree and LADTree algorithms were the DTA models used. From the results obtained, DTA performed better than ANN. The Root Mean Squared Error (RMSE) of MLP is 0.3913 that of RBF is 0.3625, that of RepTree is 0.3174 and that of LADTree is 0.3206 respectively.

Keywords: Artificial neural network, classification, decision tree, diabetes mellitus.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2412
12164 Systematic Mapping Study of Digitization and Analysis of Manufacturing Data

Authors: R. Clancy, M. Ahern, D. O’Sullivan, K. Bruton

Abstract:

The manufacturing industry is currently undergoing a digital transformation as part of the mega-trend Industry 4.0. As part of this phase of the industrial revolution, traditional manufacturing processes are being combined with digital technologies to achieve smarter and more efficient production. To successfully digitally transform a manufacturing facility, the processes must first be digitized. This is the conversion of information from an analogue format to a digital format. The objective of this study was to explore the research area of digitizing manufacturing data as part of the worldwide paradigm, Industry 4.0. The formal methodology of a systematic mapping study was utilized to capture a representative sample of the research area and assess its current state. Specific research questions were defined to assess the key benefits and limitations associated with the digitization of manufacturing data. Research papers were classified according to the type of research and type of contribution to the research area. Upon analyzing 54 papers identified in this area, it was noted that 23 of the papers originated in Germany. This is an unsurprising finding as Industry 4.0 is originally a German strategy with supporting strong policy instruments being utilized in Germany to support its implementation. It was also found that the Fraunhofer Institute for Mechatronic Systems Design, in collaboration with the University of Paderborn in Germany, was the most frequent contributing Institution of the research papers with three papers published. The literature suggested future research directions and highlighted one specific gap in the area. There exists an unresolved gap between the data science experts and the manufacturing process experts in the industry. The data analytics expertise is not useful unless the manufacturing process information is utilized. A legitimate understanding of the data is crucial to perform accurate analytics and gain true, valuable insights into the manufacturing process. There lies a gap between the manufacturing operations and the information technology/data analytics departments within enterprises, which was borne out by the results of many of the case studies reviewed as part of this work. To test the concept of this gap existing, the researcher initiated an industrial case study in which they embedded themselves between the subject matter expert of the manufacturing process and the data scientist. Of the papers resulting from the systematic mapping study, 12 of the papers contributed a framework, another 12 of the papers were based on a case study, and 11 of the papers focused on theory. However, there were only three papers that contributed a methodology. This provides further evidence for the need for an industry-focused methodology for digitizing and analyzing manufacturing data, which will be developed in future research.

Keywords: Analytics, digitization, industry 4.0, manufacturing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 719
12163 Data Oriented Model of Image: as a Framework for Image Processing

Authors: A. Habibizad Navin, A. Sadighi, M. Naghian Fesharaki, M. Mirnia, M. Teshnelab, R. Keshmiri

Abstract:

This paper presents a new data oriented model of image. Then a representation of it, ADBT, is introduced. The ability of ADBT is clustering, segmentation, measuring similarity of images etc, with desired precision and corresponding speed.

Keywords: Data oriented modelling, image, clustering, segmentation, classification, ADBT and image processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1789
12162 MIBiClus: Mutual Information based Biclustering Algorithm

Authors: Neelima Gupta, Seema Aggarwal

Abstract:

Most of the biclustering/projected clustering algorithms are based either on the Euclidean distance or correlation coefficient which capture only linear relationships. However, in many applications, like gene expression data and word-document data, non linear relationships may exist between the objects. Mutual Information between two variables provides a more general criterion to investigate dependencies amongst variables. In this paper, we improve upon our previous algorithm that uses mutual information for biclustering in terms of computation time and also the type of clusters identified. The algorithm is able to find biclusters with mixed relationships and is faster than the previous one. To the best of our knowledge, none of the other existing algorithms for biclustering have used mutual information as a similarity measure. We present the experimental results on synthetic data as well as on the yeast expression data. Biclusters on the yeast data were found to be biologically and statistically significant using GO Tool Box and FuncAssociate.

Keywords: Biclustering, mutual information.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1623
12161 Spatio-Temporal Data Mining with Association Rules for Lake Van

Authors: T. Aydin, M. F. Alaeddinoglu

Abstract:

People, throughout the history, have made estimates and inferences about the future by using their past experiences. Developing information technologies and the improvements in the database management systems make it possible to extract useful information from knowledge in hand for the strategic decisions. Therefore, different methods have been developed. Data mining by association rules learning is one of such methods. Apriori algorithm, one of the well-known association rules learning algorithms, is not commonly used in spatio-temporal data sets. However, it is possible to embed time and space features into the data sets and make Apriori algorithm a suitable data mining technique for learning spatiotemporal association rules. Lake Van, the largest lake of Turkey, is a closed basin. This feature causes the volume of the lake to increase or decrease as a result of change in water amount it holds. In this study, evaporation, humidity, lake altitude, amount of rainfall and temperature parameters recorded in Lake Van region throughout the years are used by the Apriori algorithm and a spatio-temporal data mining application is developed to identify overflows and newlyformed soil regions (underflows) occurring in the coastal parts of Lake Van. Identifying possible reasons of overflows and underflows may be used to alert the experts to take precautions and make the necessary investments.

Keywords: Apriori algorithm, association rules, data mining, spatio-temporal data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1399
12160 Exploring the Destination Image of Mainland China Tourists to Taiwan by Word-of-Mouth on Web

Authors: Y. R. Li, Y. Y. Wang

Abstract:

After allowing direct flights from Mainland China to Taiwan, Chinese tourists increased according to Tourism Bureaustatistics. There are from 0.19 to 2 million tourists from 2008 to 2011. Mainland China has become the main source of Taiwan developing tourism industry. Taiwanese government should know more about comments from Chinese tourists to Taiwan in order toproperly market Taiwan tourism and enhance the overall quality of tourism. In order to understand Chinese visitors’ comments, this study adopts content analysis to analyze electronic word-of-mouth on Web. This study collects 375 blog articles of Chinese tourists from Ctrip.com as a database during 2009 to 2011. Through the qualitative data analysis the traveling destination imagesis divided into seven dimensions, such as senic spots, shopping, food and beverages, accommodations, transportation, festivals and recreation activities. Finally, this study proposes some practical managerial implication to know both positive and negative images of the seven dimensions from Chinese tourists, providing marketing strategies and suggestions to traveling agency industry.

Keywords: Destination Image, Content Analysis, Electronic Word-of-Mouth.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2533
12159 Data Extraction of XML Files using Searching and Indexing Techniques

Authors: Sushma Satpute, Vaishali Katkar, Nilesh Sahare

Abstract:

XML files contain data which is in well formatted manner. By studying the format or semantics of the grammar it will be helpful for fast retrieval of the data. There are many algorithms which describes about searching the data from XML files. There are no. of approaches which uses data structure or are related to the contents of the document. In these cases user must know about the structure of the document and information retrieval techniques using NLPs is related to content of the document. Hence the result may be irrelevant or not so successful and may take more time to search.. This paper presents fast XML retrieval techniques by using new indexing technique and the concept of RXML. When indexing an XML document, the system takes into account both the document content and the document structure and assigns the value to each tag from file. To query the system, a user is not constrained about fixed format of query.

Keywords: XML Retrieval, Indexed Search, Information Retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1775
12158 Leveraging xAPI in a Corporate e-Learning Environment to Facilitate the Tracking, Modelling, and Predictive Analysis of Learner Behaviour

Authors: Libor Zachoval, Daire O Broin, Oisin Cawley

Abstract:

E-learning platforms, such as Blackboard have two major shortcomings: limited data capture as a result of the limitations of SCORM (Shareable Content Object Reference Model), and lack of incorporation of Artificial Intelligence (AI) and machine learning algorithms which could lead to better course adaptations. With the recent development of Experience Application Programming Interface (xAPI), a large amount of additional types of data can be captured and that opens a window of possibilities from which online education can benefit. In a corporate setting, where companies invest billions on the learning and development of their employees, some learner behaviours can be troublesome for they can hinder the knowledge development of a learner. Behaviours that hinder the knowledge development also raise ambiguity about learner’s knowledge mastery, specifically those related to gaming the system. Furthermore, a company receives little benefit from their investment if employees are passing courses without possessing the required knowledge and potential compliance risks may arise. Using xAPI and rules derived from a state-of-the-art review, we identified three learner behaviours, primarily related to guessing, in a corporate compliance course. The identified behaviours are: trying each option for a question, specifically for multiple-choice questions; selecting a single option for all the questions on the test; and continuously repeating tests upon failing as opposed to going over the learning material. These behaviours were detected on learners who repeated the test at least 4 times before passing the course. These findings suggest that gauging the mastery of a learner from multiple-choice questions test scores alone is a naive approach. Thus, next steps will consider the incorporation of additional data points, knowledge estimation models to model knowledge mastery of a learner more accurately, and analysis of the data for correlations between knowledge development and identified learner behaviours. Additional work could explore how learner behaviours could be utilised to make changes to a course. For example, course content may require modifications (certain sections of learning material may be shown to not be helpful to many learners to master the learning outcomes aimed at) or course design (such as the type and duration of feedback).

Keywords: Compliance Course, Corporate Training, Learner Behaviours, xAPI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 558
12157 WormHex: A Volatile Memory Analysis Tool for Retrieval of Social Media Evidence

Authors: Norah Almubairik, Wadha Almattar, Amani Alqarni

Abstract:

Social media applications are increasingly being used in our everyday communications. These applications utilise end-to-end encryption mechanisms which make them suitable tools for criminals to exchange messages. These messages are preserved in the volatile memory until the device is restarted. Therefore, volatile forensics has become an important branch of digital forensics. In this study, the WormHex tool was developed to inspect the memory dump files for Windows and Mac based workstations. The tool supports digital investigators by enabling them to extract valuable data written in Arabic and English through web-based WhatsApp and Twitter applications. The results confirm that social media applications write their data into the memory, regardless of the operating system running the application, with there being no major differences between Windows and Mac.

Keywords: Volatile memory, REGEX, digital forensics, memory acquisition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 905
12156 Case-Based Reasoning: A Hybrid Classification Model Improved with an Expert's Knowledge for High-Dimensional Problems

Authors: Bruno Trstenjak, Dzenana Donko

Abstract:

Data mining and classification of objects is the process of data analysis, using various machine learning techniques, which is used today in various fields of research. This paper presents a concept of hybrid classification model improved with the expert knowledge. The hybrid model in its algorithm has integrated several machine learning techniques (Information Gain, K-means, and Case-Based Reasoning) and the expert’s knowledge into one. The knowledge of experts is used to determine the importance of features. The paper presents the model algorithm and the results of the case study in which the emphasis was put on achieving the maximum classification accuracy without reducing the number of features.

Keywords: Case based reasoning, classification, expert's knowledge, hybrid model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1413
12155 A Modified Run Length Coding Technique for Test Data Compression Based on Multi-Level Selective Huffman Coding

Authors: C. Kalamani, K. Paramasivam

Abstract:

Test data compression is an efficient method for reducing the test application cost. The problem of reducing test data has been addressed by researchers in three different aspects: Test Data Compression, Built-in-Self-Test (BIST) and Test set compaction. The latter two methods are capable of enhancing fault coverage with cost of hardware overhead. The drawback of the conventional methods is that they are capable of reducing the test storage and test power but when test data have redundant length of runs, no additional compression method is followed. This paper presents a modified Run Length Coding (RLC) technique with Multilevel Selective Huffman Coding (MLSHC) technique to reduce test data volume, test pattern delivery time and power dissipation in scan test applications where redundant length of runs is encountered then the preceding run symbol is replaced with tiny codeword. Experimental results show that the presented method not only improves the test data compression but also reduces the overall test data volume compared to recent schemes. Experiments for the six largest ISCAS-98 benchmarks show that our method outperforms most known techniques.

Keywords: Modified run length coding, multilevel selective Huffman coding, built-in-self-test modified selective Huffman coding, automatic test equipment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1266
12154 Comparison of Different Methods to Produce Fuzzy Tolerance Relations for Rainfall Data Classification in the Region of Central Greece

Authors: N. Samarinas, C. Evangelides, C. Vrekos

Abstract:

The aim of this paper is the comparison of three different methods, in order to produce fuzzy tolerance relations for rainfall data classification. More specifically, the three methods are correlation coefficient, cosine amplitude and max-min method. The data were obtained from seven rainfall stations in the region of central Greece and refers to 20-year time series of monthly rainfall height average. Three methods were used to express these data as a fuzzy relation. This specific fuzzy tolerance relation is reformed into an equivalence relation with max-min composition for all three methods. From the equivalence relation, the rainfall stations were categorized and classified according to the degree of confidence. The classification shows the similarities among the rainfall stations. Stations with high similarity can be utilized in water resource management scenarios interchangeably or to augment data from one to another. Due to the complexity of calculations, it is important to find out which of the methods is computationally simpler and needs fewer compositions in order to give reliable results.

Keywords: Classification, fuzzy logic, tolerance relations, rainfall data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1018
12153 The Influences of Accountants’ Potential Performance on Their Working Process: Government Savings Bank, Northeast, Thailand

Authors: Prateep Wajeetongratana

Abstract:

The purpose of this research was to study the influence of accountants’ potential performance on their working process, a case study of Government Savings Banks in the northeast of Thailand. The independent variables included accounting knowledge, accounting skill, accounting value, accounting ethics, and accounting attitude, while the dependent variable included the success of the working process. A total of 155 accountants working for Government Savings Banks were selected by random sampling. A questionnaire was used as a tool for collecting data. Descriptive statistics in this research included percentage, mean, and multiple regression analyses.

The findings revealed that the majority of accountants were female with an age between 35-40 years old. Most of the respondents had an undergraduate degree with ten years of experience. Moreover, the factors of accounting knowledge, accounting skill, accounting a value and accounting ethics and accounting attitude were rated at a high level. The findings from regression analysis of observation data revealed a causal relationship in that the observation data could explain at least 51 percent of the success in the accountants’ working process.

Keywords: Influence, Potential Performance, Success, Working Process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1383