Search results for: analysis of scientific data

13499 Using the Combined Model of PROMETHEE and Fuzzy Analytic Network Process for Determining Question Weights in Scientific Exams through Data Mining Approach

Authors: Hassan Haleh, Amin Ghaffari, Parisa Farahpour

Abstract:

Need for an appropriate system of evaluating students- educational developments is a key problem to achieve the predefined educational goals. Intensity of the related papers in the last years; that tries to proof or disproof the necessity and adequacy of the students assessment; is the corroborator of this matter. Some of these studies tried to increase the precision of determining question weights in scientific examinations. But in all of them there has been an attempt to adjust the initial question weights while the accuracy and precision of those initial question weights are still under question. Thus In order to increase the precision of the assessment process of students- educational development, the present study tries to propose a new method for determining the initial question weights by considering the factors of questions like: difficulty, importance and complexity; and implementing a combined method of PROMETHEE and fuzzy analytic network process using a data mining approach to improve the model-s inputs. The result of the implemented case study proves the development of performance and precision of the proposed model.

Keywords: Assessing students, Analytic network process, Clustering, Data mining, Fuzzy sets, Multi-criteria decision making, and Preference function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1538

13498 Bank Business Models and The Changes in CEE Countries

Authors: I. Erins, J. Erina

Abstract:

The aim of this article is to assess the existing business models used by the banks operating in the CEE countries in the time period from 2006 till 2011. In order to obtain research results, the authors performed qualitative analysis of the scientific literature on bank business models, which have been grouped into clusters that consist of such components as: 1) capital and reserves; 2) assets; 3) deposits, and 4) loans. In their turn, bank business models have been developed based on the types of core activities of the banks, and have been divided into four groups: Wholesale, Investment, Retail and Universal Banks. Descriptive statistics have been used to analyse the models, determining mean, minimal and maximal values of constituent cluster components, as well as standard deviation. The analysis of the data is based on such bank variable indices as Return on Assets (ROA) and Return on Equity (ROE).

Keywords: Banks, Business model, CEE, ROA, ROE.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1802

13497 An Investigation into the Application of Artificial Neural Networks to the Prediction of Injuries in Sport

Authors: J. McCullagh, T. Whitfort

Abstract:

Artificial Neural Networks (ANNs) have been used successfully in many scientific, industrial and business domains as a method for extracting knowledge from vast amounts of data. However the use of ANN techniques in the sporting domain has been limited. In professional sport, data is stored on many aspects of teams, games, training and players. Sporting organisations have begun to realise that there is a wealth of untapped knowledge contained in the data and there is great interest in techniques to utilise this data. This study will use player data from the elite Australian Football League (AFL) competition to train and test ANNs with the aim to predict the onset of injuries. The results demonstrate that an accuracy of 82.9% was achieved by the ANNs’ predictions across all examples with 94.5% of all injuries correctly predicted. These initial findings suggest that ANNs may have the potential to assist sporting clubs in the prediction of injuries.

Keywords: Artificial Neural Networks, data, injuries, sport

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2794

13496 Comprehensive Analysis of Data Mining Tools

Authors: S. Sarumathi, N. Shanthi

Abstract:

Due to the fast and flawless technological innovation there is a tremendous amount of data dumping all over the world in every domain such as Pattern Recognition, Machine Learning, Spatial Data Mining, Image Analysis, Fraudulent Analysis, World Wide Web etc., This issue turns to be more essential for developing several tools for data mining functionalities. The major aim of this paper is to analyze various tools which are used to build a resourceful analytical or descriptive model for handling large amount of information more efficiently and user friendly. In this survey the diverse tools are illustrated with their extensive technical paradigm, outstanding graphical interface and inbuilt multipath algorithms in which it is very useful for handling significant amount of data more indeed.

Keywords: Classification, Clustering, Data Mining, Machine learning, Visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2399

13495 Forthcoming Big Data on Smart Buildings and Cities: An Experimental Study on Correlations among Urban Data

Authors: Yu-Mi Song, Sung-Ah Kim, Dongyoun Shin

Abstract:

Cities are complex systems of diverse and inter-tangled activities. These activities and their complex interrelationships create diverse urban phenomena. And such urban phenomena have considerable influences on the lives of citizens. This research aimed to develop a method to reveal the causes and effects among diverse urban elements in order to enable better understanding of urban activities and, therefrom, to make better urban planning strategies. Specifically, this study was conducted to solve a data-recommendation problem found on a Korean public data homepage. First, a correlation analysis was conducted to find the correlations among random urban data. Then, based on the results of that correlation analysis, the weighted data network of each urban data was provided to people. It is expected that the weights of urban data thereby obtained will provide us with insights into cities and show us how diverse urban activities influence each other and induce feedback.

Keywords: Big data, correlation analysis, data recommendation system, urban data network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1057

13494 Comparison between Open and Closed System for Dewatering with Geotextile: Field and Comparative Study

Authors: Matheus Müller, Delma Vidal

Abstract:

The present paper aims to expose two techniques of dewatering for sludge, analyzing its operations and dewatering processes, aiming at improving the conditions of disposal of residues with high liquid content. It describes the field tests performed on two geotextile systems, a closed geotextile tube and an open geotextile drying bed, both of which are submitted to two filling cycles. The sludge used in the filling cycles for the field trials is from the water treatment plant of the Technological Center of Aeronautics – CTA, in São José dos Campos, Brazil. Data about volume and height abatement due to the dewatering and consolidation were collected per time, until it was observed constancy. With the laboratory analysis of the sludge allied to the data collected in the field, it was possible to perform a critical comparative study between the observed and the scientific literature, in this way, this paper expresses the data obtained and compares them with the bibliography. The tests were carried out on three fronts: field tests, including the filling cycles of the systems with the sludge from CTA, taking measurements of filling time per cycle and maximum filling height per cycle, heights against the abatement by dewatering of the systems over time; tests carried out in the laboratory, including the characterization of the sludge and removal of material samples from the systems to ascertain the solids content within the systems per time and; comparing the data obtained in the field and laboratory tests with the scientific literature. Through the study, it was possible to perceive that the process of densification of the material inside a closed system, such as the geotextile tube, occurs faster than the observed in the drying bed system. This process of accelerated densification can be brought about by the pumping pressure of the sludge in its filling and by the confinement of the residue through the permeable geotextile membrane (allowing water to pass through), accelerating the process of densification and dewatering by its own weight after the filling with sludge.

Keywords: Consolidation, dewatering, geotextile drying bed, geotextile tube.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 631

13493 An In-Depth Analysis of Open Data Portals as an Emerging Public E-Service

Authors: Martin Lnenicka

Abstract:

Governments collect and produce large amounts of data. Increasingly, governments worldwide have started to implement open data initiatives and also launch open data portals to enable the release of these data in open and reusable formats. Therefore, a large number of open data repositories, catalogues and portals have been emerging in the world. The greater availability of interoperable and linkable open government data catalyzes secondary use of such data, so they can be used for building useful applications which leverage their value, allow insight, provide access to government services, and support transparency. The efficient development of successful open data portals makes it necessary to evaluate them systematic, in order to understand them better and assess the various types of value they generate, and identify the required improvements for increasing this value. Thus, the attention of this paper is directed particularly to the field of open data portals. The main aim of this paper is to compare the selected open data portals on the national level using content analysis and propose a new evaluation framework, which further improves the quality of these portals. It also establishes a set of considerations for involving businesses and citizens to create eservices and applications that leverage on the datasets available from these portals.

Keywords: Big data, content analysis, criteria comparison, data quality, open data, open data portals, public sector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3021

13492 Application of Multi-Dimensional Principal Component Analysis to Medical Data

Authors: Naoki Yamamoto, Jun Murakami, Chiharu Okuma, Yutaro Shigeto, Satoko Saito, Takashi Izumi, Nozomi Hayashida

Abstract:

Multi-dimensional principal component analysis (PCA) is the extension of the PCA, which is used widely as the dimensionality reduction technique in multivariate data analysis, to handle multi-dimensional data. To calculate the PCA the singular value decomposition (SVD) is commonly employed by the reason of its numerical stability. The multi-dimensional PCA can be calculated by using the higher-order SVD (HOSVD), which is proposed by Lathauwer et al., similarly with the case of ordinary PCA. In this paper, we apply the multi-dimensional PCA to the multi-dimensional medical data including the functional independence measure (FIM) score, and describe the results of experimental analysis.

Keywords: multi-dimensional principal component analysis, higher-order SVD (HOSVD), functional independence measure (FIM), medical data, tensor decomposition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2454

13491 Analysis of Diverse Clustering Tools in Data Mining

Authors: S. Sarumathi, N. Shanthi, M. Sharmila

Abstract:

Clustering in data mining is an unsupervised learning technique of aggregating the data objects into meaningful groups such that the intra cluster similarity of objects are maximized and inter cluster similarity of objects are minimized. Over the past decades several clustering tools were emerged in which clustering algorithms are inbuilt and are easier to use and extract the expected results. Data mining mainly deals with the huge databases that inflicts on cluster analysis and additional rigorous computational constraints. These challenges pave the way for the emergence of powerful expansive data mining clustering softwares. In this survey, a variety of clustering tools used in data mining are elucidated along with the pros and cons of each software.

Keywords: Cluster Analysis, Clustering Algorithms, Clustering Techniques, Association, Visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2153

13490 Spatial thinking Issues: Towards Rural Sociological Research Agenda in the Third Millennium

Authors: Abdel-Samad M. Ali

Abstract:

Does the spatial perspective provide a common thread for rural sociology? Have rural sociologists succeeded in bringing order to their data using spatial analysis models and techniques? A trial answer to such questions, as touchstones of theoretical and applied sociological studies in rural areas, is the point at issue in the present paper. Spatial analyses have changed the way rural sociologists approach scientific problems. Rural sociology is spatial by nature because much, if not most, of its research topics has a spatial “awareness." However, such spatial awareness is not quite the same as spatial analysis because it is not typically associated with underlying theories and hypotheses about spatial patterns that are designed to be tested for their specific spatial content. This paper presents pressing issues for future research to reintroduce mainstream rural sociology to the concept of space.

Keywords: Maps, Rural Sociology, Space, Spatial variations

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1900

13489 Non-negative Principal Component Analysis for Face Recognition

Authors: Zhang Yan, Yu Bin

Abstract:

Principle component analysis is often combined with the state-of-art classification algorithms to recognize human faces. However, principle component analysis can only capture these features contributing to the global characteristics of data because it is a global feature selection algorithm. It misses those features contributing to the local characteristics of data because each principal component only contains some levels of global characteristics of data. In this study, we present a novel face recognition approach using non-negative principal component analysis which is added with the constraint of non-negative to improve data locality and contribute to elucidating latent data structures. Experiments are performed on the Cambridge ORL face database. We demonstrate the strong performances of the algorithm in recognizing human faces in comparison with PCA and NREMF approaches.

Keywords: classification, face recognition, non-negativeprinciple component analysis (NPCA)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1663

13488 Corporate Social Responsibility and Corporate Reputation: A Bibliometric Analysis

Authors: Songdi Li, Louise Spry, Tony Woodall

Abstract:

Nowadays, Corporate Social responsibility (CSR) is becoming a buzz word, and more and more academics are putting efforts on CSR studies. It is believed that CSR could influence Corporate Reputation (CR), and they hold a favourable view that CSR leads to a positive CR. To be specific, the CSR related activities in the reputational context have been regarded as ways that associate to excellent financial performance, value creation, etc. Also, it is argued that CSR and CR are two sides of one coin; hence, to some extent, doing CSR is equal to establishing a good reputation. Still, there is no consensus of the CSR-CR relationship in the literature; thus, a systematic literature review is highly in need. This research conducts a systematic literature review with both bibliometric and content analysis. Data are selected from English language sources, and academic journal articles only, then, keyword combinations are applied to identify relevant sources. Data from Scopus and WoS are gathered for bibliometric analysis. Scopus search results were saved in RIS and CSV formats, and Web of Science (WoS) data were saved in TXT format and CSV formats in order to process data in the Bibexcel software for further analysis which later will be visualised by the software VOSviewer. Also, content analysis was applied to analyse the data clusters and the key articles. In terms of the topic of CSR-CR, this literature review with bibliometric analysis has made four achievements. First, this paper has developed a systematic study which quantitatively depicts the knowledge structure of CSR and CR by identifying terms closely related to CSR-CR (such as ‘corporate governance’) and clustering subtopics emerged in co-citation analysis. Second, content analysis is performed to acquire insight on the findings of bibliometric analysis in the discussion section. And it highlights some insightful implications for the future research agenda, for example, a psychological link between CSR-CR is identified from the result; also, emerging economies and qualitative research methods are new elements emerged in the CSR-CR big picture. Third, a multidisciplinary perspective presents through the whole bibliometric analysis mapping and co-word and co-citation analysis; hence, this work builds a structure of interdisciplinary perspective which potentially leads to an integrated conceptual framework in the future. Finally, Scopus and WoS are compared and contrasted in this paper; as a result, Scopus which has more depth and comprehensive data is suggested as a tool for future bibliometric analysis studies. Overall, this paper has fulfilled its initial purposes and contributed to the literature. To the author’s best knowledge, this paper conducted the first literature review of CSR-CR researches that applied both bibliometric analysis and content analysis; therefore, this paper achieves its methodological originality. And this dual approach brings advantages of carrying out a comprehensive and semantic exploration in the area of CSR-CR in a scientific and realistic method. Admittedly, its work might exist subjective bias in terms of search terms selection and paper selection; hence triangulation could reduce the subjective bias to some degree.

Keywords: Corporate social responsibility, corporate reputation, bibliometric analysis, software data analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 857

13487 Data and Spatial Analysis for Economy and Education of 28 E.U. Member-States for 2014

Authors: Alexiou Dimitra, Fragkaki Maria

Abstract:

The objective of the paper is the study of geographic, economic and educational variables and their contribution to determine the position of each member-state among the EU-28 countries based on the values of seven variables as given by Eurostat. The Data Analysis methods of Multiple Factorial Correspondence Analysis (MFCA) Principal Component Analysis and Factor Analysis have been used. The cross tabulation tables of data consist of the values of seven variables for the 28 countries for 2014. The data are manipulated using the CHIC Analysis V 1.1 software package. The results of this program using MFCA and Ascending Hierarchical Classification are given in arithmetic and graphical form. For comparison reasons with the same data the Factor procedure of Statistical package IBM SPSS 20 has been used. The numerical and graphical results presented with tables and graphs, demonstrate the agreement between the two methods. The most important result is the study of the relation between the 28 countries and the position of each country in groups or clouds, which are formed according to the values of the corresponding variables.

Keywords: Multiple factorial correspondence analysis, principal component analysis, factor analysis, E.U.-28 countries, statistical package IBM SPSS 20, CHIC Analysis V 1.1 Software, Eurostat.eu statistics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1041

13486 Downtrend Algorithm and Hedging Strategy in Futures Market

Authors: S. Masteika, A.V. Rutkauskas, A. Tamosaitis

Abstract:

The paper investigates downtrend algorithm and trading strategy based on chart pattern recognition and technical analysis in futures market. The proposed chart formation is a pattern with the lowest low in the middle and one higher low on each side. The contribution of this paper lies in the reinforcement of statements about the profitability of momentum trend trading strategies. Practical benefit of the research is a trading algorithm in falling markets and back-test analysis in futures markets. When based on daily data, the algorithm has generated positive results, especially when the market had downtrend period. Downtrend algorithm can be applied as a hedge strategy against possible sudden market crashes. The proposed strategy can be interesting for futures traders, hedge funds or scientific researchers performing technical or algorithmic market analysis based on momentum trend trading.

Keywords: trading algorithm, chart pattern, downtrend trading, futures market, hedging

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3312

13485 Coalescing Data Marts

Authors: N. Parimala, P. Pahwa

Abstract:

OLAP uses multidimensional structures, to provide access to data for analysis. Traditionally, OLAP operations are more focused on retrieving data from a single data mart. An exception is the drill across operator. This, however, is restricted to retrieving facts on common dimensions of the multiple data marts. Our concern is to define further operations while retrieving data from multiple data marts. Towards this, we have defined six operations which coalesce data marts. While doing so we consider the common as well as the non-common dimensions of the data marts.

Keywords: Data warehouse, Dimension, OLAP, Star Schema.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1516

13484 Multidimensional and Data Mining Analysis for Property Investment Risk Analysis

Authors: Nur Atiqah Rochin Demong, Jie Lu, Farookh Khadeer Hussain

Abstract:

Property investment in the real estate industry has a high risk due to the uncertainty factors that will affect the decisions made and high cost. Analytic hierarchy process has existed for some time in which referred to an expert-s opinion to measure the uncertainty of the risk factors for the risk analysis. Therefore, different level of experts- experiences will create different opinion and lead to the conflict among the experts in the field. The objective of this paper is to propose a new technique to measure the uncertainty of the risk factors based on multidimensional data model and data mining techniques as deterministic approach. The propose technique consist of a basic framework which includes four modules: user, technology, end-user access tools and applications. The property investment risk analysis defines as a micro level analysis as the features of the property will be considered in the analysis in this paper.

Keywords: Uncertainty factors, data mining, multidimensional data model, risk analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2871

13483 Robust Regression and its Application in Financial Data Analysis

Authors: Mansoor Momeni, Mahmoud Dehghan Nayeri, Ali Faal Ghayoumi, Hoda Ghorbani

Abstract:

This research is aimed to describe the application of robust regression and its advantages over the least square regression method in analyzing financial data. To do this, relationship between earning per share, book value of equity per share and share price as price model and earning per share, annual change of earning per share and return of stock as return model is discussed using both robust and least square regressions, and finally the outcomes are compared. Comparing the results from the robust regression and the least square regression shows that the former can provide the possibility of a better and more realistic analysis owing to eliminating or reducing the contribution of outliers and influential data. Therefore, robust regression is recommended for getting more precise results in financial data analysis.

Keywords: Financial data analysis, Influential data, Outliers, Robust regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1890

13482 Survey on Arabic Sentiment Analysis in Twitter

Authors: Sarah O. Alhumoud, Mawaheb I. Altuwaijri, Tarfa M. Albuhairi, Wejdan M. Alohaideb

Abstract:

Large-scale data stream analysis has become one of the important business and research priorities lately. Social networks like Twitter and other micro-blogging platforms hold an enormous amount of data that is large in volume, velocity and variety. Extracting valuable information and trends out of these data would aid in a better understanding and decision-making. Multiple analysis techniques are deployed for English content. Moreover, one of the languages that produce a large amount of data over social networks and is least analyzed is the Arabic language. The proposed paper is a survey on the research efforts to analyze the Arabic content in Twitter focusing on the tools and methods used to extract the sentiments for the Arabic content on Twitter.

Keywords: Big Data, Social Networks, Sentiment Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4289

13481 Encouraging the Development of Scientific Literacy in Early Childhood Institutions: Croatian Experience

Authors: L. Vujičić, Ž. Ivković, Ž. Boneta

Abstract:

There is a widespread belief in everyday discourse that science subjects (physics, chemistry and biology) are, along with math, the most difficult school subjects in the education of an individual. This assumption is usually justified by the following facts: low GPA in these subjects, the number of pupils who fail these subjects is high in comparison to other subjects, and the number of pupils interested in continuing their studies in the fields with a focus on science subjects is lower compared to non-science-oriented fields. From that perspective, the project: “Could it be different? How do children explore it?” becomes extremely interesting because it is focused on young children and on the introduction of new methods, with aim of arousing interest in scientific literacy development in 10 kindergartens by applying the methodology of an action research, with an ethnographic approach. We define scientific literacy as a process of encouraging and nurturing the research and explorative spirit in children, as well as their natural potential and abilities that represent an object of scientific research: to learn about exploration by conducting exploration. Upon project completion, an evaluation questionnaire was created for the parents of the children who had participated in the project, as well as for those whose children had not been involved in the project. The purpose of the first questionnaire was to examine the level of satisfaction with the project implementation and its outcomes among those parents whose children had been involved in the project (N=142), while the aim of the second questionnaire was to find out how much the parents of the children not involved (N=154) in this activity were interested in this topic.

Keywords: Documenting, early childhood education, evaluation questionnaire for parents, scientific literacy development.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1348

13480 Cluster Analysis for the Statistical Modeling of Aesthetic Judgment Data Related to Comics Artists

Authors: George E. Tsekouras, Evi Sampanikou

Abstract:

We compare three categorical data clustering algorithms with respect to the problem of classifying cultural data related to the aesthetic judgment of comics artists. Such a classification is very important in Comics Art theory since the determination of any classes of similarities in such kind of data will provide to art-historians very fruitful information of Comics Art-s evolution. To establish this, we use a categorical data set and we study it by employing three categorical data clustering algorithms. The performances of these algorithms are compared each other, while interpretations of the clustering results are also given.

Keywords: Aesthetic judgment, comics artists, cluster analysis, categorical data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1602

13479 Application of Scientific Metrics to Evaluate Academic Reputation in Different Research Areas

Authors: Cristiano R. Cervi, Renata Galante, José Palazzo M. de Oliveira

Abstract:

In this paper, we address the problem of identifying academic reputation of researchers using scientific metrics in different research areas. Due to the characteristics of each area, researchers can present different behaviors. In previous work, we define Rep-Index that makes use of a profile template to individually identify the reputation of researchers. The Rep-Index is comprehensive and adaptive because involves hole trajectory of the researcher built throughout his career and can be used in different areas and in different contexts. Now, we compare our metric (Rep-Index) with the h-index and the g-index through experiments with researchers in the fields of Economics, Dentistry and Computer Science. We analyze the trajectory of 830 Brazilian researchers from the National Council of Technological and Scientific Development (CNPq), which receive grants research productivity. The grants are aimed at productivity researchers that stand out among their peers, enhancing their scientific normative criteria established by CNPq. Of the 830 researchers, 210 are in the area of Economics, 216 of Dentistry e 404 of Computer Science. The experiments show that our metric is strongly correlated with h-index, g-index and CNPq ranking. We also show good results for our hypothesis that our metric can be used to evaluate research in several areas. We apply our metric (Rep-Index) to compare the behavior of researchers in relation to their h-index and g-index through extensive experiments. The experiments showed that our metric is strongly correlated with h-index, g-index and CNPq ranking.

Keywords: Researcher reputation, profile model, scientific metrics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1948

13478 A Data Mining Model for Detecting Financial and Operational Risk Indicators of SMEs

Authors: Ali Serhan Koyuncugil, Nermin Ozgulbas

Abstract:

In this paper, a data mining model to SMEs for detecting financial and operational risk indicators by data mining is presenting. The identification of the risk factors by clarifying the relationship between the variables defines the discovery of knowledge from the financial and operational variables. Automatic and estimation oriented information discovery process coincides the definition of data mining. During the formation of model; an easy to understand, easy to interpret and easy to apply utilitarian model that is far from the requirement of theoretical background is targeted by the discovery of the implicit relationships between the data and the identification of effect level of every factor. In addition, this paper is based on a project which was funded by The Scientific and Technological Research Council of Turkey (TUBITAK).

Keywords: Risk Management, Financial Risk, Operational Risk, Financial Early Warning System, Data Mining, CHAID Decision Tree Algorithm, SMEs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3074

13477 A Failure Analysis Tool for HDD Analysis

Authors: C. Kumjeera, T. Unchim, B. Marungsri, A. Oonsivilai

Abstract:

The study of piezoelectric material in the past was in T-Domain form; however, no one has studied piezoelectric material in the S-Domain form. This paper will present the piezoelectric material in the transfer function or S-Domain model. S-Domain is a well known mathematical model, used for analyzing the stability of the material and determining the stability limits. By using S-Domain in testing stability of piezoelectric material, it will provide a new tool for the scientific world to study this material in various forms.

Keywords: Hard disk drive, failure analysis, tool, time

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2698

13476 A Monte Carlo Method to Data Stream Analysis

Authors: Kittisak Kerdprasop, Nittaya Kerdprasop, Pairote Sattayatham

Abstract:

Data stream analysis is the process of computing various summaries and derived values from large amounts of data which are continuously generated at a rapid rate. The nature of a stream does not allow a revisit on each data element. Furthermore, data processing must be fast to produce timely analysis results. These requirements impose constraints on the design of the algorithms to balance correctness against timely responses. Several techniques have been proposed over the past few years to address these challenges. These techniques can be categorized as either dataoriented or task-oriented. The data-oriented approach analyzes a subset of data or a smaller transformed representation, whereas taskoriented scheme solves the problem directly via approximation techniques. We propose a hybrid approach to tackle the data stream analysis problem. The data stream has been both statistically transformed to a smaller size and computationally approximated its characteristics. We adopt a Monte Carlo method in the approximation step. The data reduction has been performed horizontally and vertically through our EMR sampling method. The proposed method is analyzed by a series of experiments. We apply our algorithm on clustering and classification tasks to evaluate the utility of our approach.

Keywords: Data Stream, Monte Carlo, Sampling, DensityEstimation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1383

13475 Pattern Recognition Using Feature Based Die-Map Clusteringin the Semiconductor Manufacturing Process

Authors: Seung Hwan Park, Cheng-Sool Park, Jun Seok Kim, Youngji Yoo, Daewoong An, Jun-Geol Baek

Abstract:

Depending on the big data analysis becomes important, yield prediction using data from the semiconductor process is essential. In general, yield prediction and analysis of the causes of the failure are closely related. The purpose of this study is to analyze pattern affects the final test results using a die map based clustering. Many researches have been conducted using die data from the semiconductor test process. However, analysis has limitation as the test data is less directly related to the final test results. Therefore, this study proposes a framework for analysis through clustering using more detailed data than existing die data. This study consists of three phases. In the first phase, die map is created through fail bit data in each sub-area of die. In the second phase, clustering using map data is performed. And the third stage is to find patterns that affect final test result. Finally, the proposed three steps are applied to actual industrial data and experimental results showed the potential field application.

Keywords: Die-Map Clustering, Feature Extraction, Pattern Recognition, Semiconductor Manufacturing Process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3097

13474 Multidimensional Visualization Tools for Analysis of Expression Data

Authors: Urska Cvek, Marjan Trutschl, Randolph Stone II, Zanobia Syed, John L. Clifford, Anita L. Sabichi

Abstract:

Expression data analysis is based mostly on the statistical approaches that are indispensable for the study of biological systems. Large amounts of multidimensional data resulting from the high-throughput technologies are not completely served by biostatistical techniques and are usually complemented with visual, knowledge discovery and other computational tools. In many cases, in biological systems we only speculate on the processes that are causing the changes, and it is the visual explorative analysis of data during which a hypothesis is formed. We would like to show the usability of multidimensional visualization tools and promote their use in life sciences. We survey and show some of the multidimensional visualization tools in the process of data exploration, such as parallel coordinates and radviz and we extend them by combining them with the self-organizing map algorithm. We use a time course data set of transitional cell carcinoma of the bladder in our examples. Analysis of data with these tools has the potential to uncover additional relationships and non-trivial structures.

Keywords: microarrays, visualization, parallel coordinates, radviz, self-organizing maps.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2469

13473 Knowledge Impact on Measurement: A Conceptual Metric for Evaluating Performance Improvement (PI) at the Kuwait Institute for Scientific Research (KISR)

Authors: AlMatrouk H. S., Juszczak M. D.

Abstract:

Research and development R&D work involves enormous amount of work that has to do with data measurement and collection. This process evolves as new information is fed, new technologies are utilized, and eventually new knowledge is created by the stakeholders i.e., researchers, clients, and end-users. When new knowledge is created, procedures of R&D work should evolve and produce better results within improved research skills and improved methods of data measurements and collection. This measurement improvement should then be benchmarked against a metric that should be developed at the organization. In this paper, we are suggesting a conceptual metric for R&D work performance improvement (PI) at the Kuwait Institute for Scientific Research (KISR). This PI is to be measured against a set of variables in the suggested metric, which are more closely correlated to organizational output, as opposed to organizational norms. The paper also mentions and discusses knowledge creation and management as an addedvalue to R&D work and measurement improvement. The research methodology followed in this work is qualitative in nature, based on a survey that was distributed to researchers and interviews held with senior researchers at KISR. Research and analyses in this paper also include looking at and analyzing KISR-s literature.

Keywords: Knowledge Creation, Performance Improvement (PI), Conceptual Metric, Knowledge Management (KM) addedvalue.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1215

13472 Landscape Data Transformation: Categorical Descriptions to Numerical Descriptors

Authors: Dennis A. Apuan

Abstract:

Categorical data based on description of the agricultural landscape imposed some mathematical and analytical limitations. This problem however can be overcome by data transformation through coding scheme and the use of non-parametric multivariate approach. The present study describes data transformation from qualitative to numerical descriptors. In a collection of 103 random soil samples over a 60 hectare field, categorical data were obtained from the following variables: levels of nitrogen, phosphorus, potassium, pH, hue, chroma, value and data on topography, vegetation type, and the presence of rocks. Categorical data were coded, and Spearman-s rho correlation was then calculated using PAST software ver. 1.78 in which Principal Component Analysis was based. Results revealed successful data transformation, generating 1030 quantitative descriptors. Visualization based on the new set of descriptors showed clear differences among sites, and amount of variation was successfully measured. Possible applications of data transformation are discussed.

Keywords: data transformation, numerical descriptors, principalcomponent analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1465

13471 Multi-Dimensional Concerns Mining for Web Applications via Concept-Analysis

Authors: Carlo Bellettini, Alessandro Marchetto, Andrea Trentini

Abstract:

Web applications have become very complex and crucial, especially when combined with areas such as CRM (Customer Relationship Management) and BPR (Business Process Reengineering), the scientific community has focused attention to Web applications design, development, analysis, and testing, by studying and proposing methodologies and tools. This paper proposes an approach to automatic multi-dimensional concern mining for Web Applications, based on concepts analysis, impact analysis, and token-based concern identification. This approach lets the user to analyse and traverse Web software relevant to a particular concern (concept, goal, purpose, etc.) via multi-dimensional separation of concerns, to document, understand and test Web applications. This technique was developed in the context of WAAT (Web Applications Analysis and Testing) project. A semi-automatic tool to support this technique is currently under development.

Keywords: Concepts Analysis, Concerns Mining, Multi-Dimensional Separation of Concerns, Impact Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1428

13470 Open Science Philosophy and Paradigm of Scientific Research

Authors: C. Ardil

Abstract:

This paper presents the open science philosophy and paradigm of scientific research on how to transform classical research and innovation approaches. Open science is the practice of providing free and unrestricted online access to the products of scholarly research. Open science advocates for the immediate and unrestricted online access to published, peer-reviewed research in digital format. Open science research is made available for free in perpetuity and includes guidelines and/or licenses that communicate how researchers and readers can share and re-use the digital content. The emergence of open science has changed the scholarly research and publishing landscape, making research more broadly accessible to academic and non-academic audiences alike. Consequently, open science philosophy and its practice are discussed to cover all aspects of cyberscience in the context of research and innovation excellence for the benefit of global society.

Keywords: Open science, open data, open access, cyberscience , cybertechnology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 617