Search results for: genomic data analysis
12930 Banks Profitability Indicators in CEE Countries
Abstract:
The aim of the present article is to determine the impact of the external and internal factors of bank performance on the profitability indicators of the CEE countries banks in the period from 2006 to 2012. On the basis of research conducted abroad on bank and macroeconomic profitability indicators, in order to obtain research results, the authors evaluated return on average assets (ROAA) and return on average equity (ROAE) indicators of the CEE countries banks. The authors analyzed profitability indicators of banks using descriptive methods, SPSS data analysis methods, as well as data correlation and linear regression analysis. The authors concluded that most internal and external indicators of bank performance have no direct influence the profitability of the banks in the CEE countries. The only exceptions are credit risk and bank size, which affect one of the measures of bank profitability – return on average equity.
Keywords: Banks, CEE countries, Profitability ROAA, ROAE.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 266212929 1/Sigma Term Weighting Scheme for Sentiment Analysis
Authors: Hanan Alshaher, Jinsheng Xu
Abstract:
Large amounts of data on the web can provide valuable information. For example, product reviews help business owners measure customer satisfaction. Sentiment analysis classifies texts into two polarities: positive and negative. This paper examines movie reviews and tweets using a new term weighting scheme, called one-over-sigma (1/sigma), on benchmark datasets for sentiment classification. The proposed method aims to improve the performance of sentiment classification. The results show that 1/sigma is more accurate than the popular term weighting schemes. In order to verify if the entropy reflects the discriminating power of terms, we report a comparison of entropy values for different term weighting schemes.
Keywords: Sentiment analysis, term weighting scheme, 1/sigma.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 53312928 Analysis of Train Passenger Seat Using Ergonomic Function Deployment Method
Authors: Robertoes K. K. Wibowo, Siswoyo Soekarno, Irma Puspitasari
Abstract:
Indonesian people use trains for their transportation, especially they use economy class train transportation because it is cheaper and has a more precise schedule than any other ground transportation. Nevertheless, the economy class passenger seat raises some inconvenience issues for passengers. This is due to the design of the chair on the economic class of trains that did not adjusted to the shape of anthropometry of Indonesian people. Thus, research needs to be conducted on the design of the seats in the economic class of trains. The purpose of this research is to make the design of economy class passenger seats ergonomic. This research method uses questionnaires and anthropometry measurements. The data obtained is processed using House of Quality of Ergonomic Function Development. From the results of analysis and data processing were obtained important changes from the original design. Ergonomic chair design according to the analysis is a stainless steel frame, seat height 390 mm, with a seat width for each passenger of 400 mm and a depth of 400 mm. Design of the backrest has a height of 840 mm, width of 430 mm and length of 300 mm that can move at the angle of 105-115 degrees. The width of the footrest is 42 mm and 400 mm length. The thickness of the seat cushion is 100 mm.
Keywords: Chair, ergonomics, function development, train passenger.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 182812927 VaR Forecasting in Times of Increased Volatility
Authors: Ivo Jánský, Milan Rippel
Abstract:
The paper evaluates several hundred one-day-ahead VaR forecasting models in the time period between the years 2004 and 2009 on data from six world stock indices - DJI, GSPC, IXIC, FTSE, GDAXI and N225. The models model mean using the ARMA processes with up to two lags and variance with one of GARCH, EGARCH or TARCH processes with up to two lags. The models are estimated on the data from the in-sample period and their forecasting accuracy is evaluated on the out-of-sample data, which are more volatile. The main aim of the paper is to test whether a model estimated on data with lower volatility can be used in periods with higher volatility. The evaluation is based on the conditional coverage test and is performed on each stock index separately. The primary result of the paper is that the volatility is best modelled using a GARCH process and that an ARMA process pattern cannot be found in analyzed time series.Keywords: VaR, risk analysis, conditional volatility, garch, egarch, tarch, moving average process, autoregressive process
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 143012926 Addressing Data Security in the Cloud
Authors: Marinela Mircea
Abstract:
The development of information and communication technology, the increased use of the internet, as well as the effects of the recession within the last years, have lead to the increased use of cloud computing based solutions, also called on-demand solutions. These solutions offer a large number of benefits to organizations as well as challenges and risks, mainly determined by data visualization in different geographic locations on the internet. As far as the specific risks of cloud environment are concerned, data security is still considered a peak barrier in adopting cloud computing. The present study offers an approach upon ensuring the security of cloud data, oriented towards the whole data life cycle. The final part of the study focuses on the assessment of data security in the cloud, this representing the bases in determining the potential losses and the premise for subsequent improvements and continuous learning.Keywords: cloud computing, data life cycle, data security, security assessment.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 216112925 Analysis of the Structural Fluctuation of the Permitted Building Areas and Housing Distribution Ratios - Focused on 5 Cities Including Bucheon
Authors: Cheon Sik Min, Hyeong Wook Song, Sook Yeon Shim, Hoon Chang
Abstract:
The purpose of this study was to analyze the correlation between permitted building areas and housing distribution ratios and their fluctuation, and test a distribution model during 3 successive governments in 5 cities including Bucheon in reference to the time series administrative data, and thereby, interpret the results of the analysis in association with the policies pursued by the successive governments to examine the structural fluctuation of permitted building areas and housing distribution ratios. In order to analyze the fluctuation of permitted building areas and housing distribution ratios during 3 successive governments and examine the cycles of the time series data, the spectral analysis was performed, and in order to analyze the correlation between permitted building areas and housing distribution ratios, the tabulation was performed to describe the correlations statistically, and in order to explain about differences of fluctuation distribution of permitted building areas and housing distribution ratios among 3 governments, the goodness of fit test was conducted.Keywords: The Permitted Building Areas, Housing Distribution Ratios, the Structural Fluctuation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 119412924 Design and Implementation of Security Middleware for Data Warehouse Signature Framework
Authors: Mayada AlMeghari
Abstract:
Recently, grid middlewares have provided large integrated use of network resources as the shared data and the CPU to become a virtual supercomputer. In this work, we present the design and implementation of the middleware for Data Warehouse Signature (DWS) Framework. The aim of using the middleware in the proposed DWS framework is to achieve the high performance by the parallel computing. This middleware is developed on Alchemi.Net framework to increase the security among the network nodes through the authentication and group-key distribution model. This model achieves the key security and prevents any intermediate attacks in the middleware. This paper presents the flow process structures of the middleware design. In addition, the paper ensures the implementation of security for DWS middleware enhancement with the authentication and group-key distribution model. Finally, from the analysis of other middleware approaches, the developed middleware of DWS framework is the optimal solution of a complete covering of security issues.
Keywords: Middleware, parallel computing, data warehouse, security, group-key, high performance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 33812923 Analysis of Mathematical Models and Their Application to Extreme Events
Authors: Avellino I. Mondlane, Karin Hansson, Oliver Popov
Abstract:
This paper discusses the application of extreme events distribution taking the Limpopo River Basin at Xai-Xai station, in Mozambique, as a case analysis. We analyze the extreme value concepts, namely Gumbel, Fréchet, Weibull and Generalized Extreme Value Distributions and then extrapolate the original data to 1000, 5000 and 10000 figures for further simulations and we compare their outcomes based on these three main distributions.
Keywords: Catastrophes, extreme event, disasters, mathematical models, simulation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 252112922 A Network Traffic Prediction Algorithm Based On Data Mining Technique
Authors: D. Prangchumpol
Abstract:
This paper is a description approach to predict incoming and outgoing data rate in network system by using association rule discover, which is one of the data mining techniques. Information of incoming and outgoing data in each times and network bandwidth are network performance parameters, which needed to solve in the traffic problem. Since congestion and data loss are important network problems. The result of this technique can predicted future network traffic. In addition, this research is useful for network routing selection and network performance improvement.
Keywords: Traffic prediction, association rule, data mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 366912921 Fuzzy Processing of Uncertain Data
Authors: Petr Morávek, Miloš Šeda
Abstract:
In practice, we often come across situations where it is necessary to make decisions based on incomplete or uncertain data. In control systems it may be due to the unknown exact mathematical model, or its excessive complexity (e.g. nonlinearity) when it is necessary to simplify it, respectively, to solve it using a rule base. In the case of databases, searching data we compare a similarity measure with of the requirements of the selection with stored data, where both the select query and the data itself may contain vague terms, for example in the form of linguistic qualifiers. In this paper, we focus on the processing of uncertain data in databases and demonstrate it on the example multi-criteria decision making in the selection of variants, specified by higher number of technical parameters.Keywords: fuzzy logic, linguistic variable, multicriteria decision
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 141812920 Tools for Analysis and Optimization of Standalone Green Microgrids
Authors: William Anderson, Kyle Kobold, Oleg Yakimenko
Abstract:
Green microgrids using mostly renewable energy (RE) for generation, are complex systems with inherent nonlinear dynamics. Among a variety of different optimization tools there are only a few ones that adequately consider this complexity. This paper evaluates applicability of two somewhat similar optimization tools tailored for standalone RE microgrids and also assesses a machine learning tool for performance prediction that can enhance the reliability of any chosen optimization tool. It shows that one of these microgrid optimization tools has certain advantages over another and presents a detailed routine of preparing input data to simulate RE microgrid behavior. The paper also shows how neural-network-based predictive modeling can be used to validate and forecast solar power generation based on weather time series data, which improves the overall quality of standalone RE microgrid analysis.Keywords: Microgrid, renewable energy, complex systems, optimization, predictive modeling, neural networks.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 106012919 Fuzzy Multi-Component DEA with Shared and Undesirable Fuzzy Resources
Authors: Jolly Puri, Shiv Prasad Yadav
Abstract:
Multi-component data envelopment analysis (MC-DEA) is a popular technique for measuring aggregate performance of the decision making units (DMUs) along with their components. However, the conventional MC-DEA is limited to crisp input and output data which may not always be available in exact form. In real life problems, data may be imprecise or fuzzy. Therefore, in this paper, we propose (i) a fuzzy MC-DEA (FMC-DEA) model in which shared and undesirable fuzzy resources are incorporated, (ii) the proposed FMC-DEA model is transformed into a pair of crisp models using α cut approach, (iii) fuzzy aggregate performance of a DMU and fuzzy efficiencies of components are defined to be fuzzy numbers, and (iv) a numerical example is illustrated to validate the proposed approach.
Keywords: Multi-component DEA, fuzzy multi-component DEA, fuzzy resources.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 207312918 Automated Stereophotogrammetry Data Cleansing
Authors: Stuart Henry, Philip Morrow, John Winder, Bryan Scotney
Abstract:
The stereophotogrammetry modality is gaining more widespread use in the clinical setting. Registration and visualization of this data, in conjunction with conventional 3D volumetric image modalities, provides virtual human data with textured soft tissue and internal anatomical and structural information. In this investigation computed tomography (CT) and stereophotogrammetry data is acquired from 4 anatomical phantoms and registered using the trimmed iterative closest point (TrICP) algorithm. This paper fully addresses the issue of imaging artifacts around the stereophotogrammetry surface edge using the registered CT data as a reference. Several iterative algorithms are implemented to automatically identify and remove stereophotogrammetry surface edge outliers, improving the overall visualization of the combined stereophotogrammetry and CT data. This paper shows that outliers at the surface edge of stereophotogrammetry data can be successfully removed automatically.
Keywords: Data cleansing, stereophotogrammetry.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 184212917 Constructing an Attitude Scale: Attitudes toward Violence on Televisions
Authors: Göksu Gözen Citak
Abstract:
The process of constructing a scale measuring the attitudes of youth toward violence on televisions is reported. A 30-item draft attitude scale was applied to a working group of 232 students attending the Faculty of Educational Sciences at Ankara University between the years 2005-2006. To introduce the construct validity and dimensionality of the scale, exploratory and confirmatory factor analysis was applied to the data. Results of the exploratory factor analysis showed that the scale had three factors that accounted for 58,44% (22,46% for the first, 22,15% for the second and 13,83% for the third factor) of the common variance. It is determined that the first factor considered issues related individual effects of violence on televisions, the second factor concerned issues related social effects of violence on televisions and the third factor concerned issues related violence on television programs. Results of the confirmatory factor analysis showed that all the items under each factor are fitting the concerning factors structure. An alpha reliability of 0,90 was estimated for the whole scale. It is concluded that the scale is valid and reliable.Keywords: Attitudes toward violence, confirmatory factor analysis, constructing attitude scale, exploratory factor analysis, violence on televisions.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 195712916 Feature Selection and Predictive Modeling of Housing Data Using Random Forest
Authors: Bharatendra Rai
Abstract:
Predictive data analysis and modeling involving machine learning techniques become challenging in presence of too many explanatory variables or features. Presence of too many features in machine learning is known to not only cause algorithms to slow down, but they can also lead to decrease in model prediction accuracy. This study involves housing dataset with 79 quantitative and qualitative features that describe various aspects people consider while buying a new house. Boruta algorithm that supports feature selection using a wrapper approach build around random forest is used in this study. This feature selection process leads to 49 confirmed features which are then used for developing predictive random forest models. The study also explores five different data partitioning ratios and their impact on model accuracy are captured using coefficient of determination (r-square) and root mean square error (rsme).
Keywords: Housing data, feature selection, random forest, Boruta algorithm, root mean square error.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 171512915 Evaluation of the Urban Regeneration Project: Land Use Transformation and SNS Big Data Analysis
Authors: Ju-Young Kim, Tae-Heon Moon, Jung-Hun Cho
Abstract:
Urban regeneration projects have been actively promoted in Korea. In particular, Jeonju Hanok Village is evaluated as one of representative cases in terms of utilizing local cultural heritage sits in the urban regeneration project. However, recently, there has been a growing concern in this area, due to the ‘gentrification’, caused by the excessive commercialization and surging tourists. This trend was changing land and building use and resulted in the loss of identity of the region. In this regard, this study analyzed the land use transformation between 2010 and 2016 to identify the commercialization trend in Jeonju Hanok Village. In addition, it conducted SNS big data analysis on Jeonju Hanok Village from February 14th, 2016 to March 31st, 2016 to identify visitors’ awareness of the village. The study results demonstrate that rapid commercialization was underway, unlikely the initial intention, so that planners and officials in city government should reconsider the project direction and rebuild deliberate management strategies. This study is meaningful in that it analyzed the land use transformation and SNS big data to identify the current situation in urban regeneration area. Furthermore, it is expected that the study results will contribute to the vitalization of regeneration area.
Keywords: Land use, SNS, text mining, urban regeneration.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 121512914 An Improved Data Mining Method Applied to the Search of Relationship between Metabolic Syndrome and Lifestyles
Authors: Yi Chao Huang, Yu Ling Liao, Chiu Shuang Lin
Abstract:
A data cutting and sorting method (DCSM) is proposed to optimize the performance of data mining. DCSM reduces the calculation time by getting rid of redundant data during the data mining process. In addition, DCSM minimizes the computational units by splitting the database and by sorting data with support counts. In the process of searching for the relationship between metabolic syndrome and lifestyles with the health examination database of an electronics manufacturing company, DCSM demonstrates higher search efficiency than the traditional Apriori algorithm in tests with different support counts.Keywords: Data mining, Data cutting and sorting method, Apriori algorithm, Metabolic syndrome
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 158812913 Performance Analysis of the First-Order Characteristics of Polling Systems Based on Parallel Limited (k = 1) Services Mode
Authors: Liu Yi, Bao Liyong
Abstract:
Aiming at the problem of low efficiency of pipelined scheduling in periodic query-qualified service, this paper proposes a system service resource scheduling strategy with parallel optimized qualified service polling control. The paper constructs the polling queuing system and its mathematical model; firstly, the first-order and second-order characteristic parameter equations are obtained by partial derivation of the probability mother function of the system state variables, and the complete mathematical, analytical expressions of each system parameter are deduced after the joint solution. The simulation experimental results are consistent with the theoretical calculated values. The system performance analysis shows that the average captain and average period of the system have been greatly improved, which can better adapt to the service demand of delay-sensitive data in the dense data environment.
Keywords: Polling, parallel scheduling, mean queue length, average cycle time.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6112912 Biomechanics Analysis When Delivering Baby
Authors: Kristyanto B.
Abstract:
Plenty of analyses based on Biomechanics were carried out on many jobs in manufactures or services. Now Biomechanics analysis is being applied on mothers who are giving birth. The analysis conducted in terms of normal condition of the birth process without Gyn Bed (Obstetric Bed). The aim of analysis is to study whether it is risky or not when choosing the position of mother’s postures when delivering the baby. This investigation was applied on two positions that generally appear in common birth process. Results will show the analysis of both positions to support the birth process based on the Biomechanics analysis (Ergonomic approaches).
Keywords: Biomechanics analysis, Birth process, Position of postures analysis, Ergonomic approaches.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 230212911 Time Series Modelling and Prediction of River Runoff: Case Study of Karkheh River, Iran
Authors: Karim Hamidi Machekposhti, Hossein Sedghi, Abdolrasoul Telvari, Hossein Babazadeh
Abstract:
Rainfall and runoff phenomenon is a chaotic and complex outcome of nature which requires sophisticated modelling and simulation methods for explanation and use. Time Series modelling allows runoff data analysis and can be used as forecasting tool. In the paper attempt is made to model river runoff data and predict the future behavioural pattern of river based on annual past observations of annual river runoff. The river runoff analysis and predict are done using ARIMA model. For evaluating the efficiency of prediction to hydrological events such as rainfall, runoff and etc., we use the statistical formulae applicable. The good agreement between predicted and observation river runoff coefficient of determination (R2) display that the ARIMA (4,1,1) is the suitable model for predicting Karkheh River runoff at Iran.
Keywords: Time series modelling, ARIMA model, River runoff, Karkheh River, CLS method.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 79912910 Data Mining Approach for Commercial Data Classification and Migration in Hybrid Storage Systems
Authors: Mais Haj Qasem, Maen M. Al Assaf, Ali Rodan
Abstract:
Parallel hybrid storage systems consist of a hierarchy of different storage devices that vary in terms of data reading speed performance. As we ascend in the hierarchy, data reading speed becomes faster. Thus, migrating the application’ important data that will be accessed in the near future to the uppermost level will reduce the application I/O waiting time; hence, reducing its execution elapsed time. In this research, we implement trace-driven two-levels parallel hybrid storage system prototype that consists of HDDs and SSDs. The prototype uses data mining techniques to classify application’ data in order to determine its near future data accesses in parallel with the its on-demand request. The important data (i.e. the data that the application will access in the near future) are continuously migrated to the uppermost level of the hierarchy. Our simulation results show that our data migration approach integrated with data mining techniques reduces the application execution elapsed time when using variety of traces in at least to 22%.Keywords: Data mining, hybrid storage system, recurrent neural network, support vector machine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 173612909 Interactive Effects in Blended Learning Mode: Exploring Hybrid Data Sources and Iterative Linkages
Authors: Hock Chuan, Lim
Abstract:
This paper presents an approach for identifying interactive effects using Network Science (NS) supported by Social Network Analysis (SNA) techniques. Based on general observations that learning processes and behaviors are shaped by the social relationships and influenced by learning environment, the central idea was to understand both the human and non-human interactive effects for a blended learning mode of delivery of computer science modules. Important findings include (a) the importance of non-human nodes to influence the centrality and transfer; (b) the degree of non-human and human connectivity impacts learning. This project reveals that the NS pattern and connectivity as measured by node relationships offer alternative approach for hypothesis generation and design of qualitative data collection. An iterative process further reinforces the analysis, whereas the experimental simulation option itself is an interesting alternative option, a hybrid combination of both experimental simulation and qualitative data collection presents itself as a promising and viable means to study complex scenario such as blended learning delivery mode. The primary value of this paper lies in the design of the approach for studying interactive effects of human (social nodes) and non-human (learning/study environment, Information and Communication Technologies (ICT) infrastructures nodes) components. In conclusion, this project adds to the understanding and the use of SNA to model and study interactive effects in blended social learning.
Keywords: Blended learning, network science, social learning, social network analysis, study environment.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 65812908 Investigating Breakdowns in Human Robot Interaction: A Conversation Analysis Guided Single Case Study of a Human-Robot Communication in a Museum Environment
Authors: B. Arend, P. Sunnen, P. Caire
Abstract:
In a single case study, we show how a conversation analysis (CA) approach can shed light onto the sequential unfolding of human-robot interaction. Relying on video data, we are able to show that CA allows us to investigate the respective turn-taking systems of humans and a NAO robot in their dialogical dynamics, thus pointing out relevant differences. Our fine grained video analysis points out occurring breakdowns and their overcoming, when humans and a NAO-robot engage in a multimodally uttered multi-party communication during a sports guessing game. Our findings suggest that interdisciplinary work opens up the opportunity to gain new insights into the challenging issues of human robot communication in order to provide resources for developing mechanisms that enable complex human-robot interaction (HRI).
Keywords: Human-robot interaction, conversation analysis, dialogism, museum, breakdown.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 141112907 Form of Distribution of Traffic Accident and Environment Factors of Road Affecting of Traffic Accident in Dusit District, Only Area Responsible of Samsen Police Station
Authors: Musthaya Patchanee
Abstract:
This research aimed to study form of traffic distribution and environmental factors of road that affect traffic accidents in Dusit District, only areas responsible of Samsen Police Station. Data used in this analysis is the secondary data of traffic accident case from year 2011. Observed area units are 15 traffic lines that are under responsible of Samsen Police Station. Technique and method used are the Cartographic Method, the Correlation Analysis, and the Multiple Regression Analysis. The results of form of traffic accidents show that, the Samsen Road area had most traffic accidents (24.29%), second was Rachvithi Road(18.10%), third was Sukhothai Road (15.71%), fourth was Rachasrima Road (12.38%), and fifth was Amnuaysongkram Road(7.62%). The result from Dusit District, onlyareasresponsibleofSamsen police station, has suggested that the scale of accidents have high positive correlation with statistic significant at level 0.05 and the frequency of travel (r=0.857). Traffic intersection point (r=0.763)and traffic control equipments (r=0.713) are relevant factors respectively. By using the Multiple Regression Analysis, travel frequency is the only one that has considerable influences on traffic accidents in Dusit district only Samsen Police Station area. Also, a factor in frequency of travel can explain the change in traffic accidents scale to 73.40 (R2 = 0.734). By using the Multiple regression summation from analysis was Ŷ=-7.977+0.044X6
Keywords: Form of Traffic Distribution, Environmental Factors of road, Traffic Accidents, Dusit District.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 187412906 Association Rules Mining and NOSQL Oriented Document in Big Data
Authors: Sarra Senhadji, Imene Benzeguimi, Zohra Yagoub
Abstract:
Big Data represents the recent technology of manipulating voluminous and unstructured data sets over multiple sources. Therefore, NOSQL appears to handle the problem of unstructured data. Association rules mining is one of the popular techniques of data mining to extract hidden relationship from transactional databases. The algorithm for finding association dependencies is well-solved with Map Reduce. The goal of our work is to reduce the time of generating of frequent itemsets by using Map Reduce and NOSQL database oriented document. A comparative study is given to evaluate the performances of our algorithm with the classical algorithm Apriori.
Keywords: Apriori, Association rules mining, Big Data, data mining, Hadoop, Map Reduce, MongoDB, NoSQL.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 69412905 Application of Stochastic Models to Annual Extreme Streamflow Data
Authors: Karim Hamidi Machekposhti, Hossein Sedghi
Abstract:
This study was designed to find the best stochastic model (using of time series analysis) for annual extreme streamflow (peak and maximum streamflow) of Karkheh River at Iran. The Auto-regressive Integrated Moving Average (ARIMA) model used to simulate these series and forecast those in future. For the analysis, annual extreme streamflow data of Jelogir Majin station (above of Karkheh dam reservoir) for the years 1958–2005 were used. A visual inspection of the time plot gives a little increasing trend; therefore, series is not stationary. The stationarity observed in Auto-Correlation Function (ACF) and Partial Auto-Correlation Function (PACF) plots of annual extreme streamflow was removed using first order differencing (d=1) in order to the development of the ARIMA model. Interestingly, the ARIMA(4,1,1) model developed was found to be most suitable for simulating annual extreme streamflow for Karkheh River. The model was found to be appropriate to forecast ten years of annual extreme streamflow and assist decision makers to establish priorities for water demand. The Statistical Analysis System (SAS) and Statistical Package for the Social Sciences (SPSS) codes were used to determinate of the best model for this series.Keywords: Stochastic models, ARIMA, extreme streamflow, Karkheh River.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 72212904 CoP-Networks: Virtual Spaces for New Faculty’s Professional Development in the 21st Higher Education
Authors: Eman AbuKhousa, Marwan Z. Bataineh
Abstract:
The 21st century higher education and globalization challenge new faculty members to build effective professional networks and partnership with industry in order to accelerate their growth and success. This creates the need for community of practice (CoP)-oriented development approaches that focus on cognitive apprenticeship while considering individual predisposition and future career needs. This work adopts data mining, clustering analysis, and social networking technologies to present the CoP-Network as a virtual space that connects together similar career-aspiration individuals who are socially influenced to join and engage in a process for domain-related knowledge and practice acquisitions. The CoP-Network model can be integrated into higher education to extend traditional graduate and professional development programs.Keywords: Clustering analysis, community of practice, data mining, higher education, new faculty challenges, social networks, social influence, professional development.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 97212903 Determination of the Bank's Customer Risk Profile: Data Mining Applications
Authors: Taner Ersoz, Filiz Ersoz, Seyma Ozbilge
Abstract:
In this study, the clients who applied to a bank branch for loan were analyzed through data mining. The study was composed of the information such as amounts of loans received by personal and SME clients working with the bank branch, installment numbers, number of delays in loan installments, payments available in other banks and number of banks to which they are in debt between 2010 and 2013. The client risk profile was examined through Classification and Regression Tree (CART) analysis, one of the decision tree classification methods. At the end of the study, 5 different types of customers have been determined on the decision tree. The classification of these types of customers has been created with the rating of those posing a risk for the bank branch and the customers have been classified according to the risk ratings.
Keywords: Client classification, loan suitability, risk rating, CART analysis, decision tree.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 107412902 Identifying Critical Success Factors for Data Quality Management through a Delphi Study
Authors: Maria Paula Santos, Ana Lucas
Abstract:
Organizations support their operations and decision making on the data they have at their disposal, so the quality of these data is remarkably important and Data Quality (DQ) is currently a relevant issue, the literature being unanimous in pointing out that poor DQ can result in large costs for organizations. The literature review identified and described 24 Critical Success Factors (CSF) for Data Quality Management (DQM) that were presented to a panel of experts, who ordered them according to their degree of importance, using the Delphi method with the Q-sort technique, based on an online questionnaire. The study shows that the five most important CSF for DQM are: definition of appropriate policies and standards, control of inputs, definition of a strategic plan for DQ, organizational culture focused on quality of the data and obtaining top management commitment and support.
Keywords: Critical success factors, data quality, data quality management, Delphi, Q-Sort.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 110912901 Development of a Software System for Management and Genetic Analysis of Biological Samples for Forensic Laboratories
Authors: Mariana Lima, Rodrigo Silva, Victor Stange, Teodiano Bastos
Abstract:
Due to the high reliability reached by DNA tests, since the 1980s this kind of test has allowed the identification of a growing number of criminal cases, including old cases that were unsolved, now having a chance to be solved with this technology. Currently, the use of genetic profiling databases is a typical method to increase the scope of genetic comparison. Forensic laboratories must process, analyze, and generate genetic profiles of a growing number of samples, which require time and great storage capacity. Therefore, it is essential to develop methodologies capable to organize and minimize the spent time for both biological sample processing and analysis of genetic profiles, using software tools. Thus, the present work aims the development of a software system solution for laboratories of forensics genetics, which allows sample, criminal case and local database management, minimizing the time spent in the workflow and helps to compare genetic profiles. For the development of this software system, all data related to the storage and processing of samples, workflows and requirements that incorporate the system have been considered. The system uses the following software languages: HTML, CSS, and JavaScript in Web technology, with NodeJS platform as server, which has great efficiency in the input and output of data. In addition, the data are stored in a relational database (MySQL), which is free, allowing a better acceptance for users. The software system here developed allows more agility to the workflow and analysis of samples, contributing to the rapid insertion of the genetic profiles in the national database and to increase resolution of crimes. The next step of this research is its validation, in order to operate in accordance with current Brazilian national legislation.
Keywords: Database, forensic genetics, genetic analysis, sample management, software solution.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1168