Search results for: ArcGIS data analysis
41646 Data Projects for “Social Good”: Challenges and Opportunities
Authors: Mikel Niño, Roberto V. Zicari, Todor Ivanov, Kim Hee, Naveed Mushtaq, Marten Rosselli, Concha Sánchez-Ocaña, Karsten Tolle, José Miguel Blanco, Arantza Illarramendi, Jörg Besier, Harry Underwood
Abstract:
One of the application fields for data analysis techniques and technologies gaining momentum is the area of social good or “common good”, covering cases related to humanitarian crises, global health care, or ecology and environmental issues, among others. The promotion of data-driven projects in this field aims at increasing the efficacy and efficiency of social initiatives, improving the way these actions help humanity in general and people in need in particular. This application field, however, poses its own barriers and challenges when developing data-driven projects, lagging behind in comparison with other scenarios. These challenges derive from aspects such as the scope and scale of the social issue to solve, cultural and political barriers, the skills of main stakeholders and the technological resources available, the motivation to be engaged in such projects, or the ethical and legal issues related to sensitive data. This paper analyzes the application of data projects in the field of social good, reviewing its current state and noteworthy initiatives, and presenting a framework covering the key aspects to analyze in such projects. The goal is to provide guidelines to understand the main challenges and opportunities for this type of data project, as well as identifying the main differential issues compared to “classical” data projects in general. A case study is presented on the initial steps and stakeholder analysis of a data project for the inclusion of refugees in the city of Frankfurt, Germany, in order to empirically confront the framework with a real example.Keywords: data-driven projects, humanitarian operations, personal and sensitive data, social good, stakeholders analysis
Procedia PDF Downloads 32541645 The Perspective on Data Collection Instruments for Younger Learners
Authors: Hatice Kübra Koç
Abstract:
For academia, collecting reliable and valid data is one of the most significant issues for researchers. However, it is not the same procedure for all different target groups; meanwhile, during data collection from teenagers, young adults, or adults, researchers can use common data collection tools such as questionnaires, interviews, and semi-structured interviews; yet, for young learners and very young ones, these reliable and valid data collection tools cannot be easily designed or applied by the researchers. In this study, firstly, common data collection tools are examined for ‘very young’ and ‘young learners’ participant groups since it is thought that the quality and efficiency of an academic study is mainly based on its valid and correct data collection and data analysis procedure. Secondly, two different data collection instruments for very young and young learners are stated as discussing the efficacy of them. Finally, a suggested data collection tool – a performance-based questionnaire- which is specifically developed for ‘very young’ and ‘young learners’ participant groups in the field of teaching English to young learners as a foreign language is presented in this current study. The designing procedure and suggested items/factors for the suggested data collection tool are accordingly revealed at the end of the study to help researchers have studied with young and very learners.Keywords: data collection instruments, performance-based questionnaire, young learners, very young learners
Procedia PDF Downloads 9041644 A Patent Trend Analysis for Hydrogen Based Ironmaking: Identifying the Technology’s Development Phase
Authors: Ebru Kaymaz, Aslı İlbay Hamamcı, Yakup Enes Garip, Samet Ay
Abstract:
The use of hydrogen as a fuel is important for decreasing carbon emissions. For the steel industry, reducing carbon emissions is one of the most important agendas of recent times globally. Because of the Paris Agreement requirements, European steel industry studies on green steel production. Although many literature reviews have analyzed this topic from technological and hydrogen based ironmaking, there are very few studies focused on patents of decarbonize parts of the steel industry. Hence, this study focus on technological progress of hydrogen based ironmaking and on understanding the main trends through patent data. All available patent data were collected from Questel Orbit. The trend analysis of more than 900 patent documents has been carried out by using Questel Orbit Intellixir to analyze a large number of data for scientific intelligence.Keywords: hydrogen based ironmaking, DRI, direct reduction, carbon emission, steelmaking, patent analysis
Procedia PDF Downloads 14241643 Using Discriminant Analysis to Forecast Crime Rate in Nigeria
Authors: O. P. Popoola, O. A. Alawode, M. O. Olayiwola, A. M. Oladele
Abstract:
This research work is based on using discriminant analysis to forecast crime rate in Nigeria between 1996 and 2008. The work is interested in how gender (male and female) relates to offences committed against the government, against other properties, disturbance in public places, murder/robbery offences and other offences. The data used was collected from the National Bureau of Statistics (NBS). SPSS, the statistical package was used to analyse the data. Time plot was plotted on all the 29 offences gotten from the raw data. Eigenvalues and Multivariate tests, Wilks’ Lambda, standardized canonical discriminant function coefficients and the predicted classifications were estimated. The research shows that the distribution of the scores from each function is standardized to have a mean O and a standard deviation of 1. The magnitudes of the coefficients indicate how strongly the discriminating variable affects the score. In the predicted group membership, 172 cases that were predicted to commit crime against Government group, 66 were correctly predicted and 106 were incorrectly predicted. After going through the predicted classifications, we found out that most groups numbers that were correctly predicted were less than those that were incorrectly predicted.Keywords: discriminant analysis, DA, multivariate analysis of variance, MANOVA, canonical correlation, and Wilks’ Lambda
Procedia PDF Downloads 46741642 Document-level Sentiment Analysis: An Exploratory Case Study of Low-resource Language Urdu
Authors: Ammarah Irum, Muhammad Ali Tahir
Abstract:
Document-level sentiment analysis in Urdu is a challenging Natural Language Processing (NLP) task due to the difficulty of working with lengthy texts in a language with constrained resources. Deep learning models, which are complex neural network architectures, are well-suited to text-based applications in addition to data formats like audio, image, and video. To investigate the potential of deep learning for Urdu sentiment analysis, we implemented five different deep learning models, including Bidirectional Long Short Term Memory (BiLSTM), Convolutional Neural Network (CNN), Convolutional Neural Network with Bidirectional Long Short Term Memory (CNN-BiLSTM), and Bidirectional Encoder Representation from Transformer (BERT). In this study, we developed a hybrid deep learning model called BiLSTM-Single Layer Multi Filter Convolutional Neural Network (BiLSTM-SLMFCNN) by fusing BiLSTM and CNN architecture. The proposed and baseline techniques are applied on Urdu Customer Support data set and IMDB Urdu movie review data set by using pre-trained Urdu word embedding that are suitable for sentiment analysis at the document level. Results of these techniques are evaluated and our proposed model outperforms all other deep learning techniques for Urdu sentiment analysis. BiLSTM-SLMFCNN outperformed the baseline deep learning models and achieved 83%, 79%, 83% and 94% accuracy on small, medium and large sized IMDB Urdu movie review data set and Urdu Customer Support data set respectively.Keywords: urdu sentiment analysis, deep learning, natural language processing, opinion mining, low-resource language
Procedia PDF Downloads 7041641 Indexing and Incremental Approach Using Map Reduce Bipartite Graph (MRBG) for Mining Evolving Big Data
Authors: Adarsh Shroff
Abstract:
Big data is a collection of dataset so large and complex that it becomes difficult to process using data base management tools. To perform operations like search, analysis, visualization on big data by using data mining; which is the process of extraction of patterns or knowledge from large data set. In recent years, the data mining applications become stale and obsolete over time. Incremental processing is a promising approach to refreshing mining results. It utilizes previously saved states to avoid the expense of re-computation from scratch. This project uses i2MapReduce, an incremental processing extension to Map Reduce, the most widely used framework for mining big data. I2MapReduce performs key-value pair level incremental processing rather than task level re-computation, supports not only one-step computation but also more sophisticated iterative computation, which is widely used in data mining applications, and incorporates a set of novel techniques to reduce I/O overhead for accessing preserved fine-grain computation states. To optimize the mining results, evaluate i2MapReduce using a one-step algorithm and three iterative algorithms with diverse computation characteristics for efficient mining.Keywords: big data, map reduce, incremental processing, iterative computation
Procedia PDF Downloads 34941640 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach
Authors: Mpho Mokoatle, Darlington Mapiye, James Mashiyane, Stephanie Muller, Gciniwe Dlamini
Abstract:
Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on $k$-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0%, 80.5%, 80.5%, 63.6%, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms.Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing
Procedia PDF Downloads 16641639 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach
Authors: Darlington Mapiye, Mpho Mokoatle, James Mashiyane, Stephanie Muller, Gciniwe Dlamini
Abstract:
Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on k-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0 %, 80.5 %, 80.5 %, 63.6 %, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanismsKeywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing
Procedia PDF Downloads 15841638 Possible Risks for Online Orders in the Furniture Industry - Customer and Entrepreneur Perspective
Authors: Justyna Żywiołek, Marek Matulewski
Abstract:
Data, is information processed by enterprises for primary and secondary purposes as processes. Thanks to processing, the sales process takes place; in the case of the surveyed companies, sales take place online. However, this indirect form of contact with the customer causes many problems for both customers and furniture manufacturers. The article presents solutions that would solve problems related to the analysis of data and information in the order fulfillment process sent to post-warranty service. The article also presents an analysis of threats to the security of this information, both for customers and the enterprise.Keywords: ordering furniture online, information security, furniture industry, enterprise security, risk analysis
Procedia PDF Downloads 4741637 An Analysis of Privacy and Security for Internet of Things Applications
Authors: Dhananjay Singh, M. Abdullah-Al-Wadud
Abstract:
The Internet of Things is a concept of a large scale ecosystem of wireless actuators. The actuators are defined as things in the IoT, those which contribute or produces some data to the ecosystem. However, ubiquitous data collection, data security, privacy preserving, large volume data processing, and intelligent analytics are some of the key challenges into the IoT technologies. In order to solve the security requirements, challenges and threats in the IoT, we have discussed a message authentication mechanism for IoT applications. Finally, we have discussed data encryption mechanism for messages authentication before propagating into IoT networks.Keywords: Internet of Things (IoT), message authentication, privacy, security
Procedia PDF Downloads 38141636 Design and Development of Data Mining Application for Medical Centers in Remote Areas
Authors: Grace Omowunmi Soyebi
Abstract:
Data Mining is the extraction of information from a large database which helps in predicting a trend or behavior, thereby helping management make knowledge-driven decisions. One principal problem of most hospitals in rural areas is making use of the file management system for keeping records. A lot of time is wasted when a patient visits the hospital, probably in an emergency, and the nurse or attendant has to search through voluminous files before the patient's file can be retrieved; this may cause an unexpected to happen to the patient. This Data Mining application is to be designed using a Structured System Analysis and design method, which will help in a well-articulated analysis of the existing file management system, feasibility study, and proper documentation of the Design and Implementation of a Computerized medical record system. This Computerized system will replace the file management system and help to easily retrieve a patient's record with increased data security, access clinical records for decision-making, and reduce the time range at which a patient gets attended to.Keywords: data mining, medical record system, systems programming, computing
Procedia PDF Downloads 20541635 Application of Regularized Spatio-Temporal Models to the Analysis of Remote Sensing Data
Authors: Salihah Alghamdi, Surajit Ray
Abstract:
Space-time data can be observed over irregularly shaped manifolds, which might have complex boundaries or interior gaps. Most of the existing methods do not consider the shape of the data, and as a result, it is difficult to model irregularly shaped data accommodating the complex domain. We used a method that can deal with space-time data that are distributed over non-planner shaped regions. The method is based on partial differential equations and finite element analysis. The model can be estimated using a penalized least squares approach with a regularization term that controls the over-fitting. The model is regularized using two roughness penalties, which consider the spatial and temporal regularities separately. The integrated square of the second derivative of the basis function is used as temporal penalty. While the spatial penalty consists of the integrated square of Laplace operator, which is integrated exclusively over the domain of interest that is determined using finite element technique. In this paper, we applied a spatio-temporal regression model with partial differential equations regularization (ST-PDE) approach to analyze a remote sensing data measuring the greenness of vegetation, measure by an index called enhanced vegetation index (EVI). The EVI data consist of measurements that take values between -1 and 1 reflecting the level of greenness of some region over a period of time. We applied (ST-PDE) approach to irregular shaped region of the EVI data. The approach efficiently accommodates the irregular shaped regions taking into account the complex boundaries rather than smoothing across the boundaries. Furthermore, the approach succeeds in capturing the temporal variation in the data.Keywords: irregularly shaped domain, partial differential equations, finite element analysis, complex boundray
Procedia PDF Downloads 13941634 JavaScript Object Notation Data against eXtensible Markup Language Data in Software Applications a Software Testing Approach
Authors: Theertha Chandroth
Abstract:
This paper presents a comparative study on how to check JSON (JavaScript Object Notation) data against XML (eXtensible Markup Language) data from a software testing point of view. JSON and XML are widely used data interchange formats, each with its unique syntax and structure. The objective is to explore various techniques and methodologies for validating comparison and integration between JSON data to XML and vice versa. By understanding the process of checking JSON data against XML data, testers, developers and data practitioners can ensure accurate data representation, seamless data interchange, and effective data validation.Keywords: XML, JSON, data comparison, integration testing, Python, SQL
Procedia PDF Downloads 13841633 TRACE/FRAPTRAN Analysis of Kuosheng Nuclear Power Plant Dry-Storage System
Authors: J. R. Wang, Y. Chiang, W. Y. Li, H. T. Lin, H. C. Chen, C. Shih, S. W. Chen
Abstract:
The dry-storage systems of nuclear power plants (NPPs) in Taiwan have become one of the major safety concerns. There are two steps considered in this study. The first step is the verification of the TRACE by using VSC-17 experimental data. The results of TRACE were similar to the VSC-17 data. It indicates that TRACE has the respectable accuracy in the simulation and analysis of the dry-storage systems. The next step is the application of TRACE in the dry-storage system of Kuosheng NPP (BWR/6). Kuosheng NPP is the second BWR NPP of Taiwan Power Company. In order to solve the storage of the spent fuels, Taiwan Power Company developed the new dry-storage system for Kuosheng NPP. In this step, the dry-storage system model of Kuosheng NPP was established by TRACE. Then, the steady state simulation of this model was performed and the results of TRACE were compared with the Kuosheng NPP data. Finally, this model was used to perform the safety analysis of Kuosheng NPP dry-storage system. Besides, FRAPTRAN was used tocalculate the transient performance of fuel rods.Keywords: BWR, TRACE, FRAPTRAN, dry-storage
Procedia PDF Downloads 51741632 Evaluating Learning Outcomes in the Implementation of Flipped Teaching Using Data Envelopment Analysis
Authors: Huie-Wen Lin
Abstract:
This study integrated various teaching factors -based on the idea of a flipped classroom- in a financial management course. The study’s aim was to establish an effective teaching implementation strategy and evaluation mechanism with respect to learning outcomes, which can serve as a reference for the future modification of teaching methods. This study implemented a teaching method in five stages and estimated the learning efficiencies of 22 students (in the teaching scenario and over two semesters). Subsequently, data envelopment analysis (DEA) was used to compare, for each student, between the learning efficiencies before and after participation in the flipped classroom -in the first and second semesters, respectively- to identify the crucial external factors influencing learning efficiency. According to the results, the average overall student learning efficiency increased from 0.901 in the first semester to 0.967 in the second semester, which demonstrate that the flipped classroom approach can improve teaching effectiveness and learning outcomes. The results also revealed a difference in learning efficiency between male and female students.Keywords: data envelopment analysis, flipped classroom, learning outcome, teaching and learning
Procedia PDF Downloads 15541631 Research on the Spatial Evolution of Tourism-Oriented Rural Settlements: Take the Xiaochanfangyu Village, Dongshuichang Village, Maojiayu Village in Jixian County, Tianjin City as Examples
Authors: Yu Zhang, Jie Wu, Li Dong
Abstract:
Rural tourism is the service industry which regards the agricultural production, rural life, rural nature and cultural landscape as the tourist attraction. It aims to meet the needs of the city tourists such as country sightseeing, vacation, and leisure. According to the difference of the tourist resources, the rural settlements can be divided into different types: The type of tourism resources, scenic spot, and peri-urban. In the past ten years, the rural tourism has promoted the industrial transformation and economic growth in rural areas of China. And it is conducive to the coordinated development of urban and rural areas and has greatly improved the ecological environment and the standard of living for farmers in rural areas. At the same time, a large number of buildings and sites are built in the countryside in order to enhance the tourist attraction and the ability of tourist reception and also to increase the travel comfort and convenience, which has significant influence on the spatial evolution of the village settlement. This article takes the XiangYing Subdistrict, which is in JinPu District of Dalian in China as the exemplification and uses the technology of Remote Sensing (RS), Geographic Information System (GIS) and the technology of Landscape Spatial Analysis to study the influence of the rural tourism development in the rural settlement spaces in four steps. First, acquiring the remote sensing image data at different times of 8 administrative villages in the XiangYing Subdistrict, by using the remote sensing application EDRAS8.6; second, vectoring basic maps of XiangYing Subdistrict including its land-use map with the application of ArcGIS 9.3, associating with social and economic attribute data of rural settlements and analyzing on the rural evolution visually; third, quantifying the comparison of these patches in rural settlements by using the landscape spatial calculation application Fragstats 3.3 and analyzing on the evolution of the spatial structure of settlement in macro and medium scale; finally, summarizing the evolution characteristics and internal reasons of tourism-oriented rural settlements. The main findings of this article include: first of all, there is difference in the evolution of the spatial structure between the developing rural settlements and undeveloped rural settlements among the eight administrative villages; secondly, the villages relying on the surrounding tourist attractions, the villages developing agricultural ecological garden and the villages with natural or historical and cultural resources have different laws of development; then, the rural settlements whose tourism development in germination period, development period and mature period have different characteristics of spatial evolution; finally, the different evolution modes of the tourism-oriented rural settlement space have different influences on the protection and inheritance of the village scene. The development of tourism has a significant impact on the spatial evolution of rural settlement. The intensive use of rural land and natural resources is the fundamental principle to protect the rural cultural landscape and ecological environment as well as the critical way to improve the attraction of rural tourism and promote the sustainable development of countryside.Keywords: landscape pattern, rural settlement, spatial evolution, tourism-oriented, Xiangying Subdistrict
Procedia PDF Downloads 28741630 Using RASCAL and ALOHA Codes to Establish an Analysis Methodology for Hydrogen Fluoride Evaluation
Authors: J. R. Wang, Y. Chiang, W. S. Hsu, H. C. Chen, S. H. Chen, J. H. Yang, S. W. Chen, C. Shih
Abstract:
In this study, the RASCAL and ALOHA codes are used to establish an analysis methodology for hydrogen fluoride (HF) evaluation. There are three main steps in this study. First, the UF6 data were collected. Second, one postulated case was analyzed by using the RASCAL and UF6 data. This postulated case assumes that fire occurring and UF6 is releasing from a building. Third, the results of RASCAL for HF mass were as the input data of ALOHA. Two postulated cases of HF were analyzed by using ALOHA code and the results of RASCAL. These postulated cases assume fire occurring and HF is releasing with no raining (Case 1) or raining (Case 2) condition. According to the analysis results of ALOHA, the HF concentration of Case 2 is smaller than Case 1. The results can be a reference for the preparing of emergency plans for the release of HF.Keywords: RASCAL, ALOHA, UF₆, hydrogen fluoride
Procedia PDF Downloads 74741629 Using Machine Learning Techniques to Extract Useful Information from Dark Data
Authors: Nigar Hussain
Abstract:
It is a subset of big data. Dark data means those data in which we fail to use for future decisions. There are many issues in existing work, but some need powerful tools for utilizing dark data. It needs sufficient techniques to deal with dark data. That enables users to exploit their excellence, adaptability, speed, less time utilization, execution, and accessibility. Another issue is the way to utilize dark data to extract helpful information to settle on better choices. In this paper, we proposed upgrade strategies to remove the dark side from dark data. Using a supervised model and machine learning techniques, we utilized dark data and achieved an F1 score of 89.48%.Keywords: big data, dark data, machine learning, heatmap, random forest
Procedia PDF Downloads 2741628 Ethical Leadership and Individual Creativity: The Mediating Role of Psychological Safety
Authors: Hyeondal Jeong, Yoonjung Baek
Abstract:
This study examines the relationship between ethical leadership and individual creativity and focused on mediating effects of psychological safety. In order to clarify the mechanism of ethical leadership, psychological safety of the members was set as a mediator. Using data gathered from a sample of 150 employees. For data analysis, exploratory factor analysis, correlation analysis, hierarchical regression analysis and Sobel-Test were performed. The results showed that ethical leadership had a positive effect on psychological safety and individual creativity, and psychological safety had a positive mediating effect. Since the mediating effect of psychological safety has been confirmed, we need to find ways to improve the psychological safety of the members in terms of organizational management. Psychological safety has a positive effect on individual creativity, which can have a positive impact on innovation throughout the organization.Keywords: ethical leadership, creativity, psychological safety, ethics management, innovative behaviors
Procedia PDF Downloads 24841627 Social Media Data Analysis for Personality Modelling and Learning Styles Prediction Using Educational Data Mining
Authors: Srushti Patil, Preethi Baligar, Gopalkrishna Joshi, Gururaj N. Bhadri
Abstract:
In designing learning environments, the instructional strategies can be tailored to suit the learning style of an individual to ensure effective learning. In this study, the information shared on social media like Facebook is being used to predict learning style of a learner. Previous research studies have shown that Facebook data can be used to predict user personality. Users with a particular personality exhibit an inherent pattern in their digital footprint on Facebook. The proposed work aims to correlate the user's’ personality, predicted from Facebook data to the learning styles, predicted through questionnaires. For Millennial learners, Facebook has become a primary means for information sharing and interaction with peers. Thus, it can serve as a rich bed for research and direct the design of learning environments. The authors have conducted this study in an undergraduate freshman engineering course. Data from 320 freshmen Facebook users was collected. The same users also participated in the learning style and personality prediction survey. The Kolb’s Learning style questionnaires and Big 5 personality Inventory were adopted for the survey. The users have agreed to participate in this research and have signed individual consent forms. A specific page was created on Facebook to collect user data like personal details, status updates, comments, demographic characteristics and egocentric network parameters. This data was captured by an application created using Python program. The data captured from Facebook was subjected to text analysis process using the Linguistic Inquiry and Word Count dictionary. An analysis of the data collected from the questionnaires performed reveals individual student personality and learning style. The results obtained from analysis of Facebook, learning style and personality data were then fed into an automatic classifier that was trained by using the data mining techniques like Rule-based classifiers and Decision trees. This helps to predict the user personality and learning styles by analysing the common patterns. Rule-based classifiers applied for text analysis helps to categorize Facebook data into positive, negative and neutral. There were totally two models trained, one to predict the personality from Facebook data; another one to predict the learning styles from the personalities. The results show that the classifier model has high accuracy which makes the proposed method to be a reliable one for predicting the user personality and learning styles.Keywords: educational data mining, Facebook, learning styles, personality traits
Procedia PDF Downloads 22941626 Study and Analysis of the Factors Affecting Road Safety Using Decision Tree Algorithms
Authors: Naina Mahajan, Bikram Pal Kaur
Abstract:
The purpose of traffic accident analysis is to find the possible causes of an accident. Road accidents cannot be totally prevented but by suitable traffic engineering and management the accident rate can be reduced to a certain extent. This paper discusses the classification techniques C4.5 and ID3 using the WEKA Data mining tool. These techniques use on the NH (National highway) dataset. With the C4.5 and ID3 technique it gives best results and high accuracy with less computation time and error rate.Keywords: C4.5, ID3, NH(National highway), WEKA data mining tool
Procedia PDF Downloads 33641625 A Study on the HTML5 Based Multi Media Contents Authority Tool
Authors: Heesuk Seo, Yongtae Kim
Abstract:
Online learning started in the 1990s, the spread of the Internet has been through the era of e-learning paradigm of online education in the era of smart learning change. Reflecting the different nature of the mobile to anywhere anytime, anywhere was also allows the form of learning, it was also available through the learning content and interaction. We are developing a cloud system, 'TLINKS CLOUD' that allows you to configure the environment of the smart learning without the need for additional infrastructure. Using the big-data analysis for e-learning contents, we provide an integrated solution for e-learning tailored to individual study.Keywords: authority tool, big data analysis, e-learning, HTML5
Procedia PDF Downloads 40441624 Multichannel Analysis of the Surface Waves of Earth Materials in Some Parts of Lagos State, Nigeria
Authors: R. B. Adegbola, K. F. Oyedele, L. Adeoti
Abstract:
We present a method that utilizes Multi-channel Analysis of Surface Waves, which was used to measure shear wave velocities with a view to establishing the probable causes of road failure, subsidence and weakening of structures in some Local Government Area, Lagos, Nigeria. Multi channel Analysis of Surface waves (MASW) data were acquired using 24-channel seismograph. The acquired data were processed and transformed into two-dimensional (2-D) structure reflective of depth and surface wave velocity distribution within a depth of 0–15m beneath the surface using SURFSEIS software. The shear wave velocity data were compared with other geophysical/borehole data that were acquired along the same profile. The comparison and correlation illustrates the accuracy and consistency of MASW derived-shear wave velocity profiles. Rigidity modulus and N-value were also generated. The study showed that the low velocity/very low velocity are reflective of organic clay/peat materials and thus likely responsible for the failed, subsidence/weakening of structures within the study areas.Keywords: seismograph, road failure, rigidity modulus, N-value, subsidence
Procedia PDF Downloads 36041623 Fuzzy Approach for Fault Tree Analysis of Water Tube Boiler
Authors: Syed Ahzam Tariq, Atharva Modi
Abstract:
This paper presents a probabilistic analysis of the safety of water tube boilers using fault tree analysis (FTA). A fault tree has been constructed by considering all possible areas where a malfunction could lead to a boiler accident. Boiler accidents are relatively rare, causing a scarcity of data. The fuzzy approach is employed to perform a quantitative analysis, wherein theories of fuzzy logic are employed in conjunction with expert elicitation to calculate failure probabilities. The Fuzzy Fault Tree Analysis (FFTA) provides a scientific and contingent method to forecast and prevent accidents.Keywords: fault tree analysis water tube boiler, fuzzy probability score, failure probability
Procedia PDF Downloads 12441622 Big Data: Appearance and Disappearance
Authors: James Moir
Abstract:
The mainstay of Big Data is prediction in that it allows practitioners, researchers, and policy analysts to predict trends based upon the analysis of large and varied sources of data. These can range from changing social and political opinions, patterns in crimes, and consumer behaviour. Big Data has therefore shifted the criterion of success in science from causal explanations to predictive modelling and simulation. The 19th-century science sought to capture phenomena and seek to show the appearance of it through causal mechanisms while 20th-century science attempted to save the appearance and relinquish causal explanations. Now 21st-century science in the form of Big Data is concerned with the prediction of appearances and nothing more. However, this pulls social science back in the direction of a more rule- or law-governed reality model of science and away from a consideration of the internal nature of rules in relation to various practices. In effect Big Data offers us no more than a world of surface appearance and in doing so it makes disappear any context-specific conceptual sensitivity.Keywords: big data, appearance, disappearance, surface, epistemology
Procedia PDF Downloads 41941621 Recommendations Using Online Water Quality Sensors for Chlorinated Drinking Water Monitoring at Drinking Water Distribution Systems Exposed to Glyphosate
Authors: Angela Maria Fasnacht
Abstract:
Detection of anomalies due to contaminants’ presence, also known as early detection systems in water treatment plants, has become a critical point that deserves an in-depth study for their improvement and adaptation to current requirements. The design of these systems requires a detailed analysis and processing of the data in real-time, so it is necessary to apply various statistical methods appropriate to the data generated, such as Spearman’s Correlation, Factor Analysis, Cross-Correlation, and k-fold Cross-validation. Statistical analysis and methods allow the evaluation of large data sets to model the behavior of variables; in this sense, statistical treatment or analysis could be considered a vital step to be able to develop advanced models focused on machine learning that allows optimized data management in real-time, applied to early detection systems in water treatment processes. These techniques facilitate the development of new technologies used in advanced sensors. In this work, these methods were applied to identify the possible correlations between the measured parameters and the presence of the glyphosate contaminant in the single-pass system. The interaction between the initial concentration of glyphosate and the location of the sensors on the reading of the reported parameters was studied.Keywords: glyphosate, emergent contaminants, machine learning, probes, sensors, predictive
Procedia PDF Downloads 12041620 Cross Project Software Fault Prediction at Design Phase
Authors: Pradeep Singh, Shrish Verma
Abstract:
Software fault prediction models are created by using the source code, processed metrics from the same or previous version of code and related fault data. Some company do not store and keep track of all artifacts which are required for software fault prediction. To construct fault prediction model for such company, the training data from the other projects can be one potential solution. The earlier we predict the fault the less cost it requires to correct. The training data consists of metrics data and related fault data at function/module level. This paper investigates fault predictions at early stage using the cross-project data focusing on the design metrics. In this study, empirical analysis is carried out to validate design metrics for cross project fault prediction. The machine learning techniques used for evaluation is Naïve Bayes. The design phase metrics of other projects can be used as initial guideline for the projects where no previous fault data is available. We analyze seven data sets from NASA Metrics Data Program which offer design as well as code metrics. Overall, the results of cross project is comparable to the within company data learning.Keywords: software metrics, fault prediction, cross project, within project.
Procedia PDF Downloads 34141619 Reviewing Privacy Preserving Distributed Data Mining
Authors: Sajjad Baghernezhad, Saeideh Baghernezhad
Abstract:
Nowadays considering human involved in increasing data development some methods such as data mining to extract science are unavoidable. One of the discussions of data mining is inherent distribution of the data usually the bases creating or receiving such data belong to corporate or non-corporate persons and do not give their information freely to others. Yet there is no guarantee to enable someone to mine special data without entering in the owner’s privacy. Sending data and then gathering them by each vertical or horizontal software depends on the type of their preserving type and also executed to improve data privacy. In this study it was attempted to compare comprehensively preserving data methods; also general methods such as random data, coding and strong and weak points of each one are examined.Keywords: data mining, distributed data mining, privacy protection, privacy preserving
Procedia PDF Downloads 52341618 Statistical Correlation between Logging-While-Drilling Measurements and Wireline Caliper Logs
Authors: Rima T. Alfaraj, Murtadha J. Al Tammar, Khaqan Khan, Khalid M. Alruwaili
Abstract:
OBJECTIVE/SCOPE (25-75): Caliper logging data provides critical information about wellbore shape and deformations, such as stress-induced borehole breakouts or washouts. Multiarm mechanical caliper logs are often run using wireline, which can be time-consuming, costly, and/or challenging to run in certain formations. To minimize rig time and improve operational safety, it is valuable to develop analytical solutions that can estimate caliper logs using available Logging-While-Drilling (LWD) data without the need to run wireline caliper logs. As a first step, the objective of this paper is to perform statistical analysis using an extensive datasetto identify important physical parameters that should be considered in developing such analytical solutions. METHODS, PROCEDURES, PROCESS (75-100): Caliper logs and LWD data of eleven wells, with a total of more than 80,000 data points, were obtained and imported into a data analytics software for analysis. Several parameters were selected to test the relationship of the parameters with the measured maximum and minimum caliper logs. These parameters includegamma ray, porosity, shear, and compressional sonic velocities, bulk densities, and azimuthal density. The data of the eleven wells were first visualized and cleaned.Using the analytics software, several analyses were then preformed, including the computation of Pearson’s correlation coefficients to show the statistical relationship between the selected parameters and the caliper logs. RESULTS, OBSERVATIONS, CONCLUSIONS (100-200): The results of this statistical analysis showed that some parameters show good correlation to the caliper log data. For instance, the bulk density and azimuthal directional densities showedPearson’s correlation coefficients in the range of 0.39 and 0.57, which wererelatively high when comparedto the correlation coefficients of caliper data with other parameters. Other parameters such as porosity exhibited extremely low correlation coefficients to the caliper data. Various crossplots and visualizations of the data were also demonstrated to gain further insights from the field data. NOVEL/ADDITIVE INFORMATION (25-75): This study offers a unique and novel look into the relative importance and correlation between different LWD measurements and wireline caliper logs via an extensive dataset. The results pave the way for a more informed development of new analytical solutions for estimating the size and shape of the wellbore in real-time while drilling using LWD data.Keywords: LWD measurements, caliper log, correlations, analysis
Procedia PDF Downloads 12041617 Performance Analysis of Multichannel OCDMA-FSO Network under Different Pervasive Conditions
Authors: Saru Arora, Anurag Sharma, Harsukhpreet Singh
Abstract:
To meet the growing need of high data rate and bandwidth, various efforts has been made nowadays for the efficient communication systems. Optical Code Division Multiple Access over Free space optics communication system seems an effective role for providing transmission at high data rate with low bit error rate and low amount of multiple access interference. This paper demonstrates the OCDMA over FSO communication system up to the range of 7000 m at a data rate of 5 Gbps. Initially, the 8 user OCDMA-FSO system is simulated and pseudo orthogonal codes are used for encoding. Also, the simulative analysis of various performance parameters like power and core effective area that are having an effect on the Bit error rate (BER) of the system is carried out. The simulative analysis reveals that the length of the transmission is limited by the multi-access interference (MAI) effect which arises when the number of users increases in the system.Keywords: FSO, PSO, bit error rate (BER), opti system simulation, multiple access interference (MAI), q-factor
Procedia PDF Downloads 364