Search results for: terrorism data analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 42193

Search results for: terrorism data analysis

41743 Spatial-Temporal Clustering Characteristics of Dengue in the Northern Region of Sri Lanka, 2010-2013

Authors: Sumiko Anno, Keiji Imaoka, Takeo Tadono, Tamotsu Igarashi, Subramaniam Sivaganesh, Selvam Kannathasan, Vaithehi Kumaran, Sinnathamby Noble Surendran

Abstract:

Dengue outbreaks are affected by biological, ecological, socio-economic and demographic factors that vary over time and space. These factors have been examined separately and still require systematic clarification. The present study aimed to investigate the spatial-temporal clustering relationships between these factors and dengue outbreaks in the northern region of Sri Lanka. Remote sensing (RS) data gathered from a plurality of satellites were used to develop an index comprising rainfall, humidity and temperature data. RS data gathered by ALOS/AVNIR-2 were used to detect urbanization, and a digital land cover map was used to extract land cover information. Other data on relevant factors and dengue outbreaks were collected through institutions and extant databases. The analyzed RS data and databases were integrated into geographic information systems, enabling temporal analysis, spatial statistical analysis and space-time clustering analysis. Our present results showed that increases in the number of the combination of ecological factor and socio-economic and demographic factors with above the average or the presence contribute to significantly high rates of space-time dengue clusters.

Keywords: ALOS/AVNIR-2, dengue, space-time clustering analysis, Sri Lanka

Procedia PDF Downloads 476
41742 Estimating Bridge Deterioration for Small Data Sets Using Regression and Markov Models

Authors: Yina F. Muñoz, Alexander Paz, Hanns De La Fuente-Mella, Joaquin V. Fariña, Guilherme M. Sales

Abstract:

The primary approach for estimating bridge deterioration uses Markov-chain models and regression analysis. Traditional Markov models have problems in estimating the required transition probabilities when a small sample size is used. Often, reliable bridge data have not been taken over large periods, thus large data sets may not be available. This study presents an important change to the traditional approach by using the Small Data Method to estimate transition probabilities. The results illustrate that the Small Data Method and traditional approach both provide similar estimates; however, the former method provides results that are more conservative. That is, Small Data Method provided slightly lower than expected bridge condition ratings compared with the traditional approach. Considering that bridges are critical infrastructures, the Small Data Method, which uses more information and provides more conservative estimates, may be more appropriate when the available sample size is small. In addition, regression analysis was used to calculate bridge deterioration. Condition ratings were determined for bridge groups, and the best regression model was selected for each group. The results obtained were very similar to those obtained when using Markov chains; however, it is desirable to use more data for better results.

Keywords: concrete bridges, deterioration, Markov chains, probability matrix

Procedia PDF Downloads 335
41741 Methods for Distinction of Cattle Using Supervised Learning

Authors: Radoslav Židek, Veronika Šidlová, Radovan Kasarda, Birgit Fuerst-Waltl

Abstract:

Machine learning represents a set of topics dealing with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data. The data can present identification patterns which are used to classify into groups. The result of the analysis is the pattern which can be used for identification of data set without the need to obtain input data used for creation of this pattern. An important requirement in this process is careful data preparation validation of model used and its suitable interpretation. For breeders, it is important to know the origin of animals from the point of the genetic diversity. In case of missing pedigree information, other methods can be used for traceability of animal´s origin. Genetic diversity written in genetic data is holding relatively useful information to identify animals originated from individual countries. We can conclude that the application of data mining for molecular genetic data using supervised learning is an appropriate tool for hypothesis testing and identifying an individual.

Keywords: genetic data, Pinzgau cattle, supervised learning, machine learning

Procedia PDF Downloads 549
41740 Global Analysis of Modern Economic Sanctions

Authors: I. L. Yakushev

Abstract:

Economic sanctions are an integral part of the foreign policy repertoire of States. Increasingly, States and international organizations are resorting to sanctions to address a variety of issues -from fighting corruption to preventing the use of nuclear weapons. Over time, the ways in which economic sanctions have been used have changed, especially over the past two decades. In the late 1990s, the recognition of the humanitarian harm of economic sanctions and the "War on Terrorism" after the events of September 11, 2001, led to serious changes in the structure and mechanisms of their application. Questions about how these coercive tools work, when they are applied, what consequences they have, and when they are successful are still being determined by research conducted in the second half of the 20th century. The conclusions drawn from past cases of sanctions may not be fully applicable to the current sanctions policy. In the second half of the 20th century, most cases of sanctions were related to the United States, and it covered restrictions on international trade. However, over the past two decades, the European Union, the United Nations, and China have also been the main initiators of sanctions. Modern sanctions include targeted and financial restrictions and are applied against individuals, organizations, and companies. Changing the senders, targets, stakeholders, and economic instruments used in the sanctions policy has serious implications for effectiveness and results. The regulatory and bureaucratic infrastructure necessary to implement and comply with modern economic sanctions has become more reliable. This evolution of sanctions has provided the scientific community with an opportunity to study new issues of coercion and return to the old ones. The economic sanctions research program should be developed to be relevant for understanding the application of modern sanctions and their consequences.

Keywords: global analysis, economic sanctions, targeted sanctions, foreign policy, domestic policy, United Nations, European Union, USA, economic pressure

Procedia PDF Downloads 56
41739 Development of Energy Benchmarks Using Mandatory Energy and Emissions Reporting Data: Ontario Post-Secondary Residences

Authors: C. Xavier Mendieta, J. J McArthur

Abstract:

Governments are playing an increasingly active role in reducing carbon emissions, and a key strategy has been the introduction of mandatory energy disclosure policies. These policies have resulted in a significant amount of publicly available data, providing researchers with a unique opportunity to develop location-specific energy and carbon emission benchmarks from this data set, which can then be used to develop building archetypes and used to inform urban energy models. This study presents the development of such a benchmark using the public reporting data. The data from Ontario’s Ministry of Energy for Post-Secondary Educational Institutions are being used to develop a series of building archetype dynamic building loads and energy benchmarks to fill a gap in the currently available building database. This paper presents the development of a benchmark for college and university residences within ASHRAE climate zone 6 areas in Ontario using the mandatory disclosure energy and greenhouse gas emissions data. The methodology presented includes data cleaning, statistical analysis, and benchmark development, and lessons learned from this investigation are presented and discussed to inform the development of future energy benchmarks from this larger data set. The key findings from this initial benchmarking study are: (1) the importance of careful data screening and outlier identification to develop a valid dataset; (2) the key features used to develop a model of the data are building age, size, and occupancy schedules and these can be used to estimate energy consumption; and (3) policy changes affecting the primary energy generation significantly affected greenhouse gas emissions, and consideration of these factors was critical to evaluate the validity of the reported data.

Keywords: building archetypes, data analysis, energy benchmarks, GHG emissions

Procedia PDF Downloads 306
41738 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: mining big data, big data, machine learning, telecommunication

Procedia PDF Downloads 407
41737 A Proposal for U-City (Smart City) Service Method Using Real-Time Digital Map

Authors: SangWon Han, MuWook Pyeon, Sujung Moon, DaeKyo Seo

Abstract:

Recently, technologies based on three-dimensional (3D) space information are being developed and quality of life is improving as a result. Research on real-time digital map (RDM) is being conducted now to provide 3D space information. RDM is a service that creates and supplies 3D space information in real time based on location/shape detection. Research subjects on RDM include the construction of 3D space information with matching image data, complementing the weaknesses of image acquisition using multi-source data, and data collection methods using big data. Using RDM will be effective for space analysis using 3D space information in a U-City and for other space information utilization technologies.

Keywords: RDM, multi-source data, big data, U-City

Procedia PDF Downloads 432
41736 A Critical Analysis on Gaps Associated with Culture Policy Milieu Governing Traditional Male Circumcision in the Eastern Cape, South Africa

Authors: Thanduxolo Nomngcoyiya, Simon M. Kang’ethe

Abstract:

The paper aimed to critically analyse gaps pertaining to the cultural policy environments governing traditional male circumcision in the Eastern Cape as exemplified by an empirical case study. The original study which this paper is derived from utilized qualitative paradigm; and encompassed 28 participants. It used in-depth one-on-one interviews complemented by focus group discussions and key informants as a method of data collection. It also adopted interview guide as a data collection instrument. The original study was cross-sectional in nature, and the data was audio recorded and transcribed later during the data analysis and coding process. The study data analysis was content thematic analysis and identified the following key major findings on the culture of male circumcision policy: Lack of clarity on culture of male circumcision policy operations; Myths surrounding procedures on culture of male circumcision; Divergent views on cultural policies between government and male circumcision custodians; Unclear cultural policies on selection criteria of practitioners; and Lack of policy enforcement and implementation on transgressors of culture of male circumcision. It recommended: a stringent selection criteria of practitioners; a need to carry out death-free male circumcision; a need for male circumcision stakeholders to work with other culture and tradition-friendly stakeholders.

Keywords: human rights, policy enforcement, traditional male circumcision, traditional surgeons and nurses

Procedia PDF Downloads 296
41735 Human-Centred Data Analysis Method for Future Design of Residential Spaces: Coliving Case Study

Authors: Alicia Regodon Puyalto, Alfonso Garcia-Santos

Abstract:

This article presents a method to analyze the use of indoor spaces based on data analytics obtained from inbuilt digital devices. The study uses the data generated by the in-place devices, such as smart locks, Wi-Fi routers, and electrical sensors, to gain additional insights on space occupancy, user behaviour, and comfort. Those devices, originally installed to facilitate remote operations, report data through the internet that the research uses to analyze information on human real-time use of spaces. Using an in-place Internet of Things (IoT) network enables a faster, more affordable, seamless, and scalable solution to analyze building interior spaces without incorporating external data collection systems such as sensors. The methodology is applied to a real case study of coliving, a residential building of 3000m², 7 floors, and 80 users in the centre of Madrid. The case study applies the method to classify IoT devices, assess, clean, and analyze collected data based on the analysis framework. The information is collected remotely, through the different platforms devices' platforms; the first step is to curate the data, understand what insights can be provided from each device according to the objectives of the study, this generates an analysis framework to be escalated for future building assessment even beyond the residential sector. The method will adjust the parameters to be analyzed tailored to the dataset available in the IoT of each building. The research demonstrates how human-centered data analytics can improve the future spatial design of indoor spaces.

Keywords: in-place devices, IoT, human-centred data-analytics, spatial design

Procedia PDF Downloads 196
41734 Predicting Medical Check-Up Patient Re-Coming Using Sequential Pattern Mining and Association Rules

Authors: Rizka Aisha Rahmi Hariadi, Chao Ou-Yang, Han-Cheng Wang, Rajesri Govindaraju

Abstract:

As the increasing of medical check-up popularity, there are a huge number of medical check-up data stored in database and have not been useful. These data actually can be very useful for future strategic planning if we mine it correctly. In other side, a lot of patients come with unpredictable coming and also limited available facilities make medical check-up service offered by hospital not maximal. To solve that problem, this study used those medical check-up data to predict patient re-coming. Sequential pattern mining (SPM) and association rules method were chosen because these methods are suitable for predicting patient re-coming using sequential data. First, based on patient personal information the data was grouped into … groups then discriminant analysis was done to check significant of the grouping. Second, for each group some frequent patterns were generated using SPM method. Third, based on frequent patterns of each group, pairs of variable can be extracted using association rules to get general pattern of re-coming patient. Last, discussion and conclusion was done to give some implications of the results.

Keywords: patient re-coming, medical check-up, health examination, data mining, sequential pattern mining, association rules, discriminant analysis

Procedia PDF Downloads 640
41733 A Review on the Comparison of EU Countries Based on Research and Development Efficiencies

Authors: Yeliz Ekinci, Raife Merve Ön

Abstract:

Nowadays, technological progress is one of the most important components of economic growth and the efficiency of R&D activities is particularly essential for countries. This study is an attempt to analyze the R&D efficiencies of EU countries. The indicators related to R&D efficiencies should be determined in advance in order to use DEA. For this reason a list of input and output indicators are derived from the literature review. Considering the data availability, a final list is given for the numerical analysis for future research.

Keywords: data envelopment analysis, economic growth, EU countries, R&D efficiency

Procedia PDF Downloads 535
41732 Effects of Video Games and Online Chat on Mathematics Performance in High School: An Approach of Multivariate Data Analysis

Authors: Lina Wu, Wenyi Lu, Ye Li

Abstract:

Regarding heavy video game players for boys and super online chat lovers for girls as a symbolic phrase in the current adolescent culture, this project of data analysis verifies the displacement effect on deteriorating mathematics performance. To evaluate correlation or regression coefficients between a factor of playing video games or chatting online and mathematics performance compared with other factors, we use multivariate analysis technique and take gender difference into account. We find the most important reason for the negative sign of the displacement effect on mathematics performance due to students’ poor academic background. Statistical analysis methods in this project could be applied to study internet users’ academic performance from the high school education to the college education.

Keywords: correlation coefficients, displacement effect, multivariate analysis technique, regression coefficients

Procedia PDF Downloads 361
41731 The Importance of Knowledge Innovation for External Audit on Anti-Corruption

Authors: Adel M. Qatawneh

Abstract:

This paper aimed to determine the importance of knowledge innovation for external audit on anti-corruption in the entire Jordanian bank companies are listed in Amman Stock Exchange (ASE). The study importance arises from the need to recognize the Knowledge innovation for external audit and anti-corruption as the development in the world of business, the variables that will be affected by external audit innovation are: reliability of financial data, relevantly of financial data, consistency of the financial data, Full disclosure of financial data and protecting the rights of investors to achieve the objectives of the study a questionnaire was designed and distributed to the society of the Jordanian bank are listed in Amman Stock Exchange. The data analysis found out that the banks in Jordan have a positive importance of Knowledge innovation for external audit on anti-corruption. They agree on the benefit of Knowledge innovation for external audit on anti-corruption. The statistical analysis showed that Knowledge innovation for external audit had a positive impact on the anti-corruption and that external audit has a significantly statistical relationship with anti-corruption, reliability of financial data, consistency of the financial data, a full disclosure of financial data and protecting the rights of investors.

Keywords: knowledge innovation, external audit, anti-corruption, Amman Stock Exchange

Procedia PDF Downloads 463
41730 From Text to Data: Sentiment Analysis of Presidential Election Political Forums

Authors: Sergio V Davalos, Alison L. Watkins

Abstract:

User generated content (UGC) such as website post has data associated with it: time of the post, gender, location, type of device, and number of words. The text entered in user generated content (UGC) can provide a valuable dimension for analysis. In this research, each user post is treated as a collection of terms (words). In addition to the number of words per post, the frequency of each term is determined by post and by the sum of occurrences in all posts. This research focuses on one specific aspect of UGC: sentiment. Sentiment analysis (SA) was applied to the content (user posts) of two sets of political forums related to the US presidential elections for 2012 and 2016. Sentiment analysis results in deriving data from the text. This enables the subsequent application of data analytic methods. The SASA (SAIL/SAI Sentiment Analyzer) model was used for sentiment analysis. The application of SASA resulted with a sentiment score for each post. Based on the sentiment scores for the posts there are significant differences between the content and sentiment of the two sets for the 2012 and 2016 presidential election forums. In the 2012 forums, 38% of the forums started with positive sentiment and 16% with negative sentiment. In the 2016 forums, 29% started with positive sentiment and 15% with negative sentiment. There also were changes in sentiment over time. For both elections as the election got closer, the cumulative sentiment score became negative. The candidate who won each election was in the more posts than the losing candidates. In the case of Trump, there were more negative posts than Clinton’s highest number of posts which were positive. KNIME topic modeling was used to derive topics from the posts. There were also changes in topics and keyword emphasis over time. Initially, the political parties were the most referenced and as the election got closer the emphasis changed to the candidates. The performance of the SASA method proved to predict sentiment better than four other methods in Sentibench. The research resulted in deriving sentiment data from text. In combination with other data, the sentiment data provided insight and discovery about user sentiment in the US presidential elections for 2012 and 2016.

Keywords: sentiment analysis, text mining, user generated content, US presidential elections

Procedia PDF Downloads 190
41729 Data Projects for “Social Good”: Challenges and Opportunities

Authors: Mikel Niño, Roberto V. Zicari, Todor Ivanov, Kim Hee, Naveed Mushtaq, Marten Rosselli, Concha Sánchez-Ocaña, Karsten Tolle, José Miguel Blanco, Arantza Illarramendi, Jörg Besier, Harry Underwood

Abstract:

One of the application fields for data analysis techniques and technologies gaining momentum is the area of social good or “common good”, covering cases related to humanitarian crises, global health care, or ecology and environmental issues, among others. The promotion of data-driven projects in this field aims at increasing the efficacy and efficiency of social initiatives, improving the way these actions help humanity in general and people in need in particular. This application field, however, poses its own barriers and challenges when developing data-driven projects, lagging behind in comparison with other scenarios. These challenges derive from aspects such as the scope and scale of the social issue to solve, cultural and political barriers, the skills of main stakeholders and the technological resources available, the motivation to be engaged in such projects, or the ethical and legal issues related to sensitive data. This paper analyzes the application of data projects in the field of social good, reviewing its current state and noteworthy initiatives, and presenting a framework covering the key aspects to analyze in such projects. The goal is to provide guidelines to understand the main challenges and opportunities for this type of data project, as well as identifying the main differential issues compared to “classical” data projects in general. A case study is presented on the initial steps and stakeholder analysis of a data project for the inclusion of refugees in the city of Frankfurt, Germany, in order to empirically confront the framework with a real example.

Keywords: data-driven projects, humanitarian operations, personal and sensitive data, social good, stakeholders analysis

Procedia PDF Downloads 325
41728 The Perspective on Data Collection Instruments for Younger Learners

Authors: Hatice Kübra Koç

Abstract:

For academia, collecting reliable and valid data is one of the most significant issues for researchers. However, it is not the same procedure for all different target groups; meanwhile, during data collection from teenagers, young adults, or adults, researchers can use common data collection tools such as questionnaires, interviews, and semi-structured interviews; yet, for young learners and very young ones, these reliable and valid data collection tools cannot be easily designed or applied by the researchers. In this study, firstly, common data collection tools are examined for ‘very young’ and ‘young learners’ participant groups since it is thought that the quality and efficiency of an academic study is mainly based on its valid and correct data collection and data analysis procedure. Secondly, two different data collection instruments for very young and young learners are stated as discussing the efficacy of them. Finally, a suggested data collection tool – a performance-based questionnaire- which is specifically developed for ‘very young’ and ‘young learners’ participant groups in the field of teaching English to young learners as a foreign language is presented in this current study. The designing procedure and suggested items/factors for the suggested data collection tool are accordingly revealed at the end of the study to help researchers have studied with young and very learners.

Keywords: data collection instruments, performance-based questionnaire, young learners, very young learners

Procedia PDF Downloads 90
41727 A Patent Trend Analysis for Hydrogen Based Ironmaking: Identifying the Technology’s Development Phase

Authors: Ebru Kaymaz, Aslı İlbay Hamamcı, Yakup Enes Garip, Samet Ay

Abstract:

The use of hydrogen as a fuel is important for decreasing carbon emissions. For the steel industry, reducing carbon emissions is one of the most important agendas of recent times globally. Because of the Paris Agreement requirements, European steel industry studies on green steel production. Although many literature reviews have analyzed this topic from technological and hydrogen based ironmaking, there are very few studies focused on patents of decarbonize parts of the steel industry. Hence, this study focus on technological progress of hydrogen based ironmaking and on understanding the main trends through patent data. All available patent data were collected from Questel Orbit. The trend analysis of more than 900 patent documents has been carried out by using Questel Orbit Intellixir to analyze a large number of data for scientific intelligence.

Keywords: hydrogen based ironmaking, DRI, direct reduction, carbon emission, steelmaking, patent analysis

Procedia PDF Downloads 143
41726 Using Discriminant Analysis to Forecast Crime Rate in Nigeria

Authors: O. P. Popoola, O. A. Alawode, M. O. Olayiwola, A. M. Oladele

Abstract:

This research work is based on using discriminant analysis to forecast crime rate in Nigeria between 1996 and 2008. The work is interested in how gender (male and female) relates to offences committed against the government, against other properties, disturbance in public places, murder/robbery offences and other offences. The data used was collected from the National Bureau of Statistics (NBS). SPSS, the statistical package was used to analyse the data. Time plot was plotted on all the 29 offences gotten from the raw data. Eigenvalues and Multivariate tests, Wilks’ Lambda, standardized canonical discriminant function coefficients and the predicted classifications were estimated. The research shows that the distribution of the scores from each function is standardized to have a mean O and a standard deviation of 1. The magnitudes of the coefficients indicate how strongly the discriminating variable affects the score. In the predicted group membership, 172 cases that were predicted to commit crime against Government group, 66 were correctly predicted and 106 were incorrectly predicted. After going through the predicted classifications, we found out that most groups numbers that were correctly predicted were less than those that were incorrectly predicted.

Keywords: discriminant analysis, DA, multivariate analysis of variance, MANOVA, canonical correlation, and Wilks’ Lambda

Procedia PDF Downloads 467
41725 The Effectiveness of Executive Order in the Implementation of Human Security Policies: The Violent Case of the Special Anti-Robbery Squad and Youths in Nigeria

Authors: Cita Ayeni

Abstract:

Amidst numerous arguments on reasons for low Human Development (low HDI) in Nigeria ranging from corruption, incompetence of the government and its agencies, mismanagement of funds, terrorism, violence, and crime in the country, just to mention a few. There have been several actions by agencies of the government that for years has threatened the security and development of the citizens, and the country in a broader sense. This paper analyses the activities of SARS (Special Anti-Robbery Squad) as a government agency with a mandate to tackling the high rate of crime in the country but instead have been marred with allegations of violence, killings, extortion, harsh treatment, and terror of the Nigerian citizenry, predominantly the youths. This paper establishes the effect of these actions of the agency on human development in Nigeria, hindering the capacity of the Nigerian youths to earn a decent living due to constant terrorism, extortion, and extrajudicial activities, which in numerous cases resulted in maiming and death, thus instigating fear in the vast majority. This research further analyses the executive order by the then Acting President of Nigeria (Vice-President) that overhauled the agency following many years of continuous public outcry, complaint, grievance, and protest. This work establishes that this order carried out in the absence of the President was to a large extent enough to stop these violations, thereby resulting in little or no recorded complaint or grievance by the public, as many of the officials involved in the gruesome activities were said to have been put away. This would pave way and give freedom to the youths to realize their potentials free from intimidation, violence, and fear from the agencies created to protect them, and on the other hand refocus the new agency FSARS (Federal Special Anti-Robbery Squad) on its real mandate in collaboration with independent organizations acting as a check to its actions. This work thus depicts how direct executive orders on policies pertaining to individual insecurities, on youths in this case, in a country can be a potential drive to increased human development.

Keywords: special anti-robbery squad, Nigerian youths, overhaul, insecurities, human development

Procedia PDF Downloads 169
41724 Document-level Sentiment Analysis: An Exploratory Case Study of Low-resource Language Urdu

Authors: Ammarah Irum, Muhammad Ali Tahir

Abstract:

Document-level sentiment analysis in Urdu is a challenging Natural Language Processing (NLP) task due to the difficulty of working with lengthy texts in a language with constrained resources. Deep learning models, which are complex neural network architectures, are well-suited to text-based applications in addition to data formats like audio, image, and video. To investigate the potential of deep learning for Urdu sentiment analysis, we implemented five different deep learning models, including Bidirectional Long Short Term Memory (BiLSTM), Convolutional Neural Network (CNN), Convolutional Neural Network with Bidirectional Long Short Term Memory (CNN-BiLSTM), and Bidirectional Encoder Representation from Transformer (BERT). In this study, we developed a hybrid deep learning model called BiLSTM-Single Layer Multi Filter Convolutional Neural Network (BiLSTM-SLMFCNN) by fusing BiLSTM and CNN architecture. The proposed and baseline techniques are applied on Urdu Customer Support data set and IMDB Urdu movie review data set by using pre-trained Urdu word embedding that are suitable for sentiment analysis at the document level. Results of these techniques are evaluated and our proposed model outperforms all other deep learning techniques for Urdu sentiment analysis. BiLSTM-SLMFCNN outperformed the baseline deep learning models and achieved 83%, 79%, 83% and 94% accuracy on small, medium and large sized IMDB Urdu movie review data set and Urdu Customer Support data set respectively.

Keywords: urdu sentiment analysis, deep learning, natural language processing, opinion mining, low-resource language

Procedia PDF Downloads 70
41723 Indexing and Incremental Approach Using Map Reduce Bipartite Graph (MRBG) for Mining Evolving Big Data

Authors: Adarsh Shroff

Abstract:

Big data is a collection of dataset so large and complex that it becomes difficult to process using data base management tools. To perform operations like search, analysis, visualization on big data by using data mining; which is the process of extraction of patterns or knowledge from large data set. In recent years, the data mining applications become stale and obsolete over time. Incremental processing is a promising approach to refreshing mining results. It utilizes previously saved states to avoid the expense of re-computation from scratch. This project uses i2MapReduce, an incremental processing extension to Map Reduce, the most widely used framework for mining big data. I2MapReduce performs key-value pair level incremental processing rather than task level re-computation, supports not only one-step computation but also more sophisticated iterative computation, which is widely used in data mining applications, and incorporates a set of novel techniques to reduce I/O overhead for accessing preserved fine-grain computation states. To optimize the mining results, evaluate i2MapReduce using a one-step algorithm and three iterative algorithms with diverse computation characteristics for efficient mining.

Keywords: big data, map reduce, incremental processing, iterative computation

Procedia PDF Downloads 349
41722 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Mpho Mokoatle, Darlington Mapiye, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on $k$-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0%, 80.5%, 80.5%, 63.6%, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms.

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 166
41721 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Darlington Mapiye, Mpho Mokoatle, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on k-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0 %, 80.5 %, 80.5 %, 63.6 %, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 158
41720 Possible Risks for Online Orders in the Furniture Industry - Customer and Entrepreneur Perspective

Authors: Justyna Żywiołek, Marek Matulewski

Abstract:

Data, is information processed by enterprises for primary and secondary purposes as processes. Thanks to processing, the sales process takes place; in the case of the surveyed companies, sales take place online. However, this indirect form of contact with the customer causes many problems for both customers and furniture manufacturers. The article presents solutions that would solve problems related to the analysis of data and information in the order fulfillment process sent to post-warranty service. The article also presents an analysis of threats to the security of this information, both for customers and the enterprise.

Keywords: ordering furniture online, information security, furniture industry, enterprise security, risk analysis

Procedia PDF Downloads 47
41719 Fragile States as the Fertile Ground for Non-State Actors: Colombia and Somalia

Authors: Giorgi Goguadze, Jakub Zajączkowski

Abstract:

This paper is written due to overview the connection between fragile states and non-state actors, we should take into account that fragile states may vary from weak, failing and failed. In this paper we will discuss about two countries, one of them is weak (Colombia/ second one is already failed- Somalia. We will try to understand what feeds ill non-state actors such as: terrorist organizations, criminal entities and other cells in these countries, what threats are they representing and how to eliminate these dangers in both national and international scope. This paper is mainly based on literature overview and personal attitude and doesn’t claim to be in scientific chain.

Keywords: fragile States, terrorism, tribalism, Somalia

Procedia PDF Downloads 365
41718 An Analysis of Privacy and Security for Internet of Things Applications

Authors: Dhananjay Singh, M. Abdullah-Al-Wadud

Abstract:

The Internet of Things is a concept of a large scale ecosystem of wireless actuators. The actuators are defined as things in the IoT, those which contribute or produces some data to the ecosystem. However, ubiquitous data collection, data security, privacy preserving, large volume data processing, and intelligent analytics are some of the key challenges into the IoT technologies. In order to solve the security requirements, challenges and threats in the IoT, we have discussed a message authentication mechanism for IoT applications. Finally, we have discussed data encryption mechanism for messages authentication before propagating into IoT networks.

Keywords: Internet of Things (IoT), message authentication, privacy, security

Procedia PDF Downloads 381
41717 Design and Development of Data Mining Application for Medical Centers in Remote Areas

Authors: Grace Omowunmi Soyebi

Abstract:

Data Mining is the extraction of information from a large database which helps in predicting a trend or behavior, thereby helping management make knowledge-driven decisions. One principal problem of most hospitals in rural areas is making use of the file management system for keeping records. A lot of time is wasted when a patient visits the hospital, probably in an emergency, and the nurse or attendant has to search through voluminous files before the patient's file can be retrieved; this may cause an unexpected to happen to the patient. This Data Mining application is to be designed using a Structured System Analysis and design method, which will help in a well-articulated analysis of the existing file management system, feasibility study, and proper documentation of the Design and Implementation of a Computerized medical record system. This Computerized system will replace the file management system and help to easily retrieve a patient's record with increased data security, access clinical records for decision-making, and reduce the time range at which a patient gets attended to.

Keywords: data mining, medical record system, systems programming, computing

Procedia PDF Downloads 207
41716 Application of Regularized Spatio-Temporal Models to the Analysis of Remote Sensing Data

Authors: Salihah Alghamdi, Surajit Ray

Abstract:

Space-time data can be observed over irregularly shaped manifolds, which might have complex boundaries or interior gaps. Most of the existing methods do not consider the shape of the data, and as a result, it is difficult to model irregularly shaped data accommodating the complex domain. We used a method that can deal with space-time data that are distributed over non-planner shaped regions. The method is based on partial differential equations and finite element analysis. The model can be estimated using a penalized least squares approach with a regularization term that controls the over-fitting. The model is regularized using two roughness penalties, which consider the spatial and temporal regularities separately. The integrated square of the second derivative of the basis function is used as temporal penalty. While the spatial penalty consists of the integrated square of Laplace operator, which is integrated exclusively over the domain of interest that is determined using finite element technique. In this paper, we applied a spatio-temporal regression model with partial differential equations regularization (ST-PDE) approach to analyze a remote sensing data measuring the greenness of vegetation, measure by an index called enhanced vegetation index (EVI). The EVI data consist of measurements that take values between -1 and 1 reflecting the level of greenness of some region over a period of time. We applied (ST-PDE) approach to irregular shaped region of the EVI data. The approach efficiently accommodates the irregular shaped regions taking into account the complex boundaries rather than smoothing across the boundaries. Furthermore, the approach succeeds in capturing the temporal variation in the data.

Keywords: irregularly shaped domain, partial differential equations, finite element analysis, complex boundray

Procedia PDF Downloads 139
41715 JavaScript Object Notation Data against eXtensible Markup Language Data in Software Applications a Software Testing Approach

Authors: Theertha Chandroth

Abstract:

This paper presents a comparative study on how to check JSON (JavaScript Object Notation) data against XML (eXtensible Markup Language) data from a software testing point of view. JSON and XML are widely used data interchange formats, each with its unique syntax and structure. The objective is to explore various techniques and methodologies for validating comparison and integration between JSON data to XML and vice versa. By understanding the process of checking JSON data against XML data, testers, developers and data practitioners can ensure accurate data representation, seamless data interchange, and effective data validation.

Keywords: XML, JSON, data comparison, integration testing, Python, SQL

Procedia PDF Downloads 138
41714 TRACE/FRAPTRAN Analysis of Kuosheng Nuclear Power Plant Dry-Storage System

Authors: J. R. Wang, Y. Chiang, W. Y. Li, H. T. Lin, H. C. Chen, C. Shih, S. W. Chen

Abstract:

The dry-storage systems of nuclear power plants (NPPs) in Taiwan have become one of the major safety concerns. There are two steps considered in this study. The first step is the verification of the TRACE by using VSC-17 experimental data. The results of TRACE were similar to the VSC-17 data. It indicates that TRACE has the respectable accuracy in the simulation and analysis of the dry-storage systems. The next step is the application of TRACE in the dry-storage system of Kuosheng NPP (BWR/6). Kuosheng NPP is the second BWR NPP of Taiwan Power Company. In order to solve the storage of the spent fuels, Taiwan Power Company developed the new dry-storage system for Kuosheng NPP. In this step, the dry-storage system model of Kuosheng NPP was established by TRACE. Then, the steady state simulation of this model was performed and the results of TRACE were compared with the Kuosheng NPP data. Finally, this model was used to perform the safety analysis of Kuosheng NPP dry-storage system. Besides, FRAPTRAN was used tocalculate the transient performance of fuel rods.

Keywords: BWR, TRACE, FRAPTRAN, dry-storage

Procedia PDF Downloads 517