Search results for: forest cover-type dataset
913 Imputation of Urban Movement Patterns Using Big Data
Authors: Eusebio Odiari, Mark Birkin, Susan Grant-Muller, Nicolas Malleson
Abstract:
Big data typically refers to consumer datasets revealing some detailed heterogeneity in human behavior, which if harnessed appropriately, could potentially revolutionize our understanding of the collective phenomena of the physical world. Inadvertent missing values skew these datasets and compromise the validity of the thesis. Here we discuss a conceptually consistent strategy for identifying other relevant datasets to combine with available big data, to plug the gaps and to create a rich requisite comprehensive dataset for subsequent analysis. Specifically, emphasis is on how these methodologies can for the first time enable the construction of more detailed pictures of passenger demand and drivers of mobility on the railways. These methodologies can predict the influence of changes within the network (like a change in time-table or impact of a new station), explain local phenomena outside the network (like rail-heading) and the other impacts of urban morphology. Our analysis also reveals that our new imputation data model provides for more equitable revenue sharing amongst network operators who manage different parts of the integrated UK railways.Keywords: big-data, micro-simulation, mobility, ticketing-data, commuters, transport, synthetic, population
Procedia PDF Downloads 231912 Artificial Intelligence Based Method in Identifying Tumour Infiltrating Lymphocytes of Triple Negative Breast Cancer
Authors: Nurkhairul Bariyah Baharun, Afzan Adam, Reena Rahayu Md Zin
Abstract:
Tumor microenvironment (TME) in breast cancer is mainly composed of cancer cells, immune cells, and stromal cells. The interaction between cancer cells and their microenvironment plays an important role in tumor development, progression, and treatment response. The TME in breast cancer includes tumor-infiltrating lymphocytes (TILs) that are implicated in killing tumor cells. TILs can be found in tumor stroma (sTILs) and within the tumor (iTILs). TILs in triple negative breast cancer (TNBC) have been demonstrated to have prognostic and potentially predictive value. The international Immune-Oncology Biomarker Working Group (TIL-WG) had developed a guideline focus on the assessment of sTILs using hematoxylin and eosin (H&E)-stained slides. According to the guideline, the pathologists use “eye balling” method on the H&E stained- slide for sTILs assessment. This method has low precision, poor interobserver reproducibility, and is time-consuming for a comprehensive evaluation, besides only counted sTILs in their assessment. The TIL-WG has therefore recommended that any algorithm for computational assessment of TILs utilizing the guidelines provided to overcome the limitations of manual assessment, thus providing highly accurate and reliable TILs detection and classification for reproducible and quantitative measurement. This study is carried out to develop a TNBC digital whole slide image (WSI) dataset from H&E-stained slides and IHC (CD4+ and CD8+) stained slides. TNBC cases were retrieved from the database of the Department of Pathology, Hospital Canselor Tuanku Muhriz (HCTM). TNBC cases diagnosed between the year 2010 and 2021 with no history of other cancer and available block tissue were included in the study (n=58). Tissue blocks were sectioned approximately 4 µm for H&E and IHC stain. The H&E staining was performed according to a well-established protocol. Indirect IHC stain was also performed on the tissue sections using protocol from Diagnostic BioSystems PolyVue™ Plus Kit, USA. The slides were stained with rabbit monoclonal, CD8 antibody (SP16) and Rabbit monoclonal, CD4 antibody (EP204). The selected and quality-checked slides were then scanned using a high-resolution whole slide scanner (Pannoramic DESK II DW- slide scanner) to digitalize the tissue image with a pixel resolution of 20x magnification. A manual TILs (sTILs and iTILs) assessment was then carried out by the appointed pathologist (2 pathologists) for manual TILs scoring from the digital WSIs following the guideline developed by TIL-WG 2014, and the result displayed as the percentage of sTILs and iTILs per mm² stromal and tumour area on the tissue. Following this, we aimed to develop an automated digital image scoring framework that incorporates key elements of manual guidelines (including both sTILs and iTILs) using manually annotated data for robust and objective quantification of TILs in TNBC. From the study, we have developed a digital dataset of TNBC H&E and IHC (CD4+ and CD8+) stained slides. We hope that an automated based scoring method can provide quantitative and interpretable TILs scoring, which correlates with the manual pathologist-derived sTILs and iTILs scoring and thus has potential prognostic implications.Keywords: automated quantification, digital pathology, triple negative breast cancer, tumour infiltrating lymphocytes
Procedia PDF Downloads 118911 Phylogenetic Analysis and a Review of the History of the Accidental Phytoplankter, Phaeodactylum tricornutum Bohlin (Bacillariophyta)
Authors: Jamal S. M. Sabir, Edward C. Theriot, Schonna R. Manning, Abdulrahman L. Al-Malki, Mohammad, Mumdooh J. Sabir, Dwight K. Romanovicz, Nahid H. Hajrah, Robert K. Jansen, Matt P. Ashworth
Abstract:
The diatom Phaeodactylum tricornutum has been used as a model for cell biologists and ecologists for over a century. We have incorporated several new raphid pennates into a three-gene phylogenetic dataset (SSU, rbcL, psbC), and recover Gomphonemopsis sp. as sister to P. tricornutum with 100% BS support. This is the first time a close relative has been identified for P. tricornutum with robust statistical support. We test and reject a succession of hypotheses for other relatives. Our molecular data are statistically significantly incongruent with placement of either or both species among the Cymbellales, an order of diatoms with which both have been associated. We believe that further resolution of the phylogenetic position of P. tricornutum will rely more on increased taxon sampling than increased genetic sampling. Gomphonemopsis is a benthic diatom, and its phylogenetic relationship with P. tricornutum is congruent with the hypothesis that P. tricornutum is a benthic diatom with specific adaptations that lead to active recruitment into the plankton. We hypothesize that other benthic diatoms are likely to have similar adaptations and are not merely passively recruited into the plankton.Keywords: benthic, diatoms; ecology, Phaeodactylum tricornutum, phylogeny, tychoplankton
Procedia PDF Downloads 238910 Walmart Sales Forecasting using Machine Learning in Python
Authors: Niyati Sharma, Om Anand, Sanjeev Kumar Prasad
Abstract:
Assuming future sale value for any of the organizations is one of the major essential characteristics of tactical development. Walmart Sales Forecasting is the finest illustration to work with as a beginner; subsequently, it has the major retail data set. Walmart uses this sales estimate problem for hiring purposes also. We would like to analyzing how the internal and external effects of one of the largest companies in the US can walk out their Weekly Sales in the future. Demand forecasting is the planned prerequisite of products or services in the imminent on the basis of present and previous data and different stages of the market. Since all associations is facing the anonymous future and we do not distinguish in the future good demand. Hence, through exploring former statistics and recent market statistics, we envisage the forthcoming claim and building of individual goods, which are extra challenging in the near future. As a result of this, we are producing the required products in pursuance of the petition of the souk in advance. We will be using several machine learning models to test the exactness and then lastly, train the whole data by Using linear regression and fitting the training data into it. Accuracy is 8.88%. The extra trees regression model gives the best accuracy of 97.15%.Keywords: random forest algorithm, linear regression algorithm, extra trees classifier, mean absolute error
Procedia PDF Downloads 149909 Hybrid Anomaly Detection Using Decision Tree and Support Vector Machine
Authors: Elham Serkani, Hossein Gharaee Garakani, Naser Mohammadzadeh, Elaheh Vaezpour
Abstract:
Intrusion detection systems (IDS) are the main components of network security. These systems analyze the network events for intrusion detection. The design of an IDS is through the training of normal traffic data or attack. The methods of machine learning are the best ways to design IDSs. In the method presented in this article, the pruning algorithm of C5.0 decision tree is being used to reduce the features of traffic data used and training IDS by the least square vector algorithm (LS-SVM). Then, the remaining features are arranged according to the predictor importance criterion. The least important features are eliminated in the order. The remaining features of this stage, which have created the highest level of accuracy in LS-SVM, are selected as the final features. The features obtained, compared to other similar articles which have examined the selected features in the least squared support vector machine model, are better in the accuracy, true positive rate, and false positive. The results are tested by the UNSW-NB15 dataset.Keywords: decision tree, feature selection, intrusion detection system, support vector machine
Procedia PDF Downloads 266908 Sentiment Analysis of Ensemble-Based Classifiers for E-Mail Data
Authors: Muthukumarasamy Govindarajan
Abstract:
Detection of unwanted, unsolicited mails called spam from email is an interesting area of research. It is necessary to evaluate the performance of any new spam classifier using standard data sets. Recently, ensemble-based classifiers have gained popularity in this domain. In this research work, an efficient email filtering approach based on ensemble methods is addressed for developing an accurate and sensitive spam classifier. The proposed approach employs Naive Bayes (NB), Support Vector Machine (SVM) and Genetic Algorithm (GA) as base classifiers along with different ensemble methods. The experimental results show that the ensemble classifier was performing with accuracy greater than individual classifiers, and also hybrid model results are found to be better than the combined models for the e-mail dataset. The proposed ensemble-based classifiers turn out to be good in terms of classification accuracy, which is considered to be an important criterion for building a robust spam classifier.Keywords: accuracy, arcing, bagging, genetic algorithm, Naive Bayes, sentiment mining, support vector machine
Procedia PDF Downloads 143907 Identifying Degradation Patterns of LI-Ion Batteries from Impedance Spectroscopy Using Machine Learning
Authors: Yunwei Zhang, Qiaochu Tang, Yao Zhang, Jiabin Wang, Ulrich Stimming, Alpha Lee
Abstract:
Forecasting the state of health and remaining useful life of Li-ion batteries is an unsolved challenge that limits technologies such as consumer electronics and electric vehicles. Here we build an accurate battery forecasting system by combining electrochemical impedance spectroscopy (EIS) -- a real-time, non-invasive and information-rich measurement that is hitherto underused in battery diagnosis -- with Gaussian process machine learning. We collect over 20,000 EIS spectra of commercial Li-ion batteries at different states of health, states of charge and temperatures -- the largest dataset to our knowledge of its kind. Our Gaussian process model takes the entire spectrum as input, without further feature engineering, and automatically determines which spectral features predict degradation. Our model accurately predicts the remaining useful life, even without complete knowledge of past operating conditions of the battery. Our results demonstrate the value of EIS signals in battery management systems.Keywords: battery degradation, machine learning method, electrochemical impedance spectroscopy, battery diagnosis
Procedia PDF Downloads 149906 Indexing and Incremental Approach Using Map Reduce Bipartite Graph (MRBG) for Mining Evolving Big Data
Authors: Adarsh Shroff
Abstract:
Big data is a collection of dataset so large and complex that it becomes difficult to process using data base management tools. To perform operations like search, analysis, visualization on big data by using data mining; which is the process of extraction of patterns or knowledge from large data set. In recent years, the data mining applications become stale and obsolete over time. Incremental processing is a promising approach to refreshing mining results. It utilizes previously saved states to avoid the expense of re-computation from scratch. This project uses i2MapReduce, an incremental processing extension to Map Reduce, the most widely used framework for mining big data. I2MapReduce performs key-value pair level incremental processing rather than task level re-computation, supports not only one-step computation but also more sophisticated iterative computation, which is widely used in data mining applications, and incorporates a set of novel techniques to reduce I/O overhead for accessing preserved fine-grain computation states. To optimize the mining results, evaluate i2MapReduce using a one-step algorithm and three iterative algorithms with diverse computation characteristics for efficient mining.Keywords: big data, map reduce, incremental processing, iterative computation
Procedia PDF Downloads 353905 Artificial Neural Network-Based Short-Term Load Forecasting for Mymensingh Area of Bangladesh
Authors: S. M. Anowarul Haque, Md. Asiful Islam
Abstract:
Electrical load forecasting is considered to be one of the most indispensable parts of a modern-day electrical power system. To ensure a reliable and efficient supply of electric energy, special emphasis should have been put on the predictive feature of electricity supply. Artificial Neural Network-based approaches have emerged to be a significant area of interest for electric load forecasting research. This paper proposed an Artificial Neural Network model based on the particle swarm optimization algorithm for improved electric load forecasting for Mymensingh, Bangladesh. The forecasting model is developed and simulated on the MATLAB environment with a large number of training datasets. The model is trained based on eight input parameters including historical load and weather data. The predicted load data are then compared with an available dataset for validation. The proposed neural network model is proved to be more reliable in terms of day-wise load forecasting for Mymensingh, Bangladesh.Keywords: load forecasting, artificial neural network, particle swarm optimization
Procedia PDF Downloads 172904 Incorporating Multiple Supervised Learning Algorithms for Effective Intrusion Detection
Authors: Umar Albalawi, Sang C. Suh, Jinoh Kim
Abstract:
As internet continues to expand its usage with an enormous number of applications, cyber-threats have significantly increased accordingly. Thus, accurate detection of malicious traffic in a timely manner is a critical concern in today’s Internet for security. One approach for intrusion detection is to use Machine Learning (ML) techniques. Several methods based on ML algorithms have been introduced over the past years, but they are largely limited in terms of detection accuracy and/or time and space complexity to run. In this work, we present a novel method for intrusion detection that incorporates a set of supervised learning algorithms. The proposed technique provides high accuracy and outperforms existing techniques that simply utilizes a single learning method. In addition, our technique relies on partial flow information (rather than full information) for detection, and thus, it is light-weight and desirable for online operations with the property of early identification. With the mid-Atlantic CCDC intrusion dataset publicly available, we show that our proposed technique yields a high degree of detection rate over 99% with a very low false alarm rate (0.4%).Keywords: intrusion detection, supervised learning, traffic classification, computer networks
Procedia PDF Downloads 351903 Monitoring the Vegetation Cover Dynamics of the African Great Green Wall in Yobe State Nigeria
Authors: Isa Muhammad Zumo
Abstract:
The African Great Green Wall (GGW) is a significant initiative in northern Nigeria because it promotes land restoration and conservation utilizing both commercial and species of forest trees while also helping to mitigate desertification and hazards from the sand dunes and shifting Sahara deserts. Conflicts and weather, however, pose a significant danger to the achievement of these goals. The scientific method for monitoring the vegetation dynamics since inception has not received the required attention, despite the African Development Bank (ADB)'s help in funding the project and its integration into the state's development plans for GGW initiatives. This study will monitor the changes in the vegetation cover of the great green wall within Yobe State Nigeria from 2014 to 2023. The vegetation dynamics will be monitored using Landsat 8 Operational Land Imager (OLI) for 6 years at 2 years intervals. The result will show the fluctuations in the vegetation cover density within the period of study. This will guide the design and implementation of policies of the GGW in achieving its objectives. The result can also contribute to the realization of Sustainable Development Goal (SDG) Target 13.2: Integrate climate change measures into national policies, strategies, and planning.Keywords: monitoring, green wall, Landsat 8, Nigeria
Procedia PDF Downloads 86902 Analysis of Patient No-Shows According to Health Conditions
Authors: Sangbok Lee
Abstract:
There has been much effort on process improvement for outpatient clinics to provide quality and acute care to patients. One of the efforts is no-show analysis or prediction. This work analyzes patient no-shows along with patient health conditions. The health conditions refer to clinical symptoms that each patient has, out of the followings; hyperlipidemia, diabetes, metastatic solid tumor, dementia, chronic obstructive pulmonary disease, hypertension, coronary artery disease, myocardial infraction, congestive heart failure, atrial fibrillation, stroke, drug dependence abuse, schizophrenia, major depression, and pain. A dataset from a regional hospital is used to find the relationship between the number of the symptoms and no-show probabilities. Additional analysis reveals how each symptom or combination of symptoms affects no-shows. In the above analyses, cross-classification of patients by age and gender is carried out. The findings from the analysis will be used to take extra care to patients with particular health conditions. They will be forced to visit clinics by being informed about their health conditions and possible consequences more clearly. Moreover, this work will be used in the preparation of making institutional guidelines for patient reminder systems.Keywords: healthcare system, no show analysis, process improvment, statistical data analysis
Procedia PDF Downloads 233901 Analysing Causal Effect of London Cycle Superhighways on Traffic Congestion
Authors: Prajamitra Bhuyan
Abstract:
Transport operators have a range of intervention options available to improve or enhance their networks. But often such interventions are made in the absence of sound evidence on what outcomes may result. Cycling superhighways were promoted as a sustainable and healthy travel mode which aims to cut traffic congestion. The estimation of the impacts of the cycle superhighways on congestion is complicated due to the non-random assignment of such intervention over the transport network. In this paper, we analyse the causal effect of cycle superhighways utilising pre-innervation and post-intervention information on traffic and road characteristics along with socio-economic factors. We propose a modeling framework based on the propensity score and outcome regression model. The method is also extended to doubly robust set-up. Simulation results show the superiority of the performance of the proposed method over existing competitors. The method is applied to analyse a real dataset on the London transport network, and the result would help effective decision making to improve network performance.Keywords: average treatment effect, confounder, difference-in-difference, intelligent transportation system, potential outcome
Procedia PDF Downloads 242900 Assessment of Heavy Metal Contamination in Roadside Soils along Shenyang-Dalian Highway in Liaoning Province, China
Authors: Zhang Hui, Wu Caiqiu, Yuan Xuyin, Qiu Jie, Zhang Hanpei
Abstract:
The heavy metal contaminations were determined with a detailed soil survey in roadside soils along Shenyang-Dalian Highway of Liaoning Province (China) and Pb, Cu, Cd, Ni and Zn were analyzed using the atomic absorption spectrophotometric method. The average concentration of Pb, Cu, Cd, Ni and Zn in roadside soils was determined to be 43.8, 26.5, 0.119, 32.1, 71.3 mg/kg respectively, and all of the heavy metal contents were higher than the background values. Different heavy metal distribution regularity was found in different land use type of roadside soil, there was an obvious peak of heavy concentration at 25m from road edge in the farmland, while in the forest and orchard soil, all heavy metals gradually decreased with the increase of distance from road edge and conformed to the exponential model. Furthermore, the heavy metal contents of heavy metals except Cd were markedly increased compared with those in 1999 and 2007, and the heavy metals concentrations of Shenyang- Dalian Highway were considered medium or low in comparison with those in other cities around the world. The assessment of heavy metal contamination of roadside soils illustrated a common low pollution for all heavy metal and recommended that more attention should be paid to Pb contamination in roadside soils in Shenyang-Dalian Highway.Keywords: heavy metal contamination, roadside, highway, Nemerow Pollution Index
Procedia PDF Downloads 267899 Toward Automatic Chest CT Image Segmentation
Authors: Angely Sim Jia Wun, Sasa Arsovski
Abstract:
Numerous studies have been conducted on the segmentation of medical images. Segmenting the lungs is one of the common research topics in those studies. Our research stemmed from the lack of solutions for automatic bone, airway, and vessel segmentation, despite the existence of multiple lung segmentation techniques. Consequently, currently, available software tools used for medical image segmentation do not provide automatic lung, bone, airway, and vessel segmentation. This paper presents segmentation techniques along with an interactive software tool architecture for segmenting bone, lung, airway, and vessel tissues. Additionally, we propose a method for creating binary masks from automatically generated segments. The key contribution of our approach is the technique for automatic image thresholding using adjustable Hounsfield values and binary mask extraction. Generated binary masks can be successfully used as a training dataset for deep-learning solutions in medical image segmentation. In this paper, we also examine the current software tools used for medical image segmentation, discuss our approach, and identify its advantages.Keywords: lung segmentation, binary masks, U-Net, medical software tools
Procedia PDF Downloads 98898 Vehicle Detection and Tracking Using Deep Learning Techniques in Surveillance Image
Authors: Abe D. Desta
Abstract:
This study suggests a deep learning-based method for identifying and following moving objects in surveillance video. The proposed method uses a fast regional convolution neural network (F-RCNN) trained on a substantial dataset of vehicle images to first detect vehicles. A Kalman filter and a data association technique based on a Hungarian algorithm are then used to monitor the observed vehicles throughout time. However, in general, F-RCNN algorithms have been shown to be effective in achieving high detection accuracy and robustness in this research study. For example, in one study The study has shown that the vehicle detection and tracking, the system was able to achieve an accuracy of 97.4%. In this study, the F-RCNN algorithm was compared to other popular object detection algorithms and was found to outperform them in terms of both detection accuracy and speed. The presented system, which has application potential in actual surveillance systems, shows the usefulness of deep learning approaches in vehicle detection and tracking.Keywords: artificial intelligence, computer vision, deep learning, fast-regional convolutional neural networks, feature extraction, vehicle tracking
Procedia PDF Downloads 129897 Effect of Monotonically Decreasing Parameters on Margin Softmax for Deep Face Recognition
Authors: Umair Rashid
Abstract:
Normally softmax loss is used as the supervision signal in face recognition (FR) system, and it boosts the separability of features. In the last two years, a number of techniques have been proposed by reformulating the original softmax loss to enhance the discriminating power of Deep Convolutional Neural Networks (DCNNs) for FR system. To learn angularly discriminative features Cosine-Margin based softmax has been adjusted as monotonically decreasing angular function, that is the main challenge for angular based softmax. On that issue, we propose monotonically decreasing element for Cosine-Margin based softmax and also, we discussed the effect of different monotonically decreasing parameters on angular Margin softmax for FR system. We train the model on publicly available dataset CASIA- WebFace via our proposed monotonically decreasing parameters for cosine function and the tests on YouTube Faces (YTF, Labeled Face in the Wild (LFW), VGGFace1 and VGGFace2 attain the state-of-the-art performance.Keywords: deep convolutional neural networks, cosine margin face recognition, softmax loss, monotonically decreasing parameter
Procedia PDF Downloads 102896 Churn Prediction for Savings Bank Customers: A Machine Learning Approach
Authors: Prashant Verma
Abstract:
Commercial banks are facing immense pressure, including financial disintermediation, interest rate volatility and digital ways of finance. Retaining an existing customer is 5 to 25 less expensive than acquiring a new one. This paper explores customer churn prediction, based on various statistical & machine learning models and uses under-sampling, to improve the predictive power of these models. The results show that out of the various machine learning models, Random Forest which predicts the churn with 78% accuracy, has been found to be the most powerful model for the scenario. Customer vintage, customer’s age, average balance, occupation code, population code, average withdrawal amount, and an average number of transactions were found to be the variables with high predictive power for the churn prediction model. The model can be deployed by the commercial banks in order to avoid the customer churn so that they may retain the funds, which are kept by savings bank (SB) customers. The article suggests a customized campaign to be initiated by commercial banks to avoid SB customer churn. Hence, by giving better customer satisfaction and experience, the commercial banks can limit the customer churn and maintain their deposits.Keywords: savings bank, customer churn, customer retention, random forests, machine learning, under-sampling
Procedia PDF Downloads 144895 Spatiotemporal Neural Network for Video-Based Pose Estimation
Authors: Bin Ji, Kai Xu, Shunyu Yao, Jingjing Liu, Ye Pan
Abstract:
Human pose estimation is a popular research area in computer vision for its important application in human-machine interface. In recent years, 2D human pose estimation based on convolution neural network has got great progress and development. However, in more and more practical applications, people often need to deal with tasks based on video. It’s not far-fetched for us to consider how to combine the spatial and temporal information together to achieve a balance between computing cost and accuracy. To address this issue, this study proposes a new spatiotemporal model, namely Spatiotemporal Net (STNet) to combine both temporal and spatial information more rationally. As a result, the predicted keypoints heatmap is potentially more accurate and spatially more precise. Under the condition of ensuring the recognition accuracy, the algorithm deal with spatiotemporal series in a decoupled way, which greatly reduces the computation of the model, thus reducing the resource consumption. This study demonstrate the effectiveness of our network over the Penn Action Dataset, and the results indicate superior performance of our network over the existing methods.Keywords: convolutional long short-term memory, deep learning, human pose estimation, spatiotemporal series
Procedia PDF Downloads 150894 A Longitudinal Exploration into Computer-Mediated Communication Use (CMC) and Relationship Change between 2005-2018
Authors: Laurie Dempsey
Abstract:
Relationships are considered to be beneficial for emotional wellbeing, happiness and physical health. However, they are also complicated: individuals engage in a multitude of complex and volatile relationships during their lifetime, where the change to or ending of these dynamics can be deeply disruptive. As the internet is further integrated into everyday life and relationships are increasingly mediated, Media Studies’ and Sociology’s research interests intersect and converge. This study longitudinally explores how relationship change over time corresponds with the developing UK technological landscape between 2005-2018. Since the early 2000s, the use of computer-mediated communication (CMC) in the UK has dramatically reshaped interaction. Its use has compelled individuals to renegotiate how they consider their relationships: some argue it has allowed for vast networks to be accumulated and strengthened; others contend that it has eradicated the core values and norms associated with communication, damaging relationships. This research collaborated with UK media regulator Ofcom, utilising the longitudinal dataset from their Adult Media Lives study to explore how relationships and CMC use developed over time. This is a unique qualitative dataset covering 2005-2018, where the same 18 participants partook in annual in-home filmed depth interviews. The interviews’ raw video footage was examined year-on-year to consider how the same people changed their reported behaviour and outlooks towards their relationships, and how this coincided with CMC featuring more prominently in their everyday lives. Each interview was transcribed, thematically analysed and coded using NVivo 11 software. This study allowed for a comprehensive exploration into these individuals’ changing relationships over time, as participants grew older, experienced marriages or divorces, conceived and raised children, or lost loved ones. It found that as technology developed between 2005-2018, everyday CMC use was increasingly normalised and incorporated into relationship maintenance. It played a crucial role in altering relationship dynamics, even factoring in the breakdown of several ties. Three key relationships were identified as being shaped by CMC use: parent-child; extended family; and friendships. Over the years there were substantial instances of relationship conflict: for parents renegotiating their dynamic with their child as they tried to both restrict and encourage their child’s technology use; for estranged family members ‘forced’ together in the online sphere; and for friendships compelled to publicly display their relationship on social media, for fear of social exclusion. However, it was also evident that CMC acted as a crucial lifeline for these participants, providing opportunities to strengthen and maintain their bonds via previously unachievable means, both over time and distance. A longitudinal study of this length and nature utilising the same participants does not currently exist, thus provides crucial insight into how and why relationship dynamics alter over time. This unique and topical piece of research draws together Sociology and Media Studies, illustrating how the UK’s changing technological landscape can reshape one of the most basic human compulsions. This collaboration with Ofcom allows for insight that can be utilised in both academia and policymaking alike, making this research relevant and impactful across a range of academic fields and industries.Keywords: computer mediated communication, longitudinal research, personal relationships, qualitative data
Procedia PDF Downloads 123893 Face Recognition Using Eigen Faces Algorithm
Authors: Shweta Pinjarkar, Shrutika Yawale, Mayuri Patil, Reshma Adagale
Abstract:
Face recognition is the technique which can be applied to the wide variety of problems like image and film processing, human computer interaction, criminal identification etc. This has motivated researchers to develop computational models to identify the faces, which are easy and simple to implement. In this, demonstrates the face recognition system in android device using eigenface. The system can be used as the base for the development of the recognition of human identity. Test images and training images are taken directly with the camera in android device.The test results showed that the system produces high accuracy. The goal is to implement model for particular face and distinguish it with large number of stored faces. face recognition system detects the faces in picture taken by web camera or digital camera and these images then checked with training images dataset based on descriptive features. Further this algorithm can be extended to recognize the facial expressions of a person.recognition could be carried out under widely varying conditions like frontal view,scaled frontal view subjects with spectacles. The algorithm models the real time varying lightning conditions. The implemented system is able to perform real-time face detection, face recognition and can give feedback giving a window with the subject's info from database and sending an e-mail notification to interested institutions using android application. Face recognition is the technique which can be applied to the wide variety of problems like image and film processing, human computer interaction, criminal identification etc. This has motivated researchers to develop computational models to identify the faces, which are easy and simple to implement. In this , demonstrates the face recognition system in android device using eigenface. The system can be used as the base for the development of the recognition of human identity. Test images and training images are taken directly with the camera in android device.The test results showed that the system produces high accuracy. The goal is to implement model for particular face and distinguish it with large number of stored faces. face recognition system detects the faces in picture taken by web camera or digital camera and these images then checked with training images dataset based on descriptive features. Further this algorithm can be extended to recognize the facial expressions of a person.recognition could be carried out under widely varying conditions like frontal view,scaled frontal view subjects with spectacles. The algorithm models the real time varying lightning conditions. The implemented system is able to perform real-time face detection, face recognition and can give feedback giving a window with the subject's info from database and sending an e-mail notification to interested institutions using android application.Keywords: face detection, face recognition, eigen faces, algorithm
Procedia PDF Downloads 361892 Development of Fake News Model Using Machine Learning through Natural Language Processing
Authors: Sajjad Ahmed, Knut Hinkelmann, Flavio Corradini
Abstract:
Fake news detection research is still in the early stage as this is a relatively new phenomenon in the interest raised by society. Machine learning helps to solve complex problems and to build AI systems nowadays and especially in those cases where we have tacit knowledge or the knowledge that is not known. We used machine learning algorithms and for identification of fake news; we applied three classifiers; Passive Aggressive, Naïve Bayes, and Support Vector Machine. Simple classification is not completely correct in fake news detection because classification methods are not specialized for fake news. With the integration of machine learning and text-based processing, we can detect fake news and build classifiers that can classify the news data. Text classification mainly focuses on extracting various features of text and after that incorporating those features into classification. The big challenge in this area is the lack of an efficient way to differentiate between fake and non-fake due to the unavailability of corpora. We applied three different machine learning classifiers on two publicly available datasets. Experimental analysis based on the existing dataset indicates a very encouraging and improved performance.Keywords: fake news detection, natural language processing, machine learning, classification techniques.
Procedia PDF Downloads 168891 Machine Learning Approach for Automating Electronic Component Error Classification and Detection
Authors: Monica Racha, Siva Chandrasekaran, Alex Stojcevski
Abstract:
The engineering programs focus on promoting students' personal and professional development by ensuring that students acquire technical and professional competencies during four-year studies. The traditional engineering laboratory provides an opportunity for students to "practice by doing," and laboratory facilities aid them in obtaining insight and understanding of their discipline. Due to rapid technological advancements and the current COVID-19 outbreak, the traditional labs were transforming into virtual learning environments. Aim: To better understand the limitations of the physical laboratory, this research study aims to use a Machine Learning (ML) algorithm that interfaces with the Augmented Reality HoloLens and predicts the image behavior to classify and detect the electronic components. The automated electronic components error classification and detection automatically detect and classify the position of all components on a breadboard by using the ML algorithm. This research will assist first-year undergraduate engineering students in conducting laboratory practices without any supervision. With the help of HoloLens, and ML algorithm, students will reduce component placement error on a breadboard and increase the efficiency of simple laboratory practices virtually. Method: The images of breadboards, resistors, capacitors, transistors, and other electrical components will be collected using HoloLens 2 and stored in a database. The collected image dataset will then be used for training a machine learning model. The raw images will be cleaned, processed, and labeled to facilitate further analysis of components error classification and detection. For instance, when students conduct laboratory experiments, the HoloLens captures images of students placing different components on a breadboard. The images are forwarded to the server for detection in the background. A hybrid Convolutional Neural Networks (CNNs) and Support Vector Machines (SVMs) algorithm will be used to train the dataset for object recognition and classification. The convolution layer extracts image features, which are then classified using Support Vector Machine (SVM). By adequately labeling the training data and classifying, the model will predict, categorize, and assess students in placing components correctly. As a result, the data acquired through HoloLens includes images of students assembling electronic components. It constantly checks to see if students appropriately position components in the breadboard and connect the components to function. When students misplace any components, the HoloLens predicts the error before the user places the components in the incorrect proportion and fosters students to correct their mistakes. This hybrid Convolutional Neural Networks (CNNs) and Support Vector Machines (SVMs) algorithm automating electronic component error classification and detection approach eliminates component connection problems and minimizes the risk of component damage. Conclusion: These augmented reality smart glasses powered by machine learning provide a wide range of benefits to supervisors, professionals, and students. It helps customize the learning experience, which is particularly beneficial in large classes with limited time. It determines the accuracy with which machine learning algorithms can forecast whether students are making the correct decisions and completing their laboratory tasks.Keywords: augmented reality, machine learning, object recognition, virtual laboratories
Procedia PDF Downloads 137890 Multi-Level Attentional Network for Aspect-Based Sentiment Analysis
Authors: Xinyuan Liu, Xiaojun Jing, Yuan He, Junsheng Mu
Abstract:
Aspect-based Sentiment Analysis (ABSA) has attracted much attention due to its capacity to determine the sentiment polarity of the certain aspect in a sentence. In previous works, great significance of the interaction between aspect and sentence has been exhibited in ABSA. In consequence, a Multi-Level Attentional Networks (MLAN) is proposed. MLAN consists of four parts: Embedding Layer, Encoding Layer, Multi-Level Attentional (MLA) Layers and Final Prediction Layer. Among these parts, MLA Layers including Aspect Level Attentional (ALA) Layer and Interactive Attentional (ILA) Layer is the innovation of MLAN, whose function is to focus on the important information and obtain multiple levels’ attentional weighted representation of aspect and sentence. In the experiments, MLAN is compared with classical TD-LSTM, MemNet, RAM, ATAE-LSTM, IAN, AOA, LCR-Rot and AEN-GloVe on SemEval 2014 Dataset. The experimental results show that MLAN outperforms those state-of-the-art models greatly. And in case study, the works of ALA Layer and ILA Layer have been proven to be effective and interpretable.Keywords: deep learning, aspect-based sentiment analysis, attention, natural language processing
Procedia PDF Downloads 139889 A Study on Sentiment Analysis Using Various ML/NLP Models on Historical Data of Indian Leaders
Authors: Sarthak Deshpande, Akshay Patil, Pradip Pandhare, Nikhil Wankhede, Rushali Deshmukh
Abstract:
Among the highly significant duties for any language most effective is the sentiment analysis, which is also a key area of NLP, that recently made impressive strides. There are several models and datasets available for those tasks in popular and commonly used languages like English, Russian, and Spanish. While sentiment analysis research is performed extensively, however it is lagging behind for the regional languages having few resources such as Hindi, Marathi. Marathi is one of the languages that included in the Indian Constitution’s 8th schedule and is the third most widely spoken language in the country and primarily spoken in the Deccan region, which encompasses Maharashtra and Goa. There isn’t sufficient study on sentiment analysis methods based on Marathi text due to lack of available resources, information. Therefore, this project proposes the use of different ML/NLP models for the analysis of Marathi data from the comments below YouTube content, tweets or Instagram posts. We aim to achieve a short and precise analysis and summary of the related data using our dataset (Dates, names, root words) and lexicons to locate exact information.Keywords: multilingual sentiment analysis, Marathi, natural language processing, text summarization, lexicon-based approaches
Procedia PDF Downloads 76888 Educational Data Mining: The Case of the Department of Mathematics and Computing in the Period 2009-2018
Authors: Mário Ernesto Sitoe, Orlando Zacarias
Abstract:
University education is influenced by several factors that range from the adoption of strategies to strengthen the whole process to the academic performance improvement of the students themselves. This work uses data mining techniques to develop a predictive model to identify students with a tendency to evasion and retention. To this end, a database of real students’ data from the Department of University Admission (DAU) and the Department of Mathematics and Informatics (DMI) was used. The data comprised 388 undergraduate students admitted in the years 2009 to 2014. The Weka tool was used for model building, using three different techniques, namely: K-nearest neighbor, random forest, and logistic regression. To allow for training on multiple train-test splits, a cross-validation approach was employed with a varying number of folds. To reduce bias variance and improve the performance of the models, ensemble methods of Bagging and Stacking were used. After comparing the results obtained by the three classifiers, Logistic Regression using Bagging with seven folds obtained the best performance, showing results above 90% in all evaluated metrics: accuracy, rate of true positives, and precision. Retention is the most common tendency.Keywords: evasion and retention, cross-validation, bagging, stacking
Procedia PDF Downloads 83887 Artificial Reproduction System and Imbalanced Dataset: A Mendelian Classification
Authors: Anita Kushwaha
Abstract:
We propose a new evolutionary computational model called Artificial Reproduction System which is based on the complex process of meiotic reproduction occurring between male and female cells of the living organisms. Artificial Reproduction System is an attempt towards a new computational intelligence approach inspired by the theoretical reproduction mechanism, observed reproduction functions, principles and mechanisms. A reproductive organism is programmed by genes and can be viewed as an automaton, mapping and reducing so as to create copies of those genes in its off springs. In Artificial Reproduction System, the binding mechanism between male and female cells is studied, parameters are chosen and a network is constructed also a feedback system for self regularization is established. The model then applies Mendel’s law of inheritance, allele-allele associations and can be used to perform data analysis of imbalanced data, multivariate, multiclass and big data. In the experimental study Artificial Reproduction System is compared with other state of the art classifiers like SVM, Radial Basis Function, neural networks, K-Nearest Neighbor for some benchmark datasets and comparison results indicates a good performance.Keywords: bio-inspired computation, nature- inspired computation, natural computing, data mining
Procedia PDF Downloads 274886 Efficient Deep Neural Networks for Real-Time Strawberry Freshness Monitoring: A Transfer Learning Approach
Authors: Mst. Tuhin Akter, Sharun Akter Khushbu, S. M. Shaqib
Abstract:
A real-time system architecture is highly effective for monitoring and detecting various damaged products or fruits that may deteriorate over time or become infected with diseases. Deep learning models have proven to be effective in building such architectures. However, building a deep learning model from scratch is a time-consuming and costly process. A more efficient solution is to utilize deep neural network (DNN) based transfer learning models in the real-time monitoring architecture. This study focuses on using a novel strawberry dataset to develop effective transfer learning models for the proposed real-time monitoring system architecture, specifically for evaluating and detecting strawberry freshness. Several state-of-the-art transfer learning models were employed, and the best performing model was found to be Xception, demonstrating higher performance across evaluation metrics such as accuracy, recall, precision, and F1-score.Keywords: strawberry freshness evaluation, deep neural network, transfer learning, image augmentation
Procedia PDF Downloads 91885 Classification Using Worldview-2 Imagery of Giant Panda Habitat in Wolong, Sichuan Province, China
Authors: Yunwei Tang, Linhai Jing, Hui Li, Qingjie Liu, Xiuxia Li, Qi Yan, Haifeng Ding
Abstract:
The giant panda (Ailuropoda melanoleuca) is an endangered species, mainly live in central China, where bamboos act as the main food source of wild giant pandas. Knowledge of spatial distribution of bamboos therefore becomes important for identifying the habitat of giant pandas. There have been ongoing studies for mapping bamboos and other tree species using remote sensing. WorldView-2 (WV-2) is the first high resolution commercial satellite with eight Multi-Spectral (MS) bands. Recent studies demonstrated that WV-2 imagery has a high potential in classification of tree species. The advanced classification techniques are important for utilising high spatial resolution imagery. It is generally agreed that object-based image analysis is a more desirable method than pixel-based analysis in processing high spatial resolution remotely sensed data. Classifiers that use spatial information combined with spectral information are known as contextual classifiers. It is suggested that contextual classifiers can achieve greater accuracy than non-contextual classifiers. Thus, spatial correlation can be incorporated into classifiers to improve classification results. The study area is located at Wuyipeng area in Wolong, Sichuan Province. The complex environment makes it difficult for information extraction since bamboos are sparsely distributed, mixed with brushes, and covered by other trees. Extensive fieldworks in Wuyingpeng were carried out twice. The first one was on 11th June, 2014, aiming at sampling feature locations for geometric correction and collecting training samples for classification. The second fieldwork was on 11th September, 2014, for the purposes of testing the classification results. In this study, spectral separability analysis was first performed to select appropriate MS bands for classification. Also, the reflectance analysis provided information for expanding sample points under the circumstance of knowing only a few. Then, a spatially weighted object-based k-nearest neighbour (k-NN) classifier was applied to the selected MS bands to identify seven land cover types (bamboo, conifer, broadleaf, mixed forest, brush, bare land, and shadow), accounting for spatial correlation within classes using geostatistical modelling. The spatially weighted k-NN method was compared with three alternatives: the traditional k-NN classifier, the Support Vector Machine (SVM) method and the Classification and Regression Tree (CART). Through field validation, it was proved that the classification result obtained using the spatially weighted k-NN method has the highest overall classification accuracy (77.61%) and Kappa coefficient (0.729); the producer’s accuracy and user’s accuracy achieve 81.25% and 95.12% for the bamboo class, respectively, also higher than the other methods. Photos of tree crowns were taken at sample locations using a fisheye camera, so the canopy density could be estimated. It is found that it is difficult to identify bamboo in the areas with a large canopy density (over 0.70); it is possible to extract bamboos in the areas with a median canopy density (from 0.2 to 0.7) and in a sparse forest (canopy density is less than 0.2). In summary, this study explores the ability of WV-2 imagery for bamboo extraction in a mountainous region in Sichuan. The study successfully identified the bamboo distribution, providing supporting knowledge for assessing the habitats of giant pandas.Keywords: bamboo mapping, classification, geostatistics, k-NN, worldview-2
Procedia PDF Downloads 313884 Generalized Approach to Linear Data Transformation
Authors: Abhijith Asok
Abstract:
This paper presents a generalized approach for the simple linear data transformation, Y=bX, through an integration of multidimensional coordinate geometry, vector space theory and polygonal geometry. The scaling is performed by adding an additional ’Dummy Dimension’ to the n-dimensional data, which helps plot two dimensional component-wise straight lines on pairs of dimensions. The end result is a set of scaled extensions of observations in any of the 2n spatial divisions, where n is the total number of applicable dimensions/dataset variables, created by shifting the n-dimensional plane along the ’Dummy Axis’. The derived scaling factor was found to be dependent on the coordinates of the common point of origin for diverging straight lines and the plane of extension, chosen on and perpendicular to the ’Dummy Axis’, respectively. This result indicates the geometrical interpretation of a linear data transformation and hence, opportunities for a more informed choice of the factor ’b’, based on a better choice of these coordinate values. The paper follows on to identify the effect of this transformation on certain popular distance metrics, wherein for many, the distance metric retained the same scaling factor as that of the features.Keywords: data transformation, dummy dimension, linear transformation, scaling
Procedia PDF Downloads 299