Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 2341

Search results for: Adult dataset

2251 Analysis of Real Time Seismic Signal Dataset Using Machine Learning

Authors: Sujata Kulkarni, Udhav Bhosle, Vijaykumar T.

Abstract:

Due to the closeness between seismic signals and non-seismic signals, it is vital to detect earthquakes using conventional methods. In order to distinguish between seismic events and non-seismic events depending on their amplitude, our study processes the data that come from seismic sensors. The authors suggest a robust noise suppression technique that makes use of a bandpass filter, an IIR Wiener filter, recursive short-term average/long-term average (STA/LTA), and Carl short-term average (STA)/long-term average for event identification (LTA). The trigger ratio used in the proposed study to differentiate between seismic and non-seismic activity is determined. The proposed work focuses on significant feature extraction for machine learning-based seismic event detection. This serves as motivation for compiling a dataset of all features for the identification and forecasting of seismic signals. We place a focus on feature vector dimension reduction techniques due to the temporal complexity. The proposed notable features were experimentally tested using a machine learning model, and the results on unseen data are optimal. Finally, a presentation using a hybrid dataset (captured by different sensors) demonstrates how this model may also be employed in a real-time setting while lowering false alarm rates. The planned study is based on the examination of seismic signals obtained from both individual sensors and sensor networks (SN). A wideband seismic signal from BSVK and CUKG station sensors, respectively located near Basavakalyan, Karnataka, and the Central University of Karnataka, makes up the experimental dataset.

Keywords: Carl STA/LTA, features extraction, real time, dataset, machine learning, seismic detection

Procedia PDF Downloads 74

2250 Adult Child Labour Migration and Elderly Parent Health: Recent Evidence from Indonesian Panel Data

Authors: Alfiah Hasanah, Silvia Mendolia, Oleg Yerokhin

Abstract:

This paper explores the impacts of adult child migration on the health of elderly parents left behind. The maternal and children health are a priority of health-related policy in most low and middle-income country, and so there is lack of evidence on the health of older population particularly in Indonesia. With increasing life expectancy and limited access to social security and social services for the elderly in this country, the consequences of increasing number of out-migration of adult children to parent health are important to investigate. This study use Indonesia Family Life Survey (IFLS), the only large-scale continuing longitudinal socioeconomic and health survey that based on a sample of households representing about 83 percent of the Indonesian population in its first wave. Using four waves of IFLS including the recent wave of 2014, several indicators of the self-rated health status, interviewer-rated health status and days of illness are used to estimate the impact of labour out-migration of adult children on parent health status. Incorporate both individual fixed effects to control for unobservable factors in migrant and non-migrant households and the ordered response of self-rated health, this study apply the ordered logit of “Blow-up and Cluster” (BUC ) estimator. The result shows that labour out-migration of adult children significantly improves the self-rated health status of the elderly parent left behind. Findings of this study are consistent with the view that migration increases family resources and contribute to better health care and nutrition of the family left behind.

Keywords: aging, migration, panel data, self-rated health

Procedia PDF Downloads 323

2249 Impact of Social Distancing on the Correlation Between Adults’ Participation in Learning and Acceptance of Technology

Authors: Liu Yi Hui

Abstract:

The COVID-19 pandemic in 2020 has globally affected all aspects of life, with social distancing and quarantine orders causing turmoil and learning in community colleges being temporarily paused. In fact, this is the first time that adult education has faced such a severe challenge. It forces researchers to reflect on the impact of pandemics on adult education and ways to respond. Distance learning appears to be one of the pedagogical tools capable of dealing with interpersonal isolation and social distancing caused by the pandemic. This research aims to examine whether the impact of social distancing during COVID-19 will lead to increased acceptance of technology and, subsequently, an increase in adults ’ willingness to participate in distance learning. The hypothesis that social distancing and the desire to participate in distance learning affects learners’ tendency to accept technology is investigated. Teachers ’ participation in distance education and acceptance of technology are used as adjustment variables with the relationship to “social distancing,” “participation in distance learning,” and “acceptance of technology” of learners. A questionnaire survey was conducted over a period of twelve months for teachers and learners at all community colleges in Taiwan who enrolled in a basic unit course. Community colleges were separated using multi-stage cluster sampling, with their locations being metropolitan, non-urban, south, and east as criteria. Using the G*power software, 660 samples were selected and analyzed. The results show that through appropriate pedagogical strategies or teachers ’ own acceptance of technology, adult learners’ willingness to participate in distance learning could be influenced. A diverse model of participation can be developed, improving adult education institutions’ ability to plan curricula to be flexible to avoid the risk associated with epidemic diseases.

Keywords: social distancing, adult learning, community colleges, technology acceptance model

Procedia PDF Downloads 115

2248 Combining the Deep Neural Network with the K-Means for Traffic Accident Prediction

Authors: Celso L. Fernando, Toshio Yoshii, Takahiro Tsubota

Abstract:

Understanding the causes of a road accident and predicting their occurrence is key to preventing deaths and serious injuries from road accident events. Traditional statistical methods such as the Poisson and the Logistics regressions have been used to find the association of the traffic environmental factors with the accident occurred; recently, an artificial neural network, ANN, a computational technique that learns from historical data to make a more accurate prediction, has emerged. Although the ability to make accurate predictions, the ANN has difficulty dealing with highly unbalanced attribute patterns distribution in the training dataset; in such circumstances, the ANN treats the minority group as noise. However, in the real world data, the minority group is often the group of interest; e.g., in the road traffic accident data, the events of the accident are the group of interest. This study proposes a combination of the k-means with the ANN to improve the predictive ability of the neural network model by alleviating the effect of the unbalanced distribution of the attribute patterns in the training dataset. The results show that the proposed method improves the ability of the neural network to make a prediction on a highly unbalanced distributed attribute patterns dataset; however, on an even distributed attribute patterns dataset, the proposed method performs almost like a standard neural network.

Keywords: accident risks estimation, artificial neural network, deep learning, k-mean, road safety

Procedia PDF Downloads 125

2247 Differences in Innovative Orientation of the Entrepreneurially Active Adults: The Case of Croatia

Authors: Nataša Šarlija, Sanja Pfeifer

Abstract:

This study analyzes the innovative orientation of the Croatian entrepreneurs. Innovative orientation is represented by the perceived extent to which an entrepreneur’s product or service or technology is new, and no other businesses offer the same product. The sample is extracted from the GEM Croatia Adult Population Survey dataset for the years 2003-2013. We apply descriptive statistics, t-test, Chi-square test and logistic regression. Findings indicate that innovative orientations vary with personal, firm, meso and macro level variables, and between different stages in entrepreneurship process. Significant predictors are occupation of the entrepreneurs, size of the firm and export aspiration for both early stage and established entrepreneurs. In addition, fear of failure, expecting to start a new business and seeing an entrepreneurial career as a desirable choice are predictors of innovative orientation among early stage entrepreneurs.

Keywords: multilevel determinants of the innovative orientation, Croatian early stage entrepreneurs, established businesses, GEM evidence

Procedia PDF Downloads 472

2246 Comparison between XGBoost, LightGBM and CatBoost Using a Home Credit Dataset

Authors: Essam Al Daoud

Abstract:

Gradient boosting methods have been proven to be a very important strategy. Many successful machine learning solutions were developed using the XGBoost and its derivatives. The aim of this study is to investigate and compare the efficiency of three gradient methods. Home credit dataset is used in this work which contains 219 features and 356251 records. However, new features are generated and several techniques are used to rank and select the best features. The implementation indicates that the LightGBM is faster and more accurate than CatBoost and XGBoost using variant number of features and records.

Keywords: gradient boosting, XGBoost, LightGBM, CatBoost, home credit

Procedia PDF Downloads 134

2245 Effects of Intergenerational Social Mobility on General Health, Oral Health and Physical Function among Older Adults in England

Authors: Alejandra Letelier, Anja Heilmann, Richard G. Watt, Stephen Jivraj, Georgios Tsakos

Abstract:

Background: Socioeconomic position (SEP) influences adult health. People who experienced material disadvantages in childhood or adulthood tend to have higher adult disease levels than their peers from more advantaged backgrounds. Even so, life is a dynamic process and contains a series of transitions that could lead people through different socioeconomic paths. Research on social mobility takes this into account by adopting a trajectory approach, thereby providing a long-term view of the effect of SEP on health. Aim: The aim of this research examines the effects of intergenerational social mobility on adult general health, oral health and functioning in a population aged 50 and over in England. Methods: This study is based on the secondary analysis of data from the English Longitudinal Study of Ageing (ELSA). Using cross-sectional data, nine social trajectories were created based on parental and adult occupational socio-economic position. Regression models were used to estimate the associations between social trajectories and the following outcomes: adult self-rated health, self-rated oral health, oral health related quality of life, total tooth loss and grip strength; while controlling for socio-economic background and health related behaviours. Results: Associations with adult SEP were generally stronger than with childhood SEP, suggesting a stronger influence of proximal rather than distal SEP on health and oral health. Compared to the stable high group, being in the low SEP groups in childhood and adulthood was associated with poorer health and oral health for all examined outcome measures. For adult self-rated health and edentulousness, graded associations with social mobility trajectories were observed. Conclusion: Intergenerational social mobility was associated with self-rated health and total tooth loss. Compared to only those who remained in a low SEP group over time reported worse self-rated oral health and oral health related quality of life, and had lower grip strength measurements. Potential limitations in relation to data quality will be discussed.

Keywords: social determinants of oral health, social mobility, socioeconomic position and oral health, older adults oral health

Procedia PDF Downloads 250

2244 PaSA: A Dataset for Patent Sentiment Analysis to Highlight Patent Paragraphs

Authors: Renukswamy Chikkamath, Vishvapalsinhji Ramsinh Parmar, Christoph Hewel, Markus Endres

Abstract:

Given a patent document, identifying distinct semantic annotations is an interesting research aspect. Text annotation helps the patent practitioners such as examiners and patent attorneys to quickly identify the key arguments of any invention, successively providing a timely marking of a patent text. In the process of manual patent analysis, to attain better readability, recognising the semantic information by marking paragraphs is in practice. This semantic annotation process is laborious and time-consuming. To alleviate such a problem, we proposed a dataset to train machine learning algorithms to automate the highlighting process. The contributions of this work are: i) we developed a multi-class dataset of size 150k samples by traversing USPTO patents over a decade, ii) articulated statistics and distributions of data using imperative exploratory data analysis, iii) baseline Machine Learning models are developed to utilize the dataset to address patent paragraph highlighting task, and iv) future path to extend this work using Deep Learning and domain-specific pre-trained language models to develop a tool to highlight is provided. This work assists patent practitioners in highlighting semantic information automatically and aids in creating a sustainable and efficient patent analysis using the aptitude of machine learning.

Keywords: machine learning, patents, patent sentiment analysis, patent information retrieval

Procedia PDF Downloads 66

2243 Generation of High-Quality Synthetic CT Images from Cone Beam CT Images Using A.I. Based Generative Networks

Authors: Heeba A. Gurku

Abstract:

Introduction: Cone Beam CT(CBCT) images play an integral part in proper patient positioning in cancer patients undergoing radiation therapy treatment. But these images are low in quality. The purpose of this study is to generate high-quality synthetic CT images from CBCT using generative models. Material and Methods: This study utilized two datasets from The Cancer Imaging Archive (TCIA) 1) Lung cancer dataset of 20 patients (with full view CBCT images) and 2) Pancreatic cancer dataset of 40 patients (only 27 patients having limited view images were included in the study). Cycle Generative Adversarial Networks (GAN) and its variant Attention Guided Generative Adversarial Networks (AGGAN) models were used to generate the synthetic CTs. Models were evaluated by visual evaluation and on four metrics, Structural Similarity Index Measure (SSIM), Peak Signal Noise Ratio (PSNR) Mean Absolute Error (MAE) and Root Mean Square Error (RMSE), to compare the synthetic CT and original CT images. Results: For pancreatic dataset with limited view CBCT images, our study showed that in Cycle GAN model, MAE, RMSE, PSNR improved from 12.57to 8.49, 20.94 to 15.29 and 21.85 to 24.63, respectively but structural similarity only marginally increased from 0.78 to 0.79. Similar, results were achieved with AGGAN with no improvement over Cycle GAN. However, for lung dataset with full view CBCT images Cycle GAN was able to reduce MAE significantly from 89.44 to 15.11 and AGGAN was able to reduce it to 19.77. Similarly, RMSE was also decreased from 92.68 to 23.50 in Cycle GAN and to 29.02 in AGGAN. SSIM and PSNR also improved significantly from 0.17 to 0.59 and from 8.81 to 21.06 in Cycle GAN respectively while in AGGAN SSIM increased to 0.52 and PSNR increased to 19.31. In both datasets, GAN models were able to reduce artifacts, reduce noise, have better resolution, and better contrast enhancement. Conclusion and Recommendation: Both Cycle GAN and AGGAN were significantly able to reduce MAE, RMSE and PSNR in both datasets. However, full view lung dataset showed more improvement in SSIM and image quality than limited view pancreatic dataset.

Keywords: CT images, CBCT images, cycle GAN, AGGAN

Procedia PDF Downloads 57

2242 Feature Based Unsupervised Intrusion Detection

Authors: Deeman Yousif Mahmood, Mohammed Abdullah Hussein

Abstract:

The goal of a network-based intrusion detection system is to classify activities of network traffics into two major categories: normal and attack (intrusive) activities. Nowadays, data mining and machine learning plays an important role in many sciences; including intrusion detection system (IDS) using both supervised and unsupervised techniques. However, one of the essential steps of data mining is feature selection that helps in improving the efficiency, performance and prediction rate of proposed approach. This paper applies unsupervised K-means clustering algorithm with information gain (IG) for feature selection and reduction to build a network intrusion detection system. For our experimental analysis, we have used the new NSL-KDD dataset, which is a modified dataset for KDDCup 1999 intrusion detection benchmark dataset. With a split of 60.0% for the training set and the remainder for the testing set, a 2 class classifications have been implemented (Normal, Attack). Weka framework which is a java based open source software consists of a collection of machine learning algorithms for data mining tasks has been used in the testing process. The experimental results show that the proposed approach is very accurate with low false positive rate and high true positive rate and it takes less learning time in comparison with using the full features of the dataset with the same algorithm.

Keywords: information gain (IG), intrusion detection system (IDS), k-means clustering, Weka

Procedia PDF Downloads 267

2241 Demographic Characteristics of the Atlas Barbary Sheep in Amassine Nature Reserve, Atlas Range, Morocco: Implications For Conservation and Management

Authors: Hakim Bachiri, Mohammed Znari, Moulay Abdeljalil Ait Baamranne

Abstract:

Population characteristics of Atlas Barbary sheep (Ammotragus lervia lervia) were investigated 20 years following the 1999 introduction of 10 individuals into the fenced nature reserve of Amassine, High Atlas range, Morocco, for promoting wildlife watching and tourism. Population age-sex structure and density were determined in late winter-early spring during four consecutive years (2016-2019) by direct observation before the dispersal of the herd. In this latter case, the line transect distance sampling was successfully applied. Population size increased from 37 to 62 animals during the four-year study period; the maximal population size being 82 individuals recorded in 2006. An estimated population density ranged from 0.25 to 0.41 Barbary sheep/ha during the study period. The adult sex ratio varied from 91 to 67 per 100 females. The apparent birth rate was 14 to 73/100 females. Juveniles and subadults comprised 27-43% of the population, adult males 26-31% and adult females 29-45%. The survival rate from birth to 1 year of age approximated 35%, for adult males was estimated to average 69%/year. The obtained results would be helpful for developing sustainable population management and habitat restoration plan and assessing the feasibility of potential reintroduction/restocking in other areas of the Atlas range.

Keywords: atlas mountains, barbary sheep, demography, management

Procedia PDF Downloads 441

2240 Dataset Quality Index:Development of Composite Indicator Based on Standard Data Quality Indicators

Authors: Sakda Loetpiparwanich, Preecha Vichitthamaros

Abstract:

Nowadays, poor data quality is considered one of the majority costs for a data project. The data project with data quality awareness almost as much time to data quality processes while data project without data quality awareness negatively impacts financial resources, efficiency, productivity, and credibility. One of the processes that take a long time is defining the expectations and measurements of data quality because the expectation is different up to the purpose of each data project. Especially, big data project that maybe involves with many datasets and stakeholders, that take a long time to discuss and define quality expectations and measurements. Therefore, this study aimed at developing meaningful indicators to describe overall data quality for each dataset to quick comparison and priority. The objectives of this study were to: (1) Develop a practical data quality indicators and measurements, (2) Develop data quality dimensions based on statistical characteristics and (3) Develop Composite Indicator that can describe overall data quality for each dataset. The sample consisted of more than 500 datasets from public sources obtained by random sampling. After datasets were collected, there are five steps to develop the Dataset Quality Index (SDQI). First, we define standard data quality expectations. Second, we find any indicators that can measure directly to data within datasets. Thirdly, each indicator aggregates to dimension using factor analysis. Next, the indicators and dimensions were weighted by an effort for data preparing process and usability. Finally, the dimensions aggregate to Composite Indicator. The results of these analyses showed that: (1) The developed useful indicators and measurements contained ten indicators. (2) the developed data quality dimension based on statistical characteristics, we found that ten indicators can be reduced to 4 dimensions. (3) The developed Composite Indicator, we found that the SDQI can describe overall datasets quality of each dataset and can separate into 3 Level as Good Quality, Acceptable Quality, and Poor Quality. The conclusion, the SDQI provide an overall description of data quality within datasets and meaningful composition. We can use SQDI to assess for all data in the data project, effort estimation, and priority. The SDQI also work well with Agile Method by using SDQI to assessment in the first sprint. After passing the initial evaluation, we can add more specific data quality indicators into the next sprint.

Keywords: data quality, dataset quality, data quality management, composite indicator, factor analysis, principal component analysis

Procedia PDF Downloads 112

2239 Enhancing Cultural Heritage Data Retrieval by Mapping COURAGE to CIDOC Conceptual Reference Model

Authors: Ghazal Faraj, Andras Micsik

Abstract:

The CIDOC Conceptual Reference Model (CRM) is an extensible ontology that provides integrated access to heterogeneous and digital datasets. The CIDOC-CRM offers a “semantic glue” intended to promote accessibility to several diverse and dispersed sources of cultural heritage data. That is achieved by providing a formal structure for the implicit and explicit concepts and their relationships in the cultural heritage field. The COURAGE (“Cultural Opposition – Understanding the CultuRal HeritAGE of Dissent in the Former Socialist Countries”) project aimed to explore methods about socialist-era cultural resistance during 1950-1990 and planned to serve as a basis for further narratives and digital humanities (DH) research. This project highlights the diversity of flourished alternative cultural scenes in Eastern Europe before 1989. Moreover, the dataset of COURAGE is an online RDF-based registry that consists of historical people, organizations, collections, and featured items. For increasing the inter-links between different datasets and retrieving more relevant data from various data silos, a shared federated ontology for reconciled data is needed. As a first step towards these goals, a full understanding of the CIDOC CRM ontology (target ontology), as well as the COURAGE dataset, was required to start the work. Subsequently, the queries toward the ontology were determined, and a table of equivalent properties from COURAGE and CIDOC CRM was created. The structural diagrams that clarify the mapping process and construct queries are on progress to map person, organization, and collection entities to the ontology. Through mapping the COURAGE dataset to CIDOC-CRM ontology, the dataset will have a common ontological foundation with several other datasets. Therefore, the expected results are: 1) retrieving more detailed data about existing entities, 2) retrieving new entities’ data, 3) aligning COURAGE dataset to a standard vocabulary, 4) running distributed SPARQL queries over several CIDOC-CRM datasets and testing the potentials of distributed query answering using SPARQL. The next plan is to map CIDOC-CRM to other upper-level ontologies or large datasets (e.g., DBpedia, Wikidata), and address similar questions on a wide variety of knowledge bases.

Keywords: CIDOC CRM, cultural heritage data, COURAGE dataset, ontology alignment

Procedia PDF Downloads 120

2238 Case Studies in Three Domains of Learning: Cognitive, Affective, Psychomotor

Authors: Zeinabsadat Haghshenas

Abstract:

Bloom’s Taxonomy has been changed during the years. The idea of this writing is about the revision that has happened in both facts and terms. It also contains case studies of using cognitive Bloom’s taxonomy in teaching geometric solids to the secondary school students, affective objectives in a creative workshop for adults and psychomotor objectives in fixing a malfunctioned refrigerator lamp. There is also pointed to the important role of classification objectives in adult education as a way to prevent memory loss.

Keywords: adult education, affective domain, cognitive domain, memory loss, psychomotor domain

Procedia PDF Downloads 434

2237 The Role of Art and Music in Enriching Adult Learning in Maltese as a Second Language

Authors: Jacqueline Zammit

Abstract:

Currently, a considerable number of individuals from different backgrounds are being drawn to Malta due to its favourable environment for business, investment, and employment. This influx has led to a growing interest among expats in learning Maltese as a second language (ML2) to enrich their experience of working and residing in Malta. However, the intricacies of Maltese grammar, particularly challenging for second language (L2) learners unfamiliar with Arabic, can pose difficulties in the learning process. Furthermore, it's worth noting that the teaching of ML2 is an emerging field with limited existing research on effective pedagogical strategies. The realm of second language acquisition (SLA) can be notably demanding for adults, requiring well-founded interventions to facilitate learning. Among these interventions, approaches grounded in empirical evidence have incorporated artistic and musical elements to augment SLA. Both art and music have proven roles in facilitating L2 communication, aiding vocabulary retention, and improving comprehension skills. This study aims to delve into the utilization of music and art as catalysts for enhancing the progress of adult learners in mastering ML2. The research employs a qualitative methodology, employing a sample selected through convenience sampling, which encompassed 37 adult learners of ML2. These participants engaged in individual interviews. The data derived from these interviews were subjected to thorough analysis. The outcomes of the study underscore the substantial positive influence exerted by art and music on the academic advancement of adult ML2 learners. Notably, it emerged from the participants' accounts that the current ML2 curricula lack the integration of art and music. Therefore, this study advocates for the incorporation of art and music components within both traditional classroom settings and online ML2 courses. The intention is to bolster the academic accomplishments of adult learners in the realm of Maltese as a second language, bridging the current gap between theory and practice.

Keywords: academic accomplishment, mature learners, visual art, learning Maltese as a second language, musical involvement, acquiring a second language

Procedia PDF Downloads 40

2236 Experiences Using Autoethnography as a Methodology for Research in Education

Authors: Sarah Amodeo

Abstract:

Drawing on the author’s research about the experiences of female immigrant students in academic Adult Education, in Montreal, Quebec, this paper deconstructs the benefits of autoethnography as a methodology for educators in Adult Education. Autoethnography is an advantageous methodology for teachers in Adult Education as it allows for deep engagement, allowing for educators to reflect on student experiences and their day-to-day realities, and in turn, allowing for professional development, improved andragogy, and changes to classroom practices. Autoethnography is a qualitative research methodology that cultivates strategies for improving adult learning. The paper begins by outlining the context that inspired autoethnography for the author’s work, highlighting the emergence of autoethnography as a method, while examining how it is evolving and drawing on foundational work that continues to inspire research. The basic autoethnographic methodologies that are explored in this paper include the use of memory work in episode formation, the use of personal photographs, and textual readings of artworks. Memory work allows for the researcher to use their professional experience and the lived/shared experiences of their students in their research, drawing on episodes from their past. Personal photographs and descriptions of artwork allow researchers to explore images of learning environments/realities in ways that compliment student experiences. Major findings of the text are examined through the analysis of categories of autoethnography. Specific categories include realism, impressionism, and conceptualism which aid in orientating the analysis and emergent themes that develop through self-study. Finally, the text presents a discussion surrounding the limitations of autoethnography, with attention to the trustworthiness and ethical issues. The paper concludes with a consideration of the implications of autoethnography for adult educators in juxtaposition with youth sector work.

Keywords: artwork, autoethnography, conceptualism, episode formation, impressionism, memory work, personal photographs, and realism, realism

Procedia PDF Downloads 157

2235 Plant Identification Using Convolution Neural Network and Vision Transformer-Based Models

Authors: Virender Singh, Mathew Rees, Simon Hampton, Sivaram Annadurai

Abstract:

Plant identification is a challenging task that aims to identify the family, genus, and species according to plant morphological features. Automated deep learning-based computer vision algorithms are widely used for identifying plants and can help users narrow down the possibilities. However, numerous morphological similarities between and within species render correct classification difficult. In this paper, we tested custom convolution neural network (CNN) and vision transformer (ViT) based models using the PyTorch framework to classify plants. We used a large dataset of 88,000 provided by the Royal Horticultural Society (RHS) and a smaller dataset of 16,000 images from the PlantClef 2015 dataset for classifying plants at genus and species levels, respectively. Our results show that for classifying plants at the genus level, ViT models perform better compared to CNN-based models ResNet50 and ResNet-RS-420 and other state-of-the-art CNN-based models suggested in previous studies on a similar dataset. ViT model achieved top accuracy of 83.3% for classifying plants at the genus level. For classifying plants at the species level, ViT models perform better compared to CNN-based models ResNet50 and ResNet-RS-420, with a top accuracy of 92.5%. We show that the correct set of augmentation techniques plays an important role in classification success. In conclusion, these results could help end users, professionals and the general public alike in identifying plants quicker and with improved accuracy.

Keywords: plant identification, CNN, image processing, vision transformer, classification

Procedia PDF Downloads 67

2234 PatchMix: Learning Transferable Semi-Supervised Representation by Predicting Patches

Authors: Arpit Rai

Abstract:

In this work, we propose PatchMix, a semi-supervised method for pre-training visual representations. PatchMix mixes patches of two images and then solves an auxiliary task of predicting the label of each patch in the mixed image. Our experiments on the CIFAR-10, 100 and the SVHN dataset show that the representations learned by this method encodes useful information for transfer to new tasks and outperform the baseline Residual Network encoders by on CIFAR 10 by 12% on ResNet 101 and 2% on ResNet-56, by 4% on CIFAR-100 on ResNet101 and by 6% on SVHN dataset on the ResNet-101 baseline model.

Keywords: self-supervised learning, representation learning, computer vision, generalization

Procedia PDF Downloads 62

2233 Rd-PLS Regression: From the Analysis of Two Blocks of Variables to Path Modeling

Authors: E. Tchandao Mangamana, V. Cariou, E. Vigneau, R. Glele Kakai, E. M. Qannari

Abstract:

A new definition of a latent variable associated with a dataset makes it possible to propose variants of the PLS2 regression and the multi-block PLS (MB-PLS). We shall refer to these variants as Rd-PLS regression and Rd-MB-PLS respectively because they are inspired by both Redundancy analysis and PLS regression. Usually, a latent variable t associated with a dataset Z is defined as a linear combination of the variables of Z with the constraint that the length of the loading weights vector equals 1. Formally, t=Zw with ‖w‖=1. Denoting by Z' the transpose of Z, we define herein, a latent variable by t=ZZ’q with the constraint that the auxiliary variable q has a norm equal to 1. This new definition of a latent variable entails that, as previously, t is a linear combination of the variables in Z and, in addition, the loading vector w=Z’q is constrained to be a linear combination of the rows of Z. More importantly, t could be interpreted as a kind of projection of the auxiliary variable q onto the space generated by the variables in Z, since it is collinear to the first PLS1 component of q onto Z. Consider the situation in which we aim to predict a dataset Y from another dataset X. These two datasets relate to the same individuals and are assumed to be centered. Let us consider a latent variable u=YY’q to which we associate the variable t= XX’YY’q. Rd-PLS consists in seeking q (and therefore u and t) so that the covariance between t and u is maximum. The solution to this problem is straightforward and consists in setting q to the eigenvector of YY’XX’YY’ associated with the largest eigenvalue. For the determination of higher order components, we deflate X and Y with respect to the latent variable t. Extending Rd-PLS to the context of multi-block data is relatively easy. Starting from a latent variable u=YY’q, we consider its ‘projection’ on the space generated by the variables of each block Xk (k=1, ..., K) namely, tk= XkXk'YY’q. Thereafter, Rd-MB-PLS seeks q in order to maximize the average of the covariances of u with tk (k=1, ..., K). The solution to this problem is given by q, eigenvector of YY’XX’YY’, where X is the dataset obtained by horizontally merging datasets Xk (k=1, ..., K). For the determination of latent variables of order higher than 1, we use a deflation of Y and Xk with respect to the variable t= XX’YY’q. In the same vein, extending Rd-MB-PLS to the path modeling setting is straightforward. Methods are illustrated on the basis of case studies and performance of Rd-PLS and Rd-MB-PLS in terms of prediction is compared to that of PLS2 and MB-PLS.

Keywords: multiblock data analysis, partial least squares regression, path modeling, redundancy analysis

Procedia PDF Downloads 113

2232 Automated Evaluation Approach for Time-Dependent Question Answering Pairs on Web Crawler Based Question Answering System

Authors: Shraddha Chaudhary, Raksha Agarwal, Niladri Chatterjee

Abstract:

This work demonstrates a web crawler-based generalized end-to-end open domain Question Answering (QA) system. An efficient QA system requires a significant amount of domain knowledge to answer any question with the aim to find an exact and correct answer in the form of a number, a noun, a short phrase, or a brief piece of text for the user's questions. Analysis of the question, searching the relevant document, and choosing an answer are three important steps in a QA system. This work uses a web scraper (Beautiful Soup) to extract K-documents from the web. The value of K can be calibrated on the basis of a trade-off between time and accuracy. This is followed by a passage ranking process using the MS-Marco dataset trained on 500K queries to extract the most relevant text passage, to shorten the lengthy documents. Further, a QA system is used to extract the answers from the shortened documents based on the query and return the top 3 answers. For evaluation of such systems, accuracy is judged by the exact match between predicted answers and gold answers. But automatic evaluation methods fail due to the linguistic ambiguities inherent in the questions. Moreover, reference answers are often not exhaustive or are out of date. Hence correct answers predicted by the system are often judged incorrect according to the automated metrics. One such scenario arises from the original Google Natural Question (GNQ) dataset which was collected and made available in the year 2016. Use of any such dataset proves to be inefficient with respect to any questions that have time-varying answers. For illustration, if the query is where will be the next Olympics? Gold Answer for the above query as given in the GNQ dataset is “Tokyo”. Since the dataset was collected in the year 2016, and the next Olympics after 2016 were in 2020 that was in Tokyo which is absolutely correct. But if the same question is asked in 2022 then the answer is “Paris, 2024”. Consequently, any evaluation based on the GNQ dataset will be incorrect. Such erroneous predictions are usually given to human evaluators for further validation which is quite expensive and time-consuming. To address this erroneous evaluation, the present work proposes an automated approach for evaluating time-dependent question-answer pairs. In particular, it proposes a metric using the current timestamp along with top-n predicted answers from a given QA system. To test the proposed approach GNQ dataset has been used and the system achieved an accuracy of 78% for a test dataset comprising 100 QA pairs. This test data was automatically extracted using an analysis-based approach from 10K QA pairs of the GNQ dataset. The results obtained are encouraging. The proposed technique appears to have the possibility of developing into a useful scheme for gathering precise, reliable, and specific information in a real-time and efficient manner. Our subsequent experiments will be guided towards establishing the efficacy of the above system for a larger set of time-dependent QA pairs.

Keywords: web-based information retrieval, open domain question answering system, time-varying QA, QA evaluation

Procedia PDF Downloads 76

2231 Age-Stage, Two-Sex Life Table Characteristics of Aedes albopictus (Skuse) and Aedes aegypti (Linnaeus)) (Diptera: Culicidae) in Penang Island, Malaysia

Authors: A. H. Maimusa, A. Abu Hassan, Nur Faeza A. Kassim

Abstract:

In this study, we report on the main life table developmental attributes of laboratory colonies of wild strains Ae. albopictus and Ae. aegypti. The raw life history data of the two species were analyzed and compared based on the age-stage and two-sex life table. The total pre-adult development times were 9.47 days (Ae. albopictus) and 8.76 days (Ae. aegypti). The adult pre-oviposition periods (APOP) was 1.61 day for Ae. albopictus and 2.02 for Ae. aegypti. The total pre-oviposition period (TPOP) of Ae. albopictus is significantly longer (11.66 days) than (10.75 days) for Ae. aegypti. The mean intrinsic rate of increase (r) was 0.124 days (Ae. albopictus) and 1.151 days (Ae. aegypti) while the mean finite rate of increase (λ) was 1.13 day (Ae. albopictus) and (1.16 d) (Ae. aegypti). The net reproductive rates (Ro) were 8.10 and 10.75 for Ae. albopictus and Ae. aegypti, respectively. The mean generation time (T) for Ae. albopictus and Ae. aegypti, were 16.81 days and 15.77 days respectively. The mean development time for each stage insignificantly correlated with temperature (r = -0.208, p > 0.05) and (r = -0.312, p > 0.05) for Ae. albopictus and Ae. aegypti respectively. The life expectancy was 19.01 and 19.94 days for Ae. albopictus and Ae. aegypti respectively. Mortality occurred mostly during the adult stage and ranged between 0.01 and 0.07%. The population parameters suggest that Ae. albopictus and Ae. aegypti populations are r-strategist characterized by a high r, a large Ro, and short T. This kind of information is crucial in understanding mosquito population dynamics in disease transmission and control.

Keywords: Ae. aegypti, Ae. albopictus, age-stage, life table, two-sex

Procedia PDF Downloads 295

2230 Cosmetic Recommendation Approach Using Machine Learning

Authors: Shakila N. Senarath, Dinesh Asanka, Janaka Wijayanayake

Abstract:

The necessity of cosmetic products is arising to fulfill consumer needs of personality appearance and hygiene. A cosmetic product consists of various chemical ingredients which may help to keep the skin healthy or may lead to damages. Every chemical ingredient in a cosmetic product does not perform on every human. The most appropriate way to select a healthy cosmetic product is to identify the texture of the body first and select the most suitable product with safe ingredients. Therefore, the selection process of cosmetic products is complicated. Consumer surveys have shown most of the time, the selection process of cosmetic products is done in an improper way by consumers. From this study, a content-based system is suggested that recommends cosmetic products for the human factors. To such an extent, the skin type, gender and price range will be considered as human factors. The proposed system will be implemented by using Machine Learning. Consumer skin type, gender and price range will be taken as inputs to the system. The skin type of consumer will be derived by using the Baumann Skin Type Questionnaire, which is a value-based approach that includes several numbers of questions to derive the user’s skin type to one of the 16 skin types according to the Bauman Skin Type indicator (BSTI). Two datasets are collected for further research proceedings. The user data set was collected using a questionnaire given to the public. Those are the user dataset and the cosmetic dataset. Product details are included in the cosmetic dataset, which belongs to 5 different kinds of product categories (Moisturizer, Cleanser, Sun protector, Face Mask, Eye Cream). An alternate approach of TF-IDF (Term Frequency – Inverse Document Frequency) is applied to vectorize cosmetic ingredients in the generic cosmetic products dataset and user-preferred dataset. Using the IF-IPF vectors, each user-preferred products dataset and generic cosmetic products dataset can be represented as sparse vectors. The similarity between each user-preferred product and generic cosmetic product will be calculated using the cosine similarity method. For the recommendation process, a similarity matrix can be used. Higher the similarity, higher the match for consumer. Sorting a user column from similarity matrix in a descending order, the recommended products can be retrieved in ascending order. Even though results return a list of similar products, and since the user information has been gathered, such as gender and the price ranges for product purchasing, further optimization can be done by considering and giving weights for those parameters once after a set of recommended products for a user has been retrieved.

Keywords: content-based filtering, cosmetics, machine learning, recommendation system

Procedia PDF Downloads 111

2229 Assessing the Prevalence of Accidental Iatrogenic Paracetamol Overdose in Adult Hospital Patients Weighing <50kg: A Quality Improvement Project

Authors: Elisavet Arsenaki

Abstract:

Paracetamol overdose is associated with significant and possibly permanent consequences including hepatotoxicity, acute and chronic liver failure, and death. This quality improvement project explores the prevalence of accidental iatrogenic paracetamol overdose in hospital patients with a low body weight, defined as <50kg and assesses the impact of educational posters in trying to reduce it. The study included all adult inpatients on the admissions ward, a short stay ward for patients requiring 12-72 hour treatment, and consisted of three cycles. Each cycle consisted of 3 days of data collection in a given month (data collection for cycle 1 occurred in January 2022, February 2022 for cycle 2 and March 2022 for cycle 3). All patients given paracetamol had their prescribed dose checked against their charted weight to identify the percentage of adult inpatients <50kg who were prescribed 1g of paracetamol instead of 500mg. In the first cycle of the audit, data were collected from 83 patients who were prescribed paracetamol on the admissions ward. Subsequently, four A4 educational posters were displayed across the ward, on two separate occasions and with a one-month interval in between each poster display. The aim of this was to remind prescribing doctors of their responsibility to check patient body weight prior to prescribing paracetamol. Data were collected again one week after each round of poster display, from 72 and 70 patients respectively. Over the 3 cycles with a cumulative 225 patients, 15 weighed <50kg (6.67%) and of those, 5 were incorrectly prescribed 1g of paracetamol, yielding a 33.3% prevalence of accidental iatrogenic paracetamol overdose in adult inpatients. In cycle 1 of the project, 3 out of 6 adult patients weighing <50kg were overdosed on paracetamol, meaning that 50% of low weight patients were prescribed the wrong dose of paracetamol for their weight. In the second data collection cycle, 1 out of 5 <50kg patients were overdosed (20%) and in the third cycle, 1 out of 4 (25%). The use of educational posters resulted in a lower prevalence of accidental iatrogenic paracetamol overdose in low body weight adult inpatients. However, the differences observed were statistically insignificant (p value 0.993 and 0.995 respectively). Educational posters did not induce a significant decrease in the prevalence of accidental iatrogenic paracetamol overdose. More robust strategies need to be employed to further decrease paracetamol overdose in patients weighing <50kg.

Keywords: iatrogenic, overdose, paracetamol, patient, safety

Procedia PDF Downloads 93

2228 Developing an Intonation Labeled Dataset for Hindi

Authors: Esha Banerjee, Atul Kumar Ojha, Girish Nath Jha

Abstract:

This study aims to develop an intonation labeled database for Hindi. Although no single standard for prosody labeling exists in Hindi, researchers in the past have employed perceptual and statistical methods in literature to draw inferences about the behavior of prosody patterns in Hindi. Based on such existing research and largely agreed upon intonational theories in Hindi, this study attempts to develop a manually annotated prosodic corpus of Hindi speech data, which can be used for training speech models for natural-sounding speech in the future. 100 sentences ( 500 words) each for declarative and interrogative types have been labeled using Praat.

Keywords: speech dataset, Hindi, intonation, labeled corpus

Procedia PDF Downloads 166

2227 An Enhanced Support Vector Machine Based Approach for Sentiment Classification of Arabic Tweets of Different Dialects

Authors: Gehad S. Kaseb, Mona F. Ahmed

Abstract:

Arabic Sentiment Analysis (SA) is one of the most common research fields with many open areas. Few studies apply SA to Arabic dialects. This paper proposes different pre-processing steps and a modified methodology to improve the accuracy using normal Support Vector Machine (SVM) classification. The paper works on two datasets, Arabic Sentiment Tweets Dataset (ASTD) and Extended Arabic Tweets Sentiment Dataset (Extended-AATSD), which are publicly available for academic use. The results show that the classification accuracy approaches 86%.

Keywords: Arabic, classification, sentiment analysis, tweets

Procedia PDF Downloads 117

2226 Using Machine Learning to Build a Real-Time COVID-19 Mask Safety Monitor

Authors: Yash Jain

Abstract:

The US Center for Disease Control has recommended wearing masks to slow the spread of the virus. The research uses a video feed from a camera to conduct real-time classifications of whether or not a human is correctly wearing a mask, incorrectly wearing a mask, or not wearing a mask at all. Utilizing two distinct datasets from the open-source website Kaggle, a mask detection network had been trained. The first dataset that was used to train the model was titled 'Face Mask Detection' on Kaggle, where the dataset was retrieved from and the second dataset was titled 'Face Mask Dataset, which provided the data in a (YOLO Format)' so that the TinyYoloV3 model could be trained. Based on the data from Kaggle, two machine learning models were implemented and trained: a Tiny YoloV3 Real-time model and a two-stage neural network classifier. The two-stage neural network classifier had a first step of identifying distinct faces within the image, and the second step was a classifier to detect the state of the mask on the face and whether it was worn correctly, incorrectly, or no mask at all. The TinyYoloV3 was used for the live feed as well as for a comparison standpoint against the previous two-stage classifier and was trained using the darknet neural network framework. The two-stage classifier attained a mean average precision (MAP) of 80%, while the model trained using TinyYoloV3 real-time detection had a mean average precision (MAP) of 59%. Overall, both models were able to correctly classify stages/scenarios of no mask, mask, and incorrectly worn masks.

Keywords: datasets, classifier, mask-detection, real-time, TinyYoloV3, two-stage neural network classifier

Procedia PDF Downloads 131

2225 Effects, Causes, and Prevention of Teen Dating Violence

Authors: Isabel Jones

Abstract:

As adolescence is a formative time, experiences during adolescence often affect the rest of one’s life. Therefore, dating, specifically violence in dating, can have lasting effects on the rest of one’s life. In order to find sources, searches were conducted on PsycINFO, specifically EBSCO, and narrowed down under the criteria that the source contained information about adolescent dating violence rather than adult, and focused on causes, effects, or prevention methods. This literature review examines research regarding the effects and causes of TDV, and then what methods are effective in the prevention of TDV development. This will allow for a clear image of how these prevention methods are effective and why they are important. Effects of TDV extend beyond the physical, including psychological and sexual long-lasting effects. These are caused by a number of concepts, including learned behavior, inhibitory issues/substance abuse, and cultural factors. When both of these are taken into account, preventative measures such as school-based interventions, parental/adult monitoring, and the presence of positive family examples are more clear as to their effectiveness. This literature review may provide further awareness to this public health crisis and give the public a view of how adolescents are affected by TDV on their path from child to adult.

Keywords: adolescence, dating violence, risk factors, predictors, relationship

Procedia PDF Downloads 40

2224 Gender Construction in Contemporary Dystopian Fiction in Young Adult Literature: A South African Example

Authors: Johan Anker

Abstract:

The purpose of this paper is to discuss the nature of gender construction in modern dystopian fiction, the development of this genre in Young Adult Literature and reasons for the enormous appeal on the adolescent readers. A recent award winning South African text in this genre, The Mark by Edith Bullring (2014), will be used as example while also comparing this text to international bestsellers like Divergent (Roth:2011), The Hunger Games (Collins:2008) and others. Theoretical insights from critics and academics in the field of children’s literature, like Ames, Coats, Bradford, Booker, Basu, Green-Barteet, Hintz, McAlear, McCallum, Moylan, Ostry, Ryan, Stephens and Westerfield will be referred to and their insights used as part of the analysis of The Mark. The role of relevant and recurring themes in this genre, like global concerns, environmental destruction, liberty, self-determination, social and political critique, surveillance and repression by the state or other institutions will also be referred to. The paper will shortly refer to the history and emergence of dystopian literature as genre in adult and young adult literature as part of the long tradition since the publishing of Orwell’s 1984 and Huxley’s Brave New World. Different factors appeal to adolescent readers in the modern versions of this hybrid genre for young adults: teenage protagonists who are questioning the underlying values of a flawed society like an inhuman or tyrannical government, a growing understanding of the society around them, feelings of isolation and the dynamic of relationships. This unease leads to a growing sense of the potential to act against society (rebellion), and of their role as agents in a larger community and independent decision-making abilities. This awareness also leads to a growing sense of self (identity and agency) and the development of romantic relationships. The specific modern tendency of a female protagonist as leader in the rebellion against state and state apparatus, who gains in agency and independence in this rebellion, an important part of the identification with and construction of gender, while being part of the traditional coming-of-age young adult novel will be emphasized. A comparison between the traditional themes, structures and plots of young adult literature (YAL) with adult dystopian literature and those of recent dystopian YAL will be made while the hybrid nature of this genre and the 'sense of unease' but also of hope, as an essential part of youth literature, in the closure to these novels will be discussed. Important questions about the role of the didactic nature of these texts and the political issues and the importance of the formation of agency and identity for the young adult reader, as well as identification with the protagonists in this genre, are also part of this discussion of The Mark and other YAL novels.

Keywords: agency, dystopian literature, gender construction, young adult literature

Procedia PDF Downloads 154

2223 Assessment of Patient Cooperation and Compliance in Three Stages of Orthodontic Treatment in Adult Patients: A Cross-Sectional Study

Authors: Hafsa Qabool, Rashna Sukhia, Mubassar Fida

Abstract:

Introduction: Success of orthodontic mechanotherapy is highly dependent upon patient cooperation and compliance throughout the duration of treatment. This study was conducted to assess the cooperation and compliance of adult orthodontic patients during the leveling and alignment, space closure/molar correction, and finishing stages of tooth movement. Materials and Methods: Patient cooperation and compliance among three stages of orthodontic treatment were assessed using the Orthodontic Patient Cooperation Scale (OPCS) and Clinical Compliance Evaluation (CCE) form. A sample size of 38 was calculated for each stage of treatment; therefore, 114 subjects were included in the study. Shapiro-Wilk test identified that the data were normally distributed. One way ANOVA was used to evaluate the percentage cooperation and compliance among the three stages. Pair-wise comparisons between the three stages were performed using Post-hoc Tukey. Results: Statistically significant difference was seen for scores of patient compliance using CCE (p = 0.01); however, the results of the OPCS showed a non-significant difference for patient cooperation (p = 0.16) among the three stages of treatment. Post-hoc analysis showed significant differences (p = 0.01) in patient cooperation and compliance between space closure and the finishing stage. Highly significant (p < 0.001) decline in oral hygiene was found with the progression of orthodontic treatment. Conclusions: Improvement in the cooperation and compliance levels for adult orthodontic patients was observed during space closure & molar correction stage, which then showed a decline as treatment progressed. Oral hygiene was progressively compromised as orthodontic treatment progressed.

Keywords: patient compliance, adult orthodontics, orthodontic motivation, orthodontic patient adherence

Procedia PDF Downloads 128

2222 Application of Data Mining Techniques for Tourism Knowledge Discovery

Authors: Teklu Urgessa, Wookjae Maeng, Joong Seek Lee

Abstract:

Application of five implementations of three data mining classification techniques was experimented for extracting important insights from tourism data. The aim was to find out the best performing algorithm among the compared ones for tourism knowledge discovery. Knowledge discovery process from data was used as a process model. 10-fold cross validation method is used for testing purpose. Various data preprocessing activities were performed to get the final dataset for model building. Classification models of the selected algorithms were built with different scenarios on the preprocessed dataset. The outperformed algorithm tourism dataset was Random Forest (76%) before applying information gain based attribute selection and J48 (C4.5) (75%) after selection of top relevant attributes to the class (target) attribute. In terms of time for model building, attribute selection improves the efficiency of all algorithms. Artificial Neural Network (multilayer perceptron) showed the highest improvement (90%). The rules extracted from the decision tree model are presented, which showed intricate, non-trivial knowledge/insight that would otherwise not be discovered by simple statistical analysis with mediocre accuracy of the machine using classification algorithms.

Keywords: classification algorithms, data mining, knowledge discovery, tourism

Procedia PDF Downloads 269