Search results for: text embedding
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1502

Search results for: text embedding

782 Classification of Contexts for Mentioning Love in Interviews with Victims of the Holocaust

Authors: Marina Yurievna Aleksandrova

Abstract:

Research of the Holocaust retains value not only for history but also for sociology and psychology. One of the most important fields of study is how people were coping during and after this traumatic event. The aim of this paper is to identify the main contexts of the topic of love and to determine which contexts are more characteristic for different groups of victims of the Holocaust (gender, nationality, age). In this research, transcripts of interviews with Holocaust victims that were collected during 1946 for the "Voices of the Holocaust" project were used as data. Main contexts were analyzed with methods of network analysis and latent semantic analysis and classified by gender, age, and nationality with random forest. The results show that love is articulated and described significantly differently for male and female informants, nationality is shown results with lower values of quality metrics, as well as the age.

Keywords: Holocaust, latent semantic analysis, network analysis, text-mining, random forest

Procedia PDF Downloads 180
781 Scalable Learning of Tree-Based Models on Sparsely Representable Data

Authors: Fares Hedayatit, Arnauld Joly, Panagiotis Papadimitriou

Abstract:

Many machine learning tasks such as text annotation usually require training over very big datasets, e.g., millions of web documents, that can be represented in a sparse input space. State-of the-art tree-based ensemble algorithms cannot scale to such datasets, since they include operations whose running time is a function of the input space size rather than a function of the non-zero input elements. In this paper, we propose an efficient splitting algorithm to leverage input sparsity within decision tree methods. Our algorithm improves training time over sparse datasets by more than two orders of magnitude and it has been incorporated in the current version of scikit-learn.org, the most popular open source Python machine learning library.

Keywords: big data, sparsely representable data, tree-based models, scalable learning

Procedia PDF Downloads 263
780 Information Retrieval for Kafficho Language

Authors: Mareye Zeleke Mekonen

Abstract:

The Kafficho language has distinct issues in information retrieval because of its restricted resources and dearth of standardized methods. In this endeavor, with the cooperation and support of linguists and native speakers, we investigate the creation of information retrieval systems specifically designed for the Kafficho language. The Kafficho information retrieval system allows Kafficho speakers to access information easily in an efficient and effective way. Our objective is to conduct an information retrieval experiment using 220 Kafficho text files, including fifteen sample questions. Tokenization, normalization, stop word removal, stemming, and other data pre-processing chores, together with additional tasks like term weighting, were prerequisites for the vector space model to represent each page and a particular query. The three well-known measurement metrics we used for our word were Precision, Recall, and and F-measure, with values of 87%, 28%, and 35%, respectively. This demonstrates how well the Kaffiho information retrieval system performed well while utilizing the vector space paradigm.

Keywords: Kafficho, information retrieval, stemming, vector space

Procedia PDF Downloads 57
779 Interactive Image Search for Mobile Devices

Authors: Komal V. Aher, Sanjay B. Waykar

Abstract:

Nowadays every individual having mobile device with them. In both computer vision and information retrieval Image search is currently hot topic with many applications. The proposed intelligent image search system is fully utilizing multimodal and multi-touch functionalities of smart phones which allows search with Image, Voice, and Text on mobile phones. The system will be more useful for users who already have pictures in their minds but have no proper descriptions or names to address them. The paper gives system with ability to form composite visual query to express user’s intention more clearly which helps to give more precise or appropriate results to user. The proposed algorithm will considerably get better in different aspects. System also uses Context based Image retrieval scheme to give significant outcomes. So system is able to achieve gain in terms of search performance, accuracy and user satisfaction.

Keywords: color space, histogram, mobile device, mobile visual search, multimodal search

Procedia PDF Downloads 367
778 Identifying Biomarker Response Patterns to Vitamin D Supplementation in Type 2 Diabetes Using K-means Clustering: A Meta-Analytic Approach to Glycemic and Lipid Profile Modulation

Authors: Oluwafunmibi Omotayo Fasanya, Augustine Kena Adjei

Abstract:

Background and Aims: This meta-analysis aimed to evaluate the effect of vitamin D supplementation on key metabolic and cardiovascular parameters, such as glycated hemoglobin (HbA1C), fasting blood sugar (FBS), low-density lipoprotein (LDL), high-density lipoprotein (HDL), systolic blood pressure (SBP), and total vitamin D levels in patients with Type 2 diabetes mellitus (T2DM). Methods: A systematic search was performed across databases, including PubMed, Scopus, Embase, Web of Science, Cochrane Library, and ClinicalTrials.gov, from January 1990 to January 2024. A total of 4,177 relevant studies were initially identified. Using an unsupervised K-means clustering algorithm, publications were grouped based on common text features. Maximum entropy classification was then applied to filter studies that matched a pre-identified training set of 139 potentially relevant articles. These selected studies were manually screened for relevance. A parallel manual selection of all initially searched studies was conducted for validation. The final inclusion of studies was based on full-text evaluation, quality assessment, and meta-regression models using random effects. Sensitivity analysis and publication bias assessments were also performed to ensure robustness. Results: The unsupervised K-means clustering algorithm grouped the patients based on their responses to vitamin D supplementation, using key biomarkers such as HbA1C, FBS, LDL, HDL, SBP, and total vitamin D levels. Two primary clusters emerged: one representing patients who experienced significant improvements in these markers and another showing minimal or no change. Patients in the cluster associated with significant improvement exhibited lower HbA1C, FBS, and LDL levels after vitamin D supplementation, while HDL and total vitamin D levels increased. The analysis showed that vitamin D supplementation was particularly effective in reducing HbA1C, FBS, and LDL within this cluster. Furthermore, BMI, weight gain, and disease duration were identified as factors that influenced cluster assignment, with patients having lower BMI and shorter disease duration being more likely to belong to the improvement cluster. Conclusion: The findings of this machine learning-assisted meta-analysis confirm that vitamin D supplementation can significantly improve glycemic control and reduce the risk of cardiovascular complications in T2DM patients. The use of automated screening techniques streamlined the process, ensuring the comprehensive evaluation of a large body of evidence while maintaining the validity of traditional manual review processes.

Keywords: HbA1C, T2DM, SBP, FBS

Procedia PDF Downloads 10
777 Exploring Research Trends and Topics in Intervention on Metabolic Syndrome Using Network Analysis

Authors: Lee Soo-Kyoung, Kim Young-Su

Abstract:

This study established a network related to metabolic syndrome intervention by conducting a social network analysis of titles, keywords, and abstracts, and it identified emerging topics of research. It visualized an interconnection between critical keywords and investigated their frequency of appearance to construe the trends in metabolic syndrome intervention measures used in studies conducted over 38 years (1979–2017). It examined a collection of keywords from 8,285 studies using text rank analyzer, NetMiner 4.0. The analysis revealed 5 groups of newly emerging keywords in the research. By examining the relationship between keywords with reference to their betweenness centrality, the following clusters were identified. Thus if new researchers refer to existing trends to establish the subject of their study and the direction of the development of future research on metabolic syndrome intervention can be predicted.

Keywords: intervention, metabolic syndrome, network analysis, research, the trend

Procedia PDF Downloads 200
776 Innovative Screening Tool Based on Physical Properties of Blood

Authors: Basant Singh Sikarwar, Mukesh Roy, Ayush Goyal, Priya Ranjan

Abstract:

This work combines two bodies of knowledge which includes biomedical basis of blood stain formation and fluid communities’ wisdom that such formation of blood stain depends heavily on physical properties. Moreover biomedical research tells that different patterns in stains of blood are robust indicator of blood donor’s health or lack thereof. Based on these valuable insights an innovative screening tool is proposed which can act as an aide in the diagnosis of diseases such Anemia, Hyperlipidaemia, Tuberculosis, Blood cancer, Leukemia, Malaria etc., with enhanced confidence in the proposed analysis. To realize this powerful technique, simple, robust and low-cost micro-fluidic devices, a micro-capillary viscometer and a pendant drop tensiometer are designed and proposed to be fabricated to measure the viscosity, surface tension and wettability of various blood samples. Once prognosis and diagnosis data has been generated, automated linear and nonlinear classifiers have been applied into the automated reasoning and presentation of results. A support vector machine (SVM) classifies data on a linear fashion. Discriminant analysis and nonlinear embedding’s are coupled with nonlinear manifold detection in data and detected decisions are made accordingly. In this way, physical properties can be used, using linear and non-linear classification techniques, for screening of various diseases in humans and cattle. Experiments are carried out to validate the physical properties measurement devices. This framework can be further developed towards a real life portable disease screening cum diagnostics tool. Small-scale production of screening cum diagnostic devices is proposed to carry out independent test.

Keywords: blood, physical properties, diagnostic, nonlinear, classifier, device, surface tension, viscosity, wettability

Procedia PDF Downloads 376
775 Surface to the Deeper: A Universal Entity Alignment Approach Focusing on Surface Information

Authors: Zheng Baichuan, Li Shenghui, Li Bingqian, Zhang Ning, Chen Kai

Abstract:

Entity alignment (EA) tasks in knowledge graphs often play a pivotal role in the integration of knowledge graphs, where structural differences often exist between the source and target graphs, such as the presence or absence of attribute information and the types of attribute information (text, timestamps, images, etc.). However, most current research efforts are focused on improving alignment accuracy, often along with an increased reliance on specific structures -a dependency that inevitably diminishes their practical value and causes difficulties when facing knowledge graph alignment tasks with varying structures. Therefore, we propose a universal knowledge graph alignment approach that only utilizes the common basic structures shared by knowledge graphs. We have demonstrated through experiments that our method achieves state-of-the-art performance in fair comparisons.

Keywords: knowledge graph, entity alignment, transformer, deep learning

Procedia PDF Downloads 45
774 Quality and Quantity in the Strategic Network of Higher Education Institutions

Authors: Juha Kettunen

Abstract:

This study analyzes the quality and the size of the strategic network of higher education institutions. The study analyses the concept of fitness for purpose in quality assurance. It also analyses the transaction costs of networking that have consequences on the number of members in the network. Empirical evidence is presented of the Consortium on Applied Research and Professional Education, which is a European strategic network of six higher education institutions. The results of the study support the argument that the number of members in the strategic network should be relatively small to provide high quality results. The practical importance is that networking has been able to promote international research and development projects. The results of this study are important for those who want to design and improve international networks in higher education.

Keywords: balanced scorecard, higher education, social networking, strategic planning

Procedia PDF Downloads 348
773 Interacting with Multi-Scale Structures of Online Political Debates by Visualizing Phylomemies

Authors: Quentin Lobbe, David Chavalarias, Alexandre Delanoe

Abstract:

The ICT revolution has given birth to an unprecedented world of digital traces and has impacted a wide number of knowledge-driven domains such as science, education or policy making. Nowadays, we are daily fueled by unlimited flows of articles, blogs, messages, tweets, etc. The internet itself can thus be considered as an unsteady hyper-textual environment where websites emerge and expand every day. But there are structures inside knowledge. A given text can always be studied in relation to others or in light of a specific socio-cultural context. By way of their textual traces, human beings are calling each other out: hypertext citations, retweets, vocabulary similarity, etc. We are in fact the architects of a giant web of elements of knowledge whose structures and shapes convey their own information. The global shapes of these digital traces represent a source of collective knowledge and the question of their visualization remains an opened challenge. How can we explore, browse and interact with such shapes? In order to navigate across these growing constellations of words and texts, interdisciplinary innovations are emerging at the crossroad between fields of social and computational sciences. In particular, complex systems approaches make it now possible to reconstruct the hidden structures of textual knowledge by means of multi-scale objects of research such as semantic maps and phylomemies. The phylomemy reconstruction is a generic method related to the co-word analysis framework. Phylomemies aim to reveal the temporal dynamics of large corpora of textual contents by performing inter-temporal matching on extracted knowledge domains in order to identify their conceptual lineages. This study aims to address the question of visualizing the global shapes of online political discussions related to the French presidential and legislative elections of 2017. We aim to build phylomemies on top of a dedicated collection of thousands of French political tweets enriched with archived contemporary news web articles. Our goal is to reconstruct the temporal evolution of online debates fueled by each political community during the elections. To that end, we want to introduce an iterative data exploration methodology implemented and tested within the free software Gargantext. There we combine synchronic and diachronic axis of visualization to reveal the dynamics of our corpora of tweets and web pages as well as their inner syntagmatic and paradigmatic relationships. In doing so, we aim to provide researchers with innovative methodological means to explore online semantic landscapes in a collaborative and reflective way.

Keywords: online political debate, French election, hyper-text, phylomemy

Procedia PDF Downloads 186
772 Scour Damaged Detection of Bridge Piers Using Vibration Analysis - Numerical Study of a Bridge

Authors: Solaine Hachem, Frédéric Bourquin, Dominique Siegert

Abstract:

The brutal collapse of bridges is mainly due to scour. Indeed, the soil erosion in the riverbed around a pier modifies the embedding conditions of the structure, reduces its overall stiffness and threatens its stability. Hence, finding an efficient technique that allows early scour detection becomes mandatory. Vibration analysis is an indirect method for scour detection that relies on real-time monitoring of the bridge. It tends to indicate the presence of a scour based on its consequences on the stability of the structure and its dynamic response. Most of the research in this field has focused on the dynamic behavior of a single pile and has examined the depth of the scour. In this paper, a bridge is fully modeled with all piles and spans and the scour is represented by a reduction in the foundation's stiffnesses. This work aims to identify the vibration modes sensitive to the rigidity’s loss in the foundations so that their variations can be considered as a scour indicator: the decrease in soil-structure interaction rigidity leads to a decrease in the natural frequencies’ values. By using the first-order perturbation method, the expression of sensitivity, which depends only on the selected vibration modes, is established to determine the deficiency of foundations stiffnesses. The solutions are obtained by using the singular value decomposition method for the regularization of the inverse problem. The propagation of uncertainties is also calculated to verify the efficiency of the inverse problem method. Numerical simulations describing different scenarios of scour are investigated on a simplified model of a real composite steel-concrete bridge located in France. The results of the modal analysis show that the modes corresponding to in-plane and out-of-plane piers vibrations are sensitive to the loss of foundation stiffness. While the deck bending modes are not affected by this damage.

Keywords: bridge’s piers, inverse problems, modal sensitivity, scour detection, vibration analysis

Procedia PDF Downloads 104
771 Pueblos Mágicos in Mexico: The Loss of Intangible Cultural Heritage and Cultural Tourism

Authors: Claudia Rodriguez-Espinosa, Erika Elizabeth Pérez Múzquiz

Abstract:

Since the creation of the “Pueblos Mágicos” program in 2001, a series of social and cultural events had directly affected the heritage conservation of the 121 registered localities until 2018, when the federal government terminated the program. Many studies have been carried out that seek to analyze from different perspectives and disciplines the consequences that these appointments have generated in the “Pueblos Mágicos.” Multidisciplinary groups such as the one headed by Carmen Valverde and Liliana López Levi, have brought together specialists from all over the Mexican Republic to create a set of diagnoses of most of these settlements, and although each one has unique specificities, there is a constant in most of them that has to do with the loss of cultural heritage and that is related to transculturality. There are several factors identified that have fostered a cultural loss, as a direct reflection of the economic crisis that prevails in Mexico. It is important to remember that the origin of this program had as its main objective to promote the growth and development of local economies since one of the conditions for entering the program is that they have less than 20,000 inhabitants. With this goal in mind, one of the first actions that many “Pueblos Mágicos” carried out was to improve or create an infrastructure to receive both national and foreign tourists since this was practically non-existent. Creating hotels, restaurants, cafes, training certified tour guides, among other actions, have led to one of the great problems they face: globalization. Although by itself it is not bad, its impact in many cases has been negative for heritage conservation. The entry into and contact with new cultures has led to the undervaluation of cultural traditions, their transformation and even their total loss. This work seeks to present specific cases of transformation and loss of cultural heritage, as well as to reflect on the problem and propose scenarios in which the negative effects can be reversed. For this text, 36 “Pueblos Mágicos” have been selected for study, based on those settlements that are cited in volumes I and IV (the first and last of the collection) of the series produced by the multidisciplinary group led by Carmen Valverde and Liliana López Levi (researchers from UNAM and UAM Xochimilco respectively) in the project supported by CONACyT entitled “Pueblos Mágicos. An interdisciplinary vision”, of which we are part. This sample is considered representative since it forms 30% of the total of 121 “Pueblos Mágicos” existing at that moment. With this information, the elements of its intangible heritage loss or transformation have been identified in every chapter based on the texts written by the participants of that project. Finally, this text shows an analysis of the effects that this federal program, as a public policy applied to 132 populations, has had on the conservation or transformation of the intangible cultural heritage of the “Pueblos Mágicos.” Transculturality, globalization, the creation of identities and the desire to increase the flow of tourists have impacted the changes that traditions (main intangible cultural heritage) have had in the 18 years that the federal program lasted.

Keywords: public policies, cultural tourism, heritage preservation, pueblos mágicos program

Procedia PDF Downloads 189
770 Twitter's Impact on Print Media with Respect to Real World Events

Authors: Basit Shahzad, Abdullatif M. Abdullatif

Abstract:

Recent advancements in Information and Communication Technologies (ICT) and easy access to Internet have made social media the first choice for information sharing related to any important events or news. On Twitter, trend is a common feature that quantifies the level of popularity of a certain news or event. In this work, we examine the impact of Twitter trends on real world events by hypothesizing that Twitter trends have an influence on print media in Pakistan. For this, Twitter is used as a platform and Twitter trends as a base line. We first collect data from two sources (Twitter trends and print media) in the period May to August 2016. Obtained data from two sources is analyzed and it is observed that social media is significantly influencing the print media and majority of the news printed in newspaper are posted on Twitter earlier.

Keywords: twitter trends, text mining, effectiveness of trends, print media

Procedia PDF Downloads 258
769 A Study on the Nostalgia Contents Analysis of Hometown Alumni in the Online Community

Authors: Heejin Yun, Juanjuan Zang

Abstract:

This study aims to analyze the text terms posted on an online community of people from the same hometown and to understand the topic and trend of nostalgia composed online. For this purpose, this study collected 144 writings which the natives of Yeongjong Island, Incheon, South-Korea have posted on an online community. And it analyzed association relations. As a result, online community texts means that just defining nostalgia as ‘a mind longing for hometown’ is not an enough explanation. Second, texts composed online have abstractness rather than persons’ individual stories. This study figured out the relationship that had the most critical and closest mutual association among the terms that constituted nostalgia through literature research and association rule concerning nostalgia. The result of this study has a characteristic that it summed up the core terms and emotions related to nostalgia.

Keywords: nostalgia, cultural memory, data mining, association rule

Procedia PDF Downloads 229
768 A Qualitative Study into the Success and Challenges in Embedding Evidence-Based Research Methods in Operational Policing Interventions

Authors: Ahmed Kadry, Gwyn Dodd

Abstract:

There has been a growing call globally for police forces to embed evidence-based policing research methods into police interventions in order to better understand and evaluate their impact. This research study highlights the success and challenges that police forces may encounter when trying to embed evidence-based research methods within their organisation. 10 in-depth qualitative interviews were conducted with police officers and staff at Greater Manchester Police (GMP) who were tasked with integrating evidence-based research methods into their operational interventions. The findings of the study indicate that with adequate resources and individual expertise, evidence-based research methods can be applied to operational work, including the testing of initiatives with strict controls in order to fully evaluate the impact of an intervention. However, the findings also indicate that this may only be possible where an operational intervention is heavily resourced with police officers and staff who have a strong understanding of evidence-based policing research methods, attained for example through their own graduate studies. In addition, the findings reveal that ample planning time was needed to trial operational interventions that would require strict parameters for what would be tested and how it would be evaluated. In contrast, interviewees underscored that operational interventions with the need for a speedy implementation were less likely to have evidence-based research methods applied. The study contributes to the wider literature on evidence-based policing by providing considerations for police forces globally wishing to apply evidence-based research methods to more of their operational work in order to understand their impact. The study also provides considerations for academics who work closely with police forces in assisting them to embed evidence-based policing. This includes how academics can provide their expertise to police decision makers wanting to underpin their work through evidence-based research methods, such as providing guidance on how to evaluate the impact of their work with varying research methods that they may otherwise be unaware of.  

Keywords: evidence based policing, evidence-based practice, operational policing, organisational change

Procedia PDF Downloads 142
767 Documents Emotions Classification Model Based on TF-IDF Weighting Measure

Authors: Amr Mansour Mohsen, Hesham Ahmed Hassan, Amira M. Idrees

Abstract:

Emotions classification of text documents is applied to reveal if the document expresses a determined emotion from its writer. As different supervised methods are previously used for emotion documents’ classification, in this research we present a novel model that supports the classification algorithms for more accurate results by the support of TF-IDF measure. Different experiments have been applied to reveal the applicability of the proposed model, the model succeeds in raising the accuracy percentage according to the determined metrics (precision, recall, and f-measure) based on applying the refinement of the lexicon, integration of lexicons using different perspectives, and applying the TF-IDF weighting measure over the classifying features. The proposed model has also been compared with other research to prove its competence in raising the results’ accuracy.

Keywords: emotion detection, TF-IDF, WEKA tool, classification algorithms

Procedia PDF Downloads 484
766 Uplift Modeling Approach to Optimizing Content Quality in Social Q/A Platforms

Authors: Igor A. Podgorny

Abstract:

TurboTax AnswerXchange is a social Q/A system supporting users working on federal and state tax returns. Content quality and popularity in the AnswerXchange can be predicted with propensity models using attributes of the question and answer. Using uplift modeling, we identify features of questions and answers that can be modified during the question-asking and question-answering experience in order to optimize the AnswerXchange content quality. We demonstrate that adding details to the questions always results in increased question popularity that can be used to promote good quality content. Responding to close-ended questions assertively improve content quality in the AnswerXchange in 90% of cases. Answering knowledge questions with web links increases the likelihood of receiving a negative vote from 60% of the askers. Our findings provide a rationale for employing the uplift modeling approach for AnswerXchange operations.

Keywords: customer relationship management, human-machine interaction, text mining, uplift modeling

Procedia PDF Downloads 244
765 Gender Construction in Contemporary Dystopian Fiction in Young Adult Literature: A South African Example

Authors: Johan Anker

Abstract:

The purpose of this paper is to discuss the nature of gender construction in modern dystopian fiction, the development of this genre in Young Adult Literature and reasons for the enormous appeal on the adolescent readers. A recent award winning South African text in this genre, The Mark by Edith Bullring (2014), will be used as example while also comparing this text to international bestsellers like Divergent (Roth:2011), The Hunger Games (Collins:2008) and others. Theoretical insights from critics and academics in the field of children’s literature, like Ames, Coats, Bradford, Booker, Basu, Green-Barteet, Hintz, McAlear, McCallum, Moylan, Ostry, Ryan, Stephens and Westerfield will be referred to and their insights used as part of the analysis of The Mark. The role of relevant and recurring themes in this genre, like global concerns, environmental destruction, liberty, self-determination, social and political critique, surveillance and repression by the state or other institutions will also be referred to. The paper will shortly refer to the history and emergence of dystopian literature as genre in adult and young adult literature as part of the long tradition since the publishing of Orwell’s 1984 and Huxley’s Brave New World. Different factors appeal to adolescent readers in the modern versions of this hybrid genre for young adults: teenage protagonists who are questioning the underlying values of a flawed society like an inhuman or tyrannical government, a growing understanding of the society around them, feelings of isolation and the dynamic of relationships. This unease leads to a growing sense of the potential to act against society (rebellion), and of their role as agents in a larger community and independent decision-making abilities. This awareness also leads to a growing sense of self (identity and agency) and the development of romantic relationships. The specific modern tendency of a female protagonist as leader in the rebellion against state and state apparatus, who gains in agency and independence in this rebellion, an important part of the identification with and construction of gender, while being part of the traditional coming-of-age young adult novel will be emphasized. A comparison between the traditional themes, structures and plots of young adult literature (YAL) with adult dystopian literature and those of recent dystopian YAL will be made while the hybrid nature of this genre and the 'sense of unease' but also of hope, as an essential part of youth literature, in the closure to these novels will be discussed. Important questions about the role of the didactic nature of these texts and the political issues and the importance of the formation of agency and identity for the young adult reader, as well as identification with the protagonists in this genre, are also part of this discussion of The Mark and other YAL novels.

Keywords: agency, dystopian literature, gender construction, young adult literature

Procedia PDF Downloads 190
764 Automated Evaluation Approach for Time-Dependent Question Answering Pairs on Web Crawler Based Question Answering System

Authors: Shraddha Chaudhary, Raksha Agarwal, Niladri Chatterjee

Abstract:

This work demonstrates a web crawler-based generalized end-to-end open domain Question Answering (QA) system. An efficient QA system requires a significant amount of domain knowledge to answer any question with the aim to find an exact and correct answer in the form of a number, a noun, a short phrase, or a brief piece of text for the user's questions. Analysis of the question, searching the relevant document, and choosing an answer are three important steps in a QA system. This work uses a web scraper (Beautiful Soup) to extract K-documents from the web. The value of K can be calibrated on the basis of a trade-off between time and accuracy. This is followed by a passage ranking process using the MS-Marco dataset trained on 500K queries to extract the most relevant text passage, to shorten the lengthy documents. Further, a QA system is used to extract the answers from the shortened documents based on the query and return the top 3 answers. For evaluation of such systems, accuracy is judged by the exact match between predicted answers and gold answers. But automatic evaluation methods fail due to the linguistic ambiguities inherent in the questions. Moreover, reference answers are often not exhaustive or are out of date. Hence correct answers predicted by the system are often judged incorrect according to the automated metrics. One such scenario arises from the original Google Natural Question (GNQ) dataset which was collected and made available in the year 2016. Use of any such dataset proves to be inefficient with respect to any questions that have time-varying answers. For illustration, if the query is where will be the next Olympics? Gold Answer for the above query as given in the GNQ dataset is “Tokyo”. Since the dataset was collected in the year 2016, and the next Olympics after 2016 were in 2020 that was in Tokyo which is absolutely correct. But if the same question is asked in 2022 then the answer is “Paris, 2024”. Consequently, any evaluation based on the GNQ dataset will be incorrect. Such erroneous predictions are usually given to human evaluators for further validation which is quite expensive and time-consuming. To address this erroneous evaluation, the present work proposes an automated approach for evaluating time-dependent question-answer pairs. In particular, it proposes a metric using the current timestamp along with top-n predicted answers from a given QA system. To test the proposed approach GNQ dataset has been used and the system achieved an accuracy of 78% for a test dataset comprising 100 QA pairs. This test data was automatically extracted using an analysis-based approach from 10K QA pairs of the GNQ dataset. The results obtained are encouraging. The proposed technique appears to have the possibility of developing into a useful scheme for gathering precise, reliable, and specific information in a real-time and efficient manner. Our subsequent experiments will be guided towards establishing the efficacy of the above system for a larger set of time-dependent QA pairs.

Keywords: web-based information retrieval, open domain question answering system, time-varying QA, QA evaluation

Procedia PDF Downloads 101
763 Surge Analysis of Water Transmission Mains in Una, Himachal Pradesh, India

Authors: Baldev Setia, Raj Rajeshwari, Maneesh Kumar

Abstract:

Present paper is an analysis of water transmission mains failed due to surge analysis by using basic software known as Surge Analysis Program (SAP). It is a real time failure case study of a pipe laid in Una, Himachal Pradesh. The transmission main is a 13 kilometer long pipe with 7.9 kilometers as pumping main and 5.1 kilometers as gravitational main. The analysis deals with mainly pumping mains. The results are available in two text files. Besides, several files are prepared with specific view to obtain results in a graphical form. These results help to observe the pressure difference and surge occurrence at different locations along the pipe profile, which help to redesign the transmission main with different but suitable safety measures against possible surge. A technically viable and economically feasible design has been provided as per the relevant manual and standard code of practice.

Keywords: surge, water hammer, transmission mains, SAP 2000

Procedia PDF Downloads 366
762 Music Reading Expertise Facilitates Implicit Statistical Learning of Sentence Structures in a Novel Language: Evidence from Eye Movement Behavior

Authors: Sara T. K. Li, Belinda H. J. Chung, Jeffery C. N. Yip, Janet H. Hsiao

Abstract:

Music notation and text reading both involve statistical learning of music or linguistic structures. However, it remains unclear how music reading expertise influences text reading behavior. The present study examined this issue through an eye-tracking study. Chinese-English bilingual musicians and non-musicians read English sentences, Chinese sentences, musical phrases, and sentences in Tibetan, a language novel to the participants, with their eye movement recorded. Each set of stimuli consisted of two conditions in terms of structural regularity: syntactically correct and syntactically incorrect musical phrases/sentences. They then completed a sentence comprehension (for syntactically correct sentences) or a musical segment/word recognition task afterwards to test their comprehension/recognition abilities. The results showed that in reading musical phrases, as compared with non-musicians, musicians had a higher accuracy in the recognition task, and had shorter reading time, fewer fixations, and shorter fixation duration when reading syntactically correct (i.e., in diatonic key) than incorrect (i.e., in non-diatonic key/atonal) musical phrases. This result reflects their expertise in music reading. Interestingly, in reading Tibetan sentences, which was novel to both participant groups, while non-musicians did not show any behavior differences between reading syntactically correct or incorrect Tibetan sentences, musicians showed a shorter reading time and had marginally fewer fixations when reading syntactically correct sentences than syntactically incorrect ones. However, none of the musicians reported discovering any structural regularities in the Tibetan stimuli after the experiment when being asked explicitly, suggesting that they may have implicitly acquired the structural regularities in Tibetan sentences. This group difference was not observed when they read English or Chinese sentences. This result suggests that music reading expertise facilities reading texts in a novel language (i.e., Tibetan), but not in languages that the readers are already familiar with (i.e., English and Chinese). This phenomenon may be due to the similarities between reading music notations and reading texts in a novel language, as in both cases the stimuli follow particular statistical structures but do not involve semantic or lexical processing. Thus, musicians may transfer their statistical learning skills stemmed from music notation reading experience to implicitly discover structures of sentences in a novel language. This speculation is consistent with a recent finding showing that music reading expertise modulates the processing of English nonwords (i.e., words that do not follow morphological or orthographic rules) but not pseudo- or real words. These results suggest that the modulation of music reading expertise on language processing depends on the similarities in the cognitive processes involved. It also has important implications for the benefits of music education on language and cognitive development.

Keywords: eye movement behavior, eye-tracking, music reading expertise, sentence reading, structural regularity, visual processing

Procedia PDF Downloads 380
761 Citation Analysis of New Zealand Court Decisions

Authors: Tobias Milz, L. Macpherson, Varvara Vetrova

Abstract:

The law is a fundamental pillar of human societies as it shapes, controls and governs how humans conduct business, behave and interact with each other. Recent advances in computer-assisted technologies such as NLP, data science and AI are creating opportunities to support the practice, research and study of this pervasive domain. It is therefore not surprising that there has been an increase in investments into supporting technologies for the legal industry (also known as “legal tech” or “law tech”) over the last decade. A sub-discipline of particular appeal is concerned with assisted legal research. Supporting law researchers and practitioners to retrieve information from the vast amount of ever-growing legal documentation is of natural interest to the legal research community. One tool that has been in use for this purpose since the early nineteenth century is legal citation indexing. Among other use cases, they provided an effective means to discover new precedent cases. Nowadays, computer-assisted network analysis tools can allow for new and more efficient ways to reveal the “hidden” information that is conveyed through citation behavior. Unfortunately, access to openly available legal data is still lacking in New Zealand and access to such networks is only commercially available via providers such as LexisNexis. Consequently, there is a need to create, analyze and provide a legal citation network with sufficient data to support legal research tasks. This paper describes the development and analysis of a legal citation Network for New Zealand containing over 300.000 decisions from 125 different courts of all areas of law and jurisdiction. Using python, the authors assembled web crawlers, scrapers and an OCR pipeline to collect and convert court decisions from openly available sources such as NZLII into uniform and machine-readable text. This facilitated the use of regular expressions to identify references to other court decisions from within the decision text. The data was then imported into a graph-based database (Neo4j) with the courts and their respective cases represented as nodes and the extracted citations as links. Furthermore, additional links between courts of connected cases were added to indicate an indirect citation between the courts. Neo4j, as a graph-based database, allows efficient querying and use of network algorithms such as PageRank to reveal the most influential/most cited courts and court decisions over time. This paper shows that the in-degree distribution of the New Zealand legal citation network resembles a power-law distribution, which indicates a possible scale-free behavior of the network. This is in line with findings of the respective citation networks of the U.S. Supreme Court, Austria and Germany. The authors of this paper provide the database as an openly available data source to support further legal research. The decision texts can be exported from the database to be used for NLP-related legal research, while the network can be used for in-depth analysis. For example, users of the database can specify the network algorithms and metrics to only include specific courts to filter the results to the area of law of interest.

Keywords: case citation network, citation analysis, network analysis, Neo4j

Procedia PDF Downloads 106
760 Communicative and Artistic Machines: A Survey of Models and Experiments on Artificial Agents

Authors: Artur Matuck, Guilherme F. Nobre

Abstract:

Machines can be either tool, media, or social agents. Advances in technology have been delivering machines capable of autonomous expression, both through communication and art. This paper deals with models (theoretical approach) and experiments (applied approach) related to artificial agents. On one hand it traces how social sciences' scholars have worked with topics such as text automatization, man-machine writing cooperation, and communication. On the other hand it covers how computer sciences' scholars have built communicative and artistic machines, including the programming of creativity. The aim is to present a brief survey on artificially intelligent communicators and artificially creative writers, and provide the basis to understand the meta-authorship and also to new and further man-machine co-authorship.

Keywords: artificial communication, artificial creativity, artificial writers, meta-authorship, robotic art

Procedia PDF Downloads 292
759 Surge Analysis of Water Transmission Mains in Una, Himachal Pradesh (India)

Authors: Baldev Setia, Raj Rajeshwari, Maneesh Kumar

Abstract:

Present paper is an analysis of water transmission mains failed due to surge analysis by using basic software known as Surge Analysis Program (SAP). It is a real time failure case study of a pipe laid in Una, Himachal Pradesh. The transmission main is a 13 kilometres long pipe with 7.9 kilometres as pumping main and 5.1 kilometres as gravitational main. The analysis deals with mainly pumping mains. The results are available in two text files. Besides, several files are prepared with specific view to obtain results in a graphical form. These results help to observe the pressure difference and surge occurrence at different locations along the pipe profile, which help to redesign the transmission main with different but suitable safety measures against possible surge. A technically viable and economically feasible design has been provided as per the relevant manual and standard code of practice.

Keywords: surge, water hammer, transmission mains, SAP 2000

Procedia PDF Downloads 403
758 An Analysis of the Strategies Employed to Curate, Conserve and Digitize the Timbuktu Manuscripts

Authors: F. Saptouw

Abstract:

This paper briefly reviews the range of curatorial interventions made to preserve and display the Timbuktu Manuscripts. The government of South Africa and Mali collaborated to preserve the manuscripts, and brief notes will be presented about the value of archives in those specific spaces. The research initiatives of the Tombouctou Manuscripts Project, based at the University of Cape Town, feature prominently in the text. A brief overview of the history of the archive will be presented and its preservation as a key turning point in curating the intellectual history of the continent. ­­­The strategies of preservation, curation, publication and digitization are presented as complimentary interventions. Each materialization of the manuscripts contributes something significant; the complexity of the contribution is dependent primarily on the format of presentation. This integrated reading of the manuscripts is presented as a means to gain a more nuanced understanding of the past, which greatly surpasses how much information would be gleaned from relying on a single media format.

Keywords: archive, curatorship, cultural heritage, museum practice, Timbuktu manuscripts

Procedia PDF Downloads 113
757 Radical Web Text Classification Using a Composite-Based Approach

Authors: Kolade Olawande Owoeye, George R. S. Weir

Abstract:

The widespread of terrorism and extremism activities on the internet has become a major threat to the government and national securities due to their potential dangers which have necessitated the need for intelligence gathering via web and real-time monitoring of potential websites for extremist activities. However, the manual classification for such contents is practically difficult or time-consuming. In response to this challenge, an automated classification system called composite technique was developed. This is a computational framework that explores the combination of both semantics and syntactic features of textual contents of a web. We implemented the framework on a set of extremist webpages dataset that has been subjected to the manual classification process. Therein, we developed a classification model on the data using J48 decision algorithm, this is to generate a measure of how well each page can be classified into their appropriate classes. The classification result obtained from our method when compared with other states of arts, indicated a 96% success rate in classifying overall webpages when matched against the manual classification.

Keywords: extremist, web pages, classification, semantics, posit

Procedia PDF Downloads 145
756 A Book Review of Inside the Battle of Algiers, by Zohra Drif: A Thematic Analysis on Women’s Agency

Authors: W. Zekri

Abstract:

This paper explores Zohra Drif’s memoir, Inside the Battle of Algiers, which narrates her desires as a student to become a revolutionary activist. She exemplified, in her narrative, the different roles, she and her fellows performed as combatants in the Casbah during the Algerian Revolution 1954-1962. This book review aims to evaluate the concept of women’s agency through education and language learning, and its impact on empowering women’s desires. Close-reading method and thematic analysis are used to explore the text. The analysis identified themes that refine the meaning of agency which are social and cultural supports, education, and language proficiency. These themes aim to contribute to the representation in Inside the Battle of Algiers of a woman guerrilla who engaged herself to perform national acts of resistance.

Keywords: agency, education, learning, women

Procedia PDF Downloads 176
755 Improving Fingerprinting-Based Localization System Using Generative Artificial Intelligence

Authors: Getaneh Berie Tarekegn

Abstract:

A precise localization system is crucial for many artificial intelligence Internet of Things (AI-IoT) applications in the era of smart cities. Their applications include traffic monitoring, emergency alarming, environmental monitoring, location-based advertising, intelligent transportation, and smart health care. The most common method for providing continuous positioning services in outdoor environments is by using a global navigation satellite system (GNSS). Due to nonline-of-sight, multipath, and weather conditions, GNSS systems do not perform well in dense urban, urban, and suburban areas.This paper proposes a generative AI-based positioning scheme for large-scale wireless settings using fingerprinting techniques. In this article, we presented a novel semi-supervised deep convolutional generative adversarial network (S-DCGAN)-based radio map construction method for real-time device localization. We also employed a reliable signal fingerprint feature extraction method with t-distributed stochastic neighbor embedding (t-SNE), which extracts dominant features while eliminating noise from hybrid WLAN and long-term evolution (LTE) fingerprints. The proposed scheme reduced the workload of site surveying required to build the fingerprint database by up to 78.5% and significantly improved positioning accuracy. The results show that the average positioning error of GAILoc is less than 39 cm, and more than 90% of the errors are less than 82 cm. That is, numerical results proved that, in comparison to traditional methods, the proposed SRCLoc method can significantly improve positioning performance and reduce radio map construction costs.

Keywords: location-aware services, feature extraction technique, generative adversarial network, long short-term memory, support vector machine

Procedia PDF Downloads 71
754 Modern Management Principles Enshrined in Ancient Vedic Texts

Authors: M. Kishore Kumar

Abstract:

The ancient Vedas and Upanishads are a treasure of knowledge gifted to the world by India. The four Vedas, a conglomerate of Hindu scriptures, contain many principles of modern management at organisation as well as at individual levels. It lays down the duties of a King and ministers as well as its citizens and cites values for leadership. Bhagawadgita (or ‘Gita’ in short), popularly cited as Pancham (Fifth) Veda, is stated to be sermoned about 5000 years ago by Lord Krishna. In the midst of the Kurukshetra battle, Gitopadesh was given various aspects such as dharma (duties), karma (action), stithaprajna (stable mind), nishkama (detachment from results) and ethics. Arjun was steered to victory by Lord Krishna as his charioteer, and the 700-odd-verse holy text Bhagawadgita can become a valuable guide for all of us to achieve success in business management. Many parallels exist between modern-day management theories and principles enshrined in Vedic texts.

Keywords: goal, motivation, leadership, mind, management

Procedia PDF Downloads 87
753 Investigation of Topic Modeling-Based Semi-Supervised Interpretable Document Classifier

Authors: Dasom Kim, William Xiu Shun Wong, Yoonjin Hyun, Donghoon Lee, Minji Paek, Sungho Byun, Namgyu Kim

Abstract:

There have been many researches on document classification for classifying voluminous documents automatically. Through document classification, we can assign a specific category to each unlabeled document on the basis of various machine learning algorithms. However, providing labeled documents manually requires considerable time and effort. To overcome the limitations, the semi-supervised learning which uses unlabeled document as well as labeled documents has been invented. However, traditional document classifiers, regardless of supervised or semi-supervised ones, cannot sufficiently explain the reason or the process of the classification. Thus, in this paper, we proposed a methodology to visualize major topics and class components of each document. We believe that our methodology for visualizing topics and classes of each document can enhance the reliability and explanatory power of document classifiers.

Keywords: data mining, document classifier, text mining, topic modeling

Procedia PDF Downloads 402