Search results for: semantic repository
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 672

Search results for: semantic repository

192 Little Girls and Big Stories: A Thematic Analysis of Gender Representations in Selected Asian Room to Read Storybooks

Authors: Cheeno Marlo Sayuno

Abstract:

Room to Read is an international nonprofit organization aimed at empowering young readers through literature and literacy education. In particular, the organization is focused on girls’ education in schools and bettering their social status through crafting stories and making sure that these stories are accessible to them. In 2019, Room to Read visited the Philippines and partnered with Philippine children’s literature publishers Adarna House, Lampara Books, Anvil Publishing, and OMF-Hiyas with the goal of producing contextualized stories that Filipino children can read. The result is a set of 20 storybooks developed by Filipino writers and illustrators, the author of this paper included. The project led to narratives of experiences in storybook production from conceptualization to publication, towards translations and reimagining in online repository, storytelling, and audiobook formats. During the production process, we were particularly reminded of gender representations, child’s rights, and telling stories that can empower the children in vulnerable communities, who are the beneficiaries of the project. The storybooks, along with many others produced in Asia and the world, are available online through the literacycloud.org website of Room to Read. In this study, the goal is to survey the stories produced in Asia and look at how gender is represented in the storybooks. By analyzing both the texts and the illustrations of the storybooks produced across Asian countries, themes of portrayals of young boys and girls, their characteristics and narratives, and how they are empowered in the stories are identified, with the goal of mapping how Room to Read is able to address the problem of access to literacy among young girls and ensuring them that they can do anything, the way they are portrayed in the stories. The paper hopes to determine how gender is represented in Asian storybooks produced by the international nonprofit organization Room to Read. Thematic textual analysis was used as methodology, where the storybooks are analyzed qualitatively to identify arising themes of gender representation. This study will shed light on the importance of responsible portrayal of gender in storybooks and how it can impact and empower children. The results of the study can also aid writers and illustrators in developing gender-sensitive storybooks.

Keywords: room to read, asian storybooks, young girls, thematic analysis, child empowerment, literacy, education

Procedia PDF Downloads 79
191 Conceptual Model for Massive Open Online Blended Courses Based on Disciplines’ Concepts Capitalization and Obstacles’ Detection

Authors: N. Hammid, F. Bouarab-Dahmani, T. Berkane

Abstract:

Since its appearance, the MOOC (massive open online course) is gaining more and more intention of the educational communities over the world. Apart from the current MOOCs design and purposes, the creators of MOOC focused on the importance of the connection and knowledge exchange between individuals in learning. In this paper, we present a conceptual model for massive open online blended courses where teachers over the world can collaborate and exchange their experience to get a common efficient content designed as a MOOC opened to their students to live a better learning experience. This model is based on disciplines’ concepts capitalization and the detection of the obstacles met by their students when faced with problem situations (exercises, projects, case studies, etc.). This detection is possible by analyzing the frequently of semantic errors committed by the students. The participation of teachers in the design of the course and the attendance by their students can guarantee an efficient and extensive participation (an important number of participants) in the course, the learners’ motivation and the evaluation issues, in the way that the teachers designing the course assess their students. Thus, the teachers review, together with their knowledge, offer a better assessment and efficient connections to their students.

Keywords: massive open online course, MOOC, online learning, e-learning

Procedia PDF Downloads 268
190 English Grammatical Errors of Arabic Sentence Translations Done by Machine Translations

Authors: Muhammad Fathurridho

Abstract:

Grammar as a rule used by every language to be understood by everyone is always related to syntax and morphology. Arabic grammar is different with another languages’ grammars. It has more rules and difficulties. This paper aims to investigate and describe the English grammatical errors of machine translation systems in translating Arabic sentences, including declarative, exclamation, imperative, and interrogative sentences, specifically in year 2018 which can be supported with artificial intelligence’s role. The Arabic sample sentences which are divided into two; verbal and nominal sentence of several Arabic published texts will be examined as the source language samples. The translated sentences done by several popular online machine translation systems, including Google Translate, Microsoft Bing, Babylon, Facebook, Hellotalk, Worldlingo, Yandex Translate, and Tradukka Translate are the material objects of this research. Descriptive method that will be taken to finish this research will show the grammatical errors of English target language, and classify them. The conclusion of this paper has showed that the grammatical errors of machine translation results are varied and generally classified into morphological, syntactical, and semantic errors in all type of Arabic words (Noun, Verb, and Particle), and it will be one of the evaluations for machine translation’s providers to correct them in order to improve their understandable results.

Keywords: Arabic, Arabic-English translation, machine translation, grammatical errors

Procedia PDF Downloads 155
189 How Unicode Glyphs Revolutionized the Way We Communicate

Authors: Levi Corallo

Abstract:

Typed language made by humans on computers and cell phones has made a significant distinction from previous modes of written language exchanges. While acronyms remain one of the most predominant markings of typed language, another and perhaps more recent revolution in the way humans communicate has been with the use of symbols or glyphs, primarily Emojis—globally introduced on the iPhone keyboard by Apple in 2008. This paper seeks to analyze the use of symbols in typed communication from both a linguistic and machine learning perspective. The Unicode system will be explored and methods of encoding will be juxtaposed with the current machine and human perception. Topics in how typed symbol usage exists in conversation will be explored as well as topics across current research methods dealing with Emojis like sentiment analysis, predictive text models, and so on. This study proposes that sequential analysis is a significant feature for analyzing unicode characters in a corpus with machine learning. Current models that are trying to learn or translate the meaning of Emojis should be starting to learn using bi- and tri-grams of Emoji, as well as observing the relationship between combinations of different Emoji in tandem. The sociolinguistics of an entire new vernacular of language referred to here as ‘typed language’ will also be delineated across my analysis with unicode glyphs from both a semantic and technical perspective.

Keywords: unicode, text symbols, emojis, glyphs, communication

Procedia PDF Downloads 194
188 An Experiential Learning of Ontology-Based Multi-document Summarization by Removal Summarization Techniques

Authors: Pranjali Avinash Yadav-Deshmukh

Abstract:

Remarkable development of the Internet along with the new technological innovation, such as high-speed systems and affordable large storage space have led to a tremendous increase in the amount and accessibility to digital records. For any person, studying of all these data is tremendously time intensive, so there is a great need to access effective multi-document summarization (MDS) systems, which can successfully reduce details found in several records into a short, understandable summary or conclusion. For semantic representation of textual details in ontology area, as a theoretical design, our system provides a significant structure. The stability of using the ontology in fixing multi-document summarization problems in the sector of catastrophe control is finding its recommended design. Saliency ranking is usually allocated to each phrase and phrases are rated according to the ranking, then the top rated phrases are chosen as the conclusion. With regards to the conclusion quality, wide tests on a selection of media announcements are appropriate for “Jammu Kashmir Overflow in 2014” records. Ontology centered multi-document summarization methods using “NLP centered extraction” outshine other baselines. Our participation in recommended component is to implement the details removal methods (NLP) to enhance the results.

Keywords: disaster management, extraction technique, k-means, multi-document summarization, NLP, ontology, sentence extraction

Procedia PDF Downloads 386
187 Information Disclosure And Financial Sentiment Index Using a Machine Learning Approach

Authors: Alev Atak

Abstract:

In this paper, we aim to create a financial sentiment index by investigating the company’s voluntary information disclosures. We retrieve structured content from BIST 100 companies’ financial reports for the period 1998-2018 and extract relevant financial information for sentiment analysis through Natural Language Processing. We measure strategy-related disclosures and their cross-sectional variation and classify report content into generic sections using synonym lists divided into four main categories according to their liquidity risk profile, risk positions, intra-annual information, and exposure to risk. We use Word Error Rate and Cosin Similarity for comparing and measuring text similarity and derivation in sets of texts. In addition to performing text extraction, we will provide a range of text analysis options, such as the readability metrics, word counts using pre-determined lists (e.g., forward-looking, uncertainty, tone, etc.), and comparison with reference corpus (word, parts of speech and semantic level). Therefore, we create an adequate analytical tool and a financial dictionary to depict the importance of granular financial disclosure for investors to identify correctly the risk-taking behavior and hence make the aggregated effects traceable.

Keywords: financial sentiment, machine learning, information disclosure, risk

Procedia PDF Downloads 94
186 Number Variation of the Personal Pronoun we Used by Chinese English Learners

Authors: Qiong Hu, Ming Yue

Abstract:

Language variation signals the newest usage of language community, which might become the developmental trend of that language. However, language textbooks cannot keep up with these emergent usages. Most Chinese English learners nowadays are still exposed to traditional grammar prescribed in the textbook so that some variational usages cannot be acquired. The personal pronoun we is prescribed as a plural pronoun in the textbook grammar, but its number value is more flexible in actual use. Based on the Chinese Learner English Corpus (CLEC), and with the homemade Friends corpus as reference, the present research explores the number value of the first person pronoun we used by Chinese English learners. With consideration of the subjectivity of we, this paper annotated the number value of all the wes in “we+ PCU (Perception-cognation-utterance) verbs” collocations. Results show that though exposed to traditional textbooks which prescribe the plural reference of we, there still exists some unconventional usage (singular or vague in reference) in the writings of Chinese English learners, which is less frequent than that of the native speeches. Corpus data and results from manual semantic annotation show that this could be due to the impact of formulaic sequence on the learners and the positive transfer from their native language. An improved SLA model of native language, target language and interlanguage is put forward to recognize the existence of variation in second language acquisition, which should be given more attention during teaching.

Keywords: Chinese English learners, number, PCU verbs, Personal pronoun we

Procedia PDF Downloads 355
185 Synthetic Method of Contextual Knowledge Extraction

Authors: Olga Kononova, Sergey Lyapin

Abstract:

Global information society requirements are transparency and reliability of data, as well as ability to manage information resources independently; particularly to search, to analyze, to evaluate information, thereby obtaining new expertise. Moreover, it is satisfying the society information needs that increases the efficiency of the enterprise management and public administration. The study of structurally organized thematic and semantic contexts of different types, automatically extracted from unstructured data, is one of the important tasks for the application of information technologies in education, science, culture, governance and business. The objectives of this study are the contextual knowledge typologization, selection or creation of effective tools for extracting and analyzing contextual knowledge. Explication of various kinds and forms of the contextual knowledge involves the development and use full-text search information systems. For the implementation purposes, the authors use an e-library 'Humanitariana' services such as the contextual search, different types of queries (paragraph-oriented query, frequency-ranked query), automatic extraction of knowledge from the scientific texts. The multifunctional e-library «Humanitariana» is realized in the Internet-architecture in WWS-configuration (Web-browser / Web-server / SQL-server). Advantage of use 'Humanitariana' is in the possibility of combining the resources of several organizations. Scholars and research groups may work in a local network mode and in distributed IT environments with ability to appeal to resources of any participating organizations servers. Paper discusses some specific cases of the contextual knowledge explication with the use of the e-library services and focuses on possibilities of new types of the contextual knowledge. Experimental research base are science texts about 'e-government' and 'computer games'. An analysis of the subject-themed texts trends allowed to propose the content analysis methodology, that combines a full-text search with automatic construction of 'terminogramma' and expert analysis of the selected contexts. 'Terminogramma' is made out as a table that contains a column with a frequency-ranked list of words (nouns), as well as columns with an indication of the absolute frequency (number) and the relative frequency of occurrence of the word (in %% ppm). The analysis of 'e-government' materials showed, that the state takes a dominant position in the processes of the electronic interaction between the authorities and society in modern Russia. The media credited the main role in these processes to the government, which provided public services through specialized portals. Factor analysis revealed two factors statistically describing the used terms: human interaction (the user) and the state (government, processes organizer); interaction management (public officer, processes performer) and technology (infrastructure). Isolation of these factors will lead to changes in the model of electronic interaction between government and society. In this study, the dominant social problems and the prevalence of different categories of subjects of computer gaming in science papers from 2005 to 2015 were identified. Therefore, there is an evident identification of several types of contextual knowledge: micro context; macro context; dynamic context; thematic collection of queries (interactive contextual knowledge expanding a composition of e-library information resources); multimodal context (functional integration of iconographic and full-text resources through hybrid quasi-semantic algorithm of search). Further studies can be pursued both in terms of expanding the resource base on which they are held, and in terms of the development of appropriate tools.

Keywords: contextual knowledge, contextual search, e-library services, frequency-ranked query, paragraph-oriented query, technologies of the contextual knowledge extraction

Procedia PDF Downloads 359
184 Original and the Translated: A Comparative Evaluation of Native and Non-Native English Translations of Faiz

Authors: Anam Nawaz

Abstract:

The present study is an attempt to compare the translations of Faiz’s poetry made by native and non-native translators, to determine the role of the translator in terms of preserving the cultural ethos of the original text. Peter Newmark and Katharine Reiss’s approaches to translation criticism have been used to provide a theoretical framework for the study. This study also emphasizes those cultural and semantic aspects of the original which are translated more convincingly by a native translator, and contrasting those features which the non-natives can tackle more ably. The research also highlights the linguistic sockets, ignored by the interpreters in the translation process. The analysis showed that both native and non-native translators have made an admirable effort to stay as close to the original as possible. The natives with their advantage of belonging to the same culture have excelled in preserving the original subject matter, whereas the non-native renderings have been presented in a much rhythmic and poetic manner with an excellent choice of words. Though none of the four translators has been successfully able to recreate Faiz’s magic, however V. G. Kiernan and Sarvat Rahman’s translations can be regarded as the closest to the original. Whereas V. G. Kiernan with his outstanding command over English mesmerizes the readers, Sarvat Rahman’s profound understanding of cultural ties helps establish her translations as a brilliant example of faithful re-renderings.

Keywords: comparative translations, linguistic and cultural constraints, native translators, non-native translators, poetry and translation, Faiz Ahmad Faiz

Procedia PDF Downloads 262
183 Quantitative Analysis of the Quality of Housing and Land Use in the Built-up area of Croatian Coastal City of Zadar

Authors: Silvija Šiljeg, Ante Šiljeg, Branko Cavrić

Abstract:

Housing is considered as a basic human need and important component of the quality of life (QoL) in urban areas worldwide. In contemporary housing studies, the concept of the quality of housing (QoH) is considered as a multi-dimensional and multi-disciplinary field. It emphasizes connection between various aspects of the QoL which could be measured by quantitative and qualitative indicators at different spatial levels (e.g. local, city, metropolitan, regional). The main goal of this paper is to examine the QoH and compare results of quantitative analysis with the clutter land use categories derived for selected local communities in Croatian Coastal City of Zadar. The qualitative housing analysis based on the four housing indicators (out of total 24 QoL indicators) has provided identification of the three Zadar’s local communities with the highest estimated QoH ranking. Furthermore, by using GIS overlay techniques, the QoH was merged with the urban environment analysis and introduction of spatial metrics based on the three categories: the element, class and environment as a whole. In terms of semantic-content analysis, the research has also generated a set of indexes suitable for evaluation of “housing state of affairs” and future decision making aiming at improvement of the QoH in selected local communities.

Keywords: housing, quality, indicators, indexes, urban environment, GIS, element, class

Procedia PDF Downloads 410
182 Development of a French to Yorùbá Machine Translation System

Authors: Benjamen Nathaniel, Eludiora Safiriyu Ijiyemi, Egume Oneme Lucky

Abstract:

A review on machine translation systems shows that a lot of computational artefacts has been carried out to translate written or spoken texts from a source language to Yorùbá language through Machine Translation systems. However, there are no work on French to Yorùbá language machine translation system; hence, the study investigated the process involved in the translation of French-to-Yorùbá language equivalent with the view to adopting a rule- based MT approach to build a Machine Translation framework from simple sentences administered through questionnaire. Articles and relevant textbooks were reviewed with key speakers of both languages interviewed to find out the processes involved in the translation of French language and their equivalent in Yorùbálanguage simple sentences using home domain terminologies. Achieving this, a model was formulated using phrase grammar structure, re-write rule, parse tree, automata theory- based techniques, designed and implemented respectively with unified modeling language (UML) and python programming language. Analysing the result, it was observed when carrying out the result that, the Machine Translation system performed 18.45% above Experimental Subject Respondent and 2.7% below Linguistics Expert when analysed with word orthography, sentence syntax and semantic correctness of the sentences. And, when compared with Google Machine Translation system, it was noticed that the developed system performed better on lexicons of the target language.

Keywords: machine translation (MT), rule-based, French language, Yoru`ba´ language

Procedia PDF Downloads 77
181 Analysis Model for the Relationship of Users, Products, and Stores on Online Marketplace Based on Distributed Representation

Authors: Ke He, Wumaier Parezhati, Haruka Yamashita

Abstract:

Recently, online marketplaces in the e-commerce industry, such as Rakuten and Alibaba, have become some of the most popular online marketplaces in Asia. In these shopping websites, consumers can select purchase products from a large number of stores. Additionally, consumers of the e-commerce site have to register their name, age, gender, and other information in advance, to access their registered account. Therefore, establishing a method for analyzing consumer preferences from both the store and the product side is required. This study uses the Doc2Vec method, which has been studied in the field of natural language processing. Doc2Vec has been used in many cases to analyze the extraction of semantic relationships between documents (represented as consumers) and words (represented as products) in the field of document classification. This concept is applicable to represent the relationship between users and items; however, the problem is that one more factor (i.e., shops) needs to be considered in Doc2Vec. More precisely, a method for analyzing the relationship between consumers, stores, and products is required. The purpose of our study is to combine the analysis of the Doc2vec model for users and shops, and for users and items in the same feature space. This method enables the calculation of similar shops and items for each user. In this study, we derive the real data analysis accumulated in the online marketplace and demonstrate the efficiency of the proposal.

Keywords: Doc2Vec, online marketplace, marketing, recommendation systems

Procedia PDF Downloads 112
180 LLM-Powered User-Centric Knowledge Graphs for Unified Enterprise Intelligence

Authors: Rajeev Kumar, Harishankar Kumar

Abstract:

Fragmented data silos within enterprises impede the extraction of meaningful insights and hinder efficiency in tasks such as product development, client understanding, and meeting preparation. To address this, we propose a system-agnostic framework that leverages large language models (LLMs) to unify diverse data sources into a cohesive, user-centered knowledge graph. By automating entity extraction, relationship inference, and semantic enrichment, the framework maps interactions, behaviors, and data around the user, enabling intelligent querying and reasoning across various data types, including emails, calendars, chats, documents, and logs. Its domain adaptability supports applications in contextual search, task prioritization, expertise identification, and personalized recommendations, all rooted in user-centric insights. Experimental results demonstrate its effectiveness in generating actionable insights, enhancing workflows such as trip planning, meeting preparation, and daily task management. This work advances the integration of knowledge graphs and LLMs, bridging the gap between fragmented data systems and intelligent, unified enterprise solutions focused on user interactions.

Keywords: knowledge graph, entity extraction, relation extraction, LLM, activity graph, enterprise intelligence

Procedia PDF Downloads 5
179 Understanding the Cultural Landscape of Kuttanad: Life within the Constraints of Nature

Authors: K. Nikilsha, Lakshmi Manohar, Debayan Chatterjee

Abstract:

Landscape is a setting that informs the way of life of a set of people, and the repository of intangible values and human meanings that nurture our very existence. Along with the linkage that it forms with our lives, it can be argued that landscape and memory cannot be separated, as landscape is the nucleus of our memories. In this context, this paper studies landscape evolution of a region with unique geographic setting, where the dependency of the inhabitants on its resources, led to the formation of certain peculiar beliefs and taboos that formed the basis of a set of unwritten rules and guidelines which they still follow as a part of their lifestyle. One such example is Kuttanad, a low lying region in Kerala which is a complex mosaic of fragmented agricultural landscape incorporating coastal backwaters, rivers, marshes, paddy fields and water channels. The more the physical involvement with the resources, the more was the inhabitants attachment towards it. This attachment of the inhabitants to the place is very strong because the creation of this land was the result of the toil of the low caste labourers who strived day and night to create Kuttanad, which was reclaimed from water with the help of the finance supplied by their landlords. However, the greatest challenge faced by them is posed by the forces of water in the form of floods. As this land is fed by five rivers, even the slight variation in rainfall in its watershed area can cause a large imbalance in the water level causing the reclaimed land to be inundated. The effects of climate change including increase in rainfall, rise in sea level and change of seasons can act as a catalyst to this damage. Hasty urbanization has led to the conversion of paddy fields to housing plots and coconut/plantain fields giving no regard to the traditional systems which had once respected nature and combated floods and draughts through the various cultural practices and taboos practiced by the people. Thus it is essential to look back at the landscape evolution of Kuttanad and to recognise methods used traditionally in the region to establish a cultural landscape, and to understand how climate change and urbanisation shall pose a challenge to the existing landscape and lifestyle. This research also explores the possibilities of alternative and sustainable approaches for resilient urban development learned from Kuttanad as a case study.

Keywords: ecological conservation, landscape and ecological engineering, landscape evolution, man-made landscapes

Procedia PDF Downloads 266
178 Classification of Land Cover Usage from Satellite Images Using Deep Learning Algorithms

Authors: Shaik Ayesha Fathima, Shaik Noor Jahan, Duvvada Rajeswara Rao

Abstract:

Earth's environment and its evolution can be seen through satellite images in near real-time. Through satellite imagery, remote sensing data provide crucial information that can be used for a variety of applications, including image fusion, change detection, land cover classification, agriculture, mining, disaster mitigation, and monitoring climate change. The objective of this project is to propose a method for classifying satellite images according to multiple predefined land cover classes. The proposed approach involves collecting data in image format. The data is then pre-processed using data pre-processing techniques. The processed data is fed into the proposed algorithm and the obtained result is analyzed. Some of the algorithms used in satellite imagery classification are U-Net, Random Forest, Deep Labv3, CNN, ANN, Resnet etc. In this project, we are using the DeepLabv3 (Atrous convolution) algorithm for land cover classification. The dataset used is the deep globe land cover classification dataset. DeepLabv3 is a semantic segmentation system that uses atrous convolution to capture multi-scale context by adopting multiple atrous rates in cascade or in parallel to determine the scale of segments.

Keywords: area calculation, atrous convolution, deep globe land cover classification, deepLabv3, land cover classification, resnet 50

Procedia PDF Downloads 140
177 The Involvement of Visual and Verbal Representations Within a Quantitative and Qualitative Visual Change Detection Paradigm

Authors: Laura Jenkins, Tim Eschle, Joanne Ciafone, Colin Hamilton

Abstract:

An original working memory model suggested the separation of visual and verbal systems in working memory architecture, in which only visual working memory components were used during visual working memory tasks. It was later suggested that the visuo spatial sketch pad was the only memory component at use during visual working memory tasks, and components such as the phonological loop were not considered. In more recent years, a contrasting approach has been developed with the use of an executive resource to incorporate both visual and verbal representations in visual working memory paradigms. This was supported using research demonstrating the use of verbal representations and an executive resource in a visual matrix patterns task. The aim of the current research is to investigate the working memory architecture during both a quantitative and a qualitative visual working memory task. A dual task method will be used. Three secondary tasks will be used which are designed to hit specific components within the working memory architecture – Dynamic Visual Noise (visual components), Visual Attention (spatial components) and Verbal Attention (verbal components). A comparison of the visual working memory tasks will be made to discover if verbal representations are at use, as the previous literature suggested. This direct comparison has not been made so far in the literature. Considerations will be made as to whether a domain specific approach should be employed when discussing visual working memory tasks, or whether a more domain general approach could be used instead.

Keywords: semantic organisation, visual memory, change detection

Procedia PDF Downloads 595
176 Multi-Objective Evolutionary Computation Based Feature Selection Applied to Behaviour Assessment of Children

Authors: F. Jiménez, R. Jódar, M. Martín, G. Sánchez, G. Sciavicco

Abstract:

Abstract—Attribute or feature selection is one of the basic strategies to improve the performances of data classification tasks, and, at the same time, to reduce the complexity of classifiers, and it is a particularly fundamental one when the number of attributes is relatively high. Its application to unsupervised classification is restricted to a limited number of experiments in the literature. Evolutionary computation has already proven itself to be a very effective choice to consistently reduce the number of attributes towards a better classification rate and a simpler semantic interpretation of the inferred classifiers. We present a feature selection wrapper model composed by a multi-objective evolutionary algorithm, the clustering method Expectation-Maximization (EM), and the classifier C4.5 for the unsupervised classification of data extracted from a psychological test named BASC-II (Behavior Assessment System for Children - II ed.) with two objectives: Maximizing the likelihood of the clustering model and maximizing the accuracy of the obtained classifier. We present a methodology to integrate feature selection for unsupervised classification, model evaluation, decision making (to choose the most satisfactory model according to a a posteriori process in a multi-objective context), and testing. We compare the performance of the classifier obtained by the multi-objective evolutionary algorithms ENORA and NSGA-II, and the best solution is then validated by the psychologists that collected the data.

Keywords: evolutionary computation, feature selection, classification, clustering

Procedia PDF Downloads 371
175 Implications of Measuring the Progress towards Financial Risk Protection Using Varied Survey Instruments: A Case Study of Ghana

Authors: Jemima C. A. Sumboh

Abstract:

Given the urgency and consensus for countries to move towards Universal Health Coverage (UHC), health financing systems need to be accurately and consistently monitored to provide valuable data to inform policy and practice. Most of the indicators for monitoring UHC, particularly catastrophe and impoverishment, are established based on the impact of out-of-pocket health payments (OOPHP) on households’ living standards, collected through varied household surveys. These surveys, however, vary substantially in survey methods such as the length of the recall period or the number of items included in the survey questionnaire or the farming of questions, potentially influencing the level of OOPHP. Using different survey instruments can provide inaccurate, inconsistent, erroneous and misleading estimates of UHC, subsequently influencing wrong policy decisions. Using data from a household budget survey conducted by the Navrongo Health Research Center in Ghana from May 2017 to December 2018, this study intends to explore the potential implications of using surveys with varied levels of disaggregation of OOPHP data on estimates of financial risk protection. The household budget survey, structured around food and non-food expenditure, compared three OOPHP measuring instruments: Version I (existing questions used to measure OOPHP in household budget surveys), Version II (new questions developed through benchmarking the existing Classification of the Individual Consumption by Purpose (COICOP) OOPHP questions in household surveys) and Version III (existing questions used to measure OOPHP in health surveys integrated into household budget surveys- for this, the demographic and health surveillance (DHS) health survey was used). Version I, II and III contained 11, 44, and 56 health items, respectively. However, the choice of recall periods was held constant across versions. The sample size for Version I, II and III were 930, 1032 and 1068 households, respectively. Financial risk protection will be measured based on the catastrophic and impoverishment methodologies using STATA 15 and Adept Software for each version. It is expected that findings from this study will present valuable contributions to the repository of knowledge on standardizing survey instruments to obtain estimates of financial risk protection that are valid and consistent.

Keywords: Ghana, household budget surveys, measuring financial risk protection, out-of-pocket health payments, survey instruments, universal health coverage

Procedia PDF Downloads 137
174 Validation of an Impedance-Based Flow Cytometry Technique for High-Throughput Nanotoxicity Screening

Authors: Melanie Ostermann, Eivind Birkeland, Ying Xue, Alexander Sauter, Mihaela R. Cimpan

Abstract:

Background: New reliable and robust techniques to assess biological effects of nanomaterials (NMs) in vitro are needed to speed up safety analysis and to identify key physicochemical parameters of NMs, which are responsible for their acute cytotoxicity. The central aim of this study was to validate and evaluate the applicability and reliability of an impedance-based flow cytometry (IFC) technique for the high-throughput screening of NMs. Methods: Eight inorganic NMs from the European Commission Joint Research Centre Repository were used: NM-302 and NM-300k (Ag: 200 nm rods and 16.7 nm spheres, respectively), NM-200 and NM- 203 (SiO₂: 18.3 nm and 24.7 nm amorphous, respectively), NM-100 and NM-101 (TiO₂: 100 nm and 6 nm anatase, respectively), and NM-110 and NM-111 (ZnO: 147 nm and 141 nm, respectively). The aim was to assess the biological effects of these materials on human monoblastoid (U937) cells. Dispersions of NMs were prepared as described in the NANOGENOTOX dispersion protocol and cells were exposed to NMs at relevant concentrations (2, 10, 20, 50, and 100 µg/mL) for 24 hrs. The change in electrical impedance was measured at 0.5, 2, 6, and 12 MHz using the IFC AmphaZ30 (Amphasys AG, Switzerland). A traditional toxicity assay, Trypan Blue Dye Exclusion assay, and dark-field microscopy were used to validate the IFC method. Results: Spherical Ag particles (NM-300K) showed the highest toxic effect on U937 cells followed by ZnO (NM-111 ≥ NM-110) particles. Silica particles were moderate to non-toxic at all used concentrations under these conditions. A higher toxic effect was seen with smaller sized TiO2 particles (NM-101) compared to their larger analogues (NM-100). No interferences between the IFC and the used NMs were seen. Uptake and internalization of NMs were observed after 24 hours exposure, confirming actual NM-cell interactions. Conclusion: Results collected with the IFC demonstrate the applicability of this method for rapid nanotoxicity assessment, which proved to be less prone to nano-related interference issues compared to some traditional toxicity assays. Furthermore, this label-free and novel technique shows good potential for up-scaling in directions of an automated high-throughput screening and for future NM toxicity assessment. This work was supported by the EC FP7 NANoREG (Grant Agreement NMP4-LA-2013-310584), the Research Council of Norway, project NorNANoREG (239199/O70), the EuroNanoMed II 'GEMN' project (246672), and the UH-Nett Vest project.

Keywords: cytotoxicity, high-throughput, impedance, nanomaterials

Procedia PDF Downloads 362
173 Transfer of Constraints or Constraints on Transfer? Syntactic Islands in Danish L2 English

Authors: Anne Mette Nyvad, Ken Ramshøj Christensen

Abstract:

In the syntax literature, it has standardly been assumed that relative clauses and complement wh-clauses are islands for extraction in English, and that constraints on extraction from syntactic islands are universal. However, the Mainland Scandinavian languages has been known to provide counterexamples. Previous research on Danish has shown that neither relative clauses nor embedded questions are strong islands in Danish. Instead, extraction from this type of syntactic environment is degraded due to structural complexity and it interacts with nonstructural factors such as the frequency of occurrence of the matrix verb, the possibility of temporary misanalysis leading to semantic incongruity and exposure over time. We argue that these facts can be accounted for with parametric variation in the availability of CP-recursion, resulting in the patterns observed, as Danish would then “suspend” the ban on movement out of relative clauses and embedded questions. Given that Danish does not seem to adhere to allegedly universal syntactic constraints, such as the Complex NP Constraint and the Wh-Island Constraint, what happens in L2 English? We present results from a study investigating how native Danish speakers judge extractions from island structures in L2 English. Our findings suggest that Danes transfer their native language parameter setting when asked to judge island constructions in English. This is compatible with the Full Transfer Full Access Hypothesis, as the latter predicts that Danish would have difficulties resetting their [+/- CP-recursion] parameter in English because they are not exposed to negative evidence.

Keywords: syntax, islands, second language acquisition, danish

Procedia PDF Downloads 127
172 Improving Topic Quality of Scripts by Using Scene Similarity Based Word Co-Occurrence

Authors: Yunseok Noh, Chang-Uk Kwak, Sun-Joong Kim, Seong-Bae Park

Abstract:

Scripts are one of the basic text resources to understand broadcasting contents. Since broadcast media wields lots of influence over the public, tools for understanding broadcasting contents are more required. Topic modeling is the method to get the summary of the broadcasting contents from its scripts. Generally, scripts represent contents descriptively with directions and speeches. Scripts also provide scene segments that can be seen as semantic units. Therefore, a script can be topic modeled by treating a scene segment as a document. Because scripts consist of speeches mainly, however, relatively small co-occurrences among words in the scene segments are observed. This causes inevitably the bad quality of topics based on statistical learning method. To tackle this problem, we propose a method of learning with additional word co-occurrence information obtained using scene similarities. The main idea of improving topic quality is that the information that two or more texts are topically related can be useful to learn high quality of topics. In addition, by using high quality of topics, we can get information more accurate whether two texts are related or not. In this paper, we regard two scene segments are related if their topical similarity is high enough. We also consider that words are co-occurred if they are in topically related scene segments together. In the experiments, we showed the proposed method generates a higher quality of topics from Korean drama scripts than the baselines.

Keywords: broadcasting contents, scripts, text similarity, topic model

Procedia PDF Downloads 318
171 Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text

Authors: Paolo Fantozzi, Luigi Laura, Umberto Nanni

Abstract:

The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.

Keywords: cooccurrence graph, entity relation graph, unstructured text, weighted distance

Procedia PDF Downloads 154
170 Cognitive Semantics Study of Conceptual and Metonymical Expressions in Johnson's Speeches about COVID-19

Authors: Hussain Hameed Mayuuf

Abstract:

The study is an attempt to investigate the conceptual metonymies is used in political discourse about COVID-19. Thus, this study tries to analyze and investigate how the conceptual metonymies in Johnson's speech about coronavirus are constructed. This study aims at: Identifying how are metonymies relevant to understand the messages in Boris Johnson speeches and to find out how can conceptual blending theory help people to understand the messages in the political speech about COVID-19. Lastly, it tries to Point out the kinds of integration networks are common in political speech. The study is based on the hypotheses that conceptual blending theory is a powerful tool for investigating the intended messages in Johnson's speech and there are different processes of blending networks and conceptual mapping that enable the listeners to identify the messages in political speech. This study presents a qualitative and quantitative analysis of four speeches about COVID-19; they are said by Boris Johnson. The selected data have been tackled from the cognitive-semantic perspective by adopting Conceptual Blending Theory as a model for the analysis. It concludes that CBT is applicable to the analysis of metonymies in political discourse. Its mechanisms enable listeners to analyze and understand these speeches. Also the listener can identify and understand the hidden messages in Biden and Johnson's discourse about COVID-19 by using different conceptual networks. Finally, it is concluded that the double scope networks are the most common types of blending of metonymies in the political speech.

Keywords: cognitive, semantics, conceptual, metonymical, Covid-19

Procedia PDF Downloads 129
169 Machine Learning Model to Predict TB Bacteria-Resistant Drugs from TB Isolates

Authors: Rosa Tsegaye Aga, Xuan Jiang, Pavel Vazquez Faci, Siqing Liu, Simon Rayner, Endalkachew Alemu, Markos Abebe

Abstract:

Tuberculosis (TB) is a major cause of disease globally. In most cases, TB is treatable and curable, but only with the proper treatment. There is a time when drug-resistant TB occurs when bacteria become resistant to the drugs that are used to treat TB. Current strategies to identify drug-resistant TB bacteria are laboratory-based, and it takes a longer time to identify the drug-resistant bacteria and treat the patient accordingly. But machine learning (ML) and data science approaches can offer new approaches to the problem. In this study, we propose to develop an ML-based model to predict the antibiotic resistance phenotypes of TB isolates in minutes and give the right treatment to the patient immediately. The study has been using the whole genome sequence (WGS) of TB isolates as training data that have been extracted from the NCBI repository and contain different countries’ samples to build the ML models. The reason that different countries’ samples have been included is to generalize the large group of TB isolates from different regions in the world. This supports the model to train different behaviors of the TB bacteria and makes the model robust. The model training has been considering three pieces of information that have been extracted from the WGS data to train the model. These are all variants that have been found within the candidate genes (F1), predetermined resistance-associated variants (F2), and only resistance-associated gene information for the particular drug. Two major datasets have been constructed using these three information. F1 and F2 information have been considered as two independent datasets, and the third information is used as a class to label the two datasets. Five machine learning algorithms have been considered to train the model. These are Support Vector Machine (SVM), Random forest (RF), Logistic regression (LR), Gradient Boosting, and Ada boost algorithms. The models have been trained on the datasets F1, F2, and F1F2 that is the F1 and the F2 dataset merged. Additionally, an ensemble approach has been used to train the model. The ensemble approach has been considered to run F1 and F2 datasets on gradient boosting algorithm and use the output as one dataset that is called F1F2 ensemble dataset and train a model using this dataset on the five algorithms. As the experiment shows, the ensemble approach model that has been trained on the Gradient Boosting algorithm outperformed the rest of the models. In conclusion, this study suggests the ensemble approach, that is, the RF + Gradient boosting model, to predict the antibiotic resistance phenotypes of TB isolates by outperforming the rest of the models.

Keywords: machine learning, MTB, WGS, drug resistant TB

Procedia PDF Downloads 52
168 Machine Learning and Deep Learning Approach for People Recognition and Tracking in Crowd for Safety Monitoring

Authors: A. Degale Desta, Cheng Jian

Abstract:

Deep learning application in computer vision is rapidly advancing, giving it the ability to monitor the public and quickly identify potentially anomalous behaviour from crowd scenes. Therefore, the purpose of the current work is to improve the performance of safety of people in crowd events from panic behaviour through introducing the innovative idea of Aggregation of Ensembles (AOE), which makes use of the pre-trained ConvNets and a pool of classifiers to find anomalies in video data with packed scenes. According to the theory of algorithms that applied K-means, KNN, CNN, SVD, and Faster-CNN, YOLOv5 architectures learn different levels of semantic representation from crowd videos; the proposed approach leverages an ensemble of various fine-tuned convolutional neural networks (CNN), allowing for the extraction of enriched feature sets. In addition to the above algorithms, a long short-term memory neural network to forecast future feature values and a handmade feature that takes into consideration the peculiarities of the crowd to understand human behavior. On well-known datasets of panic situations, experiments are run to assess the effectiveness and precision of the suggested method. Results reveal that, compared to state-of-the-art methodologies, the system produces better and more promising results in terms of accuracy and processing speed.

Keywords: action recognition, computer vision, crowd detecting and tracking, deep learning

Procedia PDF Downloads 162
167 English Pashto Contact: Morphological Adaptation of Bilingual Compound Words in Pashto

Authors: Imran Ullah Imran

Abstract:

Language contact is a familiar concept in the present global world. Across the globe, languages get mixed up at different levels. Borrowing, code-switching are some of the means through which languages interact. This study examines Pashto-English contact at word and syllable levels. By recording the speech of 30 Pashto native speakers, selected via 'social network' sampling, the study located a number of Pashto-English compound words, which is a unique contact of its kind. In data analysis, tokens were categorized on the basis of their pattern and morphological structure. The study shows that Pashto-English Bilingual Compound words (BCWs) are very prevalent in the Pashto language. The study also found that the BCWs in Pashto are completely productive and have their own meanings. It also shows that the dominant pattern of hybrid words in Pashto is the conjugation of an independent English root word followed by a Pashto inflectional morpheme, which contributes to the core semantic content of the construction. The BCWs construction shows that how both the languages are closer to each other. Pashto-English contact results into bilingual compound and hybrid words, which forms a considerable number of tokens in the present-day spoken Pashto. On the basis of these findings, the study assumes that the same phenomenon may increase with the passage of time that would, in turn, result in the formation of more bilingual compound or hybrid words.

Keywords: code-mixing, bilingual compound words, pashto-english contact, hybrid words, inflectional lexical morpheme

Procedia PDF Downloads 249
166 The Use of Artificial Intelligence in Digital Forensics and Incident Response in a Constrained Environment

Authors: Dipo Dunsin, Mohamed C. Ghanem, Karim Ouazzane

Abstract:

Digital investigators often have a hard time spotting evidence in digital information. It has become hard to determine which source of proof relates to a specific investigation. A growing concern is that the various processes, technology, and specific procedures used in the digital investigation are not keeping up with criminal developments. Therefore, criminals are taking advantage of these weaknesses to commit further crimes. In digital forensics investigations, artificial intelligence is invaluable in identifying crime. It has been observed that an algorithm based on artificial intelligence (AI) is highly effective in detecting risks, preventing criminal activity, and forecasting illegal activity. Providing objective data and conducting an assessment is the goal of digital forensics and digital investigation, which will assist in developing a plausible theory that can be presented as evidence in court. Researchers and other authorities have used the available data as evidence in court to convict a person. This research paper aims at developing a multiagent framework for digital investigations using specific intelligent software agents (ISA). The agents communicate to address particular tasks jointly and keep the same objectives in mind during each task. The rules and knowledge contained within each agent are dependent on the investigation type. A criminal investigation is classified quickly and efficiently using the case-based reasoning (CBR) technique. The MADIK is implemented using the Java Agent Development Framework and implemented using Eclipse, Postgres repository, and a rule engine for agent reasoning. The proposed framework was tested using the Lone Wolf image files and datasets. Experiments were conducted using various sets of ISA and VMs. There was a significant reduction in the time taken for the Hash Set Agent to execute. As a result of loading the agents, 5 percent of the time was lost, as the File Path Agent prescribed deleting 1,510, while the Timeline Agent found multiple executable files. In comparison, the integrity check carried out on the Lone Wolf image file using a digital forensic tool kit took approximately 48 minutes (2,880 ms), whereas the MADIK framework accomplished this in 16 minutes (960 ms). The framework is integrated with Python, allowing for further integration of other digital forensic tools, such as AccessData Forensic Toolkit (FTK), Wireshark, Volatility, and Scapy.

Keywords: artificial intelligence, computer science, criminal investigation, digital forensics

Procedia PDF Downloads 212
165 Factor Analysis Based on Semantic Differential of the Public Perception of Public Art: A Case Study of the Malaysia National Monument

Authors: Yuhanis Ibrahim, Sung-Pil Lee

Abstract:

This study attempts to address factors that contribute to outline public art factors assessment, memorial monument specifically. Memorial monuments hold significant and rich message whether the intention of the art is to mark and commemorate important event or to inform younger generation about the past. Public monument should relate to the public and raise awareness about the significant issue. Therefore, by investigating the impact of the existing public memorial art will hopefully shed some lights to the upcoming public art projects’ stakeholders to ensure the lucid memorial message is delivered to the public directly. Public is the main actor as public is the fundamental purpose that the art was created. Perception is framed as one of the reliable evaluation tools to assess the public art impact factors. The Malaysia National Monument was selected to be the case study for the investigation. The public’s perceptions were gathered using a questionnaire that involved (n-115) participants to attain keywords, and next Semantical Differential Methodology (SDM) was adopted to evaluate the perceptions about the memorial monument. These perceptions were then measured with Reliability Factor and then were factorised using Factor Analysis of Principal Component Analysis (PCA) method to acquire concise factors for the monument assessment. The result revealed that there are four factors that influence public’s perception on the monument which are aesthetic, audience, topology, and public reception. The study concludes by proposing the factors for public memorial art assessment for the next future public memorial projects especially in Malaysia.

Keywords: factor analysis, public art, public perception, semantical differential methodology

Procedia PDF Downloads 501
164 Text as Reader Device Improving Subjectivity on the Role of Attestation between Interpretative Semiotics and Discursive Linguistics

Authors: Marco Castagna

Abstract:

Proposed paper is aimed to inquire about the relation between text and reader, focusing on the concept of ‘attestation’. Indeed, despite being widely accepted in semiotic research, even today the concept of text remains uncertainly defined. So, it seems to be undeniable that what is called ‘text’ offers an image of internal cohesion and coherence, that makes it possible to analyze it as an object. Nevertheless, this same object remains problematic when it is pragmatically activated by the act of reading. In fact, as for the T.A.R:D.I.S., that is the unique space-temporal vehicle used by the well-known BBC character Doctor Who in his adventures, every text appears to its own readers not only “bigger inside than outside”, but also offering spaces that change according to the different traveller standing in it. In a few words, as everyone knows, this singular condition raises the questions about the gnosiological relation between text and reader. How can a text be considered the ‘same’, even if it can be read in different ways by different subjects? How can readers can be previously provided with knowledge required for ‘understanding’ a text, but at the same time learning something more from it? In order to explain this singular condition it seems useful to start thinking about text as a device more than an object. In other words, this unique status is more clearly understandable when ‘text’ ceases to be considered as a box designed to move meaning from a sender to a recipient (marking the semiotic priority of the “code”) and it starts to be recognized as performative meaning hypothesis, that is discursively configured by one or more forms and empirically perceivable by means of one or more substances. Thus, a text appears as a “semantic hanger”, potentially offered to the “unending deferral of interpretant", and from time to time fixed as “instance of Discourse”. In this perspective, every reading can be considered as an answer to the continuous request for confirming or denying the meaning configuration (the meaning hypothesis) expressed by text. Finally, ‘attestation’ is exactly what regulates this dynamic of request and answer, through which the reader is able to confirm his previous hypothesis on reality or maybe acquire some new ones.Proposed paper is aimed to inquire about the relation between text and reader, focusing on the concept of ‘attestation’. Indeed, despite being widely accepted in semiotic research, even today the concept of text remains uncertainly defined. So, it seems to be undeniable that what is called ‘text’ offers an image of internal cohesion and coherence, that makes it possible to analyze it as an object. Nevertheless, this same object remains problematic when it is pragmatically activated by the act of reading. In fact, as for the T.A.R:D.I.S., that is the unique space-temporal vehicle used by the well-known BBC character Doctor Who in his adventures, every text appears to its own readers not only “bigger inside than outside”, but also offering spaces that change according to the different traveller standing in it. In a few words, as everyone knows, this singular condition raises the questions about the gnosiological relation between text and reader. How can a text be considered the ‘same’, even if it can be read in different ways by different subjects? How can readers can be previously provided with knowledge required for ‘understanding’ a text, but at the same time learning something more from it? In order to explain this singular condition it seems useful to start thinking about text as a device more than an object. In other words, this unique status is more clearly understandable when ‘text’ ceases to be considered as a box designed to move meaning from a sender to a recipient (marking the semiotic priority of the “code”) and it starts to be recognized as performative meaning hypothesis, that is discursively configured by one or more forms and empirically perceivable by means of one or more substances. Thus, a text appears as a “semantic hanger”, potentially offered to the “unending deferral of interpretant", and from time to time fixed as “instance of Discourse”. In this perspective, every reading can be considered as an answer to the continuous request for confirming or denying the meaning configuration (the meaning hypothesis) expressed by text. Finally, ‘attestation’ is exactly what regulates this dynamic of request and answer, through which the reader is able to confirm his previous hypothesis on reality or maybe acquire some new ones.

Keywords: attestation, meaning, reader, text

Procedia PDF Downloads 237
163 Perceiving Casual Speech: A Gating Experiment with French Listeners of L2 English

Authors: Naouel Zoghlami

Abstract:

Spoken-word recognition involves the simultaneous activation of potential word candidates which compete with each other for final correct recognition. In continuous speech, the activation-competition process gets more complicated due to speech reductions existing at word boundaries. Lexical processing is more difficult in L2 than in L1 because L2 listeners often lack phonetic, lexico-semantic, syntactic, and prosodic knowledge in the target language. In this study, we investigate the on-line lexical segmentation hypotheses that French listeners of L2 English form and then revise as subsequent perceptual evidence is revealed. Our purpose is to shed further light on the processes of L2 spoken-word recognition in context and better understand L2 listening difficulties through a comparison of skilled and unskilled reactions at the point where their working hypothesis is rejected. We use a variant of the gating experiment in which subjects transcribe an English sentence presented in increments of progressively greater duration. The spoken sentence was “And this amazing athlete has just broken another world record”, chosen mainly because it included common reductions and phonetic features in English, such as elision and assimilation. Our preliminary results show that there is an important difference in the manner in which proficient and less-proficient L2 listeners handle connected speech. Less-proficient listeners delay recognition of words as they wait for lexical and syntactic evidence to appear in the gates. Further statistical results are currently being undertaken.

Keywords: gating paradigm, spoken word recognition, online lexical segmentation, L2 listening

Procedia PDF Downloads 464