Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 538

Search results for: lexical retrieval

178 Written Narrative Texts as the Indicators of Communication Competence of Pupils and Students with Hearing Impairment in the Czech Language

Authors: Marie Komorna, Katerina Hadkova

Abstract:

One reason why hearing disabilities as compared to other disabilities are considered to be less serious, is the belief that deaf and hard of hearing persons can read and write without problems and can therefore fairly easily compensate for problems related to their limited ability to hear sound. However in reality this is not the case, especially as regards written Czech, deaf persons are often not able to communicate their message clearly to its recipients. Their inability to communicate fully in written language is one of the most severe problems facing a number of deaf persons, a problem which they face and which makes it difficult for them to function in a sound-based environment. Despite this fact, this issue is one which has been given only a minimum of attention in the Czech Republic. That is why we decided to focus our research on this issue, specifically targeting written communication of deaf pupils in primary and secondary schools. The paper summarizes the background and objectives of this research. The written work of deaf respondents was obtained in response to a narrative based on a series of images which depicted a continuous storyline. Based on an analysis of the obtained written work we tried to describe the specifics of the narrative abilities of the deaf authors of these texts. We also analyzed other aspects and specific traits of text written by deaf authors at a phonetic-phonological, lexical-semantic, morphological and syntactic, respectively pragmatic level. Based on the results of the project it will be possible to increase knowledge of the communication abilities of deaf persons in written Czech. The obtained data may be used during future research and for teaching purposes and/or education concepts for teaching Czech to deaf pupils.

Keywords: communication competence, deaf, narrative, written texts

Procedia PDF Downloads 312

177 A Linguistic Product of K-Pop: A Corpus-Based Study on the Korean-Originated Chinese Neologism Simida

Authors: Hui Shi

Abstract:

This article examines the online popularity of Chinese neologism simida, which is a loanword derived from Korean declarative sentence-final suffix seumnida. Facilitated by corpus data obtained from Weibo, the Chinese counterpart of Twitter, this study analyzes the morphological and syntactical processes behind simida’s coinage, as well as the causes of its prevalence on Chinese social media. The findings show that simida is used by Weibo bloggers in two manners: (1) as an alternative word of 'Korea' and 'Korean'; (2) as a redundant sentence-final particle which adds a Korean-like speech style to a statement. Additionally, Weibo user profile analysis further reveals demographical distribution patterns concerning this neologism and highlights young Weibo users in the third-tier cities as the leading adopters of simida. These results are accounted for under the theoretical framework of social indexicality, especially how variations generate style in the indexical field. This article argues that the creation of such an ethnically-targeted neologism is a linguistic demonstration of Chinese netizen’s two-sided attitudes toward the previously heated Korean-wave. The exotic suffix seumnida is borrowed to Chinese as simida due to its high-frequency in Korean cultural exports. Therefore, it gradually becomes a replacement of Korea-related lexical items due to markedness, regardless of semantic prosody. Its innovative implantation to Chinese syntax, on the other hand, reflects Chinese netizens’ active manipulation of language for their online identity building. This study has implications for research on the linguistic construction of identity and style and lays the groundwork for linguistic creativity in the Chinese new media.

Keywords: Chinese neologism, loanword, humor, new media

Procedia PDF Downloads 154

176 Grounding Chinese Language Vocabulary Teaching and Assessment in the Working Memory Research

Authors: Chan Kwong Tung

Abstract:

Since Baddeley and Hitch’s seminal research in 1974 on working memory (WM), this topic has been of great interest to language educators. Although there are some variations in the definitions of WM, recent findings in WM have contributed vastly to our understanding of language learning, especially its effects on second language acquisition (SLA). For example, the phonological component of WM (PWM) and the executive component of WM (EWM) have been found to be positively correlated with language learning. This paper discusses two general, yet highly relevant WM findings that could directly affect the effectiveness of Chinese Language (CL) vocabulary teaching and learning, as well as the quality of its assessment. First, PWM is found to be critical for the long-term learning of phonological forms of new words. Second, EWM is heavily involved in interpreting the semantic characteristics of new words, which consequently affects the quality of learners’ reading comprehension. These two ideas are hardly discussed in the Chinese literature, both conceptual and empirical. While past vocabulary acquisition studies have mainly focused on the cognitive-processing approach, active processing, ‘elaborate processing’ (or lexical elaboration) and other effective learning tasks and strategies, it is high time to balance the spotlight to the WM (particularly PWM and EWM) to ensure an optimum control on the teaching and learning effectiveness of such approaches, as well as the validity of this language assessment. Given the unique phonological, orthographical and morphological properties of the CL, this discussion will shed some light on the vocabulary acquisition of this Sino-Tibetan language family member. Together, these two WM concepts could have crucial implications for the design, development, and planning of vocabularies and ultimately reading comprehension teaching and assessment in language education. Hopefully, this will raise an awareness and trigger a dialogue about the meaning of these findings for future language teaching, learning, and assessment.

Keywords: Chinese Language, working memory, vocabulary assessment, vocabulary teaching

Procedia PDF Downloads 314

175 The Priming Effect of Morphology, Phonology, Semantics, and Orthography in Mandarin Chinese: A Prime Paradigm Study

Authors: Bingqing Xu, Wenxing Shuai

Abstract:

This study investigates the priming effects of different Chinese compound words by native Mandarin speakers. There are lots of homonym, polysemy, and synonym in Chinese. However, it is unclear which kind of words have the biggest priming effect. Native Mandarin speakers were tested in a visual-word lexical decision experiment. The stimuli, which are all two-character compound words, consisted of two parts: primes and targets. Five types of relationships were used in all stimuli: morphologically related condition, in which the prime and the target contain the same morpheme; orthographically related condition, in which the target and the prime contain the different morpheme with the same form; phonologically related condition, in which the target and the prime contain the different morpheme with the same phonology; semantically related condition, in which the target and the prime contain the different morpheme with similar meanings; totally unrelated condition. The time since participants saw the target to respond was recorded. Analyses on reaction time showed that the average reaction time of morphologically related targets was much shorter than others, suggesting the morphological priming effect is the biggest. However, the reaction time of the phonologically related conditions was the longest, even longer than unrelated conditions. According to scatter plots analyses, 86.7% of participants had priming effects in morphologically related conditions, only 20% of participants had priming effects in phonologically related conditions. These results suggested that morphologically related conditions had the biggest priming effect. The orthographically and semantically related conditions also had priming effects, whereas the phonologically related conditions had few priming effects.

Keywords: priming effect, morphology, phonology, semantics, orthography

Procedia PDF Downloads 119

174 Extraction of Compound Words in Malay Sentences Using Linguistic and Statistical Approaches

Authors: Zamri Abu Bakar Zamri, Normaly Kamal Ismail Normaly, Mohd Izani Mohamed Rawi Izani

Abstract:

Malay noun compound are phrases that consist of two or more nouns. The key characteristic behind noun compounds lies on its frequent occurrences within the text. Therefore, extracting these noun compounds is essential for several domains of research such as Information Retrieval, Sentiment Analysis and Question Answering. Many research efforts have been proposed in terms of extracting Malay noun compounds using linguistic and statistical approaches. Most of the existing methods have concentrated on the extraction of bi-gram noun+noun compound. However, extracting noun+verb, noun+adjective and noun+prepositional is challenging due to the difficulty of selecting an appropriate method with effective results. Thus, there is still room for improvement in terms of enhancing the effectiveness of compound word extraction. Therefore, this study proposed a combination of linguistic approach and statistical measures in order to enhance the extraction of compound words. Several preprocessing steps are involved including normalization, tokenization, and stemming. The linguistic approach that has been used in this study is Part-of-Speech (POS) tagging. In addition, a new linguistic pattern for named entities has been utilized using a list of Malays named entities in order to enhance the linguistic approach in terms of noun compound recognition. The proposed statistical measures consists of NC-value, NTC-value and NLC value.

Keywords: Compound Word, Noun Compound, Linguistic Approach, Statistical Approach

Procedia PDF Downloads 320

173 Augmented Reality for Maintenance Operator for Problem Inspections

Authors: Chong-Yang Qiao, Teeravarunyou Sakol

Abstract:

Current production-oriented factories need maintenance operators to work in shifts monitoring and inspecting complex systems and different equipment in the situation of mechanical breakdown. Augmented reality (AR) is an emerging technology that embeds data into the environment for situation awareness to help maintenance operators make decisions and solve problems. An application was designed to identify the problem of steam generators and inspection centrifugal pumps. The objective of this research was to find the best medium of AR and type of problem solving strategies among analogy, focal object method and mean-ends analysis. Two scenarios of inspecting leakage were temperature and vibration. Two experiments were used in usability evaluation and future innovation, which included decision-making process and problem-solving strategy. This study found that maintenance operators prefer build-in magnifier to zoom the components (55.6%), 3D exploded view to track the problem parts (50%), and line chart to find the alter data or information (61.1%). There is a significant difference in the use of analogy (44.4%), focal objects (38.9%) and mean-ends strategy (16.7%). The marked differences between maintainers and operators are of the application of a problem solving strategy. However, future work should explore multimedia information retrieval which supports maintenance operators for decision-making.

Keywords: augmented reality, situation awareness, decision-making, problem-solving

Procedia PDF Downloads 197

172 Post Pandemic Mobility Analysis through Indexing and Sharding in MongoDB: Performance Optimization and Insights

Authors: Karan Vishavjit, Aakash Lakra, Shafaq Khan

Abstract:

The COVID-19 pandemic has pushed healthcare professionals to use big data analytics as a vital tool for tracking and evaluating the effects of contagious viruses. To effectively analyze huge datasets, efficient NoSQL databases are needed. The analysis of post-COVID-19 health and well-being outcomes and the evaluation of the effectiveness of government efforts during the pandemic is made possible by this research’s integration of several datasets, which cuts down on query processing time and creates predictive visual artifacts. We recommend applying sharding and indexing technologies to improve query effectiveness and scalability as the dataset expands. Effective data retrieval and analysis are made possible by spreading the datasets into a sharded database and doing indexing on individual shards. Analysis of connections between governmental activities, poverty levels, and post-pandemic well being is the key goal. We want to evaluate the effectiveness of governmental initiatives to improve health and lower poverty levels. We will do this by utilising advanced data analysis and visualisations. The findings provide relevant data that supports the advancement of UN sustainable objectives, future pandemic preparation, and evidence-based decision-making. This study shows how Big Data and NoSQL databases may be used to address problems with global health.

Keywords: big data, COVID-19, health, indexing, NoSQL, sharding, scalability, well being

Procedia PDF Downloads 42

171 Visual Template Detection and Compositional Automatic Regular Expression Generation for Business Invoice Extraction

Authors: Anthony Proschka, Deepak Mishra, Merlyn Ramanan, Zurab Baratashvili

Abstract:

Small and medium-sized businesses receive over 160 billion invoices every year. Since these documents exhibit many subtle differences in layout and text, extracting structured fields such as sender name, amount, and VAT rate from them automatically is an open research question. In this paper, existing work in template-based document extraction is extended, and a system is devised that is able to reliably extract all required fields for up to 70% of all documents in the data set, more than any other previously reported method. The approaches are described for 1) detecting through visual features which template a given document belongs to, 2) automatically generating extraction rules for a given new template by composing regular expressions from multiple components, and 3) computing confidence scores that indicate the accuracy of the automatic extractions. The system can generate templates with as little as one training sample and only requires the ground truth field values instead of detailed annotations such as bounding boxes that are hard to obtain. The system is deployed and used inside a commercial accounting software.

Keywords: data mining, information retrieval, business, feature extraction, layout, business data processing, document handling, end-user trained information extraction, document archiving, scanned business documents, automated document processing, F1-measure, commercial accounting software

Procedia PDF Downloads 100

170 Fuzzy Inference-Assisted Saliency-Aware Convolution Neural Networks for Multi-View Summarization

Authors: Tanveer Hussain, Khan Muhammad, Amin Ullah, Mi Young Lee, Sung Wook Baik

Abstract:

The Big Data generated from distributed vision sensors installed on large scale in smart cities create hurdles in its efficient and beneficial exploration for browsing, retrieval, and indexing. This paper presents a three-folded framework for effective video summarization of such data and provide a compact and representative format of Big Video Data. In the first fold, the paper acquires input video data from the installed cameras and collect clues such as type and count of objects and clarity of the view from a chunk of pre-defined number of frames of each view. The decision of representative view selection for a particular interval is based on fuzzy inference system, acquiring a precise and human resembling decision, reinforced by the known clues as a part of the second fold. In the third fold, the paper forwards the selected view frames to the summary generation mechanism that is supported by a saliency-aware convolution neural network (CNN) model. The new trend of fuzzy rules for view selection followed by CNN architecture for saliency computation makes the multi-view video summarization (MVS) framework a suitable candidate for real-world practice in smart cities.

Keywords: big video data analysis, fuzzy logic, multi-view video summarization, saliency detection

Procedia PDF Downloads 163

169 The Phenomena of False Cognates and Deceptive Cognates: Issues to Foreign Language Learning and Teaching Methodology Based on Set Theory

Authors: Marilei Amadeu Sabino

Abstract:

The aim of this study is to establish differences between the terms ‘false cognates’, ‘false friends’ and ‘deceptive cognates’, usually considered to be synonyms. It will be shown they are not synonyms, since they do not designate the same linguistic process or phenomenon. Despite their differences in meaning, many pairs of formally similar words in two (or more) different languages are true cognates, although they are usually known as ‘false’ cognates – such as, for instance, the English and Italian lexical items ‘assist x assistere’; ‘attend x attendere’; ‘argument x argomento’; ‘apology x apologia’; ‘camera x camera’; ‘cucumber x cocomero’; ‘fabric x fabbrica’; ‘factory x fattoria’; ‘firm x firma’; ‘journal x giornale’; ‘library x libreria’; ‘magazine x magazzino’; ‘parent x parente’; ‘preservative x preservativo’; ‘pretend x pretendere’; ‘vacancy x vacanza’, to name but a few examples. Thus, one of the theoretical objectives of this paper is firstly to elaborate definitions establishing a distinction between the words that are definitely ‘false cognates’ (derived from different etyma) and those that are just ‘deceptive cognates’ (derived from the same etymon). Secondly, based on Set Theory and on the concepts of equal sets, subsets, intersection of sets and disjoint sets, this study is intended to elaborate some theoretical and practical questions that will be useful in identifying more precisely similarities and differences between cognate words of different languages, and according to graphic interpretation of sets it will be possible to classify them and provide discernment about the processes of semantic changes. Therefore, these issues might be helpful not only to the Learning of Second and Foreign Languages, but they could also give insights into Foreign and Second Language Teaching Methodology. Acknowledgements: FAPESP – São Paulo State Research Support Foundation – the financial support offered (proc. n° 2017/02064-7).

Keywords: deceptive cognates, false cognates, foreign language learning, teaching methodology

Procedia PDF Downloads 309

168 Translating the Gendered Discourse: A Corpus-Based Study of the Chinese Science Fiction The Three Body Problem

Authors: Yi Gu

Abstract:

The Three-Body Problem by Cixin Liu has been a bestseller Chinese Sci-Fi novel for years since 2008. The book was translated into English by Ken Liu in 2014 and won the prestigious 2015 science fiction and fantasy writing Hugo Award, drawing greater attention from wider international communities. The story exposes the horrors of the Chinese Cultural Revolution in the 1960s, in an intriguing narrative for readers at home and abroad. However, without the access to the source text, western readers may not be aware that the original Chinese version of the book is rich in gender-bias. Some Chinese scholars have applied feminist translation theories to their analysis on this book before, based on isolated selected, cherry-picking examples. Thus this paper aims to obtain a more thorough picture of how translators can cope with gender discrimination and reshape the gendered discourse from the source text, by systematically investigating the lexical and syntactic patterns in the translation of Liu’s entire book of 400 pages. The source text and the translation were downloaded into digital files, automatically aligned at paragraph level and then manually post-edited. They were then compiled into a parallel corpus of 114,629 English words and 204,145 Chinese characters using Sketch Engine. Gender-discrimination markers such as the overuse of ‘girl’ to describe an adult woman were searched in the source text, and the alignment made it possible to identify the strategies adopted by the translator to mitigate gender discrimination. The results provide a framework for translators to address gender bias. The study also shows how corpus methods can be used to further research in feminist translation and critical discourse analysis.

Keywords: corpus, discourse analysis, feminist translation, science fiction translation

Procedia PDF Downloads 233

167 Design of a Real Time Closed Loop Simulation Test Bed on a General Purpose Operating System: Practical Approaches

Authors: Pratibha Srivastava, Chithra V. J., Sudhakar S., Nitin K. D.

Abstract:

A closed-loop system comprises of a controller, a response system, and an actuating system. The controller, which is the system under test for us, excites the actuators based on feedback from the sensors in a periodic manner. The sensors should provide the feedback to the System Under Test (SUT) within a deterministic time post excitation of the actuators. Any delay or miss in the generation of response or acquisition of excitation pulses may lead to control loop controller computation errors, which can be catastrophic in certain cases. Such systems categorised as hard real-time systems that need special strategies. The real-time operating systems available in the market may be the best solutions for such kind of simulations, but they pose limitations like the availability of the X Windows system, graphical interfaces, other user tools. In this paper, we present strategies that can be used on a general purpose operating system (Bare Linux Kernel) to achieve a deterministic deadline and hence have the added advantages of a GPOS with real-time features. Techniques shall be discussed how to make the time-critical application run with the highest priority in an uninterrupted manner, reduced network latency for distributed architecture, real-time data acquisition, data storage, and retrieval, user interactions, etc.

Keywords: real time data acquisition, real time kernel preemption, scheduling, network latency

Procedia PDF Downloads 110

166 A Sentence-to-Sentence Relation Network for Recognizing Textual Entailment

Authors: Isaac K. E. Ampomah, Seong-Bae Park, Sang-Jo Lee

Abstract:

Over the past decade, there have been promising developments in Natural Language Processing (NLP) with several investigations of approaches focusing on Recognizing Textual Entailment (RTE). These models include models based on lexical similarities, models based on formal reasoning, and most recently deep neural models. In this paper, we present a sentence encoding model that exploits the sentence-to-sentence relation information for RTE. In terms of sentence modeling, Convolutional neural network (CNN) and recurrent neural networks (RNNs) adopt different approaches. RNNs are known to be well suited for sequence modeling, whilst CNN is suited for the extraction of n-gram features through the filters and can learn ranges of relations via the pooling mechanism. We combine the strength of RNN and CNN as stated above to present a unified model for the RTE task. Our model basically combines relation vectors computed from the phrasal representation of each sentence and final encoded sentence representations. Firstly, we pass each sentence through a convolutional layer to extract a sequence of higher-level phrase representation for each sentence from which the first relation vector is computed. Secondly, the phrasal representation of each sentence from the convolutional layer is fed into a Bidirectional Long Short Term Memory (Bi-LSTM) to obtain the final sentence representations from which a second relation vector is computed. The relations vectors are combined and then used in then used in the same fashion as attention mechanism over the Bi-LSTM outputs to yield the final sentence representations for the classification. Experiment on the Stanford Natural Language Inference (SNLI) corpus suggests that this is a promising technique for RTE.

Keywords: deep neural models, natural language inference, recognizing textual entailment (RTE), sentence-to-sentence relation

Procedia PDF Downloads 325

165 TACTICAL: Ram Image Retrieval in Linux Using Protected Mode Architecture’s Paging Technique

Authors: Sedat Aktas, Egemen Ulusoy, Remzi Yildirim

Abstract:

This article explains how to get a ram image from a computer with a Linux operating system and what steps should be followed while getting it. What we mean by taking a ram image is the process of dumping the physical memory instantly and writing it to a file. This process can be likened to taking a picture of everything in the computer’s memory at that moment. This process is very important for tools that analyze ram images. Volatility can be given as an example because before these tools can analyze ram, images must be taken. These tools are used extensively in the forensic world. Forensic, on the other hand, is a set of processes for digitally examining the information on any computer or server on behalf of official authorities. In this article, the protected mode architecture in the Linux operating system is examined, and the way to save the image sample of the kernel driver and system memory to disk is followed. Tables and access methods to be used in the operating system are examined based on the basic architecture of the operating system, and the most appropriate methods and application methods are transferred to the article. Since there is no article directly related to this study on Linux in the literature, it is aimed to contribute to the literature with this study on obtaining ram images. LIME can be mentioned as a similar tool, but there is no explanation about the memory dumping method of this tool. Considering the frequency of use of these tools, the contribution of the study in the field of forensic medicine has been the main motivation of the study due to the intense studies on ram image in the field of forensics.

Keywords: linux, paging, addressing, ram-image, memory dumping, kernel modules, forensic

Procedia PDF Downloads 80

164 Information Retrieval from Internet Using Hand Gestures

Authors: Aniket S. Joshi, Aditya R. Mane, Arjun Tukaram

Abstract:

In the 21st century, in the era of e-world, people are continuously getting updated by daily information such as weather conditions, news, stock exchange market updates, new projects, cricket updates, sports and other such applications. In the busy situation, they want this information on the little use of keyboard, time. Today in order to get such information user have to repeat same mouse and keyboard actions which includes time and inconvenience. In India due to rural background many people are not much familiar about the use of computer and internet also. Also in small clinics, small offices, and hotels and in the airport there should be a system which retrieves daily information with the minimum use of keyboard and mouse actions. We plan to design application based project that can easily retrieve information with minimum use of keyboard and mouse actions and make our task more convenient and easier. This can be possible with an image processing application which takes real time hand gestures which will get matched by system and retrieve information. Once selected the functions with hand gestures, the system will report action information to user. In this project we use real time hand gesture movements to select required option which is stored on the screen in the form of RSS Feeds. Gesture will select the required option and the information will be popped and we got the information. A real time hand gesture makes the application handier and easier to use.

Keywords: hand detection, hand tracking, hand gesture recognition, HSV color model, Blob detection

Procedia PDF Downloads 260

163 PaSA: A Dataset for Patent Sentiment Analysis to Highlight Patent Paragraphs

Authors: Renukswamy Chikkamath, Vishvapalsinhji Ramsinh Parmar, Christoph Hewel, Markus Endres

Abstract:

Given a patent document, identifying distinct semantic annotations is an interesting research aspect. Text annotation helps the patent practitioners such as examiners and patent attorneys to quickly identify the key arguments of any invention, successively providing a timely marking of a patent text. In the process of manual patent analysis, to attain better readability, recognising the semantic information by marking paragraphs is in practice. This semantic annotation process is laborious and time-consuming. To alleviate such a problem, we proposed a dataset to train machine learning algorithms to automate the highlighting process. The contributions of this work are: i) we developed a multi-class dataset of size 150k samples by traversing USPTO patents over a decade, ii) articulated statistics and distributions of data using imperative exploratory data analysis, iii) baseline Machine Learning models are developed to utilize the dataset to address patent paragraph highlighting task, and iv) future path to extend this work using Deep Learning and domain-specific pre-trained language models to develop a tool to highlight is provided. This work assists patent practitioners in highlighting semantic information automatically and aids in creating a sustainable and efficient patent analysis using the aptitude of machine learning.

Keywords: machine learning, patents, patent sentiment analysis, patent information retrieval

Procedia PDF Downloads 65

162 Finding the Longest Common Subsequence in Normal DNA and Disease Affected Human DNA Using Self Organizing Map

Authors: G. Tamilpavai, C. Vishnuppriya

Abstract:

Bioinformatics is an active research area which combines biological matter as well as computer science research. The longest common subsequence (LCSS) is one of the major challenges in various bioinformatics applications. The computation of the LCSS plays a vital role in biomedicine and also it is an essential task in DNA sequence analysis in genetics. It includes wide range of disease diagnosing steps. The objective of this proposed system is to find the longest common subsequence which presents in a normal and various disease affected human DNA sequence using Self Organizing Map (SOM) and LCSS. The human DNA sequence is collected from National Center for Biotechnology Information (NCBI) database. Initially, the human DNA sequence is separated as k-mer using k-mer separation rule. Mean and median values are calculated from each separated k-mer. These calculated values are fed as input to the Self Organizing Map for the purpose of clustering. Then obtained clusters are given to the Longest Common Sub Sequence (LCSS) algorithm for finding common subsequence which presents in every clusters. It returns nx(n-1)/2 subsequence for each cluster where n is number of k-mer in a specific cluster. Experimental outcomes of this proposed system produce the possible number of longest common subsequence of normal and disease affected DNA data. Thus the proposed system will be a good initiative aid for finding disease causing sequence. Finally, performance analysis is carried out for different DNA sequences. The obtained values show that the retrieval of LCSS is done in a shorter time than the existing system.

Keywords: clustering, k-mers, longest common subsequence, SOM

Procedia PDF Downloads 233

161 Navigating Government Finance Statistics: Effortless Retrieval and Comparative Analysis through Data Science and Machine Learning

Authors: Kwaku Damoah

Abstract:

This paper presents a methodology and software application (App) designed to empower users in accessing, retrieving, and comparatively exploring data within the hierarchical network framework of the Government Finance Statistics (GFS) system. It explores the ease of navigating the GFS system and identifies the gaps filled by the new methodology and App. The GFS, embodies a complex Hierarchical Network Classification (HNC) structure, encapsulating institutional units, revenues, expenses, assets, liabilities, and economic activities. Navigating this structure demands specialized knowledge, experience, and skill, posing a significant challenge for effective analytics and fiscal policy decision-making. Many professionals encounter difficulties deciphering these classifications, hindering confident utilization of the system. This accessibility barrier obstructs a vast number of professionals, students, policymakers, and the public from leveraging the abundant data and information within the GFS. Leveraging R programming language, Data Science Analytics and Machine Learning, an efficient methodology enabling users to access, navigate, and conduct exploratory comparisons was developed. The machine learning Fiscal Analytics App (FLOWZZ) democratizes access to advanced analytics through its user-friendly interface, breaking down expertise barriers.

Keywords: data science, data wrangling, drilldown analytics, government finance statistics, hierarchical network classification, machine learning, web application.

Procedia PDF Downloads 34

160 Scholastic Ability and Achievement as Predictors of College Performance among Selected Second Year College Students at University of Perpetual Help System DALTA, Calamba

Authors: Shielilo R. Amihan, Ederliza De Jesus

Abstract:

The study determined the predictors of college performance of 2nd Yr students of UPHSD-Calamba. This quantitative study conducted a survey using the Scholastic Abilities Test for Adults (SATA), and the retrieval of entrance examinations results and current General Weighted Average (GWA) of the 242 randomly selected respondents. The mean, Pearson r and multiple regression analyses through SPSS revealed that students are capable of verbal, non-verbal and quantitative reasoning, reading vocabulary, comprehension, math calculation, and writing mechanics but have difficulty in math application and writing composition. The study found out the Scholastic Ability and Achievement, except in mathematics, are significantly related to college performance. It concludes that students with high ability and achievement may perform better in college. However, only English subset results in the entrance exam predicts the academic success of students in college while SATA and Math entrance exam results do not. The study recommends providing pre-college Math and Writing courses as requisites in college. It also suggests implementing formative curriculum-based enhancement programs on specific priority areas, profiling programs towards informed individual academic decision-making, revising the Entrance Examinations, monitoring the development of the students, and exploring other predictors of college academic performance such as non-cognitive factors.

Keywords: scholastic ability, scholastic achievement, entrance exam, college performance

Procedia PDF Downloads 236

159 Context and Culture in EFL Learners' and Native Speakers' Discourses

Authors: Emad A. S. Abu-Ayyash

Abstract:

Cohesive devices, the linguistic tools that are usually employed to hold the different parts of the text together, have been the focus of a significant number of discourse analysis studies. These linguistic tools have grabbed the attention of researchers since the inception of the first and most comprehensive model of cohesion in 1976. However, it was noticed that some cohesive devices (e.g., endophoric reference, conjunctions, ellipsis, substitution, and lexical ties) – being thought of as more popular than others (e.g., exophoric reference) – were over-researched. The present paper explores the usage of two cohesive devices that have been evidently almost absent from discourse analysis studies. These cohesive devices are exophoric and homophoric references, the linguistic items that can be interpreted in terms of the physical and cultural contexts of discourse. The significance of the current paper, therefore, stems from the fact that it attempts to fill a gap in the research conducted so far on cohesive devices. This study provides an explanation of the concepts of the cohesive devices that have been employed in a plethora of research on cohesion and elucidates the relevant context-related concepts. The paper also identifies the gap in cohesive devices research. Exophora and homophora, the least visited cohesive devices in previous studies, were qualitatively and quantitatively explored in six opinion articles, four produced by eight postgraduate English as a Foreign Language (EFL) students in a university in the United Arab Emirates and two by professional NS writers in the Independent and the Guardian. The six pieces were about the United Kingdom Independent Party (UKIP) leader’s call to ban the burqa in the UK and were analysed vis-a-vis the employment and function of homophora and exophora. The study found that both EFL students and native speakers employed exophora and homophora considerably in their writing to serve a variety of functions, including building assumptions, supporting main ideas, and involving the readers among others.

Keywords: cohesive devices, context, culture, exophoric reference, homophoric reference

Procedia PDF Downloads 103

158 Machine Learning Strategies for Data Extraction from Unstructured Documents in Financial Services

Authors: Delphine Vendryes, Dushyanth Sekhar, Baojia Tong, Matthew Theisen, Chester Curme

Abstract:

Much of the data that inform the decisions of governments, corporations and individuals are harvested from unstructured documents. Data extraction is defined here as a process that turns non-machine-readable information into a machine-readable format that can be stored, for instance, in a database. In financial services, introducing more automation in data extraction pipelines is a major challenge. Information sought by financial data consumers is often buried within vast bodies of unstructured documents, which have historically required thorough manual extraction. Automated solutions provide faster access to non-machine-readable datasets, in a context where untimely information quickly becomes irrelevant. Data quality standards cannot be compromised, so automation requires high data integrity. This multifaceted task is broken down into smaller steps: ingestion, table parsing (detection and structure recognition), text analysis (entity detection and disambiguation), schema-based record extraction, user feedback incorporation. Selected intermediary steps are phrased as machine learning problems. Solutions leveraging cutting-edge approaches from the fields of computer vision (e.g. table detection) and natural language processing (e.g. entity detection and disambiguation) are proposed.

Keywords: computer vision, entity recognition, finance, information retrieval, machine learning, natural language processing

Procedia PDF Downloads 87

157 Translation of the Verbal Nouns (Masadars) Originating from Three-Letter Verbs in the Holy Quran: Verbal Noun with More than One Pattern (Wazn) As a Model

Authors: Montasser Mohamed Abdelwahab Mahmoud, Abdelwahab Saber Esawi

Abstract:

The language of the Qur’an has a wide range of understanding, reflection, and meanings. Therefore, translation of the Qur’an is inevitably nothing but a translation of the interpretation of the meanings of the Qur’an. It requires special competencies and skills for translators so that they can get close to the intended meaning of the verse of the Qur’an and convey it with precision. In the Arabic language, the verbal noun “AlMasdar” is a very important derivative that properly expresses the verbal idea in the form of a noun. It sounds the same as the base form of the verb with minor changes in the vowel pattern. It is one of the important topics in morphology. The morphologists divided verbal nouns into auditory and analogical, and they stated that that the verbal nouns (Masadars) originating from three-letter verbs are auditory, although they set controls for some of them in order to preserve them. As for the lexicographers, they mentioned the verbal nouns while talking about the lexical materials, and in some cases, their explanation of them exceeded that made by the morphologists, especially in their discussion of structures that the morphologists did not refer to in their books. The verb kafara (disbelief), for example, has three patterns, namely: al-kufْr, al-kufrān, and al-kufūr, and it was mentioned in the Holy Qur’an with different connotations. The verb ṣāma (fasted) with his two patterns (al-ṣaūm and al-ṣīām) was mentioned in the Holy Qur’an while their semantic meaning is different. The problem discussed in this research paper lied in the "linguistic loss" committed by translators when dealing with Islamic religious texts, especially the Qur'an. The study tried to identify the strategy adopted by translators of the Holy Qur'an in translating words that were classified as verbal nouns through analyzing the translation rendered by five translations of the Qur’an into English: Yusuf Ali, Pickthall, Mohsin Khan, Muhammad Sarwar, and Shakir. This study was limited to the verbal nouns in the Quraan that originate from three-letter verbs and have different semantic meanings.

Keywords: pattern, three-letter verbs, translation of the Quran, verbal nouns

Procedia PDF Downloads 128

156 Swahili Codification of Emotions: A Cognitive Linguistic Analysis

Authors: Rosanna Tramutoli

Abstract:

Studies on several languages have demonstrated how different emotions are categorized in various linguistic constructions. It exists in several writings on the codification of emotions in Western African languages. A recent study on the semantic description of Swahili body terminology has demonstrated that body part terms, such as moyo (heart), uso (face) and jicho (eye) are involved in several metaphorical expressions describing emotions. However, so far hardly anything has been written on the linguistic description of emotions in Swahili. Thus, this study describes how emotional concepts, such as ‘love’ and ‘anger’ are codified in Swahili, in order to highlight common semantic and syntactic patterns, etymological sources and metaphorical expressions. The research seeks to answer a number of questions, such as which are the Swahili terms for ‘emotions’? Is there a distinction between ‘emotions’ and ‘feelings’? Which emotional lexical items have Bantu origin and which come from Arabic? Which metaphorical expressions/cognitive schemas are used to codify emotions? (e.g. kumpanda mtu kichwani, lit. ‘to climb on somebody’s head’, to make somebody feel angry, kushuka moyo, lit. ‘to be down the heart’, to feel discouraged, kumpa mtu moyo lit. ‘to give someone heart’, to encourage someone). Which body terms are involved as ‘containers/locus of emotions’? For instance, it has been shown that moyo (‘heart’) occurs as container of ‘love’ (e.g. kumtia mtu moyoni, lit. ‘to put somebody in the heart’, to love somebody very much) and ‘kindness’ (moyo wake ulijaa hisani, ‘his heart was filled with kindness’). The study also takes into account the syntactic patterns used to code emotions. For instance, when does the experiencer occur in subject position? (e.g. nina furaha, nimefurahi, ‘I am happy’) and when in object position (e.g. Huruma iliniingia moyoni, lit. ‘Pity entered me inside my heart’, ‘I felt pity’)? Data have been collected mostly through the analysis of Swahili digital corpora, containing different kinds of Swahili texts (e.g. novels, drama, political essays).

Keywords: emotions, cognitive linguistics, metaphors, Swahili

Procedia PDF Downloads 545

155 Deep Well Grounded Magnetite Anode Chains Retrieval and Installation for Raslanuf Complex Impressed Current Cathodic Protection System Rectification

Authors: Mohamed Ahmed Khali

Abstract:

Numbers of deep well anode ground beds (GBs) have been retrieved due to un operated anode chains. New identical magnetite anode chains(MAC) have been installed at Raslanuf complex impressed current Cathodic protection(ICCP) system, distributed at different plants(Utility, ethylene and polyethylene). All problems associated with retrieving and installation of MACs have been discussed, rectified and presented. All GB associated severely corroded wellhead casings were well maintained and/ or replaced by new fabricated and modified ones. The main cause of wellhead casings internal corrosion was discussed, and the conducted remedy action to overcome future corrosion problem is presented. All GB connected anode junction boxes (AJBs) and shunts were closely inspected, maintained, and necessary replacement/and or modification were carried out on shunts. All damaged GB concrete foundations (CF) have been inspected and completely replaced. All GB associated Transformer-Rectifiers units (TRUs) were subjected to through inspection, and necessary maintenance has been performed on each individual TRU. After completion of all MACs and TRU maintenance activities, each cathodic protection station (CPS) has been re-operated. An alternative current (AC), direct current (DC), voltage and structure to soil potential (S/P) measurements have been conducted, recorded, and all obtained test results are presented. DC current outputs has been adjusted, and DC current outputs of each MAC has been recorded for each GB AJB.

Keywords: magnatite anode, deep well, ground bed, cathodic protection, transformer rectifies, impreced current, junction box

Procedia PDF Downloads 74

154 Digitalisation of the Railway Industry: Recent Advances in the Field of Dialogue Systems: Systematic Review

Authors: Andrei Nosov

Abstract:

This paper discusses the development directions of dialogue systems within the digitalisation of the railway industry, where technologies based on conversational AI are already potentially applied or will be applied. Conversational AI is one of the popular natural language processing (NLP) tasks, as it has great prospects for real-world applications today. At the same time, it is a challenging task as it involves many areas of NLP based on complex computations and deep insights from linguistics and psychology. In this review, we focus on dialogue systems and their implementation in the railway domain. We comprehensively review the state-of-the-art research results on dialogue systems and analyse them from three perspectives: type of problem to be solved, type of model, and type of system. In particular, from the perspective of the type of tasks to be solved, we discuss characteristics and applications. This will help to understand how to prioritise tasks. In terms of the type of models, we give an overview that will allow researchers to become familiar with how to apply them in dialogue systems. By analysing the types of dialogue systems, we propose an unconventional approach in contrast to colleagues who traditionally contrast goal-oriented dialogue systems with open-domain systems. Our view focuses on considering retrieval and generative approaches. Furthermore, the work comprehensively presents evaluation methods and datasets for dialogue systems in the railway domain to pave the way for future research. Finally, some possible directions for future research are identified based on recent research results.

Keywords: digitalisation, railway, dialogue systems, conversational AI, natural language processing, natural language understanding, natural language generation

Procedia PDF Downloads 35

153 Advantages of Multispectral Imaging for Accurate Gas Temperature Profile Retrieval from Fire Combustion Reactions

Authors: Jean-Philippe Gagnon, Benjamin Saute, Stéphane Boubanga-Tombet

Abstract:

Infrared thermal imaging is used for a wide range of applications, especially in the combustion domain. However, it is well known that most combustion gases such as carbon dioxide (CO₂), water vapor (H₂O), and carbon monoxide (CO) selectively absorb/emit infrared radiation at discrete energies, i.e., over a very narrow spectral range. Therefore, temperature profiles of most combustion processes derived from conventional broadband imaging are inaccurate without prior knowledge or assumptions about the spectral emissivity properties of the combustion gases. Using spectral filters allows estimating these critical emissivity parameters in addition to providing selectivity regarding the chemical nature of the combustion gases. However, due to the turbulent nature of most flames, it is crucial that such information be obtained without sacrificing temporal resolution. For this reason, Telops has developed a time-resolved multispectral imaging system which combines a high-performance broadband camera synchronized with a rotating spectral filter wheel. In order to illustrate the benefits of using this system to characterize combustion experiments, measurements were carried out using a Telops MS-IR MW on a very simple combustion system: a wood fire. The temperature profiles calculated using the spectral information from the different channels were compared with corresponding temperature profiles obtained with conventional broadband imaging. The results illustrate the benefits of the Telops MS-IR cameras for the characterization of laminar and turbulent combustion systems at a high temporal resolution.

Keywords: infrared, multispectral, fire, broadband, gas temperature, IR camera

Procedia PDF Downloads 101

152 Rehabilitation and Conservation of Mangrove Forest as Pertamina Corporate Social Responsibility Approach in Prevention Damage Climate in Indonesia

Authors: Nor Anisa

Abstract:

This paper aims to describe the use of conservation and rehabilitation of Mangrove forests as an alternative area in protecting the natural environment and ecosystems and ecology, community education and innovation of sustainable industrial development such as oil companies, gas and coal. The existence of globalization encourages energy needs such as gas, diesel and coal as an unaffected resource which is a basic need for human life while environmental degradation and natural phenomena continue to occur in Indonesia, especially global warming, sea water pollution, extinction of animal steps. The phenomenon or damage to nature in Indonesia is caused by a population explosion in Indonesia that causes unemployment, the land where the residence will disappear so that this will encourage the exploitation of nature and the environment. Therefore, Pertamina as a state-owned oil and gas company carries out its social responsibility efforts, namely to carry out conservation and rehabilitation and management of Mangrove fruit seeds which will provide an educational effect on the benefits of Mangrove seed maintenance. The method used in this study is a qualitative method and secondary data retrieval techniques where data is taken based on Pertamina activity journals and websites that can be accounted for. So the conclusion of this paper is: the benefits and function of conservation of mangrove forests in Indonesia physically, chemically, biologically and socially and economically and can provide innovation to the CSR (Corporate Social Responsibility) of the company in continuing social responsibility in the scope of environmental conservation and social education.

Keywords: mangrove, environmental damage, conservation and rehabilitation, innovation of corporate social responsibility

Procedia PDF Downloads 108

151 Collaboration During Planning and Reviewing in Writing: Effects on L2 Writing

Authors: Amal Sellami, Ahlem Ammar

Abstract:

Writing is acknowledged to be a cognitively demanding and complex task. Indeed, the writing process is composed of three iterative sub-processes, namely planning, translating (writing), and reviewing. Not only do second or foreign language learners need to write according to this process, but they also need to respect the norms and rules of language and writing in the text to-be-produced. Accordingly, researchers have suggested to approach writing as a collaborative task in order to al leviate its complexity. Consequently, collaboration has been implemented during the whole writing process or only during planning orreviewing. Researchers report that implementing collaboration during the whole process might be demanding in terms of time in comparison to individual writing tasks. Consequently, because of time constraints, teachers may avoid it. For this reason, it might be pedagogically more realistic to limit collaboration to one of the writing sub-processes(i.e., planning or reviewing). However, previous research implementing collaboration in planning or reviewing is limited and fails to explore the effects of the seconditionson the written text. Consequently, the present study examines the effects of collaboration in planning and collaboration in reviewing on the written text. To reach this objective, quantitative as well as qualitative methods are deployed to examine the written texts holistically and in terms of fluency, complexity, and accuracy. Participants of the study include 4 pairs in each group (n=8). They participated in two experimental conditions, which are: (1) collaborative planning followed by individual writing and individual reviewing and (2) individual planning followed by individual writing and collaborative reviewing. The comparative research findings indicate that while collaborative planning resulted in better overall text quality (precisely better content and organization ratings), better fluency, better complexity, and fewer lexical errors, collaborative reviewing produces better accuracy and less syntactical and mechanical errors. The discussion of the findings suggests the need to conduct more comparative research in order to further explore the effects of collaboration in planning or in reviewing. Pedagogical implications of the current study include advising teachers to choose between implementing collaboration in planning or in reviewing depending on their students’ need and what they need to improve.

Keywords: collaboration, writing, collaborative planning, collaborative reviewing

Procedia PDF Downloads 64

150 The Prevalence and Impact of Anxiety Among Medical Students in the MENA Region: A Systematic Review, Meta-Analysis, and Meta-Regression

Authors: Kawthar F. Albasri, Abdullah M. AlHudaithi, Dana B. AlTurairi, Abdullaziz S. AlQuraini, Adoub Y. AlDerazi, Reem A. Hubail, Haitham A. Jahrami

Abstract:

Several studies have found that medical students have a significant prevalence of anxiety. The purpose of this review paper is to carefully evaluate the current research on anxiety among medical students in the MENA region and, as a result, estimate the prevalence of these disturbances. Multiple databases, including the CINAHL (Cumulative Index to Nursing and Allied Health Literature), Cochrane Library, Embase, MEDLINE (Medical Literature Analysis and Retrieval System Online), PubMed, PsycINFO (Psychological Information Database), Scopus, Web of Science, UpToDate, ClinicalTrials.gov, WHO Global Health Library, EbscoHost, ProQuest, JAMA Network, and ScienceDirect, were searched. The retrieved article reference lists were rigorously searched and rated for quality. A random effects meta-analysis was performed to compute estimates. The current meta-analysis revealed an alarming estimated pooled prevalence of anxiety (K = 46, N = 27023) of 52.5% [95%CI: 43.3%–61.6%]. A total of 62.0% [95% CI 42.9%; 78.0%] of the students (K = 18, N = 16466) suffered from anxiety during the COVID-19 pandemic, while 52.5% [95% CI 43.3%; 61.6%] had anxiety before COVID-19. Based on the GAD-7 measure, a total of 55.7% [95%CI 30.5%; 78.3%] of the students (K = 10, N = 5830) had anxiety, and a total of 54.7% of the students (K = 18, N = 12154) [95%CI 42.8%; 66.0%] had anxiety using the DASS-21 or 42 measure. Anxiety is a common issue among medical students, making it a genuine problem. Further research should be conducted post-COVD 19, with a focus on anxiety prevention and intervention initiatives for medical students.

Keywords: anxiety, medical students, MENA, meta-analysis, prevalence

Procedia PDF Downloads 44

149 Vascular Crossed Aphasia in Dextrals: A Study on Bengali-Speaking Population in Eastern India

Authors: Durjoy Lahiri, Vishal Madhukar Sawale, Ashwani Bhat, Souvik Dubey, Gautam Das, Biman Kanti Roy, Suparna Chatterjee, Goutam Gangopadhyay

Abstract:

Crossed aphasia has been an area of considerable interest for cognitive researchers as it offers a fascinating insight into cerebral lateralization for language function. We conducted an observational study in the stroke unit of a tertiary care neurology teaching hospital in eastern India on subjects with crossed aphasia over a period of four years. During the study period, we detected twelve cases of crossed aphasia in strongly right-handed patients, caused by ischemic stroke. The age, gender, vernacular language and educational status of the patients were noted. Aphasia type and severity were assessed using Bengali version of Western Aphasia Battery (validated). Computed tomography, magnetic resonance imaging and angiography were used to evaluate the location and extent of the ischemic lesion in brain. Our series of 12 cases of crossed aphasia included 7 male and 5 female with mean age being 58.6 years. Eight patients were found to have Broca’s aphasia, 3 had trans-cortical motor aphasia and 1 patient suffered from global aphasia. Nine patients were having very severe aphasia and 3 suffered from mild aphasia. Mirror-image type of crossed aphasia was found in 3 patients, whereas 9 had anomalous variety. In our study crossed aphasia was found to be more frequent in males. Anomalous pattern was more common than mirror-image. Majority of the patients had motor-type aphasia and no patient was found to have pure comprehension deficit. We hypothesize that in Bengali-speaking right-handed population, lexical-semantic system of the language network remains loyal to the left hemisphere even if the phonological output system is anomalously located in the right hemisphere.

Keywords: aphasia, crossed, lateralization, language function, vascular

Procedia PDF Downloads 155