Search results for: short text analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 29628

Search results for: short text analysis

29088 Linguistic Features for Sentence Difficulty Prediction in Aspect-Based Sentiment Analysis

Authors: Adrian-Gabriel Chifu, Sebastien Fournier

Abstract:

One of the challenges of natural language understanding is to deal with the subjectivity of sentences, which may express opinions and emotions that add layers of complexity and nuance. Sentiment analysis is a field that aims to extract and analyze these subjective elements from text, and it can be applied at different levels of granularity, such as document, paragraph, sentence, or aspect. Aspect-based sentiment analysis is a well-studied topic with many available data sets and models. However, there is no clear definition of what makes a sentence difficult for aspect-based sentiment analysis. In this paper, we explore this question by conducting an experiment with three data sets: ”Laptops”, ”Restaurants”, and ”MTSC” (Multi-Target-dependent Sentiment Classification), and a merged version of these three datasets. We study the impact of domain diversity and syntactic diversity on difficulty. We use a combination of classifiers to identify the most difficult sentences and analyze their characteristics. We employ two ways of defining sentence difficulty. The first one is binary and labels a sentence as difficult if the classifiers fail to correctly predict the sentiment polarity. The second one is a six-level scale based on how many of the top five best-performing classifiers can correctly predict the sentiment polarity. We also define 9 linguistic features that, combined, aim at estimating the difficulty at sentence level.

Keywords: sentiment analysis, difficulty, classification, machine learning

Procedia PDF Downloads 55
29087 A New Method to Reduce 5G Application Layer Payload Size

Authors: Gui Yang Wu, Bo Wang, Xin Wang

Abstract:

Nowadays, 5G service-based interface architecture uses text-based payload like JSON to transfer business data between network functions, which has obvious advantages as internet services but causes unnecessarily larger traffic. In this paper, a new 5G application payload size reduction method is presented to provides the mechanism to negotiate about new capability between network functions when network communication starts up and how 5G application data are reduced according to negotiated information with peer network function. Without losing the advantages of 5G text-based payload, this method demonstrates an excellent result on application payload size reduction and does not increase the usage quota of computing resource. Implementation of this method does not impact any standards or specifications and not change any encoding or decoding functionality too. In a real 5G network, this method will contribute to network efficiency and eventually save considerable computing resources.

Keywords: 5G, JSON, payload size, service-based interface

Procedia PDF Downloads 150
29086 Forecasting Nokoué Lake Water Levels Using Long Short-Term Memory Network

Authors: Namwinwelbere Dabire, Eugene C. Ezin, Adandedji M. Firmin

Abstract:

The prediction of hydrological flows (rainfall-depth or rainfall-discharge) is becoming increasingly important in the management of hydrological risks such as floods. In this study, the Long Short-Term Memory (LSTM) network, a state-of-the-art algorithm dedicated to time series, is applied to predict the daily water level of Nokoue Lake in Benin. This paper aims to provide an effective and reliable method enable of reproducing the future daily water level of Nokoue Lake, which is influenced by a combination of two phenomena: rainfall and river flow (runoff from the Ouémé River, the Sô River, the Porto-Novo lagoon, and the Atlantic Ocean). Performance analysis based on the forecasting horizon indicates that LSTM can predict the water level of Nokoué Lake up to a forecast horizon of t+10 days. Performance metrics such as Root Mean Square Error (RMSE), coefficient of correlation (R²), Nash-Sutcliffe Efficiency (NSE), and Mean Absolute Error (MAE) agree on a forecast horizon of up to t+3 days. The values of these metrics remain stable for forecast horizons of t+1 days, t+2 days, and t+3 days. The values of R² and NSE are greater than 0.97 during the training and testing phases in the Nokoué Lake basin. Based on the evaluation indices used to assess the model's performance for the appropriate forecast horizon of water level in the Nokoué Lake basin, the forecast horizon of t+3 days is chosen for predicting future daily water levels.

Keywords: forecasting, long short-term memory cell, recurrent artificial neural network, Nokoué lake

Procedia PDF Downloads 39
29085 Sales Patterns Clustering Analysis on Seasonal Product Sales Data

Authors: Soojin Kim, Jiwon Yang, Sungzoon Cho

Abstract:

As a seasonal product is only in demand for a short time, inventory management is critical to profits. Both markdowns and stockouts decrease the return on perishable products; therefore, researchers have been interested in the distribution of seasonal products with the aim of maximizing profits. In this study, we propose a data-driven seasonal product sales pattern analysis method for individual retail outlets based on observed sales data clustering; the proposed method helps in determining distribution strategies.

Keywords: clustering, distribution, sales pattern, seasonal product

Procedia PDF Downloads 576
29084 Semantic Indexing Improvement for Textual Documents: Contribution of Classification by Fuzzy Association Rules

Authors: Mohsen Maraoui

Abstract:

In the aim of natural language processing applications improvement, such as information retrieval, machine translation, lexical disambiguation, we focus on statistical approach to semantic indexing for multilingual text documents based on conceptual network formalism. We propose to use this formalism as an indexing language to represent the descriptive concepts and their weighting. These concepts represent the content of the document. Our contribution is based on two steps. In the first step, we propose the extraction of index terms using the multilingual lexical resource Euro WordNet (EWN). In the second step, we pass from the representation of index terms to the representation of index concepts through conceptual network formalism. This network is generated using the EWN resource and pass by a classification step based on association rules model (in attempt to discover the non-taxonomic relations or contextual relations between the concepts of a document). These relations are latent relations buried in the text and carried by the semantic context of the co-occurrence of concepts in the document. Our proposed indexing approach can be applied to text documents in various languages because it is based on a linguistic method adapted to the language through a multilingual thesaurus. Next, we apply the same statistical process regardless of the language in order to extract the significant concepts and their associated weights. We prove that the proposed indexing approach provides encouraging results.

Keywords: concept extraction, conceptual network formalism, fuzzy association rules, multilingual thesaurus, semantic indexing

Procedia PDF Downloads 123
29083 An Intelligent Text Independent Speaker Identification Using VQ-GMM Model Based Multiple Classifier System

Authors: Ben Soltane Cheima, Ittansa Yonas Kelbesa

Abstract:

Speaker Identification (SI) is the task of establishing identity of an individual based on his/her voice characteristics. The SI task is typically achieved by two-stage signal processing: training and testing. The training process calculates speaker specific feature parameters from the speech and generates speaker models accordingly. In the testing phase, speech samples from unknown speakers are compared with the models and classified. Even though performance of speaker identification systems has improved due to recent advances in speech processing techniques, there is still need of improvement. In this paper, a Closed-Set Tex-Independent Speaker Identification System (CISI) based on a Multiple Classifier System (MCS) is proposed, using Mel Frequency Cepstrum Coefficient (MFCC) as feature extraction and suitable combination of vector quantization (VQ) and Gaussian Mixture Model (GMM) together with Expectation Maximization algorithm (EM) for speaker modeling. The use of Voice Activity Detector (VAD) with a hybrid approach based on Short Time Energy (STE) and Statistical Modeling of Background Noise in the pre-processing step of the feature extraction yields a better and more robust automatic speaker identification system. Also investigation of Linde-Buzo-Gray (LBG) clustering algorithm for initialization of GMM, for estimating the underlying parameters, in the EM step improved the convergence rate and systems performance. It also uses relative index as confidence measures in case of contradiction in identification process by GMM and VQ as well. Simulation results carried out on voxforge.org speech database using MATLAB highlight the efficacy of the proposed method compared to earlier work.

Keywords: feature extraction, speaker modeling, feature matching, Mel frequency cepstrum coefficient (MFCC), Gaussian mixture model (GMM), vector quantization (VQ), Linde-Buzo-Gray (LBG), expectation maximization (EM), pre-processing, voice activity detection (VAD), short time energy (STE), background noise statistical modeling, closed-set tex-independent speaker identification system (CISI)

Procedia PDF Downloads 285
29082 Direct Blind Separation Methods for Convolutive Images Mixtures

Authors: Ahmed Hammed, Wady Naanaa

Abstract:

In this paper, we propose a general approach to deal with the problem of a convolutive mixture of images. We use a direct blind source separation method by adding only one non-statistical justified constraint describing the relationships between different mixing matrix at the aim to make its resolution easy. This method can be applied, provided that this constraint is known, to degraded document affected by the overlapping of text-patterns and images. This is due to chemical and physical reactions of the materials (paper, inks,...) occurring during the documents aging, and other unpredictable causes such as humidity, microorganism infestation, human handling, etc. We will demonstrate that this problem corresponds to a convolutive mixture of images. Subsequently, we will show how the validation of our method through numerical examples. We can so obtain clear images from unreadable ones which can be caused by pages superposition, a phenomenon similar to that we find every often in archival documents.

Keywords: blind source separation, convoluted mixture, degraded documents, text-patterns overlapping

Procedia PDF Downloads 304
29081 Analysis of Short Counter-Flow Heat Exchanger (SCFHE) Using Non-Circular Micro-Tubes Operated on Water-CuO Nanofluid

Authors: Avdhesh K. Sharma

Abstract:

Key, in the development of energy-efficient micro-scale heat exchanger devices, is to select large heat transfer surface to volume ratio without much expanse on re-circulated pumps. The increased interest in short heat exchanger (SHE) is due to accessibility of advanced technologies for manufacturing of micro-tubes in range of 1 micron m - 1 mm. Such SHE using micro-tubes are highly effective for high flux heat transfer technologies. Nanofluids, are used to enhance the thermal conductivity of re-circulated coolant and thus enhances heat transfer rate further. Higher viscosity associated with nanofluid expands more pumping power. Thus, there is a trade-off between heat transfer rate and pressure drop with geometry of micro-tubes. Herein, a novel design of short counter flow heat exchanger (SCFHE) using non-circular micro-tubes flooded with CuO-water nanofluid is conceptualized by varying the ratio of surface area to cross-sectional area of micro-tubes. A framework for comparative analysis of SCFHE using micro-tubes non-circular shape flooded by CuO-water nanofluid is presented. In SCFHE concept, micro-tubes having various geometrical shapes (viz., triangular, rectangular and trapezoidal) has been arranged row-wise to facilitate two aspects: (1) allowing easy flow distribution for cold and hot stream, and (2) maximizing the thermal interactions with neighboring channels. Adequate distribution of rows for cold and hot flow streams enables above two aspects. For comparative analysis, a specific volume or cross-section area is assigned to each elemental cell (which includes flow area and area corresponds to half wall thickness). A specific volume or cross-section area is assumed to be constant for each elemental cell (which includes flow area and half wall thickness area) and variation in surface area is allowed by selecting different geometry of micro-tubes in SCFHE. Effective thermal conductivity model for CuO-water nanofluid has been adopted, while the viscosity values for water based nanofluids are obtained empirically. Correlations for Nusselt number (Nu) and Poiseuille number (Po) for micro-tubes have been derived or adopted. Entrance effect is accounted for. Thermal and hydrodynamic performances of SCFHE are defined in terms of effectiveness and pressure drop or pumping power, respectively. For defining the overall performance index of SCFHE, two links are employed. First one relates heat transfer between the fluid streams q and pumping power PP as (=qj/PPj); while another link relates effectiveness eff and pressure drop dP as (=effj/dPj). For analysis, the inlet temperatures of hot and cold streams are varied in usual range of 20dC-65dC. Fully turbulent regime is seldom encountered in micro-tubes and transition of flow regime occurs much early (i.e., ~Re=1000). Thus, Re is fixed at 900, however, the uncertainty in Re due to addition of nanoparticles in base fluid is quantified by averaging of Re. Moreover, for minimizing error, volumetric concentration is limited to range 0% to ≤4% only. Such framework may be helpful in utilizing maximum peripheral surface area of SCFHE without any serious severity on pumping power and towards developing advanced short heat exchangers.

Keywords: CuO-water nanofluid, non-circular micro-tubes, performance index, short counter flow heat exchanger

Procedia PDF Downloads 194
29080 Metadiscourse in Chinese and Thai Request Emails: Analysis and Pedagogical Application

Authors: Chia-Ling Hsieh, Kankanit Potikit

Abstract:

Metadiscourse refers to linguistic resources employed by writers to organize text and interact with readers. While metadiscourse has received considerable attention within the field of discourse analysis, few studies have explored the use of metadiscourse in email, one of the most popular forms of computer-mediated communication. Furthermore, the diversity of cross-linguistic research required to uncover the influence of cultural factors on metadiscourse use is lacking. The present study compares metadiscourse markers employed in Chinese and Thai-language request emails with the purpose of discovering cross-cultural similarities and differences that are meaningful and applicable to foreign language teaching. The analysis is based on a corpus of 200 request emails: 100 composed in Chinese and 100 in Thai, with half of the emails from each language data set addressed to professors and the other half addressed to classmates. Adopting Hyland’s model as an analytical framework, two primary categories of metadiscourse are identified. Textual metadiscourse helps to create text coherence, while interpersonal metadiscourse functions to convey authorial stance. Results of the study make clear that both Chinese and Thai-language emails use significantly more interpersonal markers than textual markers, indicating that email, as a unique communicative medium, is characterized by high degrees of concision and interactivity. Users of both languages further deploy similar patterns in writing emails to recipients of different social statuses. Compared with emails addressed to classmates, emails addressed to professors are notably longer and include more transition and engagement markers. Nevertheless, cultural factors do play a role. Emails composed in Thai, for example, include more textual markers than those in Chinese, as Thai favors formal expressions and detailed explanations, while in contrast, emails composed in Chinese employ more interpersonal markers than those in Thai, since Chinese tends to emphasize recipient involvement and attitudinal warmth. These findings thereby demonstrate the combined effects of email as a communicative medium, social status, and cultural values on metadiscourse usage. The study concludes by applying these findings to pedagogical suggestions for teaching email writing to Chinese and Thai language learners based on similarities and differences in metadiscourse strategy between the two languages.

Keywords: discourse analysis, email, metadiscourse, writing instruction

Procedia PDF Downloads 109
29079 Stock Price Prediction Using Time Series Algorithms

Authors: Sumit Sen, Sohan Khedekar, Umang Shinde, Shivam Bhargava

Abstract:

This study has been undertaken to investigate whether the deep learning models are able to predict the future stock prices by training the model with the historical stock price data. Since this work required time series analysis, various models are present today to perform time series analysis such as Recurrent Neural Network LSTM, ARIMA and Facebook Prophet. Applying these models the movement of stock price of stocks are predicted and also tried to provide the future prediction of the stock price of a stock. Final product will be a stock price prediction web application that is developed for providing the user the ease of analysis of the stocks and will also provide the predicted stock price for the next seven days.

Keywords: Autoregressive Integrated Moving Average, Deep Learning, Long Short Term Memory, Time-series

Procedia PDF Downloads 119
29078 A Comparison of Short- and Long-Haul Vacation Tourists on Evaluation of Attractiveness: The Case of Hong Kong

Authors: Zhaoyu Chen

Abstract:

In this study, an attempt was made to find reasons why tourists go to particular attractions. Tourists may be either motivated by the attractions or simply make the choice to satisfy their needs and desires. Based on the attractions in Hong Kong, this research was conducted to explore the attraction-related concepts to discuss how the attraction system works. Due to the limited studies on exploring the attractiveness of attractions through tourist movement patterns, the study aims to evaluate such indicators to determine whether tourists are motivated by attractiveness or their own needs. The investigation is conducted through the comparison of different source markets - Mainland China, short haul markets (excluding Mainland China) and long haul markets. The latest finding of Departing Visitor Survey (DVS) implemented by the Hong Kong Tourism Board (HKTB) is employed for the analysis. Various tourist movement patterns are drawn from the practical data. The managerial implication to destination management organizations (DMOs) is suggested to better allocate attractions according to the needs of tourists.

Keywords: attractions, attraction system, Hong Kong, tourist movement patterns

Procedia PDF Downloads 492
29077 Topic-to-Essay Generation with Event Element Constraints

Authors: Yufen Qin

Abstract:

Topic-to-Essay generation is a challenging task in Natural language processing, which aims to generate novel, diverse, and topic-related text based on user input. Previous research has overlooked the generation of articles under the constraints of event elements, resulting in issues such as incomplete event elements and logical inconsistencies in the generated results. To fill this gap, this paper proposes an event-constrained approach for a topic-to-essay generation that enforces the completeness of event elements during the generation process. Additionally, a language model is employed to verify the logical consistency of the generated results. Experimental results demonstrate that the proposed model achieves a better BLEU-2 score and performs better than the baseline in terms of subjective evaluation on a real dataset, indicating its capability to generate higher-quality topic-related text.

Keywords: event element, language model, natural language processing, topic-to-essay generation.

Procedia PDF Downloads 211
29076 Perceptions and Expectations by Participants of Monitoring and Evaluation Short Course Training Programmes in Africa

Authors: Mokgophana Ramasobana

Abstract:

Background: At the core of the demand to utilize evidence-based approaches in the policy-making cycle, prioritization of limited financial resources and results driven initiatives is the urgency to develop a cohort of competent Monitoring and Evaluation (M&E) practitioners and public servants. The ongoing strides in the evaluation capacity building (ECB) initiatives are a direct response to produce the highly-sought after M&E skills. Notwithstanding the rapid growth of M&E short courses, participants perceived value and expectation of M&E short courses as a panacea for ECB have not been empirically quantified or measured. The objective of this article is to explicitly illustrate the importance of measuring ECB interventions and understanding what works in ECB and why it works. Objectives: This article illustrates the importance of establishing empirical ECB measurement tools to evaluate ECB interventions in order to ascertain its contribution to the broader evaluation practice. Method: The study was primarily a desktop review of existing literature, juxtaposed by a survey of the participants across the African continent based on the 43 M&E short courses hosted by the Centre for Learning on Evaluation and Results Anglophone Africa (CLEAR-AA) in collaboration with the Department of Planning Monitoring and Evaluation (DPME) Results: The article established that participants perceive short course training as a panacea to improve their M&E practical skill critical to executing their organizational duties. In tandem, participants are likely to demand customized training as opposed to general topics in Evaluation. However, the organizational environments constrain the application of the newly acquired skills. Conclusion: This article aims to contribute to the 'how to' measure ECB interventions discourse and contribute towards the improvement to evaluate ECB interventions. The study finds that participants prefer training courses with longer duration to cover more topics. At the same time, whilst organizations call for customization of programmes, the study found that individual participants demand knowledge of generic and popular evaluation topics.

Keywords: evaluation capacity building, effectiveness and training, monitoring and evaluation (M&E) short course training, perceptions and expectations

Procedia PDF Downloads 114
29075 Present and Future of Micromobility in the City of Medellin

Authors: Saul Emilio Rivero Mejia, Estefanya Marin Tabares, Carlos Andres Rodriguez Toro, Katherine Bolano Restrepo, Sarita Santa Cortes

Abstract:

Medellin is the Colombian city with the best public transportation system in the country, which is composed of two subway lines, five metro cables, two Bus Rapid Transit lines, and a streetcar. But despite the above, the Aburra Valley, the area in which the city is located, comparatively speaking, has a lower number of urban roads per inhabitant built, compared to the national average. In addition, since there is approximately one vehicle for every three inhabitants in Medellin, the problems of congestion and environmental pollution have become more acute over the years, and it has even been necessary to implement restrictive measures to the use of private vehicles on a permanent basis. In that sense, due to the limitations of physical space, the low public investment in road infrastructure, it is necessary to opt for mobility alternatives according to the above. Within the options for the city, there is what is known as micromobility. Micromobility is understood as those small and light means of transport used to travel short distances, which use electrical energy, such as skateboards and bicycles. These transport alternatives have a high potential for use by the city's young population, but this requires an adequate infrastructure and also state regulation. Taking into account the above, this paper will analyze the current state and future of micro mobility in the city of Medellin, making a prospective analysis, supported by a PEST (political, economic, social and technological) analysis. Based on the above, it is expected to identify the growth of demand for these alternative means and its impact on the mobility of the city in the medium and short term.

Keywords: electric, micromobility, transport, sustainable

Procedia PDF Downloads 104
29074 An Unexpected Helping Hand: Consequences of Redistribution on Personal Ideology

Authors: Simon B.A. Egli, Katja Rost

Abstract:

Literature on redistributive preferences has proliferated in past decades. A core assumption behind it is that variation in redistributive preferences can explain different levels of redistribution. In contrast, this paper considers the reverse. What if it is redistribution that changes redistributive preferences? The core assumption behind the argument is that if self-interest - which we label concrete preferences - and ideology - which we label abstract preferences - come into conflict, the former will prevail and lead to an adjustment of the latter. To test the hypothesis, data from a survey conducted in Switzerland during the first wave of the COVID-19 crisis is used. A significant portion of the workforce at the time unexpectedly received state money through the short-time working program. Short-time work was used as a proxy for self-interest and was tested (1) on the support given to hypothetical, ailing firms during the crisis and (2) on the prioritization of justice principles guiding state action. In a first step, several models using OLS-regressions on political orientation were estimated to test our hypothesis as well as to check for non-linear effects. We expected support for ailing firms to be the same regardless of ideology but only for people on short-time work. The results both confirm our hypothesis and suggest a non-linear effect. Far-right individuals on short-time work were disproportionally supportive compared to moderate ones. In a second step, ordered logit models were estimated to test the impact of short-time work and political orientation on the rankings of the distributive justice principles need, performance, entitlement, and equality. The results show that being on short-time work significantly alters the prioritization of justice principles. Right-wing individuals are much more likely to prioritize need and equality over performance and entitlement when they receive government assistance. No such effect is found among left-wing individuals. In conclusion, we provide moderate to strong evidence that unexpectedly finding oneself at the receiving end changes redistributive preferences if personal ideology is antithetical to redistribution. The implications of our findings on the study of populism, personal ideologies, and political change are discussed.

Keywords: COVID-19, ideology, redistribution, redistributive preferences, self-interest

Procedia PDF Downloads 124
29073 Intentionality and Context in the Paradox of Reward and Punishment in the Meccan Surahs

Authors: Asmaa Fathy Mohamed Desoky

Abstract:

The subject of this research is the inference of intentionality and context from the verses of the Meccan surahs, which include the paradox of reward and punishment, applied to the duality of disbelief and faith; The Holy Quran is the most important sacred linguistic reference in the Arabic language because it is rich in all the rules of the language in addition to the linguistic miracle. the Quranic text is a first-class intentional text, sent down to convey something to the recipient (Muhammad first and then communicates it to Muslims) and influence and convince him, which opens the door to many Ijtihad; a desire to reach the will of Allah and his intention from his words Almighty. Intentionality as a term is one of the most important deliberative terms, but it will be modified to suit the Quranic discourse, especially since intentionality is related to intention-as it turned out earlier - that is, it turns the reader or recipient into a predictor of the unseen, and this does not correspond to the Quranic discourse. Hence, in this research, a set of dualities will be identified that will be studied in order to clarify the meaning of them according to the opinions of previous interpreters in accordance with the sanctity of the Quranic discourse, which is intentionally related to the dualities of reward and punishment, such as: the duality of disbelief and faith, noting that it is a duality that combines opposites and Paradox on one level, because it may be an external paradox between action and reaction, and may be an internal paradox in matters related to faith, and may be a situational paradox in a specific event or a certain fact. It should be noted that the intention of the Qur'anic text is fully realized in form and content, in whole and in part, and this research includes a presentation of some applied models of the issues of intention and context that appear in the verses of the paradox of reward and punishment in the Meccan surahs in Quraan.

Keywords: intentionality, context, the paradox, reward, punishment, Meccan surahs

Procedia PDF Downloads 47
29072 Switching to the Latin Alphabet in Kazakhstan: A Brief Overview of Character Recognition Methods

Authors: Ainagul Yermekova, Liudmila Goncharenko, Ali Baghirzade, Sergey Sybachin

Abstract:

In this article, we address the problem of Kazakhstan's transition to the Latin alphabet. The transition process started in 2017 and is scheduled to be completed in 2025. In connection with these events, the problem of recognizing the characters of the new alphabet is raised. Well-known character recognition programs such as ABBYY FineReader, FormReader, MyScript Stylus did not recognize specific Kazakh letters that were used in Cyrillic. The author tries to give an assessment of the well-known method of character recognition that could be in demand as part of the country's transition to the Latin alphabet. Three methods of character recognition: template, structured, and feature-based, are considered through the algorithms of operation. At the end of the article, a general conclusion is made about the possibility of applying a certain method to a particular recognition process: for example, in the process of population census, recognition of typographic text in Latin, or recognition of photos of car numbers, store signs, etc.

Keywords: text detection, template method, recognition algorithm, structured method, feature method

Procedia PDF Downloads 162
29071 An Interactive Online Academic Writing Resource for Research Students in Engineering

Authors: Eleanor K. P. Kwan

Abstract:

English academic writing, it has been argued, is an acquired language even for English speakers. For research students whose English is not their first language, however, the acquisition process is often more challenging. Instead of hoping that students would acquire the conventions themselves through extensive reading, there is a need for the explicit teaching of linguistic conventions in academic writing, as explicit teaching could help students to be more aware of the different generic conventions in different disciplines in science. This paper presents an interuniversity effort to develop an online academic writing resource for research students in five subdisciplines in engineering, upon the completion of the needs analysis which indicates that students and faculty members are more concerned about students’ ability to organize an extended text than about grammatical accuracy per se. In particular, this paper focuses on the materials developed for thesis writing (also called dissertation writing in some tertiary institutions), as theses form an essential graduation requirement for all research students and this genre is also expected to demonstrate the writer’s competence in research and contributions to the research community. Drawing on Swalesian move analysis of research articles, this online resource includes authentic materials written by students and faculty members from the participating institutes. Highlight will be given to several aspects and challenges of developing this online resource. First, as the online resource aims at moving beyond providing instructions on academic writing, a range of interactive activities need to be designed to engage the users, which is one feature which differentiates this online resource from other equally informative websites on academic writing. Second, it will also include discussion on divergent textual practices in different subdisciplines, which help to illustrate different practices among these subdisciplines. Third, since theses, probably one of the most extended texts a research student will complete, require effective use of signposting devices to facility readers’ understanding, this online resource will also provide both explanation and activities on different components that contribute to text coherence. Finally results from piloting will also be included to shed light on the effectiveness of the materials, which could be useful for future development.

Keywords: academic writing, English for academic purposes, online language learning materials, scientific writing

Procedia PDF Downloads 247
29070 Unsupervised Domain Adaptive Text Retrieval with Query Generation

Authors: Rui Yin, Haojie Wang, Xun Li

Abstract:

Recently, mainstream dense retrieval methods have obtained state-of-the-art results on some datasets and tasks. However, they require large amounts of training data, which is not available in most domains. The severe performance degradation of dense retrievers on new data domains has limited the use of dense retrieval methods to only a few domains with large training datasets. In this paper, we propose an unsupervised domain-adaptive approach based on query generation. First, a generative model is used to generate relevant queries for each passage in the target corpus, and then the generated queries are used for mining negative passages. Finally, the query-passage pairs are labeled with a cross-encoder and used to train a domain-adapted dense retriever. Experiments show that our approach is more robust than previous methods in target domains that require less unlabeled data.

Keywords: dense retrieval, query generation, unsupervised training, text retrieval

Procedia PDF Downloads 46
29069 Prediction of Sepsis Illness from Patients Vital Signs Using Long Short-Term Memory Network and Dynamic Analysis

Authors: Marcio Freire Cruz, Naoaki Ono, Shigehiko Kanaya, Carlos Arthur Mattos Teixeira Cavalcante

Abstract:

The systems that record patient care information, known as Electronic Medical Record (EMR) and those that monitor vital signs of patients, such as heart rate, body temperature, and blood pressure have been extremely valuable for the effectiveness of the patient’s treatment. Several kinds of research have been using data from EMRs and vital signs of patients to predict illnesses. Among them, we highlight those that intend to predict, classify, or, at least identify patterns, of sepsis illness in patients under vital signs monitoring. Sepsis is an organic dysfunction caused by a dysregulated patient's response to an infection that affects millions of people worldwide. Early detection of sepsis is expected to provide a significant improvement in its treatment. Preceding works usually combined medical, statistical, mathematical and computational models to develop detection methods for early prediction, getting higher accuracies, and using the smallest number of variables. Among other techniques, we could find researches using survival analysis, specialist systems, machine learning and deep learning that reached great results. In our research, patients are modeled as points moving each hour in an n-dimensional space where n is the number of vital signs (variables). These points can reach a sepsis target point after some time. For now, the sepsis target point was calculated using the median of all patients’ variables on the sepsis onset. From these points, we calculate for each hour the position vector, the first derivative (velocity vector) and the second derivative (acceleration vector) of the variables to evaluate their behavior. And we construct a prediction model based on a Long Short-Term Memory (LSTM) Network, including these derivatives as explanatory variables. The accuracy of the prediction 6 hours before the time of sepsis, considering only the vital signs reached 83.24% and by including the vectors position, speed, and acceleration, we obtained 94.96%. The data are being collected from Medical Information Mart for Intensive Care (MIMIC) Database, a public database that contains vital signs, laboratory test results, observations, notes, and so on, from more than 60.000 patients.

Keywords: dynamic analysis, long short-term memory, prediction, sepsis

Procedia PDF Downloads 101
29068 A Method for Clinical Concept Extraction from Medical Text

Authors: Moshe Wasserblat, Jonathan Mamou, Oren Pereg

Abstract:

Natural Language Processing (NLP) has made a major leap in the last few years, in practical integration into medical solutions; for example, extracting clinical concepts from medical texts such as medical condition, medication, treatment, and symptoms. However, training and deploying those models in real environments still demands a large amount of annotated data and NLP/Machine Learning (ML) expertise, which makes this process costly and time-consuming. We present a practical and efficient method for clinical concept extraction that does not require costly labeled data nor ML expertise. The method includes three steps: Step 1- the user injects a large in-domain text corpus (e.g., PubMed). Then, the system builds a contextual model containing vector representations of concepts in the corpus, in an unsupervised manner (e.g., Phrase2Vec). Step 2- the user provides a seed set of terms representing a specific medical concept (e.g., for the concept of the symptoms, the user may provide: ‘dry mouth,’ ‘itchy skin,’ and ‘blurred vision’). Then, the system matches the seed set against the contextual model and extracts the most semantically similar terms (e.g., additional symptoms). The result is a complete set of terms related to the medical concept. Step 3 –in production, there is a need to extract medical concepts from the unseen medical text. The system extracts key-phrases from the new text, then matches them against the complete set of terms from step 2, and the most semantically similar will be annotated with the same medical concept category. As an example, the seed symptom concepts would result in the following annotation: “The patient complaints on fatigue [symptom], dry skin [symptom], and Weight loss [symptom], which can be an early sign for Diabetes.” Our evaluations show promising results for extracting concepts from medical corpora. The method allows medical analysts to easily and efficiently build taxonomies (in step 2) representing their domain-specific concepts, and automatically annotate a large number of texts (in step 3) for classification/summarization of medical reports.

Keywords: clinical concepts, concept expansion, medical records annotation, medical records summarization

Procedia PDF Downloads 116
29067 Molecular Dynamics Simulation for Vibration Analysis at Nanocomposite Plates

Authors: Babak Safaei, A. M. Fattahi

Abstract:

Polymer/carbon nanotube nanocomposites have a wide range of promising applications Due to their enhanced properties. In this work, free vibration analysis of single-walled carbon nanotube-reinforced composite plates is conducted in which carbon nanotubes are embedded in an amorphous polyethylene. The rule of mixture based on various types of plate model namely classical plate theory (CLPT), first-order shear deformation theory (FSDT), and higher-order shear deformation theory (HSDT) was employed to obtain fundamental frequencies of the nanocomposite plates. Generalized differential quadrature (GDQ) method was used to discretize the governing differential equations along with the simply supported and clamped boundary conditions. The material properties of the nanocomposite plates were evaluated using molecular dynamic (MD) simulation corresponding to both short-(10,10) SWCNT and long-(10,10) SWCNT composites. Then the results obtained directly from MD simulations were fitted with those calculated by the rule of mixture to extract appropriate values of carbon nanotube efficiency parameters accounting for the scale-dependent material properties. The selected numerical results are presented to address the influences of nanotube volume fraction and edge supports on the value of fundamental frequency of carbon nanotube-reinforced composite plates corresponding to both long- and short-nanotube composites.

Keywords: nanocomposites, molecular dynamics simulation, free vibration, generalized, differential quadrature (GDQ) method

Procedia PDF Downloads 306
29066 Aggregate Supply Response of Some Livestock Commodities in Algeria: Cointegration- Vector Error Correction Model Approach

Authors: Amine M. Benmehaia, Amine Oulmane

Abstract:

The supply response of agricultural commodities to changes in price incentives is an important issue for the success of any policy reform in the agricultural sector. This study aims to quantify the responsiveness of producers of some livestock commodities to price incentives in Algerian context. Time series analysis is used on annual data for a period of 52 years (1966-2018). Both co-integration and vector error correction model (VECM) are used through the Nerlove model of partial adjustment. The study attempts to determine the long-run and short-run relationships along with the magnitudes of disequilibria in the selected commodities. Results show that the short-run price elasticities are low in cow and sheep meat sectors (8.7 and 8% respectively), while their respective long-run elasticities are 16.5 and 10.5, whereas eggs and milk have very high short-run price elasticities (82 and 90% respectively) with long-run elasticities of 40 and 46 respectively. The error correction coefficient, reflecting the speed of adjustment towards the long-run equilibrium, is statistically significant and have the expected negative sign. Its estimates are 12.7 for cow meat, 33.5 for sheep meat, 46.7 for eggs and 8.4 for milk. It seems that cow meat and milk producers have a weak feedback of about 12.7% and 8.4% respectively of the previous year's disequilibrium from the long-run price elasticity, whereas sheep meat and eggs producers adjust to correct long run disequilibrium with a high speed of adjustment (33.5% and 46.7 % respectively). The implication of this is that much more in-depth research is needed to identify those factors that affect agricultural supply and to describe the effect of factors that shift supply in response to price incentives. This could provide valuable information for government in the use of appropriate policy measures.

Keywords: Algeria, cointegration, livestock, supply response, vector error correction model

Procedia PDF Downloads 112
29065 Data Quality Enhancement with String Length Distribution

Authors: Qi Xiu, Hiromu Hota, Yohsuke Ishii, Takuya Oda

Abstract:

Recently, collectable manufacturing data are rapidly increasing. On the other hand, mega recall is getting serious as a social problem. Under such circumstances, there are increasing needs for preventing mega recalls by defect analysis such as root cause analysis and abnormal detection utilizing manufacturing data. However, the time to classify strings in manufacturing data by traditional method is too long to meet requirement of quick defect analysis. Therefore, we present String Length Distribution Classification method (SLDC) to correctly classify strings in a short time. This method learns character features, especially string length distribution from Product ID, Machine ID in BOM and asset list. By applying the proposal to strings in actual manufacturing data, we verified that the classification time of strings can be reduced by 80%. As a result, it can be estimated that the requirement of quick defect analysis can be fulfilled.

Keywords: string classification, data quality, feature selection, probability distribution, string length

Procedia PDF Downloads 301
29064 Transcriptome and Metabolome Analysis of a Tomato Solanum Lycopersicum STAYGREEN1 Null Line Generated Using Clustered Regularly Interspaced Short Palindromic Repeats/Cas9 Technology

Authors: Jin Young Kim, Kwon Kyoo Kang

Abstract:

The SGR1 (STAYGREEN1) protein is a critical regulator of plant leaves in chlorophyll degradation and senescence. The functions and mechanisms of tomato SGR1 action are poorly understood and worthy of further investigation. To investigate the function of the SGR1 gene, we generated a SGR1-knockout (KO) null line via clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9-mediated gene editing and conducted RNA sequencing and gas chromatography tandem mass spectrometry (GC-MS/MS) analysis to identify the differentially expressed genes. The SlSGR1 (Solanum lycopersicum SGR1) knockout null line clearly showed a turbid brown color with significantly higher chlorophyll and carotenoid content compared to wild-type (WT) fruit. Differential gene expression analysis revealed 728 differentially expressed genes (DEGs) between WT and sgr1 #1-6 line, including 263 and 465 downregulated and upregulated genes, respectively, for which fold change was >2, and the adjusted p-value was <0.05. Most of the DEGs were related to photosynthesis and chloroplast function. In addition, the pigment, carotenoid changes in sgr1 #1-6 line was accumulated of key primary metabolites such as sucrose and its derivatives (fructose, galactinol, raffinose), glycolytic intermediates (glucose, G6P, Fru6P) and tricarboxylic acid cycle (TCA) intermediates (malate and fumarate). Taken together, the transcriptome and metabolite profiles of SGR1-KO lines presented here provide evidence for the mechanisms underlying the effects of SGR1 and molecular pathways involved in chlorophyll degradation and carotenoid biosynthesis.

Keywords: tomato, CRISPR/Cas9, null line, RNA-sequencing, metabolite profiling

Procedia PDF Downloads 94
29063 Determinants of Inward Foreign Direct Investment: New Evidence from Bangladesh

Authors: Mohammad Maruf Hasan

Abstract:

Foreign Direct Investment (FDI) has been increased at a remarkable position around the globe in which emerging economies are getting more FDI compared to industrialized economies. This study aims to examine the determinants of inward FDI flows in Bangladesh. To estimate the long and short-run impact of the FDI determinants for 1996-2020, we employed the Autoregressive-Distributed Lag (ARDL) model. Results show that: (1) macroeconomic determinants, such as economic growth, infrastructure, and market size, have a significant and strong positive effect.(2) Inflation exchange rate shows insignificant effects, while trade openness has mixed (short-run negative, long-run positive) effects on FDI inflows in both the long and short run. (3) Current institutional determinants rule of law has a positive effect on FDI inflows but is statistically insignificant, political stability has a negative, and the rule of law has a considerable beneficial impact on inflows of FDI. (4) The macroeconomic factors have been determined to impact Bangladesh's FDI inflows. Finally, a stable macroeconomic climate is more effective at luring FDI, as this study confirms. From a policy perspective, this study will help the government and policymakers to make a new investment policy.

Keywords: determinants, FDI, ARDL, Bangladesh

Procedia PDF Downloads 57
29062 The Translation of Code-Switching in African Literature: Comparing the Two German Translations of Ngugi Wa Thiongo’s "Petals of Blood"

Authors: Omotayo Olalere

Abstract:

The relevance of code-switching for intercultural communication through literary translation cannot be overemphasized. The translation of code-switching and its implications for translations studies have been studied in the context of African literature. In these cases, code-switching was examined in the more general terms of its usage in source text and not particularly in Ngugi’s novels and its translations. In addition, the functions of translation and code-switching in the lyrics of some popular African songs have been studied, but this study is related more with oral performance than with written literature. As such, little has been done on the German translation of code-switching in African works. This study intends to fill this lacuna by examining the concept of code-switching in the German translations in Ngugi’s Petals of Blood. The aim is to highlight the significance of code-switching as a phenomenon in this African (Ngugi’s) novel written in English and to also focus on its representation in the two German translations. The target texts to be used are Verbrannte Blueten and Land der flammenden Blueten. “Abrogration“ as a concept will play an important role in the analysis of the data. Findings will show that the ideology of a translator plays a huge role in representing the concept of “abrogration” in the translation of code-switching in the selected source text. The study will contribute to knowledge in translation studies by bringing to limelight the need to foreground aspects of language contact in translation theory and practice, particularly in the African context. Relevant translation theories adopted for the study include Bandia’s (2008) postcolonial theory of translation and Snell-Hornby”s (1988) cultural translation theory.

Keywords: code switching, german translation, ngugi wa thiong’o, petals of blood

Procedia PDF Downloads 62
29061 The Paralinguistic Function of Emojis in Twitter Communication

Authors: Yasmin Tantawi, Mary Beth Rosson

Abstract:

In response to the dearth of information about emoji use for different purposes in different settings, this paper investigates the paralinguistic function of emojis within Twitter communication in the United States. To conduct this investigation, the Twitter feeds from 16 population centers spread throughout the United States were collected from the Twitter public API. One hundred tweets were collected from each population center, totaling to 1,600 tweets. Tweets containing emojis were next extracted using the “emot” Python package; these were then analyzed via the IBM Watson API Natural Language Understanding module to identify the topics discussed. A manual content analysis was then conducted to ascertain the paralinguistic and emotional features of the emojis used in these tweets. We present our characterization of emoji usage in Twitter and discuss implications for the design of Twitter and other text-based communication tools.

Keywords: computer-mediated communication, content analysis, paralinguistics, sociology

Procedia PDF Downloads 147
29060 An End-to-end Piping and Instrumentation Diagram Information Recognition System

Authors: Taekyong Lee, Joon-Young Kim, Jae-Min Cha

Abstract:

Piping and instrumentation diagram (P&ID) is an essential design drawing describing the interconnection of process equipment and the instrumentation installed to control the process. P&IDs are modified and managed throughout a whole life cycle of a process plant. For the ease of data transfer, P&IDs are generally handed over from a design company to an engineering company as portable document format (PDF) which is hard to be modified. Therefore, engineering companies have to deploy a great deal of time and human resources only for manually converting P&ID images into a computer aided design (CAD) file format. To reduce the inefficiency of the P&ID conversion, various symbols and texts in P&ID images should be automatically recognized. However, recognizing information in P&ID images is not an easy task. A P&ID image usually contains hundreds of symbol and text objects. Most objects are pretty small compared to the size of a whole image and are densely packed together. Traditional recognition methods based on geometrical features are not capable enough to recognize every elements of a P&ID image. To overcome these difficulties, state-of-the-art deep learning models, RetinaNet and connectionist text proposal network (CTPN) were used to build a system for recognizing symbols and texts in a P&ID image. Using the RetinaNet and the CTPN model carefully modified and tuned for P&ID image dataset, the developed system recognizes texts, equipment symbols, piping symbols and instrumentation symbols from an input P&ID image and save the recognition results as the pre-defined extensible markup language format. In the test using a commercial P&ID image, the P&ID information recognition system correctly recognized 97% of the symbols and 81.4% of the texts.

Keywords: object recognition system, P&ID, symbol recognition, text recognition

Procedia PDF Downloads 129
29059 Solar Cell Degradation by Electron Irradiation Effect of Irradiation Fluence

Authors: H. Mazouz, A. Belghachi, F. Hadjaj

Abstract:

Solar cells used in orbit are exposed to radiation environment mainly protons and high energy electrons. These particles degrade the output parameters of the solar cell. The aim of this work is to characterize the effects of electron irradiation fluence on the J (V) characteristic and output parameters of gaAs solar cell by numerical simulation. The results obtained demonstrate that the electron irradiation-induced degradation of performances of the cells concerns mainly the short circuit current.

Keywords: gaAs solar cell, MeV electron irradiation, irradiation fluence, short circuit

Procedia PDF Downloads 439