Search results for: protein language processing
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 9266

Search results for: protein language processing

9266 Selection of Pichia kudriavzevii Strain for the Production of Single-Cell Protein from Cassava Processing Waste

Authors: Phakamas Rachamontree, Theerawut Phusantisampan, Natthakorn Woravutthikul, Peerapong Pornwongthong, Malinee Sriariyanun

Abstract:

A total of 115 yeast strains isolated from local cassava processing wastes were measured for crude protein content. Among these strains, the strain MSY-2 possessed the highest protein concentration (>3.5 mg protein/mL). By using molecular identification tools, it was identified to be a strain of Pichia kudriavzevii based on similarity of D1/D2 domain of 26S rDNA region. In this study, to optimize the protein production by MSY-2 strain, Response Surface Methodology (RSM) was applied. The tested parameters were the carbon content, nitrogen content, and incubation time. Here, the value of regression coefficient (R2) = 0.7194 could be explained by the model, which is high to support the significance of the model. Under the optimal condition, the protein content was produced up to 3.77 g per L of the culture and MSY-2 strain contain 66.8 g protein per 100 g of cell dry weight. These results revealed the plausibility of applying the novel strain of yeast in single-cell protein production.

Keywords: single cell protein, response surface methodology, yeast, cassava processing waste

Procedia PDF Downloads 401
9265 Resource Creation Using Natural Language Processing Techniques for Malay Translated Qur'an

Authors: Nor Diana Ahmad, Eric Atwell, Brandon Bennett

Abstract:

Text processing techniques for English have been developed for several decades. But for the Malay language, text processing methods are still far behind. Moreover, there are limited resources, tools for computational linguistic analysis available for the Malay language. Therefore, this research presents the use of natural language processing (NLP) in processing Malay translated Qur’an text. As the result, a new language resource for Malay translated Qur’an was created. This resource will help other researchers to build the necessary processing tools for the Malay language. This research also develops a simple question-answer prototype to demonstrate the use of the Malay Qur’an resource for text processing. This prototype has been developed using Python. The prototype pre-processes the Malay Qur’an and an input query using a stemming algorithm and then searches for occurrences of the query word stem. The result produced shows improved matching likelihood between user query and its answer. A POS-tagging algorithm has also been produced. The stemming and tagging algorithms can be used as tools for research related to other Malay texts and can be used to support applications such as information retrieval, question answering systems, ontology-based search and other text analysis tasks.

Keywords: language resource, Malay translated Qur'an, natural language processing (NLP), text processing

Procedia PDF Downloads 316
9264 BiFormerDTA: Structural Embedding of Protein in Drug Target Affinity Prediction Using BiFormer

Authors: Leila Baghaarabani, Parvin Razzaghi, Mennatolla Magdy Mostafa, Ahmad Albaqsami, Al Warith Al Rushaidi, Masoud Al Rawahi

Abstract:

Predicting the interaction between drugs and their molecular targets is pivotal for advancing drug development processes. Due to the time and cost limitations, computational approaches have emerged as an effective approach to drug-target interaction (DTI) prediction. Most of the introduced computational based approaches utilize the drug molecule and protein sequence as input. This study does not only utilize these inputs, it also introduces a protein representation developed using a masked protein language model. In this representation, for every individual amino acid residue within the protein sequence, there exists a corresponding probability distribution that indicates the likelihood of each amino acid being present at that particular position. Then, the similarity between each pair of amino acids is computed to create a similarity matrix. To encode the knowledge of the similarity matrix, Bi-Level Routing Attention (BiFormer) is utilized, which combines aspects of transformer-based models with protein sequence analysis and represents a significant advancement in the field of drug-protein interaction prediction. BiFormer has the ability to pinpoint the most effective regions of the protein sequence that are responsible for facilitating interactions between the protein and drugs, thereby enhancing the understanding of these critical interactions. Thus, it appears promising in its ability to capture the local structural relationship of the proteins by enhancing the understanding of how it contributes to drugprotein interactions, thereby facilitating more accurate predictions. To evaluate the proposed method, it was tested on two widely recognized datasets: Davis and KIBA. A comprehensive series of experiments was conducted to illustrate its effectiveness in comparison to cutting edge techniques.

Keywords: BiFormer, transformer, protein language processing, self-attention mechanism, binding affinity, drug target interaction, similarity matrix, protein masked representation, protein language model

Procedia PDF Downloads 0
9263 Role of Natural Language Processing in Information Retrieval; Challenges and Opportunities

Authors: Khaled M. Alhawiti

Abstract:

This paper aims to analyze the role of natural language processing (NLP). The paper will discuss the role in the context of automated data retrieval, automated question answer, and text structuring. NLP techniques are gaining wider acceptance in real life applications and industrial concerns. There are various complexities involved in processing the text of natural language that could satisfy the need of decision makers. This paper begins with the description of the qualities of NLP practices. The paper then focuses on the challenges in natural language processing. The paper also discusses major techniques of NLP. The last section describes opportunities and challenges for future research.

Keywords: data retrieval, information retrieval, natural language processing, text structuring

Procedia PDF Downloads 336
9262 Impact of Natural Language Processing in Educational Setting: An Effective Approach towards Improved Learning

Authors: Khaled M. Alhawiti

Abstract:

Natural Language Processing (NLP) is an effective approach for bringing improvement in educational setting. This involves initiating the process of learning through the natural acquisition in the educational systems. It is based on following effective approaches for providing the solution for various problems and issues in education. Natural Language Processing provides solution in a variety of different fields associated with the social and cultural context of language learning. It is based on involving various tools and techniques such as grammar, syntax, and structure of text. It is effective approach for teachers, students, authors, and educators for providing assistance for writing, analysis, and assessment procedure. Natural Language Processing is widely integrated in the large number of educational contexts such as research, science, linguistics, e-learning, evaluations system, and various other educational settings such as schools, higher education system, and universities. Natural Language Processing is based on applying scientific approach in the educational settings. In the educational settings, NLP is an effective approach to ensure that students can learn easily in the same way as they acquired language in the natural settings.

Keywords: natural language processing, education, application, e-learning, scientific studies, educational system

Procedia PDF Downloads 501
9261 Nutritional Potential and Functionality of Whey Powder Influenced by Different Processing Temperature and Storage

Authors: Zarmina Gillani, Nuzhat Huma, Aysha Sameen, Mulazim Hussain Bukhari

Abstract:

Whey is an excellent food ingredient owing to its high nutritive value and its functional properties. However, composition of whey varies depending on composition of milk, processing conditions, processing method, and its whey protein content. The aim of this study was to prepare a whey powder from raw whey and to determine the influence of different processing temperatures (160 and 180 °C) on the physicochemical, functional properties during storage of 180 days and on whey protein denaturation. Results have shown that temperature significantly (P < 0.05) affects the pH, acidity, non-protein nitrogen (NPN), protein total soluble solids, fat and lactose contents. Significantly (p < 0.05) higher foaming capacity (FC), foam stability (FS), whey protein nitrogen index (WPNI), and a lower turbidity and solubility index (SI) were observed in whey powder processed at 160 °C compared to whey powder processed at 180 °C. During storage of 180 days, slow but progressive changes were noticed on the physicochemical and functional properties of whey powder. Reverse phase-HPLC analysis revealed a significant (P < 0.05) effect of temperature on whey protein contents. Denaturation of β-Lactoglobulin is followed by α-lacalbumin, casein glycomacropeptide (CMP/GMP), and bovine serum albumin (BSA).

Keywords: whey powder, temperature, denaturation, reverse phase, HPLC

Procedia PDF Downloads 297
9260 A Review of Research on Pre-training Technology for Natural Language Processing

Authors: Moquan Gong

Abstract:

In recent years, with the rapid development of deep learning, pre-training technology for natural language processing has made great progress. The early field of natural language processing has long used word vector methods such as Word2Vec to encode text. These word vector methods can also be regarded as static pre-training techniques. However, this context-free text representation brings very limited improvement to subsequent natural language processing tasks and cannot solve the problem of word polysemy. ELMo proposes a context-sensitive text representation method that can effectively handle polysemy problems. Since then, pre-training language models such as GPT and BERT have been proposed one after another. Among them, the BERT model has significantly improved its performance on many typical downstream tasks, greatly promoting the technological development in the field of natural language processing, and has since entered the field of natural language processing. The era of dynamic pre-training technology. Since then, a large number of pre-trained language models based on BERT and XLNet have continued to emerge, and pre-training technology has become an indispensable mainstream technology in the field of natural language processing. This article first gives an overview of pre-training technology and its development history, and introduces in detail the classic pre-training technology in the field of natural language processing, including early static pre-training technology and classic dynamic pre-training technology; and then briefly sorts out a series of enlightening technologies. Pre-training technology, including improved models based on BERT and XLNet; on this basis, analyze the problems faced by current pre-training technology research; finally, look forward to the future development trend of pre-training technology.

Keywords: natural language processing, pre-training, language model, word vectors

Procedia PDF Downloads 55
9259 Natural Language Processing; the Future of Clinical Record Management

Authors: Khaled M. Alhawiti

Abstract:

This paper investigates the future of medicine and the use of Natural language processing. The importance of having correct clinical information available online is remarkable; improving patient care at affordable costs could be achieved using automated applications to use the online clinical information. The major challenge towards the retrieval of such vital information is to have it appropriately coded. Majority of the online patient reports are not found to be coded and not accessible as its recorded in natural language text. The use of Natural Language processing provides a feasible solution by retrieving and organizing clinical information, available in text and transforming clinical data that is available for use. Systems used in NLP are rather complex to construct, as they entail considerable knowledge, however significant development has been made. Newly formed NLP systems have been tested and have established performance that is promising and considered as practical clinical applications.

Keywords: clinical information, information retrieval, natural language processing, automated applications

Procedia PDF Downloads 402
9258 Effect of Different Processing Methods on the Quality Attributes of Pigeon Pea Used in Bread Production

Authors: B. F. Olanipekun, O. J. Oyelade, C. O. Osemobor

Abstract:

Pigeon pea is a very good source of protein and micronutrient, but it is being underutilized in Nigeria because of several constraints. This research considered the effect of different processing methods on the quality attributes of pigeon pea used in bread production towards enhancing its utility. Pigeon pea was obtained at a local market and processed into the flour using three processing methods: soaking, sprouting and roasting and were used to bake bread in different proportions. Chemical composition and sensory attributes of the breads were thereafter determined. The highest values of protein and ash contents were obtained from 20 % substitution of sprouted pigeon pea in wheat flour and may be attributable to complex biochemical changes occurring during hydration, to invariably lead to protein constituent being broken down. Hydrolytic activities of the enzymes from the sprouted sample resulted in improvement in the constituent of total protein probably due to reduction in the carbohydrate content. Sensory qualities analyses showed that bread produced with soaked and roasted pigeon pea flours at 5 and 10% inclusion, respectively were mostly accepted than other blends, and products with sprouted pigeon pea flour were least accepted. The findings of this research suggest that supplementing wheat flour with sprouted pigeon peas have more nutritional potentials. However, with sensory analysis indices, the soaked and roasted pigeon peas up to 10% are majorly accepted, and also can improve the nutritional status. Overall, this will be very beneficial to population dependent on plant protein in order to combat malnutrition problems.

Keywords: pigeon pea, processing, protein, malnutrition

Procedia PDF Downloads 248
9257 The Output Fallacy: An Investigation into Input, Noticing, and Learners’ Mechanisms

Authors: Samantha Rix

Abstract:

The purpose of this research paper is to investigate the cognitive processing of learners who receive input but produce very little or no output, and who, when they do produce output, exhibit a similar language proficiency as do those learners who produced output more regularly in the language classroom. Previous studies have investigated the benefits of output (with somewhat differing results); therefore, the presentation will begin with an investigation of what may underlie gains in proficiency without output. Consequently, a pilot study was designed and conducted to gain insight into the cognitive processing of low-output language learners looking, for example, at quantity and quality of noticing. This will be carried out within the paradigm of action classroom research, observing and interviewing low-output language learners in an intensive English program at a small Midwest university. The results of the pilot study indicated that autonomy in language learning, specifically utilizing strategies such self-monitoring, self-talk, and thinking 'out-loud', were crucial in the development of language proficiency for academic-level performance. The presentation concludes with an examination of pedagogical implication for classroom use in order to aide students in their language development.

Keywords: cognitive processing, language learners, language proficiency, learning strategies

Procedia PDF Downloads 473
9256 Understanding the Heart of the Matter: A Pedagogical Framework for Apprehending Successful Second Language Development

Authors: Cinthya Olivares Garita

Abstract:

Untangling language processing in second language development has been either a taken-for-granted and overlooked task for some English language teaching (ELT) instructors or a considerable feat for others. From the most traditional language instruction to the most communicative methodologies, how to assist L2 learners in processing language in the classroom has become a challenging matter in second language teaching. Amidst an ample array of methods, strategies, and techniques to teach a target language, finding a suitable model to lead learners to process, interpret, and negotiate meaning to communicate in a second language has imposed a great responsibility on language teachers; committed teachers are those who are aware of their role in equipping learners with the appropriate tools to communicate in the target language in a 21stcentury society. Unfortunately, one might find some English language teachers convinced that their job is only to lecture students; others are advocates of textbook-based instruction that might hinder second language processing, and just a few might courageously struggle to facilitate second language learning effectively. Grounded on the most representative empirical studies on comprehensible input, processing instruction, and focus on form, this analysis aims to facilitate the understanding of how second language learners process and automatize input and propose a pedagogical framework for the successful development of a second language. In light of this, this paper is structured to tackle noticing and attention and structured input as the heart of processing instruction, comprehensible input as the missing link in second language learning, and form-meaning connections as opposed to traditional grammar approaches to language teaching. The author finishes by suggesting a pedagogical framework involving noticing-attention-comprehensible-input-form (NACIF based on their acronym) to support ELT instructors, teachers, and scholars on the challenging task of facilitating the understanding of effective second language development.

Keywords: second language development, pedagogical framework, noticing, attention, comprehensible input, form

Procedia PDF Downloads 26
9255 Prediction, Production, and Comprehension: Exploring the Influence of Salience in Language Processing

Authors: Andy H. Clark

Abstract:

This research looks into the relationship between language comprehension and production with a specific focus on the role of salience in shaping these processes. Salience, our most immediate perception of what is most probable out of all possible situations and outcomes strongly affects our perception and action in language production and comprehension. This study investigates the impact of geographic and emotional attachments to the target language on the differences in the learners’ comprehension and production abilities. Using quantitative research methods (Qualtrics, SPSS), this study examines preferential choices of two groups of Japanese English language learners: those residing in the United States and those in Japan. By comparing and contrasting these two groups, we hope to gain a better understanding of how salience of linguistics cues influences language processing.

Keywords: intercultural pragmatics, salience, production, comprehension, pragmatics, action, perception, cognition

Procedia PDF Downloads 70
9254 How Western Donors Allocate Official Development Assistance: New Evidence From a Natural Language Processing Approach

Authors: Daniel Benson, Yundan Gong, Hannah Kirk

Abstract:

Advancement in national language processing techniques has led to increased data processing speeds, and reduced the need for cumbersome, manual data processing that is often required when processing data from multilateral organizations for specific purposes. As such, using named entity recognition (NER) modeling and the Organisation of Economically Developed Countries (OECD) Creditor Reporting System database, we present the first geotagged dataset of OECD donor Official Development Assistance (ODA) projects on a global, subnational basis. Our resulting data contains 52,086 ODA projects geocoded to subnational locations across 115 countries, worth a combined $87.9bn. This represents the first global, OECD donor ODA project database with geocoded projects. We use this new data to revisit old questions of how ‘well’ donors allocate ODA to the developing world. This understanding is imperative for policymakers seeking to improve ODA effectiveness.

Keywords: international aid, geocoding, subnational data, natural language processing, machine learning

Procedia PDF Downloads 76
9253 An Event-Related Potentials Study on the Processing of English Subjunctive Mood by Chinese ESL Learners

Authors: Yan Huang

Abstract:

Event-related potentials (ERPs) technique helps researchers to make continuous measures on the whole process of language comprehension, with an excellent temporal resolution at the level of milliseconds. The research on sentence processing has developed from the behavioral level to the neuropsychological level, which brings about a variety of sentence processing theories and models. However, the applicability of these models to L2 learners is still under debate. Therefore, the present study aims to investigate the neural mechanisms underlying English subjunctive mood processing by Chinese ESL learners. To this end, English subject clauses with subjunctive moods are used as the stimuli, all of which follow the same syntactic structure, “It is + adjective + that … + (should) do + …” Besides, in order to examine the role that language proficiency plays on L2 processing, this research deals with two groups of Chinese ESL learners (18 males and 22 females, mean age=21.68), namely, high proficiency group (Group H) and low proficiency group (Group L). Finally, the behavioral and neurophysiological data analysis reveals the following findings: 1) Syntax and semantics interact with each other on the SECOND phase (300-500ms) of sentence processing, which is partially in line with the Three-phase Sentence Model; 2) Language proficiency does affect L2 processing. Specifically, for Group H, it is the syntactic processing that plays the dominant role in sentence processing while for Group L, semantic processing also affects the syntactic parsing during the THIRD phase of sentence processing (500-700ms). Besides, Group H, compared to Group L, demonstrates a richer native-like ERPs pattern, which further demonstrates the role of language proficiency in L2 processing. Based on the research findings, this paper also provides some enlightenment for the L2 pedagogy as well as the L2 proficiency assessment.

Keywords: Chinese ESL learners, English subjunctive mood, ERPs, L2 processing

Procedia PDF Downloads 129
9252 Effect of Extrusion Processing Parameters on Protein in Banana Flour Extrudates: Characterisation Using Fourier-Transform Infrared Spectroscopy

Authors: Surabhi Pandey, Pavuluri Srinivasa Rao

Abstract:

Extrusion processing is a high-temperature short time (HTST) treatment which can improve protein quality and digestibility together with retaining active nutrients. In-vitro protein digestibility of plant protein-based foods is generally enhanced by extrusion. The current study aimed to investigate the effect of extrusion cooking on in-vitro protein digestibility (IVPD) and conformational modification of protein in green banana flour extrudates. Green banana flour was extruded through a co-rotating twin-screw extruder varying the moisture content, barrel temperature, screw speed in the range of 10-20 %, 60-80 °C, 200-300 rpm, respectively, at constant feed rate. Response surface methodology was used to optimise the result for IVPD. Fourier-transform infrared spectroscopy (FTIR) analysis provided a convenient and powerful means to monitor interactions and changes in functional and conformational properties of extrudates. Results showed that protein digestibility was highest in extrudate produced at 80°C, 250 rpm and 15% feed moisture. FTIR analysis was done for the optimised sample having highest IVPD. FTIR analysis showed that there were no changes in primary structure of protein while the secondary protein structure changed. In order to explain this behaviour, infrared spectroscopy analysis was carried out, mainly in the amide I and II regions. Moreover, curve fitting analysis showed the conformational changes produced in the flour due to protein denaturation. The quantitative analysis of the changes in the amide I and II regions provided information about the modifications produced in banana flour extrudates.

Keywords: extrusion, FTIR, protein conformation, raw banana flour, SDS-PAGE method

Procedia PDF Downloads 160
9251 Resume Ranking Using Custom Word2vec and Rule-Based Natural Language Processing Techniques

Authors: Subodh Chandra Shakya, Rajendra Sapkota, Aakash Tamang, Shushant Pudasaini, Sujan Adhikari, Sajjan Adhikari

Abstract:

Lots of efforts have been made in order to measure the semantic similarity between the text corpora in the documents. Techniques have been evolved to measure the similarity of two documents. One such state-of-art technique in the field of Natural Language Processing (NLP) is word to vector models, which converts the words into their word-embedding and measures the similarity between the vectors. We found this to be quite useful for the task of resume ranking. So, this research paper is the implementation of the word2vec model along with other Natural Language Processing techniques in order to rank the resumes for the particular job description so as to automate the process of hiring. The research paper proposes the system and the findings that were made during the process of building the system.

Keywords: chunking, document similarity, information extraction, natural language processing, word2vec, word embedding

Procedia PDF Downloads 157
9250 Lentil Protein Fortification in Cranberry Squash

Authors: Sandhya Devi A

Abstract:

The protein content of the cranberry squash (protein: 0g) may be increased by extracting protein from the lentils (9 g), which is particularly linked to a lower risk of developing heart disease. Using the technique of alkaline extraction from the lentils flour, protein may be extracted. Alkaline extraction of protein from lentil flour was optimized utilizing response surface approach in order to maximize both protein content and yield. Cranberry squash may be taken if a protein fortification syrup is prepared and processed into the squash.

Keywords: alkaline extraction, cranberry squash, protein fortification, response surface methodology

Procedia PDF Downloads 107
9249 Hydration of Protein-RNA Recognition Sites

Authors: Amita Barik, Ranjit Prasad Bahadur

Abstract:

We investigate the role of water molecules in 89 protein-RNA complexes taken from the Protein Data Bank. Those with tRNA and single-stranded RNA are less hydrated than with duplex or ribosomal proteins. Protein-RNA interfaces are hydrated less than protein-DNA interfaces, but more than protein-protein interfaces. Majority of the waters at protein-RNA interfaces makes multiple H-bonds; however, a fraction does not make any. Those making Hbonds have preferences for the polar groups of RNA than its partner protein. The spatial distribution of waters makes interfaces with ribosomal proteins and single-stranded RNA relatively ‘dry’ than interfaces with tRNA and duplex RNA. In contrast to protein-DNA interfaces, mainly due to the presence of the 2’OH, the ribose in protein-RNA interfaces is hydrated more than the phosphate or the bases. The minor groove in protein-RNA interfaces is hydrated more than the major groove, while in protein-DNA interfaces it is reverse. The strands make the highest number of water-mediated H-bonds per unit interface area followed by the helices and the non-regular structures. The preserved waters at protein-RNA interfaces make higher number of H-bonds than the other waters. Preserved waters contribute toward the affinity in protein-RNA recognition and should be carefully treated while engineering protein-RNA interfaces.

Keywords: h-bonds, minor-major grooves, preserved water, protein-RNA interfaces

Procedia PDF Downloads 300
9248 Language Processing of Seniors with Alzheimer’s Disease: From the Perspective of Temporal Parameters

Authors: Lai Yi-Hsiu

Abstract:

The present paper aims to examine the language processing of Chinese-speaking seniors with Alzheimer’s disease (AD) from the perspective of temporal cues. Twenty healthy adults, 17 healthy seniors, and 13 seniors with AD in Taiwan participated in this study to tell stories based on two sets of pictures. Nine temporal cues were fetched and analyzed. Oral productions in Mandarin Chinese were compared and discussed to examine to what extent and in what way these three groups of participants performed with significant differences. Results indicated that the age effects were significant in filled pauses. The dementia effects were significant in mean duration of pauses, empty pauses, filled pauses, lexical pauses, normalized mean duration of filled pauses and lexical pauses. The findings reported in the current paper help characterize the nature of language processing in seniors with or without AD, and contribute to the interactions between the AD neural mechanism and their temporal parameters.

Keywords: language processing, Alzheimer’s disease, Mandarin Chinese, temporal cues

Procedia PDF Downloads 445
9247 Protein Crystallization Induced by Surface Plasmon Resonance

Authors: Tetsuo Okutsu

Abstract:

We have developed a crystallization plate with the function of promoting protein crystallization. A gold thin film is deposited on the crystallization plate. A protein solution is dropped thereon, and crystallization is promoted when the protein is irradiated with light of a wavelength that protein does not absorb. Protein is densely adsorbed on the gold thin film surface. The light excites the surface plasmon resonance of the gold thin film, the protein is excited by the generated enhanced electric field induced by surface plasmon resonance, and the amino acid residues are radicalized to produce protein dimers. The dimers function as templates for protein crystals, crystallization is promoted.

Keywords: lysozyme, plasmon, protein, crystallization, RNaseA

Procedia PDF Downloads 216
9246 Application of Natural Language Processing in Education

Authors: Khaled M. Alhawiti

Abstract:

Reading capability is a major segment of language competency. On the other hand, discovering topical writings at a fitting level for outside and second language learners is a test for educators. We address this issue utilizing natural language preparing innovation to survey reading level and streamline content. In the connection of outside and second-language learning, existing measures of reading level are not appropriate to this errand. Related work has demonstrated the profit of utilizing measurable language preparing procedures; we expand these thoughts and incorporate other potential peculiarities to measure intelligibility. In the first piece of this examination, we join characteristics from measurable language models, customary reading level measures and other language preparing apparatuses to deliver a finer technique for recognizing reading level. We examine the execution of human annotators and assess results for our finders concerning human appraisals. A key commitment is that our identifiers are trainable; with preparing and test information from the same space, our finders beat more general reading level instruments (Flesch-Kincaid and Lexile). Trainability will permit execution to be tuned to address the needs of specific gatherings or understudies.

Keywords: natural language processing, trainability, syntactic simplification tools, education

Procedia PDF Downloads 488
9245 Gender Bias in Natural Language Processing: Machines Reflect Misogyny in Society

Authors: Irene Yi

Abstract:

Machine learning, natural language processing, and neural network models of language are becoming more and more prevalent in the fields of technology and linguistics today. Training data for machines are at best, large corpora of human literature and at worst, a reflection of the ugliness in society. Machines have been trained on millions of human books, only to find that in the course of human history, derogatory and sexist adjectives are used significantly more frequently when describing females in history and literature than when describing males. This is extremely problematic, both as training data, and as the outcome of natural language processing. As machines start to handle more responsibilities, it is crucial to ensure that they do not take with them historical sexist and misogynistic notions. This paper gathers data and algorithms from neural network models of language having to deal with syntax, semantics, sociolinguistics, and text classification. Results are significant in showing the existing intentional and unintentional misogynistic notions used to train machines, as well as in developing better technologies that take into account the semantics and syntax of text to be more mindful and reflect gender equality. Further, this paper deals with the idea of non-binary gender pronouns and how machines can process these pronouns correctly, given its semantic and syntactic context. This paper also delves into the implications of gendered grammar and its effect, cross-linguistically, on natural language processing. Languages such as French or Spanish not only have rigid gendered grammar rules, but also historically patriarchal societies. The progression of society comes hand in hand with not only its language, but how machines process those natural languages. These ideas are all extremely vital to the development of natural language models in technology, and they must be taken into account immediately.

Keywords: gendered grammar, misogynistic language, natural language processing, neural networks

Procedia PDF Downloads 118
9244 Literacy in First and Second Language: Implication for Language Education

Authors: Inuwa Danladi Bawa

Abstract:

One of the challenges of African states in the development of education in the past and the present is the problem of literacy. Literacy in the first language is seen as a strong base for the development of second language; they are mostly the language of education. Language development is an offshoot of language planning; so the need to develop literacy in both first and second language affects language education and predicts the extent of achievement of the entire education sector. The need to balance literacy acquisition in first language for good conditioning the acquisition of second language is paramount. Likely constraints that includes; non-standardization, underdeveloped and undeveloped first languages are among many. Solutions to some of these include the development of materials and use of the stages and levels of literacy acquisition. This is with believed that a child writes well in second language if he has literacy in the first language.

Keywords: first language, second language, literacy, english language, linguistics

Procedia PDF Downloads 449
9243 Influence of Thermal Treatments on Ovomucoid as Allergenic Protein

Authors: Nasser A. Al-Shabib

Abstract:

Food allergens are most common non-native form when exposed to the immune system. Most food proteins undergo various treatments (e.g. thermal or proteolytic processing) during food manufacturing. Such treatments have the potential to impact the chemical structure of food allergens so as to convert them to more denatured or unfolded forms. The conformational changes in the proteins may affect the allergenicity of treated-allergens. However, most allergenic proteins possess high resistance against thermal modification or digestive enzymes. In the present study, ovomucoid (a major allergenic protein of egg white) was heated in phosphate-buffered saline (pH 7.4) at different temperatures, aqueous solutions and on different surfaces for various times. The results indicated that different antibody-based methods had different sensitivities in detecting the heated ovomucoid. When using one particular immunoassay‚ the immunoreactivity of ovomucoid increased rapidly after heating in water whereas immunoreactivity declined after heating in alkaline buffer (pH 10). Ovomucoid appeared more immunoreactive when dissolved in PBS (pH 7.4) and heated on a stainless steel surface. To the best of our knowledge‚ this is the first time that antibody-based methods have been applied for the detection of ovomucoid adsorbed onto different surfaces under various conditions. The results obtained suggest that use of antibodies to detect ovomucoid after food processing may be problematic. False assurance will be given with the use of inappropriate‚ non-validated immunoassays such as those available commercially as ‘Swab’ tests. A greater understanding of antibody-protein interaction after processing of a protein is required.

Keywords: ovomucoid, thermal treatment, solutions, surfaces

Procedia PDF Downloads 446
9242 Social-Cognitive Aspects of Interpretation: Didactic Approaches in Language Processing and English as a Second Language Difficulties in Dyslexia

Authors: Schnell Zsuzsanna

Abstract:

Background: The interpretation of written texts, language processing in the visual domain, in other words, atypical reading abilities, also known as dyslexia, is an ever-growing phenomenon in today’s societies and educational communities. The much-researched problem affects cognitive abilities and, coupled with normal intelligence normally manifests difficulties in the differentiation of sounds and orthography and in the holistic processing of written words. The factors of susceptibility are varied: social, cognitive psychological, and linguistic factors interact with each other. Methods: The research will explain the psycholinguistics of dyslexia on the basis of several empirical experiments and demonstrate how domain-general abilities of inhibition, retrieval from the mental lexicon, priming, phonological processing, and visual modality transfer affect successful language processing and interpretation. Interpretation of visual stimuli is hindered, and the problem seems to be embedded in a sociocultural, psycholinguistic, and cognitive background. This makes the picture even more complex, suggesting that the understanding and resolving of the issues of dyslexia has to be interdisciplinary, aided by several disciplines in the field of humanities and social sciences, and should be researched from an empirical approach, where the practical, educational corollaries can be analyzed on an applied basis. Aim and applicability: The lecture sheds light on the applied, cognitive aspects of interpretation, social cognitive traits of language processing, the mental underpinnings of cognitive interpretation strategies in different languages (namely, Hungarian and English), offering solutions with a few applied techniques for success in foreign language learning that can be useful advice for the developers of testing methodologies and measures across ESL teaching and testing platforms.

Keywords: dyslexia, social cognition, transparency, modalities

Procedia PDF Downloads 83
9241 Benchmarking Bert-Based Low-Resource Language: Case Uzbek NLP Models

Authors: Jamshid Qodirov, Sirojiddin Komolov, Ravilov Mirahmad, Olimjon Mirzayev

Abstract:

Nowadays, natural language processing tools play a crucial role in our daily lives, including various techniques with text processing. There are very advanced models in modern languages, such as English, Russian etc. But, in some languages, such as Uzbek, the NLP models have been developed recently. Thus, there are only a few NLP models in Uzbek language. Moreover, there is no such work that could show which Uzbek NLP model behaves in different situations and when to use them. This work tries to close this gap and compares the Uzbek NLP models existing as of the time this article was written. The authors try to compare the NLP models in two different scenarios: sentiment analysis and sentence similarity, which are the implementations of the two most common problems in the industry: classification and similarity. Another outcome from this work is two datasets for classification and sentence similarity in Uzbek language that we generated ourselves and can be useful in both industry and academia as well.

Keywords: NLP, benchmak, bert, vectorization

Procedia PDF Downloads 52
9240 COVID-19 Genomic Analysis and Complete Evaluation

Authors: Narin Salehiyan, Ramin Ghasemi Shayan

Abstract:

In order to investigate coronavirus RNA replication, transcription, recombination, protein processing and transport, virion assembly, the identification of coronavirus-specific cell receptors, and polymerase processing, the manipulation of coronavirus clones and complementary DNAs (cDNAs) of defective-interfering (DI) RNAs is the subject of this chapter. The idea of the Covid genome is nonsegmented, single-abandoned, and positive-sense RNA. When compared to other RNA viruses, its size is significantly greater, ranging from 27 to 32 kb. The quality encoding the enormous surface glycoprotein depends on 4.4 kb, encoding a forcing trimeric, profoundly glycosylated protein. This takes off exactly 20 nm over the virion envelope, giving the infection the appearance-with a little creative mind of a crown or coronet. Covid research has added to the comprehension of numerous parts of atomic science as a general rule, like the component of RNA union, translational control, and protein transport and handling. It stays a fortune equipped for creating startling experiences.

Keywords: covid-19, corona, virus, genome, genetic

Procedia PDF Downloads 70
9239 Grounding Chinese Language Vocabulary Teaching and Assessment in the Working Memory Research

Authors: Chan Kwong Tung

Abstract:

Since Baddeley and Hitch’s seminal research in 1974 on working memory (WM), this topic has been of great interest to language educators. Although there are some variations in the definitions of WM, recent findings in WM have contributed vastly to our understanding of language learning, especially its effects on second language acquisition (SLA). For example, the phonological component of WM (PWM) and the executive component of WM (EWM) have been found to be positively correlated with language learning. This paper discusses two general, yet highly relevant WM findings that could directly affect the effectiveness of Chinese Language (CL) vocabulary teaching and learning, as well as the quality of its assessment. First, PWM is found to be critical for the long-term learning of phonological forms of new words. Second, EWM is heavily involved in interpreting the semantic characteristics of new words, which consequently affects the quality of learners’ reading comprehension. These two ideas are hardly discussed in the Chinese literature, both conceptual and empirical. While past vocabulary acquisition studies have mainly focused on the cognitive-processing approach, active processing, ‘elaborate processing’ (or lexical elaboration) and other effective learning tasks and strategies, it is high time to balance the spotlight to the WM (particularly PWM and EWM) to ensure an optimum control on the teaching and learning effectiveness of such approaches, as well as the validity of this language assessment. Given the unique phonological, orthographical and morphological properties of the CL, this discussion will shed some light on the vocabulary acquisition of this Sino-Tibetan language family member. Together, these two WM concepts could have crucial implications for the design, development, and planning of vocabularies and ultimately reading comprehension teaching and assessment in language education. Hopefully, this will raise an awareness and trigger a dialogue about the meaning of these findings for future language teaching, learning, and assessment.

Keywords: Chinese Language, working memory, vocabulary assessment, vocabulary teaching

Procedia PDF Downloads 342
9238 Potential Use of Cnidoscolus Chayamansa Leaf from Mexico as High-Quality Protein Source

Authors: Diana Karina Baigts Allende, Mariana Gonzalez Diaz, Luis Antonio Chel Guerrero, Mukthar Sandoval Peraza

Abstract:

Poverty and food insecurity are still incident problems in the developing countries, where population´s diet is based on cereals which are lack in protein content. Nevertheless, during last years the use of native plants has been studied as an alternative source of protein in order to improve the nutritional intake. Chaya crop also called Spinach tree, is a prehispanic plant native from Central America and South of Mexico (Mayan culture), which has been especially valued due to its high nutritional content particularly protein and some medicinal properties. The aim of this work was to study the effect of protein isolation processing from Chaya leaf harvest in Yucatan, Mexico on its structure quality in order: i) to valorize the Chaya crop and ii) to produce low-cost and high-quality protein. Chaya leaf was extruded, clarified and recovered using: a) acid precipitation by decreasing the pH value until reach the isoelectric point (3.5) and b) thermal coagulation, by heating the protein solution at 80 °C during 30 min. Solubilized protein was re-dissolved in water and spray dried. The presence of Fraction I protein, known as RuBisCO (Rubilose-1,5-biphosfate carboxylase/oxygenase) was confirmed by gel electrophoresis (SDS-PAGE) where molecular weight bands of 55 KDa and 12 KDa were observed. The infrared spectrum showed changes in protein structure due to the isolation method. The use of high temperatures (thermal coagulation) highly decreased protein solubility in comparison to isoelectric precipitated protein, the nutritional properties according to amino acid profile was also disturbed, showing minor amounts of overall essential amino acids from 435.9 to 367.8 mg/g. Chaya protein isolate obtained by acid precipitation showed higher protein quality according to essential amino acid score compared to FAO recommendations, which could represent an important sustainable source of protein for human consumption.

Keywords: chaya leaf, nutritional properties, protein isolate, protein structure

Procedia PDF Downloads 339
9237 Protein Remote Homology Detection and Fold Recognition by Combining Profiles with Kernel Methods

Authors: Bin Liu

Abstract:

Protein remote homology detection and fold recognition are two most important tasks in protein sequence analysis, which is critical for protein structure and function studies. In this study, we combined the profile-based features with various string kernels, and constructed several computational predictors for protein remote homology detection and fold recognition. Experimental results on two widely used benchmark datasets showed that these methods outperformed the competing methods, indicating that these predictors are useful computational tools for protein sequence analysis. By analyzing the discriminative features of the training models, some interesting patterns were discovered, reflecting the characteristics of protein superfamilies and folds, which are important for the researchers who are interested in finding the patterns of protein folds.

Keywords: protein remote homology detection, protein fold recognition, profile-based features, Support Vector Machines (SVMs)

Procedia PDF Downloads 160