Search results for: semantic processing
3024 A Gradient Orientation Based Efficient Linear Interpolation Method
Authors: S. Khan, A. Khan, Abdul R. Soomrani, Raja F. Zafar, A. Waqas, G. Akbar
Abstract:
This paper proposes a low-complexity image interpolation method. Image interpolation is used to convert a low dimension video/image to high dimension video/image. The objective of a good interpolation method is to upscale an image in such a way that it provides better edge preservation at the cost of very low complexity so that real-time processing of video frames can be made possible. However, low complexity methods tend to provide real-time interpolation at the cost of blurring, jagging and other artifacts due to errors in slope calculation. Non-linear methods, on the other hand, provide better edge preservation, but at the cost of high complexity and hence they can be considered very far from having real-time interpolation. The proposed method is a linear method that uses gradient orientation for slope calculation, unlike conventional linear methods that uses the contrast of nearby pixels. Prewitt edge detection is applied to separate uniform regions and edges. Simple line averaging is applied to unknown uniform regions, whereas unknown edge pixels are interpolated after calculation of slopes using gradient orientations of neighboring known edge pixels. As a post-processing step, bilateral filter is applied to interpolated edge regions in order to enhance the interpolated edges.Keywords: edge detection, gradient orientation, image upscaling, linear interpolation, slope tracing
Procedia PDF Downloads 2613023 Trust: The Enabler of Knowledge-Sharing Culture in an Informal Setting
Authors: Emmanuel Ukpe, S. M. F. D. Syed Mustapha
Abstract:
Trust in an organization has been perceived as one of the key factors behind knowledge sharing, mainly in an unstructured work environment. In an informal working environment, to instill trust among individuals is a challenge and even more in the virtual environment. The study has contributed in developing the framework for building trust in an unstructured organization in performing knowledge sharing in a virtual environment. The artifact called KAPE (Knowledge Acquisition, Processing, and Exchange) was developed for knowledge sharing for the informal organization where the framework was incorporated. It applies to Cassava farmers to facilitate knowledge sharing using web-based platform. A survey was conducted; data were collected from 382 farmers from 21 farm communities. Multiple regression technique, Cronbach’s Alpha reliability test; Tukey’s Honestly significant difference (HSD) analysis; one way Analysis of Variance (ANOVA), and all trust acceptable measures (TAM) were used to test the hypothesis and to determine noteworthy relationships. The results show a significant difference when there is a trust in knowledge sharing between farmers, the ones who have high in trust acceptable factors found in the model (M = 3.66 SD = .93) and the ones who have low on trust acceptable factors (M = 2.08 SD = .28), (t (48) = 5.69, p = .00). Furthermore, when applying Cognitive Expectancy Theory, the farmers with cognitive-consonance show higher level of trust and satisfaction with knowledge and information from KAPE, as compared with a low level of cognitive-dissonance. These results imply that the adopted trust model KAPE positively improved knowledge sharing activities in an informal environment amongst rural farmers.Keywords: trust, knowledge, sharing, knowledge acquisition, processing and exchange, KAPE
Procedia PDF Downloads 1213022 Modern Agriculture and Industrialization Nexus in the Nigerian Context
Authors: Ese Urhie, Olabisi Popoola, Obindah Gershon, Olabanji Ewetan
Abstract:
Modern agriculture involves the use of improved tools and equipment (instead of crude and ineffective tools) like tractors, hand operated planters, hand operated fertilizer drills and combined harvesters - which increase agricultural productivity. Farmers in Nigeria still have huge potentials to enhance their productivity. The study argues that the increase in agricultural output due to increased productivity, orchestrated by modern agriculture will promote forward linkages and opportunities in the processing sub-sector; both the manufacturing of machines and the processing of raw materials. Depending on existing incentives, foreign investment could be attracted to augment local investment in the sector. The availability of raw materials in large quantity – which prices are competitive – will attract investment in other industries. In addition, potentials for backward linkages will also be created. In a nutshell, adopting the unbalanced growth theory in favour of the agricultural sector could engender industrialization in a country with untapped potentials. The paper highlights the numerous potentials of modern agriculture that are yet to be tapped in Nigeria and also provides a theoretical analysis of how the realization of such potentials could promote industrialization in the country. The study adopts the Lewis’ theory of structural–change model and Hirschman’s theory of unbalanced growth in the design of the analytical framework. The framework will be useful in empirical studies that will guide policy formulation.Keywords: modern agriculture, industrialization, structural change model, unbalanced growth
Procedia PDF Downloads 3083021 Hot Deformation Behavior and Recrystallization of Inconel 718 Superalloy under Double Cone Compression
Authors: Wang Jianguo, Ding Xiao, Liu Dong, Wang Haiping, Yang Yanhui, Hu Yang
Abstract:
The hot deformation behavior of Inconel 718 alloy was studied by uniaxial compression tests under the deformation temperature of 940~1040℃ and strain rate of 0.001-10s⁻¹. The double cone compression (DCC) tests develop strains range from 30% to the 79% strain including all intermediate values of stains at different temperature (960~1040℃). DCC tests were simulated by finite element software which shown the strain and strain rates distribution. The result shows that the peak stress level of the alloy decreased with increasing deformation temperature and decreasing strain rate, which could be characterized by a Zener-Hollomon parameter in the hyperbolic-sine equation. The characterization method of hot processing window containing recrystallization volume fraction and average grain size was proposed for double cone compression test of uniform coarse grain, mixed crystal and uniform fine grain double conical specimen in hydraulic press and screw press. The results show that uniform microstructures can be obtained by low temperature with high deformation followed by high temperature with small deformation on the hydraulic press and low temperature, medium deformation, multi-pass on the screw press. The two methods were applied in industrial forgings process, and the forgings with uniform microstructure were obtained successfully.Keywords: inconel 718 superalloy, hot processing windows, double cone compression, uniform microstructure
Procedia PDF Downloads 2203020 Synthesis and Characterization of Carboxymethyl Cellulose-Chitosan Based Composite Hydrogels for Biomedical and Non-Biomedical Applications
Abstract:
Hydrogels have attracted much academic and industrial attention due to their unique properties and potential biomedical and non-biomedical applications. Limitations on extending their applications have resulted from the synthesis of hydrogels using toxic materials and complex irreproducible processing techniques. In order to promote environmental sustainability, hydrogel efficiency, and wider application, this study focused on the synthesis of composite hydrogels matrices from an edible non-toxic crosslinker-citric acid (CA) using a simple low energy processing method based on carboxymethyl cellulose (CMC) and chitosan (CSN) natural polymers. Composite hydrogels were developed by chemical crosslinking. The results demonstrated that CMC:2CSN:CA exhibited good performance properties and super-absorbency 21× its original weight. This makes it promising for biomedical applications such as chronic wound healing and regeneration, next generation skin substitute, in situ bone regeneration and cell delivery. On the other hand, CMC:CSN:CA exhibited durable well-structured internal network with minimum swelling degrees, water absorbency, excellent gel fraction, and infra-red reflectance. These properties make it a suitable composite hydrogel matrix for warming effect and controlled and efficient release of loaded materials. CMC:2CSN:CA and CMC:CSN:CA composite hydrogels developed also exhibited excellent chemical, morphological, and thermal properties.Keywords: citric acid, fumaric acid, tartaric acid, zinc nitrate hexahydrate
Procedia PDF Downloads 1533019 The Effects of Blanching, Boiling and Steaming on Ascorbic Acid Content, Total Phenolic Content, and Colour in Cauliflowers (Brassica oleracea var. Botrytis)
Authors: Huei Lin Lee, Wee Sim Choo
Abstract:
The effects of blanching, boiling and steaming on the ascorbic acid content, total phenolic content and colour in cauliflower (Brassica oleraceavar. Botrytis) was investigated. It was found that blanching was the best thermal processing to be applied on cauliflower compared to boiling and steaming processes. Blanching and steaming processes on cauliflower retained most of the ascorbic acid content (AAC) compared to those of boiling. As for the total phenolic content (TPC), blanching process retained a higher TPC in cauliflower compared to those of boiling and steaming processes. There were no significant differences between the TPC of boiled and steamed cauliflowers. As for the colour measurement, there were no significant differences in the colour of the cauliflower at different lead time (after processing to the point of consumption) of 30 minutes interval up to 3 hours but there were slight variations in L*, a*, and b* values among the thermal processed cauliflowers (blanched, boiled and steamed). The cauliflowers in this study were found to give a desirable white colour (L* value in the range of 77-83) in all the three thermal processes (blanching, boiling and steaming). There was no significant difference on the effect of lead time (30-minutes interval up to 3 hours) in raw and all the three thermal processed (blanched, boiled and steamed) cauliflowers.Keywords: ascorbic acid, cauliflower, colour, phenolics
Procedia PDF Downloads 3143018 The Effect of Development of Two-Phase Flow Regimes on the Stability of Gas Lift Systems
Authors: Khalid. M. O. Elmabrok, M. L. Burby, G. G. Nasr
Abstract:
Flow instability during gas lift operation is caused by three major phenomena – the density wave oscillation, the casing heading pressure and the flow perturbation within the two-phase flow region. This paper focuses on the causes and the effect of flow instability during gas lift operation and suggests ways to control it in order to maximise productivity during gas lift operations. A laboratory-scale two-phase flow system to study the effects of flow perturbation was designed and built. The apparatus is comprised of a 2 m long by 66 mm ID transparent PVC pipe with air injection point situated at 0.1 m above the base of the pipe. This is the point where stabilised bubbles were visibly clear after injection. Air is injected into the water filled transparent pipe at different flow rates and pressures. The behavior of the different sizes of the bubbles generated within the two-phase region was captured using a digital camera and the images were analysed using the advanced image processing package. It was observed that the average maximum bubbles sizes increased with the increase in the length of the vertical pipe column from 29.72 to 47 mm. The increase in air injection pressure from 0.5 to 3 bars increased the bubble sizes from 29.72 mm to 44.17 mm and then decreasing when the pressure reaches 4 bars. It was observed that at higher bubble velocity of 6.7 m/s, larger diameter bubbles coalesce and burst due to high agitation and collision with each other. This collapse of the bubbles causes pressure drop and reverse flow within two phase flow and is the main cause of the flow instability phenomena.Keywords: gas lift instability, bubbles forming, bubbles collapsing, image processing
Procedia PDF Downloads 4213017 Referencing Anna: Findings From Eye-tracking During Dutch Pronoun Resolution
Authors: Robin Devillers, Chantal van Dijk
Abstract:
Children face ambiguities in everyday language use. Particularly ambiguity in pronoun resolution can be challenging, whereas adults can rapidly identify the antecedent of the mentioned pronoun. Two main factors underlie this process, namely the accessibility of the referent and the syntactic cues of the pronoun. After 200ms, adults have converged the accessibility and the syntactic constraints, while relieving cognitive effort by considering contextual cues. As children are still developing their cognitive capacity, they are not able yet to simultaneously assess and integrate accessibility, contextual cues and syntactic information. As such, they fail to identify the correct referent and possibly fixate more on the competitor in comparison to adults. In this study, Dutch while-clauses were used to investigate the interpretation of pronouns by children. The aim is to a) examine the extent to which 7-10 year old children are able to utilise discourse and syntactic information during online and offline sentence processing and b) analyse the contribution of individual factors, including age, working memory, condition and vocabulary. Adult and child participants are presented with filler-items and while-clauses, and the latter follows a particular structure: ‘Anna and Sophie are sitting in the library. While Anna is reading a book, she is taking a sip of water.’ This sentence illustrates the ambiguous situation, as it is unclear whether ‘she’ refers to Anna or Sophie. In the unambiguous situation, either Anna or Sophie would be substituted by a boy, such as ‘Peter’. The pronoun in the second sentence will unambiguously refer to one of the characters due to the syntactic constraints of the pronoun. Children’s and adults’ responses were measured by means of a visual world paradigm. This paradigm consisted of two characters, of which one was the referent (the target) and the other was the competitor. A sentence was presented and followed by a question, which required the participant to choose which character was the referent. Subsequently, this paradigm yields an online (fixations) and offline (accuracy) score. These findings will be analysed using Generalised Additive Mixed Models, which allow for a thorough estimation of the individual variables. These findings will contribute to the scientific literature in several ways; firstly, the use of while-clauses has not been studied much and it’s processing has not yet been identified. Moreover, online pronoun resolution has not been investigated much in both children and adults, and therefore, this study will contribute to adults and child’s pronoun resolution literature. Lastly, pronoun resolution has not been studied yet in Dutch and as such, this study adds to the languagesKeywords: pronouns, online language processing, Dutch, eye-tracking, first language acquisition, language development
Procedia PDF Downloads 1003016 Explaining the Steps of Designing and Calculating the Content Validity Ratio Index of the Screening Checklist of Preschool Students (5 to 7 Years Old) Exposed to Learning Difficulties
Authors: Sajed Yaghoubnezhad, Sedygheh Rezai
Abstract:
Background and Aim: Since currently in Iran, students with learning disabilities are identified after entering school, and with the approach to the gap between IQ and academic achievement, the purpose of this study is to design and calculate the content validity of the pre-school screening checklist (5-7) exposed to learning difficulties. Methods: This research is a fundamental study, and in terms of data collection method, it is quantitative research with a descriptive approach. In order to design this checklist, after reviewing the research background and theoretical foundations, cognitive abilities (visual processing, auditory processing, phonological awareness, executive functions, spatial visual working memory and fine motor skills) are considered the basic variables of school learning. The basic items and worksheets of the screening checklist of pre-school students 5 to 7 years old with learning difficulties were compiled based on the mentioned abilities and were provided to the specialists in order to calculate the content validity ratio index. Results: Based on the results of the table, the validity of the CVR index of the background information checklist is equal to 0.9, and the CVR index of the performance checklist of preschool children (5 to7 years) is equal to 0.78. In general, the CVR index of this checklist is reported to be 0.84. The results of this study provide good evidence for the validity of the pre-school sieve screening checklist (5-7) exposed to learning difficulties.Keywords: checklist, screening, preschoolers, learning difficulties
Procedia PDF Downloads 1033015 The Implementation of an E-Government System in Developing Countries: A Case of Taita Taveta County, Kenya
Authors: Tabitha Mberi, Tirus Wanyoike, Joseph Sevilla
Abstract:
The use of Information and Communication Technology (ICT) in Government is gradually becoming a major requirement to transform delivery of services to its stakeholders by improving quality of service and efficiency. In Kenya, the devolvement of government from local authorities to county governments has resulted in many counties adopting online revenue collection systems which can be easily accessed by its stakeholders. Strathmore Research and Consortium Centre (SRCC) implemented a revenue collection system in Taita Taveta, a County in coastal Kenya. It consisted of two systems that are integrated; an online system dubbed “CountyPro” for processing county services such as Business Permit applications, General Billing, Property Rates Payments and any other revenue streams from the county. The second part was a Point of Sale(PoS) system used by the county revenue collectors to charge for market fees and vehicle parking fees. This study assesses the success and challenges in adoption of the integrated system. Qualitative and quantitative data collection methods were used to collect data on the adoption of the system with the researcher using focus groups, interviews, and questionnaires to collect data from various users of the system An analysis was carried out and revealed that 87% of the county revenue officers who are situated in county offices describe the system as efficient and has made their work easier in terms of processing of transactions for customers.Keywords: e-government, counties, information technology, online system, point of sale
Procedia PDF Downloads 2503014 Machine Learning Strategies for Data Extraction from Unstructured Documents in Financial Services
Authors: Delphine Vendryes, Dushyanth Sekhar, Baojia Tong, Matthew Theisen, Chester Curme
Abstract:
Much of the data that inform the decisions of governments, corporations and individuals are harvested from unstructured documents. Data extraction is defined here as a process that turns non-machine-readable information into a machine-readable format that can be stored, for instance, in a database. In financial services, introducing more automation in data extraction pipelines is a major challenge. Information sought by financial data consumers is often buried within vast bodies of unstructured documents, which have historically required thorough manual extraction. Automated solutions provide faster access to non-machine-readable datasets, in a context where untimely information quickly becomes irrelevant. Data quality standards cannot be compromised, so automation requires high data integrity. This multifaceted task is broken down into smaller steps: ingestion, table parsing (detection and structure recognition), text analysis (entity detection and disambiguation), schema-based record extraction, user feedback incorporation. Selected intermediary steps are phrased as machine learning problems. Solutions leveraging cutting-edge approaches from the fields of computer vision (e.g. table detection) and natural language processing (e.g. entity detection and disambiguation) are proposed.Keywords: computer vision, entity recognition, finance, information retrieval, machine learning, natural language processing
Procedia PDF Downloads 1143013 Time and Cost Prediction Models for Language Classification Over a Large Corpus on Spark
Authors: Jairson Barbosa Rodrigues, Paulo Romero Martins Maciel, Germano Crispim Vasconcelos
Abstract:
This paper presents an investigation of the performance impacts regarding the variation of five factors (input data size, node number, cores, memory, and disks) when applying a distributed implementation of Naïve Bayes for text classification of a large Corpus on the Spark big data processing framework. Problem: The algorithm's performance depends on multiple factors, and knowing before-hand the effects of each factor becomes especially critical as hardware is priced by time slice in cloud environments. Objectives: To explain the functional relationship between factors and performance and to develop linear predictor models for time and cost. Methods: the solid statistical principles of Design of Experiments (DoE), particularly the randomized two-level fractional factorial design with replications. This research involved 48 real clusters with different hardware arrangements. The metrics were analyzed using linear models for screening, ranking, and measurement of each factor's impact. Results: Our findings include prediction models and show some non-intuitive results about the small influence of cores and the neutrality of memory and disks on total execution time, and the non-significant impact of data input scale on costs, although notably impacts the execution time.Keywords: big data, design of experiments, distributed machine learning, natural language processing, spark
Procedia PDF Downloads 1203012 Voice Liveness Detection Using Kolmogorov Arnold Networks
Authors: Arth J. Shah, Madhu R. Kamble
Abstract:
Voice biometric liveness detection is customized to certify an authentication process of the voice data presented is genuine and not a recording or synthetic voice. With the rise of deepfakes and other equivalently sophisticated spoofing generation techniques, it’s becoming challenging to ensure that the person on the other end is a live speaker or not. Voice Liveness Detection (VLD) system is a group of security measures which detect and prevent voice spoofing attacks. Motivated by the recent development of the Kolmogorov-Arnold Network (KAN) based on the Kolmogorov-Arnold theorem, we proposed KAN for the VLD task. To date, multilayer perceptron (MLP) based classifiers have been used for the classification tasks. We aim to capture not only the compositional structure of the model but also to optimize the values of univariate functions. This study explains the mathematical as well as experimental analysis of KAN for VLD tasks, thereby opening a new perspective for scientists to work on speech and signal processing-based tasks. This study emerges as a combination of traditional signal processing tasks and new deep learning models, which further proved to be a better combination for VLD tasks. The experiments are performed on the POCO and ASVSpoof 2017 V2 database. We used Constant Q-transform, Mel, and short-time Fourier transform (STFT) based front-end features and used CNN, BiLSTM, and KAN as back-end classifiers. The best accuracy is 91.26 % on the POCO database using STFT features with the KAN classifier. In the ASVSpoof 2017 V2 database, the lowest EER we obtained was 26.42 %, using CQT features and KAN as a classifier.Keywords: Kolmogorov Arnold networks, multilayer perceptron, pop noise, voice liveness detection
Procedia PDF Downloads 443011 Synthetic Method of Contextual Knowledge Extraction
Authors: Olga Kononova, Sergey Lyapin
Abstract:
Global information society requirements are transparency and reliability of data, as well as ability to manage information resources independently; particularly to search, to analyze, to evaluate information, thereby obtaining new expertise. Moreover, it is satisfying the society information needs that increases the efficiency of the enterprise management and public administration. The study of structurally organized thematic and semantic contexts of different types, automatically extracted from unstructured data, is one of the important tasks for the application of information technologies in education, science, culture, governance and business. The objectives of this study are the contextual knowledge typologization, selection or creation of effective tools for extracting and analyzing contextual knowledge. Explication of various kinds and forms of the contextual knowledge involves the development and use full-text search information systems. For the implementation purposes, the authors use an e-library 'Humanitariana' services such as the contextual search, different types of queries (paragraph-oriented query, frequency-ranked query), automatic extraction of knowledge from the scientific texts. The multifunctional e-library «Humanitariana» is realized in the Internet-architecture in WWS-configuration (Web-browser / Web-server / SQL-server). Advantage of use 'Humanitariana' is in the possibility of combining the resources of several organizations. Scholars and research groups may work in a local network mode and in distributed IT environments with ability to appeal to resources of any participating organizations servers. Paper discusses some specific cases of the contextual knowledge explication with the use of the e-library services and focuses on possibilities of new types of the contextual knowledge. Experimental research base are science texts about 'e-government' and 'computer games'. An analysis of the subject-themed texts trends allowed to propose the content analysis methodology, that combines a full-text search with automatic construction of 'terminogramma' and expert analysis of the selected contexts. 'Terminogramma' is made out as a table that contains a column with a frequency-ranked list of words (nouns), as well as columns with an indication of the absolute frequency (number) and the relative frequency of occurrence of the word (in %% ppm). The analysis of 'e-government' materials showed, that the state takes a dominant position in the processes of the electronic interaction between the authorities and society in modern Russia. The media credited the main role in these processes to the government, which provided public services through specialized portals. Factor analysis revealed two factors statistically describing the used terms: human interaction (the user) and the state (government, processes organizer); interaction management (public officer, processes performer) and technology (infrastructure). Isolation of these factors will lead to changes in the model of electronic interaction between government and society. In this study, the dominant social problems and the prevalence of different categories of subjects of computer gaming in science papers from 2005 to 2015 were identified. Therefore, there is an evident identification of several types of contextual knowledge: micro context; macro context; dynamic context; thematic collection of queries (interactive contextual knowledge expanding a composition of e-library information resources); multimodal context (functional integration of iconographic and full-text resources through hybrid quasi-semantic algorithm of search). Further studies can be pursued both in terms of expanding the resource base on which they are held, and in terms of the development of appropriate tools.Keywords: contextual knowledge, contextual search, e-library services, frequency-ranked query, paragraph-oriented query, technologies of the contextual knowledge extraction
Procedia PDF Downloads 3603010 Original and the Translated: A Comparative Evaluation of Native and Non-Native English Translations of Faiz
Authors: Anam Nawaz
Abstract:
The present study is an attempt to compare the translations of Faiz’s poetry made by native and non-native translators, to determine the role of the translator in terms of preserving the cultural ethos of the original text. Peter Newmark and Katharine Reiss’s approaches to translation criticism have been used to provide a theoretical framework for the study. This study also emphasizes those cultural and semantic aspects of the original which are translated more convincingly by a native translator, and contrasting those features which the non-natives can tackle more ably. The research also highlights the linguistic sockets, ignored by the interpreters in the translation process. The analysis showed that both native and non-native translators have made an admirable effort to stay as close to the original as possible. The natives with their advantage of belonging to the same culture have excelled in preserving the original subject matter, whereas the non-native renderings have been presented in a much rhythmic and poetic manner with an excellent choice of words. Though none of the four translators has been successfully able to recreate Faiz’s magic, however V. G. Kiernan and Sarvat Rahman’s translations can be regarded as the closest to the original. Whereas V. G. Kiernan with his outstanding command over English mesmerizes the readers, Sarvat Rahman’s profound understanding of cultural ties helps establish her translations as a brilliant example of faithful re-renderings.Keywords: comparative translations, linguistic and cultural constraints, native translators, non-native translators, poetry and translation, Faiz Ahmad Faiz
Procedia PDF Downloads 2623009 Quantitative Analysis of the Quality of Housing and Land Use in the Built-up area of Croatian Coastal City of Zadar
Authors: Silvija Šiljeg, Ante Šiljeg, Branko Cavrić
Abstract:
Housing is considered as a basic human need and important component of the quality of life (QoL) in urban areas worldwide. In contemporary housing studies, the concept of the quality of housing (QoH) is considered as a multi-dimensional and multi-disciplinary field. It emphasizes connection between various aspects of the QoL which could be measured by quantitative and qualitative indicators at different spatial levels (e.g. local, city, metropolitan, regional). The main goal of this paper is to examine the QoH and compare results of quantitative analysis with the clutter land use categories derived for selected local communities in Croatian Coastal City of Zadar. The qualitative housing analysis based on the four housing indicators (out of total 24 QoL indicators) has provided identification of the three Zadar’s local communities with the highest estimated QoH ranking. Furthermore, by using GIS overlay techniques, the QoH was merged with the urban environment analysis and introduction of spatial metrics based on the three categories: the element, class and environment as a whole. In terms of semantic-content analysis, the research has also generated a set of indexes suitable for evaluation of “housing state of affairs” and future decision making aiming at improvement of the QoH in selected local communities.Keywords: housing, quality, indicators, indexes, urban environment, GIS, element, class
Procedia PDF Downloads 4103008 Clinical Validation of an Automated Natural Language Processing Algorithm for Finding COVID-19 Symptoms and Complications in Patient Notes
Authors: Karolina Wieczorek, Sophie Wiliams
Abstract:
Introduction: Patient data is often collected in Electronic Health Record Systems (EHR) for purposes such as providing care as well as reporting data. This information can be re-used to validate data models in clinical trials or in epidemiological studies. Manual validation of automated tools is vital to pick up errors in processing and to provide confidence in the output. Mentioning a disease in a discharge letter does not necessarily mean that a patient suffers from this disease. Many of them discuss a diagnostic process, different tests, or discuss whether a patient has a certain disease. The COVID-19 dataset in this study used natural language processing (NLP), an automated algorithm which extracts information related to COVID-19 symptoms, complications, and medications prescribed within the hospital. Free-text patient clinical patient notes are rich sources of information which contain patient data not captured in a structured form, hence the use of named entity recognition (NER) to capture additional information. Methods: Patient data (discharge summary letters) were exported and screened by an algorithm to pick up relevant terms related to COVID-19. Manual validation of automated tools is vital to pick up errors in processing and to provide confidence in the output. A list of 124 Systematized Nomenclature of Medicine (SNOMED) Clinical Terms has been provided in Excel with corresponding IDs. Two independent medical student researchers were provided with a dictionary of SNOMED list of terms to refer to when screening the notes. They worked on two separate datasets called "A” and "B”, respectively. Notes were screened to check if the correct term had been picked-up by the algorithm to ensure that negated terms were not picked up. Results: Its implementation in the hospital began on March 31, 2020, and the first EHR-derived extract was generated for use in an audit study on June 04, 2020. The dataset has contributed to large, priority clinical trials (including International Severe Acute Respiratory and Emerging Infection Consortium (ISARIC) by bulk upload to REDcap research databases) and local research and audit studies. Successful sharing of EHR-extracted datasets requires communicating the provenance and quality, including completeness and accuracy of this data. The results of the validation of the algorithm were the following: precision (0.907), recall (0.416), and F-score test (0.570). Percentage enhancement with NLP extracted terms compared to regular data extraction alone was low (0.3%) for relatively well-documented data such as previous medical history but higher (16.6%, 29.53%, 30.3%, 45.1%) for complications, presenting illness, chronic procedures, acute procedures respectively. Conclusions: This automated NLP algorithm is shown to be useful in facilitating patient data analysis and has the potential to be used in more large-scale clinical trials to assess potential study exclusion criteria for participants in the development of vaccines.Keywords: automated, algorithm, NLP, COVID-19
Procedia PDF Downloads 1023007 Development of a French to Yorùbá Machine Translation System
Authors: Benjamen Nathaniel, Eludiora Safiriyu Ijiyemi, Egume Oneme Lucky
Abstract:
A review on machine translation systems shows that a lot of computational artefacts has been carried out to translate written or spoken texts from a source language to Yorùbá language through Machine Translation systems. However, there are no work on French to Yorùbá language machine translation system; hence, the study investigated the process involved in the translation of French-to-Yorùbá language equivalent with the view to adopting a rule- based MT approach to build a Machine Translation framework from simple sentences administered through questionnaire. Articles and relevant textbooks were reviewed with key speakers of both languages interviewed to find out the processes involved in the translation of French language and their equivalent in Yorùbálanguage simple sentences using home domain terminologies. Achieving this, a model was formulated using phrase grammar structure, re-write rule, parse tree, automata theory- based techniques, designed and implemented respectively with unified modeling language (UML) and python programming language. Analysing the result, it was observed when carrying out the result that, the Machine Translation system performed 18.45% above Experimental Subject Respondent and 2.7% below Linguistics Expert when analysed with word orthography, sentence syntax and semantic correctness of the sentences. And, when compared with Google Machine Translation system, it was noticed that the developed system performed better on lexicons of the target language.Keywords: machine translation (MT), rule-based, French language, Yoru`ba´ language
Procedia PDF Downloads 783006 Combined Synchrotron Radiography and Diffraction for in Situ Study of Reactive Infiltration of Aluminum into Iron Porous Preform
Authors: S. Djaziri, F. Sket, A. Hynowska, S. Milenkovic
Abstract:
The use of Fe-Al based intermetallics as an alternative to Cr/Ni based stainless steels is very promising for industrial applications that use critical raw materials parts under extreme conditions. However, the development of advanced Fe-Al based intermetallics with appropriate mechanical properties presents several challenges that involve appropriate processing and microstructure control. A processing strategy is being developed which aims at producing a net-shape porous Fe-based preform that is infiltrated with molten Al or Al-alloy. In the present work, porous Fe-based preforms produced by two different methods (selective laser melting (SLM) and Kochanek-process (KE)) are studied during infiltration with molten aluminum. In the objective to elucidate the mechanisms underlying the formation of Fe-Al intermetallic phases during infiltration, an in-house furnace has been designed for in situ observation of infiltration at synchrotron facilities combining x-ray radiography (XR) and x-ray diffraction (XRD) techniques. The feasibility of this approach has been demonstrated, and information about the melt flow front propagation has been obtained. In addition, reactive infiltration has been achieved where a bi-phased intermetallic layer has been identified to be formed between the solid Fe and liquid Al. In particular, a tongue-like Fe₂Al₅ phase adhering to the Fe and a needle-like Fe₄Al₁₃ phase adhering to the Al were observed. The growth of the intermetallic compound was found to be dependent on the temperature gradient present along the preform as well as on the reaction time which will be discussed in view of the different obtained results.Keywords: combined synchrotron radiography and diffraction, Fe-Al intermetallic compounds, in-situ molten Al infiltration, porous solid Fe preforms
Procedia PDF Downloads 2263005 Reverse Logistics Network Optimization for E-Commerce
Authors: Albert W. K. Tan
Abstract:
This research consolidates a comprehensive array of publications from peer-reviewed journals, case studies, and seminar reports focused on reverse logistics and network design. By synthesizing this secondary knowledge, our objective is to identify and articulate key decision factors crucial to reverse logistics network design for e-commerce. Through this exploration, we aim to present a refined mathematical model that offers valuable insights for companies seeking to optimize their reverse logistics operations. The primary goal of this research endeavor is to develop a comprehensive framework tailored to advising organizations and companies on crafting effective networks for their reverse logistics operations, thereby facilitating the achievement of their organizational goals. This involves a thorough examination of various network configurations, weighing their advantages and disadvantages to ensure alignment with specific business objectives. The key objectives of this research include: (i) Identifying pivotal factors pertinent to network design decisions within the realm of reverse logistics across diverse supply chains. (ii) Formulating a structured framework designed to offer informed recommendations for sound network design decisions applicable to relevant industries and scenarios. (iii) Propose a mathematical model to optimize its reverse logistics network. A conceptual framework for designing a reverse logistics network has been developed through a combination of insights from the literature review and information gathered from company websites. This framework encompasses four key stages in the selection of reverse logistics operations modes: (1) Collection, (2) Sorting and testing, (3) Processing, and (4) Storage. Key factors to consider in reverse logistics network design: I) Centralized vs. decentralized processing: Centralized processing, a long-standing practice in reverse logistics, has recently gained greater attention from manufacturing companies. In this system, all products within the reverse logistics pipeline are brought to a central facility for sorting, processing, and subsequent shipment to their next destinations. Centralization offers the advantage of efficiently managing the reverse logistics flow, potentially leading to increased revenues from returned items. Moreover, it aids in determining the most appropriate reverse channel for handling returns. On the contrary, a decentralized system is more suitable when products are returned directly from consumers to retailers. In this scenario, individual sales outlets serve as gatekeepers for processing returns. Considerations encompass the product lifecycle, product value and cost, return volume, and the geographic distribution of returns. II) In-house vs. third-party logistics providers: The decision between insourcing and outsourcing in reverse logistics network design is pivotal. In insourcing, a company handles the entire reverse logistics process, including material reuse. In contrast, outsourcing involves third-party providers taking on various aspects of reverse logistics. Companies may choose outsourcing due to resource constraints or lack of expertise, with the extent of outsourcing varying based on factors such as personnel skills and cost considerations. Based on the conceptual framework, the authors have constructed a mathematical model that optimizes reverse logistics network design decisions. The model will consider key factors identified in the framework, such as transportation costs, facility capacities, and lead times. The authors have employed mixed LP to find the optimal solutions that minimize costs while meeting organizational objectives.Keywords: reverse logistics, supply chain management, optimization, e-commerce
Procedia PDF Downloads 413004 Context Detection in Spreadsheets Based on Automatically Inferred Table Schema
Authors: Alexander Wachtel, Michael T. Franzen, Walter F. Tichy
Abstract:
Programming requires years of training. With natural language and end user development methods, programming could become available to everyone. It enables end users to program their own devices and extend the functionality of the existing system without any knowledge of programming languages. In this paper, we describe an Interactive Spreadsheet Processing Module (ISPM), a natural language interface to spreadsheets that allows users to address ranges within the spreadsheet based on inferred table schema. Using the ISPM, end users are able to search for values in the schema of the table and to address the data in spreadsheets implicitly. Furthermore, it enables them to select and sort the spreadsheet data by using natural language. ISPM uses a machine learning technique to automatically infer areas within a spreadsheet, including different kinds of headers and data ranges. Since ranges can be identified from natural language queries, the end users can query the data using natural language. During the evaluation 12 undergraduate students were asked to perform operations (sum, sort, group and select) using the system and also Excel without ISPM interface, and the time taken for task completion was compared across the two systems. Only for the selection task did users take less time in Excel (since they directly selected the cells using the mouse) than in ISPM, by using natural language for end user software engineering, to overcome the present bottleneck of professional developers.Keywords: natural language processing, natural language interfaces, human computer interaction, end user development, dialog systems, data recognition, spreadsheet
Procedia PDF Downloads 3133003 Studying the Effect of Reducing Thermal Processing over the Bioactive Composition of Non-Centrifugal Cane Sugar: Towards Natural Products with High Therapeutic Value
Authors: Laura Rueda-Gensini, Jader Rodríguez, Juan C. Cruz, Carolina Munoz-Camargo
Abstract:
There is an emerging interest in botanicals and plant extracts for medicinal practices due to their widely reported health benefits. A large variety of phytochemicals found in plants have been correlated with antioxidant, immunomodulatory, and analgesic properties, which makes plant-derived products promising candidates for modulating the progression and treatment of numerous diseases. Non-centrifugal cane sugar (NCS), in particular, has been known for its high antioxidant and nutritional value, but composition-wise variability due to changing environmental and processing conditions have considerably limited its use in the nutraceutical and biomedical fields. This work is therefore aimed at assessing the effect of thermal exposure during NCS production over its bioactive composition and, in turn, its therapeutic value. Accordingly, two modified dehydration methods are proposed that employ: (i) vacuum-aided evaporation, which reduces the necessary temperatures to dehydrate the sample, and (ii) window refractance evaporation, which reduces thermal exposure time. The biochemical composition of NCS produced under these two methods was compared to traditionally-produced NCS by estimating their total polyphenolic and protein content with Folin-Ciocalteu and Bradford assays, as well as identifying the major phenolic compounds in each sample via HPLC-coupled mass spectrometry. Their antioxidant activities were also compared as measured by their scavenging potential of ABTS and DPPH radicals. Results show that the two modified production methods enhance polyphenolic and protein yield in resulting NCS samples when compared to traditional production methods. In particular, reducing employed temperatures with vacuum-aided evaporation demonstrated to be superior at preserving polyphenolic compounds, as evidenced both in the total and individual polyphenol concentrations. However, antioxidant activities were not significantly different between these. Although additional studies should be performed to determine if the observed compositional differences affect other therapeutic activities (e.g., anti-inflammatory, analgesic, and immunoprotective), these results suggest that reducing thermal exposure holds great promise for the production of natural products with enhanced nutritional value.Keywords: non-centrifugal cane sugar, polyphenolic compounds, thermal processing, antioxidant activity
Procedia PDF Downloads 923002 Enhancing Temporal Extrapolation of Wind Speed Using a Hybrid Technique: A Case Study in West Coast of Denmark
Authors: B. Elshafei, X. Mao
Abstract:
The demand for renewable energy is significantly increasing, major investments are being supplied to the wind power generation industry as a leading source of clean energy. The wind energy sector is entirely dependable and driven by the prediction of wind speed, which by the nature of wind is very stochastic and widely random. This s0tudy employs deep multi-fidelity Gaussian process regression, used to predict wind speeds for medium term time horizons. Data of the RUNE experiment in the west coast of Denmark were provided by the Technical University of Denmark, which represent the wind speed across the study area from the period between December 2015 and March 2016. The study aims to investigate the effect of pre-processing the data by denoising the signal using empirical wavelet transform (EWT) and engaging the vector components of wind speed to increase the number of input data layers for data fusion using deep multi-fidelity Gaussian process regression (GPR). The outcomes were compared using root mean square error (RMSE) and the results demonstrated a significant increase in the accuracy of predictions which demonstrated that using vector components of the wind speed as additional predictors exhibits more accurate predictions than strategies that ignore them, reflecting the importance of the inclusion of all sub data and pre-processing signals for wind speed forecasting models.Keywords: data fusion, Gaussian process regression, signal denoise, temporal extrapolation
Procedia PDF Downloads 1363001 LLM-Powered User-Centric Knowledge Graphs for Unified Enterprise Intelligence
Authors: Rajeev Kumar, Harishankar Kumar
Abstract:
Fragmented data silos within enterprises impede the extraction of meaningful insights and hinder efficiency in tasks such as product development, client understanding, and meeting preparation. To address this, we propose a system-agnostic framework that leverages large language models (LLMs) to unify diverse data sources into a cohesive, user-centered knowledge graph. By automating entity extraction, relationship inference, and semantic enrichment, the framework maps interactions, behaviors, and data around the user, enabling intelligent querying and reasoning across various data types, including emails, calendars, chats, documents, and logs. Its domain adaptability supports applications in contextual search, task prioritization, expertise identification, and personalized recommendations, all rooted in user-centric insights. Experimental results demonstrate its effectiveness in generating actionable insights, enhancing workflows such as trip planning, meeting preparation, and daily task management. This work advances the integration of knowledge graphs and LLMs, bridging the gap between fragmented data systems and intelligent, unified enterprise solutions focused on user interactions.Keywords: knowledge graph, entity extraction, relation extraction, LLM, activity graph, enterprise intelligence
Procedia PDF Downloads 103000 The Relation between Cognitive Fluency and Utterance Fluency in Second Language Spoken Fluency: Studying Fluency through a Psycholinguistic Lens
Authors: Tannistha Dasgupta
Abstract:
This study explores the aspects of second language (L2) spoken fluency that are related to L2 linguistic knowledge and processing skill. It draws on Levelt’s ‘blueprint’ of the L2 speaker which discusses the cognitive issues underlying the act of speaking. However, L2 speaking assessments have largely neglected the underlying mechanism involved in language production; emphasis is given on the relationship between subjective ratings of L2 speech sample and objectively measured aspects of fluency. Hence, in this study, the relation between L2 linguistic knowledge and processing skill i.e. Cognitive Fluency (CF), and objectively measurable aspects of L2 spoken fluency i.e. Utterance Fluency (UF) is examined. The participants of the study are L2 learners of English, studying at high school level in Hyderabad, India. 50 participants with intermediate level of proficiency in English performed several lexical retrieval tasks and attention-shifting tasks to measure CF, and 8 oral tasks to measure UF. Each aspect of UF (speed, pause, and repair) were measured against the scores of CF to find out those aspects of UF which are reliable indicators of CF. Quantitative analysis of the data shows that among the three aspects of UF; speed is the best predictor of CF, and pause is weakly related to CF. The study suggests that including the speed aspect of UF could make L2 fluency assessment more reliable, valid, and objective. Thus, incorporating the assessment of psycholinguistic mechanisms into L2 spoken fluency testing, could result in fairer evaluation.Keywords: attention-shifting, cognitive fluency, lexical retrieval, utterance fluency
Procedia PDF Downloads 7112999 Digitalisation of the Railway Industry: Recent Advances in the Field of Dialogue Systems: Systematic Review
Authors: Andrei Nosov
Abstract:
This paper discusses the development directions of dialogue systems within the digitalisation of the railway industry, where technologies based on conversational AI are already potentially applied or will be applied. Conversational AI is one of the popular natural language processing (NLP) tasks, as it has great prospects for real-world applications today. At the same time, it is a challenging task as it involves many areas of NLP based on complex computations and deep insights from linguistics and psychology. In this review, we focus on dialogue systems and their implementation in the railway domain. We comprehensively review the state-of-the-art research results on dialogue systems and analyse them from three perspectives: type of problem to be solved, type of model, and type of system. In particular, from the perspective of the type of tasks to be solved, we discuss characteristics and applications. This will help to understand how to prioritise tasks. In terms of the type of models, we give an overview that will allow researchers to become familiar with how to apply them in dialogue systems. By analysing the types of dialogue systems, we propose an unconventional approach in contrast to colleagues who traditionally contrast goal-oriented dialogue systems with open-domain systems. Our view focuses on considering retrieval and generative approaches. Furthermore, the work comprehensively presents evaluation methods and datasets for dialogue systems in the railway domain to pave the way for future research. Finally, some possible directions for future research are identified based on recent research results.Keywords: digitalisation, railway, dialogue systems, conversational AI, natural language processing, natural language understanding, natural language generation
Procedia PDF Downloads 632998 Predictive Analysis of Chest X-rays Using NLP and Large Language Models with the Indiana University Dataset and Random Forest Classifier
Authors: Azita Ramezani, Ghazal Mashhadiagha, Bahareh Sanabakhsh
Abstract:
This study researches the combination of Random. Forest classifiers with large language models (LLMs) and natural language processing (NLP) to improve diagnostic accuracy in chest X-ray analysis using the Indiana University dataset. Utilizing advanced NLP techniques, the research preprocesses textual data from radiological reports to extract key features, which are then merged with image-derived data. This improved dataset is analyzed with Random Forest classifiers to predict specific clinical results, focusing on the identification of health issues and the estimation of case urgency. The findings reveal that the combination of NLP, LLMs, and machine learning not only increases diagnostic precision but also reliability, especially in quickly identifying critical conditions. Achieving an accuracy of 99.35%, the model shows significant advancements over conventional diagnostic techniques. The results emphasize the large potential of machine learning in medical imaging, suggesting that these technologies could greatly enhance clinician judgment and patient outcomes by offering quicker and more precise diagnostic approximations.Keywords: natural language processing (NLP), large language models (LLMs), random forest classifier, chest x-ray analysis, medical imaging, diagnostic accuracy, indiana university dataset, machine learning in healthcare, predictive modeling, clinical decision support systems
Procedia PDF Downloads 472997 Sentiment Analysis of Chinese Microblog Comments: Comparison between Support Vector Machine and Long Short-Term Memory
Authors: Xu Jiaqiao
Abstract:
Text sentiment analysis is an important branch of natural language processing. This technology is widely used in public opinion analysis and web surfing recommendations. At present, the mainstream sentiment analysis methods include three parts: sentiment analysis based on a sentiment dictionary, based on traditional machine learning, and based on deep learning. This paper mainly analyzes and compares the advantages and disadvantages of the SVM method of traditional machine learning and the Long Short-term Memory (LSTM) method of deep learning in the field of Chinese sentiment analysis, using Chinese comments on Sina Microblog as the data set. Firstly, this paper classifies and adds labels to the original comment dataset obtained by the web crawler, and then uses Jieba word segmentation to classify the original dataset and remove stop words. After that, this paper extracts text feature vectors and builds document word vectors to facilitate the training of the model. Finally, SVM and LSTM models are trained respectively. After accuracy calculation, it can be obtained that the accuracy of the LSTM model is 85.80%, while the accuracy of SVM is 91.07%. But at the same time, LSTM operation only needs 2.57 seconds, SVM model needs 6.06 seconds. Therefore, this paper concludes that: compared with the SVM model, the LSTM model is worse in accuracy but faster in processing speed.Keywords: sentiment analysis, support vector machine, long short-term memory, Chinese microblog comments
Procedia PDF Downloads 962996 Profiling Risky Code Using Machine Learning
Authors: Zunaira Zaman, David Bohannon
Abstract:
This study explores the application of machine learning (ML) for detecting security vulnerabilities in source code. The research aims to assist organizations with large application portfolios and limited security testing capabilities in prioritizing security activities. ML-based approaches offer benefits such as increased confidence scores, false positives and negatives tuning, and automated feedback. The initial approach using natural language processing techniques to extract features achieved 86% accuracy during the training phase but suffered from overfitting and performed poorly on unseen datasets during testing. To address these issues, the study proposes using the abstract syntax tree (AST) for Java and C++ codebases to capture code semantics and structure and generate path-context representations for each function. The Code2Vec model architecture is used to learn distributed representations of source code snippets for training a machine-learning classifier for vulnerability prediction. The study evaluates the performance of the proposed methodology using two datasets and compares the results with existing approaches. The Devign dataset yielded 60% accuracy in predicting vulnerable code snippets and helped resist overfitting, while the Juliet Test Suite predicted specific vulnerabilities such as OS-Command Injection, Cryptographic, and Cross-Site Scripting vulnerabilities. The Code2Vec model achieved 75% accuracy and a 98% recall rate in predicting OS-Command Injection vulnerabilities. The study concludes that even partial AST representations of source code can be useful for vulnerability prediction. The approach has the potential for automated intelligent analysis of source code, including vulnerability prediction on unseen source code. State-of-the-art models using natural language processing techniques and CNN models with ensemble modelling techniques did not generalize well on unseen data and faced overfitting issues. However, predicting vulnerabilities in source code using machine learning poses challenges such as high dimensionality and complexity of source code, imbalanced datasets, and identifying specific types of vulnerabilities. Future work will address these challenges and expand the scope of the research.Keywords: code embeddings, neural networks, natural language processing, OS command injection, software security, code properties
Procedia PDF Downloads 1092995 Extracting the Coupled Dynamics in Thin-Walled Beams from Numerical Data Bases
Authors: Mohammad A. Bani-Khaled
Abstract:
In this work we use the Discrete Proper Orthogonal Decomposition transform to characterize the properties of coupled dynamics in thin-walled beams by exploiting numerical simulations obtained from finite element simulations. The outcomes of the will improve our understanding of the linear and nonlinear coupled behavior of thin-walled beams structures. Thin-walled beams have widespread usage in modern engineering application in both large scale structures (aeronautical structures), as well as in nano-structures (nano-tubes). Therefore, detailed knowledge in regard to the properties of coupled vibrations and buckling in these structures are of great interest in the research community. Due to the geometric complexity in the overall structure and in particular in the cross-sections it is necessary to involve computational mechanics to numerically simulate the dynamics. In using numerical computational techniques, it is not necessary to over simplify a model in order to solve the equations of motions. Computational dynamics methods produce databases of controlled resolution in time and space. These numerical databases contain information on the properties of the coupled dynamics. In order to extract the system dynamic properties and strength of coupling among the various fields of the motion, processing techniques are required. Time- Proper Orthogonal Decomposition transform is a powerful tool for processing databases for the dynamics. It will be used to study the coupled dynamics of thin-walled basic structures. These structures are ideal to form a basis for a systematic study of coupled dynamics in structures of complex geometry.Keywords: coupled dynamics, geometric complexity, proper orthogonal decomposition (POD), thin walled beams
Procedia PDF Downloads 420