Search results for: text analytics
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1595

Search results for: text analytics

1385 Scaling Siamese Neural Network for Cross-Domain Few Shot Learning in Medical Imaging

Authors: Jinan Fiaidhi, Sabah Mohammed

Abstract:

Cross-domain learning in the medical field is a research challenge as many conditions, like in oncology imaging, use different imaging modalities. Moreover, in most of the medical learning applications, the sample training size is relatively small. Although few-shot learning (FSL) through the use of a Siamese neural network was able to be trained on a small sample with remarkable accuracy, FSL fails to be effective for use in multiple domains as their convolution weights are set for task-specific applications. In this paper, we are addressing this problem by enabling FSL to possess the ability to shift across domains by designing a two-layer FSL network that can learn individually from each domain and produce a shared features map with extra modulation to be used at the second layer that can recognize important targets from mix domains. Our initial experimentations based on mixed medical datasets like the Medical-MNIST reveal promising results. We aim to continue this research to perform full-scale analytics for testing our cross-domain FSL learning.

Keywords: Siamese neural network, few-shot learning, meta-learning, metric-based learning, thick data transformation and analytics

Procedia PDF Downloads 22
1384 Amharic Text News Classification Using Supervised Learning

Authors: Misrak Assefa

Abstract:

The Amharic language is the second most widely spoken Semitic language in the world. There are several new overloaded on the web. Searching some useful documents from the web on a specific topic, which is written in the Amharic language, is a challenging task. Hence, document categorization is required for managing and filtering important information. In the classification of Amharic text news, there is still a gap in the domain of information that needs to be launch. This study attempts to design an automatic Amharic news classification using a supervised learning mechanism on four un-touch classes. To achieve this research, 4,182 news articles were used. Naive Bayes (NB) and Decision tree (j48) algorithms were used to classify the given Amharic dataset. In this paper, k-fold cross-validation is used to estimate the accuracy of the classifier. As a result, it shows those algorithms can be applicable in Amharic news categorization. The best average accuracy result is achieved by j48 decision tree and naïve Bayes is 95.2345 %, and 94.6245 % respectively using three categories. This research indicated that a typical decision tree algorithm is more applicable to Amharic news categorization.

Keywords: text categorization, supervised machine learning, naive Bayes, decision tree

Procedia PDF Downloads 166
1383 Google Translate: AI Application

Authors: Shaima Almalhan, Lubna Shukri, Miriam Talal, Safaa Teskieh

Abstract:

Since artificial intelligence is a rapidly evolving topic that has had a significant impact on technical growth and innovation, this paper examines people's awareness, use, and engagement with the Google Translate application. To see how familiar aware users are with the app and its features, quantitative and qualitative research was conducted. The findings revealed that consumers have a high level of confidence in the application and how far people they benefit from this sort of innovation and how convenient it makes communication.

Keywords: artificial intelligence, google translate, speech recognition, language translation, camera translation, speech to text, text to speech

Procedia PDF Downloads 126
1382 A Rational Intelligent Agent to Promote Metacognition a Situation of Text Comprehension

Authors: Anass Hsissi, Hakim Allali, Abdelmajid Hajami

Abstract:

This article presents the results of a doctoral research which aims to integrate metacognitive dimension in the design of human learning computing environments (ILE). We conducted a detailed study on the relationship between metacognitive processes and learning, specifically their positive impact on the performance of learners in the area of reading comprehension. Our contribution is to implement methods, using an intelligent agent based on BDI paradigm to ensure intelligent and reliable support for low readers, in order to encourage regulation and a conscious and rational use of their metacognitive abilities.

Keywords: metacognition, text comprehension EIAH, autoregulation, BDI agent

Procedia PDF Downloads 301
1381 Research on the Landscape of Xi'an Ancient City Based on the Poetry Text of Tang Dynasty

Authors: Zou Yihui

Abstract:

The integration of the traditional landscape of the ancient city and the poet's emotions and symbolization into ancient poetry is the unique cultural gene and spiritual core of the historical city, and re-understanding the historical landscape pattern from the poetry is conducive to continuing the historical city context and improving the current situation of the gradual decline of the poetry of the modern historical urban landscape. Starting from Tang poetry uses semantic analysis methods、combined with text mining technology, entry mining, word frequency analysis, and cluster analysis of the landscape information of Tang Chang'an City were carried out, and the method framework for analyzing the urban landscape form based on poetry text was constructed. Nearly 160 poems describing the landscape of Tang Chang'an City were screened, and the poetic landscape characteristics of Tang Chang'an City were sorted out locally in order to combine with modern urban spatial development to continue the urban spatial context.

Keywords: Tang Chang'an City, poetic texts, semantic analysis, historical landscape

Procedia PDF Downloads 15
1380 Increasing the Ability of State Senior High School 12 Pekanbaru Students in Writing an Analytical Exposition Text through Comic Strips

Authors: Budiman Budiman

Abstract:

This research aimed at describing and testing whether the students’ ability in writing analytical exposition text is increased by using comic strips at SMAN 12 Pekanbaru. The respondents of this study were the second-grade students, especially XI Science 3 academic year 2011-2012. The total number of students in this class was forty-two (42) students. The quantitative and qualitative data was collected by using writing test and observation sheets. The research finding reveals that there is a significant increase of students’ writing ability in writing analytical exposition text through comic strips. It can be proved by the average score of pre-test was 43.7 and the average score of post-test was 65.37. Besides, the students’ interest and motivation in learning are also improved. These can be seen from the increasing of students’ awareness and activeness in learning process based on observation sheets. The findings draw attention to the use of comic strips in teaching and learning is beneficial for better learning outcome.

Keywords: analytical exposition, comic strips, secondary school students, writing ability

Procedia PDF Downloads 135
1379 Improving Technical Translation Ability of the Iranian Students of Translation Through Multimedia: An Empirical Study

Authors: Dina Zakeri, Ali Aminzad

Abstract:

Multimedia-assisted teaching results in eliminating traditional training barriers, facilitating the cognition process and upgrading learning outcomes. This study attempted to examine the effects of implementing multimedia on teaching technical translation model and on the technical text translation ability of Iranian students of translation. To fulfill the purpose of the study, a total of forty-six learners were selected out of fifty-seven participants in a higher education center in Tehran based on their scores in Preliminary English Test (PET) and were divided randomly into the experimental and control groups. Prior to the treatment, a technical text translation questionnaire was devised and then approved and validated by three assistant professors of technical fields and three assistant professors of Teaching English as a Foreign Language (TEFL) at the university. This questionnaire was administered as a pretest to both groups. Control and experimental groups were trained for five successive weeks using identical course books but with a different lesson plan that allowed employing multimedia for the experimental group only. The devised and approved questionnaire was administered as a posttest to both groups at the end of the instruction. A multivariate ANOVA was run to compare the two groups’ means on the PET, pretest and posttest. The results showed the rejection of all null hypotheses of the study and revealed that multimedia significantly improved technical text translation ability of the learners.

Keywords: multimedia, multimedia-mediated teaching, technical translation model, technical text, translation ability

Procedia PDF Downloads 102
1378 Temporality, Place and Autobiography in J.M. Coetzee’s 'Summertime'

Authors: Barbara Janari

Abstract:

In this paper it is argued that the effect of the disjunctive temporality in Summertime (the third of J.M. Coetzee’s fictionalised memoirs) is two-fold: firstly, it reflects the memoir’s ambivalent, contradictory representations of place in order to emphasize the fractured sense of self growing up in South Africa during apartheid entailed for Coetzee. Secondly, it reconceives the autobiographical discourse as one that foregrounds the inherent fictionality of all texts. The memoir’s narrative is filtered through intricate textual strategies that disrupt the chronological movement of the narrative, evoking the labyrinthine ways in which the past and present intersect and interpenetrate each other. It is framed by entries from Coetzee’s Notebooks: it opens with entries that cover the years 1972–1975, and ends with a number of undated fragments from his Notebooks. Most of the entries include a short ‘memo’ at the end, added between 1999 and 2000. While the memos follow the Notebook entries in the text, they are separated by decades. Between the Notebook entries is a series of interviews conducted by Vincent, the text’s putative biographer, between 2007 and 2008, based on recollections from five people who had known Coetzee in the 1970s – a key period in John’s life as it marks both his return to South Africa after a failed emigration attempt to America, and the beginning of his writing career, with the publication of Dusklands in 1974. The relationship between the memoir’s various parts is a key feature of Coetzee’s representation of place in Summertime, which is constructed as a composite one in which the principle of reflexive referencing has to be adopted. In other words, readers have to suspend individual references temporarily until the relationships between the parts have been connected to each other. In order to apprehend meaning in the text, the disparate narrative elements have to first be tied together. In this text, then, the experience of time as ordered and chronological is ruptured. Instead, the memoir’s themes and patterns become apparent most clearly through reflexive referencing, by which relationships between disparate sections of the text are linked. The image of the fictional John that emerges from the text is a composite of this John and the author, J.M. Coetzee, and is one which embodies Coetzee’s often fraught relationship with his home country, South Africa.

Keywords: autobiography, place, reflexive referencing, temporality

Procedia PDF Downloads 47
1377 Evaluating the Total Costs of a Ransomware-Resilient Architecture for Healthcare Systems

Authors: Sreejith Gopinath, Aspen Olmsted

Abstract:

This paper is based on our previous work that proposed a risk-transference-based architecture for healthcare systems to store sensitive data outside the system boundary, rendering the system unattractive to would-be bad actors. This architecture also allows a compromised system to be abandoned and a new system instance spun up in place to ensure business continuity without paying a ransom or engaging with a bad actor. This paper delves into the details of various attacks we simulated against the prototype system. In the paper, we discuss at length the time and computational costs associated with storing and retrieving data in the prototype system, abandoning a compromised system, and setting up a new instance with existing data. Lastly, we simulate some analytical workloads over the data stored in our specialized data storage system and discuss the time and computational costs associated with running analytics over data in a specialized storage system outside the system boundary. In summary, this paper discusses the total costs of data storage, access, and analytics incurred with the proposed architecture.

Keywords: cybersecurity, healthcare, ransomware, resilience, risk transference

Procedia PDF Downloads 113
1376 Effect of Mobile Phone Text Message Reminders on Adherence to Routine Prenatal Iron/Folic Acid Supplement among Pregnant Women: A Pilot Study

Authors: Nneka U. Igboeli, Maxwell O. Adibe

Abstract:

Iron and folate supplementation in pregnancy are important interventions that prevent maternal anaemia and fetal anomaly. Thus, daily oral doses of iron and folic acid are recommended throughout pregnancy as part of antenatal care. However, low adherence has been a major drawback leading to low effectiveness of these programs. The effect of mobile text message reminders to pregnant women to take their routine medications on adherence was evaluated in this study. The first 100 women who consented to the study were recruited and randomized to either receive a text message reminder on adherence to routine medications or not. Adherence was assessed using the 8-item Modified Morisky Adherence Scale (8-MMAS). The folders of successfully recruited women were tagged with the a study number assigned to each of them. The womens’ phone numbers were collected and these were used to send text messages reminders on adhering to routine drugs only to women in the intervention group. The text messages were sent three times per week for a period of four weeks with an adherence reassessment at the one month follow-up antenatal visit for recruited women. At one month follow-up, the lost to follow-up were 6 (16%) women for the intervention group and 17 (34%) for the control group. The across group mean difference in adherence score was 0.07 (-0.96 – 1.10) at baseline and 0.3 (-0.31 – 0.92) after intervention, both insignificant at p > 0.05. The within group change were increases of 0.58 (0.00 – 1.16) (p = 0.05) from baseline for the intervention group and a 0.35 (-0.51 – 1.20) (p = 0.395) for the control group. Non-significant increase in adherence scores were recorded for both groups. However, the increase in adherence scores of women in the intervention group was greater and may be potentially transformed into more positive results if the study period is increased with possibly reduced study drop-outs shows great promise for more positive results.

Keywords: adherence, mobile phone, pregnant women, reminders

Procedia PDF Downloads 152
1375 Systematic Mapping Study of Digitization and Analysis of Manufacturing Data

Authors: R. Clancy, M. Ahern, D. O’Sullivan, K. Bruton

Abstract:

The manufacturing industry is currently undergoing a digital transformation as part of the mega-trend Industry 4.0. As part of this phase of the industrial revolution, traditional manufacturing processes are being combined with digital technologies to achieve smarter and more efficient production. To successfully digitally transform a manufacturing facility, the processes must first be digitized. This is the conversion of information from an analogue format to a digital format. The objective of this study was to explore the research area of digitizing manufacturing data as part of the worldwide paradigm, Industry 4.0. The formal methodology of a systematic mapping study was utilized to capture a representative sample of the research area and assess its current state. Specific research questions were defined to assess the key benefits and limitations associated with the digitization of manufacturing data. Research papers were classified according to the type of research and type of contribution to the research area. Upon analyzing 54 papers identified in this area, it was noted that 23 of the papers originated in Germany. This is an unsurprising finding as Industry 4.0 is originally a German strategy with supporting strong policy instruments being utilized in Germany to support its implementation. It was also found that the Fraunhofer Institute for Mechatronic Systems Design, in collaboration with the University of Paderborn in Germany, was the most frequent contributing Institution of the research papers with three papers published. The literature suggested future research directions and highlighted one specific gap in the area. There exists an unresolved gap between the data science experts and the manufacturing process experts in the industry. The data analytics expertise is not useful unless the manufacturing process information is utilized. A legitimate understanding of the data is crucial to perform accurate analytics and gain true, valuable insights into the manufacturing process. There lies a gap between the manufacturing operations and the information technology/data analytics departments within enterprises, which was borne out by the results of many of the case studies reviewed as part of this work. To test the concept of this gap existing, the researcher initiated an industrial case study in which they embedded themselves between the subject matter expert of the manufacturing process and the data scientist. Of the papers resulting from the systematic mapping study, 12 of the papers contributed a framework, another 12 of the papers were based on a case study, and 11 of the papers focused on theory. However, there were only three papers that contributed a methodology. This provides further evidence for the need for an industry-focused methodology for digitizing and analyzing manufacturing data, which will be developed in future research.

Keywords: analytics, digitization, industry 4.0, manufacturing

Procedia PDF Downloads 85
1374 Multimodal Sentiment Analysis With Web Based Application

Authors: Shreyansh Singh, Afroz Ahmed

Abstract:

Sentiment Analysis intends to naturally reveal the hidden mentality that we hold towards an entity. The total of this assumption over a populace addresses sentiment surveying and has various applications. Current text-based sentiment analysis depends on the development of word embeddings and Machine Learning models that take in conclusion from enormous text corpora. Sentiment Analysis from text is presently generally utilized for consumer loyalty appraisal and brand insight investigation. With the expansion of online media, multimodal assessment investigation is set to carry new freedoms with the appearance of integral information streams for improving and going past text-based feeling examination using the new transforms methods. Since supposition can be distinguished through compelling follows it leaves, like facial and vocal presentations, multimodal opinion investigation offers good roads for examining facial and vocal articulations notwithstanding the record or printed content. These methodologies use the Recurrent Neural Networks (RNNs) with the LSTM modes to increase their performance. In this study, we characterize feeling and the issue of multimodal assessment investigation and audit ongoing advancements in multimodal notion examination in various spaces, including spoken surveys, pictures, video websites, human-machine, and human-human connections. Difficulties and chances of this arising field are additionally examined, promoting our theory that multimodal feeling investigation holds critical undiscovered potential.

Keywords: sentiment analysis, RNN, LSTM, word embeddings

Procedia PDF Downloads 95
1373 A Word-to-Vector Formulation for Word Representation

Authors: Sandra Rizkallah, Amir F. Atiya

Abstract:

This work presents a novel word to vector representation that is based on embedding the words into a sphere, whereby the dot product of the corresponding vectors represents the similarity between any two words. Embedding the vectors into a sphere enabled us to take into consideration the antonymity between words, not only the synonymity, because of the suitability to handle the polarity nature of words. For example, a word and its antonym can be represented as a vector and its negative. Moreover, we have managed to extract an adequate vocabulary. The obtained results show that the proposed approach can capture the essence of the language, and can be generalized to estimate a correct similarity of any new pair of words.

Keywords: natural language processing, word to vector, text similarity, text mining

Procedia PDF Downloads 243
1372 Fake News Detection for Korean News Using Machine Learning Techniques

Authors: Tae-Uk Yun, Pullip Chung, Kee-Young Kwahk, Hyunchul Ahn

Abstract:

Fake news is defined as the news articles that are intentionally and verifiably false, and could mislead readers. Spread of fake news may provoke anxiety, chaos, fear, or irrational decisions of the public. Thus, detecting fake news and preventing its spread has become very important issue in our society. However, due to the huge amount of fake news produced every day, it is almost impossible to identify it by a human. Under this context, researchers have tried to develop automated fake news detection using machine learning techniques over the past years. But, there have been no prior studies proposed an automated fake news detection method for Korean news to our best knowledge. In this study, we aim to detect Korean fake news using text mining and machine learning techniques. Our proposed method consists of two steps. In the first step, the news contents to be analyzed is convert to quantified values using various text mining techniques (topic modeling, TF-IDF, and so on). After that, in step 2, classifiers are trained using the values produced in step 1. As the classifiers, machine learning techniques such as logistic regression, backpropagation network, support vector machine, and deep neural network can be applied. To validate the effectiveness of the proposed method, we collected about 200 short Korean news from Seoul National University’s FactCheck. which provides with detailed analysis reports from 20 media outlets and links to source documents for each case. Using this dataset, we will identify which text features are important as well as which classifiers are effective in detecting Korean fake news.

Keywords: fake news detection, Korean news, machine learning, text mining

Procedia PDF Downloads 245
1371 Translation and Ideology: New Perspectives

Authors: Hamza Salih

Abstract:

Since translation is no longer viewed as a mere replacement of linguistic codes from one language to another, it has increasingly been considered, especially with the advent of the cultural turn in the late 70's, in relation to the broader external context in which it takes place. According to scholars in the field, the translation process is determined by the political, economic and cultural values which exert external pressures on the translator. Correspondingly, the relationship between translation as an act of re-writing the original text and ideology has already been established. This paper addresses the issue of how ideology comes into play in the translational process and what strategies the translator adopts to foreground or circumvent ideological constraints. Along with this, the paper will touch upon the notions of censorship, manipulation, subversion and domestication which are deemed of relevance to this very topic. In fact, after the domination of the empirically-oriented linguistic approaches in translation studies, the relationship between translation and ideology has to be foregrounded to draw attention to the fact that the translation process is not a mere text-to-text linguistic transfer, but, on the contrary, takes place in the midst of economic, political, cultural and religious variables, which some scholars subsume under the category ideology.

Keywords: translation, language, ideology, subversion, censorship and manipulation

Procedia PDF Downloads 226
1370 How Is a Machine-Translated Literary Text Organized in Coherence? An Analysis Based upon Theme-Rheme Structure

Authors: Jiang Niu, Yue Jiang

Abstract:

With the ultimate goal to automatically generate translated texts with high quality, machine translation has made tremendous improvements. However, its translations of literary works are still plagued with problems in coherence, esp. the translation between distant language pairs. One of the causes of the problems is probably the lack of linguistic knowledge to be incorporated into the training of machine translation systems. In order to enable readers to better understand the problems of machine translation in coherence, to seek out the potential knowledge to be incorporated, and thus to improve the quality of machine translation products, this study applies Theme-Rheme structure to examine how a machine-translated literary text is organized and developed in terms of coherence. Theme-Rheme structure in Systemic Functional Linguistics is a useful tool for analysis of textual coherence. Theme is the departure point of a clause and Rheme is the rest of the clause. In a text, as Themes and Rhemes may be connected with each other in meaning, they form thematic and rhematic progressions throughout the text. Based on this structure, we can look into how a text is organized and developed in terms of coherence. Methodologically, we chose Chinese and English as the language pair to be studied. Specifically, we built a comparable corpus with two modes of English translations, viz. machine translation (MT) and human translation (HT) of one Chinese literary source text. The translated texts were annotated with Themes, Rhemes and their progressions throughout the texts. The annotated texts were analyzed from two respects, the different types of Themes functioning differently in achieving coherence, and the different types of thematic and rhematic progressions functioning differently in constructing texts. By analyzing and contrasting the two modes of translations, it is found that compared with the HT, 1) the MT features “pseudo-coherence”, with lots of ill-connected fragments of information using “and”; 2) the MT system produces a static and less interconnected text that reads like a list; these two points, in turn, lead to the less coherent organization and development of the MT than that of the HT; 3) novel to traditional and previous studies, Rhemes do contribute to textual connection and coherence though less than Themes do and thus are worthy of notice in further studies. Hence, the findings suggest that Theme-Rheme structure be applied to measuring and assessing the coherence of machine translation, to being incorporated into the training of the machine translation system, and Rheme be taken into account when studying the textual coherence of both MT and HT.

Keywords: coherence, corpus-based, literary translation, machine translation, Theme-Rheme structure

Procedia PDF Downloads 180
1369 Development of Fake News Model Using Machine Learning through Natural Language Processing

Authors: Sajjad Ahmed, Knut Hinkelmann, Flavio Corradini

Abstract:

Fake news detection research is still in the early stage as this is a relatively new phenomenon in the interest raised by society. Machine learning helps to solve complex problems and to build AI systems nowadays and especially in those cases where we have tacit knowledge or the knowledge that is not known. We used machine learning algorithms and for identification of fake news; we applied three classifiers; Passive Aggressive, Naïve Bayes, and Support Vector Machine. Simple classification is not completely correct in fake news detection because classification methods are not specialized for fake news. With the integration of machine learning and text-based processing, we can detect fake news and build classifiers that can classify the news data. Text classification mainly focuses on extracting various features of text and after that incorporating those features into classification. The big challenge in this area is the lack of an efficient way to differentiate between fake and non-fake due to the unavailability of corpora. We applied three different machine learning classifiers on two publicly available datasets. Experimental analysis based on the existing dataset indicates a very encouraging and improved performance.

Keywords: fake news detection, natural language processing, machine learning, classification techniques.

Procedia PDF Downloads 135
1368 Data Mining in Healthcare for Predictive Analytics

Authors: Ruzanna Muradyan

Abstract:

Medical data mining is a crucial field in contemporary healthcare that offers cutting-edge tactics with enormous potential to transform patient care. This abstract examines how sophisticated data mining techniques could transform the healthcare industry, with a special focus on how they might improve patient outcomes. Healthcare data repositories have dynamically evolved, producing a rich tapestry of different, multi-dimensional information that includes genetic profiles, lifestyle markers, electronic health records, and more. By utilizing data mining techniques inside this vast library, a variety of prospects for precision medicine, predictive analytics, and insight production become visible. Predictive modeling for illness prediction, risk stratification, and therapy efficacy evaluations are important points of focus. Healthcare providers may use this abundance of data to tailor treatment plans, identify high-risk patient populations, and forecast disease trajectories by applying machine learning algorithms and predictive analytics. Better patient outcomes, more efficient use of resources, and early treatments are made possible by this proactive strategy. Furthermore, data mining techniques act as catalysts to reveal complex relationships between apparently unrelated data pieces, providing enhanced insights into the cause of disease, genetic susceptibilities, and environmental factors. Healthcare practitioners can get practical insights that guide disease prevention, customized patient counseling, and focused therapies by analyzing these associations. The abstract explores the problems and ethical issues that come with using data mining techniques in the healthcare industry. In order to properly use these approaches, it is essential to find a balance between data privacy, security issues, and the interpretability of complex models. Finally, this abstract demonstrates the revolutionary power of modern data mining methodologies in transforming the healthcare sector. Healthcare practitioners and researchers can uncover unique insights, enhance clinical decision-making, and ultimately elevate patient care to unprecedented levels of precision and efficacy by employing cutting-edge methodologies.

Keywords: data mining, healthcare, patient care, predictive analytics, precision medicine, electronic health records, machine learning, predictive modeling, disease prognosis, risk stratification, treatment efficacy, genetic profiles, precision health

Procedia PDF Downloads 33
1367 Competitive DNA Calibrators as Quality Reference Standards (QRS™) for Germline and Somatic Copy Number Variations/Variant Allelic Frequencies Analyses

Authors: Eirini Konstanta, Cedric Gouedard, Aggeliki Delimitsou, Stefania Patera, Samuel Murray

Abstract:

Introduction: Quality reference DNA standards (QRS) for molecular testing by next-generation sequencing (NGS) are essential for accurate quantitation of copy number variations (CNV) for germline and variant allelic frequencies (VAF) for somatic analyses. Objectives: Presently, several molecular analytics for oncology patients are reliant upon quantitative metrics. Test validation and standardisation are also reliant upon the availability of surrogate control materials allowing for understanding test LOD (limit of detection), sensitivity, specificity. We have developed a dual calibration platform allowing for QRS pairs to be included in analysed DNA samples, allowing for accurate quantitation of CNV and VAF metrics within and between patient samples. Methods: QRS™ blocks up to 500nt were designed for common NGS panel targets incorporating ≥ 2 identification tags (IDTDNA.com). These were analysed upon spiking into gDNA, somatic, and ctDNA using a proprietary CalSuite™ platform adaptable to common LIMS. Results: We demonstrate QRS™ calibration reproducibility spiked to 5–25% at ± 2.5% in gDNA and ctDNA. Furthermore, we demonstrate CNV and VAF within and between samples (gDNA and ctDNA) with the same reproducibility (± 2.5%) in a clinical sample of lung cancer and HBOC (EGFR and BRCA1, respectively). CNV analytics was performed with similar accuracy using a single pair of QRS calibrators when using multiple single targeted sequencing controls. Conclusion: Dual paired QRS™ calibrators allow for accurate and reproducible quantitative analyses of CNV, VAF, intrinsic sample allele measurement, inter and intra-sample measure not only simplifying NGS analytics but allowing for monitoring clinically relevant biomarker VAF across patient ctDNA samples with improved accuracy.

Keywords: calibrator, CNV, gene copy number, VAF

Procedia PDF Downloads 125
1366 Thick Data Analytics for Learning Cataract Severity: A Triplet Loss Siamese Neural Network Model

Authors: Jinan Fiaidhi, Sabah Mohammed

Abstract:

Diagnosing cataract severity is an important factor in deciding to undertake surgery. It is usually conducted by an ophthalmologist or through taking a variety of fundus photography that needs to be examined by the ophthalmologist. This paper carries out an investigation using a Siamese neural net that can be trained with small anchor samples to score cataract severity. The model used in this paper is based on a triplet loss function that takes the ophthalmologist best experience in rating positive and negative anchors to a specific cataract scaling system. This approach that takes the heuristics of the ophthalmologist is generally called the thick data approach, which is a kind of machine learning approach that learn from a few shots. Clinical Relevance: The lens of the eye is mostly made up of water and proteins. A cataract occurs when these proteins at the eye lens start to clump together and block lights causing impair vision. This research aims at employing thick data machine learning techniques to rate the severity of the cataract using Siamese neural network.

Keywords: thick data analytics, siamese neural network, triplet-loss model, few shot learning

Procedia PDF Downloads 75
1365 Twitter Sentiment Analysis during the Lockdown on New-Zealand

Authors: Smah Almotiri

Abstract:

One of the most common fields of natural language processing (NLP) is sentimental analysis. The inferred feeling in the text can be successfully mined for various events using sentiment analysis. Twitter is viewed as a reliable data point for sentimental analytics studies since people are using social media to receive and exchange different types of data on a broad scale during the COVID-19 epidemic. The processing of such data may aid in making critical decisions on how to keep the situation under control. The aim of this research is to look at how sentimental states differed in a single geographic region during the lockdown at two different times.1162 tweets were analyzed related to the COVID-19 pandemic lockdown using keywords hashtags (lockdown, COVID-19) for the first sample tweets were from March 23, 2020, until April 23, 2020, and the second sample for the following year was from March 1, 2020, until April 4, 2020. Natural language processing (NLP), which is a form of Artificial intelligence, was used for this research to calculate the sentiment value of all of the tweets by using AFINN Lexicon sentiment analysis method. The findings revealed that the sentimental condition in both different times during the region's lockdown was positive in the samples of this study, which are unique to the specific geographical area of New Zealand. This research suggests applying machine learning sentimental methods such as Crystal Feel and extending the size of the sample tweet by using multiple tweets over a longer period of time.

Keywords: sentiment analysis, Twitter analysis, lockdown, Covid-19, AFINN, NodeJS

Procedia PDF Downloads 158
1364 Advancing in Cricket Analytics: Novel Approaches for Pitch and Ball Detection Employing OpenCV and YOLOV8

Authors: Pratham Madnur, Prathamkumar Shetty, Sneha Varur, Gouri Parashetti

Abstract:

In order to overcome conventional obstacles, this research paper investigates novel approaches for cricket pitch and ball detection that make use of cutting-edge technologies. The research integrates OpenCV for pitch inspection and modifies the YOLOv8 model for cricket ball detection in order to overcome the shortcomings of manual pitch assessment and traditional ball detection techniques. To ensure flexibility in a range of pitch environments, the pitch detection method leverages OpenCV’s color space transformation, contour extraction, and accurate color range defining features. Regarding ball detection, the YOLOv8 model emphasizes the preservation of minor object details to improve accuracy and is specifically trained to the unique properties of cricket balls. The methods are more reliable because of the careful preparation of the datasets, which include novel ball and pitch information. These cutting-edge methods not only improve cricket analytics but also set the stage for flexible methods in more general sports technology applications.

Keywords: OpenCV, YOLOv8, cricket, custom dataset, computer vision, sports

Procedia PDF Downloads 40
1363 The Use of Punctuation by Primary School Students Writing Texts Collaboratively: A Franco-Brazilian Comparative Study

Authors: Cristina Felipeto, Catherine Bore, Eduardo Calil

Abstract:

This work aims to analyze and compare the punctuation marks (PM) in school texts of Brazilian and French students and the comments on these PM made spontaneously by the students during the ongoing text. Assuming textual genetics as an investigative field within a dialogical and enunciative approach, we defined a common methodological design in two 1st year classrooms (7 years old) of the primary school, one classroom in Brazil (Maceio) and the other one in France (Paris). Through a multimodal capture system of writing processes in real time and space (Ramos System), we recorded the collaborative writing proposal in dyads in each of the classrooms. This system preserves the classroom’s ecological characteristics and provides a video recording synchronized with dialogues, gestures and facial expressions of the students, the stroke of the pen’s ink on the sheet of paper and the movement of the teacher and students in the classroom. The multimodal register of the writing process allowed access to the text in progress and the comments made by the students on what was being written. In each proposed text production, teachers organized their students in dyads and requested that they should talk, combine and write a fictional narrative. We selected a Dyad of Brazilian students (BD) and another Dyad of French students (FD) and we have filmed 6 proposals for each of the dyads. The proposals were collected during the 2nd Term of 2013 (Brazil) and 2014 (France). In 6 texts written by the BD there were identified 39 PMs and 825 written words (on average, a PM every 23 words): Of these 39 PMs, 27 were highlighted orally and commented by either student. In the texts written by the FD there were identified 48 PMs and 258 written words (on average, 1 PM every 5 words): Of these 48 PM, 39 were commented by the French students. Unlike what the studies on punctuation acquisition point out, the PM that occurred the most were hyphens (BD) and commas (FD). Despite the significant difference between the types and quantities of PM in the written texts, the recognition of the need for writing PM in the text in progress and the comments have some common characteristics: i) the writing of the PM was not anticipated in relation to the text in progress, then they were added after the end of a sentence or after the finished text itself; ii) the need to add punctuation marks in the text came after one of the students had ‘remembered’ that a particular sign was needed; iii) most of the PM inscribed were not related to their linguistic functions, but the graphic-visual feature of the text; iv) the comments justify or explain the PM, indicating metalinguistic reflections made by the students. Our results indicate how the comments of the BD and FD express the dialogic and subjective nature of knowledge acquisition. Our study suggests that the initial learning of PM depends more on its graphic features and interactional conditions than on its linguistic functions.

Keywords: collaborative writing, erasure, graphic marks, learning, metalinguistic awareness, textual genesis

Procedia PDF Downloads 141
1362 Architectural Experience of the Everyday in Bangkok CBD

Authors: Thirayu Jumsai Na Ayudhya

Abstract:

The attempt to understand about what architecture means to people as they go about their everyday life revealed that knowledge such as environmental psychology, environmental perception, environmental aesthetics, inadequately address the contextualized and holistic theoretical framework. In my previous research, it was found that people’s making senses of their everyday architecture can be addressed in terms of four super‐ordinate themes; (1) building in urban (text), (2) building in (text), (3) building in human (text), (4) and building in time (text). In this research, Bangkok CBD was selected as the focal urban context that the integrated style of architecture is noticeable. It is expected that in a unique urban context like Bangkok CBD unprecedented super-ordinate themes will be unveiled through the reflection of people’s everyday experiences. In this research, people’s architectural experience conducted in Bangkok CBD, Thailand, will be presented succinctly. The research addresses the question of how do people make sense of their everyday architecture/buildings especially in a unique urban context, Bangkok CBD, and identifies ways in which people make sense of their everyday architecture. Two key methodologies are adopted. First, Participant-Produced-Photograph (PPP) allows people to express their experiences of the everyday urban context freely without any interference or forced-data generating by researchers. Second, Interpretative Phenomenological Analysis (IPA) are also applied as main methodologies. With IPA methodology, a small pool of participants is considered giving the detailed level of analysis and its potential to produce a meaningful outcome.

Keywords: architectural experience, building appreciation, design psychology, environmental psychology, sense-making, the everyday experience, transactional theory

Procedia PDF Downloads 301
1361 Intertextuality as a Dialogue Between Postmodern Writer J. Fowles and Mid-English Writer J. Donne

Authors: Isahakyan Heghine

Abstract:

Intertextuality, being in the centre of attention of both linguists and literary critics, is vividly expressed in the outstanding British novelist and philosopher J. Fowles' works. 'The Magus’ is a deep psychological and philosophical novel with vivid intertextual links with the Greek mythology and authors from different epochs. The aim of the paper is to show how intertextuality might serve as a dialogue between two authors (J. Fowles and J. Donne) disguised in the dialogue of two protagonists of the novel : Conchis and Nicholas. Contrastive viewpoints concerning man's isolation, loneliness are stated in the dialogue. Due to the conceptual analysis of the text it becomes possible both to decode the conceptual information of the text and find out its intertextual links.

Keywords: dialogue, conceptual analysis, isolation, intertextuality

Procedia PDF Downloads 306
1360 A Multivariate Exploratory Data Analysis of a Crisis Text Messaging Service in Order to Analyse the Impact of the COVID-19 Pandemic on Mental Health in Ireland

Authors: Hamda Ajmal, Karen Young, Ruth Melia, John Bogue, Mary O'Sullivan, Jim Duggan, Hannah Wood

Abstract:

The Covid-19 pandemic led to a range of public health mitigation strategies in order to suppress the SARS-CoV-2 virus. The drastic changes in everyday life due to lockdowns had the potential for a significant negative impact on public mental health, and a key public health goal is to now assess the evidence from available Irish datasets to provide useful insights on this issue. Text-50808 is an online text-based mental health support service, established in Ireland in 2020, and can provide a measure of revealed distress and mental health concerns across the population. The aim of this study is to explore statistical associations between public mental health in Ireland and the Covid-19 pandemic. Uniquely, this study combines two measures of emotional wellbeing in Ireland: (1) weekly text volume at Text-50808, and (2) emotional wellbeing indicators reported by respondents of the Amárach public opinion survey, carried out on behalf of the Department of Health, Ireland. For this analysis, a multivariate graphical exploratory data analysis (EDA) was performed on the Text-50808 dataset dated from 15th June 2020 to 30th June 2021. This was followed by time-series analysis of key mental health indicators including: (1) the percentage of daily/weekly texts at Text-50808 that mention Covid-19 related issues; (2) the weekly percentage of people experiencing anxiety, boredom, enjoyment, happiness, worry, fear and stress in Amárach survey; and Covid-19 related factors: (3) daily new Covid-19 case numbers; (4) daily stringency index capturing the effect of government non-pharmaceutical interventions (NPIs) in Ireland. The cross-correlation function was applied to measure the relationship between the different time series. EDA of the Text-50808 dataset reveals significant peaks in the volume of texts on days prior to level 3 lockdown and level 5 lockdown in October 2020, and full level 5 lockdown in December 2020. A significantly high positive correlation was observed between the percentage of texts at Text-50808 that reported Covid-19 related issues and the percentage of respondents experiencing anxiety, worry and boredom (at a lag of 1 week) in Amárach survey data. There is a significant negative correlation between percentage of texts with Covid-19 related issues and percentage of respondents experiencing happiness in Amárach survey. Daily percentage of texts at Text-50808 that reported Covid-19 related issues to have a weak positive correlation with daily new Covid-19 cases in Ireland at a lag of 10 days and with daily stringency index of NPIs in Ireland at a lag of 2 days. The sudden peaks in text volume at Text-50808 immediately prior to new restrictions in Ireland indicate an association between a rise in mental health concerns following the announcement of new restrictions. There is also a high correlation between emotional wellbeing variables in the Amárach dataset and the number of weekly texts at Text-50808, and this confirms that Text-50808 reflects overall public sentiment. This analysis confirms the benefits of the texting service as a community surveillance tool for mental health in the population. This initial EDA will be extended to use multivariate modeling to predict the effect of additional Covid-19 related factors on public mental health in Ireland.

Keywords: COVID-19 pandemic, data analysis, digital health, mental health, public health, digital health

Procedia PDF Downloads 114
1359 Predicting Success and Failure in Drug Development Using Text Analysis

Authors: Zhi Hao Chow, Cian Mulligan, Jack Walsh, Antonio Garzon Vico, Dimitar Krastev

Abstract:

Drug development is resource-intensive, time-consuming, and increasingly expensive with each developmental stage. The success rates of drug development are also relatively low, and the resources committed are wasted with each failed candidate. As such, a reliable method of predicting the success of drug development is in demand. The hypothesis was that some examples of failed drug candidates are pushed through developmental pipelines based on false confidence and may possess common linguistic features identifiable through sentiment analysis. Here, the concept of using text analysis to discover such features in research publications and investor reports as predictors of success was explored. R studios were used to perform text mining and lexicon-based sentiment analysis to identify affective phrases and determine their frequency in each document, then using SPSS to determine the relationship between our defined variables and the accuracy of predicting outcomes. A total of 161 publications were collected and categorised into 4 groups: (i) Cancer treatment, (ii) Neurodegenerative disease treatment, (iii) Vaccines, and (iv) Others (containing all other drugs that do not fit into the 3 categories). Text analysis was then performed on each document using 2 separate datasets (BING and AFINN) in R within the category of drugs to determine the frequency of positive or negative phrases in each document. A relative positivity and negativity value were then calculated by dividing the frequency of phrases with the word count of each document. Regression analysis was then performed with SPSS statistical software on each dataset (values from using BING or AFINN dataset during text analysis) using a random selection of 61 documents to construct a model. The remaining documents were then used to determine the predictive power of the models. Model constructed from BING predicts the outcome of drug performance in clinical trials with an overall percentage of 65.3%. AFINN model had a lower accuracy at predicting outcomes compared to the BING model at 62.5% but was not effective at predicting the failure of drugs in clinical trials. Overall, the study did not show significant efficacy of the model at predicting outcomes of drugs in development. Many improvements may need to be made to later iterations of the model to sufficiently increase the accuracy.

Keywords: data analysis, drug development, sentiment analysis, text-mining

Procedia PDF Downloads 126
1358 Social Media Mining with R. Twitter Analyses

Authors: Diana Codat

Abstract:

Tweets' analysis is part of text mining. Each document is a written text. It's possible to apply the usual text search techniques, in particular by switching to the bag-of-words representation. But the tweets induce peculiarities. Some may enrich the analysis. Thus, their length is calibrated (at least as far as public messages are concerned), special characters make it possible to identify authors (@) and themes (#), the tweet and retweet mechanisms make it possible to follow the diffusion of the information. Conversely, other characteristics may disrupt the analyzes. Because space is limited, authors often use abbreviations, emoticons to express feelings, and they do not pay much attention to spelling. All this creates noise that can complicate the task. The tweets carry a lot of potentially interesting information. Their exploitation is one of the main axes of the analysis of the social networks. We show how to access Twitter-related messages. We will initiate a study of the properties of the tweets, and we will follow up on the exploitation of the content of the messages. We will work under R with the package 'twitteR'. The study of tweets is a strong focus of analysis of social networks because Twitter has become an important vector of communication. This example shows that it is easy to initiate an analysis from data extracted directly online. The data preparation phase is of great importance.

Keywords: data mining, language R, social networks, Twitter

Procedia PDF Downloads 155
1357 Study on the Focus of Attention of Special Education Students in Primary School

Authors: Tung-Kuang Wu, Hsing-Pei Hsieh, Ying-Ru Meng

Abstract:

Special Education in Taiwan has been facing difficulties including shortage of teachers and lack in resources. Some students need to receive special education are thus not identified or admitted. Fortunately, information technologies can be applied to relieve some of the difficulties. For example, on-line multimedia courseware can be used to assist the learning of special education students and take pretty much workload from special education teachers. However, there may exist cognitive variations between students in special or regular educations, which suggests the design of online courseware requires different considerations. This study aims to investigate the difference in focus of attention (FOA) between special and regular education students of primary school in viewing the computer screen. The study is essential as it helps courseware developers in determining where to put learning elements that matter the most on the right position of screen. It may also assist special education specialists to better understand the subtle differences among various subtypes of learning disabilities. This study involves 76 special education students (among them, 39 are students with mental retardation, MR, and 37 are students with learning disabilities, LDs) and 42 regular education students. The participants were asked to view a computer screen showing a picture partitioned into 3 × 3 areas with each area filled with text or icon. The subjects were then instructed to mark on the prior given paper sheets, which are also partitioned into 3 × 3 grids, the areas corresponding to the pictures on the computer screen that they first set their eyes on. The data are then collected and analyzed. Major findings are listed: 1. In both text and icon scenario, significant differences exist in the first preferred FOA between special and regular education students. The first FOA for the former is mainly on area 1 (upper left area, 53.8% / 51.3% for MR / LDs students in text scenario; and 53.8% / 56.8% for MR / LDs students in icons scenario), while the latter on area 5 (middle area, 50.0% and 57.1% in text and icons scenarios). 2. The second most preferred area in text scenario for students with MR and LDs are area 2 (upper-middle, 20.5%) and 5 (middle area, 24.3%). In icons scenario, the results are similar, but lesser in percentage. 3. Students with LDs that show similar preference (either in text or icons scenarios) in FOA to regular education students tend to be of some specific sub-type of learning disabilities. For instance, students with LDs that chose area 5 (middle area, either in text or icon scenario) as their FOA are mostly ones that have reading or writing disability. Also, three (out of 13) subjects in this category, after going through the rediagnosis process, were excluded from being learning disabilities. In summary, the findings suggest when designing multimedia courseware for students with MR and LDs, the essential learning elements should be placed on area 1, 2 and 5. In addition, FOV preference may also potentially be used as an indicator for diagnosing students with LDs.

Keywords: focus of attention, learning disabilities, mental retardation, on-line multimedia courseware, special education

Procedia PDF Downloads 145
1356 Understanding the Challenges of Lawbook Translation via the Framework of Functional Theory of Language

Authors: Tengku Sepora Tengku Mahadi

Abstract:

Where the speed of book writing lags behind the high need for such material for tertiary studies, translation offers a way to enhance the equilibrium in this demand-supply equation. Nevertheless, translation is confronted by obstacles that threaten its effectiveness. The primary challenge to the production of efficient translations may well be related to the text-type and in terms of its complexity. A text that is intricately written with unique rhetorical devices, subject-matter foundation and cultural references will undoubtedly challenge the translator. Longer time and greater effort would be the consequence. To understand these text-related challenges, the present paper set out to analyze a lawbook entitled Learning the Law by David Melinkoff. The book is chosen because it has often been used as a textbook or for reference in many law courses in the United Kingdom and has seen over thirteen editions; therefore, it can be said to be a worthy book for studies in law. Another reason is the existence of a ready translation in Malay. Reference to this translation enables confirmation to some extent of the potential problems that might occur in its translation. Understanding the organization and the language of the book will help translators to prepare themselves better for the task. They can anticipate the research and time that may be needed to produce an effective translation. Another premise here is that this text-type implies certain ways of writing and organization. Accordingly, it seems practicable to adopt the functional theory of language as suggested by Michael Halliday as its theoretical framework. Concepts of the context of culture, the context of situation and measures of the field, tenor and mode form the instruments for analysis. Additional examples from similar materials can also be used to validate the findings. Some interesting findings include the presence of several other text-types or sub-text-types in the book and the dependence on literary discourse and devices to capture the meanings better or add color to the dry field of law. In addition, many elements of culture can be seen, for example, the use of familiar alternatives, allusions, and even terminology and references that date back to various periods of time and languages. Also found are parts which discuss origins of words and terms that may be relevant to readers within the United Kingdom but make little sense to readers of the book in other languages. In conclusion, the textual analysis in terms of its functions and the linguistic and textual devices used to achieve them can then be applied as a guide to determine the effectiveness of the translation that is produced.

Keywords: functional theory of language, lawbook text-type, rhetorical devices, culture

Procedia PDF Downloads 125