Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 27583

Search results for: sesimic data processing

26623 Information Extraction for Short-Answer Question for the University of the Cordilleras

Authors: Thelma Palaoag, Melanie Basa, Jezreel Mark Panilo

Abstract:

Checking short-answer questions and essays, whether it may be paper or electronic in form, is a tiring and tedious task for teachers. Evaluating a student’s output require wide array of domains. Scoring the work is often a critical task. Several attempts in the past few years to create an automated writing assessment software but only have received negative results from teachers and students alike due to unreliability in scoring, does not provide feedback and others. The study aims to create an application that will be able to check short-answer questions which incorporate information extraction. Information extraction is a subfield of Natural Language Processing (NLP) where a chunk of text (technically known as unstructured text) is being broken down to gather necessary bits of data and/or keywords (structured text) to be further analyzed or rather be utilized by query tools. The proposed system shall be able to extract keywords or phrases from the individual’s answers to match it into a corpora of words (as defined by the instructor), which shall be the basis of evaluation of the individual’s answer. The proposed system shall also enable the teacher to provide feedback and re-evaluate the output of the student for some writing elements in which the computer cannot fully evaluate such as creativity and logic. Teachers can formulate, design, and check short answer questions efficiently by defining keywords or phrases as parameters by assigning weights for checking answers. With the proposed system, teacher’s time in checking and evaluating students output shall be lessened, thus, making the teacher more productive and easier.

Keywords: information extraction, short-answer question, natural language processing, application

Procedia PDF Downloads 428

26622 3D Images Representation to Provide Information on the Type of Castella Beams Hole

Authors: Cut Maisyarah Karyati, Aries Muslim, Sulardi

Abstract:

Digital image processing techniques to obtain detailed information from an image have been used in various fields, including in civil engineering, where the use of solid beam profiles in buildings and bridges has often been encountered since the early development of beams. Along with this development, the founded castellated beam profiles began to be more diverse in shape, such as the shape of a hexagon, triangle, pentagon, circle, ellipse and oval that could be a practical solution in optimizing a construction because of its characteristics. The purpose of this research is to create a computer application to edge detect the profile of various shapes of the castella beams hole. The digital image segmentation method has been used to obtain the grayscale images and represented in 2D and 3D formats. This application has been successfully made according to the desired function, which is to provide information on the type of castella beam hole.

Keywords: digital image, image processing, edge detection, grayscale, castella beams

Procedia PDF Downloads 142

26621 The Effect of the Internal Organization Communications' Effectiveness through Employee's Performance of Faculty of Management Science, Suan Sunandha Rajabhat University

Authors: Malaiphan Pansap, Surasit Vithayarat

Abstract:

The purpose of this study was to study the relationship between internal organization communications’ effectiveness and employee’s performance of Faculty of Management Science, Suan Sunandha Rajabhat University. Study on solutions of communication were carried out within the organization. Questionnaire was used to collect information from 136 people of staff and instructor and data were analyzed by using frequency, percentage, mean and standard deviation and then data processing statistic programs. The result found that organization communication that affects their employee’s performance is sender which lack the skills for speaking and writing to convince audiences ready before taking message and the message which organizations are not always informed. The employees believe the behavior of good organization communication has a positive impact on the development of organization because the employees feel involved and be a part of the organization, by the cooperation in working to achieve the goal, the employees can work in the same direction and meet goal quickly.

Keywords: employee’s performance, faculty of management science, internal organization communications’ effectiveness, management accounting, Suan Sunandha Rajabhat University

Procedia PDF Downloads 239

26620 ICanny: CNN Modulation Recognition Algorithm

Authors: Jingpeng Gao, Xinrui Mao, Zhibin Deng

Abstract:

Aiming at the low recognition rate on the composite signal modulation in low signal to noise ratio (SNR), this paper proposes a modulation recognition algorithm based on ICanny-CNN. Firstly, the radar signal is transformed into the time-frequency image by Choi-Williams Distribution (CWD). Secondly, we propose an image processing algorithm using the Guided Filter and the threshold selection method, which is combined with the hole filling and the mask operation. Finally, the shallow convolutional neural network (CNN) is combined with the idea of the depth-wise convolution (Dw Conv) and the point-wise convolution (Pw Conv). The proposed CNN is designed to complete image classification and realize modulation recognition of radar signal. The simulation results show that the proposed algorithm can reach 90.83% at 0dB and 71.52% at -8dB. Therefore, the proposed algorithm has a good classification and anti-noise performance in radar signal modulation recognition and other fields.

Keywords: modulation recognition, image processing, composite signal, improved Canny algorithm

Procedia PDF Downloads 191

26619 A Review Paper on Data Mining and Genetic Algorithm

Authors: Sikander Singh Cheema, Jasmeen Kaur

Abstract:

In this paper, the concept of data mining is summarized and its one of the important process i.e KDD is summarized. The data mining based on Genetic Algorithm is researched in and ways to achieve the data mining Genetic Algorithm are surveyed. This paper also conducts a formal review on the area of data mining tasks and genetic algorithm in various fields.

Keywords: data mining, KDD, genetic algorithm, descriptive mining, predictive mining

Procedia PDF Downloads 593

26618 Polarity Classification of Social Media Comments in Turkish

Authors: Migena Ceyhan, Zeynep Orhan, Dimitrios Karras

Abstract:

People in modern societies are continuously sharing their experiences, emotions, and thoughts in different areas of life. The information reaches almost everyone in real-time and can have an important impact in shaping people’s way of living. This phenomenon is very well recognized and advantageously used by the market representatives, trying to earn the most from this means. Given the abundance of information, people and organizations are looking for efficient tools that filter the countless data into important information, ready to analyze. This paper is a modest contribution in this field, describing the process of automatically classifying social media comments in the Turkish language into positive or negative. Once data is gathered and preprocessed, feature sets of selected single words or groups of words are build according to the characteristics of language used in the texts. These features are used later to train, and test a system according to different machine learning algorithms (Naïve Bayes, Sequential Minimal Optimization, J48, and Bayesian Linear Regression). The resultant high accuracies can be important feedback for decision-makers to improve the business strategies accordingly.

Keywords: feature selection, machine learning, natural language processing, sentiment analysis, social media reviews

Procedia PDF Downloads 147

26617 Topic-to-Essay Generation with Event Element Constraints

Authors: Yufen Qin

Abstract:

Topic-to-Essay generation is a challenging task in Natural language processing, which aims to generate novel, diverse, and topic-related text based on user input. Previous research has overlooked the generation of articles under the constraints of event elements, resulting in issues such as incomplete event elements and logical inconsistencies in the generated results. To fill this gap, this paper proposes an event-constrained approach for a topic-to-essay generation that enforces the completeness of event elements during the generation process. Additionally, a language model is employed to verify the logical consistency of the generated results. Experimental results demonstrate that the proposed model achieves a better BLEU-2 score and performs better than the baseline in terms of subjective evaluation on a real dataset, indicating its capability to generate higher-quality topic-related text.

Keywords: event element, language model, natural language processing, topic-to-essay generation.

Procedia PDF Downloads 237

26616 The Processing of Implicit Stereotypes in Everyday Scene Perception

Authors: Magali Mari, Fabrice Clement

Abstract:

The present study investigated the influence of implicit stereotypes on adults’ visual information processing, using an eye-tracking device. Implicit stereotyping is an automatic and implicit process; it happens relatively quickly, outside of awareness. In the presence of a member of a social group, a set of expectations about the characteristics of this social group appears automatically in people’s minds. The study aimed to shed light on the cognitive processes involved in stereotyping and to further investigate the use of eye movements to measure implicit stereotypes. With an eye-tracking device, the eye movements of participants were analyzed, while they viewed everyday scenes depicting women and men in congruent or incongruent gender role activities (e.g., a woman ironing or a man ironing). The settings of these scenes had to be analyzed to infer the character’s role. Also, participants completed an implicit association test that combined the concept of gender with attributes of occupation (home/work), while measuring reaction times to assess participants’ implicit stereotypes about gender. The results showed that implicit stereotypes do influence people’s visual attention; within a fraction of a second, the number of returns, between stereotypical and counter-stereotypical scenes, differed significantly, meaning that participants interpreted the scene itself as a whole before identifying the character. They predicted that, in such a situation, the character was supposed to be a woman or a man. Also, the study showed that eye movements could be used as a fast and reliable supplement for traditional implicit association tests to measure implicit stereotypes. Altogether, this research provides further understanding of implicit stereotypes processing as well as a natural method to study implicit stereotypes.

Keywords: eye-tracking, implicit stereotypes, social cognition, visual attention

Procedia PDF Downloads 160

26615 Preprocessing and Fusion of Multiple Representation of Finger Vein patterns using Conventional and Machine Learning techniques

Authors: Tomas Trainys, Algimantas Venckauskas

Abstract:

Application of biometric features to the cryptography for human identification and authentication is widely studied and promising area of the development of high-reliability cryptosystems. Biometric cryptosystems typically are designed for patterns recognition, which allows biometric data acquisition from an individual, extracts feature sets, compares the feature set against the set stored in the vault and gives a result of the comparison. Preprocessing and fusion of biometric data are the most important phases in generating a feature vector for key generation or authentication. Fusion of biometric features is critical for achieving a higher level of security and prevents from possible spoofing attacks. The paper focuses on the tasks of initial processing and fusion of multiple representations of finger vein modality patterns. These tasks are solved by applying conventional image preprocessing methods and machine learning techniques, Convolutional Neural Network (SVM) method for image segmentation and feature extraction. An article presents a method for generating sets of biometric features from a finger vein network using several instances of the same modality. Extracted features sets were fused at the feature level. The proposed method was tested and compared with the performance and accuracy results of other authors.

Keywords: bio-cryptography, biometrics, cryptographic key generation, data fusion, information security, SVM, pattern recognition, finger vein method.

Procedia PDF Downloads 152

26614 A Survey of Semantic Integration Approaches in Bioinformatics

Authors: Chaimaa Messaoudi, Rachida Fissoune, Hassan Badir

Abstract:

Technological advances of computer science and data analysis are helping to provide continuously huge volumes of biological data, which are available on the web. Such advances involve and require powerful techniques for data integration to extract pertinent knowledge and information for a specific question. Biomedical exploration of these big data often requires the use of complex queries across multiple autonomous, heterogeneous and distributed data sources. Semantic integration is an active area of research in several disciplines, such as databases, information-integration, and ontology. We provide a survey of some approaches and techniques for integrating biological data, we focus on those developed in the ontology community.

Keywords: biological ontology, linked data, semantic data integration, semantic web

Procedia PDF Downloads 449

26613 An Experimental Investigation of Air Entrainment Due to Water Jets in Crossflows

Authors: Mina Esmi Jahromi, Mehdi Khiadani

Abstract:

Vertical water jets discharging into free surface turbulent cross flows result in the ingression of a large amount of air in the body of water and form a region of two-phase air-water flow with a considerable interfacial area. This research presents an experimental study of the two-phase bubbly flow using image processing technique. The air ingression and the trajectories of bubble swarms under different experimental conditions are evaluated. The rate of air entrainment and the bubble characteristics such as penetration depth, and dispersion pattern were found to be affected by the most influential parameters of water jet and cross flow including water jet-to-crossflow velocity ratio, water jet falling height, and cross flow depth. This research improves understanding of the underwater flow structure due to the water jet impingement in crossflow and advances the practical applications of water jets such as artificial aeration, circulation, and mixing where crossflow is present.

Keywords: air entrainment, image processing, jet in cross flow, two-phase flow

Procedia PDF Downloads 369

26612 COVID-19 Genomic Analysis and Complete Evaluation

Authors: Narin Salehiyan, Ramin Ghasemi Shayan

Abstract:

In order to investigate coronavirus RNA replication, transcription, recombination, protein processing and transport, virion assembly, the identification of coronavirus-specific cell receptors, and polymerase processing, the manipulation of coronavirus clones and complementary DNAs (cDNAs) of defective-interfering (DI) RNAs is the subject of this chapter. The idea of the Covid genome is nonsegmented, single-abandoned, and positive-sense RNA. When compared to other RNA viruses, its size is significantly greater, ranging from 27 to 32 kb. The quality encoding the enormous surface glycoprotein depends on 4.4 kb, encoding a forcing trimeric, profoundly glycosylated protein. This takes off exactly 20 nm over the virion envelope, giving the infection the appearance-with a little creative mind of a crown or coronet. Covid research has added to the comprehension of numerous parts of atomic science as a general rule, like the component of RNA union, translational control, and protein transport and handling. It stays a fortune equipped for creating startling experiences.

Keywords: covid-19, corona, virus, genome, genetic

Procedia PDF Downloads 72

26611 Newly Designed Ecological Task to Assess Cognitive Map Reading Ability: Behavioral Neuro-Anatomic Correlates of Mental Navigation

Authors: Igor Faulmann, Arnaud Saj, Roland Maurer

Abstract:

Spatial cognition consists in a plethora of high level cognitive abilities: among them, the ability to learn and to navigate in large scale environments is probably one of the most complex skills. Navigation is thought to rely on the ability to read a cognitive map, defined as an allocentric representation of ones environment. Those representations are of course intimately related to the two geometrical primitives of the environment: distance and direction. Also, many recent studies point to a predominant hippocampal and para-hippocampal role in spatial cognition, as well as in the more specific cluster of navigational skills. In a previous study in humans, we used a newly validated test assessing cognitive map processing by evaluating the ability to judge relative distances and directions: the CMRT (Cognitive Map Recall Test). This study identified in topographically disorientated patients (1) behavioral differences between the evaluation of distances and of directions, and (2) distinct causality patterns assessed via VLSM (i.e., distinct cerebral lesions cause distinct response patterns depending on the modality (distance vs direction questions). Thus, we hypothesized that: (1) if the CMRT really taps into the same resources as real navigation, there would be hippocampal, parahippocampal, and parietal activation, and (2) there exists underlying neuroanatomical and functional differences between the processing of this two modalities. Aiming toward a better understanding of the neuroanatomical correlates of the CMRT in humans, and more generally toward a better understanding of how the brain processes the cognitive map, we adapted the CMRT as an fMRI procedure. 23 healthy subjects (11 women, 12 men), all living in Geneva for at least 2 years, underwent the CMRT in fMRI. Results show, for distance and direction taken together, than the most active brain regions are the parietal, frontal and cerebellar parts. Additionally, and as expected, patterns of brain activation differ when comparing the two modalities. Furthermore, distance processing seems to rely more on parietal regions (compared to other brain regions in the same modality and also to direction). It is interesting to notice that no significant activity was observed in the hippocampal or parahippocampal areas. Direction processing seems to tap more into frontal and cerebellar brain regions (compared to other brain regions in the same modality and also to distance). Significant hippocampal and parahippocampal activity has been shown only in this modality. This results demonstrated a complex interaction of structures which are compatible with response patterns observed in other navigational tasks, thus showing that the CMRT taps at least partially into the same brain resources as real navigation. Additionally, differences between the processing of distances and directions leads to the conclusion that the human brain processes each modality distinctly. Further research should focus on the dynamics of this processing, allowing a clearer understanding between the two sub-processes.

Keywords: cognitive map, navigation, fMRI, spatial cognition

Procedia PDF Downloads 295

26610 Classification of Generative Adversarial Network Generated Multivariate Time Series Data Featuring Transformer-Based Deep Learning Architecture

Authors: Thrivikraman Aswathi, S. Advaith

Abstract:

As there can be cases where the use of real data is somehow limited, such as when it is hard to get access to a large volume of real data, we need to go for synthetic data generation. This produces high-quality synthetic data while maintaining the statistical properties of a specific dataset. In the present work, a generative adversarial network (GAN) is trained to produce multivariate time series (MTS) data since the MTS is now being gathered more often in various real-world systems. Furthermore, the GAN-generated MTS data is fed into a transformer-based deep learning architecture that carries out the data categorization into predefined classes. Further, the model is evaluated across various distinct domains by generating corresponding MTS data.

Keywords: GAN, transformer, classification, multivariate time series

Procedia PDF Downloads 131

26609 Central African Republic Government Recruitment Agency Based on Identity Management and Public Key Encryption

Authors: Koyangbo Guere Monguia Michel Alex Emmanuel

Abstract:

In e-government and especially recruitment, many researches have been conducted to build a trustworthy and reliable online or application system capable to process users or job applicant files. In this research (Government Recruitment Agency), cloud computing, identity management and public key encryption have been used to management domains, access control authorization mechanism and to secure data exchange between entities for reliable procedure of processing files.

Keywords: cloud computing network, identity management systems, public key encryption, access control and authorization

Procedia PDF Downloads 360

26608 Generative AI: A Comparison of Conditional Tabular Generative Adversarial Networks and Conditional Tabular Generative Adversarial Networks with Gaussian Copula in Generating Synthetic Data with Synthetic Data Vault

Authors: Lakshmi Prayaga, Chandra Prayaga. Aaron Wade, Gopi Shankar Mallu, Harsha Satya Pola

Abstract:

Synthetic data generated by Generative Adversarial Networks and Autoencoders is becoming more common to combat the problem of insufficient data for research purposes. However, generating synthetic data is a tedious task requiring extensive mathematical and programming background. Open-source platforms such as the Synthetic Data Vault (SDV) and Mostly AI have offered a platform that is user-friendly and accessible to non-technical professionals to generate synthetic data to augment existing data for further analysis. The SDV also provides for additions to the generic GAN, such as the Gaussian copula. We present the results from two synthetic data sets (CTGAN data and CTGAN with Gaussian Copula) generated by the SDV and report the findings. The results indicate that the ROC and AUC curves for the data generated by adding the layer of Gaussian copula are much higher than the data generated by the CTGAN.

Keywords: synthetic data generation, generative adversarial networks, conditional tabular GAN, Gaussian copula

Procedia PDF Downloads 84

26607 Sentiment Analysis of Fake Health News Using Naive Bayes Classification Models

Authors: Danielle Shackley, Yetunde Folajimi

Abstract:

As more people turn to the internet seeking health-related information, there is more risk of finding false, inaccurate, or dangerous information. Sentiment analysis is a natural language processing technique that assigns polarity scores to text, ranging from positive, neutral, and negative. In this research, we evaluate the weight of a sentiment analysis feature added to fake health news classification models. The dataset consists of existing reliably labeled health article headlines that were supplemented with health information collected about COVID-19 from social media sources. We started with data preprocessing and tested out various vectorization methods such as Count and TFIDF vectorization. We implemented 3 Naive Bayes classifier models, including Bernoulli, Multinomial, and Complement. To test the weight of the sentiment analysis feature on the dataset, we created benchmark Naive Bayes classification models without sentiment analysis, and those same models were reproduced, and the feature was added. We evaluated using the precision and accuracy scores. The Bernoulli initial model performed with 90% precision and 75.2% accuracy, while the model supplemented with sentiment labels performed with 90.4% precision and stayed constant at 75.2% accuracy. Our results show that the addition of sentiment analysis did not improve model precision by a wide margin; while there was no evidence of improvement in accuracy, we had a 1.9% improvement margin of the precision score with the Complement model. Future expansion of this work could include replicating the experiment process and substituting the Naive Bayes for a deep learning neural network model.

Keywords: sentiment analysis, Naive Bayes model, natural language processing, topic analysis, fake health news classification model

Procedia PDF Downloads 97

26606 Effects of Bilingual Education in the Teaching and Learning Practices in the Continuous Improvement and Development of k12 Program

Authors: Miriam Sebastian

Abstract:

This research focused on the effects of bilingual education as medium of instruction to the academic performance of selected intermediate students of Miriam’s Academy of Valenzuela Inc. . An experimental design was used, with language of instruction as the independent variable and the different literacy skills as dependent variables. The sample consisted of experimental students comprises of 30 students were exposed to bilingual education (Filipino and English) . They were given pretests and were divided into three groups: Monolingual Filipino, Monolingual English, and Bilingual. They were taught different literacy skills for eight weeks and were then administered the posttests. Data was analyzed and evaluated in the light of the central processing and script-dependent hypotheses. Based on the data, it can be inferred that monolingual instruction in either Filipino or English had a stronger effect on the students’ literacy skills compared to bilingual instruction. Moreover, mother tongue-based instruction, as compared to second-language instruction, had stronger effect on the preschoolers’ literacy skills. Such results have implications not only for mother tongue-based (MTB) but also for English as a second language (ESL) instruction in the country

Keywords: bilingualism, effects, monolingual, function, multilingual, mother tongue

Procedia PDF Downloads 130

26605 Grey Prediction of Atmospheric Pollutants in Shanghai Based on GM(1,1) Model Group

Authors: Diqin Qi, Jiaming Li, Siman Li

Abstract:

Based on the use of the three-point smoothing method for selectively processing original data columns, this paper establishes a group of grey GM(1,1) models to predict the concentration ranges of four major air pollutants in Shanghai from 2023 to 2024. The results indicate that PM₁₀, SO₂, and NO₂ maintain the national Grade I standards, while the concentration of PM₂.₅ has decreased but still remains within the national Grade II standards. Combining the forecast results, recommendations are provided for the Shanghai municipal government's efforts in air pollution prevention and control.

Keywords: atmospheric pollutant prediction, Grey GM(1, 1), model group, three-point smoothing method

Procedia PDF Downloads 38

26604 Automatic Identification of Pectoral Muscle

Authors: Ana L. M. Pavan, Guilherme Giacomini, Allan F. F. Alves, Marcela De Oliveira, Fernando A. B. Neto, Maria E. D. Rosa, Andre P. Trindade, Diana R. De Pina

Abstract:

Mammography is a worldwide image modality used to diagnose breast cancer, even in asymptomatic women. Due to its large availability, mammograms can be used to measure breast density and to predict cancer development. Women with increased mammographic density have a four- to sixfold increase in their risk of developing breast cancer. Therefore, studies have been made to accurately quantify mammographic breast density. In clinical routine, radiologists perform image evaluations through BIRADS (Breast Imaging Reporting and Data System) assessment. However, this method has inter and intraindividual variability. An automatic objective method to measure breast density could relieve radiologist’s workload by providing a first aid opinion. However, pectoral muscle is a high density tissue, with similar characteristics of fibroglandular tissues. It is consequently hard to automatically quantify mammographic breast density. Therefore, a pre-processing is needed to segment the pectoral muscle which may erroneously be quantified as fibroglandular tissue. The aim of this work was to develop an automatic algorithm to segment and extract pectoral muscle in digital mammograms. The database consisted of thirty medio-lateral oblique incidence digital mammography from São Paulo Medical School. This study was developed with ethical approval from the authors’ institutions and national review panels under protocol number 3720-2010. An algorithm was developed, in Matlab® platform, for the pre-processing of images. The algorithm uses image processing tools to automatically segment and extract the pectoral muscle of mammograms. Firstly, it was applied thresholding technique to remove non-biological information from image. Then, the Hough transform is applied, to find the limit of the pectoral muscle, followed by active contour method. Seed of active contour is applied in the limit of pectoral muscle found by Hough transform. An experienced radiologist also manually performed the pectoral muscle segmentation. Both methods, manual and automatic, were compared using the Jaccard index and Bland-Altman statistics. The comparison between manual and the developed automatic method presented a Jaccard similarity coefficient greater than 90% for all analyzed images, showing the efficiency and accuracy of segmentation of the proposed method. The Bland-Altman statistics compared both methods in relation to area (mm²) of segmented pectoral muscle. The statistic showed data within the 95% confidence interval, enhancing the accuracy of segmentation compared to the manual method. Thus, the method proved to be accurate and robust, segmenting rapidly and freely from intra and inter-observer variability. It is concluded that the proposed method may be used reliably to segment pectoral muscle in digital mammography in clinical routine. The segmentation of the pectoral muscle is very important for further quantifications of fibroglandular tissue volume present in the breast.

Keywords: active contour, fibroglandular tissue, hough transform, pectoral muscle

Procedia PDF Downloads 351

26603 Consumer Load Profile Determination with Entropy-Based K-Means Algorithm

Authors: Ioannis P. Panapakidis, Marios N. Moschakis

Abstract:

With the continuous increment of smart meter installations across the globe, the need for processing of the load data is evident. Clustering-based load profiling is built upon the utilization of unsupervised machine learning tools for the purpose of formulating the typical load curves or load profiles. The most commonly used algorithm in the load profiling literature is the K-means. While the algorithm has been successfully tested in a variety of applications, its drawback is the strong dependence in the initialization phase. This paper proposes a novel modified form of the K-means that addresses the aforementioned problem. Simulation results indicate the superiority of the proposed algorithm compared to the K-means.

Keywords: clustering, load profiling, load modeling, machine learning, energy efficiency and quality

Procedia PDF Downloads 165

26602 Relative Entropy Used to Determine the Divergence of Cells in Single Cell RNA Sequence Data Analysis

Authors: An Chengrui, Yin Zi, Wu Bingbing, Ma Yuanzhu, Jin Kaixiu, Chen Xiao, Ouyang Hongwei

Abstract:

Single cell RNA sequence (scRNA-seq) is one of the effective tools to study transcriptomics of biological processes. Recently, similarity measurement of cells is Euclidian distance or its derivatives. However, the process of scRNA-seq is a multi-variate Bernoulli event model, thus we hypothesize that it would be more efficient when the divergence between cells is valued with relative entropy than Euclidian distance. In this study, we compared the performances of Euclidian distance, Spearman correlation distance and Relative Entropy using scRNA-seq data of the early, medial and late stage of limb development generated in our lab. Relative Entropy is better than other methods according to cluster potential test. Furthermore, we developed KL-SNE, an algorithm modifying t-SNE whose definition of divergence between cells Euclidian distance to Kullback–Leibler divergence. Results showed that KL-SNE was more effective to dissect cell heterogeneity than t-SNE, indicating the better performance of relative entropy than Euclidian distance. Specifically, the chondrocyte expressing Comp was clustered together with KL-SNE but not with t-SNE. Surprisingly, cells in early stage were surrounded by cells in medial stage in the processing of KL-SNE while medial cells neighbored to late stage with the process of t-SNE. This results parallel to Heatmap which showed cells in medial stage were more heterogenic than cells in other stages. In addition, we also found that results of KL-SNE tend to follow Gaussian distribution compared with those of the t-SNE, which could also be verified with the analysis of scRNA-seq data from another study on human embryo development. Therefore, it is also an effective way to convert non-Gaussian distribution to Gaussian distribution and facilitate the subsequent statistic possesses. Thus, relative entropy is potentially a better way to determine the divergence of cells in scRNA-seq data analysis.

Keywords: Single cell RNA sequence, Similarity measurement, Relative Entropy, KL-SNE, t-SNE

Procedia PDF Downloads 340

26601 Management Information System to Help Managers for Providing Decision Making in an Organization

Authors: Ajayi Oluwasola Felix

Abstract:

Management information system (MIS) provides information for the managerial activities in an organization. The main purpose of this research is, MIS provides accurate and timely information necessary to facilitate the decision-making process and enable the organizations planning control and operational functions to be carried out effectively. Management information system (MIS) is basically concerned with processing data into information and is then communicated to the various departments in an organization for appropriate decision-making. MIS is a subset of the overall planning and control activities covering the application of humans technologies, and procedures of the organization. The information system is the mechanism to ensure that information is available to the managers in the form they want it and when they need it.

Keywords: Management Information Systems (MIS), information technology, decision-making, MIS in Organizations

Procedia PDF Downloads 557

26600 A Privacy Protection Scheme Supporting Fuzzy Search for NDN Routing Cache Data Name

Authors: Feng Tao, Ma Jing, Guo Xian, Wang Jing

Abstract:

Named Data Networking (NDN) replaces IP address of traditional network with data name, and adopts dynamic cache mechanism. In the existing mechanism, however, only one-to-one search can be achieved because every data has a unique name corresponding to it. There is a certain mapping relationship between data content and data name, so if the data name is intercepted by an adversary, the privacy of the data content and user’s interest can hardly be guaranteed. In order to solve this problem, this paper proposes a one-to-many fuzzy search scheme based on order-preserving encryption to reduce the query overhead by optimizing the caching strategy. In this scheme, we use hash value to ensure the user’s query safe from each node in the process of search, so does the privacy of the requiring data content.

Keywords: NDN, order-preserving encryption, fuzzy search, privacy

Procedia PDF Downloads 487

26599 E4D-MP: Time-Lapse Multiphysics Simulation and Joint Inversion Toolset for Large-Scale Subsurface Imaging

Authors: Zhuanfang Fred Zhang, Tim C. Johnson, Yilin Fang, Chris E. Strickland

Abstract:

A variety of geophysical techniques are available to image the opaque subsurface with little or no contact with the soil. It is common to conduct time-lapse surveys of different types for a given site for improved results of subsurface imaging. Regardless of the chosen survey methods, it is often a challenge to process the massive amount of survey data. The currently available software applications are generally based on the one-dimensional assumption for a desktop personal computer. Hence, they are usually incapable of imaging the three-dimensional (3D) processes/variables in the subsurface of reasonable spatial scales; the maximum amount of data that can be inverted simultaneously is often very small due to the capability limitation of personal computers. Presently, high-performance or integrating software that enables real-time integration of multi-process geophysical methods is needed. E4D-MP enables the integration and inversion of time-lapsed large-scale data surveys from geophysical methods. Using the supercomputing capability and parallel computation algorithm, E4D-MP is capable of processing data across vast spatiotemporal scales and in near real time. The main code and the modules of E4D-MP for inverting individual or combined data sets of time-lapse 3D electrical resistivity, spectral induced polarization, and gravity surveys have been developed and demonstrated for sub-surface imaging. E4D-MP provides capability of imaging the processes (e.g., liquid or gas flow, solute transport, cavity development) and subsurface properties (e.g., rock/soil density, conductivity) critical for successful control of environmental engineering related efforts such as environmental remediation, carbon sequestration, geothermal exploration, and mine land reclamation, among others.

Keywords: gravity survey, high-performance computing, sub-surface monitoring, electrical resistivity tomography

Procedia PDF Downloads 158

26598 A Performance Comparison between Conventional and Flexible Box Erecting Machines Using Dispatching Rules

Authors: Min Kyu Kim, Eun Young Lee, Dong Woo Son, Yoon Seok Chang

Abstract:

In this paper, we introduce a flexible box erecting machine (BEM) that swiftly and automatically transforms cardboard into a three dimensional box. Recently, the parcel service and home-shopping industries have grown rapidly, and there is an increasing need for various box types to ship various products. However, workers cannot fold thousands of boxes manually in a day. As such, automatic BEMs are garnering greater attention. This study takes equipment operation into consideration as well as mechanical improvements in order to design a BEM that is able to outperform its conventional counterparts. We analyzed six dispatching rules – First In First Out (FIFO), Shortest Processing Time (SPT), Earliest Due Date (EDD), Setup Avoidance, EDD + SPT, and EDD + Setup Avoidance – to determine which one was most suitable for BEM operation. Consequently, SPT and Setup Avoidance were found to be the most critical rules, followed by EDD + Setup Avoidance, EDD + SPT, EDD, and FIFO. This hierarchy was valid for both our conventional BEM and our new flexible BEM from the viewpoint of processing time. We believe that this research can contribute to flexible BEM management, which has the potential to increase productivity and convenience.

Keywords: automation, box erecting machine, dispatching rule, setup time

Procedia PDF Downloads 365

26597 Optimization and Design of Current-Mode Multiplier Circuits with Applications in Analog Signal Processing for Gas Industrial Package Systems

Authors: Mohamad Baqer Heidari, Hefzollah.Mohammadian

Abstract:

This brief presents two original implementations of improved accuracy current-mode multiplier/divider circuits. Besides the advantage of their simplicity, these original multiplier/divider structures present the advantage of very small linearity errors that can be obtained as a result of the proposed design techniques (0.75% and 0.9%, respectively, for an extended range of the input currents). The original multiplier/divider circuits permit a facile reconfiguration, the presented structures representing the functional basis for implementing complex function synthesizer circuits. The proposed computational structures are designed for implementing in 0.18-µm CMOS technology, with a low-voltage operation (a supply voltage of 1.2 V). The circuits’ power consumptions are 60 and 75 µW, respectively, while their frequency bandwidths are 79.6 and 59.7 MHz, respectively.

Keywords: analog signal processing, current-mode operation, functional core, multiplier, reconfigurable circuits, industrial package systems

Procedia PDF Downloads 375

26596 Enhancing Spatial Interpolation: A Multi-Layer Inverse Distance Weighting Model for Complex Regression and Classification Tasks in Spatial Data Analysis

Authors: Yakin Hajlaoui, Richard Labib, Jean-François Plante, Michel Gamache

Abstract:

This study introduces the Multi-Layer Inverse Distance Weighting Model (ML-IDW), inspired by the mathematical formulation of both multi-layer neural networks (ML-NNs) and Inverse Distance Weighting model (IDW). ML-IDW leverages ML-NNs' processing capabilities, characterized by compositions of learnable non-linear functions applied to input features, and incorporates IDW's ability to learn anisotropic spatial dependencies, presenting a promising solution for nonlinear spatial interpolation and learning from complex spatial data. it employ gradient descent and backpropagation to train ML-IDW, comparing its performance against conventional spatial interpolation models such as Kriging and standard IDW on regression and classification tasks using simulated spatial datasets of varying complexity. the results highlight the efficacy of ML-IDW, particularly in handling complex spatial datasets, exhibiting lower mean square error in regression and higher F1 score in classification.

Keywords: deep learning, multi-layer neural networks, gradient descent, spatial interpolation, inverse distance weighting

Procedia PDF Downloads 55

26595 Enhancing Large Language Models' Data Analysis Capability with Planning-and-Execution and Code Generation Agents: A Use Case for Southeast Asia Real Estate Market Analytics

Authors: Kien Vu, Jien Min Soh, Mohamed Jahangir Abubacker, Piyawut Pattamanon, Soojin Lee, Suvro Banerjee

Abstract:

Recent advances in Generative Artificial Intelligence (GenAI), in particular Large Language Models (LLMs) have shown promise to disrupt multiple industries at scale. However, LLMs also present unique challenges, notably, these so-called "hallucination" which is the generation of outputs that are not grounded in the input data that hinders its adoption into production. Common practice to mitigate hallucination problem is utilizing Retrieval Agmented Generation (RAG) system to ground LLMs'response to ground truth. RAG converts the grounding documents into embeddings, retrieve the relevant parts with vector similarity between user's query and documents, then generates a response that is not only based on its pre-trained knowledge but also on the specific information from the retrieved documents. However, the RAG system is not suitable for tabular data and subsequent data analysis tasks due to multiple reasons such as information loss, data format, and retrieval mechanism. In this study, we have explored a novel methodology that combines planning-and-execution and code generation agents to enhance LLMs' data analysis capabilities. The approach enables LLMs to autonomously dissect a complex analytical task into simpler sub-tasks and requirements, then convert them into executable segments of code. In the final step, it generates the complete response from output of the executed code. When deployed beta version on DataSense, the property insight tool of PropertyGuru, the approach yielded promising results, as it was able to provide market insights and data visualization needs with high accuracy and extensive coverage by abstracting the complexities for real-estate agents and developers from non-programming background. In essence, the methodology not only refines the analytical process but also serves as a strategic tool for real estate professionals, aiding in market understanding and enhancement without the need for programming skills. The implication extends beyond immediate analytics, paving the way for a new era in the real estate industry characterized by efficiency and advanced data utilization.

Keywords: large language model, reasoning, planning and execution, code generation, natural language processing, prompt engineering, data analysis, real estate, data sense, PropertyGuru

Procedia PDF Downloads 88

26594 Data Disorders in Healthcare Organizations: Symptoms, Diagnoses, and Treatments

Authors: Zakieh Piri, Shahla Damanabi, Peyman Rezaii Hachesoo

Abstract:

Introduction: Healthcare organizations like other organizations suffer from a number of disorders such as Business Sponsor Disorder, Business Acceptance Disorder, Cultural/Political Disorder, Data Disorder, etc. As quality in healthcare care mostly depends on the quality of data, we aimed to identify data disorders and its symptoms in two teaching hospitals. Methods: Using a self-constructed questionnaire, we asked 20 questions in related to quality and usability of patient data stored in patient records. Research population consisted of 150 managers, physicians, nurses, medical record staff who were working at the time of study. We also asked their views about the symptoms and treatments for any data disorders they mentioned in the questionnaire. Using qualitative methods we analyzed the answers. Results: After classifying the answers, we found six main data disorders: incomplete data, missed data, late data, blurred data, manipulated data, illegible data. The majority of participants believed in their important roles in treatment of data disorders while others believed in health system problems. Discussion: As clinicians have important roles in producing of data, they can easily identify symptoms and disorders of patient data. Health information managers can also play important roles in early detection of data disorders by proactively monitoring and periodic check-ups of data.

Keywords: data disorders, quality, healthcare, treatment

Procedia PDF Downloads 434