Search results for: sensor node data processing
26367 One-Shot Text Classification with Multilingual-BERT
Authors: Hsin-Yang Wang, K. M. A. Salam, Ying-Jia Lin, Daniel Tan, Tzu-Hsuan Chou, Hung-Yu Kao
Abstract:
Detecting user intent from natural language expression has a wide variety of use cases in different natural language processing applications. Recently few-shot training has a spike of usage on commercial domains. Due to the lack of significant sample features, the downstream task performance has been limited or leads to an unstable result across different domains. As a state-of-the-art method, the pre-trained BERT model gathering the sentence-level information from a large text corpus shows improvement on several NLP benchmarks. In this research, we are proposing a method to change multi-class classification tasks into binary classification tasks, then use the confidence score to rank the results. As a language model, BERT performs well on sequence data. In our experiment, we change the objective from predicting labels into finding the relations between words in sequence data. Our proposed method achieved 71.0% accuracy in the internal intent detection dataset and 63.9% accuracy in the HuffPost dataset. Acknowledgment: This work was supported by NCKU-B109-K003, which is the collaboration between National Cheng Kung University, Taiwan, and SoftBank Corp., Tokyo.Keywords: OSML, BERT, text classification, one shot
Procedia PDF Downloads 10126366 Comparative Analysis of the Computer Methods' Usage for Calculation of Hydrocarbon Reserves in the Baltic Sea
Authors: Pavel Shcherban, Vlad Golovanov
Abstract:
Nowadays, the depletion of hydrocarbon deposits on the land of the Kaliningrad region leads to active geological exploration and development of oil and natural gas reserves in the southeastern part of the Baltic Sea. LLC 'Lukoil-Kaliningradmorneft' implements a comprehensive program for the development of the region's shelf in 2014-2023. Due to heterogeneity of reservoir rocks in various open fields, as well as with ambiguous conclusions on the contours of deposits, additional geological prospecting and refinement of the recoverable oil reserves are carried out. The key element is use of an effective technique of computer stock modeling at the first stage of processing of the received data. The following step uses information for the cluster analysis, which makes it possible to optimize the field development approaches. The article analyzes the effectiveness of various methods for reserves' calculation and computer modelling methods of the offshore hydrocarbon fields. Cluster analysis allows to measure influence of the obtained data on the development of a technical and economic model for mining deposits. The relationship between the accuracy of the calculation of recoverable reserves and the need of modernization of existing mining infrastructure, as well as the optimization of the scheme of opening and development of oil deposits, is observed.Keywords: cluster analysis, computer modelling of deposits, correction of the feasibility study, offshore hydrocarbon fields
Procedia PDF Downloads 16626365 An Analysis System for Integrating High-Throughput Transcript Abundance Data with Metabolic Pathways in Green Algae
Authors: Han-Qin Zheng, Yi-Fan Chiang-Hsieh, Chia-Hung Chien, Wen-Chi Chang
Abstract:
As the most important non-vascular plants, algae have many research applications, including high species diversity, biofuel sources, adsorption of heavy metals and, following processing, health supplements. With the increasing availability of next-generation sequencing (NGS) data for algae genomes and transcriptomes, an integrated resource for retrieving gene expression data and metabolic pathway is essential for functional analysis and systems biology in algae. However, gene expression profiles and biological pathways are displayed separately in current resources, and making it impossible to search current databases directly to identify the cellular response mechanisms. Therefore, this work develops a novel AlgaePath database to retrieve gene expression profiles efficiently under various conditions in numerous metabolic pathways. AlgaePath, a web-based database, integrates gene information, biological pathways, and next-generation sequencing (NGS) datasets in Chlamydomonasreinhardtii and Neodesmus sp. UTEX 2219-4. Users can identify gene expression profiles and pathway information by using five query pages (i.e. Gene Search, Pathway Search, Differentially Expressed Genes (DEGs) Search, Gene Group Analysis, and Co-Expression Analysis). The gene expression data of 45 and 4 samples can be obtained directly on pathway maps in C. reinhardtii and Neodesmus sp. UTEX 2219-4, respectively. Genes that are differentially expressed between two conditions can be identified in Folds Search. Furthermore, the Gene Group Analysis of AlgaePath includes pathway enrichment analysis, and can easily compare the gene expression profiles of functionally related genes in a map. Finally, Co-Expression Analysis provides co-expressed transcripts of a target gene. The analysis results provide a valuable reference for designing further experiments and elucidating critical mechanisms from high-throughput data. More than an effective interface to clarify the transcript response mechanisms in different metabolic pathways under various conditions, AlgaePath is also a data mining system to identify critical mechanisms based on high-throughput sequencing.Keywords: next-generation sequencing (NGS), algae, transcriptome, metabolic pathway, co-expression
Procedia PDF Downloads 40726364 A Simple and Empirical Refraction Correction Method for UAV-Based Shallow-Water Photogrammetry
Authors: I GD Yudha Partama, A. Kanno, Y. Akamatsu, R. Inui, M. Goto, M. Sekine
Abstract:
The aerial photogrammetry of shallow water bottoms has the potential to be an efficient high-resolution survey technique for shallow water topography, thanks to the advent of convenient UAV and automatic image processing techniques Structure-from-Motion (SfM) and Multi-View Stereo (MVS)). However, it suffers from the systematic overestimation of the bottom elevation, due to the light refraction at the air-water interface. In this study, we present an empirical method to correct for the effect of refraction after the usual SfM-MVS processing, using common software. The presented method utilizes the empirical relation between the measured true depth and the estimated apparent depth to generate an empirical correction factor. Furthermore, this correction factor was utilized to convert the apparent water depth into a refraction-corrected (real-scale) water depth. To examine its effectiveness, we applied the method to two river sites, and compared the RMS errors in the corrected bottom elevations with those obtained by three existing methods. The result shows that the presented method is more effective than the two existing methods: The method without applying correction factor and the method utilizes the refractive index of water (1.34) as correction factor. In comparison with the remaining existing method, which used the additive terms (offset) after calculating correction factor, the presented method performs well in Site 2 and worse in Site 1. However, we found this linear regression method to be unstable when the training data used for calibration are limited. It also suffers from a large negative bias in the correction factor when the apparent water depth estimated is affected by noise, according to our numerical experiment. Overall, the good accuracy of refraction correction method depends on various factors such as the locations, image acquisition, and GPS measurement conditions. The most effective method can be selected by using statistical selection (e.g. leave-one-out cross validation).Keywords: bottom elevation, MVS, river, SfM
Procedia PDF Downloads 29926363 Eye Tracking Syntax in Language Education
Authors: Marcus Maia
Abstract:
The present study reports and discusses the use of eye tracking qualitative data in reading workshops in Brazilian middle and high schools and in Generative Syntax and Sentence Processing courses at the undergraduate and graduate levels at the Federal University of Rio de Janeiro, respectively. Both endeavors take the sentential level as the proper object to be metacognitively explored in language education (cf. Chomsky, Gallego & Ott, 2019) to develop innate science forming capacity and knowledge of language. In both projects, non-discrepant qualitative eye tracking data collected and quantitatively analyzed in experimental syntax and psycholinguistic studies carried out in Lapex (Experimental Psycholinguistics Laboratory of the Federal University of Rio de Janeiro) were displayed to students as a point of departure, triggering discussions. Classes would generally start with the display of videos showing eye tracking data, such as gaze plots and heatmaps from several studies in Psycholinguistics and Experimental Syntax that we had already developed in our laboratory. The videos usually triggered discussions with students about linguistic and psycholinguistic issues, such as the reading of sentences for gist, garden-path sentences, syntactic and semantic anomalies, the filled-gap effect, island effects, direct and indirect cause, and recursive constructions, among other topics. Active, problem-solving based methodologies were employed with the objective of stimulating student participation. The communication also discusses the importance of developing full literacy, epistemic vigilance and intellectual self-defense in an infodemic world in the lines of Maia (2022).Keywords: reading, educational psycholinguistics, eye-tracking, active methodology
Procedia PDF Downloads 6626362 FPGA Implementation of a Marginalized Particle Filter for Delineation of P and T Waves of ECG Signal
Authors: Jugal Bhandari, K. Hari Priya
Abstract:
The ECG signal provides important clinical information which could be used to pretend the diseases related to heart. Accordingly, delineation of ECG signal is an important task. Whereas delineation of P and T waves is a complex task. This paper deals with the Study of ECG signal and analysis of signal by means of Verilog Design of efficient filters and MATLAB tool effectively. It includes generation and simulation of ECG signal, by means of real time ECG data, ECG signal filtering and processing by analysis of different algorithms and techniques. In this paper, we design a basic particle filter which generates a dynamic model depending on the present and past input samples and then produces the desired output. Afterwards, the output will be processed by MATLAB to get the actual shape and accurate values of the ranges of P-wave and T-wave of ECG signal. In this paper, Questasim is a tool of mentor graphics which is being used for simulation and functional verification. The same design is again verified using Xilinx ISE which will be also used for synthesis, mapping and bit file generation. Xilinx FPGA board will be used for implementation of system. The final results of FPGA shall be verified with ChipScope Pro where the output data can be observed.Keywords: ECG, MATLAB, Bayesian filtering, particle filter, Verilog hardware descriptive language
Procedia PDF Downloads 36726361 Mobile Microscope for the Detection of Pathogenic Cells Using Image Processing
Authors: P. S. Surya Meghana, K. Lingeshwaran, C. Kannan, V. Raghavendran, C. Priya
Abstract:
One of the most basic and powerful tools in all of science and medicine is the light microscope, the fundamental device for laboratory as well as research purposes. With the improving technology, the need for portable, economic and user-friendly instruments is in high demand. The conventional microscope fails to live up to the emerging trend. Also, adequate access to healthcare is not widely available, especially in developing countries. The most basic step towards the curing of a malady is the diagnosis of the disease itself. The main aim of this paper is to diagnose Malaria with the most common device, cell phones, which prove to be the immediate solution for most of the modern day needs with the development of wireless infrastructure allowing to compute and communicate on the move. This opened up the opportunity to develop novel imaging, sensing, and diagnostics platforms using mobile phones as an underlying platform to address the global demand for accurate, sensitive, cost-effective, and field-portable measurement devices for use in remote and resource-limited settings around the world.Keywords: cellular, hand-held, health care, image processing, malarial parasites, microscope
Procedia PDF Downloads 26726360 Data Mining Practices: Practical Studies on the Telecommunication Companies in Jordan
Authors: Dina Ahmad Alkhodary
Abstract:
This study aimed to investigate the practices of Data Mining on the telecommunication companies in Jordan, from the viewpoint of the respondents. In order to achieve the goal of the study, and test the validity of hypotheses, the researcher has designed a questionnaire to collect data from managers and staff members from main department in the researched companies. The results shows improvements stages of the telecommunications companies towered Data Mining.Keywords: data, mining, development, business
Procedia PDF Downloads 49826359 Joint Simulation and Estimation for Geometallurgical Modeling of Crushing Consumption Energy in the Mineral Processing Plants
Authors: Farzaneh Khorram, Xavier Emery
Abstract:
In this paper, it is aimed to create a crushing consumption energy (CCE) block model and determine the blocks with the potential to have the maximum grinding process energy consumption for the study area. For this purpose, a joint estimate (co-kriging) and joint simulation (turning band method and plurigaussian methods) to predict the CCE based on its correlation with SAG power index (SPI), A×B, and ball mill bond work Index (BWI). The analysis shows that TBCOSIM and plurigaussian have the more realistic results compared to cokriging. It seems logical due to the nature of the data geometallurgical and the linearity of the kriging method and the smoothing effect of kriging.Keywords: plurigaussian, turning band, cokriging, geometallurgy
Procedia PDF Downloads 7026358 Predicting Open Chromatin Regions in Cell-Free DNA Whole Genome Sequencing Data by Correlation Clustering
Authors: Fahimeh Palizban, Farshad Noravesh, Amir Hossein Saeidian, Mahya Mehrmohamadi
Abstract:
In the recent decade, the emergence of liquid biopsy has significantly improved cancer monitoring and detection. Dying cells, including those originating from tumors, shed their DNA into the blood and contribute to a pool of circulating fragments called cell-free DNA. Accordingly, identifying the tissue origin of these DNA fragments from the plasma can result in more accurate and fast disease diagnosis and precise treatment protocols. Open chromatin regions are important epigenetic features of DNA that reflect cell types of origin. Profiling these features by DNase-seq, ATAC-seq, and histone ChIP-seq provides insights into tissue-specific and disease-specific regulatory mechanisms. There have been several studies in the area of cancer liquid biopsy that integrate distinct genomic and epigenomic features for early cancer detection along with tissue of origin detection. However, multimodal analysis requires several types of experiments to cover the genomic and epigenomic aspects of a single sample, which will lead to a huge amount of cost and time. To overcome these limitations, the idea of predicting OCRs from WGS is of particular importance. In this regard, we proposed a computational approach to target the prediction of open chromatin regions as an important epigenetic feature from cell-free DNA whole genome sequence data. To fulfill this objective, local sequencing depth will be fed to our proposed algorithm and the prediction of the most probable open chromatin regions from whole genome sequencing data can be carried out. Our method integrates the signal processing method with sequencing depth data and includes count normalization, Discrete Fourie Transform conversion, graph construction, graph cut optimization by linear programming, and clustering. To validate the proposed method, we compared the output of the clustering (open chromatin region+, open chromatin region-) with previously validated open chromatin regions related to human blood samples of the ATAC-DB database. The percentage of overlap between predicted open chromatin regions and the experimentally validated regions obtained by ATAC-seq in ATAC-DB is greater than 67%, which indicates meaningful prediction. As it is evident, OCRs are mostly located in the transcription start sites (TSS) of the genes. In this regard, we compared the concordance between the predicted OCRs and the human genes TSS regions obtained from refTSS and it showed proper accordance around 52.04% and ~78% with all and the housekeeping genes, respectively. Accurately detecting open chromatin regions from plasma cell-free DNA-seq data is a very challenging computational problem due to the existence of several confounding factors, such as technical and biological variations. Although this approach is in its infancy, there has already been an attempt to apply it, which leads to a tool named OCRDetector with some restrictions like the need for highly depth cfDNA WGS data, prior information about OCRs distribution, and considering multiple features. However, we implemented a graph signal clustering based on a single depth feature in an unsupervised learning manner that resulted in faster performance and decent accuracy. Overall, we tried to investigate the epigenomic pattern of a cell-free DNA sample from a new computational perspective that can be used along with other tools to investigate genetic and epigenetic aspects of a single whole genome sequencing data for efficient liquid biopsy-related analysis.Keywords: open chromatin regions, cancer, cell-free DNA, epigenomics, graph signal processing, correlation clustering
Procedia PDF Downloads 15026357 FT-NIR Method to Determine Moisture in Gluten Free Rice-Based Pasta during Drying
Authors: Navneet Singh Deora, Aastha Deswal, H. N. Mishra
Abstract:
Pasta is one of the most widely consumed food products around the world. Rapid determination of the moisture content in pasta will assist food processors to provide online quality control of pasta during large scale production. Rapid Fourier transform near-infrared method (FT-NIR) was developed for determining moisture content in pasta. A calibration set of 150 samples, a validation set of 30 samples and a prediction set of 25 samples of pasta were used. The diffuse reflection spectra of different types of pastas were measured by FT-NIR analyzer in the 4,000-12,000 cm-1 spectral range. Calibration and validation sets were designed for the conception and evaluation of the method adequacy in the range of moisture content 10 to 15 percent (w.b) of the pasta. The prediction models based on partial least squares (PLS) regression, were developed in the near-infrared. Conventional criteria such as the R2, the root mean square errors of cross validation (RMSECV), root mean square errors of estimation (RMSEE) as well as the number of PLS factors were considered for the selection of three pre-processing (vector normalization, minimum-maximum normalization and multiplicative scatter correction) methods. Spectra of pasta sample were treated with different mathematic pre-treatments before being used to build models between the spectral information and moisture content. The moisture content in pasta predicted by FT-NIR methods had very good correlation with their values determined via traditional methods (R2 = 0.983), which clearly indicated that FT-NIR methods could be used as an effective tool for rapid determination of moisture content in pasta. The best calibration model was developed with min-max normalization (MMN) spectral pre-processing (R2 = 0.9775). The MMN pre-processing method was found most suitable and the maximum coefficient of determination (R2) value of 0.9875 was obtained for the calibration model developed.Keywords: FT-NIR, pasta, moisture determination, food engineering
Procedia PDF Downloads 25826356 Damage Identification Using Experimental Modal Analysis
Authors: Niladri Sekhar Barma, Satish Dhandole
Abstract:
Damage identification in the context of safety, nowadays, has become a fundamental research interest area in the field of mechanical, civil, and aerospace engineering structures. The following research is aimed to identify damage in a mechanical beam structure and quantify the severity or extent of damage in terms of loss of stiffness, and obtain an updated analytical Finite Element (FE) model. An FE model is used for analysis, and the location of damage for single and multiple damage cases is identified numerically using the modal strain energy method and mode shape curvature method. Experimental data has been acquired with the help of an accelerometer. Fast Fourier Transform (FFT) algorithm is applied to the measured signal, and subsequently, post-processing is done in MEscopeVes software. The two sets of data, the numerical FE model and experimental results, are compared to locate the damage accurately. The extent of the damage is identified via modal frequencies using a mixed numerical-experimental technique. Mode shape comparison is performed by Modal Assurance Criteria (MAC). The analytical FE model is adjusted by the direct method of model updating. The same study has been extended to some real-life structures such as plate and GARTEUR structures.Keywords: damage identification, damage quantification, damage detection using modal analysis, structural damage identification
Procedia PDF Downloads 11626355 Integrated Life Skill Training and Executive Function Strategies in Children with Autism Spectrum Disorder in Qatar: A Study Protocol for a Randomized Controlled Trial
Authors: Bara M Yousef, Naresh B Raj, Nadiah W Arfah, Brightlin N Dhas
Abstract:
Background: Executive function (EF) impairment is common in children with autism spectrum disorder (ASD). EF strategies are considered effective in improving the therapeutic outcomes of children with ASD. Aims: This study primarily aims to explore whether integrating EF strategies combined with regular occupational therapy intervention is more effective in improving daily life skills (DLS) and sensory integration/processing (SI/SP) skills than regular occupational therapy alone in children with ASD and secondarily aims to assess treatment outcomes on improving visual motor integration (VMI) skills. Procedures: A total of 92 children with ASD will be recruited and, following baseline assessments, randomly assigned to the treatment group (45-min once weekly individual occupational therapy plus EF strategies) and control group (45-min once weekly individual therapy sessions alone). Results and Outcomes: All children will be evaluated systematically by assessing SI/SP, DLS, and VMI, skills at baseline, 7 weeks, and 14 weeks of treatment. Data will be analyzed using ANCOVA and T-test. Conclusions and Implications: This single-blind, randomized controlled trial will provide empirical evidence for the effectiveness of EF strategies when combined with regular occupational therapy programs. Based on trial results, EF strategies could be recommended in multidisciplinary programs for children with ASD. Trial Registration: The trial has been registered in the clinicaltrail.gov for a registry, protocol ID: MRC-01-22-509 ClinicalTrials.gov Identifier: NCT05829577, registered 25th April 2023Keywords: autism spectrum disorder, executive function strategies, daily life skills, sensory integration/processing, visual motor integration, occupational therapy, effectiveness
Procedia PDF Downloads 12326354 Entrepreneurial Orientation and Business Performance: The Case of Micro Scale Food Processors Operating in a War-Recovery Environment
Authors: V. Suganya, V. Balasuriya
Abstract:
The functioning of Micro and Small Scale (MSS) businesses in the northern part of Sri Lanka was vulnerable due to three decades of internal conflict and the subsequent post-war economic openings has resulted new market prospects for MSS businesses. MSS businesses survive and operate with limited resources and struggle to access finance, raw material, markets, and technology. This study attempts to identify the manner in which entrepreneurial orientation puts into practice by the business operators to overcome these business challenges. Business operators in the traditional food processing sector are taken for this study as this sub-sector of the food industry is developing at a rapid pace. A review of the literature was done to recognize the concepts of entrepreneurial orientation, defining MMS businesses and the manner in which business performance is measured. Direct interview method supported by a structured questionnaire is used to collect data from 80 respondents; based on a fixed interval random sampling technique. This study reveals that more than half of the business operators have opted to commence their business ventures as a result of identifying a market opportunity. 41 per cent of the business operators are highly entrepreneurial oriented in a scale of 1 to 5. Entrepreneurial orientation shows significant relationship and strongly correlated with business performance. Pro-activeness, innovativeness and competitive aggressiveness shows a significant relationship with business performance while risk taking is negative and autonomy is not significantly related to business performance. It is evident that entrepreneurial oriented business practices contribute to better business performance even though 70 per cent prefer the ideas/views of the support agencies than the stakeholders when making business decisions. It is recommended that appropriate training should be introduced to develop entrepreneurial skills focusing to improve business networks so that new business opportunities and innovative business practices are identified.Keywords: Micro and Small Scale (MMS) businesses, entrepreneurial orientation (EO), food processing, business operators
Procedia PDF Downloads 49526353 The Impact of System and Data Quality on Organizational Success in the Kingdom of Bahrain
Authors: Amal M. Alrayes
Abstract:
Data and system quality play a central role in organizational success, and the quality of any existing information system has a major influence on the effectiveness of overall system performance.Given the importance of system and data quality to an organization, it is relevant to highlight their importance on organizational performance in the Kingdom of Bahrain. This research aims to discover whether system quality and data quality are related, and to study the impact of system and data quality on organizational success. A theoretical model based on previous research is used to show the relationship between data and system quality, and organizational impact. We hypothesize, first, that system quality is positively associated with organizational impact, secondly that system quality is positively associated with data quality, and finally that data quality is positively associated with organizational impact. A questionnaire was conducted among public and private organizations in the Kingdom of Bahrain. The results show that there is a strong association between data and system quality, that affects organizational success.Keywords: data quality, performance, system quality, Kingdom of Bahrain
Procedia PDF Downloads 49326352 Investigating the Effect of Orthographic Transparency on Phonological Awareness in Bilingual Children with Dyslexia
Authors: Sruthi Raveendran
Abstract:
Developmental dyslexia, characterized by reading difficulties despite normal intelligence, presents a significant challenge for bilingual children navigating languages with varying degrees of orthographic transparency. This study bridges a critical gap in dyslexia interventions for bilingual populations in India by examining how consistency and predictability of letter-sound relationships in a writing system (orthographic transparency) influence the ability to understand and manipulate the building blocks of sound in language (phonological processing). The study employed a computerized visual rhyme-judgment task with concurrent EEG (electroencephalogram) recording. The task compared reaction times, accuracy of performance, and event-related potential (ERP) components (N170, N400, and LPC) for rhyming and non-rhyming stimuli in two orthographies: English (opaque orthography) and Kannada (transparent orthography). As hypothesized, the results revealed advantages in phonological processing tasks for transparent orthography (Kannada). Children with dyslexia were faster and more accurate when judging rhymes in Kannada compared to English. This suggests that a language with consistent letter-sound relationships (transparent orthography) facilitates processing, especially for tasks that involve manipulating sounds within words (rhyming). Furthermore, brain activity measured by event-related potentials (ERP) showed less effort required for processing words in Kannada, as reflected by smaller N170, N400, and LPC amplitudes. These findings highlight the crucial role of orthographic transparency in optimizing reading performance for bilingual children with dyslexia. These findings emphasize the need for language-specific intervention strategies that consider the unique linguistic characteristics of each language. While acknowledging the complexity of factors influencing dyslexia, this research contributes valuable insights into the impact of orthographic transparency on phonological awareness in bilingual children. This knowledge paves the way for developing tailored interventions that promote linguistic inclusivity and optimize literacy outcomes for children with dyslexia.Keywords: developmental dyslexia, phonological awareness, rhyme judgment, orthographic transparency, Kannada, English, N170, N400, LPC
Procedia PDF Downloads 926351 Impact of Node Density and Transmission Range on the Performance of OLSR and DSDV Routing Protocols in VANET City Scenarios
Authors: Yassine Meraihi, Dalila Acheli, Rabah Meraihi
Abstract:
Vehicular Ad hoc Network (VANET) is a special case of Mobile Ad hoc Network (MANET) used to establish communications and exchange information among nearby vehicles and between vehicles and nearby fixed infrastructure. VANET is seen as a promising technology used to provide safety, efficiency, assistance and comfort to the road users. Routing is an important issue in Vehicular Ad Hoc Network to find and maintain communication between vehicles due to the highly dynamic topology, frequently disconnected network and mobility constraints. This paper evaluates the performance of two most popular proactive routing protocols OLSR and DSDV in real city traffic scenario on the basis of three metrics namely Packet delivery ratio, throughput and average end to end delay by varying vehicles density and transmission range.Keywords: DSDV, OLSR, quality of service, routing protocols, VANET
Procedia PDF Downloads 47126350 Cloud Computing in Data Mining: A Technical Survey
Authors: Ghaemi Reza, Abdollahi Hamid, Dashti Elham
Abstract:
Cloud computing poses a diversity of challenges in data mining operation arising out of the dynamic structure of data distribution as against the use of typical database scenarios in conventional architecture. Due to immense number of users seeking data on daily basis, there is a serious security concerns to cloud providers as well as data providers who put their data on the cloud computing environment. Big data analytics use compute intensive data mining algorithms (Hidden markov, MapReduce parallel programming, Mahot Project, Hadoop distributed file system, K-Means and KMediod, Apriori) that require efficient high performance processors to produce timely results. Data mining algorithms to solve or optimize the model parameters. The challenges that operation has to encounter is the successful transactions to be established with the existing virtual machine environment and the databases to be kept under the control. Several factors have led to the distributed data mining from normal or centralized mining. The approach is as a SaaS which uses multi-agent systems for implementing the different tasks of system. There are still some problems of data mining based on cloud computing, including design and selection of data mining algorithms.Keywords: cloud computing, data mining, computing models, cloud services
Procedia PDF Downloads 47926349 Cross-border Data Transfers to and from South Africa
Authors: Amy Gooden, Meshandren Naidoo
Abstract:
Genetic research and transfers of big data are not confined to a particular jurisdiction, but there is a lack of clarity regarding the legal requirements for importing and exporting such data. Using direct-to-consumer genetic testing (DTC-GT) as an example, this research assesses the status of data sharing into and out of South Africa (SA). While SA laws cover the sending of genetic data out of SA, prohibiting such transfer unless a legal ground exists, the position where genetic data comes into the country depends on the laws of the country from where it is sent – making the legal position less clear.Keywords: cross-border, data, genetic testing, law, regulation, research, sharing, South Africa
Procedia PDF Downloads 12526348 The Study of Security Techniques on Information System for Decision Making
Authors: Tejinder Singh
Abstract:
Information system is the flow of data from different levels to different directions for decision making and data operations in information system (IS). Data can be violated by different manner like manual or technical errors, data tampering or loss of integrity. Security system called firewall of IS is effected by such type of violations. The flow of data among various levels of Information System is done by networking system. The flow of data on network is in form of packets or frames. To protect these packets from unauthorized access, virus attacks, and to maintain the integrity level, network security is an important factor. To protect the data to get pirated, various security techniques are used. This paper represents the various security techniques and signifies different harmful attacks with the help of detailed data analysis. This paper will be beneficial for the organizations to make the system more secure, effective, and beneficial for future decisions making.Keywords: information systems, data integrity, TCP/IP network, vulnerability, decision, data
Procedia PDF Downloads 30726347 Efficient Computer-Aided Design-Based Multilevel Optimization of the LS89
Authors: A. Chatel, I. S. Torreguitart, T. Verstraete
Abstract:
The paper deals with a single point optimization of the LS89 turbine using an adjoint optimization and defining the design variables within a CAD system. The advantage of including the CAD model in the design system is that higher level constraints can be imposed on the shape, allowing the optimized model or component to be manufactured. However, CAD-based approaches restrict the design space compared to node-based approaches where every node is free to move. In order to preserve a rich design space, we develop a methodology to refine the CAD model during the optimization and to create the best parameterization to use at each time. This study presents a methodology to progressively refine the design space, which combines parametric effectiveness with a differential evolutionary algorithm in order to create an optimal parameterization. In this manuscript, we show that by doing the parameterization at the CAD level, we can impose higher level constraints on the shape, such as the axial chord length, the trailing edge radius and G2 geometric continuity between the suction side and pressure side at the leading edge. Additionally, the adjoint sensitivities are filtered out and only smooth shapes are produced during the optimization process. The use of algorithmic differentiation for the CAD kernel and grid generator allows computing the grid sensitivities to machine accuracy and avoid the limited arithmetic precision and the truncation error of finite differences. Then, the parametric effectiveness is computed to rate the ability of a set of CAD design parameters to produce the design shape change dictated by the adjoint sensitivities. During the optimization process, the design space is progressively enlarged using the knot insertion algorithm which allows introducing new control points whilst preserving the initial shape. The position of the inserted knots is generally assumed. However, this assumption can hinder the creation of better parameterizations that would allow producing more localized shape changes where the adjoint sensitivities dictate. To address this, we propose using a differential evolutionary algorithm to maximize the parametric effectiveness by optimizing the location of the inserted knots. This allows the optimizer to gradually explore larger design spaces and to use an optimal CAD-based parameterization during the course of the optimization. The method is tested on the LS89 turbine cascade and large aerodynamic improvements in the entropy generation are achieved whilst keeping the exit flow angle fixed. The trailing edge and axial chord length, which are kept fixed as manufacturing constraints. The optimization results show that the multilevel optimizations were more efficient than the single level optimization, even though they used the same number of design variables at the end of the multilevel optimizations. Furthermore, the multilevel optimization where the parameterization is created using the optimal knot positions results in a more efficient strategy to reach a better optimum than the multilevel optimization where the position of the knots is arbitrarily assumed.Keywords: adjoint, CAD, knots, multilevel, optimization, parametric effectiveness
Procedia PDF Downloads 11026346 Data Integration with Geographic Information System Tools for Rural Environmental Monitoring
Authors: Tamas Jancso, Andrea Podor, Eva Nagyne Hajnal, Peter Udvardy, Gabor Nagy, Attila Varga, Meng Qingyan
Abstract:
The paper deals with the conditions and circumstances of integration of remotely sensed data for rural environmental monitoring purposes. The main task is to make decisions during the integration process when we have data sources with different resolution, location, spectral channels, and dimension. In order to have exact knowledge about the integration and data fusion possibilities, it is necessary to know the properties (metadata) that characterize the data. The paper explains the joining of these data sources using their attribute data through a sample project. The resulted product will be used for rural environmental analysis.Keywords: remote sensing, GIS, metadata, integration, environmental analysis
Procedia PDF Downloads 12026345 Deep Learning-Based Classification of 3D CT Scans with Real Clinical Data; Impact of Image format
Authors: Maryam Fallahpoor, Biswajeet Pradhan
Abstract:
Background: Artificial intelligence (AI) serves as a valuable tool in mitigating the scarcity of human resources required for the evaluation and categorization of vast quantities of medical imaging data. When AI operates with optimal precision, it minimizes the demand for human interpretations and, thereby, reduces the burden on radiologists. Among various AI approaches, deep learning (DL) stands out as it obviates the need for feature extraction, a process that can impede classification, especially with intricate datasets. The advent of DL models has ushered in a new era in medical imaging, particularly in the context of COVID-19 detection. Traditional 2D imaging techniques exhibit limitations when applied to volumetric data, such as Computed Tomography (CT) scans. Medical images predominantly exist in one of two formats: neuroimaging informatics technology initiative (NIfTI) and digital imaging and communications in medicine (DICOM). Purpose: This study aims to employ DL for the classification of COVID-19-infected pulmonary patients and normal cases based on 3D CT scans while investigating the impact of image format. Material and Methods: The dataset used for model training and testing consisted of 1245 patients from IranMehr Hospital. All scans shared a matrix size of 512 × 512, although they exhibited varying slice numbers. Consequently, after loading the DICOM CT scans, image resampling and interpolation were performed to standardize the slice count. All images underwent cropping and resampling, resulting in uniform dimensions of 128 × 128 × 60. Resolution uniformity was achieved through resampling to 1 mm × 1 mm × 1 mm, and image intensities were confined to the range of (−1000, 400) Hounsfield units (HU). For classification purposes, positive pulmonary COVID-19 involvement was designated as 1, while normal images were assigned a value of 0. Subsequently, a U-net-based lung segmentation module was applied to obtain 3D segmented lung regions. The pre-processing stage included normalization, zero-centering, and shuffling. Four distinct 3D CNN models (ResNet152, ResNet50, DensNet169, and DensNet201) were employed in this study. Results: The findings revealed that the segmentation technique yielded superior results for DICOM images, which could be attributed to the potential loss of information during the conversion of original DICOM images to NIFTI format. Notably, ResNet152 and ResNet50 exhibited the highest accuracy at 90.0%, and the same models achieved the best F1 score at 87%. ResNet152 also secured the highest Area under the Curve (AUC) at 0.932. Regarding sensitivity and specificity, DensNet201 achieved the highest values at 93% and 96%, respectively. Conclusion: This study underscores the capacity of deep learning to classify COVID-19 pulmonary involvement using real 3D hospital data. The results underscore the significance of employing DICOM format 3D CT images alongside appropriate pre-processing techniques when training DL models for COVID-19 detection. This approach enhances the accuracy and reliability of diagnostic systems for COVID-19 detection.Keywords: deep learning, COVID-19 detection, NIFTI format, DICOM format
Procedia PDF Downloads 8826344 Model Development for Real-Time Human Sitting Posture Detection Using a Camera
Authors: Jheanel E. Estrada, Larry A. Vea
Abstract:
This study developed model to detect proper/improper sitting posture using the built in web camera which detects the upper body points’ location and distances (chin, manubrium and acromion process). It also established relationships of human body frames and proper sitting posture. The models were developed by training some well-known classifiers such as KNN, SVM, MLP, and Decision Tree using the data collected from 60 students of different body frames. Decision Tree classifier demonstrated the most promising model performance with an accuracy of 95.35% and a kappa of 0.907 for head and shoulder posture. Results also showed that there were relationships between body frame and posture through Body Mass Index.Keywords: posture, spinal points, gyroscope, image processing, ergonomics
Procedia PDF Downloads 32926343 Forthcoming Big Data on Smart Buildings and Cities: An Experimental Study on Correlations among Urban Data
Authors: Yu-Mi Song, Sung-Ah Kim, Dongyoun Shin
Abstract:
Cities are complex systems of diverse and inter-tangled activities. These activities and their complex interrelationships create diverse urban phenomena. And such urban phenomena have considerable influences on the lives of citizens. This research aimed to develop a method to reveal the causes and effects among diverse urban elements in order to enable better understanding of urban activities and, therefrom, to make better urban planning strategies. Specifically, this study was conducted to solve a data-recommendation problem found on a Korean public data homepage. First, a correlation analysis was conducted to find the correlations among random urban data. Then, based on the results of that correlation analysis, the weighted data network of each urban data was provided to people. It is expected that the weights of urban data thereby obtained will provide us with insights into cities and show us how diverse urban activities influence each other and induce feedback.Keywords: big data, machine learning, ontology model, urban data model
Procedia PDF Downloads 41826342 Determines the Continuity of Void in Underground Mine Tunnel Using Ground Penetrating Radar
Authors: Farid Adisaputra Gumilang
Abstract:
Kucing Liar Underground Mine is a future mine of PT Freeport Indonesia PTFI that is currently being developed. In the development process, problems were found when blasting the tunnels; there were overbreak, and void occur caused by geological contact or poor rock conditions. Geotechnical engineers must evaluate not only the remnant capacity of ground support systems but also investigate the depth of rock mass yield within pillars. To prevent the potential hazard caused by void zones, geotechnical engineers must ensure the planned drift is mined in the best location where people can work safely. GPR, or Ground penetrating radar, is a geophysical method that can image the subsurface. This non-destructive method uses electromagnetic radiation and detects the reflected signals from subsurface structures. The GPR survey measurements are conducted 48 meters along the drift that has a poor ground condition with 150MHz antenna with several angles (roof, wall, and floor). Concern grounds are determined by the continuity of reflector/low reflector in the radargram section. Concern grounds are determined by the continuity of reflector/low reflector in the radargram section. In this paper, processing data using instantaneous amplitude to identify the void zone. In order to have a good interpretation and result, it combines with the geological information and borehole camera data, so the calibrated GPR data allows the geotechnical engineer to determine the safe location to change the drift location.Keywords: underground mine, ground penetrating radar, reflectivity, borehole camera
Procedia PDF Downloads 8326341 Medical Image Augmentation Using Spatial Transformations for Convolutional Neural Network
Authors: Trupti Chavan, Ramachandra Guda, Kameshwar Rao
Abstract:
The lack of data is a pain problem in medical image analysis using a convolutional neural network (CNN). This work uses various spatial transformation techniques to address the medical image augmentation issue for knee detection and localization using an enhanced single shot detector (SSD) network. The spatial transforms like a negative, histogram equalization, power law, sharpening, averaging, gaussian blurring, etc. help to generate more samples, serve as pre-processing methods, and highlight the features of interest. The experimentation is done on the OpenKnee dataset which is a collection of knee images from the openly available online sources. The CNN called enhanced single shot detector (SSD) is utilized for the detection and localization of the knee joint from a given X-ray image. It is an enhanced version of the famous SSD network and is modified in such a way that it will reduce the number of prediction boxes at the output side. It consists of a classification network (VGGNET) and an auxiliary detection network. The performance is measured in mean average precision (mAP), and 99.96% mAP is achieved using the proposed enhanced SSD with spatial transformations. It is also seen that the localization boundary is comparatively more refined and closer to the ground truth in spatial augmentation and gives better detection and localization of knee joints.Keywords: data augmentation, enhanced SSD, knee detection and localization, medical image analysis, openKnee, Spatial transformations
Procedia PDF Downloads 15426340 Detection and Classification of Myocardial Infarction Using New Extracted Features from Standard 12-Lead ECG Signals
Authors: Naser Safdarian, Nader Jafarnia Dabanloo
Abstract:
In this paper we used four features i.e. Q-wave integral, QRS complex integral, T-wave integral and total integral as extracted feature from normal and patient ECG signals to detection and localization of myocardial infarction (MI) in left ventricle of heart. In our research we focused on detection and localization of MI in standard ECG. We use the Q-wave integral and T-wave integral because this feature is important impression in detection of MI. We used some pattern recognition method such as Artificial Neural Network (ANN) to detect and localize the MI. Because these methods have good accuracy for classification of normal and abnormal signals. We used one type of Radial Basis Function (RBF) that called Probabilistic Neural Network (PNN) because of its nonlinearity property, and used other classifier such as k-Nearest Neighbors (KNN), Multilayer Perceptron (MLP) and Naive Bayes Classification. We used PhysioNet database as our training and test data. We reached over 80% for accuracy in test data for localization and over 95% for detection of MI. Main advantages of our method are simplicity and its good accuracy. Also we can improve accuracy of classification by adding more features in this method. A simple method based on using only four features which extracted from standard ECG is presented which has good accuracy in MI localization.Keywords: ECG signal processing, myocardial infarction, features extraction, pattern recognition
Procedia PDF Downloads 45626339 Investigation of the Unbiased Characteristic of Doppler Frequency to Different Antenna Array Geometries
Authors: Somayeh Komeylian
Abstract:
Array signal processing techniques have been recently developing in a variety application of the performance enhancement of receivers by refraining the power of jamming and interference signals. In this scenario, biases induced to the antenna array receiver degrade significantly the accurate estimation of the carrier phase. Owing to the integration of frequency becomes the carrier phase, we have obtained the unbiased doppler frequency for the high precision estimation of carrier phase. The unbiased characteristic of Doppler frequency to the power jamming and the other interference signals allows achieving the highly accurate estimation of phase carrier. In this study, we have rigorously investigated the unbiased characteristic of Doppler frequency to the variation of the antenna array geometries. The simulation results have efficiently verified that the Doppler frequency remains also unbiased and accurate to the variation of antenna array geometries.Keywords: array signal processing, unbiased doppler frequency, GNSS, carrier phase, and slowly fluctuating point target
Procedia PDF Downloads 15926338 Data-driven Decision-Making in Digital Entrepreneurship
Authors: Abeba Nigussie Turi, Xiangming Samuel Li
Abstract:
Data-driven business models are more typical for established businesses than early-stage startups that strive to penetrate a market. This paper provided an extensive discussion on the principles of data analytics for early-stage digital entrepreneurial businesses. Here, we developed data-driven decision-making (DDDM) framework that applies to startups prone to multifaceted barriers in the form of poor data access, technical and financial constraints, to state some. The startup DDDM framework proposed in this paper is novel in its form encompassing startup data analytics enablers and metrics aligning with startups' business models ranging from customer-centric product development to servitization which is the future of modern digital entrepreneurship.Keywords: startup data analytics, data-driven decision-making, data acquisition, data generation, digital entrepreneurship
Procedia PDF Downloads 329