Search results for: manually annotated dataset
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1471

Search results for: manually annotated dataset

1111 Effective Survey Designing for Conducting Opinion Survey to Follow Participatory Approach in a Study of Transport Infrastructure Projects: A Case Study of the City of Kolkata

Authors: Jayanti De

Abstract:

Users of any urban road infrastructure may be classified into various categories. The current paper intends to see whether the opinions on different environmental and transportation criteria vary significantly among different types of road users or not. The paper addresses this issue by using a unique survey data that has been collected from Kolkata, a highly populated city in India. Multiple criteria should be taken into account while planning on infrastructure development programs. Given limited resources, a welfare maximizing government typically resorts to public opinion by designing surveys for prioritization of one project over another. Designing such surveys can be challenging and costly. Deciding upon whom to include in a survey and how to represent each group of consumers/road-users depend crucially on how opinion for different criteria vary across consumer groups. A unique dataset has been collected from various parts of Kolkata to statistically test (using Kolmogorov-Smirnov test) whether assigning of weights to rank the transportation criteria like congestion, air pollution, noise pollution, and morning/evening delay vary significantly across the various groups of users of such infrastructure. The different consumer/user groups in the dataset include pedestrian, private car owner, para-transit (taxi /auto rickshaw) user, public transport (bus) user and freight transporter among others. Very little evidence has been found that ranking of different criteria among these groups vary significantly. This also supports the hypothesis that road- users/consumers form their opinion by using their long-run rather than immediate experience. As a policy prescription, this implies that under-representation or over-representation of a specific consumer group in a survey may not necessarily distort the overall opinion, since opinions across different consumer groups are highly correlated as evident from this particular case study.

Keywords: multi criteria analysis, project-prioritization, road- users, survey designing

Procedia PDF Downloads 259
1110 Edge Detection and Morphological Image for Estimating Gestational Age Based on Fetus Length Automatically

Authors: Retno Supriyanti, Ahmad Chuzaeri, Yogi Ramadhani, A. Haris Budi Widodo

Abstract:

The use of ultrasonography in the medical world has been very popular including the diagnosis of pregnancy. In determining pregnancy, ultrasonography has many roles, such as to check the position of the fetus, abnormal pregnancy, fetal age and others. Unfortunately, all these things still need to analyze the role of the obstetrician in the sense of image raised by ultrasonography. One of the most striking is the determination of gestational age. Usually, it is done by measuring the length of the fetus manually by obstetricians. In this study, we developed a computer-aided diagnosis for the determination of gestational age by measuring the length of the fetus automatically using edge detection method and image morphology. Results showed that the system is sufficiently accurate in determining the gestational age based image processing.

Keywords: computer aided diagnosis, gestational age, and diameter of uterus, length of fetus, edge detection method, morphology image

Procedia PDF Downloads 274
1109 Comparison Of Virtual Non-Contrast To True Non-Contrast Images Using Dual Layer Spectral Computed Tomography

Authors: O’Day Luke

Abstract:

Purpose: To validate virtual non-contrast reconstructions generated from dual-layer spectral computed tomography (DL-CT) data as an alternative for the acquisition of a dedicated true non-contrast dataset during multiphase contrast studies. Material and methods: Thirty-three patients underwent a routine multiphase clinical CT examination, using Dual-Layer Spectral CT, from March to August 2021. True non-contrast (TNC) and virtual non-contrast (VNC) datasets, generated from both portal venous and arterial phase imaging were evaluated. For every patient in both true and virtual non-contrast datasets, a region-of-interest (ROI) was defined in aorta, liver, fluid (i.e. gallbladder, urinary bladder), kidney, muscle, fat and spongious bone, resulting in 693 ROIs. Differences in attenuation for VNC and TNV images were compared, both separately and combined. Consistency between VNC reconstructions obtained from the arterial and portal venous phase was evaluated. Results: Comparison of CT density (HU) on the VNC and TNC images showed a high correlation. The mean difference between TNC and VNC images (excluding bone results) was 5.5 ± 9.1 HU and > 90% of all comparisons showed a difference of less than 15 HU. For all tissues but spongious bone, the mean absolute difference between TNC and VNC images was below 10 HU. VNC images derived from the arterial and the portal venous phase showed a good correlation in most tissue types. The aortic attenuation was somewhat dependent however on which dataset was used for reconstruction. Bone evaluation with VNC datasets continues to be a problem, as spectral CT algorithms are currently poor in differentiating bone and iodine. Conclusion: Given the increasing availability of DL-CT and proven accuracy of virtual non-contrast processing, VNC is a promising tool for generating additional data during routine contrast-enhanced studies. This study shows the utility of virtual non-contrast scans as an alternative for true non-contrast studies during multiphase CT, with potential for dose reduction, without loss of diagnostic information.

Keywords: dual-layer spectral computed tomography, virtual non-contrast, true non-contrast, clinical comparison

Procedia PDF Downloads 115
1108 Smart Irrigation System

Authors: Levent Seyfi, Ertan Akman, Tuğrul C. Topak

Abstract:

In this study, irrigation automation with electronic sensors and its control with smartphones were aimed. In this context, temperature and soil humidity measurements of the area irrigated were obtained by temperature and humidity sensors. A micro controller (Arduino) was utilized for accessing values of these parameters and controlling the proposed irrigation system. The irrigation system could automatically be worked according to obtained measurement values. Besides, a GSM module used together with Arduino provided that the irrigation system was in connection to smartphones. Thus, the irrigation system can be remotely controlled. Not only can we observe whether the irrigation system is working or not via developed special android application but also we can see temperature and humidity measurement values. In addition to this, if desired, the irrigation system can be remotely and manually started or stopped regardless of measured sensor vales thanks to the developed android application. In addition to smartphones, the irrigation system can be alternatively controlled via the designed website (www.sulamadenetim.com).

Keywords: smartphone, Android Operating System, sensors, irrigation System, arduino

Procedia PDF Downloads 591
1107 Comprehensive Evaluation of COVID-19 Through Chest Images

Authors: Parisa Mansour

Abstract:

The coronavirus disease 2019 (COVID-19) was discovered and rapidly spread to various countries around the world since the end of 2019. Computed tomography (CT) images have been used as an important alternative to the time-consuming RT. PCR test. However, manual segmentation of CT images alone is a major challenge as the number of suspected cases increases. Thus, accurate and automatic segmentation of COVID-19 infections is urgently needed. Because the imaging features of the COVID-19 infection are different and similar to the background, existing medical image segmentation methods cannot achieve satisfactory performance. In this work, we try to build a deep convolutional neural network adapted for the segmentation of chest CT images with COVID-19 infections. First, we maintain a large and novel chest CT image database containing 165,667 annotated chest CT images from 861 patients with confirmed COVID-19. Inspired by the observation that the boundary of an infected lung can be improved by global intensity adjustment, we introduce a feature variable block into the proposed deep CNN, which adjusts the global features of features to segment the COVID-19 infection. The proposed PV array can effectively and adaptively improve the performance of functions in different cases. We combine features of different scales by proposing a progressive atrocious space pyramid fusion scheme to deal with advanced infection regions with various aspects and shapes. We conducted experiments on data collected in China and Germany and showed that the proposed deep CNN can effectively produce impressive performance.

Keywords: chest, COVID-19, chest Image, coronavirus, CT image, chest CT

Procedia PDF Downloads 29
1106 Off-Line Text-Independent Arabic Writer Identification Using Optimum Codebooks

Authors: Ahmed Abdullah Ahmed

Abstract:

The task of recognizing the writer of a handwritten text has been an attractive research problem in the document analysis and recognition community with applications in handwriting forensics, paleography, document examination and handwriting recognition. This research presents an automatic method for writer recognition from digitized images of unconstrained writings. Although a great effort has been made by previous studies to come out with various methods, their performances, especially in terms of accuracy, are fallen short, and room for improvements is still wide open. The proposed technique employs optimal codebook based writer characterization where each writing sample is represented by a set of features computed from two codebooks, beginning and ending. Unlike most of the classical codebook based approaches which segment the writing into graphemes, this study is based on fragmenting a particular area of writing which are beginning and ending strokes. The proposed method starting with contour detection to extract significant information from the handwriting and the curve fragmentation is then employed to categorize the handwriting into Beginning and Ending zones into small fragments. The similar fragments of beginning strokes are grouped together to create Beginning cluster, and similarly, the ending strokes are grouped to create the ending cluster. These two clusters lead to the development of two codebooks (beginning and ending) by choosing the center of every similar fragments group. Writings under study are then represented by computing the probability of occurrence of codebook patterns. The probability distribution is used to characterize each writer. Two writings are then compared by computing distances between their respective probability distribution. The evaluations carried out on ICFHR standard dataset of 206 writers using Beginning and Ending codebooks separately. Finally, the Ending codebook achieved the highest identification rate of 98.23%, which is the best result so far on ICFHR dataset.

Keywords: off-line text-independent writer identification, feature extraction, codebook, fragments

Procedia PDF Downloads 488
1105 Census and Mapping of Oil Palms Over Satellite Dataset Using Deep Learning Model

Authors: Gholba Niranjan Dilip, Anil Kumar

Abstract:

Conduct of accurate reliable mapping of oil palm plantations and census of individual palm trees is a huge challenge. This study addresses this challenge and developed an optimized solution implemented deep learning techniques on remote sensing data. The oil palm is a very important tropical crop. To improve its productivity and land management, it is imperative to have accurate census over large areas. Since, manual census is costly and prone to approximations, a methodology for automated census using panchromatic images from Cartosat-2, SkySat and World View-3 satellites is demonstrated. It is selected two different study sites in Indonesia. The customized set of training data and ground-truth data are created for this study from Cartosat-2 images. The pre-trained model of Single Shot MultiBox Detector (SSD) Lite MobileNet V2 Convolutional Neural Network (CNN) from the TensorFlow Object Detection API is subjected to transfer learning on this customized dataset. The SSD model is able to generate the bounding boxes for each oil palm and also do the counting of palms with good accuracy on the panchromatic images. The detection yielded an F-Score of 83.16 % on seven different images. The detections are buffered and dissolved to generate polygons demarcating the boundaries of the oil palm plantations. This provided the area under the plantations and also gave maps of their location, thereby completing the automated census, with a fairly high accuracy (≈100%). The trained CNN was found competent enough to detect oil palm crowns from images obtained from multiple satellite sensors and of varying temporal vintage. It helped to estimate the increase in oil palm plantations from 2014 to 2021 in the study area. The study proved that high-resolution panchromatic satellite image can successfully be used to undertake census of oil palm plantations using CNNs.

Keywords: object detection, oil palm tree census, panchromatic images, single shot multibox detector

Procedia PDF Downloads 141
1104 Integrating Radar Sensors with an Autonomous Vehicle Simulator for an Enhanced Smart Parking Management System

Authors: Mohamed Gazzeh, Bradley Null, Fethi Tlili, Hichem Besbes

Abstract:

The burgeoning global ownership of personal vehicles has posed a significant strain on urban infrastructure, notably parking facilities, leading to traffic congestion and environmental concerns. Effective parking management systems (PMS) are indispensable for optimizing urban traffic flow and reducing emissions. The most commonly deployed systems nowadays rely on computer vision technology. This paper explores the integration of radar sensors and simulation in the context of smart parking management. We concentrate on radar sensors due to their versatility and utility in automotive applications, which extends to PMS. Additionally, radar sensors play a crucial role in driver assistance systems and autonomous vehicle development. However, the resource-intensive nature of radar data collection for algorithm development and testing necessitates innovative solutions. Simulation, particularly the monoDrive simulator, an internal development tool used by NI the Test and Measurement division of Emerson, offers a practical means to overcome this challenge. The primary objectives of this study encompass simulating radar sensors to generate a substantial dataset for algorithm development, testing, and, critically, assessing the transferability of models between simulated and real radar data. We focus on occupancy detection in parking as a practical use case, categorizing each parking space as vacant or occupied. The simulation approach using monoDrive enables algorithm validation and reliability assessment for virtual radar sensors. It meticulously designed various parking scenarios, involving manual measurements of parking spot coordinates, orientations, and the utilization of TI AWR1843 radar. To create a diverse dataset, we generated 4950 scenarios, comprising a total of 455,400 parking spots. This extensive dataset encompasses radar configuration details, ground truth occupancy information, radar detections, and associated object attributes such as range, azimuth, elevation, radar cross-section, and velocity data. The paper also addresses the intricacies and challenges of real-world radar data collection, highlighting the advantages of simulation in producing radar data for parking lot applications. We developed classification models based on Support Vector Machines (SVM) and Density-Based Spatial Clustering of Applications with Noise (DBSCAN), exclusively trained and evaluated on simulated data. Subsequently, we applied these models to real-world data, comparing their performance against the monoDrive dataset. The study demonstrates the feasibility of transferring models from a simulated environment to real-world applications, achieving an impressive accuracy score of 92% using only one radar sensor. This finding underscores the potential of radar sensors and simulation in the development of smart parking management systems, offering significant benefits for improving urban mobility and reducing environmental impact. The integration of radar sensors and simulation represents a promising avenue for enhancing smart parking management systems, addressing the challenges posed by the exponential growth in personal vehicle ownership. This research contributes valuable insights into the practicality of using simulated radar data in real-world applications and underscores the role of radar technology in advancing urban sustainability.

Keywords: autonomous vehicle simulator, FMCW radar sensors, occupancy detection, smart parking management, transferability of models

Procedia PDF Downloads 55
1103 Biodiversity of Platyhelminthes Parasites on Batoids (Elasmobranchii) Fishes from the Algerian Coasts: First Annotated Inventory

Authors: Fadila Tazerouti, Affaf Boukadoum, Kamilia Gharbi, Karima Benmeslem

Abstract:

Parasites are recognized as an important component of biodiversity because of their crucial role in providing valuable information on host populations and in the functioning and balance of natural ecosystems. Although the knowledge about these pathogen organisms' diversity has increased these last years, many species still need to be identified and more investigations should be performed. Batoid fishes represent a significant biological resource, especially among populations of the Mediterranean basin. However, the data on their parasitic fauna, particularly in Algeria, remains unknown and still incomplete. Therefore, the aim of this study is to survey and provide data on the biodiversity of Platyhelminthes parasites of Elasmobranches fishes from Algerian coasts. 3217 specimens of Batoids belonging to 4 families, Topedinidae, Rajdae, Dasyatidae and Myliobatidae, caught in several sites on the Algerian coasts, were examined for their parasites. 47 taxa, including 7 new for science and belonging to 2 classes Monogenea and Cestoda, have been identified. Monogeneans presented the highest richness with 24 taxa and 5 new species for science: 4 Amphibdelloides species and one Calicotyle species. Cestodes are represented by 23 taxa and 3 new species: 2 Acanthobothrium and 1 species Echinobothrium. This study allowed us to establish for the first time in Algeria an inventory of Platyhelminthes parasites of this group of Chondrichthyes fish, as well as an invaluable contribution to the knowledge about the parasitic fauna of Algerian and Mediterranean Elasmobranch fishes.

Keywords: parasitic platyhelminthes, biodiversity, elasmobranches, algerian coasts, inventory

Procedia PDF Downloads 49
1102 Development of Locally Fabricated Honey Extracting Machine

Authors: Akinfiresoye W. A., Olarewaju O. O., Okunola, Okunola I. O.

Abstract:

An indigenous honey-extracting machine was designed, fabricated and evaluated at the workshop of the department of Agricultural Technology, Federal Polytechnic, Ile-Oluji, Nigeria using locally available materials. It has the extraction unit, the presser, the honey collector and the frame. The harvested honeycomb is placed inside the cylindrical extraction unit with perforated holes. The press plate was then placed on the comb while the hydraulic press of 3 tons was placed on it, supported by the frame. The hydraulic press, which is manually operated, forces the oil out of the extraction chamber through the perforated holes into the honey collector positioned at the lowest part of the extraction chamber. The honey-extracting machine has an average throughput of 2.59 kg/min and an efficiency of about 91%. The cost of producing the honey extracting machine is NGN 31, 700: 00, thirty-one thousand and seven hundred nairas only or $70 at NGN 452.8 to a dollar. This cost is affordable to beekeepers and would-be honey entrepreneurs. The honey-extracting machine is easy to operate and maintain without any complex technical know-how.

Keywords: honey, extractor, cost, efficiency

Procedia PDF Downloads 51
1101 Efficient Subsurface Mapping: Automatic Integration of Ground Penetrating Radar with Geographic Information Systems

Authors: Rauf R. Hussein, Devon M. Ramey

Abstract:

Integrating Ground Penetrating Radar (GPR) with Geographic Information Systems (GIS) can provide valuable insights for various applications, such as archaeology, transportation, and utility locating. Although there has been progress toward automating the integration of GPR data with GIS, fully automatic integration has not been achieved yet. Additionally, manually integrating GPR data with GIS can be a time-consuming and error-prone process. In this study, actual, real-world GPR applications are presented, and a software named GPR-GIS 10 is created to interactively extract subsurface targets from GPR radargrams and automatically integrate them into GIS. With this software, it is possible to quickly and reliably integrate the two techniques to create informative subsurface maps. The results indicated that automatic integration of GPR with GIS can be an efficient tool to map and view any subsurface targets in their appropriate location in a 3D space with the needed precision. The findings of this study could help GPR-GIS integrators save time and reduce errors in many GPR-GIS applications.

Keywords: GPR, GIS, GPR-GIS 10, drone technology, automation

Procedia PDF Downloads 60
1100 A Cost Effective Solar Powered Water Pump Solution for Household Application in the Rural Area of Bangladesh

Authors: Khosru M. Salim, Md. Jasim Uddin, Mohammad Rejwan Uddin

Abstract:

Developing countries like Bangladesh has huge population lives in the rural areas out of electricity. They are using manually operated tube well for collecting underground water to meet their daily demand. A human labour is required to lift water from tube well. Sometimes, it is impossible for a elementary school going child to operate a tube well in the school. Solar powered water pump could be a sustainable water pumping solution in the rural area of Bangladesh. To minimize the cost, a 0.5 horse power solar water pump is designed considering the requirement of water for a typical house hold in this research. A prototype of the 0.5 hp capacity system is implemented and tested in the rooftop of the university lab to validate the performances. Based on the experimental data, the performance of the system is analyzed and presented in this paper.

Keywords: water pump, solar photovoltaic module, performance analysis, feasibility study

Procedia PDF Downloads 282
1099 A Study on the Application of Machine Learning and Deep Learning Techniques for Skin Cancer Detection

Authors: Hritwik Ghosh, Irfan Sadiq Rahat, Sachi Nandan Mohanty, J. V. R. Ravindra

Abstract:

In the rapidly evolving landscape of medical diagnostics, the early detection and accurate classification of skin cancer remain paramount for effective treatment outcomes. This research delves into the transformative potential of Artificial Intelligence (AI), specifically Deep Learning (DL), as a tool for discerning and categorizing various skin conditions. Utilizing a diverse dataset of 3,000 images representing nine distinct skin conditions, we confront the inherent challenge of class imbalance. This imbalance, where conditions like melanomas are over-represented, is addressed by incorporating class weights during the model training phase, ensuring an equitable representation of all conditions in the learning process. Our pioneering approach introduces a hybrid model, amalgamating the strengths of two renowned Convolutional Neural Networks (CNNs), VGG16 and ResNet50. These networks, pre-trained on the ImageNet dataset, are adept at extracting intricate features from images. By synergizing these models, our research aims to capture a holistic set of features, thereby bolstering classification performance. Preliminary findings underscore the hybrid model's superiority over individual models, showcasing its prowess in feature extraction and classification. Moreover, the research emphasizes the significance of rigorous data pre-processing, including image resizing, color normalization, and segmentation, in ensuring data quality and model reliability. In essence, this study illuminates the promising role of AI and DL in revolutionizing skin cancer diagnostics, offering insights into its potential applications in broader medical domains.

Keywords: artificial intelligence, machine learning, deep learning, skin cancer, dermatology, convolutional neural networks, image classification, computer vision, healthcare technology, cancer detection, medical imaging

Procedia PDF Downloads 52
1098 PsyVBot: Chatbot for Accurate Depression Diagnosis using Long Short-Term Memory and NLP

Authors: Thaveesha Dheerasekera, Dileeka Sandamali Alwis

Abstract:

The escalating prevalence of mental health issues, such as depression and suicidal ideation, is a matter of significant global concern. It is plausible that a variety of factors, such as life events, social isolation, and preexisting physiological or psychological health conditions, could instigate or exacerbate these conditions. Traditional approaches to diagnosing depression entail a considerable amount of time and necessitate the involvement of adept practitioners. This underscores the necessity for automated systems capable of promptly detecting and diagnosing symptoms of depression. The PsyVBot system employs sophisticated natural language processing and machine learning methodologies, including the use of the NLTK toolkit for dataset preprocessing and the utilization of a Long Short-Term Memory (LSTM) model. The PsyVBot exhibits a remarkable ability to diagnose depression with a 94% accuracy rate through the analysis of user input. Consequently, this resource proves to be efficacious for individuals, particularly those enrolled in academic institutions, who may encounter challenges pertaining to their psychological well-being. The PsyVBot employs a Long Short-Term Memory (LSTM) model that comprises a total of three layers, namely an embedding layer, an LSTM layer, and a dense layer. The stratification of these layers facilitates a precise examination of linguistic patterns that are associated with the condition of depression. The PsyVBot has the capability to accurately assess an individual's level of depression through the identification of linguistic and contextual cues. The task is achieved via a rigorous training regimen, which is executed by utilizing a dataset comprising information sourced from the subreddit r/SuicideWatch. The diverse data present in the dataset ensures precise and delicate identification of symptoms linked with depression, thereby guaranteeing accuracy. PsyVBot not only possesses diagnostic capabilities but also enhances the user experience through the utilization of audio outputs. This feature enables users to engage in more captivating and interactive interactions. The PsyVBot platform offers individuals the opportunity to conveniently diagnose mental health challenges through a confidential and user-friendly interface. Regarding the advancement of PsyVBot, maintaining user confidentiality and upholding ethical principles are of paramount significance. It is imperative to note that diligent efforts are undertaken to adhere to ethical standards, thereby safeguarding the confidentiality of user information and ensuring its security. Moreover, the chatbot fosters a conducive atmosphere that is supportive and compassionate, thereby promoting psychological welfare. In brief, PsyVBot is an automated conversational agent that utilizes an LSTM model to assess the level of depression in accordance with the input provided by the user. The demonstrated accuracy rate of 94% serves as a promising indication of the potential efficacy of employing natural language processing and machine learning techniques in tackling challenges associated with mental health. The reliability of PsyVBot is further improved by the fact that it makes use of the Reddit dataset and incorporates Natural Language Toolkit (NLTK) for preprocessing. PsyVBot represents a pioneering and user-centric solution that furnishes an easily accessible and confidential medium for seeking assistance. The present platform is offered as a modality to tackle the pervasive issue of depression and the contemplation of suicide.

Keywords: chatbot, depression diagnosis, LSTM model, natural language process

Procedia PDF Downloads 37
1097 Adversary Emulation: Implementation of Automated Countermeasure in CALDERA Framework

Authors: Yinan Cao, Francine Herrmann

Abstract:

Adversary emulation is a very effective concrete way to evaluate the defense of an information system or network. It is about building an emulator, which depending on the vulnerability of a target system, will allow to detect and execute a set of identified attacks. However, emulating an adversary is very costly in terms of time and resources. Verifying the information of each technique and building up the countermeasures in the middle of the test is also needed to be accomplished manually. In this article, a synthesis of previous MITRE research on the creation of the ATT&CK matrix will be as the knowledge base of the known techniques and a well-designed adversary emulation software CALDERA based on ATT&CK Matrix will be used as our platform. Inspired and guided by the previous study, a plugin in CALDERA called Tinker will be implemented, which is aiming to help the tester to get more information and also the mitigation of each technique used in the previous operation. Furthermore, the optional countermeasures for some techniques are also implemented and preset in Tinker in order to facilitate and fasten the process of the defense improvement of the tested system.

Keywords: automation, adversary emulation, CALDERA, countermeasures, MITRE ATT&CK

Procedia PDF Downloads 180
1096 Gene Names Identity Recognition Using Siamese Network for Biomedical Publications

Authors: Micheal Olaolu Arowolo, Muhammad Azam, Fei He, Mihail Popescu, Dong Xu

Abstract:

As the quantity of biological articles rises, so does the number of biological route figures. Each route figure shows gene names and relationships. Annotating pathway diagrams manually is time-consuming. Advanced image understanding models could speed up curation, but they must be more precise. There is rich information in biological pathway figures. The first step to performing image understanding of these figures is to recognize gene names automatically. Classical optical character recognition methods have been employed for gene name recognition, but they are not optimized for literature mining data. This study devised a method to recognize an image bounding box of gene name as a photo using deep Siamese neural network models to outperform the existing methods using ResNet, DenseNet and Inception architectures, the results obtained about 84% accuracy.

Keywords: biological pathway, gene identification, object detection, Siamese network

Procedia PDF Downloads 252
1095 Incorporating Information Gain in Regular Expressions Based Classifiers

Authors: Rosa L. Figueroa, Christopher A. Flores, Qing Zeng-Treitler

Abstract:

A regular expression consists of sequence characters which allow describing a text path. Usually, in clinical research, regular expressions are manually created by programmers together with domain experts. Lately, there have been several efforts to investigate how to generate them automatically. This article presents a text classification algorithm based on regexes. The algorithm named REX was designed, and then, implemented as a simplified method to create regexes to classify Spanish text automatically. In order to classify ambiguous cases, such as, when multiple labels are assigned to a testing example, REX includes an information gain method Two sets of data were used to evaluate the algorithm’s effectiveness in clinical text classification tasks. The results indicate that the regular expression based classifier proposed in this work performs statically better regarding accuracy and F-measure than Support Vector Machine and Naïve Bayes for both datasets.

Keywords: information gain, regular expressions, smith-waterman algorithm, text classification

Procedia PDF Downloads 295
1094 A Performance Comparison between Conventional and Flexible Box Erecting Machines Using Dispatching Rules

Authors: Min Kyu Kim, Eun Young Lee, Dong Woo Son, Yoon Seok Chang

Abstract:

In this paper, we introduce a flexible box erecting machine (BEM) that swiftly and automatically transforms cardboard into a three dimensional box. Recently, the parcel service and home-shopping industries have grown rapidly, and there is an increasing need for various box types to ship various products. However, workers cannot fold thousands of boxes manually in a day. As such, automatic BEMs are garnering greater attention. This study takes equipment operation into consideration as well as mechanical improvements in order to design a BEM that is able to outperform its conventional counterparts. We analyzed six dispatching rules – First In First Out (FIFO), Shortest Processing Time (SPT), Earliest Due Date (EDD), Setup Avoidance, EDD + SPT, and EDD + Setup Avoidance – to determine which one was most suitable for BEM operation. Consequently, SPT and Setup Avoidance were found to be the most critical rules, followed by EDD + Setup Avoidance, EDD + SPT, EDD, and FIFO. This hierarchy was valid for both our conventional BEM and our new flexible BEM from the viewpoint of processing time. We believe that this research can contribute to flexible BEM management, which has the potential to increase productivity and convenience.

Keywords: automation, box erecting machine, dispatching rule, setup time

Procedia PDF Downloads 336
1093 Evaluation of Video Quality Metrics and Performance Comparison on Contents Taken from Most Commonly Used Devices

Authors: Pratik Dhabal Deo, Manoj P.

Abstract:

With the increasing number of social media users, the amount of video content available has also significantly increased. Currently, the number of smartphone users is at its peak, and many are increasingly using their smartphones as their main photography and recording devices. There have been a lot of developments in the field of Video Quality Assessment (VQA) and metrics like VMAF, SSIM etc. are said to be some of the best performing metrics, but the evaluation of these metrics is dominantly done on professionally taken video contents using professional tools, lighting conditions etc. No study particularly pinpointing the performance of the metrics on the contents taken by users on very commonly available devices has been done. Datasets that contain a huge number of videos from different high-end devices make it difficult to analyze the performance of the metrics on the content from most used devices even if they contain contents taken in poor lighting conditions using lower-end devices. These devices face a lot of distortions due to various factors since the spectrum of contents recorded on these devices is huge. In this paper, we have presented an analysis of the objective VQA metrics on contents taken only from most used devices and their performance on them, focusing on full-reference metrics. To carry out this research, we created a custom dataset containing a total of 90 videos that have been taken from three most commonly used devices, and android smartphone, an IOS smartphone and a DSLR. On the videos taken on each of these devices, the six most common types of distortions that users face have been applied on addition to already existing H.264 compression based on four reference videos. These six applied distortions have three levels of degradation each. A total of the five most popular VQA metrics have been evaluated on this dataset and the highest values and the lowest values of each of the metrics on the distortions have been recorded. Finally, it is found that blur is the artifact on which most of the metrics didn’t perform well. Thus, in order to understand the results better the amount of blur in the data set has been calculated and an additional evaluation of the metrics was done using HEVC codec, which is the next version of H.264 compression, on the camera that proved to be the sharpest among the devices. The results have shown that as the resolution increases, the performance of the metrics tends to become more accurate and the best performing metric among them is VQM with very few inconsistencies and inaccurate results when the compression applied is H.264, but when the compression is applied is HEVC, SSIM and VMAF have performed significantly better.

Keywords: distortion, metrics, performance, resolution, video quality assessment

Procedia PDF Downloads 182
1092 A Framework for Chinese Domain-Specific Distant Supervised Named Entity Recognition

Authors: Qin Long, Li Xiaoge

Abstract:

The Knowledge Graphs have now become a new form of knowledge representation. However, there is no consensus in regard to a plausible and definition of entities and relationships in the domain-specific knowledge graph. Further, in conjunction with several limitations and deficiencies, various domain-specific entities and relationships recognition approaches are far from perfect. Specifically, named entity recognition in Chinese domain is a critical task for the natural language process applications. However, a bottleneck problem with Chinese named entity recognition in new domains is the lack of annotated data. To address this challenge, a domain distant supervised named entity recognition framework is proposed. The framework is divided into two stages: first, the distant supervised corpus is generated based on the entity linking model of graph attention neural network; secondly, the generated corpus is trained as the input of the distant supervised named entity recognition model to train to obtain named entities. The link model is verified in the ccks2019 entity link corpus, and the F1 value is 2% higher than that of the benchmark method. The re-pre-trained BERT language model is added to the benchmark method, and the results show that it is more suitable for distant supervised named entity recognition tasks. Finally, it is applied in the computer field, and the results show that this framework can obtain domain named entities.

Keywords: distant named entity recognition, entity linking, knowledge graph, graph attention neural network

Procedia PDF Downloads 71
1091 Proactive WPA/WPA2 Security Using DD-WRT Firmware

Authors: Mustafa Kamoona, Mohamed El-Sharkawy

Abstract:

Although the latest Wireless Local Area Network technology Wi-Fi 802.11i standard addresses many of the security weaknesses of the antecedent Wired Equivalent Privacy (WEP) protocol, there are still scenarios where the network security are still vulnerable. The first security model that 802.11i offers is the Personal model which is very cheap and simple to install and maintain, yet it uses a Pre Shared Key (PSK) and thus has a low to medium security level. The second model that 802.11i provide is the Enterprise model which is highly secured but much more expensive and difficult to install/maintain and requires the installation and maintenance of an authentication server that will handle the authentication and key management for the wireless network. A central issue with the personal model is that the PSK needs to be shared with all the devices that are connected to the specific Wi-Fi network. This pre-shared key, unless changed regularly, can be cracked using offline dictionary attacks within a matter of hours. The key is burdensome to change in all the connected devices manually unless there is some kind of algorithm that coordinate this PSK update. The key idea of this paper is to propose a new algorithm that proactively and effectively coordinates the pre-shared key generation, management, and distribution in the cheap WPA/WPA2 personal security model using only a DD-WRT router.

Keywords: Wi-Fi, WPS, TLS, DD-WRT

Procedia PDF Downloads 208
1090 Automatic Multi-Label Image Annotation System Guided by Firefly Algorithm and Bayesian Method

Authors: Saad M. Darwish, Mohamed A. El-Iskandarani, Guitar M. Shawkat

Abstract:

Nowadays, the amount of available multimedia data is continuously on the rise. The need to find a required image for an ordinary user is a challenging task. Content based image retrieval (CBIR) computes relevance based on the visual similarity of low-level image features such as color, textures, etc. However, there is a gap between low-level visual features and semantic meanings required by applications. The typical method of bridging the semantic gap is through the automatic image annotation (AIA) that extracts semantic features using machine learning techniques. In this paper, a multi-label image annotation system guided by Firefly and Bayesian method is proposed. Firstly, images are segmented using the maximum variance intra cluster and Firefly algorithm, which is a swarm-based approach with high convergence speed, less computation rate and search for the optimal multiple threshold. Feature extraction techniques based on color features and region properties are applied to obtain the representative features. After that, the images are annotated using translation model based on the Net Bayes system, which is efficient for multi-label learning with high precision and less complexity. Experiments are performed using Corel Database. The results show that the proposed system is better than traditional ones for automatic image annotation and retrieval.

Keywords: feature extraction, feature selection, image annotation, classification

Procedia PDF Downloads 562
1089 Emotion Recognition in Video and Images in the Wild

Authors: Faizan Tariq, Moayid Ali Zaidi

Abstract:

Facial emotion recognition algorithms are expanding rapidly now a day. People are using different algorithms with different combinations to generate best results. There are six basic emotions which are being studied in this area. Author tried to recognize the facial expressions using object detector algorithms instead of traditional algorithms. Two object detection algorithms were chosen which are Faster R-CNN and YOLO. For pre-processing we used image rotation and batch normalization. The dataset I have chosen for the experiments is Static Facial Expression in Wild (SFEW). Our approach worked well but there is still a lot of room to improve it, which will be a future direction.

Keywords: face recognition, emotion recognition, deep learning, CNN

Procedia PDF Downloads 165
1088 Study and Analysis of the Factors Affecting Road Safety Using Decision Tree Algorithms

Authors: Naina Mahajan, Bikram Pal Kaur

Abstract:

The purpose of traffic accident analysis is to find the possible causes of an accident. Road accidents cannot be totally prevented but by suitable traffic engineering and management the accident rate can be reduced to a certain extent. This paper discusses the classification techniques C4.5 and ID3 using the WEKA Data mining tool. These techniques use on the NH (National highway) dataset. With the C4.5 and ID3 technique it gives best results and high accuracy with less computation time and error rate.

Keywords: C4.5, ID3, NH(National highway), WEKA data mining tool

Procedia PDF Downloads 308
1087 Analysis of Citation Rate and Data Reuse for Openly Accessible Biodiversity Datasets on Global Biodiversity Information Facility

Authors: Nushrat Khan, Mike Thelwall, Kayvan Kousha

Abstract:

Making research data openly accessible has been mandated by most funders over the last 5 years as it promotes reproducibility in science and reduces duplication of effort to collect the same data. There are evidence that articles that publicly share research data have higher citation rates in biological and social sciences. However, how and whether shared data is being reused is not always intuitive as such information is not easily accessible from the majority of research data repositories. This study aims to understand the practice of data citation and how data is being reused over the years focusing on biodiversity since research data is frequently reused in this field. Metadata of 38,878 datasets including citation counts were collected through the Global Biodiversity Information Facility (GBIF) API for this purpose. GBIF was used as a data source since it provides citation count for datasets, not a commonly available feature for most repositories. Analysis of dataset types, citation counts, creation and update time of datasets suggests that citation rate varies for different types of datasets, where occurrence datasets that have more granular information have higher citation rates than checklist and metadata-only datasets. Another finding is that biodiversity datasets on GBIF are frequently updated, which is unique to this field. Majority of the datasets from the earliest year of 2007 were updated after 11 years, with no dataset that was not updated since creation. For each year between 2007 and 2017, we compared the correlations between update time and citation rate of four different types of datasets. While recent datasets do not show any correlations, 3 to 4 years old datasets show weak correlation where datasets that were updated more recently received high citations. The results are suggestive that it takes several years to cumulate citations for research datasets. However, this investigation found that when searched on Google Scholar or Scopus databases for the same datasets, the number of citations is often not the same as GBIF. Hence future aim is to further explore the citation count system adopted by GBIF to evaluate its reliability and whether it can be applicable to other fields of studies as well.

Keywords: data citation, data reuse, research data sharing, webometrics

Procedia PDF Downloads 151
1086 Statistical Models and Time Series Forecasting on Crime Data in Nepal

Authors: Dila Ram Bhandari

Abstract:

Throughout the 20th century, new governments were created where identities such as ethnic, religious, linguistic, caste, communal, tribal, and others played a part in the development of constitutions and the legal system of victim and criminal justice. Acute issues with extremism, poverty, environmental degradation, cybercrimes, human rights violations, crime against, and victimization of both individuals and groups have recently plagued South Asian nations. Everyday massive number of crimes are steadfast, these frequent crimes have made the lives of common citizens restless. Crimes are one of the major threats to society and also for civilization. Crime is a bone of contention that can create a societal disturbance. The old-style crime solving practices are unable to live up to the requirement of existing crime situations. Crime analysis is one of the most important activities of the majority of intelligent and law enforcement organizations all over the world. The South Asia region lacks such a regional coordination mechanism, unlike central Asia of Asia Pacific regions, to facilitate criminal intelligence sharing and operational coordination related to organized crime, including illicit drug trafficking and money laundering. There have been numerous conversations in recent years about using data mining technology to combat crime and terrorism. The Data Detective program from Sentient as a software company, uses data mining techniques to support the police (Sentient, 2017). The goals of this internship are to test out several predictive model solutions and choose the most effective and promising one. First, extensive literature reviews on data mining, crime analysis, and crime data mining were conducted. Sentient offered a 7-year archive of crime statistics that were daily aggregated to produce a univariate dataset. Moreover, a daily incidence type aggregation was performed to produce a multivariate dataset. Each solution's forecast period lasted seven days. Statistical models and neural network models were the two main groups into which the experiments were split. For the crime data, neural networks fared better than statistical models. This study gives a general review of the applied statistics and neural network models. A detailed image of each model's performance on the available data and generalizability is provided by a comparative analysis of all the models on a comparable dataset. Obviously, the studies demonstrated that, in comparison to other models, Gated Recurrent Units (GRU) produced greater prediction. The crime records of 2005-2019 which was collected from Nepal Police headquarter and analysed by R programming. In conclusion, gated recurrent unit implementation could give benefit to police in predicting crime. Hence, time series analysis using GRU could be a prospective additional feature in Data Detective.

Keywords: time series analysis, forecasting, ARIMA, machine learning

Procedia PDF Downloads 139
1085 Physics Informed Deep Residual Networks Based Type-A Aortic Dissection Prediction

Authors: Joy Cao, Min Zhou

Abstract:

Purpose: Acute Type A aortic dissection is a well-known cause of extremely high mortality rate. A highly accurate and cost-effective non-invasive predictor is critically needed so that the patient can be treated at earlier stage. Although various CFD approaches have been tried to establish some prediction frameworks, they are sensitive to uncertainty in both image segmentation and boundary conditions. Tedious pre-processing and demanding calibration procedures requirement further compound the issue, thus hampering their clinical applicability. Using the latest physics informed deep learning methods to establish an accurate and cost-effective predictor framework are amongst the main goals for a better Type A aortic dissection treatment. Methods: Via training a novel physics-informed deep residual network, with non-invasive 4D MRI displacement vectors as inputs, the trained model can cost-effectively calculate all these biomarkers: aortic blood pressure, WSS, and OSI, which are used to predict potential type A aortic dissection to avoid the high mortality events down the road. Results: The proposed deep learning method has been successfully trained and tested with both synthetic 3D aneurysm dataset and a clinical dataset in the aortic dissection context using Google colab environment. In both cases, the model has generated aortic blood pressure, WSS, and OSI results matching the expected patient’s health status. Conclusion: The proposed novel physics-informed deep residual network shows great potential to create a cost-effective, non-invasive predictor framework. Additional physics-based de-noising algorithm will be added to make the model more robust to clinical data noises. Further studies will be conducted in collaboration with big institutions such as Cleveland Clinic with more clinical samples to further improve the model’s clinical applicability.

Keywords: type-a aortic dissection, deep residual networks, blood flow modeling, data-driven modeling, non-invasive diagnostics, deep learning, artificial intelligence.

Procedia PDF Downloads 62
1084 Constructing a Semi-Supervised Model for Network Intrusion Detection

Authors: Tigabu Dagne Akal

Abstract:

While advances in computer and communications technology have made the network ubiquitous, they have also rendered networked systems vulnerable to malicious attacks devised from a distance. These attacks or intrusions start with attackers infiltrating a network through a vulnerable host and then launching further attacks on the local network or Intranet. Nowadays, system administrators and network professionals can attempt to prevent such attacks by developing intrusion detection tools and systems using data mining technology. In this study, the experiments were conducted following the Knowledge Discovery in Database Process Model. The Knowledge Discovery in Database Process Model starts from selection of the datasets. The dataset used in this study has been taken from Massachusetts Institute of Technology Lincoln Laboratory. After taking the data, it has been pre-processed. The major pre-processing activities include fill in missed values, remove outliers; resolve inconsistencies, integration of data that contains both labelled and unlabelled datasets, dimensionality reduction, size reduction and data transformation activity like discretization tasks were done for this study. A total of 21,533 intrusion records are used for training the models. For validating the performance of the selected model a separate 3,397 records are used as a testing set. For building a predictive model for intrusion detection J48 decision tree and the Naïve Bayes algorithms have been tested as a classification approach for both with and without feature selection approaches. The model that was created using 10-fold cross validation using the J48 decision tree algorithm with the default parameter values showed the best classification accuracy. The model has a prediction accuracy of 96.11% on the training datasets and 93.2% on the test dataset to classify the new instances as normal, DOS, U2R, R2L and probe classes. The findings of this study have shown that the data mining methods generates interesting rules that are crucial for intrusion detection and prevention in the networking industry. Future research directions are forwarded to come up an applicable system in the area of the study.

Keywords: intrusion detection, data mining, computer science, data mining

Procedia PDF Downloads 270
1083 Desktop High-Speed Aerodynamics by Shallow Water Analogy in a Tin Box for Engineering Students

Authors: Etsuo Morishita

Abstract:

In this paper, we show shallow water in a tin box as an analogous simulation tool for high-speed aerodynamics education and research. It is customary that we use a water tank to create shallow water flow. While a flow in a water tank is not necessarily uniform and is sometimes wavy, we can visualize a clear supercritical flow even when we move a body manually in stationary water in a simple shallow tin box. We can visualize a blunt shock wave around a moving circular cylinder together with a shock pattern around a diamond airfoil. Another interesting analogous experiment is a hydrodynamic shock tube with water and tea. We observe the contact surface clearly due to color difference of the two liquids those are invisible in the real gas dynamics experiment. We first revisit the similarities between high-speed aerodynamics and shallow water hydraulics. Several educational and research experiments are then introduced for engineering students. Shallow water experiments in a tin box simulate properly the high-speed flows.

Keywords: aerodynamics compressible flow, gas dynamics, hydraulics, shock wave

Procedia PDF Downloads 276
1082 Multi-Subpopulation Genetic Algorithm with Estimation of Distribution Algorithm for Textile Batch Dyeing Scheduling Problem

Authors: Nhat-To Huynh, Chen-Fu Chien

Abstract:

Textile batch dyeing scheduling problem is complicated which includes batch formation, batch assignment on machines, batch sequencing with sequence-dependent setup time. Most manufacturers schedule their orders manually that are time consuming and inefficient. More power methods are needed to improve the solution. Motivated by the real needs, this study aims to propose approaches in which genetic algorithm is developed with multi-subpopulation and hybridised with estimation of distribution algorithm to solve the constructed problem for minimising the makespan. A heuristic algorithm is designed and embedded into the proposed algorithms to improve the ability to get out of the local optima. In addition, an empirical study is conducted in a textile company in Taiwan to validate the proposed approaches. The results have showed that proposed approaches are more efficient than simulated annealing algorithm.

Keywords: estimation of distribution algorithm, genetic algorithm, multi-subpopulation, scheduling, textile dyeing

Procedia PDF Downloads 276