Search results for: automatic image colorization
2680 Numerical Implementation and Testing of Fractioning Estimator Method for the Box-Counting Dimension of Fractal Objects
Authors: Abraham Terán Salcedo, Didier Samayoa Ochoa
Abstract:
This work presents a numerical implementation of a method for estimating the box-counting dimension of self-avoiding curves on a planar space, fractal objects captured on digital images; this method is named fractioning estimator. Classical methods of digital image processing, such as noise filtering, contrast manipulation, and thresholding, among others, are used in order to obtain binary images that are suitable for performing the necessary computations of the fractioning estimator. A user interface is developed for performing the image processing operations and testing the fractioning estimator on different captured images of real-life fractal objects. To analyze the results, the estimations obtained through the fractioning estimator are compared to the results obtained through other methods that are already implemented on different available software for computing and estimating the box-counting dimension.Keywords: box-counting, digital image processing, fractal dimension, numerical method
Procedia PDF Downloads 832679 Automatic Teller Machine System Security by Using Mobile SMS Code
Authors: Husnain Mushtaq, Mary Anjum, Muhammad Aleem
Abstract:
The main objective of this paper is used to develop a high security in Automatic Teller Machine (ATM). In these system bankers will collect the mobile numbers from the customers and then provide a code on their mobile number. In most country existing ATM machine use the magnetic card reader. The customer is identifying by inserting an ATM card with magnetic card that hold unique information such as card number and some security limitations. By entering a personal identification number, first the customer is authenticated then will access bank account in order to make cash withdraw or other services provided by the bank. Cases of card fraud are another problem once the user’s bank card is missing and the password is stolen, or simply steal a customer’s card & PIN the criminal will draw all cash in very short time, which will being great financial losses in customer, this type of fraud has increase worldwide. So to resolve this problem we are going to provide the solution using “Mobile SMS code” and ATM “PIN code” in order to improve the verify the security of customers using ATM system and confidence in the banking area.Keywords: PIN, inquiry, biometric, magnetic strip, iris recognition, face recognition
Procedia PDF Downloads 3652678 A Primary Care Diagnosis of Middle-Aged Men with Oral Cancer Who Underwent Extensive Resection and Flap Repair: A Case Report
Authors: Ching-Yi Huang, Pi-Fen Cheng, Hui-Zhu Chen, Shi Ting Huang, Heng-Hua Wang
Abstract:
This is a case of oral cancer after extensive resection and modified right lateral neck lymph node dissection followed by reconstruction with a skin flap. The nursing period lasted From September 25 to October 3, 2017, through observation, interview, physical assessment, and medical record review, the author identified the following nursing problems: acute pain, impaired oral mucous membrane, and body image change. During the nursing period, the author provided individual and overall nursing care and established mutual trust through the use of empathy. Author listened and eased the patient's physical indisposition, such as wound pain, we use medications and acupuncture massage to relieve pain. However, for oral mucosa change caused by surgery, provide continuous and complete oral care and oral exercise training to improve oral mucosal healing and restore swallowing function. In the body-image changes, guided him to express his feeling after the body-image change, and enhanced support and from the family, and encouraged him to attend head and neck cancer survivor alliance which allowed the patient to accept the altered body image and reaffirm self-worth. Hopefully, through sharing this nursing experience will help to the nursing care quality of nursing care for oral cancer patients after extensive resection and modified right lateral neck lymph node dissection followed by reconstruction with a skin flap.Keywords: oral cancer, acute pain, impaired oral mucous membrane, body image change
Procedia PDF Downloads 1872677 Improved Processing Speed for Text Watermarking Algorithm in Color Images
Authors: Hamza A. Al-Sewadi, Akram N. A. Aldakari
Abstract:
Copyright protection and ownership proof of digital multimedia are achieved nowadays by digital watermarking techniques. A text watermarking algorithm for protecting the property rights and ownership judgment of color images is proposed in this paper. Embedding is achieved by inserting texts elements randomly into the color image as noise. The YIQ image processing model is found to be faster than other image processing methods, and hence, it is adopted for the embedding process. An optional choice of encrypting the text watermark before embedding is also suggested (in case required by some applications), where, the text can is encrypted using any enciphering technique adding more difficulty to hackers. Experiments resulted in embedding speed improvement of more than double the speed of other considered systems (such as least significant bit method, and separate color code methods), and a fairly acceptable level of peak signal to noise ratio (PSNR) with low mean square error values for watermarking purposes.Keywords: steganography, watermarking, time complexity measurements, private keys
Procedia PDF Downloads 1432676 Impact of Green Marketing Mix Strategy and CSR on Organizational Performance: An Empirical Study of Manufacturing Sector of Pakistan
Authors: Syeda Shawana Mahasan, Muhammad Farooq Akhtar
Abstract:
The objective of this study is to analyze the influence of the green marketing mix strategy and corporate social responsibility (CSR) on the performance of an organization, taking into account the mediating effect of corporate image. The impact of frugal innovation and corporate activism is being examined. The data was gathered from executives at various levels of management, including top, middle, and lower-level managers, from a total of 550 manufacturing enterprises of different sizes, ranging from small to medium to large. The collected replies are processed and analyzed using SMART PLS version 4.0.0.0. The application of PLS-SEM demonstrates that the green marketing mix strategy and corporate social responsibility have a significant impact on organizational performance. Therefore, it is imperative for organizations to effectively adopt environmentally sustainable and socially conscious methods within their operations. The results indicate that the corporate image has a key role in mediating the relationship between the green marketing mix strategy, corporate social responsibility, and organizational performance. This demonstrates the imperative for organizations to actively enhance their favorable reputation among stakeholders. The combination of frugal innovation and corporate activism enhances the connection between corporate image and organizational performance. The current study assists managers in recognizing the significance of these particular constructs in maintaining the long-term performance of the organization.Keywords: green marketing mix strategy, CSR, corporate image, organizational performance, frugal innovation, corporate activism
Procedia PDF Downloads 392675 Drugstore Control System Design and Realization Based on Programmable Logic Controller (PLC)
Authors: Muhammad Faheem Khakhi, Jian Yu Wang, Salman Muhammad, Muhammad Faisal Shabir
Abstract:
Population growth and Chinese two-child policy will boost pharmaceutical market, and it will continue to maintain the growth for a period of time in the future, the traditional pharmacy dispensary has been unable to meet the growing medical needs of the peoples. Under the strong support of the national policy, the automatic transformation of traditional pharmacies is the inclination of the Times, the new type of intelligent pharmacy system will continue to promote the development of the pharmaceutical industry. Under this background, based on PLC control, the paper proposed an intelligent storage and automatic drug delivery system; complete design of the lower computer's control system and the host computer's software system has been present. The system can be applied to dispensing work for Chinese herbal medicinal and Western medicines. Firstly, the essential of intelligent control system for pharmacy is discussed. After the analysis of the requirements, the overall scheme of the system design is presented. Secondly, introduces the software and hardware design of the lower computer's control system, including the selection of PLC and the selection of motion control system, the problem of the human-computer interaction module and the communication between PC and PLC solves, the program design and development of the PLC control system is completed. The design of the upper computer software management system is described in detail. By analyzing of E-R diagram, built the establish data, the communication protocol between systems is customize, C++ Builder is adopted to realize interface module, supply module, main control module, etc. The paper also gives the implementations of the multi-threaded system and communication method. Lastly, each module of the lower computer control system is tested. Then, after building a test environment, the function test of the upper computer software management system is completed. On this basis, the entire control system accepts the overall test.Keywords: automatic pharmacy, PLC, control system, management system, communication
Procedia PDF Downloads 3102674 A Comprehensive Study and Evaluation on Image Fashion Features Extraction
Authors: Yuanchao Sang, Zhihao Gong, Longsheng Chen, Long Chen
Abstract:
Clothing fashion represents a human’s aesthetic appreciation towards everyday outfits and appetite for fashion, and it reflects the development of status in society, humanity, and economics. However, modelling fashion by machine is extremely challenging because fashion is too abstract to be efficiently described by machines. Even human beings can hardly reach a consensus about fashion. In this paper, we are dedicated to answering a fundamental fashion-related problem: what image feature best describes clothing fashion? To address this issue, we have designed and evaluated various image features, ranging from traditional low-level hand-crafted features to mid-level style awareness features to various current popular deep neural network-based features, which have shown state-of-the-art performance in various vision tasks. In summary, we tested the following 9 feature representations: color, texture, shape, style, convolutional neural networks (CNNs), CNNs with distance metric learning (CNNs&DML), AutoEncoder, CNNs with multiple layer combination (CNNs&MLC) and CNNs with dynamic feature clustering (CNNs&DFC). Finally, we validated the performance of these features on two publicly available datasets. Quantitative and qualitative experimental results on both intra-domain and inter-domain fashion clothing image retrieval showed that deep learning based feature representations far outweigh traditional hand-crafted feature representation. Additionally, among all deep learning based methods, CNNs with explicit feature clustering performs best, which shows feature clustering is essential for discriminative fashion feature representation.Keywords: convolutional neural network, feature representation, image processing, machine modelling
Procedia PDF Downloads 1392673 Assessing the Current State of Wheelchair Accessibility in Shopping Centers and Stores in Saudi Arabia
Authors: Majed M. Mustafa, Abdulrahman A. Altassan
Abstract:
In recent years, ensuring accessibility for all individuals, particularly those with mobility impairments, has gained significant attention in Saudi Arabia. This research aims to evaluate wheelchair accessibility in shopping centers, malls, and stores across the kingdom, highlighting its critical role in promoting inclusivity and equal access. The study will focus on the availability and quality of ramps, automatic doors, lifts, accessible restrooms, and overall ease of navigation for wheelchair users. Utilizing a mixed-methods approach, the research will employ site assessments, user surveys, and interviews with facility managers to gather comprehensive data. Preliminary findings indicate that while some facilities have made strides in accessibility, there are still numerous areas requiring improvement. The study will provide targeted recommendations to enhance accessibility, ensuring that all users can navigate shopping environments with ease and dignity. Conclusively, this research underscores the need for continuous efforts and policy enhancements to achieve universal design standards in public spaces within Saudi Arabia.Keywords: automatic doors, equal access, ramp quality, wheelchair accessibility
Procedia PDF Downloads 362672 Measurement of Susceptibility Users Using Email Phishing Attack
Authors: Cindy Sahera, Sarwono Sutikno
Abstract:
Rapid technological developments also have negative impacts, namely the increasing criminal cases based on technology or cybercrime. One technique that can be used to conduct cybercrime attacks are phishing email. The issue is whether the user is aware that email can be misused by others so that it can harm the user's own? This research was conducted to measure the susceptibility of selected targets against email abuse. The objectives of this research are measurement of targets’ susceptibility and find vulnerability in email recipient. There are three steps being taken in this research, (1) the information gathering phase, (2) the design phase, and (3) the execution phase. The first step includes the collection of the information necessary to carry out an attack on a target. The next step is to make the design of an attack against a target. The last step is to send phishing emails to the target. The levels of susceptibility are three: level 1, level 2 and level 3. Level 1 indicates a low level of targets’ susceptibility, level 2 indicates the intermediate level of targets’ susceptibility, and level 3 indicates a high level of targets’ susceptibility. The results showed that users who are on level 1 and level 2 more that level 3, which means the user is not too careless. However, it does not mean the user to be safe. There are still vulnerabilities that may occur, such as automatic location detection when opening emails and automatic downloaded malware as user clicks a link in the email.Keywords: cybercrime, email phishing, susceptibility, vulnerability
Procedia PDF Downloads 2892671 Myanmar Consonants Recognition System Based on Lip Movements Using Active Contour Model
Authors: T. Thein, S. Kalyar Myo
Abstract:
Human uses visual information for understanding the speech contents in noisy conditions or in situations where the audio signal is not available. The primary advantage of visual information is that it is not affected by the acoustic noise and cross talk among speakers. Using visual information from the lip movements can improve the accuracy and robustness of automatic speech recognition. However, a major challenge with most automatic lip reading system is to find a robust and efficient method for extracting the linguistically relevant speech information from a lip image sequence. This is a difficult task due to variation caused by different speakers, illumination, camera setting and the inherent low luminance and chrominance contrast between lip and non-lip region. Several researchers have been developing methods to overcome these problems; the one is lip reading. Moreover, it is well known that visual information about speech through lip reading is very useful for human speech recognition system. Lip reading is the technique of a comprehensive understanding of underlying speech by processing on the movement of lips. Therefore, lip reading system is one of the different supportive technologies for hearing impaired or elderly people, and it is an active research area. The need for lip reading system is ever increasing for every language. This research aims to develop a visual teaching method system for the hearing impaired persons in Myanmar, how to pronounce words precisely by identifying the features of lip movement. The proposed research will work a lip reading system for Myanmar Consonants, one syllable consonants (င (Nga)၊ ည (Nya)၊ မ (Ma)၊ လ (La)၊ ၀ (Wa)၊ သ (Tha)၊ ဟ (Ha)၊ အ (Ah) ) and two syllable consonants ( က(Ka Gyi)၊ ခ (Kha Gway)၊ ဂ (Ga Nge)၊ ဃ (Ga Gyi)၊ စ (Sa Lone)၊ ဆ (Sa Lain)၊ ဇ (Za Gwe) ၊ ဒ (Da Dway)၊ ဏ (Na Gyi)၊ န (Na Nge)၊ ပ (Pa Saug)၊ ဘ (Ba Gone)၊ ရ (Ya Gaug)၊ ဠ (La Gyi) ). In the proposed system, there are three subsystems, the first one is the lip localization system, which localizes the lips in the digital inputs. The next one is the feature extraction system, which extracts features of lip movement suitable for visual speech recognition. And the final one is the classification system. In the proposed research, Two Dimensional Discrete Cosine Transform (2D-DCT) and Linear Discriminant Analysis (LDA) with Active Contour Model (ACM) will be used for lip movement features extraction. Support Vector Machine (SVM) classifier is used for finding class parameter and class number in training set and testing set. Then, experiments will be carried out for the recognition accuracy of Myanmar consonants using the only visual information on lip movements which are useful for visual speech of Myanmar languages. The result will show the effectiveness of the lip movement recognition for Myanmar Consonants. This system will help the hearing impaired persons to use as the language learning application. This system can also be useful for normal hearing persons in noisy environments or conditions where they can find out what was said by other people without hearing voice.Keywords: feature extraction, lip reading, lip localization, Active Contour Model (ACM), Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), Two Dimensional Discrete Cosine Transform (2D-DCT)
Procedia PDF Downloads 2862670 Algorithm for Automatic Real-Time Electrooculographic Artifact Correction
Authors: Norman Sinnigen, Igor Izyurov, Marina Krylova, Hamidreza Jamalabadi, Sarah Alizadeh, Martin Walter
Abstract:
Background: EEG is a non-invasive brain activity recording technique with a high temporal resolution that allows the use of real-time applications, such as neurofeedback. However, EEG data are susceptible to electrooculographic (EOG) and electromyography (EMG) artifacts (i.e., jaw clenching, teeth squeezing and forehead movements). Due to their non-stationary nature, these artifacts greatly obscure the information and power spectrum of EEG signals. Many EEG artifact correction methods are too time-consuming when applied to low-density EEG and have been focusing on offline processing or handling one single type of EEG artifact. A software-only real-time method for correcting multiple types of EEG artifacts of high-density EEG remains a significant challenge. Methods: We demonstrate an improved approach for automatic real-time EEG artifact correction of EOG and EMG artifacts. The method was tested on three healthy subjects using 64 EEG channels (Brain Products GmbH) and a sampling rate of 1,000 Hz. Captured EEG signals were imported in MATLAB with the lab streaming layer interface allowing buffering of EEG data. EMG artifacts were detected by channel variance and adaptive thresholding and corrected by using channel interpolation. Real-time independent component analysis (ICA) was applied for correcting EOG artifacts. Results: Our results demonstrate that the algorithm effectively reduces EMG artifacts, such as jaw clenching, teeth squeezing and forehead movements, and EOG artifacts (horizontal and vertical eye movements) of high-density EEG while preserving brain neuronal activity information. The average computation time of EOG and EMG artifact correction for 80 s (80,000 data points) 64-channel data is 300 – 700 ms depending on the convergence of ICA and the type and intensity of the artifact. Conclusion: An automatic EEG artifact correction algorithm based on channel variance, adaptive thresholding, and ICA improves high-density EEG recordings contaminated with EOG and EMG artifacts in real-time.Keywords: EEG, muscle artifacts, ocular artifacts, real-time artifact correction, real-time ICA
Procedia PDF Downloads 1802669 Classification of Digital Chest Radiographs Using Image Processing Techniques to Aid in Diagnosis of Pulmonary Tuberculosis
Authors: A. J. S. P. Nileema, S. Kulatunga , S. H. Palihawadana
Abstract:
Computer aided detection (CAD) system was developed for the diagnosis of pulmonary tuberculosis using digital chest X-rays with MATLAB image processing techniques using a statistical approach. The study comprised of 200 digital chest radiographs collected from the National Hospital for Respiratory Diseases - Welisara, Sri Lanka. Pre-processing was done to remove identification details. Lung fields were segmented and then divided into four quadrants; right upper quadrant, left upper quadrant, right lower quadrant, and left lower quadrant using the image processing techniques in MATLAB. Contrast, correlation, homogeneity, energy, entropy, and maximum probability texture features were extracted using the gray level co-occurrence matrix method. Descriptive statistics and normal distribution analysis were performed using SPSS. Depending on the radiologists’ interpretation, chest radiographs were classified manually into PTB - positive (PTBP) and PTB - negative (PTBN) classes. Features with standard normal distribution were analyzed using an independent sample T-test for PTBP and PTBN chest radiographs. Among the six features tested, contrast, correlation, energy, entropy, and maximum probability features showed a statistically significant difference between the two classes at 95% confidence interval; therefore, could be used in the classification of chest radiograph for PTB diagnosis. With the resulting value ranges of the five texture features with normal distribution, a classification algorithm was then defined to recognize and classify the quadrant images; if the texture feature values of the quadrant image being tested falls within the defined region, it will be identified as a PTBP – abnormal quadrant and will be labeled as ‘Abnormal’ in red color with its border being highlighted in red color whereas if the texture feature values of the quadrant image being tested falls outside of the defined value range, it will be identified as PTBN–normal and labeled as ‘Normal’ in blue color but there will be no changes to the image outline. The developed classification algorithm has shown a high sensitivity of 92% which makes it an efficient CAD system and with a modest specificity of 70%.Keywords: chest radiographs, computer aided detection, image processing, pulmonary tuberculosis
Procedia PDF Downloads 1262668 Automatic Assignment of Geminate and Epenthetic Vowel for Amharic Text-to-Speech System
Authors: Tadesse Anberbir, Felix Bankole, Tomio Takara, Girma Mamo
Abstract:
In the development of a text-to-speech synthesizer, automatic derivation of correct pronunciation from the grapheme form of a text is a central problem. Particularly deriving phonological features which are not shown in orthography is challenging. In the Amharic language, geminates and epenthetic vowels are very crucial for proper pronunciation but neither is shown in orthography. In this paper, we proposed and integrated a morphological analyzer into an Amharic Text-to-Speech system, mainly to predict geminates and epenthetic vowel positions, and prepared a duration modeling method. Amharic Text-to-Speech system (AmhTTS) is a parametric and rule-based system that adopts a cepstral method and uses a source filter model for speech production and a Log Magnitude Approximation (LMA) filter as the vocal tract filter. The naturalness of the system after employing the duration modeling was evaluated by sentence listening test and we achieved an average Mean Opinion Score (MOS) 3.4 (68%) which is moderate. By modeling the duration of geminates and controlling the locations of epenthetic vowel, we are able to synthesize good quality speech. Our system is mainly suitable to be customized for other Ethiopian languages with limited resources.Keywords: Amharic, gemination, speech synthesis, morphology, epenthesis
Procedia PDF Downloads 872667 Toward Subtle Change Detection and Quantification in Magnetic Resonance Neuroimaging
Authors: Mohammad Esmaeilpour
Abstract:
One of the important open problems in the field of medical image processing is detection and quantification of small changes. In this poster, we try to investigate that, how the algebraic decomposition techniques can be used for semiautomatically detecting and quantifying subtle changes in Magnetic Resonance (MR) neuroimaging volumes. We mostly focus on the low-rank values of the matrices achieved from decomposing MR image pairs during a period of time. Besides, a skillful neuroradiologist will help the algorithm to distinguish between noises and small changes.Keywords: magnetic resonance neuroimaging, subtle change detection and quantification, algebraic decomposition, basis functions
Procedia PDF Downloads 4742666 Small Text Extraction from Documents and Chart Images
Authors: Rominkumar Busa, Shahira K. C., Lijiya A.
Abstract:
Text recognition is an important area in computer vision which deals with detecting and recognising text from an image. The Optical Character Recognition (OCR) is a saturated area these days and with very good text recognition accuracy. However the same OCR methods when applied on text with small font sizes like the text data of chart images, the recognition rate is less than 30%. In this work, aims to extract small text in images using the deep learning model, CRNN with CTC loss. The text recognition accuracy is found to improve by applying image enhancement by super resolution prior to CRNN model. We also observe the text recognition rate further increases by 18% by applying the proposed method, which involves super resolution and character segmentation followed by CRNN with CTC loss. The efficiency of the proposed method shows that further pre-processing on chart image text and other small text images will improve the accuracy further, thereby helping text extraction from chart images.Keywords: small text extraction, OCR, scene text recognition, CRNN
Procedia PDF Downloads 1252665 Advances of Image Processing in Precision Agriculture: Using Deep Learning Convolution Neural Network for Soil Nutrient Classification
Authors: Halimatu S. Abdullahi, Ray E. Sheriff, Fatima Mahieddine
Abstract:
Agriculture is essential to the continuous existence of human life as they directly depend on it for the production of food. The exponential rise in population calls for a rapid increase in food with the application of technology to reduce the laborious work and maximize production. Technology can aid/improve agriculture in several ways through pre-planning and post-harvest by the use of computer vision technology through image processing to determine the soil nutrient composition, right amount, right time, right place application of farm input resources like fertilizers, herbicides, water, weed detection, early detection of pest and diseases etc. This is precision agriculture which is thought to be solution required to achieve our goals. There has been significant improvement in the area of image processing and data processing which has being a major challenge. A database of images is collected through remote sensing, analyzed and a model is developed to determine the right treatment plans for different crop types and different regions. Features of images from vegetations need to be extracted, classified, segmented and finally fed into the model. Different techniques have been applied to the processes from the use of neural network, support vector machine, fuzzy logic approach and recently, the most effective approach generating excellent results using the deep learning approach of convolution neural network for image classifications. Deep Convolution neural network is used to determine soil nutrients required in a plantation for maximum production. The experimental results on the developed model yielded results with an average accuracy of 99.58%.Keywords: convolution, feature extraction, image analysis, validation, precision agriculture
Procedia PDF Downloads 3162664 3D Microscopy, Image Processing, and Analysis of Lymphangiogenesis in Biological Models
Authors: Thomas Louis, Irina Primac, Florent Morfoisse, Tania Durre, Silvia Blacher, Agnes Noel
Abstract:
In vitro and in vivo lymphangiogenesis assays are essential for the identification of potential lymphangiogenic agents and the screening of pharmacological inhibitors. In the present study, we analyse three biological models: in vitro lymphatic endothelial cell spheroids, in vivo ear sponge assay, and in vivo lymph node colonisation by tumour cells. These assays provide suitable 3D models to test pro- and anti-lymphangiogenic factors or drugs. 3D images were acquired by confocal laser scanning and light sheet fluorescence microscopy. Virtual scan microscopy followed by 3D reconstruction by image aligning methods was also used to obtain 3D images of whole large sponge and ganglion samples. 3D reconstruction, image segmentation, skeletonisation, and other image processing algorithms are described. Fixed and time-lapse imaging techniques are used to analyse lymphatic endothelial cell spheroids behaviour. The study of cell spatial distribution in spheroid models enables to detect interactions between cells and to identify invasion hierarchy and guidance patterns. Global measurements such as volume, length, and density of lymphatic vessels are measured in both in vivo models. Branching density and tortuosity evaluation are also proposed to determine structure complexity. Those properties combined with vessel spatial distribution are evaluated in order to determine lymphangiogenesis extent. Lymphatic endothelial cell invasion and lymphangiogenesis were evaluated under various experimental conditions. The comparison of these conditions enables to identify lymphangiogenic agents and to better comprehend their roles in the lymphangiogenesis process. The proposed methodology is validated by its application on the three presented models.Keywords: 3D image segmentation, 3D image skeletonisation, cell invasion, confocal microscopy, ear sponges, light sheet microscopy, lymph nodes, lymphangiogenesis, spheroids
Procedia PDF Downloads 3782663 Optimizing Super Resolution Generative Adversarial Networks for Resource-Efficient Single-Image Super-Resolution via Knowledge Distillation and Weight Pruning
Authors: Hussain Sajid, Jung-Hun Shin, Kum-Won Cho
Abstract:
Image super-resolution is the most common computer vision problem with many important applications. Generative adversarial networks (GANs) have promoted remarkable advances in single-image super-resolution (SR) by recovering photo-realistic images. However, high memory requirements of GAN-based SR (mainly generators) lead to performance degradation and increased energy consumption, making it difficult to implement it onto resource-constricted devices. To relieve such a problem, In this paper, we introduce an optimized and highly efficient architecture for SR-GAN (generator) model by utilizing model compression techniques such as Knowledge Distillation and pruning, which work together to reduce the storage requirement of the model also increase in their performance. Our method begins with distilling the knowledge from a large pre-trained model to a lightweight model using different loss functions. Then, iterative weight pruning is applied to the distilled model to remove less significant weights based on their magnitude, resulting in a sparser network. Knowledge Distillation reduces the model size by 40%; pruning then reduces it further by 18%. To accelerate the learning process, we employ the Horovod framework for distributed training on a cluster of 2 nodes, each with 8 GPUs, resulting in improved training performance and faster convergence. Experimental results on various benchmarks demonstrate that the proposed compressed model significantly outperforms state-of-the-art methods in terms of peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), and image quality for x4 super-resolution tasks.Keywords: single-image super-resolution, generative adversarial networks, knowledge distillation, pruning
Procedia PDF Downloads 962662 A Comprehensive Methodology for Voice Segmentation of Large Sets of Speech Files Recorded in Naturalistic Environments
Authors: Ana Londral, Burcu Demiray, Marcus Cheetham
Abstract:
Speech recording is a methodology used in many different studies related to cognitive and behaviour research. Modern advances in digital equipment brought the possibility of continuously recording hours of speech in naturalistic environments and building rich sets of sound files. Speech analysis can then extract from these files multiple features for different scopes of research in Language and Communication. However, tools for analysing a large set of sound files and automatically extract relevant features from these files are often inaccessible to researchers that are not familiar with programming languages. Manual analysis is a common alternative, with a high time and efficiency cost. In the analysis of long sound files, the first step is the voice segmentation, i.e. to detect and label segments containing speech. We present a comprehensive methodology aiming to support researchers on voice segmentation, as the first step for data analysis of a big set of sound files. Praat, an open source software, is suggested as a tool to run a voice detection algorithm, label segments and files and extract other quantitative features on a structure of folders containing a large number of sound files. We present the validation of our methodology with a set of 5000 sound files that were collected in the daily life of a group of voluntary participants with age over 65. A smartphone device was used to collect sound using the Electronically Activated Recorder (EAR): an app programmed to record 30-second sound samples that were randomly distributed throughout the day. Results demonstrated that automatic segmentation and labelling of files containing speech segments was 74% faster when compared to a manual analysis performed with two independent coders. Furthermore, the methodology presented allows manual adjustments of voiced segments with visualisation of the sound signal and the automatic extraction of quantitative information on speech. In conclusion, we propose a comprehensive methodology for voice segmentation, to be used by researchers that have to work with large sets of sound files and are not familiar with programming tools.Keywords: automatic speech analysis, behavior analysis, naturalistic environments, voice segmentation
Procedia PDF Downloads 2812661 Adversarial Attacks and Defenses on Deep Neural Networks
Authors: Jonathan Sohn
Abstract:
Deep neural networks (DNNs) have shown state-of-the-art performance for many applications, including computer vision, natural language processing, and speech recognition. Recently, adversarial attacks have been studied in the context of deep neural networks, which aim to alter the results of deep neural networks by modifying the inputs slightly. For example, an adversarial attack on a DNN used for object detection can cause the DNN to miss certain objects. As a result, the reliability of DNNs is undermined by their lack of robustness against adversarial attacks, raising concerns about their use in safety-critical applications such as autonomous driving. In this paper, we focus on studying the adversarial attacks and defenses on DNNs for image classification. There are two types of adversarial attacks studied which are fast gradient sign method (FGSM) attack and projected gradient descent (PGD) attack. A DNN forms decision boundaries that separate the input images into different categories. The adversarial attack slightly alters the image to move over the decision boundary, causing the DNN to misclassify the image. FGSM attack obtains the gradient with respect to the image and updates the image once based on the gradients to cross the decision boundary. PGD attack, instead of taking one big step, repeatedly modifies the input image with multiple small steps. There is also another type of attack called the target attack. This adversarial attack is designed to make the machine classify an image to a class chosen by the attacker. We can defend against adversarial attacks by incorporating adversarial examples in training. Specifically, instead of training the neural network with clean examples, we can explicitly let the neural network learn from the adversarial examples. In our experiments, the digit recognition accuracy on the MNIST dataset drops from 97.81% to 39.50% and 34.01% when the DNN is attacked by FGSM and PGD attacks, respectively. If we utilize FGSM training as a defense method, the classification accuracy greatly improves from 39.50% to 92.31% for FGSM attacks and from 34.01% to 75.63% for PGD attacks. To further improve the classification accuracy under adversarial attacks, we can also use a stronger PGD training method. PGD training improves the accuracy by 2.7% under FGSM attacks and 18.4% under PGD attacks over FGSM training. It is worth mentioning that both FGSM and PGD training do not affect the accuracy of clean images. In summary, we find that PGD attacks can greatly degrade the performance of DNNs, and PGD training is a very effective way to defend against such attacks. PGD attacks and defence are overall significantly more effective than FGSM methods.Keywords: deep neural network, adversarial attack, adversarial defense, adversarial machine learning
Procedia PDF Downloads 1952660 Normalized Compression Distance Based Scene Alteration Analysis of a Video
Authors: Lakshay Kharbanda, Aabhas Chauhan
Abstract:
In this paper, an application of Normalized Compression Distance (NCD) to detect notable scene alterations occurring in videos is presented. Several research groups have been developing methods to perform image classification using NCD, a computable approximation to Normalized Information Distance (NID) by studying the degree of similarity in images. The timeframes where significant aberrations between the frames of a video have occurred have been identified by obtaining a threshold NCD value, using two compressors: LZMA and BZIP2 and defining scene alterations using Pixel Difference Percentage metrics.Keywords: image compression, Kolmogorov complexity, normalized compression distance, root mean square error
Procedia PDF Downloads 3402659 An Artificial Intelligence Supported QUAL2K Model for the Simulation of Various Physiochemical Parameters of Water
Authors: Mehvish Bilal, Navneet Singh, Jasir Mushtaq
Abstract:
Water pollution puts people's health at risk, and it can also impact the ecology. For practitioners of integrated water resources management (IWRM), water quality modelling may be useful for informing decisions about pollution control (such as discharge permitting) or demand management (such as abstraction permitting). To comprehend the current pollutant load, movement of effective load movement of contaminants generates effective relation between pollutants, mathematical simulation, source, and water quality is regarded as one of the best estimating tools. The current study involves the Qual2k model, which includes manual simulation of the various physiochemical characteristics of water. To this end, various sensors could be installed for the automatic simulation of various physiochemical characteristics of water. An artificial intelligence model has been proposed for the automatic simulation of water quality parameters. Models of water quality have become an effective tool for identifying worldwide water contamination, as well as the ultimate fate and behavior of contaminants in the water environment. Water quality model research is primarily conducted in Europe and other industrialized countries in the first world, where theoretical underpinnings and practical research are prioritized.Keywords: artificial intelligence, QUAL2K, simulation, physiochemical parameters
Procedia PDF Downloads 1052658 Advances in Machine Learning and Deep Learning Techniques for Image Classification and Clustering
Authors: R. Nandhini, Gaurab Mudbhari
Abstract:
Ranging from the field of health care to self-driving cars, machine learning and deep learning algorithms have revolutionized the field with the proper utilization of images and visual-oriented data. Segmentation, regression, classification, clustering, dimensionality reduction, etc., are some of the Machine Learning tasks that helped Machine Learning and Deep Learning models to become state-of-the-art models for the field where images are key datasets. Among these tasks, classification and clustering are essential but difficult because of the intricate and high-dimensional characteristics of image data. This finding examines and assesses advanced techniques in supervised classification and unsupervised clustering for image datasets, emphasizing the relative efficiency of Convolutional Neural Networks (CNNs), Vision Transformers (ViTs), Deep Embedded Clustering (DEC), and self-supervised learning approaches. Due to the distinctive structural attributes present in images, conventional methods often fail to effectively capture spatial patterns, resulting in the development of models that utilize more advanced architectures and attention mechanisms. In image classification, we investigated both CNNs and ViTs. One of the most promising models, which is very much known for its ability to detect spatial hierarchies, is CNN, and it serves as a core model in our study. On the other hand, ViT is another model that also serves as a core model, reflecting a modern classification method that uses a self-attention mechanism which makes them more robust as this self-attention mechanism allows them to lean global dependencies in images without relying on convolutional layers. This paper evaluates the performance of these two architectures based on accuracy, precision, recall, and F1-score across different image datasets, analyzing their appropriateness for various categories of images. In the domain of clustering, we assess DEC, Variational Autoencoders (VAEs), and conventional clustering techniques like k-means, which are used on embeddings derived from CNN models. DEC, a prominent model in the field of clustering, has gained the attention of many ML engineers because of its ability to combine feature learning and clustering into a single framework and its main goal is to improve clustering quality through better feature representation. VAEs, on the other hand, are pretty well known for using latent embeddings for grouping similar images without requiring for prior label by utilizing the probabilistic clustering method.Keywords: machine learning, deep learning, image classification, image clustering
Procedia PDF Downloads 102657 Quality Analysis of Vegetables Through Image Processing
Authors: Abdul Khalique Baloch, Ali Okatan
Abstract:
The quality analysis of food and vegetable from image is hot topic now a day, where researchers make them better then pervious findings through different technique and methods. In this research we have review the literature, and find gape from them, and suggest better proposed approach, design the algorithm, developed a software to measure the quality from images, where accuracy of image show better results, and compare the results with Perouse work done so for. The Application we uses an open-source dataset and python language with tensor flow lite framework. In this research we focus to sort food and vegetable from image, in the images, the application can sorts and make them grading after process the images, it could create less errors them human base sorting errors by manual grading. Digital pictures datasets were created. The collected images arranged by classes. The classification accuracy of the system was about 94%. As fruits and vegetables play main role in day-to-day life, the quality of fruits and vegetables is necessary in evaluating agricultural produce, the customer always buy good quality fruits and vegetables. This document is about quality detection of fruit and vegetables using images. Most of customers suffering due to unhealthy foods and vegetables by suppliers, so there is no proper quality measurement level followed by hotel managements. it have developed software to measure the quality of the fruits and vegetables by using images, it will tell you how is your fruits and vegetables are fresh or rotten. Some algorithms reviewed in this thesis including digital images, ResNet, VGG16, CNN and Transfer Learning grading feature extraction. This application used an open source dataset of images and language used python, and designs a framework of system.Keywords: deep learning, computer vision, image processing, rotten fruit detection, fruits quality criteria, vegetables quality criteria
Procedia PDF Downloads 702656 'Low Electronic Noise' Detector Technology in Computed Tomography
Authors: A. Ikhlef
Abstract:
Image noise in computed tomography, is mainly caused by the statistical noise, system noise reconstruction algorithm filters. Since last few years, low dose x-ray imaging became more and more desired and looked as a technical differentiating technology among CT manufacturers. In order to achieve this goal, several technologies and techniques are being investigated, including both hardware (integrated electronics and photon counting) and software (artificial intelligence and machine learning) based solutions. From a hardware point of view, electronic noise could indeed be a potential driver for low and ultra-low dose imaging. We demonstrated that the reduction or elimination of this term could lead to a reduction of dose without affecting image quality. Also, in this study, we will show that we can achieve this goal using conventional electronics (low cost and affordable technology), designed carefully and optimized for maximum detective quantum efficiency. We have conducted the tests using large imaging objects such as 30 cm water and 43 cm polyethylene phantoms. We compared the image quality with conventional imaging protocols with radiation as low as 10 mAs (<< 1 mGy). Clinical validation of such results has been performed as well.Keywords: computed tomography, electronic noise, scintillation detector, x-ray detector
Procedia PDF Downloads 1262655 The Algorithm of Semi-Automatic Thai Spoonerism Words for Bi-Syllable
Authors: Nutthapat Kaewrattanapat, Wannarat Bunchongkien
Abstract:
The purposes of this research are to study and develop the algorithm of Thai spoonerism words by semi-automatic computer programs, that is to say, in part of data input, syllables are already separated and in part of spoonerism, the developed algorithm is utilized, which can establish rules and mechanisms in Thai spoonerism words for bi-syllables by utilizing analysis in elements of the syllables, namely cluster consonant, vowel, intonation mark and final consonant. From the study, it is found that bi-syllable Thai spoonerism has 1 case of spoonerism mechanism, namely transposition in value of vowel, intonation mark and consonant of both 2 syllables but keeping consonant value and cluster word (if any). From the study, the rules and mechanisms in Thai spoonerism word were applied to develop as Thai spoonerism word software, utilizing PHP program. the software was brought to conduct a performance test on software execution; it is found that the program performs bi-syllable Thai spoonerism correctly or 99% of all words used in the test and found faults on the program at 1% as the words obtained from spoonerism may not be spelling in conformity with Thai grammar and the answer in Thai spoonerism could be more than 1 answer.Keywords: algorithm, spoonerism, computational linguistics, Thai spoonerism
Procedia PDF Downloads 2362654 3D Remote Sensing Images Parallax Refining Based On HTML5
Authors: Qian Pei, Hengjian Tong, Weitao Chen, Hai Wang, Yanrong Feng
Abstract:
Horizontal parallax is the foundation of stereoscopic viewing. However, the human eye will feel uncomfortable and it will occur diplopia if horizontal parallax is larger than eye separation. Therefore, we need to do parallax refining before conducting stereoscopic observation. Although some scholars have been devoted to online remote sensing refining, the main work of image refining is completed on the server side. There will be a significant delay when multiple users access the server at the same time. The emergence of HTML5 technology in recent years makes it possible to develop rich browser web application. Authors complete the image parallax refining on the browser side based on HTML5, while server side only need to transfer image data and parallax file to browser side according to the browser’s request. In this way, we can greatly reduce the server CPU load and allow a large number of users to access server in parallel and respond the user’s request quickly.Keywords: 3D remote sensing images, parallax, online refining, rich browser web application, HTML5
Procedia PDF Downloads 4612653 Velocity Distribution in Open Channels with Sand: An Experimental Study
Authors: E. Keramaris
Abstract:
In this study, laboratory experiments in open channel flows over a sand bed were conducted. A porous bed (sand bed) with porosity of ε=0.70 and porous thickness of s΄=3 cm was tested. Vertical distributions of velocity were evaluated by using a two-dimensional (2D) Particle Image Velocimetry (PIV). Velocity profiles are measured above the impermeable bed and above the sand bed for the same different total water heights (h= 6, 8, 10 and 12 cm) and for the same slope S=1.5. Measurements of mean velocity indicate the effects of the bed material used (sand bed) on the flow characteristics (Velocity distribution and Reynolds number) in comparison with those above the impermeable bed.Keywords: particle image velocimetry, sand bed, velocity distribution, Reynolds number
Procedia PDF Downloads 3742652 Building Information Modeling-Based Approach for Automatic Quantity Take-off and Cost Estimation
Authors: Lo Kar Yin, Law Ka Mei
Abstract:
Architectural, engineering, construction and operations (AECO) industry practitioners have been well adapting to the dynamic construction market from the fundamental training of its discipline. As further triggered by the pandemic since 2019, great steps are taken in virtual environment and the best collaboration is strived with project teams without boundaries. With adoption of Building Information Modeling-based approach and qualitative analysis, this paper is to review quantity take-off and cost estimation process through modeling techniques in liaison with suppliers, fabricators, subcontractors, contractors, designers, consultants and services providers in the construction industry value chain for automatic project cost budgeting, project cost control and cost evaluation on design options of in-situ reinforced-concrete construction and Modular Integrated Construction (MiC) at design stage, variation of works and cash flow/spending analysis at construction stage as far as practicable, with a view to sharing the findings for enhancing mutual trust and co-operation among AECO industry practitioners. It is to foster development through a common prototype of design and build project delivery method in NEC Engineering and Construction Contract (ECC) Options A and C.Keywords: building information modeling, cost estimation, quantity take-off, modeling techniques
Procedia PDF Downloads 1882651 Investigation of Martensitic Transformation Zone at the Crack Tip of NiTi under Mode-I Loading Using Microscopic Image Correlation
Authors: Nima Shafaghi, Gunay Anlaş, C. Can Aydiner
Abstract:
A realistic understanding of martensitic phase transition under complex stress states is key for accurately describing the mechanical behavior of shape memory alloys (SMAs). Particularly regarding the sharply changing stress fields at the tip of a crack, the size, nature and shape of transformed zones are of great interest. There is significant variation among various analytical models in their predictions of the size and shape of the transformation zone. As the fully transformed region remains inside a very small boundary at the tip of the crack, experimental validation requires microscopic resolution. Here, the crack tip vicinity of NiTi compact tension specimen has been monitored in situ with microscopic image correlation with 20x magnification. With nominal 15 micrometer grains and 0.2 micrometer per pixel optical resolution, the strains at the crack tip are mapped with intra-grain detail. The transformation regions are then deduced using an equivalent strain formulation.Keywords: digital image correlation, fracture, martensitic phase transition, mode I, NiTi, transformation zone
Procedia PDF Downloads 353