Search results for: features extraction
5256 A Conglomerate of Multiple Optical Character Recognition Table Detection and Extraction
Authors: Smita Pallavi, Raj Ratn Pranesh, Sumit Kumar
Abstract:
Information representation as tables is compact and concise method that eases searching, indexing, and storage requirements. Extracting and cloning tables from parsable documents is easier and widely used; however, industry still faces challenges in detecting and extracting tables from OCR (Optical Character Recognition) documents or images. This paper proposes an algorithm that detects and extracts multiple tables from OCR document. The algorithm uses a combination of image processing techniques, text recognition, and procedural coding to identify distinct tables in the same image and map the text to appropriate the corresponding cell in dataframe, which can be stored as comma-separated values, database, excel, and multiple other usable formats.Keywords: table extraction, optical character recognition, image processing, text extraction, morphological transformation
Procedia PDF Downloads 1455255 Automatic Staging and Subtype Determination for Non-Small Cell Lung Carcinoma Using PET Image Texture Analysis
Authors: Seyhan Karaçavuş, Bülent Yılmaz, Ömer Kayaaltı, Semra İçer, Arzu Taşdemir, Oğuzhan Ayyıldız, Kübra Eset, Eser Kaya
Abstract:
In this study, our goal was to perform tumor staging and subtype determination automatically using different texture analysis approaches for a very common cancer type, i.e., non-small cell lung carcinoma (NSCLC). Especially, we introduced a texture analysis approach, called Law’s texture filter, to be used in this context for the first time. The 18F-FDG PET images of 42 patients with NSCLC were evaluated. The number of patients for each tumor stage, i.e., I-II, III or IV, was 14. The patients had ~45% adenocarcinoma (ADC) and ~55% squamous cell carcinoma (SqCCs). MATLAB technical computing language was employed in the extraction of 51 features by using first order statistics (FOS), gray-level co-occurrence matrix (GLCM), gray-level run-length matrix (GLRLM), and Laws’ texture filters. The feature selection method employed was the sequential forward selection (SFS). Selected textural features were used in the automatic classification by k-nearest neighbors (k-NN) and support vector machines (SVM). In the automatic classification of tumor stage, the accuracy was approximately 59.5% with k-NN classifier (k=3) and 69% with SVM (with one versus one paradigm), using 5 features. In the automatic classification of tumor subtype, the accuracy was around 92.7% with SVM one vs. one. Texture analysis of FDG-PET images might be used, in addition to metabolic parameters as an objective tool to assess tumor histopathological characteristics and in automatic classification of tumor stage and subtype.Keywords: cancer stage, cancer cell type, non-small cell lung carcinoma, PET, texture analysis
Procedia PDF Downloads 3265254 Extractive Desulfurization of Fuels Using Choline Chloride-Based Deep Eutectic Solvents
Authors: T. Zaki, Fathi S. Soliman
Abstract:
Desulfurization process is required by most, if not all refineries, to achieve ultra-low sulfur fuel, that contains less than 10 ppm sulfur. A lot of research works and many effective technologies have been studied to achieve deep desulfurization process in moderate reaction environment, such as adsorption desulfurization (ADS), oxidative desulfurization (ODS), biodesulfurization and extraction desulfurization (EDS). Extraction desulfurization using deep eutectic solvents (DESs) is considered as simple, cheap, highly efficient and environmentally friend process. In this work, four DESs were designed and synthesized. Choline chloride (ChCl) was selected as typical hydrogen bond acceptors (HBA), and ethylene glycol (EG), glycerol (Gl), urea (Ur) and thiourea (Tu) were selected as hydrogen bond donors (HBD), from which a series of deep eutectic solvents were synthesized. The experimental data showed that the synthesized DESs showed desulfurization affinities towards the thiophene species in cyclohexane solvent. Ethylene glycol molecules showed more affinity to create hydrogen bond with thiophene instead of choline chloride. Accordingly, ethylene glycol choline chloride DES has the highest extraction efficiency.Keywords: DES, desulfurization, green solvent, extraction
Procedia PDF Downloads 2885253 Robust Features for Impulsive Noisy Speech Recognition Using Relative Spectral Analysis
Authors: Hajer Rahali, Zied Hajaiej, Noureddine Ellouze
Abstract:
The goal of speech parameterization is to extract the relevant information about what is being spoken from the audio signal. In speech recognition systems Mel-Frequency Cepstral Coefficients (MFCC) and Relative Spectral Mel-Frequency Cepstral Coefficients (RASTA-MFCC) are the two main techniques used. It will be shown in this paper that it presents some modifications to the original MFCC method. In our work the effectiveness of proposed changes to MFCC called Modified Function Cepstral Coefficients (MODFCC) were tested and compared against the original MFCC and RASTA-MFCC features. The prosodic features such as jitter and shimmer are added to baseline spectral features. The above-mentioned techniques were tested with impulsive signals under various noisy conditions within AURORA databases.Keywords: auditory filter, impulsive noise, MFCC, prosodic features, RASTA filter
Procedia PDF Downloads 4255252 Enterprise Information Portal Features: Results of Content Analysis Literature Review
Authors: Michal Krčál
Abstract:
Since their introduction in 1990’s, Enterprise Information Portals (EIPs) were investigated from different perspectives (e.g. project management, technology acceptance, IS success). However, no systematic literature review was produced to systematize both the research efforts and the technology itself. This paper reports first results of an extent systematic literature review study focused on research of EIPs and its categorization, specifically it reports a conceptual model of EIP features. The previous attempt to categorize EIP features was published in 2002. For the purpose of the literature review, content of 89 articles was analyzed in order to identify and categorize features of EIPs. The methodology of the literature review was as follows. Firstly, search queries in major indexing databases (Web of Science and SCOPUS) were used. The results of queries were analyzed according to their usability for the goal of the study. Then, full-texts were coded in Atlas.ti according to previously established coding scheme. The codes were categorized and the conceptual model of EIP features was created.Keywords: enterprise information portal, content analysis, features, systematic literature review
Procedia PDF Downloads 2985251 Ontology Expansion via Synthetic Dataset Generation and Transformer-Based Concept Extraction
Authors: Andrey Khalov
Abstract:
The rapid proliferation of unstructured data in IT infrastructure management demands innovative approaches for extracting actionable knowledge. This paper presents a framework for ontology-based knowledge extraction that combines relational graph neural networks (R-GNN) with large language models (LLMs). The proposed method leverages the DOLCE framework as the foundational ontology, extending it with concepts from ITSMO for domain-specific applications in IT service management and outsourcing. A key component of this research is the use of transformer-based models, such as DeBERTa-v3-large, for automatic entity and relationship extraction from unstructured texts. Furthermore, the paper explores how transfer learning techniques can be applied to fine-tune large language models (LLaMA) for using to generate synthetic datasets to improve precision in BERT-based entity recognition and ontology alignment. The resulting IT Ontology (ITO) serves as a comprehensive knowledge base that integrates domain-specific insights from ITIL processes, enabling more efficient decision-making. Experimental results demonstrate significant improvements in knowledge extraction and relationship mapping, offering a cutting-edge solution for enhancing cognitive computing in IT service environments.Keywords: ontology expansion, synthetic dataset, transformer fine-tuning, concept extraction, DOLCE, BERT, taxonomy, LLM, NER
Procedia PDF Downloads 145250 Content-Based Image Retrieval Using HSV Color Space Features
Authors: Hamed Qazanfari, Hamid Hassanpour, Kazem Qazanfari
Abstract:
In this paper, a method is provided for content-based image retrieval. Content-based image retrieval system searches query an image based on its visual content in an image database to retrieve similar images. In this paper, with the aim of simulating the human visual system sensitivity to image's edges and color features, the concept of color difference histogram (CDH) is used. CDH includes the perceptually color difference between two neighboring pixels with regard to colors and edge orientations. Since the HSV color space is close to the human visual system, the CDH is calculated in this color space. In addition, to improve the color features, the color histogram in HSV color space is also used as a feature. Among the extracted features, efficient features are selected using entropy and correlation criteria. The final features extract the content of images most efficiently. The proposed method has been evaluated on three standard databases Corel 5k, Corel 10k and UKBench. Experimental results show that the accuracy of the proposed image retrieval method is significantly improved compared to the recently developed methods.Keywords: content-based image retrieval, color difference histogram, efficient features selection, entropy, correlation
Procedia PDF Downloads 2505249 Prevalence of Lower Third Molar Impactions and Angulations Among Yemeni Population
Authors: Khawlah Al-Khalidi
Abstract:
Prevalence of lower third molar impactions and angulations among Yemeni population The purpose of this study was to look into the prevalence of lower third molars in a sample of patients from Ibb University Affiliated Hospital, as well as to study and categorise their position by using Pell and Gregory classification, and to look into a possible correlation between their position and the indication for extraction. Materials and methods: This is a retrospective, observational study in which a sample of 200 patients from Ibb University Affiliated Hospital were studied, including patient record validation and orthopantomography performed in screening appointments in people aged 16 to 21. Results and discussion: Males make up 63% of the sample, while people aged 19 to 20 make up 41.2%. Lower third molars were found in 365 of the 365 instances examined, accounting for 91% of the sample under study. According to Pell and Gregory's categorisation, the most common position is IIB, with 37%, followed by IIA with 21%; less common classes are IIIA, IC, and IIIC, with 1%, 3%, and 3%, respectively. It was feasible to determine that 56% of the lower third molars in the sample were recommended for extraction during the screening consultation. Finally, there are differences in third molar location and angulation. There was, however, a link between the available space for third molar eruption and the need for tooth extraction.Keywords: lower third molar, extraction, Pell and Gregory classification, lower third molar impaction
Procedia PDF Downloads 555248 Recycling of Spent Mo-Co Catalyst for the Recovery of Molybdenum Using Cyphos IL 104
Authors: Harshit Mahandra, Rashmi Singh, Bina Gupta
Abstract:
Molybdenum is widely used in thermocouples, anticathode of X-ray tubes and in the production of alloys of steels. Molybdenum compounds are extensively used as a catalyst in petroleum-refining industries for hydrodesulphurization. Activity of the catalysts decreases gradually with time and are dumped as hazardous waste due to contamination with toxic materials during the process. These spent catalysts can serve as a secondary source for metal recovery and help to sort out environmental and economical issues. In present study, extraction and separation of molybdenum from a Mo-Co spent catalyst leach liquor containing 0.870 g L⁻¹ Mo, 0.341 g L⁻¹ Co, 0.422 ×10⁻¹ g L⁻¹ Fe and 0.508 g L⁻¹ Al in 3 mol L⁻¹ HCl has been investigated using solvent extraction technique. The extracted molybdenum has been finally recovered as molybdenum trioxide. Leaching conditions used were- 3 mol L⁻¹ HCl, 90°C temperature, solid to liquid ratio (w/v) of 1.25% and reaction time of 60 minutes. 96.45% molybdenum was leached under these conditions. For the extraction of molybdenum from leach liquor, Cyphos IL 104 [trihexyl(tetradecyl)phosphonium bis(2,4,4-trimethylpentyl)phosphinate] in toluene was used as an extractant. Around 91% molybdenum was extracted with 0.02 mol L⁻¹ Cyphos IL 104, and 75% of molybdenum was stripped from the loaded organic phase with 2 mol L⁻¹ HNO₃ at A/O=1/1. McCabe Thiele diagrams were drawn to determine the number of stages required for the extraction and stripping of molybdenum. According to McCabe Thiele plots, two stages are required for both extraction and stripping of molybdenum at A/O=1/1 which were also confirmed by countercurrent simulation studies. Around 98% molybdenum was extracted in two countercurrent extraction stages with no co-extraction of cobalt and aluminum. Iron was removed from the loaded organic phase by scrubbing with 0.01 mol L⁻¹ HCl. Quantitative recovery of molybdenum is achieved in three countercurrent stripping stages at A/O=1/1. Trioxide of molybdenum was obtained from strip solution and was characterized by XRD, FE-SEM and EDX techniques. Molybdenum trioxide due to its distinctive electrochromic, thermochromic and photochromic properties is used as a smart material for sensors, lubricants, and Li-ion batteries. Molybdenum trioxide finds application in various processes such as methanol oxidation, metathesis, propane oxidation and in hydrodesulphurization. It can also be used as a precursor for the synthesis of MoS₂ and MoSe₂.Keywords: Cyphos IL 104, molybdenum, spent Mo-Co catalyst, recovery
Procedia PDF Downloads 2075247 From Binary Solutions to Real Bio-Oils: A Multi-Step Extraction Story of Phenolic Compounds with Ionic Liquid
Authors: L. Cesari, L. Canabady-Rochelle, F. Mutelet
Abstract:
The thermal conversion of lignin produces bio-oils that contain many compounds with high added-value such as phenolic compounds. In order to efficiently extract these compounds, the possible use of choline bis(trifluoromethylsulfonyl)imide [Choline][NTf2] ionic liquid was explored. To this end, a multistep approach was implemented. First, binary (phenolic compound and solvent) and ternary (phenolic compound and solvent and ionic liquid) solutions were investigated. Eight binary systems of phenolic compound and water were investigated at atmospheric pressure. These systems were quantified using the turbidity method and UV-spectroscopy. Ternary systems (phenolic compound and water and [Choline][NTf2]) were investigated at room temperature and atmospheric pressure. After stirring, the solutions were let to settle down, and a sample of each phase was collected. The analysis of the phases was performed using gas chromatography with an internal standard. These results were used to quantify the values of the interaction parameters of thermodynamic models. Then, extractions were performed on synthetic solutions to determine the influence of several operating conditions (temperature, kinetics, amount of [Choline][NTf2]). With this knowledge, it has been possible to design and simulate an extraction process composed of one extraction column and one flash. Finally, the extraction efficiency of [Choline][NTf2] was quantified with real bio-oils from lignin pyrolysis. Qualitative and quantitative analysis were performed using gas chromatographic connected to mass spectroscopy and flame ionization detector. The experimental measurements show that the extraction of phenolic compounds is efficient at room temperature, quick and does not require a high amount of [Choline][NTf2]. Moreover, the simulations of the extraction process demonstrate that [Choline][NTf2] process requires less energy than an organic one. Finally, the efficiency of [Choline][NTf2] was confirmed in real situations with the experiments on lignin pyrolysis bio-oils.Keywords: bio-oils, extraction, lignin, phenolic compounds
Procedia PDF Downloads 1105246 Intelligent Rheumatoid Arthritis Identification System Based Image Processing and Neural Classifier
Authors: Abdulkader Helwan
Abstract:
Rheumatoid joint inflammation is characterized as a perpetual incendiary issue which influences the joints by hurting body tissues Therefore, there is an urgent need for an effective intelligent identification system of knee Rheumatoid arthritis especially in its early stages. This paper is to develop a new intelligent system for the identification of Rheumatoid arthritis of the knee utilizing image processing techniques and neural classifier. The system involves two principle stages. The first one is the image processing stage in which the images are processed using some techniques such as RGB to gryascale conversion, rescaling, median filtering, background extracting, images subtracting, segmentation using canny edge detection, and features extraction using pattern averaging. The extracted features are used then as inputs for the neural network which classifies the X-ray knee images as normal or abnormal (arthritic) based on a backpropagation learning algorithm which involves training of the network on 400 X-ray normal and abnormal knee images. The system was tested on 400 x-ray images and the network shows good performance during that phase, resulting in a good identification rate 97%.Keywords: rheumatoid arthritis, intelligent identification, neural classifier, segmentation, backpropoagation
Procedia PDF Downloads 5335245 Microwave-Assisted Alginate Extraction from Portuguese Saccorhiza polyschides – Influence of Acid Pretreatment
Authors: Mário Silva, Filipa Gomes, Filipa Oliveira, Simone Morais, Cristina Delerue-Matos
Abstract:
Brown seaweeds are abundant in Portuguese coastline and represent an almost unexploited marine economic resource. One of the most common species, easily available for harvesting in the northwest coast, is Saccorhiza polyschides grows in the lowest shore and costal rocky reefs. It is almost exclusively used by local farmers as natural fertilizer, but contains a substantial amount of valuable compounds, particularly alginates, natural biopolymers of high interest for many industrial applications. Alginates are natural polysaccharides present in cell walls of brown seaweed, highly biocompatible, with particular properties that make them of high interest for the food, biotechnology, cosmetics and pharmaceutical industries. Conventional extraction processes are based on thermal treatment. They are lengthy and consume high amounts of energy and solvents. In recent years, microwave-assisted extraction (MAE) has shown enormous potential to overcome major drawbacks that outcome from conventional plant material extraction (thermal and/or solvent based) techniques, being also successfully applied to the extraction of agar, fucoidans and alginates. In the present study, acid pretreatment of brown seaweed Saccorhiza polyschides for subsequent microwave-assisted extraction (MAE) of alginate was optimized. Seaweeds were collected in Northwest Portuguese coastal waters of the Atlantic Ocean between May and August, 2014. Experimental design was used to assess the effect of temperature and acid pretreatment time in alginate extraction. Response surface methodology allowed the determination of the optimum MAE conditions: 40 mL of HCl 0.1 M per g of dried seaweed with constant stirring at 20ºC during 14h. Optimal acid pretreatment conditions have enhanced significantly MAE of alginates from Saccorhiza polyschides, thus contributing for the development of a viable, more environmental friendly alternative to conventional processes.Keywords: acid pretreatment, alginate, brown seaweed, microwave-assisted extraction, response surface methodology
Procedia PDF Downloads 3825244 Use RP-HPLC To Investigate Factors Influencing Sorghum Protein Extraction
Authors: Khaled Khaladi, Rafika Bibi, Hind Mokrane, Boubekeur Nadjemi
Abstract:
Sorghum (Sorghum bicolor (L.) Moench) is an important cereal crop grown in the semi-arid tropics of Africa and Asia due to its drought tolerance. Sorghum grain has protein content varying from 6 to 18%, with an average of 11%, Sorghum proteins can be broadly classified into prolamin and non-prolamin proteins. Kafirins, the major storage proteins, are classified as prolamins, and as such, they contain high levels of proline and glutamine and are soluble in non-polar solvents such as aqueous alcohols. Kafirins account for 77 to 82% of the protein in the endosperm, whereas non-prolamin proteins (namely, albumins, globulins, and glutelins) make up about 30% of the proteins. To optimize the extraction of sorghum proteins, several variables were examined: detergent type and concentration, reducing agent type and concentration, and buffer pH and concentration. Samples were quantified and characterized by RP-HPLC.Keywords: sorghum, protein extraction, detergent, food science
Procedia PDF Downloads 3205243 Investigating the Stylistic Features of Advertising: Ad Design and Creation
Authors: Asma Ben Abdallah
Abstract:
Language has a powerful influence over people and their actions. The language of advertising has a very great impact on the consumer. It makes use of different features from the linguistic continuum. The present paper attempts to apply the theories of stylistics to the analysis of advertising texts. In order to decipher the stylistic features of the advertising discourse, 30 advertising text samples designed by MA Business students have been selected. These samples have been analyzed at the level of design and content. The study brings insights into the use of stylistic devices in advertising, and it reveals that both linguistic and non-linguistic features of advertisements are frequently employed to develop a well-thought-out design and content. The practical significance of the study is to highlight the specificities of the advertising genre so that people interested in the language of advertising (Business students and ESP teachers) will have a better understanding of the nature of the language used and the techniques of writing and designing ads. Similarly, those working in the advertising sphere (ad designers) will appreciate the specificities of the advertising discourse.Keywords: the language of advertising, advertising discourse, ad design, stylistic features
Procedia PDF Downloads 2385242 Recovery of Copper and Gold by Delamination of Printed Circuit Boards Followed by Leaching and Solvent Extraction Process
Authors: Kamalesh Kumar Singh
Abstract:
Due to increasing trends of electronic waste, specially the ICT related gadgets, their green recycling is still a greater challenge. This article presents a two-stage, eco-friendly hydrometallurgical route for the recovery of gold from the delaminated metallic layers of waste mobile phone Printed Circuit Boards (PCBs). Initially, mobile phone PCBs are downsized (1x1 cm²) and treated with an organic solvent dimethylacetamide (DMA) for the separation of metallic fraction from non-metallic glass fiber. In the first stage, liberated metallic sheets are used for the selective dissolution of copper in an aqueous leaching reagent. Influence of various parameters such as type of leaching reagent, the concentration of the solution, temperature, time and pulp density are optimized for the effective leaching (almost 100%) of copper. Results have shown that 3M nitric acid is a suitable reagent for copper leaching at room temperature and considering chemical features, gold remained in solid residue. In the second stage, the separated residue is used for the recovery of gold by using sulphuric acid with a combination of halide salt. In this halide leaching, Cl₂ or Br₂ is generated as an in-situ oxidant to improve the leaching of gold. Results have shown that almost 92 % of gold is recovered at the optimized parameters.Keywords: printed circuit boards, delamination, leaching, solvent extraction, recovery
Procedia PDF Downloads 575241 Effect of Honey on Rate of Healing of Socket after Tooth Extraction in Rabbits
Authors: Deependra Prasad Sarraf, Ashish Shrestha, Mehul Rajesh Jaisani, Gajendra Prasad Rauniar
Abstract:
Background: Honey is the worlds’ oldest known wound dressing. Its wound healing properties are not fully established till today. Concerns about antibiotic resistance, and a renewed interest in natural remedies have prompted the resurgence in the antimicrobial and wound healing properties of Honey. Evidence from animal studies and some trials has suggested that honey may accelerate wound healing in burns, infected wounds and open wounds. None of these reports have documented the effect of honey on the healing of socket after tooth extraction. Therefore, the present experimental study was planned to evaluate the efficacy of honey on the healing of socket after tooth extraction in rabbits. Materials and Methods: An experimental study was conducted in six New Zealand White rabbits. Extraction of first premolar tooth on both sides of the lower jaw was done under anesthesia produced by Ketamine and Xylazine followed by application of honey on one socket (test group) and normal saline (control group) in the opposite socket. The intervention was continued for two more days. On the 7th day, the biopsy was taken from the extraction site, and histopathological examination was done. Student’s t-test was used for comparison between the groups and differences were considered to be statistically significant at p-value less than 0.05. Results: There was a significant difference between control group and test group in terms of fibroblast proliferation (p = 0.0019) and bony trabeculae formation (p=0.0003). Inflammatory cells were also observed in both groups, and it was not significant (p=1.0). Overlying epithelium was hyperplastic in both the groups. Conclusion: The study showed that local application of honey promoted the rapid healing process particularly by increasing fibroblast proliferation and bony trabeculae.Keywords: honey, extraction wound, Nepal, healing
Procedia PDF Downloads 2935240 Kinetic and Removable of Amoxicillin Using Aliquat336 as a Carrier via a HFSLM
Authors: Teerapon Pirom, Ura Pancharoen
Abstract:
Amoxicillin is an antibiotic which is widely used to treat various infections in both human beings and animals. However, when amoxicillin is released into the environment, it is a major problem. Amoxicillin causes bacterial resistance to these drugs and failure of treatment with antibiotics. Liquid membrane is of great interest as a promising method for the separation and recovery of the target ions from aqueous solutions due to the use of carriers for the transport mechanism, resulting in highly selectivity and rapid transportation of the desired metal ions. The simultaneous processes of extraction and stripping in a single unit operation of liquid membrane system are very interesting. Therefore, it is practical to apply liquid membrane, particularly the HFSLM for industrial applications as HFSLM is proved to be a separation process with lower capital and operating costs, low energy and extractant with long life time, high selectivity and high fluxes compared with solid membranes. It is a simple design amenable to scaling up for industrial applications. The extraction and recovery for (Amoxicillin) through the hollow fiber supported liquid membrane (HFSLM) using aliquat336 as a carrier were explored with the experimental data. The important variables affecting on transport of amoxicillin viz. extractant concentration and operating time were investigated. The highest AMOX- extraction percentages of 85.35 and Amoxicillin stripping of 80.04 were achieved with the best condition at 6 mmol/L [aliquat336] and operating time 100 min. The extraction reaction order (n) and the extraction reaction rate constant (kf) were found to be 1.00 and 0.0344 min-1, respectively.Keywords: aliquat336, amoxicillin, HFSLM, kinetic
Procedia PDF Downloads 2755239 Laser Data Based Automatic Generation of Lane-Level Road Map for Intelligent Vehicles
Authors: Zehai Yu, Hui Zhu, Linglong Lin, Huawei Liang, Biao Yu, Weixin Huang
Abstract:
With the development of intelligent vehicle systems, a high-precision road map is increasingly needed in many aspects. The automatic lane lines extraction and modeling are the most essential steps for the generation of a precise lane-level road map. In this paper, an automatic lane-level road map generation system is proposed. To extract the road markings on the ground, the multi-region Otsu thresholding method is applied, which calculates the intensity value of laser data that maximizes the variance between background and road markings. The extracted road marking points are then projected to the raster image and clustered using a two-stage clustering algorithm. Lane lines are subsequently recognized from these clusters by the shape features of their minimum bounding rectangle. To ensure the storage efficiency of the map, the lane lines are approximated to cubic polynomial curves using a Bayesian estimation approach. The proposed lane-level road map generation system has been tested on urban and expressway conditions in Hefei, China. The experimental results on the datasets show that our method can achieve excellent extraction and clustering effect, and the fitted lines can reach a high position accuracy with an error of less than 10 cm.Keywords: curve fitting, lane-level road map, line recognition, multi-thresholding, two-stage clustering
Procedia PDF Downloads 1285238 TARF: Web Toolkit for Annotating RNA-Related Genomic Features
Abstract:
Genomic features, the genome-based coordinates, are commonly used for the representation of biological features such as genes, RNA transcripts and transcription factor binding sites. For the analysis of RNA-related genomic features, such as RNA modification sites, a common task is to correlate these features with transcript components (5'UTR, CDS, 3'UTR) to explore their distribution characteristics in terms of transcriptomic coordinates, e.g., to examine whether a specific type of biological feature is enriched near transcription start sites. Existing approaches for performing these tasks involve the manipulation of a gene database, conversion from genome-based coordinate to transcript-based coordinate, and visualization methods that are capable of showing RNA transcript components and distribution of the features. These steps are complicated and time consuming, and this is especially true for researchers who are not familiar with relevant tools. To overcome this obstacle, we develop a dedicated web app TARF, which represents web toolkit for annotating RNA-related genomic features. TARF web tool intends to provide a web-based way to easily annotate and visualize RNA-related genomic features. Once a user has uploaded the features with BED format and specified a built-in transcript database or uploaded a customized gene database with GTF format, the tool could fulfill its three main functions. First, it adds annotation on gene and RNA transcript components. For every features provided by the user, the overlapping with RNA transcript components are identified, and the information is combined in one table which is available for copy and download. Summary statistics about ambiguous belongings are also carried out. Second, the tool provides a convenient visualization method of the features on single gene/transcript level. For the selected gene, the tool shows the features with gene model on genome-based view, and also maps the features to transcript-based coordinate and show the distribution against one single spliced RNA transcript. Third, a global transcriptomic view of the genomic features is generated utilizing the Guitar R/Bioconductor package. The distribution of features on RNA transcripts are normalized with respect to RNA transcript landmarks and the enrichment of the features on different RNA transcript components is demonstrated. We tested the newly developed TARF toolkit with 3 different types of genomics features related to chromatin H3K4me3, RNA N6-methyladenosine (m6A) and RNA 5-methylcytosine (m5C), which are obtained from ChIP-Seq, MeRIP-Seq and RNA BS-Seq data, respectively. TARF successfully revealed their respective distribution characteristics, i.e. H3K4me3, m6A and m5C are enriched near transcription starting sites, stop codons and 5’UTRs, respectively. Overall, TARF is a useful web toolkit for annotation and visualization of RNA-related genomic features, and should help simplify the analysis of various RNA-related genomic features, especially those related RNA modifications.Keywords: RNA-related genomic features, annotation, visualization, web server
Procedia PDF Downloads 2085237 Enhanced Multi-Scale Feature Extraction Using a DCNN by Proposing Dynamic Soft Margin SoftMax for Face Emotion Detection
Authors: Armin Nabaei, M. Omair Ahmad, M. N. S. Swamy
Abstract:
Many facial expression and emotion recognition methods in the traditional approaches of using LDA, PCA, and EBGM have been proposed. In recent years deep learning models have provided a unique platform addressing by automatically extracting the features for the detection of facial expression and emotions. However, deep networks require large training datasets to extract automatic features effectively. In this work, we propose an efficient emotion detection algorithm using face images when only small datasets are available for training. We design a deep network whose feature extraction capability is enhanced by utilizing several parallel modules between the input and output of the network, each focusing on the extraction of different types of coarse features with fined grained details to break the symmetry of produced information. In fact, we leverage long range dependencies, which is one of the main drawback of CNNs. We develop this work by introducing a Dynamic Soft-Margin SoftMax.The conventional SoftMax suffers from reaching to gold labels very soon, which take the model to over-fitting. Because it’s not able to determine adequately discriminant feature vectors for some variant class labels. We reduced the risk of over-fitting by using a dynamic shape of input tensor instead of static in SoftMax layer with specifying a desired Soft- Margin. In fact, it acts as a controller to how hard the model should work to push dissimilar embedding vectors apart. For the proposed Categorical Loss, by the objective of compacting the same class labels and separating different class labels in the normalized log domain.We select penalty for those predictions with high divergence from ground-truth labels.So, we shorten correct feature vectors and enlarge false prediction tensors, it means we assign more weights for those classes with conjunction to each other (namely, “hard labels to learn”). By doing this work, we constrain the model to generate more discriminate feature vectors for variant class labels. Finally, for the proposed optimizer, our focus is on solving weak convergence of Adam optimizer for a non-convex problem. Our noteworthy optimizer is working by an alternative updating gradient procedure with an exponential weighted moving average function for faster convergence and exploiting a weight decay method to help drastically reducing the learning rate near optima to reach the dominant local minimum. We demonstrate the superiority of our proposed work by surpassing the first rank of three widely used Facial Expression Recognition datasets with 93.30% on FER-2013, and 16% improvement compare to the first rank after 10 years, reaching to 90.73% on RAF-DB, and 100% k-fold average accuracy for CK+ dataset, and shown to provide a top performance to that provided by other networks, which require much larger training datasets.Keywords: computer vision, facial expression recognition, machine learning, algorithms, depp learning, neural networks
Procedia PDF Downloads 745236 Morphological Features Fusion for Identifying INBREAST-Database Masses Using Neural Networks and Support Vector Machines
Authors: Nadia el Atlas, Mohammed el Aroussi, Mohammed Wahbi
Abstract:
In this paper a novel technique of mass characterization based on robust features-fusion is presented. The proposed method consists of mainly four stages: (a) the first phase involves segmenting the masses using edge information’s. (b) The second phase is to calculate and fuse the most relevant morphological features. (c) The last phase is the classification step which allows us to classify the images into benign and malignant masses. In this step we have implemented Support Vectors Machines (SVM) and Artificial Neural Networks (ANN), which were evaluated with the following performance criteria: confusion matrix, accuracy, sensitivity, specificity, receiver operating characteristic ROC, and error histogram. The effectiveness of this new approach was evaluated by a recently developed database: INBREAST database. The fusion of the most appropriate morphological features provided very good results. The SVM gives accuracy to within 64.3%. Whereas the ANN classifier gives better results with an accuracy of 97.5%.Keywords: breast cancer, mammography, CAD system, features, fusion
Procedia PDF Downloads 5995235 Recognition of Tifinagh Characters with Missing Parts Using Neural Network
Authors: El Mahdi Barrah, Said Safi, Abdessamad Malaoui
Abstract:
In this paper, we present an algorithm for reconstruction from incomplete 2D scans for tifinagh characters. This algorithm is based on using correlation between the lost block and its neighbors. This system proposed contains three main parts: pre-processing, features extraction and recognition. In the first step, we construct a database of tifinagh characters. In the second step, we will apply “shape analysis algorithm”. In classification part, we will use Neural Network. The simulation results demonstrate that the proposed method give good results.Keywords: Tifinagh character recognition, neural networks, local cost computation, ANN
Procedia PDF Downloads 3345234 Using Priority Order of Basic Features for Circumscribed Masses Detection in Mammograms
Authors: Minh Dong Le, Viet Dung Nguyen, Do Huu Viet, Nguyen Huu Tu
Abstract:
In this paper, we present a new method for circumscribed masses detection in mammograms. Our method is evaluated on 23 mammographic images of circumscribed masses and 20 normal mammograms from public Mini-MIAS database. The method is quite sanguine with sensitivity (SE) of 95% with only about 1 false positive per image (FPpI). To achieve above results we carry out a progression following: Firstly, the input images are preprocessed with the aim to enhance key information of circumscribed masses; Next, we calculate and evaluate statistically basic features of abnormal regions on training database; Then, mammograms on testing database are divided into equal blocks which calculated corresponding features. Finally, using priority order of basic features to classify blocks as an abnormal or normal regions.Keywords: mammograms, circumscribed masses, evaluated statistically, priority order of basic features
Procedia PDF Downloads 3345233 A Method to Evaluate and Compare Web Information Extractors
Authors: Patricia Jiménez, Rafael Corchuelo, Hassan A. Sleiman
Abstract:
Web mining is gaining importance at an increasing pace. Currently, there are many complementary research topics under this umbrella. Their common theme is that they all focus on applying knowledge discovery techniques to data that is gathered from the Web. Sometimes, these data are relatively easy to gather, chiefly when it comes from server logs. Unfortunately, there are cases in which the data to be mined is the data that is displayed on a web document. In such cases, it is necessary to apply a pre-processing step to first extract the information of interest from the web documents. Such pre-processing steps are performed using so-called information extractors, which are software components that are typically configured by means of rules that are tailored to extracting the information of interest from a web page and structuring it according to a pre-defined schema. Paramount to getting good mining results is that the technique used to extract the source information is exact, which requires to evaluate and compare the different proposals in the literature from an empirical point of view. According to Google Scholar, about 4 200 papers on information extraction have been published during the last decade. Unfortunately, they were not evaluated within a homogeneous framework, which leads to difficulties to compare them empirically. In this paper, we report on an original information extraction evaluation method. Our contribution is three-fold: a) this is the first attempt to provide an evaluation method for proposals that work on semi-structured documents; the little existing work on this topic focuses on proposals that work on free text, which has little to do with extracting information from semi-structured documents. b) It provides a method that relies on statistically sound tests to support the conclusions drawn; the previous work does not provide clear guidelines or recommend statistically sound tests, but rather a survey that collects many features to take into account as well as related work; c) We provide a novel method to compute the performance measures regarding unsupervised proposals; otherwise they would require the intervention of a user to compute them by using the annotations on the evaluation sets and the information extracted. Our contributions will definitely help researchers in this area make sure that they have advanced the state of the art not only conceptually, but from an empirical point of view; it will also help practitioners make informed decisions on which proposal is the most adequate for a particular problem. This conference is a good forum to discuss on our ideas so that we can spread them to help improve the evaluation of information extraction proposals and gather valuable feedback from other researchers.Keywords: web information extractors, information extraction evaluation method, Google scholar, web
Procedia PDF Downloads 2485232 A Technique for Image Segmentation Using K-Means Clustering Classification
Authors: Sadia Basar, Naila Habib, Awais Adnan
Abstract:
The paper presents the Technique for Image Segmentation Using K-Means Clustering Classification. The presented algorithms were specific, however, missed the neighboring information and required high-speed computerized machines to run the segmentation algorithms. Clustering is the process of partitioning a group of data points into a small number of clusters. The proposed method is content-aware and feature extraction method which is able to run on low-end computerized machines, simple algorithm, required low-quality streaming, efficient and used for security purpose. It has the capability to highlight the boundary and the object. At first, the user enters the data in the representation of the input. Then in the next step, the digital image is converted into groups clusters. Clusters are divided into many regions. The same categories with same features of clusters are assembled within a group and different clusters are placed in other groups. Finally, the clusters are combined with respect to similar features and then represented in the form of segments. The clustered image depicts the clear representation of the digital image in order to highlight the regions and boundaries of the image. At last, the final image is presented in the form of segments. All colors of the image are separated in clusters.Keywords: clustering, image segmentation, K-means function, local and global minimum, region
Procedia PDF Downloads 3765231 Segmentation of Arabic Handwritten Numeral Strings Based on Watershed Approach
Authors: Nidal F. Shilbayeh, Remah W. Al-Khatib, Sameer A. Nooh
Abstract:
Arabic offline handwriting recognition systems are considered as one of the most challenging topics. Arabic Handwritten Numeral Strings are used to automate systems that deal with numbers such as postal code, banking account numbers and numbers on car plates. Segmentation of connected numerals is the main bottleneck in the handwritten numeral recognition system. This is in turn can increase the speed and efficiency of the recognition system. In this paper, we proposed algorithms for automatic segmentation and feature extraction of Arabic handwritten numeral strings based on Watershed approach. The algorithms have been designed and implemented to achieve the main goal of segmenting and extracting the string of numeral digits written by hand especially in a courtesy amount of bank checks. The segmentation algorithm partitions the string into multiple regions that can be associated with the properties of one or more criteria. The numeral extraction algorithm extracts the numeral string digits into separated individual digit. Both algorithms for segmentation and feature extraction have been tested successfully and efficiently for all types of numerals.Keywords: handwritten numerals, segmentation, courtesy amount, feature extraction, numeral recognition
Procedia PDF Downloads 3825230 Machine Learning Strategies for Data Extraction from Unstructured Documents in Financial Services
Authors: Delphine Vendryes, Dushyanth Sekhar, Baojia Tong, Matthew Theisen, Chester Curme
Abstract:
Much of the data that inform the decisions of governments, corporations and individuals are harvested from unstructured documents. Data extraction is defined here as a process that turns non-machine-readable information into a machine-readable format that can be stored, for instance, in a database. In financial services, introducing more automation in data extraction pipelines is a major challenge. Information sought by financial data consumers is often buried within vast bodies of unstructured documents, which have historically required thorough manual extraction. Automated solutions provide faster access to non-machine-readable datasets, in a context where untimely information quickly becomes irrelevant. Data quality standards cannot be compromised, so automation requires high data integrity. This multifaceted task is broken down into smaller steps: ingestion, table parsing (detection and structure recognition), text analysis (entity detection and disambiguation), schema-based record extraction, user feedback incorporation. Selected intermediary steps are phrased as machine learning problems. Solutions leveraging cutting-edge approaches from the fields of computer vision (e.g. table detection) and natural language processing (e.g. entity detection and disambiguation) are proposed.Keywords: computer vision, entity recognition, finance, information retrieval, machine learning, natural language processing
Procedia PDF Downloads 1135229 1D Convolutional Networks to Compute Mel-Spectrogram, Chromagram, and Cochleogram for Audio Networks
Authors: Elias Nemer, Greg Vines
Abstract:
Time-frequency transformation and spectral representations of audio signals are commonly used in various machine learning applications. Training networks on frequency features such as the Mel-Spectrogram or Cochleogram have been proven more effective and convenient than training on-time samples. In practical realizations, these features are created on a different processor and/or pre-computed and stored on disk, requiring additional efforts and making it difficult to experiment with different features. In this paper, we provide a PyTorch framework for creating various spectral features as well as time-frequency transformation and time-domain filter-banks using the built-in trainable conv1d() layer. This allows computing these features on the fly as part of a larger network and enabling easier experimentation with various combinations and parameters. Our work extends the work in the literature developed for that end: First, by adding more of these features and also by allowing the possibility of either starting from initialized kernels or training them from random values. The code is written as a template of classes and scripts that users may integrate into their own PyTorch classes or simply use as is and add more layers for various applications.Keywords: neural networks Mel-Spectrogram, chromagram, cochleogram, discrete Fourrier transform, PyTorch conv1d()
Procedia PDF Downloads 2345228 Morphological Properties in Ndre Mjeda's Works
Authors: Shyhrete Morina
Abstract:
This paper deals with morphological features in Mjeda's works. To make such a distinction, these features will be compared to standard Albanian language, considering the linguistic structure in the morphological field, which represent an all-important segment of Albanian language. Therefore, the study will focus mainly on the description and construction of these paradigms, which will give a linguistic insight into the entire work of Mjeda as the author who wrote in the dialect of northwestern Geg. Therefore, we have tried to distinguish different parts of the author's language, as well as the distinctive features or even the similarities of these paradigms that arise in the literary work of Mjeda. By constructing the corpus of this phonetic and grammar segment from the whole of Mjeda's work, we have seen that in these fields has built a variety of grammar structures, which for the history of Albanian are of special importance, that in the full variant of the work, as far as we can investigate, we will point out in all the distinctive features. Therefore, our study aims to highlight the linguistic features, namely the author's deep knowledge toward the language, the authenticity of its use, and its mutual relationship with it.Keywords: distinctive morpholgy, nouns, adjetives, pronouns, Albanian standard language
Procedia PDF Downloads 1615227 Zonal and Sequential Extraction Design for Large Flat Space to Achieve Perpetual Tenability
Authors: Mingjun Xu, Man Pun Wan
Abstract:
This study proposed an effective smoke control strategy for the large flat space with a low ceiling to achieve the requirement of perpetual tenability. For the large flat space with a low ceiling, the depth of the smoke reservoir is very shallow, and it is difficult to perpetually constrain the smoke within a limited space. A series of numerical tests were conducted to determine the smoke strategy. A zonal design i.e., the fire zone and two adjacent zones was proposed and validated to be effective in controlling smoke. Once a fire happens in a compartment space, the Engineered Smoke Control (ESC) system will be activated in three zones i.e., the fire zone, in which the fire happened, and two adjacent zones. The smoke can be perpetually constrained within the three smoke zones. To further improve the extraction efficiency, sequential activation of the ESC system within the 3 zones turned out to be more efficient than simultaneous activation. Additionally, the proposed zonal and sequential extraction design can reduce the mechanical extraction flow rate by up to 40.7 % as compared to the conventional method, which is much more economical than that of the conventional method.Keywords: performance-based design, perpetual tenability, smoke control, fire plume
Procedia PDF Downloads 74