Search results for: open information extraction
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 14430

Search results for: open information extraction

14430 Visual Template Detection and Compositional Automatic Regular Expression Generation for Business Invoice Extraction

Authors: Anthony Proschka, Deepak Mishra, Merlyn Ramanan, Zurab Baratashvili

Abstract:

Small and medium-sized businesses receive over 160 billion invoices every year. Since these documents exhibit many subtle differences in layout and text, extracting structured fields such as sender name, amount, and VAT rate from them automatically is an open research question. In this paper, existing work in template-based document extraction is extended, and a system is devised that is able to reliably extract all required fields for up to 70% of all documents in the data set, more than any other previously reported method. The approaches are described for 1) detecting through visual features which template a given document belongs to, 2) automatically generating extraction rules for a given new template by composing regular expressions from multiple components, and 3) computing confidence scores that indicate the accuracy of the automatic extractions. The system can generate templates with as little as one training sample and only requires the ground truth field values instead of detailed annotations such as bounding boxes that are hard to obtain. The system is deployed and used inside a commercial accounting software.

Keywords: data mining, information retrieval, business, feature extraction, layout, business data processing, document handling, end-user trained information extraction, document archiving, scanned business documents, automated document processing, F1-measure, commercial accounting software

Procedia PDF Downloads 99
14429 Event Extraction, Analysis, and Event Linking

Authors: Anam Alam, Rahim Jamaluddin Kanji

Abstract:

With the rapid growth of event in everywhere, event extraction has now become an important matter to retrieve the information from the unstructured data. One of the challenging problems is to extract the event from it. An event is an observable occurrence of interaction among entities. The paper investigates the effectiveness of event extraction capabilities of three software tools that are Wandora, Nitro and SPSS. We performed standard text mining techniques of these tools on the data sets of (i) Afghan War Diaries (AWD collection), (ii) MUC4 and (iii) WebKB. Information retrieval measures such as precision and recall which are computed under extensive set of experiments for Event Extraction. The experimental study analyzes the difference between events extracted by the software and human. This approach helps to construct an algorithm that will be applied for different machine learning methods.

Keywords: event extraction, Wandora, nitro, SPSS, event analysis, extraction method, AFG, Afghan War Diaries, MUC4, 4 universities, dataset, algorithm, precision, recall, evaluation

Procedia PDF Downloads 556
14428 The Role of Named Entity Recognition for Information Extraction

Authors: Girma Yohannis Bade, Olga Kolesnikova, Grigori Sidorov

Abstract:

Named entity recognition (NER) is a building block for information extraction. Though the information extraction process has been automated using a variety of techniques to find and extract a piece of relevant information from unstructured documents, the discovery of targeted knowledge still poses a number of research difficulties because of the variability and lack of structure in Web data. NER, a subtask of information extraction (IE), came to exist to smooth such difficulty. It deals with finding the proper names (named entities), such as the name of the person, country, location, organization, dates, and event in a document, and categorizing them as predetermined labels, which is an initial step in IE tasks. This survey paper presents the roles and importance of NER to IE from the perspective of different algorithms and application area domains. Thus, this paper well summarizes how researchers implemented NER in particular application areas like finance, medicine, defense, business, food science, archeology, and so on. It also outlines the three types of sequence labeling algorithms for NER such as feature-based, neural network-based, and rule-based. Finally, the state-of-the-art and evaluation metrics of NER were presented.

Keywords: the role of NER, named entity recognition, information extraction, sequence labeling algorithms, named entity application area

Procedia PDF Downloads 49
14427 Phishing Attacks Facilitated by Open Source Intelligence

Authors: Urva Maryam

Abstract:

The information has become an important asset to the current cosmos. Globally, various tactics are being observed to confine the spread of information as it makes people vulnerable to security attacks. Open Source Intelligence (OSINT) is a publicly available source that has disseminated information about users or websites, companies, and various organizations. This paper focuses on the quantitative method of exploring various OSINT tools that reveal public information of personals. This information could further facilitate phishing attacks. Phishing attacks can be launched on email addresses, open ports, and unsecure web-surfing. This study allows to analyze the information retrieved from OSINT tools, i.e. theHarvester, and Maltego that can be used to send phishing attacks to individuals.

Keywords: e-mail spoofing, Maltego, OSINT, phishing, spear phishing, theHarvester

Procedia PDF Downloads 111
14426 Mechanisms of Ginger Bioactive Compounds Extract Using Soxhlet and Accelerated Water Extraction

Authors: M. N. Azian, A. N. Ilia Anisa, Y. Iwai

Abstract:

The mechanism for extraction bioactive compounds from plant matrix is essential for optimizing the extraction process. As a benchmark technique, a soxhlet extraction has been utilized for discussing the mechanism and compared with an accelerated water extraction. The trends of both techniques show that the process involves extraction and degradation. The highest yields of 6-, 8-, 10-gingerols and 6-shogaol in soxhlet extraction were 13.948, 7.12, 10.312 and 2.306 mg/g, respectively. The optimum 6-, 8-, 10-gingerols and 6-shogaol extracted by the accelerated water extraction at 140oC were 68.97±3.95 mg/g at 3min, 18.98±3.04 mg/g at 5min, 5.167±2.35 mg/g at 3min and 14.57±6.27 mg/g at 3min, respectively. The effect of temperature at 3mins shows that the concentration of 6-shogaol increased rapidly as decreasing the recovery of 6-gingerol.

Keywords: mechanism, ginger bioactive compounds, soxhlet extraction, accelerated water extraction

Procedia PDF Downloads 394
14425 Phishing Attacks Facilitated by Open Source Intelligence

Authors: Urva Maryam

Abstract:

Information has become an important asset to the current cosmos. Globally, various tactics are being observed to confine the spread of information as it makes people vulnerable to security attacks. Open Source Intelligence (OSINT) is a publicly available source that has disseminated information about users or website, companies, and various organizations. This paper focuses on the quantitative method of exploring various OSINT tools that reveal public information of personals. This information could further facilitate the phishing attacks. Phishing attacks can be launched on email addresses, open ports, and unsecured web-surfing. This study allows to analyze information retrieved from OSINT tools i.e., the Harvester, and Maltego, that can be used to send phishing attacks to individuals.

Keywords: OSINT, phishing, spear phishing, email spoofing, the harvester, maltego

Procedia PDF Downloads 48
14424 Open Minds but Closed Access: Why Are There so Few Gold Open Access LIS Journals And Why Are so Many Librarians Unwilling to Unlock Their Scholarship?

Authors: Sarah Baker, Jayati Chaudhuri

Abstract:

Librarians have embraced the open access movement in all disciplines but their own. They are strong advocates on college campuses and curate institutional repositories, yet there are surprisingly few open access LIS journals. Presenters evaluated the open access availability of library and information science literature. After analyzing the top 100 library science journals (the top 50 journals from Scimago and JCR) and finding very few gold open access journals, they then investigated the availability of open access articles from the top 10 closed access journals. Presenters would like to generate a conversation on what type of proactive approach librarians can take to increase open access to literature within our discipline. Librarians like their colleagues in other disciplines are not motivated to submit their articles to their institutional repositories. Presenters have found a similar reluctance from their fellow colleagues regarding open access initiatives on campus. Presenters will describe Open Access Week activities as part of a campus-wide initiative and share some faculty comments, concerns, and misconceptions that came up as a part of this dialog. Presenters will discuss their personal experiences providing access to faculty publications through the California State University Los Angeles institutional repository.

Keywords: faculty scholarship, institutional repositories, library and information science journals, open access

Procedia PDF Downloads 305
14423 Arabic Light Stemmer for Better Search Accuracy

Authors: Sahar Khedr, Dina Sayed, Ayman Hanafy

Abstract:

Arabic is one of the most ancient and critical languages in the world. It has over than 250 million Arabic native speakers and more than twenty countries having Arabic as one of its official languages. In the past decade, we have witnessed a rapid evolution in smart devices, social network and technology sector which led to the need to provide tools and libraries that properly tackle the Arabic language in different domains. Stemming is one of the most crucial linguistic fundamentals. It is used in many applications especially in information extraction and text mining fields. The motivation behind this work is to enhance the Arabic light stemmer to serve the data mining industry and leverage it in an open source community. The presented implementation works on enhancing the Arabic light stemmer by utilizing and enhancing an algorithm that provides an extension for a new set of rules and patterns accompanied by adjusted procedure. This study has proven a significant enhancement for better search accuracy with an average 10% improvement in comparison with previous works.

Keywords: Arabic data mining, Arabic Information extraction, Arabic Light stemmer, Arabic stemmer

Procedia PDF Downloads 281
14422 Keypoints Extraction for Markerless Tracking in Augmented Reality Applications: A Case Study in Dar As-Saraya Museum

Authors: Jafar W. Al-Badarneh, Abdalkareem R. Al-Hawary, Abdulmalik M. Morghem, Mostafa Z. Ali, Rami S. Al-Gharaibeh

Abstract:

Archeological heritage is at the heart of each country’s national glory. Moreover, it could develop into a source of national income. Heritage management requires socially-responsible marketing that achieves high visitor satisfaction while maintaining high site conservation. We have developed an Augmented Reality (AR) experience for heritage and cultural reservation at Dar-As-Saraya museum in Jordan. Our application of this notion relied on markerless-based tracking approach. This approach uses keypoints extraction technique where features of the environment are identified and defined into the system as keypoints. A set of these keypoints forms a tracker for an augmented object to be displayed and overlaid with a real scene at Dar As-Saraya museum. We tested and compared several techniques for markerless tracking and then applied the best technique to complete a mosaic artifact with AR content. The successful results from our application open the door for applications in open archeological sites where markerless tracking is mostly needed.

Keywords: augmented reality, cultural heritage, keypoints extraction, virtual recreation

Procedia PDF Downloads 303
14421 Automatic Extraction of Water Bodies Using Whole-R Method

Authors: Nikhat Nawaz, S. Srinivasulu, P. Kesava Rao

Abstract:

Feature extraction plays an important role in many remote sensing applications. Automatic extraction of water bodies is of great significance in many remote sensing applications like change detection, image retrieval etc. This paper presents a procedure for automatic extraction of water information from remote sensing images. The algorithm uses the relative location of R-colour component of the chromaticity diagram. This method is then integrated with the effectiveness of the spatial scale transformation of whole method. The whole method is based on water index fitted from spectral library. Experimental results demonstrate the improved accuracy and effectiveness of the integrated method for automatic extraction of water bodies.

Keywords: feature extraction, remote sensing, image retrieval, chromaticity, water index, spectral library, integrated method

Procedia PDF Downloads 344
14420 Author Self-Archiving in Open Access Institutional Repositories for Awareness Creation in Universities

Authors: Kwame Kodua-Ntim

Abstract:

The study explored the authors self-archiving to create awareness of open-access institutional repositories in universities. The qualitative approach of the study was informed by the interpretive paradigm as well as the case research design. The target population for the study was all twelve (12) open-access institutional repositories managers and administrators purposively selected from the five (5) universities in Ghana. The universities were chosen since they were the only ones listed in the Directory of Open Access Repositories. Interviews were conducted using a semi-structured interview guide and data were analyzed using thematic analysis. The study revealed that academics had some information about self-archiving in open-access institutional repositories and university libraries with open-access institutional repositories were using DSpace software. Managers and administrators of open-access institutional repositories mediated content uploaded and believed that author self-archiving could improve awareness of open-access institutional repositories. The study recommended that universities should fully implement the author’s self-archiving protocol, and academics should be trained to be able to upload research works onto open-access institutional repositories. Furthermore, the university and university library should provide rigorous policies on author self-archiving and incentives for author self-archiving in the open access institutional repositories.

Keywords: author, awareness, institutional repositories, open access, open archive, self-archiving

Procedia PDF Downloads 53
14419 Victims and Violators: Open Source Information, Admissibility Standards, and War Crimes Investigations in Iraq and Syria

Authors: Genevieve Zingg

Abstract:

Modern technology and social media platforms have fundamentally altered the nature of war crimes investigations by providing new forms of data, evidence, and documentation, and pose a unique opportunity to expand the efficacy of international law. However, much of the open source information available is deemed inadmissible in subsequent legal proceedings and fails to function as evidence largely due to issues of reliability and verifiability. Focusing on current judicial investigations related to ongoing conflicts in Syria and Iraq, this paper will examine key challenges and opportunities for the effective use of open source information in securing justice. This paper will consider strategies and approaches that can be used to ensure that information collected by affected populations meets basic admissibility standards. This paper argues that the critical failure to equip civilian populations in conflict zones with knowledge and information regarding established admissibility standards and guidelines both jeopardizes the potential of open source information and compromises the ability of victims to participate effectively in justice and accountability processes. The ultimate purpose of this paper is, therefore, to examine how to maximize the value of open source information based on the rules of evidence in international, regional, and national courts, and how to maximize the participation of affected populations in holding their abusers to account.

Keywords: human rights, international criminal law, international justice, international law, Iraq, open source information, social media, Syria, transitional justice, war crimes

Procedia PDF Downloads 318
14418 Biomedical Definition Extraction Using Machine Learning with Synonymous Feature

Authors: Jian Qu, Akira Shimazu

Abstract:

OOV (Out Of Vocabulary) terms are terms that cannot be found in many dictionaries. Although it is possible to translate such OOV terms, the translations do not provide any real information for a user. We present an OOV term definition extraction method by using information available from the Internet. We use features such as occurrence of the synonyms and location distances. We apply machine learning method to find the correct definitions for OOV terms. We tested our method on both biomedical type and name type OOV terms, our work outperforms existing work with an accuracy of 86.5%.

Keywords: information retrieval, definition retrieval, OOV (out of vocabulary), biomedical information retrieval

Procedia PDF Downloads 466
14417 Analytical Study of Cobalt(II) and Nickel(II) Extraction with Salicylidene O-, M-, and P-Toluidine in Chloroform

Authors: Sana Almi, Djamel Barkat

Abstract:

The solvent extraction of cobalt (II) and nickel (II) from aqueous sulfate solutions were investigated with the analytical methods of slope analysis using salicylidene aniline and the three isomeric o-, m- and p-salicylidene toluidine diluted with chloroform at 25°C. By a statistical analysis of the extraction data, it was concluded that the extracted species are CoL2 with CoL2(HL) and NiL2 (HL denotes HSA, HSOT, HSMT, and HSPT). The extraction efficiency of Co(II) was higher than Ni(II). This tendency is confirmed from numerical extraction constants for each metal cations. The best extraction was according to the following order: HSMT > HSPT > HSOT > HSA for Co2+ and Ni2+.

Keywords: solvent extraction, nickel(II), cobalt(II), salicylidene aniline, o-, m-, and p-salicylidene toluidine

Procedia PDF Downloads 455
14416 Determination of Safe Ore Extraction Methodology beneath Permanent Extraction in a Lead Zinc Mine with the Help of FLAC3D Numerical Model

Authors: Ayan Giri, Lukaranjan Phukan, Shantanu Karmakar

Abstract:

Structure and tectonics play a vital role in ore genesis and deposition. The existence of a swelling structure below the current level of a mine leads to the discovery of ores below some permeant developments of the mine. The discovery and the extraction of the ore body are very critical to sustain the business requirement of the mine. The challenge was to extract the ore without hampering the global stability of the mine. In order to do so, different mining options were considered and analysed by numerical modelling in FLAC3d software. The constitutive model prepared for this simulation is the improved unified constitutive model, which can better and more accurately predict the stress-strain relationships in a continuum model. The IUCM employs the Hoek-Brown criterion to determine the instantaneous Mohr-Coulomb parameters cohesion (c) and friction (ɸ) at each level of confining stress. The extra swelled part can be dimensioned as north-south strike width 50m, east-west strike width 50m. On the north side, already a stope (P1) is excavated of the dimension of 25m NS width. The different options considered were (a) Open stoping of extraction of southern part (P0) of 50m to the full extent, (b) Extraction of the southern part of 25m, then filling of both the primaries and extraction of secondary (S0) 25m in between. (c) Extraction of the southern part (P0) completely, preceded by backfill and modify the design of the secondary (S0) for the overall stability of the permanent excavation above the stoping.

Keywords: extraction, IUCM, FLAC 3D, stoping, tectonics

Procedia PDF Downloads 191
14415 Technology Planning with Internal and External Resource for Open Innovation

Authors: Jeonghwan Jeon

Abstract:

Technology planning with both internal capacity and external resource is necessary for successful open innovation. Until now, many types of research have been conducted for this issue. However, technology planning for open innovation at the national level has not been researched sufficiently. This study proposes Open roadmap for open innovation at the national level. The proposed open roadmap can manage the inflow & outflow open innovation systematically. Six types of open roadmap are classified with respect to the innovation direction and characteristics. The proposed open roadmap is applied to the open innovation cases of the Roman period. The proposed open roadmap is expected to be helpful tool for technology policy planning at the national level.

Keywords: technology planning, open innovation, internal resource, external resource, technology management

Procedia PDF Downloads 459
14414 Extraction of M. paradisiaca L. Inflorescences Using Compressed Propane

Authors: Michele C. Mesomo, Madeline de Souza Correa, Roberta L. Kruger, Luis R. S. Kanda, Marcos L. Corazza

Abstract:

Natural extracts of plants have been used for many years for different purposes and recently they have been screened for their potential use as alternative remedies and food preservatives. Inflorescences of M. paradisiaca L., also known as the heart of the banana, have great economic interest due to its fruit. All parts of the banana are used for many different purposes, including use in folk medicine. The use of extraction via supercritical technology has grown in recent years, though it is still necessary to obtain experimental information for the construction of industrial plants. This work reports the extraction of Musa paradisiaca L. using compressed propane as solvent. The effects of the supercritical extraction conditions, pressure and temperature on the yield were evaluated. The raw material, inflorescences banana, was dried at 313.15 K and milled. The particle size used for the packaging of the extraction cell was 12 mesh (23.5%), 16 mesh (23.5%), 32 mesh (34.5%), 48 mesh (18.5%). The extractions were performed in a laboratory scale unit at pressures of 3.0 MPa, 6.5 MPa and 10.0 MPa and at 308.15 K, 323.15 K and 338.15 K. The operating conditions tested achieved a maximum yield of 2.94 wt% for the CO2 extraction at 10.0 MPa and 338.15 K, higher pressure and temperature. The lower yield, 2.29 wt%, was obtained in the condition of lower pressure and higher temperature. Temperature presented significant and positive effect on the extraction yield with supercritical CO2, while pressure had no effect on the yield. The overall extraction curves showed typical behavior obtained for the supercritical extraction procedure and and reached a constant extraction rate of about 80 to 100 min. The largest amount of extract was obtained at the beginning of the process, within 10 to 60 min.

Keywords: banana, natural products, supercritical extraction, temperature

Procedia PDF Downloads 577
14413 Information Extraction Based on Search Engine Results

Authors: Mohammed R. Elkobaisi, Abdelsalam Maatuk

Abstract:

The search engines are the large scale information retrieval tools from the Web that are currently freely available to all. This paper explains how to convert the raw resulted number of search engines into useful information. This represents a new method for data gathering comparing with traditional methods. When a query is submitted for a multiple numbers of keywords, this take a long time and effort, hence we develop a user interface program to automatic search by taking multi-keywords at the same time and leave this program to collect wanted data automatically. The collected raw data is processed using mathematical and statistical theories to eliminate unwanted data and converting it to usable data.

Keywords: search engines, information extraction, agent system

Procedia PDF Downloads 399
14412 Extraction of Essential Oil From Orange Peels

Authors: Aayush Bhisikar, Neha Rajas, Aditya Bhingare, Samarth Bhandare, Amruta Amrurkar

Abstract:

Orange peels are currently thrown away as garbage in India after orange fruits' edible components are consumed. However, the nation depends on important essential oils for usage in companies that produce goods, including food, beverages, cosmetics, and medicines. This study was conducted to show how to effectively use it. By using various extraction techniques, orange peel is used in the creation of essential oils. Stream distillation, water distillation, and solvent extraction were the techniques taken into consideration in this paper. Due to its relative prevalence among the extraction techniques, Design Expert 7.0 was used to plan an experimental run for solvent extraction. Oil was examined to ascertain its physical and chemical characteristics after extraction. It was determined from the outcomes that the orange peels.

Keywords: orange peels, extraction, essential oil, distillation

Procedia PDF Downloads 46
14411 Extraction of Essential Oil from Orange Peels

Authors: Neha Rajas, Aayush Bhisikar, Samarth Bhandare, Aditya Bhingare, Amruta Amrutkar

Abstract:

Orange peels are currently thrown away as garbage in India after orange fruits' edible components are consumed. However, the nation depends on important essential oils for usage in companies that produce goods, including food, beverages, cosmetics, and medicines. This study was conducted to show how to effectively use it. By using various extraction techniques, orange peel is used in the creation of essential oils. Stream distillation, water distillation, and solvent extraction were the techniques taken into consideration in this paper. Due to its relative prevalence among the extraction techniques, Design Expert 7.0 was used to plan an experimental run for solvent extraction. Oil was examined to ascertain its physical and chemical characteristics after extraction. It was determined from the outcomes that the orange peels.

Keywords: orange peels, extraction, distillation, essential oil

Procedia PDF Downloads 43
14410 On Exploring Search Heuristics for improving the efficiency in Web Information Extraction

Authors: Patricia Jiménez, Rafael Corchuelo

Abstract:

Nowadays the World Wide Web is the most popular source of information that relies on billions of on-line documents. Web mining is used to crawl through these documents, collect the information of interest and process it by applying data mining tools in order to use the gathered information in the best interest of a business, what enables companies to promote theirs. Unfortunately, it is not easy to extract the information a web site provides automatically when it lacks an API that allows to transform the user-friendly data provided in web documents into a structured format that is machine-readable. Rule-based information extractors are the tools intended to extract the information of interest automatically and offer it in a structured format that allow mining tools to process it. However, the performance of an information extractor strongly depends on the search heuristic employed since bad choices regarding how to learn a rule may easily result in loss of effectiveness and/or efficiency. Improving search heuristics regarding efficiency is of uttermost importance in the field of Web Information Extraction since typical datasets are very large. In this paper, we employ an information extractor based on a classical top-down algorithm that uses the so-called Information Gain heuristic introduced by Quinlan and Cameron-Jones. Unfortunately, the Information Gain relies on some well-known problems so we analyse an intuitive alternative, Termini, that is clearly more efficient; we also analyse other proposals in the literature and conclude that none of them outperforms the previous alternative.

Keywords: information extraction, search heuristics, semi-structured documents, web mining.

Procedia PDF Downloads 308
14409 Machine Learning Approach for Yield Prediction in Semiconductor Production

Authors: Heramb Somthankar, Anujoy Chakraborty

Abstract:

This paper presents a classification study on yield prediction in semiconductor production using machine learning approaches. A complicated semiconductor production process is generally monitored continuously by signals acquired from sensors and measurement sites. A monitoring system contains a variety of signals, all of which contain useful information, irrelevant information, and noise. In the case of each signal being considered a feature, "Feature Selection" is used to find the most relevant signals. The open-source UCI SECOM Dataset provides 1567 such samples, out of which 104 fail in quality assurance. Feature extraction and selection are performed on the dataset, and useful signals were considered for further study. Afterward, common machine learning algorithms were employed to predict whether the signal yields pass or fail. The most relevant algorithm is selected for prediction based on the accuracy and loss of the ML model.

Keywords: deep learning, feature extraction, feature selection, machine learning classification algorithms, semiconductor production monitoring, signal processing, time-series analysis

Procedia PDF Downloads 80
14408 Microwave-Assisted Extraction of Lycopene from Gac Arils (Momordica cochinchinensis (Lour.) Spreng)

Authors: Yardfon Tanongkankit, Kanjana Narkprasom, Nukrob Narkprasom, Khwanruthai Saiupparat, Phatthareeya Siriwat

Abstract:

Gac fruit (Momordica cochinchinensis (Lour.) Spreng) possesses high potential for health food as it contains high lycopene contents. The objective of this study was to optimize the extraction of lycopene from gac arils using the microwave extraction method. Response surface method was used to find the conditions that optimize the extraction of lycopene from gac arils. The parameters of extraction used in this study were extraction time (120-600 seconds), the solvent to sample ratio (10:1, 20:1, 30:1, 40:1 and 50:1 mL/g) and set microwave power (100-800 watts). The results showed that the microwave extraction condition at the extraction time of 360 seconds, the sample ratio of 30:1 mL/g and the microwave power of 450 watts were suggested since it exhibited the highest value of lycopene content of 9.86 mg/gDW. It was also observed that lycopene contents extracted from gac arils by microwave method were higher than that by the conventional method.

Keywords: conventional extraction, Gac arils, microwave-assisted extraction, Lycopene

Procedia PDF Downloads 357
14407 Critical Review of Web Content Mining Extraction Mechanisms

Authors: Rabia Bashir, Sajjad Akbar

Abstract:

There is an inevitable demand of web mining due to rapid increase of huge information on the Internet, but the striking variety of web structures has made required content retrieval a difficult task. To counter this issue, Web Content Mining (WCM) emerges as a potential candidate which extracts and integrates suitable resources of data to users. In past few years, research has been done on several extraction techniques for WCM i.e. agent-based, template-based, assumption-based, statistic-based, wrapper-based and machine learning. However, it is still unclear that either these approaches are efficiently tackling the significant challenges of WCM or not. To answer this question, this paper identifies these challenges such as language independency, structure flexibility, performance, automation, dynamicity, redundancy handling, intelligence, relevant content retrieval, and privacy. Further, mapping of these challenges is done with existing extraction mechanisms which helps to adopt the most suitable WCM approach, given some conditions and characteristics at hand.

Keywords: content mining challenges, web content mining, web content extraction approaches, web information retrieval

Procedia PDF Downloads 509
14406 Evolving Knowledge Extraction from Online Resources

Authors: Zhibo Xiao, Tharini Nayanika de Silva, Kezhi Mao

Abstract:

In this paper, we present an evolving knowledge extraction system named AKEOS (Automatic Knowledge Extraction from Online Sources). AKEOS consists of two modules, including a one-time learning module and an evolving learning module. The one-time learning module takes in user input query, and automatically harvests knowledge from online unstructured resources in an unsupervised way. The output of the one-time learning is a structured vector representing the harvested knowledge. The evolving learning module automatically schedules and performs repeated one-time learning to extract the newest information and track the development of an event. In addition, the evolving learning module summarizes the knowledge learned at different time points to produce a final knowledge vector about the event. With the evolving learning, we are able to visualize the key information of the event, discover the trends, and track the development of an event.

Keywords: evolving learning, knowledge extraction, knowledge graph, text mining

Procedia PDF Downloads 435
14405 Solvent extraction of molybdenum (VI) with two organophosphorus reagents TBP and D2EHPA under microwave irradiations

Authors: Ahmed Boucherit, Hussein Khalaf, Eduardo Paredes, José Luis Todolí

Abstract:

Solvent extraction studies of molybdenum (VI) with two organophosphorus reagents namely TBP and D2EHPA have been carried out from aqueous acidic solutions of HCl, H2SO4 and H3PO4 under microwave irradiations. The extraction efficiencies of the investigated extractants in the extraction of molybdenum (Vl) were compared. Extraction yield was found unchanged when microwave power varied in the range 20-100 Watts from H2SO4 or H3PO4 but it decreases in the range 20-60 Watts and increases in the range 60-100 Watts when TBP is used for extraction of molybdenum (VI) from 1 M HCl solutions. Extraction yield of molybdenum (VI) was found higher with TBP for HCl molarities greater than 1 M than with D2EHPA for H3PO4 molarities lower than 1 M. Extraction yield increases with HCl molarities in the range 0.50 - 1.80 M but it decreases with the increase in H2SO4 and H3PO4 molarities in the range of 0.05 - 1 M and 0.50 - 1 M, respectively.

Keywords: extraction, molybdenum, microwave, solvent

Procedia PDF Downloads 612
14404 Optimization of Extraction Conditions for Phenolic Compounds from Deverra Scoparia Coss and Dur

Authors: Roukia Hammoudi, Chabrouk Farid, Dehak Karima, Mahfoud Hadj Mahammed, Mohamed Didi Ouldelhadj

Abstract:

The objective of this study was to optimise the extraction conditions for phenolic compounds from Deverra scoparia Coss and Dur. Apiaceae plant by ultrasound assisted extraction (UAE). The effects of solvent type (acetone, ethanol and methanol), solvent concentration (%), extraction time (mins) and extraction temperature (°C) on total phenolic content (TPC) were determined. The optimum extraction conditions were found to be acetone concentration of 80%, extraction time of 25 min and extraction temperature of 25°C. Under the optimized conditions, the value for TPC was 9.68 ± 1.05 mg GAE/g of extract. The study of the antioxidant power of these oils was performed by the method of DPPH. The results showed that antioxidant activity of the Deverra scoparia essential oil was more effective as compared to ascorbic acid and trolox.

Keywords: Deverra scoparia, phenolic compounds, ultrasound assisted extraction, total phenolic content, antioxidant activity

Procedia PDF Downloads 571
14403 Optimization of Extraction Conditions for Phenolic Compounds from Deverra scoparia Coss. and Dur

Authors: Roukia Hammoudi, Dehak Karima, Chabrouk Farid, Mahfoud Hadj Mahammed, Mohamed Didi Ouldelhadj

Abstract:

The objective of this study was to optimise the extraction conditions for phenolic compounds from Deverra scoparia Coss and Dur. Apiaceae plant by ultrasound assisted extraction (UAE). The effects of solvent type (Acetone, Ethanol and methanol), solvent concentration (%), extraction time (mins) and extraction temperature (°C) on total phenolic content (TPC) were determined. the optimum extraction conditions were found to be acetone concentration of 80%, extraction time of 25 min and extraction temperature of 25°C. Under the optimized conditions, the value for TPC was 9.68 ± 1.05 mg GAE/g of extract. The study of the antioxidant power of these oils was performed by the method of DPPH. The results showed that antioxidant activity of the Deverra scoparia essential oil was more effective as compared to ascorbic acid and trolox.

Keywords: Deverra scoparia, phenolic compounds, ultrasound assisted extraction, total phenolic content, antioxidant activity

Procedia PDF Downloads 567
14402 Change of Flavor Characteristics of Flavor Oil Made Using Sarcodon aspratus (Sarcodon aspratus Berk. S. Ito) According to Extraction Temperature and Extraction Time

Authors: Gyeong-Suk Jo, Soo-Hyun Ji, You-Seok Lee, Jeong-Hwa Kang

Abstract:

To develop an flavor oil using Sarcodon aspratus (Sarcodon aspratus Berk. S. Ito), infiltration extraction method was used to add dried mushroom flavor of Sarcodon aspratus to base olive oil. Edible base oil used during infiltration extraction was pressed olive oil, and infiltration extraction was done while varying extraction temperature to 20, 30, 40 and 50(℃) extraction time to 24 hours, 48 hours and 72 hours. Amount of Sarcodon aspratus added to base oil was 20% compared to 100% of base oil. Production yield of Sarcodon aspratus flavor oil decreased with increasing extraction frequency. Aroma intensity was 2195~2447 (A.U./1㎖), and it increased with increasing extraction temperature and extraction time. Chromaticity of Sarcodon aspratus flavor oil was bright pale yellow with pH of 4.5, sugar content of 71~72 (°Brix), and highest average turbidity of 16.74 (Haze %) shown by the 40℃ group. In the aromatic evaluation, increasing extraction temperature and extraction time resulted in increase of cheese aroma, savory sweet aroma and beef jerky aroma, as well as spicy taste comprised of slight bitter taste, savory taste and slight acrid taste, to make aromatic oil with unique flavor.

Keywords: Flavor Characteristics, Flavor Oil, Infiltration extraction method, mushroom, Sarcodon aspratus (Sarcodon aspratus Berk. S. Ito)

Procedia PDF Downloads 345
14401 An Automatic Feature Extraction Technique for 2D Punch Shapes

Authors: Awais Ahmad Khan, Emad Abouel Nasr, H. M. A. Hussein, Abdulrahman Al-Ahmari

Abstract:

Sheet-metal parts have been widely applied in electronics, communication and mechanical industries in recent decades; but the advancement in sheet-metal part design and manufacturing is still behind in comparison with the increasing importance of sheet-metal parts in modern industry. This paper presents a methodology for automatic extraction of some common 2D internal sheet metal features. The features used in this study are taken from Unipunch ™ catalogue. The extraction process starts with the data extraction from STEP file using an object oriented approach and with the application of suitable algorithms and rules, all features contained in the catalogue are automatically extracted. Since the extracted features include geometry and engineering information, they will be effective for downstream application such as feature rebuilding and process planning.

Keywords: feature extraction, internal features, punch shapes, sheet metal

Procedia PDF Downloads 587