Search results for: key information documents
11402 Semantic Indexing Improvement for Textual Documents: Contribution of Classification by Fuzzy Association Rules
Authors: Mohsen Maraoui
Abstract:
In the aim of natural language processing applications improvement, such as information retrieval, machine translation, lexical disambiguation, we focus on statistical approach to semantic indexing for multilingual text documents based on conceptual network formalism. We propose to use this formalism as an indexing language to represent the descriptive concepts and their weighting. These concepts represent the content of the document. Our contribution is based on two steps. In the first step, we propose the extraction of index terms using the multilingual lexical resource Euro WordNet (EWN). In the second step, we pass from the representation of index terms to the representation of index concepts through conceptual network formalism. This network is generated using the EWN resource and pass by a classification step based on association rules model (in attempt to discover the non-taxonomic relations or contextual relations between the concepts of a document). These relations are latent relations buried in the text and carried by the semantic context of the co-occurrence of concepts in the document. Our proposed indexing approach can be applied to text documents in various languages because it is based on a linguistic method adapted to the language through a multilingual thesaurus. Next, we apply the same statistical process regardless of the language in order to extract the significant concepts and their associated weights. We prove that the proposed indexing approach provides encouraging results.Keywords: concept extraction, conceptual network formalism, fuzzy association rules, multilingual thesaurus, semantic indexing
Procedia PDF Downloads 14111401 Digital Preservation: Requirement of 21st Century
Authors: Gaurav Kumar, Shilpa
Abstract:
Digital libraries have been established all over the world to create, maintain and to preserve the digital materials. This paper focuses on operational digital preservation systems specifically in educational organizations in India. It considers the broad range of digital objects including e-journals, technical reports, e-records, project documents, scientific data, etc. This paper describes the main objectives, process and technological issues involved in preservation of digital materials. Digital preservation refers to the various methods of keeping digital materials alive for the future. It includes everything from electronic publications on CD-ROM to Online database and collections of experimental data in digital format maintains the ability to display, retrieve and use digital collections in the face of rapidly changing technological and organizational infrastructures elements. This paper exhibits the importance and objectives of digital preservation. The necessities of preservation are hardware and software technology to interpret the digital documents and discuss various aspects of digital preservation.Keywords: preservation, digital preservation, digital dark age, conservation, archive, repository, document, information technology, hardware, software, organization, machine readable format
Procedia PDF Downloads 45811400 Management Software for the Elaboration of an Electronic File in the Pharmaceutical Industry Following Mexican Regulations
Authors: M. Peña Aguilar Juan, Ríos Hernández Ezequiel, R. Valencia Luis
Abstract:
For certification, certain goods of public interest, such as medicines and food, it is required the preparation and delivery of a dossier. For its elaboration, legal and administrative knowledge must be taken, as well as organization of the documents of the process, and an order that allows the file verification. Therefore, a virtual platform was developed to support the process of management and elaboration of the dossier, providing accessibility to the information and interfaces that allow the user to know the status of projects. The development of dossier system on the cloud allows the inclusion of the technical requirements for the software management, including the validation and the manufacturing in the field industry. The platform guides and facilitates the dossier elaboration (report, file or history), considering Mexican legislation and regulations, it also has auxiliary tools for its management. This technological alternative provides organization support for documents and accessibility to the information required to specify the successful development of a dossier. The platform divides into the following modules: System control, catalog, dossier and enterprise management. The modules are designed per the structure required in a dossier in those areas. However, the structure allows for flexibility, as its goal is to become a tool that facilitates and does not obstruct processes. The architecture and development of the software allows flexibility for future work expansion to other fields, this would imply feeding the system with new regulations.Keywords: electronic dossier, cloud management software, pharmaceutical industry, sanitary registration
Procedia PDF Downloads 29811399 Mapping of Adrenal Gland Diseases Research in Middle East Countries: A Scientometric Analysis, 2007-2013
Authors: Zahra Emami, Mohammad Ebrahim Khamseh, Nahid Hashemi Madani, Iman Kermani
Abstract:
The aim of the study was to map scientific research on adrenal gland diseases in the Middle East countries through the Web of Science database using scientometric analysis. Data were analyzed with Excel software; and HistCite was used for mapping of the scientific texts. In this study, from a total of 268 retrieved records, 1125 authors from 328 institutions published their texts in 138 journals. Among 17 Middle East countries, Turkey ranked first with 164 documents (61.19%), Israel ranked second with 47 documents (15.53%) and Iran came in the third place with 26 documents. Most of the publications (185 documents, 69.2%) were articles. Among the universities of the Middle East, Istanbul University had the highest science production rate (9.7%). The Journal of Clinical Endocrinology & Metabolism had the highest TGCS (243 citations). In the scientific mapping, 7 clusters were formed based on TLCS (Total Local Citation Score) & TGCS (Total Global Citation Score). considering the study results, establishment of scientific connections and collaboration with other countries and use of publications on adrenal gland diseases from high ranking universities can help in the development of this field and promote the medical practice in this regard. Moreover, investigation of the formed clusters in relation to Congenital Hyperplasia and puberty related disorders can be research priorities for investigators.Keywords: mapping, scientific research, adrenal gland diseases, scientometric
Procedia PDF Downloads 27411398 The Role of Named Entity Recognition for Information Extraction
Authors: Girma Yohannis Bade, Olga Kolesnikova, Grigori Sidorov
Abstract:
Named entity recognition (NER) is a building block for information extraction. Though the information extraction process has been automated using a variety of techniques to find and extract a piece of relevant information from unstructured documents, the discovery of targeted knowledge still poses a number of research difficulties because of the variability and lack of structure in Web data. NER, a subtask of information extraction (IE), came to exist to smooth such difficulty. It deals with finding the proper names (named entities), such as the name of the person, country, location, organization, dates, and event in a document, and categorizing them as predetermined labels, which is an initial step in IE tasks. This survey paper presents the roles and importance of NER to IE from the perspective of different algorithms and application area domains. Thus, this paper well summarizes how researchers implemented NER in particular application areas like finance, medicine, defense, business, food science, archeology, and so on. It also outlines the three types of sequence labeling algorithms for NER such as feature-based, neural network-based, and rule-based. Finally, the state-of-the-art and evaluation metrics of NER were presented.Keywords: the role of NER, named entity recognition, information extraction, sequence labeling algorithms, named entity application area
Procedia PDF Downloads 8111397 Lexical Based Method for Opinion Detection on Tripadvisor Collection
Authors: Faiza Belbachir, Thibault Schienhinski
Abstract:
The massive development of online social networks allows users to post and share their opinions on various topics. With this huge volume of opinion, it is interesting to extract and interpret these information for different domains, e.g., product and service benchmarking, politic, system of recommendation. This is why opinion detection is one of the most important research tasks. It consists on differentiating between opinion data and factual data. The difficulty of this task is to determine an approach which returns opinionated document. Generally, there are two approaches used for opinion detection i.e. Lexical based approaches and Machine Learning based approaches. In Lexical based approaches, a dictionary of sentimental words is used, words are associated with weights. The opinion score of document is derived by the occurrence of words from this dictionary. In Machine learning approaches, usually a classifier is trained using a set of annotated document containing sentiment, and features such as n-grams of words, part-of-speech tags, and logical forms. Majority of these works are based on documents text to determine opinion score but dont take into account if these texts are really correct. Thus, it is interesting to exploit other information to improve opinion detection. In our work, we will develop a new way to consider the opinion score. We introduce the notion of trust score. We determine opinionated documents but also if these opinions are really trustable information in relation with topics. For that we use lexical SentiWordNet to calculate opinion and trust scores, we compute different features about users like (numbers of their comments, numbers of their useful comments, Average useful review). After that, we combine opinion score and trust score to obtain a final score. We applied our method to detect trust opinions in TRIPADVISOR collection. Our experimental results report that the combination between opinion score and trust score improves opinion detection.Keywords: Tripadvisor, opinion detection, SentiWordNet, trust score
Procedia PDF Downloads 20011396 A Methodology for Automatic Diversification of Document Categories
Authors: Dasom Kim, Chen Liu, Myungsu Lim, Su-Hyeon Jeon, ByeoungKug Jeon, Kee-Young Kwahk, Namgyu Kim
Abstract:
Recently, numerous documents including unstructured data and text have been created due to the rapid increase in the usage of social media and the Internet. Each document is usually provided with a specific category for the convenience of the users. In the past, the categorization was performed manually. However, in the case of manual categorization, not only can the accuracy of the categorization be not guaranteed but the categorization also requires a large amount of time and huge costs. Many studies have been conducted towards the automatic creation of categories to solve the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorizing complex documents with multiple topics because the methods work by assuming that one document can be categorized into one category only. In order to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, they are also limited in that their learning process involves training using a multi-categorized document set. These methods therefore cannot be applied to multi-categorization of most documents unless multi-categorized training sets are provided. To overcome the limitation of the requirement of a multi-categorized training set by traditional multi-categorization algorithms, we previously proposed a new methodology that can extend a category of a single-categorized document to multiple categorizes by analyzing relationships among categories, topics, and documents. In this paper, we design a survey-based verification scenario for estimating the accuracy of our automatic categorization methodology.Keywords: big data analysis, document classification, multi-category, text mining, topic analysis
Procedia PDF Downloads 27311395 Deployment of Matrix Transpose in Digital Image Encryption
Authors: Okike Benjamin, Garba E J. D.
Abstract:
Encryption is used to conceal information from prying eyes. Presently, information and data encryption are common due to the volume of data and information in transit across the globe on daily basis. Image encryption is yet to receive the attention of the researchers as deserved. In other words, video and multimedia documents are exposed to unauthorized accessors. The authors propose image encryption using matrix transpose. An algorithm that would allow image encryption is developed. In this proposed image encryption technique, the image to be encrypted is split into parts based on the image size. Each part is encrypted separately using matrix transpose. The actual encryption is on the picture elements (pixel) that make up the image. After encrypting each part of the image, the positions of the encrypted images are swapped before transmission of the image can take place. Swapping the positions of the images is carried out to make the encrypted image more robust for any cryptanalyst to decrypt.Keywords: image encryption, matrices, pixel, matrix transpose
Procedia PDF Downloads 42111394 The KAPSARC Energy Policy Database: Introducing a Quantified Library of China's Energy Policies
Authors: Philipp Galkin
Abstract:
Government policy is a critical factor in the understanding of energy markets. Regardless, it is rarely approached systematically from a research perspective. Gaining a precise understanding of what policies exist, their intended outcomes, geographical extent, duration, evolution, etc. would enable the research community to answer a variety of questions that, for now, are either oversimplified or ignored. Policy, on its surface, also seems a rather unstructured and qualitative undertaking. There may be quantitative components, but incorporating the concept of policy analysis into quantitative analysis remains a challenge. The KAPSARC Energy Policy Database (KEPD) is intended to address these two energy policy research limitations. Our approach is to represent policies within a quantitative library of the specific policy measures contained within a set of legal documents. Each of these measures is recorded into the database as a single entry characterized by a set of qualitative and quantitative attributes. Initially, we have focused on the major laws at the national level that regulate coal in China. However, KAPSARC is engaged in various efforts to apply this methodology to other energy policy domains. To ensure scalability and sustainability of our project, we are exploring semantic processing using automated computer algorithms. Automated coding can provide a more convenient input data for human coders and serve as a quality control option. Our initial findings suggest that the methodology utilized in KEPD could be applied to any set of energy policies. It also provides a convenient tool to facilitate understanding in the energy policy realm enabling the researcher to quickly identify, summarize, and digest policy documents and specific policy measures. The KEPD captures a wide range of information about each individual policy contained within a single policy document. This enables a variety of analyses, such as structural comparison of policy documents, tracing policy evolution, stakeholder analysis, and exploring interdependencies of policies and their attributes with exogenous datasets using statistical tools. The usability and broad range of research implications suggest a need for the continued expansion of the KEPD to encompass a larger scope of policy documents across geographies and energy sectors.Keywords: China, energy policy, policy analysis, policy database
Procedia PDF Downloads 32311393 Design Criteria for an Internal Information Technology Cost Allocation to Support Business Information Technology Alignment
Authors: Andrea Schnabl, Mario Bernhart
Abstract:
The controlling instrument of an internal cost allocation (IT chargeback) is commonly used to make IT costs transparent and controllable. Information Technology (IT) became, especially for information industries, a central competitive factor. Consequently, the focus is not on minimizing IT costs but on the strategic aligned application of IT. Hence, an internal IT cost allocation should be designed to enhance the business-IT alignment (strategic alignment of IT) in order to support the effective application of IT from a company’s point of view. To identify design criteria for an internal cost allocation to support business alignment a case study analysis at a typical medium-sized firm in information industry is performed. Documents, Key Performance Indicators, and cost accounting data over a period of 10 years are analyzed and interviews are performed. The derived design criteria are evaluated by 6 heads of IT departments from 6 different companies, which have an internal IT cost allocation at use. By applying these design criteria an internal cost allocation serves not only for cost controlling but also as an instrument in strategic IT management.Keywords: accounting for IT services, Business IT Alignment, internal cost allocation, IT controlling, IT governance, strategic IT management
Procedia PDF Downloads 15511392 Project Design Deliverables Sequence (PDD)
Authors: Nahed Al-Hajeri
Abstract:
There are several reasons which lead to a delay in project completion, out of all, one main reason is the delay in deliverable processing, i.e. submission and review of documents. Most of the project cycles start with a list of deliverables but without a sequence of submission of the same, means without a direction to move, leading to overlapping of activities and more interdependencies. Hence Project Design Deliverables (PDD) is developed as a solution to Organize Transmittals (Documents/Drawings) received from contractors/consultants during different phases of an EPC (Engineering, Procurement, and Construction) projects, which gives proper direction to the stakeholders from the beginning, to reduce inter-discipline dependency, avoid overlapping of activities, provide a list of deliverables, sequence of activities, etc. PDD attempts to provide a list and sequencing of the engineering documents/drawings required during different phases of a Project which will benefit both client and Contractor in performing planned activities through timely submission and review of deliverables. This helps in ensuring improved quality and completion of Project in time. The successful implementation begins with a detailed understanding the specific challenges and requirements of the project. PDD will help to learn about vendor document submissions including general workflow, sequence and monitor the submission and review of the deliverables from the early stages of Project. This will provide an overview for the Submission of deliverables by the concerned during the projects in proper sequence. The goal of PDD is also to hold responsible and accountability of all stakeholders during complete project cycle. We believe that successful implementation of PDD with a detailed list of documents and their sequence will help organizations to achieve the project target.Keywords: EPC (Engineering, Procurement, and Construction), project design deliverables (PDD), econometrics sciences, management sciences
Procedia PDF Downloads 40011391 Digital Preservation: A Need of Tomorrow
Authors: Gaurav Kumar
Abstract:
Digital libraries have been established all over the world to create, maintain and to preserve the digital materials. This paper exhibits the importance and objectives of digital preservation. The necessities of preservation are hardware and software technology to interpret the digital documents and discuss various aspects of digital preservation.Keywords: preservation, digital preservation, conservation, archive, repository, document, information technology, hardware, software, organization, machine readable format
Procedia PDF Downloads 59011390 Logistics Information Systems in the Distribution of Flour in Nigeria
Authors: Cornelius Femi Popoola
Abstract:
This study investigated logistics information systems in the distribution of flour in Nigeria. A case study design was used and 50 staff of Honeywell Flour Mill was sampled for the study. Data generated through a questionnaire were analysed using correlation and regression analysis. The findings of the study revealed that logistic information systems such as e-commerce, interactive telephone systems and electronic data interchange positively correlated with the distribution of flour in Honeywell Flour Mill. Finding also deduced that e-commerce, interactive telephone systems and electronic data interchange jointly and positively contribute to the distribution of flour in Honeywell Flour Mill in Nigeria (R = .935; Adj. R2 = .642; F (3,47) = 14.739; p < .05). The study therefore recommended that Honeywell Flour Mill should upgrade their logistic information systems to computer-to-computer communication of business transactions and documents, as well adopt new technology such as, tracking-and-tracing systems (barcode scanning for packages and palettes), tracking vehicles with Global Positioning System (GPS), measuring vehicle performance with ‘black boxes’ (containing logistic data), and Automatic Equipment Identification (AEI) into their systems.Keywords: e-commerce, electronic data interchange, flour distribution, information system, interactive telephone systems
Procedia PDF Downloads 55611389 Experimental Analysis of Tools Used for Doxing and Proposed New Transforms to Help Organizations Protect against Doxing Attacks
Authors: Parul Khanna, Pavol Zavarsky, Dale Lindskog
Abstract:
Doxing is a term derived from documents, and hence consists of collecting information on an organization or individual through social media websites, search engines, password cracking methods, social engineering tools and other sources of publicly displayed information. The main purpose of doxing attacks is to threaten, embarrass, harass and humiliate the organization or individual. Various tools are used to perform doxing. Tools such as Maltego visualize organization’s architecture which helps in determining weak links within the organization. This paper discusses limitations of Maltego Chlorine CE 3.6.0 and suggests measures as to how organizations can use these tools to protect themselves from doxing attacks.Keywords: advanced persistent threat, FOCA, OSINT, PII
Procedia PDF Downloads 25711388 A Tool to Provide Advanced Secure Exchange of Electronic Documents through Europe
Authors: Jesus Carretero, Mario Vasile, Javier Garcia-Blas, Felix Garcia-Carballeira
Abstract:
Supporting cross-border secure and reliable exchange of data and documents and to promote data interoperability is critical for Europe to enhance sector (like eFinance, eJustice and eHealth). This work presents the status and results of the European Project MADE, a Research Project funded by Connecting Europe facility Programme, to provide secure e-invoicing and e-document exchange systems among Europe countries in compliance with the eIDAS Regulation (Regulation EU 910/2014 on electronic identification and trust services). The main goal of MADE is to develop six new AS4 Access Points and SMP in Europe to provide secure document exchanges using the eDelivery DSI (Digital Service Infrastructure) amongst both private and public entities. Moreover, the project demonstrates the feasibility and interest of the solution provided by providing several months of interoperability among the providers of the six partners in different EU countries. To achieve those goals, we have followed a methodology setting first a common background for requirements in the partner countries and the European regulations. Then, the partners have implemented access points in each country, including their service metadata publisher (SMP), to allow the access to their clients to the pan-European network. Finally, we have setup interoperability tests with the other access points of the consortium. The tests will include the use of each entity production-ready Information Systems that process the data to confirm all steps of the data exchange. For the access points, we have chosen AS4 instead of other existing alternatives because it supports multiple payloads, native web services, pulling facilities, lightweight client implementations, modern crypto algorithms, and more authentication types, like username-password and X.509 authentication and SAML authentication. The main contribution of MADE project is to open the path for European companies to use eDelivery services with cross-border exchange of electronic documents following PEPPOL (Pan-European Public Procurement Online) based on the e-SENS AS4 Profile. It also includes the development/integration of new components, integration of new and existing logging and traceability solutions and maintenance tool support for PKI. Moreover, we have found that most companies are still not ready to support those profiles. Thus further efforts will be needed to promote this technology into the companies. The consortium includes the following 9 partners. From them, 2 are research institutions: University Carlos III of Madrid (Coordinator), and Universidad Politecnica de Valencia. The other 7 (EDICOM, BIZbrains, Officient, Aksesspunkt Norge, eConnect, LMT group, Unimaze) are private entities specialized in secure delivery of electronic documents and information integration brokerage in their respective countries. To achieve cross-border operativity, they will include AS4 and SMP services in their platforms according to the EU Core Service Platform. Made project is instrumental to test the feasibility of cross-border documents eDelivery in Europe. If successful, not only einvoices, but many other types of documents will be securely exchanged through Europe. It will be the base to extend the network to the whole Europe. This project has been funded under the Connecting Europe Facility Agreement number: INEA/CEF/ICT/A2016/1278042. Action No: 2016-EU-IA-0063.Keywords: security, e-delivery, e-invoicing, e-delivery, e-document exchange, trust
Procedia PDF Downloads 26711387 A Comparative Study of the Proposed Models for the Components of the National Health Information System
Authors: M. Ahmadi, Sh. Damanabi, F. Sadoughi
Abstract:
National Health Information System plays an important role in ensuring timely and reliable access to Health information which is essential for strategic and operational decisions that improve health, quality and effectiveness of health care. In other words, by using the National Health information system you can improve the quality of health data, information and knowledge used to support decision making at all levels and areas of the health sector. Since full identification of the components of this system for better planning and management influential factors of performance seems necessary, therefore, in this study, different attitudes towards components of this system are explored comparatively. Methods: This is a descriptive and comparative kind of study. The society includes printed and electronic documents containing components of the national health information system in three parts: input, process, and output. In this context, search for information using library resources and internet search were conducted and data analysis was expressed using comparative tables and qualitative data. Results: The findings showed that there are three different perspectives presenting the components of national health information system, Lippeveld, Sauerborn, and Bodart Model in 2000, Health Metrics Network (HMN) model from World Health Organization in 2008 and Gattini’s 2009 model. All three models outlined above in the input (resources and structure) require components of management and leadership, planning and design programs, supply of staff, software and hardware facilities, and equipment. In addition, in the ‘process’ section from three models, we pointed up the actions ensuring the quality of health information system and in output section, except Lippeveld Model, two other models consider information products, usage and distribution of information as components of the national health information system. Conclusion: The results showed that all the three models have had a brief discussion about the components of health information in input section. However, Lippeveld model has overlooked the components of national health information in process and output sections. Therefore, it seems that the health measurement model of network has a comprehensive presentation for the components of health system in all three sections-input, process, and output.Keywords: National Health Information System, components of the NHIS, Lippeveld Model
Procedia PDF Downloads 42211386 The Second Column of Origen’s Hexapla and the Transcription of BGDKPT Consonants: A Confrontation with Transliterated Hebrew Names in Greek Documents
Authors: Isabella Maurizio
Abstract:
This research analyses the pronunciation of Hebrew consonants 'bgdkpt' in II- III C. E. in Palestine, through the confrontation of two kinds of data: the fragments of transliteration of Old Testament in the Greek alphabet, from the second column of Origen’s synopsis, called Hexapla, and Hebrew names transliterated in Greek documents, especially epigraphs. Origen is a very important author, not only for his bgdkpt theological and exegetic works: the Hexapla, synoptic six columns for a critical edition of Septuaginta, has a relevant role in attempting to reconstruct the pronunciation of Hebrew language before Masoretic punctuation. For this reason, at the beginning, it is important to analyze the column in order to study phonetic and linguistic phenomena. Among the most problematic data, there is the evidence from bgdkpt consonants, always represented as Greek aspirated graphemes. This transcription raised the question if their pronunciation was the only spirant, and consequently, the double one, that is, the stop/spirant contrast, was introduced by Masoretes. However, the phonetic and linguistic examination of the column alone is not enough to establish a real pronunciation of language: this paper is significant because a confrontation between the second column’s transliteration and Hebrew names found in Greek documents epigraphic ones mainly, is achieved. Palestine in II - III was a bilingual country: Greek and Aramaic language lived together, the first one like the official language, the second one as the principal mean of communication between people. For this reason, Hebrew names are often found in Greek documents of the same geographical area: a deep examination of bgdkpt’s transliteration can help to understand better which the real pronunciation of these consonants was, or at least it allows to evidence a phonetic tendency. As a consequence, the research considers the contemporary documents to Origen and the previous ones: the first ones testify a specific stadium of pronunciation, the second ones reflect phonemes’ evolution. Alexandrian documents are also examined: Origen was from there, and the influence of Greek language, spoken in his native country, must be considered. The epigraphs have another implication: they are totally free from morphological criteria, probably used by Origen in his column, because of their popular origin. Thus, a confrontation between the hexaplaric transliteration and Hebrew names is absolutely required, in Hexapla’s studies: first of all, it can be the second clue of a pronunciation already noted in the column; then because, for documents’ specific nature, it has more probabilities to be real, reflecting a daily use of language. The examination of data shows a general tendency to employ the aspirated graphemes for bgdkpt consonants’ transliteration. This probably means that they were closer to Greek aspirated consonants rather than to the plosive ones. The exceptions are linked to a particular status of the name, i.e. its history and origin. In this way, this paper gives its contribution to onomastic studies, too: indeed, the research may contribute to verify the diffusion and the treatment of Jewish names in Hellenized world and in the koinè language.Keywords: bgdkpt consonants, Greek epigraphs, Jewish names, origen's Hexapla
Procedia PDF Downloads 14111385 Deep Learning-Based Approach to Automatic Abstractive Summarization of Patent Documents
Authors: Sakshi V. Tantak, Vishap K. Malik, Neelanjney Pilarisetty
Abstract:
A patent is an exclusive right granted for an invention. It can be a product or a process that provides an innovative method of doing something, or offers a new technical perspective or solution to a problem. A patent can be obtained by making the technical information and details about the invention publicly available. The patent owner has exclusive rights to prevent or stop anyone from using the patented invention for commercial uses. Any commercial usage, distribution, import or export of a patented invention or product requires the patent owner’s consent. It has been observed that the central and important parts of patents are scripted in idiosyncratic and complex linguistic structures that can be difficult to read, comprehend or interpret for the masses. The abstracts of these patents tend to obfuscate the precise nature of the patent instead of clarifying it via direct and simple linguistic constructs. This makes it necessary to have an efficient access to this knowledge via concise and transparent summaries. However, as mentioned above, due to complex and repetitive linguistic constructs and extremely long sentences, common extraction-oriented automatic text summarization methods should not be expected to show a remarkable performance when applied to patent documents. Other, more content-oriented or abstractive summarization techniques are able to perform much better and generate more concise summaries. This paper proposes an efficient summarization system for patents using artificial intelligence, natural language processing and deep learning techniques to condense the knowledge and essential information from a patent document into a single summary that is easier to understand without any redundant formatting and difficult jargon.Keywords: abstractive summarization, deep learning, natural language Processing, patent document
Procedia PDF Downloads 12311384 The History of the Birth of Tunisian Higher Accounting Education
Authors: Rim Khemiri, Mariam Dammak
Abstract:
The aim of this study is to trace the historical evolution of Tunisian higher accounting education and to understand and highlight the circumstances of its birth and its development. A documentary study (archival documents, official documents, public speeches, etc.), as well as semi-directive interviews with key actors, were carried out as part of this research work. These interviews aim to fill a lack of information on this subject and to confirm events addressed by other sources, but for which it lacks the elements necessary for a good understanding. After having put forward the specificities of the Tunisian context, we will, first of all, proceed to a review of the literature related to our theme in various contexts of the world. Then, we will present the evolution of the accounting curriculum by highlighting the circumstances of its birth and those of the successive reforms led by the Tunisian government. The study of higher accounting education in Tunisia and its evolution has several interests. The first lies in understanding the circumstances of its birth and its evolution in relation to the historical, socio-economic, and political context of the country. The second is to propose a reading grid that allows an understanding of the reforms that led to the university accountancy accounting course as we know it today. And, the third, aims to complete the literature on the processes of evolution of higher education accounting, by treating a different context, in order to provide additional knowledge necessary to compare experiences in this area around the world.Keywords: accounting history, higher accounting education, socio-economic and political context, Tunisian context
Procedia PDF Downloads 13611383 Statistical Discrimination of Blue Ballpoint Pen Inks by Diamond Attenuated Total Reflectance (ATR) FTIR
Authors: Mohamed Izzharif Abdul Halim, Niamh Nic Daeid
Abstract:
Determining the source of pen inks used on a variety of documents is impartial for forensic document examiners. The examination of inks is often performed to differentiate between inks in order to evaluate the authenticity of a document. A ballpoint pen ink consists of synthetic dyes in (acidic and/or basic), pigments (organic and/or inorganic) and a range of additives. Inks of similar color may consist of different composition and are frequently the subjects of forensic examinations. This study emphasizes on blue ballpoint pen inks available in the market because it is reported that approximately 80% of questioned documents analysis involving ballpoint pen ink. Analytical techniques such as thin layer chromatography, high-performance liquid chromatography, UV-vis spectroscopy, luminescence spectroscopy and infrared spectroscopy have been used in the analysis of ink samples. In this study, application of Diamond Attenuated Total Reflectance (ATR) FTIR is straightforward but preferable in forensic science as it offers no sample preparation and minimal analysis time. The data obtained from these techniques were further analyzed using multivariate chemometric methods which enable extraction of more information based on the similarities and differences among samples in a dataset. It was indicated that some pens from the same manufactures can be similar in composition, however, discrete types can be significantly different.Keywords: ATR FTIR, ballpoint, multivariate chemometric, PCA
Procedia PDF Downloads 45811382 Information Literacy Initiatives in India in Present Era Age
Authors: Darshan Lal
Abstract:
The paper describes the concept of Information literacy. It is a critical component of this information age. Information literacy is the vital process in modern changing world. Information Literacy initiatives in India was also discussed. Paper also discussed Information literacy programmes for LIS professionals. Information literacy makes person capable to recognize when information is needed and how to locate, evaluate and use effectively of the needed information.Keywords: information literacy, information communication technology (ICT), information literacy programmes
Procedia PDF Downloads 37211381 Interactive, Topic-Oriented Search Support by a Centroid-Based Text Categorisation
Authors: Mario Kubek, Herwig Unger
Abstract:
Centroid terms are single words that semantically and topically characterise text documents and so may serve as their very compact representation in automatic text processing. In the present paper, centroids are used to measure the relevance of text documents with respect to a given search query. Thus, a new graphbased paradigm for searching texts in large corpora is proposed and evaluated against keyword-based methods. The first, promising experimental results demonstrate the usefulness of the centroid-based search procedure. It is shown that especially the routing of search queries in interactive and decentralised search systems can be greatly improved by applying this approach. A detailed discussion on further fields of its application completes this contribution.Keywords: search algorithm, centroid, query, keyword, co-occurrence, categorisation
Procedia PDF Downloads 28311380 The Safety Profile of Vilazodone: A Study on Post-Marketing Surveillance
Authors: Humraaz Kaja, Kofi Mensah, Frasia Oosthuizen
Abstract:
Background and Aim: Vilazodone was approved in 2011 as an antidepressant to treat the major depressive disorder. As a relatively new drug, it is not clear if all adverse effects have been identified. The aim of this study was to review the adverse effects reported to the WHO Programme for International Drug Monitoring (PIDM) in order to add to the knowledge about the safety profile and adverse effects caused by vilazodone. Method: Data on adverse effects reported for vilazodone was obtained from the database VigiAccess managed by PIDM. Data was extracted from VigiAccess using Excel® and analyzed using descriptive statistics. The data collected was compared to the patient information leaflet (PIL) of Viibryd® and the FDA documents to determine adverse drug reactions reported post-marketing. Results: A total of 9708 adverse events had been recorded on VigiAccess, of which 6054 were not recorded on the PIL and the FDA approval document. Most of the reports were received from the Americas and were for adult women aged 45-64 years (24%, n=1059). The highest number of adverse events reported were for psychiatric events (19%; n=1889), followed by gastro-intestinal effects (18%; n=1839). Specific psychiatric disorders recorded included anxiety (316), depression (208), hallucination (168) and agitation (142). The systematic review confirmed several psychiatric adverse effects associated with the use of vilazodone. The findings of this study suggested that these common psychiatric adverse effects associated with the use of vilazodone were not known during the time of FDA approval of the drug and is not currently recorded in the patient information leaflet (PIL). Conclusions: In summary, this study found several adverse drug reactions not recorded in documents emanating from clinical trials pre-marketing. This highlights the importance of continued post-marketing surveillance of a drug, as well as the need for further studies on the psychiatric adverse events associated with vilazodone in order to improve the safety profile.Keywords: adverse drug reactions, pharmacovigilance, post-marketing surveillance, vilazodone
Procedia PDF Downloads 11611379 A Methodology for Developing New Technology Ideas to Avoid Patent Infringement: F-Term Based Patent Analysis
Authors: Kisik Song, Sungjoo Lee
Abstract:
With the growing importance of intangible assets recently, the impact of patent infringement on the business of a company has become more evident. Accordingly, it is essential for firms to estimate the risk of patent infringement risk before developing a technology and create new technology ideas to avoid the risk. Recognizing the needs, several attempts have been made to help develop new technology opportunities and most of them have focused on identifying emerging vacant technologies from patent analysis. In these studies, the IPC (International Patent Classification) system or keywords from text-mining application to patent documents was generally used to define vacant technologies. Unlike those studies, this study adopted F-term, which classifies patent documents according to the technical features of the inventions described in them. Since the technical features are analyzed by various perspectives by F-term, F-term provides more detailed information about technologies compared to IPC while more systematic information compared to keywords. Therefore, if well utilized, it can be a useful guideline to create a new technology idea. Recognizing the potential of F-term, this paper aims to suggest a novel approach to developing new technology ideas to avoid patent infringement based on F-term. For this purpose, we firstly collected data about F-term and then applied text-mining to the descriptions about classification criteria and attributes. From the text-mining results, we could identify other technologies with similar technical features of the existing one, the patented technology. Finally, we compare the technologies and extract the technical features that are commonly used in other technologies but have not been used in the existing one. These features are presented in terms of “purpose”, “function”, “structure”, “material”, “method”, “processing and operation procedure” and “control means” and so are useful for creating new technology ideas that help avoid infringing patent rights of other companies. Theoretically, this is one of the earliest attempts to adopt F-term to patent analysis; the proposed methodology can show how to best take advantage of F-term with the wealth of technical information. In practice, the proposed methodology can be valuable in the ideation process for successful product and service innovation without infringing the patents of other companies.Keywords: patent infringement, new technology ideas, patent analysis, F-term
Procedia PDF Downloads 27011378 Behavior of Printing Inks on Historical Documents Subjected to Cold RF Plasma Discharges
Authors: Dorina Rusu, Emil Ghiocel Ioanid, Marta Ursescu, Ana Maria Vlad, Mihaela Popescu
Abstract:
During the last decades the cold plasma discharges made the subject of numerous studies concerning the applications in the cultural heritage field, especially concentrated on ecological and non-invasive aspect of these conservation procedures. The conservation treatment using cold plasma is based, on the one hand, on the well-known property of plasma discharges to inactivate the contaminant biological species and, on the other hand, on the surface cleaning effect. Moreover the plasma discharge produces the functionalization of the treated surface, allowing subsequent deposition of protective layers. The paper presents the behavior of printing inks on historical documents treated in cold RF plasma. Two types of printing inks were studied, namely red and black ink, used on a religious book published in 19 century. SEM-EDX analysis results in the identification of the two inks as carbon black ink (C presence in the EDX spectrum) and cinnabar based red ink (Hg and S lines in the spectrum), result confirmed by XRF analysis. The experiments have been performed on paper samples written with laboratory- made inks, of similar composition with the inks identified on historical documents. The samples were subjected to RF plasma discharge, operating in nitrogen gaseous medium, at 1.2 MHz frequency and low-pressure (0.5 mbar), performed in a self-designed equipment for the application of conservation treatments on naturally aged paper supports. The impact of plasma discharge on the inks has been evaluated by SEM, XRD and color analysis. The color analysis revealed a slight discoloration of cinnabar ink on the historical document. SEM and XRD analyses have been carried out in an attempt to elucidate the process responsable for color modification.Keywords: RF plasma, printing inks, historical documents, surface cleaning effect
Procedia PDF Downloads 44111377 Popularization of the Communist Manifesto in 19th Century Europe
Authors: Xuanyu Bai
Abstract:
“The Communist Manifesto”, written by Karl Marx and Friedrich Engels, is one of the most significant documents throughout the whole history which covers across different fields including Economic, Politic, Sociology and Philosophy. Instead of discussing the Communist ideas presented in the Communist Manifesto, the essay focuses on exploring the reasons that contributed to the popularization of the document and its influence on political revolutions in 19th century Europe by concentrating on the document itself along with other primary and secondary sources and temporal artwork. Combining the details from the Communist Manifesto and other documents, Marx’s writing style and word choice, his convincible notions about a new society dominated by proletariats, and the revolutionary idea of class destruction has led to the popularization of the Communist Manifesto and influenced the latter political revolutions.Keywords: communist manifesto, Marx, Engels, capitalism
Procedia PDF Downloads 13111376 Exploring Social Impact of Emerging Technologies from Futuristic Data
Authors: Heeyeul Kwon, Yongtae Park
Abstract:
Despite the highly touted benefits, emerging technologies have unleashed pervasive concerns regarding unintended and unforeseen social impacts. Thus, those wishing to create safe and socially acceptable products need to identify such side effects and mitigate them prior to the market proliferation. Various methodologies in the field of technology assessment (TA), namely Delphi, impact assessment, and scenario planning, have been widely incorporated in such a circumstance. However, literatures face a major limitation in terms of sole reliance on participatory workshop activities. They unfortunately missed out the availability of a massive untapped data source of futuristic information flooding through the Internet. This research thus seeks to gain insights into utilization of futuristic data, future-oriented documents from the Internet, as a supplementary method to generate social impact scenarios whilst capturing perspectives of experts from a wide variety of disciplines. To this end, network analysis is conducted based on the social keywords extracted from the futuristic documents by text mining, which is then used as a guide to produce a comprehensive set of detailed scenarios. Our proposed approach facilitates harmonized depictions of possible hazardous consequences of emerging technologies and thereby makes decision makers more aware of, and responsive to, broad qualitative uncertainties.Keywords: emerging technologies, futuristic data, scenario, text mining
Procedia PDF Downloads 49211375 Modified Active (MA) Algorithm to Generate Semantic Web Related Clustered Hierarchy for Keyword Search
Authors: G. Leena Giri, Archana Mathur, S. H. Manjula, K. R. Venugopal, L. M. Patnaik
Abstract:
Keyword search in XML documents is based on the notion of lowest common ancestors in the labelled trees model of XML documents and has recently gained a lot of research interest in the database community. In this paper, we propose the Modified Active (MA) algorithm which is an improvement over the active clustering algorithm by taking into consideration the entity aspect of the nodes to find the level of the node pertaining to a particular keyword input by the user. A portion of the bibliography database is used to experimentally evaluate the modified active algorithm and results show that it performs better than the active algorithm. Our modification improves the response time of the system and thereby increases the efficiency of the system.Keywords: keyword matching patterns, MA algorithm, semantic search, knowledge management
Procedia PDF Downloads 41511374 Computer Fraud from the Perspective of Iran's Law and International Documents
Authors: Babak Pourghahramani
Abstract:
One of the modern crimes against property and ownership in the cyber-space is the computer fraud. Despite being modern, the aforementioned crime has its roots in the principles of religious jurisprudence. In some cases, this crime is compatible with the traditional regulations and that is when the computer is considered as a crime commitment device and also some computer frauds that take place in the context of electronic exchanges are considered as crime based on the E-commerce Law (approved in 2003) but the aforementioned regulations are flawed and until recent years there was no comprehensive law in this regard; yet after some years the Computer Crime Act was approved in 2009/26/5 and partly solved the problem of legal vacuum. The present study intends to investigate the computer fraud according to Iran's Computer Crime Act and by taking into consideration the international documents.Keywords: fraud, cyber fraud, computer fraud, classic fraud, computer crime
Procedia PDF Downloads 33211373 Enhancing Large Language Models' Data Analysis Capability with Planning-and-Execution and Code Generation Agents: A Use Case for Southeast Asia Real Estate Market Analytics
Authors: Kien Vu, Jien Min Soh, Mohamed Jahangir Abubacker, Piyawut Pattamanon, Soojin Lee, Suvro Banerjee
Abstract:
Recent advances in Generative Artificial Intelligence (GenAI), in particular Large Language Models (LLMs) have shown promise to disrupt multiple industries at scale. However, LLMs also present unique challenges, notably, these so-called "hallucination" which is the generation of outputs that are not grounded in the input data that hinders its adoption into production. Common practice to mitigate hallucination problem is utilizing Retrieval Agmented Generation (RAG) system to ground LLMs'response to ground truth. RAG converts the grounding documents into embeddings, retrieve the relevant parts with vector similarity between user's query and documents, then generates a response that is not only based on its pre-trained knowledge but also on the specific information from the retrieved documents. However, the RAG system is not suitable for tabular data and subsequent data analysis tasks due to multiple reasons such as information loss, data format, and retrieval mechanism. In this study, we have explored a novel methodology that combines planning-and-execution and code generation agents to enhance LLMs' data analysis capabilities. The approach enables LLMs to autonomously dissect a complex analytical task into simpler sub-tasks and requirements, then convert them into executable segments of code. In the final step, it generates the complete response from output of the executed code. When deployed beta version on DataSense, the property insight tool of PropertyGuru, the approach yielded promising results, as it was able to provide market insights and data visualization needs with high accuracy and extensive coverage by abstracting the complexities for real-estate agents and developers from non-programming background. In essence, the methodology not only refines the analytical process but also serves as a strategic tool for real estate professionals, aiding in market understanding and enhancement without the need for programming skills. The implication extends beyond immediate analytics, paving the way for a new era in the real estate industry characterized by efficiency and advanced data utilization.Keywords: large language model, reasoning, planning and execution, code generation, natural language processing, prompt engineering, data analysis, real estate, data sense, PropertyGuru
Procedia PDF Downloads 88