Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 12

Search results for: Parser

12 Automatic Intelligent Analysis of Malware Behaviour

Authors: H. Dornhackl, K. Kadletz, R. Luh, P. Tavolato

Abstract:

In this paper, we describe the use of formal methods to model malware behaviour. The modelling of harmful behaviour rests upon syntactic structures that represent malicious procedures inside malware. The malicious activities are modelled by a formal grammar, where API calls’ components are the terminals and the set of API calls used in combination to achieve a goal are designated non-terminals. The combination of different non-terminals in various ways and tiers make up the attack vectors that are used by harmful software. Based on these syntactic structures a parser can be generated which takes execution traces as input for pattern recognition.

Keywords: Modelling, Search, parsing, Pattern Matching, malware behaviour

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1079
11 BugCatcher.Net: Detecting Bugs and Proposing Corrective Solutions

Authors: Sheetal Chavan, P. J. Kulkarni, Vivek Shanbhag

Abstract:

Although achieving zero-defect software release is practically impossible, software industries should take maximum care to detect defects/bugs well ahead in time allowing only bare minimums to creep into released version. This is a clear indicator of time playing an important role in the bug detection. In addition to this, software quality is the major factor in software engineering process. Moreover, early detection can be achieved only through static code analysis as opposed to conventional testing. BugCatcher.Net is a static analysis tool, which detects bugs in .NET® languages through MSIL (Microsoft Intermediate Language) inspection. The tool utilizes a Parser based on Finite State Automata to carry out bug detection. After being detected, bugs need to be corrected immediately. BugCatcher.Net facilitates correction, by proposing a corrective solution for reported warnings/bugs to end users with minimum side effects. Moreover, the tool is also capable of analyzing the bug trend of a program under inspection.

Keywords: Grammar, dependence, Early solution, Finite State Automata, Late solution, Parser State Transition Diagram, StaticProgram Analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1220
10 Syntax Sensitive and Language Independent Detection of Code Clones

Authors: Kazuaki Maeda

Abstract:

This paper proposes a new technique to detect code clones from the lexical and syntactic point of view, which is based on PALEX source code representation. The PALEX code contains the recorded parsing actions and also lexical formatting information including white spaces and comments. We can record a list of parsing actions (shift, reduce, and reading a token) during a compiling process after a compiler finishes analyzing the source code. The proposed technique has advantages for syntax sensitive approach and language independency.

Keywords: xml, parser, Code Clones, Source Code Representation, Parser Generator

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1134
9 A Thai to English Machine Translation System Using Thai LFG Tree Structure as Interlingua

Authors: Tawee Chimsuk, Surapong Auwatanamongkol

Abstract:

Machine Translation (MT) between the Thai and English languages has been a challenging research topic in natural language processing. Most research has been done on English to Thai machine translation, but not the other way around. This paper presents a Thai to English Machine Translation System that translates a Thai sentence into interlingua of a Thai LFG tree using LFG grammar and a bottom up parser. The Thai LFG tree is then transformed into the corresponding English LFG tree by pattern matching and node transformation. Finally, an equivalent English sentence is created using structural information prescribed by the English LFG tree. Based on results of experiments designed to evaluate the performance of the proposed system, it can be stated that the system has been proven to be effective in providing a useful translation from Thai to English.

Keywords: Machine Translation, Pattern Matching, Interlingua, LFG grammar

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1884
8 Pattern Matching Based on Regular Tree Grammars

Authors: Riad S. Jabri

Abstract:

Pattern matching based on regular tree grammars have been widely used in many areas of computer science. In this paper, we propose a pattern matcher within the framework of code generation, based on a generic and a formalized approach. According to this approach, parsers for regular tree grammars are adapted to a general pattern matching solution, rather than adapting the pattern matching according to their parsing behavior. Hence, we first formalize the construction of the pattern matches respective to input trees drawn from a regular tree grammar in a form of the so-called match trees. Then, we adopt a recently developed generic parser and tightly couple its parsing behavior with such construction. In addition to its generality, the resulting pattern matcher is characterized by its soundness and efficient implementation. This is demonstrated by the proposed theory and by the derived algorithms for its implementation. A comparison with similar and well-known approaches, such as the ones based on tree automata and LR parsers, has shown that our pattern matcher can be applied to a broader class of grammars, and achieves better approximation of pattern matches in one pass. Furthermore, its use as a machine code selector is characterized by a minimized overhead, due to the balanced distribution of the cost computations into static ones, during parser generation time, and into dynamic ones, during parsing time.

Keywords: Pattern Matching, Bottom-up automata, Code selection, Regular tree grammars, Match trees

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 870
7 ReSeT : Reverse Engineering System Requirements Tool

Authors: Rosziati Ibrahim, Tiu Kian Yong

Abstract:

Reverse Engineering is a very important process in Software Engineering. It can be performed backwards from system development life cycle (SDLC) in order to get back the source data or representations of a system through analysis of its structure, function and operation. We use reverse engineering to introduce an automatic tool to generate system requirements from its program source codes. The tool is able to accept the Cµ programming source codes, scan the source codes line by line and parse the codes to parser. Then, the engine of the tool will be able to generate system requirements for that specific program to facilitate reuse and enhancement of the program. The purpose of producing the tool is to help recovering the system requirements of any system when the system requirements document (SRD) does not exist due to undocumented support of the system.

Keywords: Reverse Engineering, system requirements, SourceCodes

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1247
6 Thematic Role Extraction Using Shallow Parsing

Authors: Mehrnoush Shamsfard, Maryam Sadr Mousavi

Abstract:

Extracting thematic (semantic) roles is one of the major steps in representing text meaning. It refers to finding the semantic relations between a predicate and syntactic constituents in a sentence. In this paper we present a rule-based approach to extract semantic roles from Persian sentences. The system exploits a twophase architecture to (1) identify the arguments and (2) label them for each predicate. For the first phase we developed a rule based shallow parser to chunk Persian sentences and for the second phase we developed a knowledge-based system to assign 16 selected thematic roles to the chunks. The experimental results of testing each phase are shown at the end of the paper.

Keywords: natural language processing, thematic roles, Semantic RoleLabeling, Shallow parsing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1663
5 Structural Parsing of Natural Language Text in Tamil Using Phrase Structure Hybrid Language Model

Authors: Selvam M, Natarajan. A M, Thangarajan R

Abstract:

Parsing is important in Linguistics and Natural Language Processing to understand the syntax and semantics of a natural language grammar. Parsing natural language text is challenging because of the problems like ambiguity and inefficiency. Also the interpretation of natural language text depends on context based techniques. A probabilistic component is essential to resolve ambiguity in both syntax and semantics thereby increasing accuracy and efficiency of the parser. Tamil language has some inherent features which are more challenging. In order to obtain the solutions, lexicalized and statistical approach is to be applied in the parsing with the aid of a language model. Statistical models mainly focus on semantics of the language which are suitable for large vocabulary tasks where as structural methods focus on syntax which models small vocabulary tasks. A statistical language model based on Trigram for Tamil language with medium vocabulary of 5000 words has been built. Though statistical parsing gives better performance through tri-gram probabilities and large vocabulary size, it has some disadvantages like focus on semantics rather than syntax, lack of support in free ordering of words and long term relationship. To overcome the disadvantages a structural component is to be incorporated in statistical language models which leads to the implementation of hybrid language models. This paper has attempted to build phrase structured hybrid language model which resolves above mentioned disadvantages. In the development of hybrid language model, new part of speech tag set for Tamil language has been developed with more than 500 tags which have the wider coverage. A phrase structured Treebank has been developed with 326 Tamil sentences which covers more than 5000 words. A hybrid language model has been trained with the phrase structured Treebank using immediate head parsing technique. Lexicalized and statistical parser which employs this hybrid language model and immediate head parsing technique gives better results than pure grammar and trigram based model.

Keywords: natural language processing, parts of speech, Hybrid Language Model, Immediate Head Parsing, Lexicalized and Statistical Parsing, Probabilistic Context Free Grammar, Tamil Language, Tree Bank

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3169
4 Use XML Format like a Model of Data Backup

Authors: Souleymane Oumtanaga, Kadjo Tanon Lambert, Koné Tiémoman, Tety Pierre, Dowa N’sreke Florent

Abstract:

Nowadays data backup format doesn-t cease to appear raising so the anxiety on their accessibility and their perpetuity. XML is one of the most promising formats to guarantee the integrity of data. This article suggests while showing one thing man can do with XML. Indeed XML will help to create a data backup model. The main task will consist in defining an application in JAVA able to convert information of a database in XML format and restore them later.

Keywords: Backup, parser, Proprietary format, syntactic tree

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1441
3 System of Programs for Rapid Development and Execution of Palm OS Applications

Authors: Mihai Ciocarlie, Marcela-Simona Atanasoae, Horia Ciocarlie

Abstract:

We present the development of a system of programs designed for the compilation and execution of applications for handheld computers. In introduction we describe the purpose of the project and its components. The next two paragraphs present the first two components of the project (the scanner and parser generators). Then we describe the Object Pascal compiler and the virtual machines for Windows and Palm OS. In conclusion we emphasize the ways in which the project can be extended.

Keywords: Virtual machine, Compiler Design, Palm OS applications, rapid application development

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1075
2 PIELG: A Protein Interaction Extraction Systemusing a Link Grammar Parser from Biomedical Abstracts

Authors: Rania A. Abul Seoud, Nahed H. Solouma, Abou-Baker M. Youssef, Yasser M. Kadah

Abstract:

Due to the ever growing amount of publications about protein-protein interactions, information extraction from text is increasingly recognized as one of crucial technologies in bioinformatics. This paper presents a Protein Interaction Extraction System using a Link Grammar Parser from biomedical abstracts (PIELG). PIELG uses linkage given by the Link Grammar Parser to start a case based analysis of contents of various syntactic roles as well as their linguistically significant and meaningful combinations. The system uses phrasal-prepositional verbs patterns to overcome preposition combinations problems. The recall and precision are 74.4% and 62.65%, respectively. Experimental evaluations with two other state-of-the-art extraction systems indicate that PIELG system achieves better performance. For further evaluation, the system is augmented with a graphical package (Cytoscape) for extracting protein interaction information from sequence databases. The result shows that the performance is remarkably promising.

Keywords: natural language processing, Protein-Protein Interaction, Link Grammar Parser, Interaction extraction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1875
1 An Embedded System for Artificial Intelligence Applications

Authors: Ioannis P. Panagopoulos, Christos C. Pavlatos, George K. Papakonstantinou

Abstract:

Conventional approaches in the implementation of logic programming applications on embedded systems are solely of software nature. As a consequence, a compiler is needed that transforms the initial declarative logic program to its equivalent procedural one, to be programmed to the microprocessor. This approach increases the complexity of the final implementation and reduces the overall system's performance. On the contrary, presenting hardware implementations which are only capable of supporting logic programs prevents their use in applications where logic programs need to be intertwined with traditional procedural ones, for a specific application. We exploit HW/SW codesign methods to present a microprocessor, capable of supporting hybrid applications using both programming approaches. We take advantage of the close relationship between attribute grammar (AG) evaluation and knowledge engineering methods to present a programmable hardware parser that performs logic derivations and combine it with an extension of a conventional RISC microprocessor that performs the unification process to report the success or failure of those derivations. The extended RISC microprocessor is still capable of executing conventional procedural programs, thus hybrid applications can be implemented. The presented implementation is programmable, supports the execution of hybrid applications, increases the performance of logic derivations (experimental analysis yields an approximate 1000% increase in performance) and reduces the complexity of the final implemented code. The proposed hardware design is supported by a proposed extended C-language called C-AG.

Keywords: logic programming, Attribute Grammars, RISC microprocessor

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4743