Search results for: Unstructured%20TextDocuments

82 Clustering Unstructured Text Documents Using Fading Function

Authors: Pallav Roxy, Durga Toshniwal

Abstract:

Clustering unstructured text documents is an important issue in data mining community and has a number of applications such as document archive filtering, document organization and topic detection and subject tracing. In the real world, some of the already clustered documents may not be of importance while new documents of more significance may evolve. Most of the work done so far in clustering unstructured text documents overlooks this aspect of clustering. This paper, addresses this issue by using the Fading Function. The unstructured text documents are clustered. And for each cluster a statistics structure called Cluster Profile (CP) is implemented. The cluster profile incorporates the Fading Function. This Fading Function keeps an account of the time-dependent importance of the cluster. The work proposes a novel algorithm Clustering n-ary Merge Algorithm (CnMA) for unstructured text documents, that uses Cluster Profile and Fading Function. Experimental results illustrating the effectiveness of the proposed technique are also included.

Keywords: Clustering, Text Mining, Unstructured TextDocuments, Fading Function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1938

81 Exploiting Query Feedback for Efficient Query Routing in Unstructured Peer-to-peer Networks

Authors: Iskandar Ishak, Naomie Salim

Abstract:

Unstructured peer-to-peer networks are popular due to its robustness and scalability. Query schemes that are being used in unstructured peer-to-peer such as the flooding and interest-based shortcuts suffer various problems such as using large communication overhead long delay response. The use of routing indices has been a popular approach for peer-to-peer query routing. It helps the query routing processes to learn the routing based on the feedbacks collected. In an unstructured network where there is no global information available, efficient and low cost routing approach is needed for routing efficiency. In this paper, we propose a novel mechanism for query-feedback oriented routing indices to achieve routing efficiency in unstructured network at a minimal cost. The approach also applied information retrieval technique to make sure the content of the query is understandable and will make the routing process not just based to the query hits but also related to the query content. Experiments have shown that the proposed mechanism performs more efficient than flood-based routing.

Keywords: Unstructured peer-to-peer, Searching, Retrieval, Internet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1488

80 Discovery and Capture of Organizational Knowledge from Unstructured Information

Authors: J. Gu, W.B. Lee, C.F. Cheung, E. Tsui, W.M. Wang

Abstract:

Knowledge of an organization does not merely reside in structured form of information and data; it is also embedded in unstructured form. The discovery of such knowledge is particularly difficult as the characteristic is dynamic, scattered, massive and multiplying at high speed. Conventional methods of managing unstructured information are considered too resource demanding and time consuming to cope with the rapid information growth. In this paper, a Multi-faceted and Automatic Knowledge Elicitation System (MAKES) is introduced for the purpose of discovery and capture of organizational knowledge. A trial implementation has been conducted in a public organization to achieve the objective of decision capture and navigation from a number of meeting minutes which are autonomously organized, classified and presented in a multi-faceted taxonomy map in both document and content level. Key concepts such as critical decision made, key knowledge workers, knowledge flow and the relationship among them are elicited and displayed in predefined knowledge model and maps. Hence, the structured knowledge can be retained, shared and reused. Conducting Knowledge Management with MAKES reduces work in searching and retrieving the target decision, saves a great deal of time and manpower, and also enables an organization to keep pace with the knowledge life cycle. This is particularly important when the amount of unstructured information and data grows extremely quickly. This system approach of knowledge management can accelerate value extraction and creation cycles of organizations.

Keywords: Knowledge-Based System, Knowledge Elicitation, Knowledge Management, Taxonomy, Unstructured Information Management

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1795

79 An Unstructured Finite-volume Technique for Shallow-water Flows with Wetting and Drying Fronts

Authors: Rajendra K. Ray, Kim Dan Nguyen

Abstract:

An unstructured finite volume numerical model is presented here for simulating shallow-water flows with wetting and drying fronts. The model is based on the Green-s theorem in combination with Chorin-s projection method. A 2nd-order upwind scheme coupled with a Least Square technique is used to handle convection terms. An Wetting and drying treatment is used in the present model to ensures the total mass conservation. To test it-s capacity and reliability, the present model is used to solve the Parabolic Bowl problem. We compare our numerical solutions with the corresponding analytical and existing standard numerical results. Excellent agreements are found in all the cases.

Keywords: Finite volume method, Projection method, Shallow water, Unstructured grid, wetting/drying fronts.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1551

78 Opinion Mining Framework in the Education Domain

Authors: A. M. H. Elyasir, K. S. M. Anbananthen

Abstract:

The internet is growing larger and becoming the most popular platform for the people to share their opinion in different interests. We choose the education domain specifically comparing some Malaysian universities against each other. This comparison produces benchmark based on different criteria shared by the online users in various online resources including Twitter, Facebook and web pages. The comparison is accomplished using opinion mining framework to extract, process the unstructured text and classify the result to positive, negative or neutral (polarity). Hence, we divide our framework to three main stages; opinion collection (extraction), unstructured text processing and polarity classification. The extraction stage includes web crawling, HTML parsing, Sentence segmentation for punctuation classification, Part of Speech (POS) tagging, the second stage processes the unstructured text with stemming and stop words removal and finally prepare the raw text for classification using Named Entity Recognition (NER). Last phase is to classify the polarity and present overall result for the comparison among the Malaysian universities. The final result is useful for those who are interested to study in Malaysia, in which our final output declares clear winners based on the public opinions all over the web.

Keywords: Entity Recognition, Education Domain, Opinion Mining, Unstructured Text.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2918

77 Delaunay Triangulations Efficiency for Conduction-Convection Problems

Authors: Bashar Albaalbaki, Roger E. Khayat

Abstract:

This work is a comparative study on the effect of Delaunay triangulation algorithms on discretization error for conduction-convection conservation problems. A structured triangulation and many unstructured Delaunay triangulations using three popular algorithms for node placement strategies are used. The numerical method employed is the vertex-centered finite volume method. It is found that when the computational domain can be meshed using a structured triangulation, the discretization error is lower for structured triangulations compared to unstructured ones for only low Peclet number values, i.e. when conduction is dominant. However, as the Peclet number is increased and convection becomes more significant, the unstructured triangulations reduce the discretization error. Also, no statistical correlation between triangulation angle extremums and the discretization error is found using 200 samples of randomly generated Delaunay and non-Delaunay triangulations. Thus, the angle extremums cannot be an indicator of the discretization error on their own and need to be combined with other triangulation quality measures, which is the subject of further studies.

Keywords: Conduction-convection problems, Delaunay triangulation, discretization error, finite volume method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 75

76 Mining Association Rules from Unstructured Documents

Authors: Hany Mahgoub

Abstract:

This paper presents a system for discovering association rules from collections of unstructured documents called EART (Extract Association Rules from Text). The EART system treats texts only not images or figures. EART discovers association rules amongst keywords labeling the collection of textual documents. The main characteristic of EART is that the system integrates XML technology (to transform unstructured documents into structured documents) with Information Retrieval scheme (TF-IDF) and Data Mining technique for association rules extraction. EART depends on word feature to extract association rules. It consists of four phases: structure phase, index phase, text mining phase and visualization phase. Our work depends on the analysis of the keywords in the extracted association rules through the co-occurrence of the keywords in one sentence in the original text and the existing of the keywords in one sentence without co-occurrence. Experiments applied on a collection of scientific documents selected from MEDLINE that are related to the outbreak of H5N1 avian influenza virus.

Keywords: Association rules, information retrieval, knowledgediscovery in text, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2392

75 Kinetic Studies on Microbial Production of Tannase Using Redgram Husk

Authors: S. K. Mohan, T. Viruthagiri, C. Arunkumar

Abstract:

Tannase (tannin acyl hydrolase, E.C.3.1.1.20) is an important hydrolysable enzyme with innumerable applications and industrial potential. In the present study, a kinetic model has been developed for the batch fermentation used for the production of tannase by A.flavus MTCC 3783. Maximum tannase activity of 143.30 U/ml was obtained at 96 hours under optimum operating conditions at 35oC, an initial pH of 5.5 and with an inducer tannic acid concentration of 3% (w/v) for a fermentation period of 120 hours. The biomass concentration reaches a maximum of 6.62 g/l at 96 hours and further there was no increase in biomass concentration till the end of the fermentation. Various unstructured kinetic models were analyzed to simulate the experimental values of microbial growth, tannase activity and substrate concentration. The Logistic model for microbial growth , Luedeking - Piret model for production of tannase and Substrate utilization kinetic model for utilization of substrate were capable of predicting the fermentation profile with high coefficient of determination (R2) values of 0.980, 0.942 and 0.983 respectively. The results indicated that the unstructured models were able to describe the fermentation kinetics more effectively.

Keywords: Aspergillus flavus, Batch fermentation, Kinetic model, Tannase, Unstructured models.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1516

74 Association Rules Mining and NOSQL Oriented Document in Big Data

Authors: Sarra Senhadji, Imene Benzeguimi, Zohra Yagoub

Abstract:

Big Data represents the recent technology of manipulating voluminous and unstructured data sets over multiple sources. Therefore, NOSQL appears to handle the problem of unstructured data. Association rules mining is one of the popular techniques of data mining to extract hidden relationship from transactional databases. The algorithm for finding association dependencies is well-solved with Map Reduce. The goal of our work is to reduce the time of generating of frequent itemsets by using Map Reduce and NOSQL database oriented document. A comparative study is given to evaluate the performances of our algorithm with the classical algorithm Apriori.

Keywords: Apriori, Association rules mining, Big Data, data mining, Hadoop, Map Reduce, MongoDB, NoSQL.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 617

73 Dynamic Variational Multiscale LES of Bluff Body Flows on Unstructured Grids

Authors: Carine Moussaed, Stephen Wornom, Bruno Koobus, Maria Vittoria Salvetti, Alain Dervieux,

Abstract:

The effects of dynamic subgrid scale (SGS) models are investigated in variational multiscale (VMS) LES simulations of bluff body flows. The spatial discretization is based on a mixed finite element/finite volume formulation on unstructured grids. In the VMS approach used in this work, the separation between the largest and the smallest resolved scales is obtained through a variational projection operator and a finite volume cell agglomeration. The dynamic version of Smagorinsky and WALE SGS models are used to account for the effects of the unresolved scales. In the VMS approach, these effects are only modeled in the smallest resolved scales. The dynamic VMS-LES approach is applied to the simulation of the flow around a circular cylinder at Reynolds numbers 3900 and 20000 and to the flow around a square cylinder at Reynolds numbers 22000 and 175000. It is observed as in previous studies that the dynamic SGS procedure has a smaller impact on the results within the VMS approach than in LES. But improvements are demonstrated for important feature like recirculating part of the flow. The global prediction is improved for a small computational extra cost.

Keywords: variational multiscale LES, dynamic SGS model, unstructured grids, circular cylinder, square cylinder.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1778

72 A Finite Volume Procedure on Unstructured Meshes for Fluid-Structure Interaction Problems

Authors: P I Jagad, B P Puranik, A W Date

Abstract:

Flow through micro and mini channels requires relatively high driving pressure due to the large fluid pressure drop through these channels. Consequently the forces acting on the walls of the channel due to the fluid pressure are also large. Due to these forces there are displacement fields set up in the solid substrate containing the channels. If the movement of the substrate is constrained at some points, then stress fields are established in the substrate. On the other hand, if the deformation of the channel shape is sufficiently large then its effect on the fluid flow is important to be calculated. Such coupled fluid-solid systems form a class of problems known as fluidstructure interactions. In the present work a co-located finite volume discretization procedure on unstructured meshes is described for solving fluid-structure interaction type of problems. A linear elastic solid is assumed for which the effect of the channel deformation on the flow is neglected. Thus the governing equations for the fluid and the solid are decoupled and are solved separately. The procedure is validated by solving two benchmark problems, one from fluid mechanics and another from solid mechanics. A fluid-structure interaction problem of flow through a U-shaped channel embedded in a plate is solved.

Keywords: Finite volume method, flow induced stresses, fluidstructureinteraction, unstructured meshes.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1839

71 Finite Volume Method for Flow Prediction Using Unstructured Meshes

Authors: Juhee Lee, Yongjun Lee

Abstract:

In designing a low-energy-consuming buildings, the heat transfer through a large glass or wall becomes critical. Multiple layers of the window glasses and walls are employed for the high insulation. The gravity driven air flow between window glasses or wall layers is a natural heat convection phenomenon being a key of the heat transfer. For the first step of the natural heat transfer analysis, in this study the development and application of a finite volume method for the numerical computation of viscous incompressible flows is presented. It will become a part of the natural convection analysis with high-order scheme, multi-grid method, and dual-time step in the future. A finite volume method based on a fully-implicit second-order is used to discretize and solve the fluid flow on unstructured grids composed of arbitrary-shaped cells. The integrations of the governing equation are discretised in the finite volume manner using a collocated arrangement of variables. The convergence of the SIMPLE segregated algorithm for the solution of the coupled nonlinear algebraic equations is accelerated by using a sparse matrix solver such as BiCGSTAB. The method used in the present study is verified by applying it to some flows for which either the numerical solution is known or the solution can be obtained using another numerical technique available in the other researches. The accuracy of the method is assessed through the grid refinement.

Keywords: Finite volume method, fluid flow, laminar flow, unstructured grid.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1781

70 Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text

Authors: Paolo Fantozzi, Luigi Laura, Umberto Nanni

Abstract:

The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.

Keywords: Cooccurrence graph, entity relation graph, unstructured text, weighted distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 626

69 Robust Stability in Multivariable Neural Network Control using Harmonic Analysis

Authors: J. Fernandez de Canete, S. Gonzalez-Perez, P. del Saz-Orozco, I. Garcia-Moral

Abstract:

Robust stability and performance are the two most basic features of feedback control systems. The harmonic balance analysis technique enables to analyze the stability of limit cycles arising from a neural network control based system operating over nonlinear plants. In this work a robust stability analysis based on the harmonic balance is presented and applied to a neural based control of a non-linear binary distillation column with unstructured uncertainty. We develop ways to describe uncertainty in the form of neglected nonlinear dynamics and high harmonics for the plant and controller respectively. Finally, conclusions about the performance of the neural control system are discussed using the Nyquist stability margin together with the structured singular values of the uncertainty as a robustness measure.

Keywords: Robust stability, neural network control, unstructured uncertainty, singular values, distillation column.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1582

68 Semi-Lagrangian Method for Advection Equation on GPU in Unstructured R3 Mesh for Fluid Dynamics Application

Authors: Irakli V. Gugushvili, Nickolay M. Evstigneev

Abstract:

Numerical integration of initial boundary problem for advection equation in 3 ℜ is considered. The method used is conditionally stable semi-Lagrangian advection scheme with high order interpolation on unstructured mesh. In order to increase time step integration the BFECC method with limiter TVD correction is used. The method is adopted on parallel graphic processor unit environment using NVIDIA CUDA and applied in Navier-Stokes solver. It is shown that the calculation on NVIDIA GeForce 8800 GPU is 184 times faster than on one processor AMDX2 4800+ CPU. The method is extended to the incompressible fluid dynamics solver. Flow over a Cylinder for 3D case is compared to the experimental data.

Keywords: Advection equations, CUDA technology, Flow overthe 3D Cylinder, Incompressible Pressure Projection Solver, Parallel computation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2782

67 Use of Bayesian Network in Information Extraction from Unstructured Data Sources

Authors: Quratulain N. Rajput, Sajjad Haider

Abstract:

This paper applies Bayesian Networks to support information extraction from unstructured, ungrammatical, and incoherent data sources for semantic annotation. A tool has been developed that combines ontologies, machine learning, and information extraction and probabilistic reasoning techniques to support the extraction process. Data acquisition is performed with the aid of knowledge specified in the form of ontology. Due to the variable size of information available on different data sources, it is often the case that the extracted data contains missing values for certain variables of interest. It is desirable in such situations to predict the missing values. The methodology, presented in this paper, first learns a Bayesian network from the training data and then uses it to predict missing data and to resolve conflicts. Experiments have been conducted to analyze the performance of the presented methodology. The results look promising as the methodology achieves high degree of precision and recall for information extraction and reasonably good accuracy for predicting missing values.

Keywords: Information Extraction, Bayesian Network, ontology, Machine Learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2172

66 System for Monitoring Marine Turtles Using Unstructured Supplementary Service Data

Authors: Luís Pina

Abstract:

The conservation of marine biodiversity keeps ecosystems in balance and ensures the sustainable use of resources. In this context, technological resources have been used for monitoring marine species to allow biologists to obtain data in real-time. There are different mobile applications developed for data collection for monitoring purposes, but these systems are designed to be utilized only on third-generation (3G) phones or smartphones with Internet access and in rural parts of the developing countries, Internet services and smartphones are scarce. Thus, the objective of this work is to develop a system to monitor marine turtles using Unstructured Supplementary Service Data (USSD), which users can access through basic mobile phones. The system aims to improve the data collection mechanism and enhance the effectiveness of current systems in monitoring sea turtles using any type of mobile device without Internet access. The system will be able to report information related to the biological activities of marine turtles. Also, it will be used as a platform to assist marine conservation entities to receive reports of illegal sales of sea turtles. The system can also be utilized as an educational tool for communities, providing knowledge and allowing the inclusion of communities in the process of monitoring marine turtles. Therefore, this work may contribute with information to decision-making and implementation of contingency plans for marine conservation programs.

Keywords: GSM, marine biology, marine turtles, USSD.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 870

65 Web Content Mining: A Solution to Consumer's Product Hunt

Authors: Syed Salman Ahmed, Zahid Halim, Rauf Baig, Shariq Bashir

Abstract:

With the rapid growth in business size, today's businesses orient towards electronic technologies. Amazon.com and e-bay.com are some of the major stakeholders in this regard. Unfortunately the enormous size and hugely unstructured data on the web, even for a single commodity, has become a cause of ambiguity for consumers. Extracting valuable information from such an everincreasing data is an extremely tedious task and is fast becoming critical towards the success of businesses. Web content mining can play a major role in solving these issues. It involves using efficient algorithmic techniques to search and retrieve the desired information from a seemingly impossible to search unstructured data on the Internet. Application of web content mining can be very encouraging in the areas of Customer Relations Modeling, billing records, logistics investigations, product cataloguing and quality management. In this paper we present a review of some very interesting, efficient yet implementable techniques from the field of web content mining and study their impact in the area specific to business user needs focusing both on the customer as well as the producer. The techniques we would be reviewing include, mining by developing a knowledge-base repository of the domain, iterative refinement of user queries for personalized search, using a graphbased approach for the development of a web-crawler and filtering information for personalized search using website captions. These techniques have been analyzed and compared on the basis of their execution time and relevance of the result they produced against a particular search.

Keywords: Data mining, web mining, search engines, knowledge discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2004

64 An Automatic Bayesian Classification System for File Format Selection

Authors: Roman Graf, Sergiu Gordea, Heather M. Ryan

Abstract:

This paper presents an approach for the classification of an unstructured format description for identification of file formats. The main contribution of this work is the employment of data mining techniques to support file format selection with just the unstructured text description that comprises the most important format features for a particular organisation. Subsequently, the file format indentification method employs file format classifier and associated configurations to support digital preservation experts with an estimation of required file format. Our goal is to make use of a format specification knowledge base aggregated from a different Web sources in order to select file format for a particular institution. Using the naive Bayes method, the decision support system recommends to an expert, the file format for his institution. The proposed methods facilitate the selection of file format and the quality of a digital preservation process. The presented approach is meant to facilitate decision making for the preservation of digital content in libraries and archives using domain expert knowledge and specifications of file formats. To facilitate decision-making, the aggregated information about the file formats is presented as a file format vocabulary that comprises most common terms that are characteristic for all researched formats. The goal is to suggest a particular file format based on this vocabulary for analysis by an expert. The sample file format calculation and the calculation results including probabilities are presented in the evaluation section.

Keywords: Data mining, digital libraries, digital preservation, file format.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1611

63 Current Status and Future Trends of Mechanized Fruit Thinning Devices and Sensor Technology

Authors: Marco Lopes, Pedro D. Gaspar, Maria P. Simões

Abstract:

This paper reviews the different concepts that have been investigated concerning the mechanization of fruit thinning as well as multiple working principles and solutions that have been developed for feature extraction of horticultural products, both in the field and industrial environments. The research should be committed towards selective methods, which inevitably need to incorporate some kinds of sensor technology. Computer vision often comes out as an obvious solution for unstructured detection problems, although leaves despite the chosen point of view frequently occlude fruits. Further research on non-traditional sensors that are capable of object differentiation is needed. Ultrasonic and Near Infrared (NIR) technologies have been investigated for applications related to horticultural produce and show a potential to satisfy this need while simultaneously providing spatial information as time of flight sensors. Light Detection and Ranging (LIDAR) technology also shows a huge potential but it implies much greater costs and the related equipment is usually much larger, making it less suitable for portable devices, which may serve a purpose on smaller unstructured orchards. Portable devices may serve a purpose on these types of orchards. In what concerns sensor methods, on-tree fruit detection, major challenge is to overcome the problem of fruits’ occlusion by leaves and branches. Hence, nontraditional sensors capable of providing some type of differentiation should be investigated.

Keywords: Fruit thinning, horticultural field, portable devices, sensor technologies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 928

62 Continuous FAQ Updating for Service Incident Ticket Resolution

Authors: Kohtaroh Miyamoto

Abstract:

As enterprise computing becomes more and more complex, the costs and technical challenges of IT system maintenance and support are increasing rapidly. One popular approach to managing IT system maintenance is to prepare and use a FAQ (Frequently Asked Questions) system to manage and reuse systems knowledge. Such a FAQ system can help reduce the resolution time for each service incident ticket. However, there is a major problem where over time the knowledge in such FAQs tends to become outdated. Much of the knowledge captured in the FAQ requires periodic updates in response to new insights or new trends in the problems addressed in order to maintain its usefulness for problem resolution. These updates require a systematic approach to define the exact portion of the FAQ and its content. Therefore, we are working on a novel method to hierarchically structure the FAQ and automate the updates of its structure and content. We use structured information and the unstructured text information with the timelines of the information in the service incident tickets. We cluster the tickets by structured category information, by keywords, and by keyword modifiers for the unstructured text information. We also calculate an urgency score based on trends, resolution times, and priorities. We carefully studied the tickets of one of our projects over a 2.5-year time period. After the first 6 months we started to create FAQs and confirmed they improved the resolution times. We continued observing over the next 2 years to assess the ongoing effectiveness of our method for the automatic FAQ updates. We improved the ratio of tickets covered by the FAQ from 32.3% to 68.9% during this time. Also, the average time reduction of ticket resolution was between 31.6% and 43.9%. Subjective analysis showed more than 75% reported that the FAQ system was useful in reducing ticket resolution times.

Keywords: FAQ System, Resolution Time, Service Incident Tickets, IT System Maintenance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2439

61 Flood Modeling in Urban Area Using a Well-Balanced Discontinuous Galerkin Scheme on Unstructured Triangular Grids

Authors: Rabih Ghostine, Craig Kapfer, Viswanathan Kannan, Ibrahim Hoteit

Abstract:

Urban flooding resulting from a sudden release of water due to dam-break or excessive rainfall is a serious threatening environment hazard, which causes loss of human life and large economic losses. Anticipating floods before they occur could minimize human and economic losses through the implementation of appropriate protection, provision, and rescue plans. This work reports on the numerical modelling of flash flood propagation in urban areas after an excessive rainfall event or dam-break. A two-dimensional (2D) depth-averaged shallow water model is used with a refined unstructured grid of triangles for representing the urban area topography. The 2D shallow water equations are solved using a second-order well-balanced discontinuous Galerkin scheme. Theoretical test case and three flood events are described to demonstrate the potential benefits of the scheme: (i) wetting and drying in a parabolic basin (ii) flash flood over a physical model of the urbanized Toce River valley in Italy; (iii) wave propagation on the Reyran river valley in consequence of the Malpasset dam-break in 1959 (France); and (iv) dam-break flood in October 1982 at the town of Sumacarcel (Spain). The capability of the scheme is also verified against alternative models. Computational results compare well with recorded data and show that the scheme is at least as efficient as comparable second-order finite volume schemes, with notable efficiency speedup due to parallelization.

Keywords: Flood modeling, dam-break, shallow water equations, Discontinuous Galerkin scheme, MUSCL scheme.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 884

60 Application of Unstructured Mesh Modeling in Evolving SGE of an Airport at the Confluence of Multiple Rivers in a Macro Tidal Region

Authors: A. A. Purohit, M. M. Vaidya, M. D. Kudale

Abstract:

Among the various developing countries in the world like China, Malaysia, Korea etc., India is also developing its infrastructures in the form of Road/Rail/Airports and Waterborne facilities at an exponential rate. Mumbai, the financial epicenter of India is overcrowded and to relieve the pressure of congestion, Navi Mumbai suburb is being developed on the east bank of Thane creek near Mumbai. The government due to limited space at existing Mumbai Airports (domestic and international) to cater for the future demand of airborne traffic, proposes to build a new international airport near Panvel at Navi Mumbai. Considering the precedence of extreme rainfall on 26^th July 2005 and nearby townships being in a low-lying area, wherein new airport is proposed, it is inevitable to study this complex confluence area from a hydrodynamic consideration under both tidal and extreme events (predicted discharge hydrographs), to avoid inundation of the surrounding due to the proposed airport reclamation (1160 hectares) and to determine the safe grade elevation (SGE). The model studies conducted using the application of unstructured mesh to simulate the Panvel estuarine area (93 km²), calibration, validation of a model for hydraulic field measurements and determine the maxima water levels around the airport for various extreme hydrodynamic events, namely the simultaneous occurrence of highest tide from the Arabian Sea and peak flood discharges (Probable Maximum Precipitation and 26^th July 2005) from five rivers, the Gadhi, Kalundri, Taloja, Kasadi and Ulwe, meeting at the proposed airport area revealed that: (a) The Ulwe River flowing beneath the proposed airport needs to be diverted. The 120m wide proposed Ulwe diversion channel having a wider base width of 200 m at SH-54 Bridge on the Ulwe River along with the removal of the existing bund in Moha Creek is inevitable to keep the SGE of the airport to a minimum. (b) The clear waterway of 80 m at SH-54 Bridge (Ulwe River) and 120 m at Amra Marg Bridge near Moha Creek is also essential for the Ulwe diversion and (c) The river bank protection works on the right bank of Gadhi River between the NH-4B and SH-54 bridges as well as upstream of the Ulwe River diversion channel are essential to avoid inundation of low lying areas. The maxima water levels predicted around the airport keeps SGE to a minimum of 11m with respect to Chart datum of Ulwe Bundar and thus development is not only technologically-economically feasible but also sustainable. The unstructured mesh modeling is a promising tool to simulate complex extreme hydrodynamic events and provides a reliable solution to evolve optimal SGE of airport.

Keywords: Airport, hydrodynamics, hydrographs, safe grade elevation, tides.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 955

59 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, Wang Qun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSql), and gives 6 data cleaning methods based on these algorithms.

Keywords: Data cleaning, dependency rules, violation data discovery, data repair.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2560

58 A Unified Framework for a Robust Conflict-Free Robot Navigation

Authors: S. Veera Ragavan, V. Ganapathy

Abstract:

Many environment specific methods and systems for Robot Navigation exist. However vast strides in the evolution of navigation technologies and system techniques create the need for a general unified framework that is scalable, modular and dynamic. In this paper a Unified Framework for a Robust Conflict-free Robot Navigation System that can be used for either a structured or unstructured and indoor or outdoor environments has been proposed. The fundamental design aspects and implementation issues encountered during the development of the module are discussed. The results of the deployment of three major peripheral modules of the framework namely the GSM based communication module, GIS Module and GPS module are reported in this paper.

Keywords: Localization, Sensor Fusion, Mapping, GIS, GPS, and Autonomous Mobile Robot Navigation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1902

57 Numerical Analysis of the Turbulent Flow around DTMB 4119 Marine Propeller

Authors: K. Boumediene, S. E. Belhenniche

Abstract:

This article presents a numerical analysis of a turbulent flow past DTMB 4119 marine propeller by the means of RANS approach; the propeller designed at David Taylor Model Basin in USA. The purpose of this study is to predict the hydrodynamic performance of the marine propeller, it aims also to compare the results obtained with the experiment carried out in open water tests; a periodical computational domain was created to reduce the unstructured mesh size generated. The standard kw turbulence model for the simulation is selected; the results were in a good agreement. Therefore, the errors were estimated respectively to 1.3% and 5.9% for KT and KQ.

Keywords: propeller flow, CFD simulation, hydrodynamic performance, RANS

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2824

56 Chinese Event Detection Technique Based on Dependency Parsing and Rule Matching

Authors: Weitao Lin

Abstract:

To quickly extract adequate information from large-scale unstructured text data, this paper studies the representation of events in Chinese scenarios and performs the regularized abstraction. It proposes a Chinese event detection technique based on dependency parsing and rule matching. The method first performs dependency parsing on the original utterance, then performs pattern matching at the word or phrase granularity based on the results of dependent syntactic analysis, filters out the utterances with prominent non-event characteristics, and obtains the final results. The experimental results show the effectiveness of the method.

Keywords: Natural Language Processing, Chinese event detection, rules matching, dependency parsing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 105

55 Unified Structured Process for Health Analytics

Authors: Supunmali Ahangama, Danny Chiang Choon Poo

Abstract:

Health analytics (HA) is used in healthcare systems for effective decision making, management and planning of healthcare and related activities. However, user resistances, unique position of medical data content and structure (including heterogeneous and unstructured data) and impromptu HA projects have held up the progress in HA applications. Notably, the accuracy of outcomes depends on the skills and the domain knowledge of the data analyst working on the healthcare data. Success of HA depends on having a sound process model, effective project management and availability of supporting tools. Thus, to overcome these challenges through an effective process model, we propose a HA process model with features from rational unified process (RUP) model and agile methodology.

Keywords: Agile methodology, health analytics, unified process model, UML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2287

54 Comparative Analysis of Diverse Collection of Big Data Analytics Tools

Authors: S. Vidhya, S. Sarumathi, N. Shanthi

Abstract:

Over the past era, there have been a lot of efforts and studies are carried out in growing proficient tools for performing various tasks in big data. Recently big data have gotten a lot of publicity for their good reasons. Due to the large and complex collection of datasets it is difficult to process on traditional data processing applications. This concern turns to be further mandatory for producing various tools in big data. Moreover, the main aim of big data analytics is to utilize the advanced analytic techniques besides very huge, different datasets which contain diverse sizes from terabytes to zettabytes and diverse types such as structured or unstructured and batch or streaming. Big data is useful for data sets where their size or type is away from the capability of traditional relational databases for capturing, managing and processing the data with low-latency. Thus the out coming challenges tend to the occurrence of powerful big data tools. In this survey, a various collection of big data tools are illustrated and also compared with the salient features.

Keywords: Big data, Big data analytics, Business analytics, Data analysis, Data visualization, Data discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3726

53 Developing OMS in IHL

Authors: Suzana Basaruddin, Haryani Haron, Siti Arpah Noodin

Abstract:

Managing knowledge of research is one way to ensure just in time information and knowledge to support research strategist and activities. Unfortunately researcher found the vital research knowledge in IHL (Institutions of Higher Learning) are scattered, unstructured and unorganized. Aiming on lay aside conceptual foundations for understanding and developing OMS (Organizational Memory System) to facilitate research in IHL, this research revealed ten factors contributed to the needs of research in the IHL and seven internal challenges of IHL in promoting research to their academic members. This study then suggested a comprehensive support of managing research knowledge using Organizational Memory System (OMS). Eight OMS characteristics to support research were identified. Finally the initial work in designing OMS was projected using knowledge taxonomy. All analysis is derived from pertinent research paper related to research in IHL and OMS. Further study can be conducted to validate and verify results presented.

Keywords: corporate memory, Institutions of Higher Learning, organizational memory system, research

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2044