Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 8

Search results for: vectorization

8 Cache Analysis and Software Optimizations for Faster on-Chip Network Simulations

Authors: Khyamling Parane, B. M. Prabhu Prasad, Basavaraj Talawar

Abstract:

Fast simulations are critical in reducing time to market in CMPs and SoCs. Several simulators have been used to evaluate the performance and power consumed by Network-on-Chips. Researchers and designers rely upon these simulators for design space exploration of NoC architectures. Our experiments show that simulating large NoC topologies take hours to several days for completion. To speed up the simulations, it is necessary to investigate and optimize the hotspots in simulator source code. Among several simulators available, we choose Booksim2.0, as it is being extensively used in the NoC community. In this paper, we analyze the cache and memory system behaviour of Booksim2.0 to accurately monitor input dependent performance bottlenecks. Our measurements show that cache and memory usage patterns vary widely based on the input parameters given to Booksim2.0. Based on these measurements, the cache configuration having least misses has been identified. To further reduce the cache misses, we use software optimization techniques such as removal of unused functions, loop interchanging and replacing post-increment operator with pre-increment operator for non-primitive data types. The cache misses were reduced by 18.52%, 5.34% and 3.91% by employing above technology respectively. We also employ thread parallelization and vectorization to improve the overall performance of Booksim2.0. The OpenMP programming model and SIMD are used for parallelizing and vectorizing the more time-consuming portions of Booksim2.0. Speedups of 2.93x and 3.97x were observed for the Mesh topology with 30 × 30 network size by employing thread parallelization and vectorization respectively.

Keywords: cache behaviour, network-on-chip, performance profiling, vectorization

Procedia PDF Downloads 163

7 Web Data Scraping Technology Using Term Frequency Inverse Document Frequency to Enhance the Big Data Quality on Sentiment Analysis

Authors: Sangita Pokhrel, Nalinda Somasiri, Rebecca Jeyavadhanam, Swathi Ganesan

Abstract:

Tourism is a booming industry with huge future potential for global wealth and employment. There are countless data generated over social media sites every day, creating numerous opportunities to bring more insights to decision-makers. The integration of Big Data Technology into the tourism industry will allow companies to conclude where their customers have been and what they like. This information can then be used by businesses, such as those in charge of managing visitor centers or hotels, etc., and the tourist can get a clear idea of places before visiting. The technical perspective of natural language is processed by analysing the sentiment features of online reviews from tourists, and we then supply an enhanced long short-term memory (LSTM) framework for sentiment feature extraction of travel reviews. We have constructed a web review database using a crawler and web scraping technique for experimental validation to evaluate the effectiveness of our methodology. The text form of sentences was first classified through Vader and Roberta model to get the polarity of the reviews. In this paper, we have conducted study methods for feature extraction, such as Count Vectorization and TFIDF Vectorization, and implemented Convolutional Neural Network (CNN) classifier algorithm for the sentiment analysis to decide the tourist’s attitude towards the destinations is positive, negative, or simply neutral based on the review text that they posted online. The results demonstrated that from the CNN algorithm, after pre-processing and cleaning the dataset, we received an accuracy of 96.12% for the positive and negative sentiment analysis.

Keywords: counter vectorization, convolutional neural network, crawler, data technology, long short-term memory, web scraping, sentiment analysis

Procedia PDF Downloads 47

6 Sentiment Analysis of Fake Health News Using Naive Bayes Classification Models

Authors: Danielle Shackley, Yetunde Folajimi

Abstract:

As more people turn to the internet seeking health-related information, there is more risk of finding false, inaccurate, or dangerous information. Sentiment analysis is a natural language processing technique that assigns polarity scores to text, ranging from positive, neutral, and negative. In this research, we evaluate the weight of a sentiment analysis feature added to fake health news classification models. The dataset consists of existing reliably labeled health article headlines that were supplemented with health information collected about COVID-19 from social media sources. We started with data preprocessing and tested out various vectorization methods such as Count and TFIDF vectorization. We implemented 3 Naive Bayes classifier models, including Bernoulli, Multinomial, and Complement. To test the weight of the sentiment analysis feature on the dataset, we created benchmark Naive Bayes classification models without sentiment analysis, and those same models were reproduced, and the feature was added. We evaluated using the precision and accuracy scores. The Bernoulli initial model performed with 90% precision and 75.2% accuracy, while the model supplemented with sentiment labels performed with 90.4% precision and stayed constant at 75.2% accuracy. Our results show that the addition of sentiment analysis did not improve model precision by a wide margin; while there was no evidence of improvement in accuracy, we had a 1.9% improvement margin of the precision score with the Complement model. Future expansion of this work could include replicating the experiment process and substituting the Naive Bayes for a deep learning neural network model.

Keywords: sentiment analysis, Naive Bayes model, natural language processing, topic analysis, fake health news classification model

Procedia PDF Downloads 62

5 Benchmarking Bert-Based Low-Resource Language: Case Uzbek NLP Models

Authors: Jamshid Qodirov, Sirojiddin Komolov, Ravilov Mirahmad, Olimjon Mirzayev

Abstract:

Nowadays, natural language processing tools play a crucial role in our daily lives, including various techniques with text processing. There are very advanced models in modern languages, such as English, Russian etc. But, in some languages, such as Uzbek, the NLP models have been developed recently. Thus, there are only a few NLP models in Uzbek language. Moreover, there is no such work that could show which Uzbek NLP model behaves in different situations and when to use them. This work tries to close this gap and compares the Uzbek NLP models existing as of the time this article was written. The authors try to compare the NLP models in two different scenarios: sentiment analysis and sentence similarity, which are the implementations of the two most common problems in the industry: classification and similarity. Another outcome from this work is two datasets for classification and sentence similarity in Uzbek language that we generated ourselves and can be useful in both industry and academia as well.

Keywords: NLP, benchmak, bert, vectorization

Procedia PDF Downloads 16

4 Accessibility Analysis of Urban Green Space in Zadar Settlement, Croatia

Authors: Silvija Šiljeg, Ivan Marić, Ante Šiljeg

Abstract:

The accessibility of urban green spaces (UGS) is an integral element in the quality of life. Due to rapid urbanization, UGS studies have become a key element in urban planning. The potential benefits of space for its inhabitants are frequently analysed. A functional transport network system and the optimal spatial distribution of urban green surfaces are the prerequisites for maintaining the environmental equilibrium of the urban landscape. An accessibility analysis was conducted as part of the Urban Green Belts Project (UGB). The development of a GIS database for Zadar was the first step in generating the UGS accessibility indicator. Data were collected using the supervised classification method of multispectral LANDSAT images and manual vectorization of digital orthophoto images (DOF). An analysis of UGS accessibility according to the ANGst standard was conducted in the first phase of research. The accessibility indicator was generated on the basis of seven objective measurements, which included average UGS surface per capita and accessibility according to six functional levels of green surfaces. The generated indicator was compared with subjective measurements obtained by conducting a survey (718 respondents) within statistical units. The collected data reflected individual assessments and subjective evaluations of UGS accessibility. This study highlighted the importance of using objective and subjective measures in the process of understanding the accessibility of urban green surfaces. It may be concluded that when evaluating UGS accessibility, residents emphasize the immediate residential environment, ignoring higher UGS functional levels. It was also concluded that large areas of UGS within a city do not necessarily generate similar satisfaction with accessibility. The heterogeneity of output results may serve as guidelines for the further development of a functional UGS city network.

Keywords: urban green spaces (UGS), accessibility indicator, subjective and objective measurements, Zadar

Procedia PDF Downloads 210

3 Land Use Dynamics of Ikere Forest Reserve, Nigeria Using Geographic Information System

Authors: Akintunde Alo

Abstract:

The incessant encroachments into the forest ecosystem by the farmers and local contractors constitute a major threat to the conservation of genetic resources and biodiversity in Nigeria. To propose a viable monitoring system, this study employed Geographic Information System (GIS) technology to assess the changes that occurred for a period of five years (between 2011 and 2016) in Ikere forest reserve. Landsat imagery of the forest reserve was obtained. For the purpose of geo-referencing the acquired satellite imagery, ground-truth coordinates of some benchmark places within the forest reserve was relied on. Supervised classification algorithm, image processing, vectorization and map production were realized using ArcGIS. Various land use systems within the forest ecosystem were digitized into polygons of different types and colours for 2011 and 2016, roads were represented with lines of different thickness and colours. Of the six land-use delineated, the grassland increased from 26.50 % in 2011 to 45.53% in 2016 of the total land area with a percentage change of 71.81 %. Plantations of Gmelina arborea and Tectona grandis on the other hand reduced from 62.16 % in 2011 to 27.41% in 2016. The farmland and degraded land recorded percentage change of about 176.80 % and 8.70 % respectively from 2011 to 2016. Overall, the rate of deforestation in the study area is on the increase and becoming severe. About 72.59% of the total land area has been converted to non-forestry uses while the remnant 27.41% is occupied by plantations of Gmelina arborea and Tectona grandis. Interestingly, over 55 % of the plantation area in 2011 has changed to grassland, or converted to farmland and degraded land in 2016. The rate of change over time was about 9.79 % annually. Based on the results, rapid actions to prevail on the encroachers to stop deforestation and encouraged re-afforestation in the study area are recommended.

Keywords: land use change, forest reserve, satellite imagery, geographical information system

Procedia PDF Downloads 322

2 Development of Academic Software for Medial Axis Determination of Porous Media from High-Resolution X-Ray Microtomography Data

Authors: S. Jurado, E. Pazmino

Abstract:

Determination of the medial axis of a porous media sample is a non-trivial problem of interest for several disciplines, e.g., hydrology, fluid dynamics, contaminant transport, filtration, oil extraction, etc. However, the computational tools available for researchers are limited and restricted. The primary aim of this work was to develop a series of algorithms to extract porosity, medial axis structure, and pore-throat size distributions from porous media domains. A complementary objective was to provide the algorithms as free computational software available to the academic community comprising researchers and students interested in 3D data processing. The burn algorithm was tested on porous media data obtained from High-Resolution X-Ray Microtomography (HRXMT) and idealized computer-generated domains. The real data and idealized domains were discretized in voxels domains of 550³ elements and binarized to denote solid and void regions to determine porosity. Subsequently, the algorithm identifies the layer of void voxels next to the solid boundaries. An iterative process removes or 'burns' void voxels in sequence of layer by layer until all the void space is characterized. Multiples strategies were tested to optimize the execution time and use of computer memory, i.e., segmentation of the overall domain in subdomains, vectorization of operations, and extraction of single burn layer data during the iterative process. The medial axis determination was conducted identifying regions where burnt layers collide. The final medial axis structure was refined to avoid concave-grain effects and utilized to determine the pore throat size distribution. A graphic user interface software was developed to encompass all these algorithms, including the generation of idealized porous media domains. The software allows input of HRXMT data to calculate porosity, medial axis, and pore-throat size distribution and provide output in tabular and graphical formats. Preliminary tests of the software developed during this study achieved medial axis, pore-throat size distribution and porosity determination of 100³, 320³ and 550³ voxel porous media domains in 2, 22, and 45 minutes, respectively in a personal computer (Intel i7 processor, 16Gb RAM). These results indicate that the software is a practical and accessible tool in postprocessing HRXMT data for the academic community.

Keywords: medial axis, pore-throat distribution, porosity, porous media

Procedia PDF Downloads 86

1 Development of Building Information Modeling in Property Industry: Beginning with Building Information Modeling Construction

Authors: B. Godefroy, D. Beladjine, K. Beddiar

Abstract:

In France, construction BIM actors commonly evoke the BIM gains for exploitation by integrating of the life cycle of a building. The standardization of level 7 of development would achieve this stage of the digital model. The householders include local public authorities, social landlords, public institutions (health and education), enterprises, facilities management companies. They have a dual role: owner and manager of their housing complex. In a context of financial constraint, the BIM of exploitation aims to control costs, make long-term investment choices, renew the portfolio and enable environmental standards to be met. It assumes a knowledge of the existing buildings, marked by its size and complexity. The information sought must be synthetic and structured, it concerns, in general, a real estate complex. We conducted a study with professionals about their concerns and ways to use it to see how householders could benefit from this development. To obtain results, we had in mind the recurring interrogation of the project management, on the needs of the operators, we tested the following stages: 1) Inculcate a minimal culture of BIM with multidisciplinary teams of the operator then by business, 2) Learn by BIM tools, the adaptation of their trade in operations, 3) Understand the place and creation of a graphic and technical database management system, determine the components of its library so their needs, 4) Identify the cross-functional interventions of its managers by business (operations, technical, information system, purchasing and legal aspects), 5) Set an internal protocol and define the BIM impact in their digital strategy. In addition, continuity of management by the integration of construction models in the operation phase raises the question of interoperability in the control of the production of IFC files in the operator’s proprietary format and the export and import processes, a solution rivaled by the traditional method of vectorization of paper plans. Companies that digitize housing complex and those in FM produce a file IFC, directly, according to their needs without recourse to the model of construction, they produce models business for the exploitation. They standardize components, equipment that are useful for coding. We observed the consequences resulting from the use of the BIM in the property industry and, made the following observations: a) The value of data prevail over the graphics, 3D is little used b) The owner must, through his organization, promote the feedback of technical management information during the design phase c) The operator's reflection on outsourcing concerns the acquisition of its information system and these services, observing the risks and costs related to their internal or external developments. This study allows us to highlight: i) The need for an internal organization of operators prior to a response to the construction management ii) The evolution towards automated methods for creating models dedicated to the exploitation, a specialization would be required iii) A review of the communication of the project management, management continuity not articulating around his building model, it must take into account the environment of the operator and reflect on its scope of action.

Keywords: information system, interoperability, models for exploitation, property industry

Procedia PDF Downloads 118