Search results for: documents clustering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1468

Search results for: documents clustering

1258 Hybrid Hierarchical Routing Protocol for WSN Lifetime Maximization

Authors: H. Aoudia, Y. Touati, E. H. Teguig, A. Ali Cherif

Abstract:

Conceiving and developing routing protocols for wireless sensor networks requires considerations on constraints such as network lifetime and energy consumption. In this paper, we propose a hybrid hierarchical routing protocol named HHRP combining both clustering mechanism and multipath optimization taking into account residual energy and RSSI measures. HHRP consists of classifying dynamically nodes into clusters where coordinators nodes with extra privileges are able to manipulate messages, aggregate data and ensure transmission between nodes according to TDMA and CDMA schedules. The reconfiguration of the network is carried out dynamically based on a threshold value which is associated with the number of nodes belonging to the smallest cluster. To show the effectiveness of the proposed approach HHRP, a comparative study with LEACH protocol is illustrated in simulations.

Keywords: routing protocol, optimization, clustering, WSN

Procedia PDF Downloads 437
1257 Interactive, Topic-Oriented Search Support by a Centroid-Based Text Categorisation

Authors: Mario Kubek, Herwig Unger

Abstract:

Centroid terms are single words that semantically and topically characterise text documents and so may serve as their very compact representation in automatic text processing. In the present paper, centroids are used to measure the relevance of text documents with respect to a given search query. Thus, a new graphbased paradigm for searching texts in large corpora is proposed and evaluated against keyword-based methods. The first, promising experimental results demonstrate the usefulness of the centroid-based search procedure. It is shown that especially the routing of search queries in interactive and decentralised search systems can be greatly improved by applying this approach. A detailed discussion on further fields of its application completes this contribution.

Keywords: search algorithm, centroid, query, keyword, co-occurrence, categorisation

Procedia PDF Downloads 260
1256 A Computational Cost-Effective Clustering Algorithm in Multidimensional Space Using the Manhattan Metric: Application to the Global Terrorism Database

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

The increasing amount of collected data has limited the performance of the current analyzing algorithms. Thus, developing new cost-effective algorithms in terms of complexity, scalability, and accuracy raised significant interests. In this paper, a modified effective k-means based algorithm is developed and experimented. The new algorithm aims to reduce the computational load without significantly affecting the quality of the clusterings. The algorithm uses the City Block distance and a new stop criterion to guarantee the convergence. Conducted experiments on a real data set show its high performance when compared with the original k-means version.

Keywords: pattern recognition, global terrorism database, Manhattan distance, k-means clustering, terrorism data analysis

Procedia PDF Downloads 361
1255 Behavior of Printing Inks on Historical Documents Subjected to Cold RF Plasma Discharges

Authors: Dorina Rusu, Emil Ghiocel Ioanid, Marta Ursescu, Ana Maria Vlad, Mihaela Popescu

Abstract:

During the last decades the cold plasma discharges made the subject of numerous studies concerning the applications in the cultural heritage field, especially concentrated on ecological and non-invasive aspect of these conservation procedures. The conservation treatment using cold plasma is based, on the one hand, on the well-known property of plasma discharges to inactivate the contaminant biological species and, on the other hand, on the surface cleaning effect. Moreover the plasma discharge produces the functionalization of the treated surface, allowing subsequent deposition of protective layers. The paper presents the behavior of printing inks on historical documents treated in cold RF plasma. Two types of printing inks were studied, namely red and black ink, used on a religious book published in 19 century. SEM-EDX analysis results in the identification of the two inks as carbon black ink (C presence in the EDX spectrum) and cinnabar based red ink (Hg and S lines in the spectrum), result confirmed by XRF analysis. The experiments have been performed on paper samples written with laboratory- made inks, of similar composition with the inks identified on historical documents. The samples were subjected to RF plasma discharge, operating in nitrogen gaseous medium, at 1.2 MHz frequency and low-pressure (0.5 mbar), performed in a self-designed equipment for the application of conservation treatments on naturally aged paper supports. The impact of plasma discharge on the inks has been evaluated by SEM, XRD and color analysis. The color analysis revealed a slight discoloration of cinnabar ink on the historical document. SEM and XRD analyses have been carried out in an attempt to elucidate the process responsable for color modification.

Keywords: RF plasma, printing inks, historical documents, surface cleaning effect

Procedia PDF Downloads 418
1254 Popularization of the Communist Manifesto in 19th Century Europe

Authors: Xuanyu Bai

Abstract:

“The Communist Manifesto”, written by Karl Marx and Friedrich Engels, is one of the most significant documents throughout the whole history which covers across different fields including Economic, Politic, Sociology and Philosophy. Instead of discussing the Communist ideas presented in the Communist Manifesto, the essay focuses on exploring the reasons that contributed to the popularization of the document and its influence on political revolutions in 19th century Europe by concentrating on the document itself along with other primary and secondary sources and temporal artwork. Combining the details from the Communist Manifesto and other documents, Marx’s writing style and word choice, his convincible notions about a new society dominated by proletariats, and the revolutionary idea of class destruction has led to the popularization of the Communist Manifesto and influenced the latter political revolutions.

Keywords: communist manifesto, Marx, Engels, capitalism

Procedia PDF Downloads 110
1253 Altered Network Organization in Mild Alzheimer's Disease Compared to Mild Cognitive Impairment Using Resting-State EEG

Authors: Chia-Feng Lu, Yuh-Jen Wang, Shin Teng, Yu-Te Wu, Sui-Hing Yan

Abstract:

Brain functional networks based on resting-state EEG data were compared between patients with mild Alzheimer’s disease (mAD) and matched patients with amnestic subtype of mild cognitive impairment (aMCI). We integrated the time–frequency cross mutual information (TFCMI) method to estimate the EEG functional connectivity between cortical regions and the network analysis based on graph theory to further investigate the alterations of functional networks in mAD compared with aMCI group. We aimed at investigating the changes of network integrity, local clustering, information processing efficiency, and fault tolerance in mAD brain networks for different frequency bands based on several topological properties, including degree, strength, clustering coefficient, shortest path length, and efficiency. Results showed that the disruptions of network integrity and reductions of network efficiency in mAD characterized by lower degree, decreased clustering coefficient, higher shortest path length, and reduced global and local efficiencies in the delta, theta, beta2, and gamma bands were evident. The significant changes in network organization can be used in assisting discrimination of mAD from aMCI in clinical.

Keywords: EEG, functional connectivity, graph theory, TFCMI

Procedia PDF Downloads 404
1252 Computer Fraud from the Perspective of Iran's Law and International Documents

Authors: Babak Pourghahramani

Abstract:

One of the modern crimes against property and ownership in the cyber-space is the computer fraud. Despite being modern, the aforementioned crime has its roots in the principles of religious jurisprudence. In some cases, this crime is compatible with the traditional regulations and that is when the computer is considered as a crime commitment device and also some computer frauds that take place in the context of electronic exchanges are considered as crime based on the E-commerce Law (approved in 2003) but the aforementioned regulations are flawed and until recent years there was no comprehensive law in this regard; yet after some years the Computer Crime Act was approved in 2009/26/5 and partly solved the problem of legal vacuum. The present study intends to investigate the computer fraud according to Iran's Computer Crime Act and by taking into consideration the international documents.

Keywords: fraud, cyber fraud, computer fraud, classic fraud, computer crime

Procedia PDF Downloads 308
1251 Automatic Landmark Selection Based on Feature Clustering for Visual Autonomous Unmanned Aerial Vehicle Navigation

Authors: Paulo Fernando Silva Filho, Elcio Hideiti Shiguemori

Abstract:

The selection of specific landmarks for an Unmanned Aerial Vehicles’ Visual Navigation systems based on Automatic Landmark Recognition has significant influence on the precision of the system’s estimated position. At the same time, manual selection of the landmarks does not guarantee a high recognition rate, which would also result on a poor precision. This work aims to develop an automatic landmark selection that will take the image of the flight area and identify the best landmarks to be recognized by the Visual Navigation Landmark Recognition System. The criterion to select a landmark is based on features detected by ORB or AKAZE and edges information on each possible landmark. Results have shown that disposition of possible landmarks is quite different from the human perception.

Keywords: clustering, edges, feature points, landmark selection, X-means

Procedia PDF Downloads 255
1250 Clustering Based and Centralized Routing Table Topology of Control Protocol in Mobile Wireless Sensor Networks

Authors: Mbida Mohamed, Ezzati Abdellah

Abstract:

A strong challenge in the wireless sensor networks (WSN) is to save the energy and have a long life time in the network without having a high rate of loss information. However, topology control (TC) protocols are designed in a way that the network is divided and having a standard system of exchange packets between nodes. In this article, we will propose a clustering based and centralized routing table protocol of TC (CBCRT) which delegates a leader node that will encapsulate a single routing table in every cluster nodes. Hence, if a node wants to send packets to the sink, it requests the information's routing table of the current cluster from the node leader in order to root the packet.

Keywords: mobile wireless sensor networks, routing, topology of control, protocols

Procedia PDF Downloads 245
1249 Analysis of State Documents on Environmental Awareness Aspects in Kazakhstan

Authors: Y. A. Kumar

Abstract:

Environmental awareness issues in Kazakhstan are one of the most undermined topics both among the public community and in terms of state rhetoric. In the context of official state documents, so far only two official environmental codes and national programs called Zhasyl Kazakhstan were introduced in the country in 2021. While on the one hand the Environmental Code was introduced with the purpose to modernize, frame and enlist main legislative aspects on various sectors of environmental law in Kazakhstan, on the other hand, the Zhasyl Kazakhstan Program has been implemented as a state program to address with numerous environmental projects various environmental issues ranging from air pollution to waste management as well as aspects related to ecological education and low environmental awareness matters. In this regard, the main goal of this paper is to analyze critically the main content of both of these documents with a particular focus on sections related to environmental awareness-raising aspects. For that, this paper applied a subjective-based content analysis in order to identify interesting insights on regulatory legal aspects, future research streams, and uncovering of improved legislative frameworks in the context of an environmental awareness issue. Apart from that, five open-ended questions were sent out to the Ministry of Ecology, Geology and Natural Resources to obtain primary data on the state’s view in regards to current previous, recent and future aspects of environmental awareness issues in the country.

Keywords: Kazakhstan, environmental awareness, environmental code, Zhasyl Kazakhstan, content analysis

Procedia PDF Downloads 68
1248 Evaluation of Security and Performance of Master Node Protocol in the Bitcoin Peer-To-Peer Network

Authors: Muntadher Sallal, Gareth Owenson, Mo Adda, Safa Shubbar

Abstract:

Bitcoin is a digital currency based on a peer-to-peer network to propagate and verify transactions. Bitcoin is gaining wider adoption than any previous crypto-currency. However, the mechanism of peers randomly choosing logical neighbors without any knowledge about underlying physical topology can cause a delay overhead in information propagation, which makes the system vulnerable to double-spend attacks. Aiming at alleviating the propagation delay problem, this paper introduces proximity-aware extensions to the current Bitcoin protocol, named Master Node Based Clustering (MNBC). The ultimate purpose of the proposed protocol, that are based on how clusters are formulated and how nodes can define their membership, is to improve the information propagation delay in the Bitcoin network. In MNBC protocol, physical internet connectivity increases, as well as the number of hops between nodes, decreases through assigning nodes to be responsible for maintaining clusters based on physical internet proximity. We show, through simulations, that the proposed protocol defines better clustering structures that optimize the performance of the transaction propagation over the Bitcoin protocol. The evaluation of partition attacks in the MNBC protocol, as well as the Bitcoin network, was done in this paper. Evaluation results prove that even though the Bitcoin network is more resistant against the partitioning attack than the MNBC protocol, more resources are needed to be spent to split the network in the MNBC protocol, especially with a higher number of nodes.

Keywords: Bitcoin network, propagation delay, clustering, scalability

Procedia PDF Downloads 98
1247 Visual Template Detection and Compositional Automatic Regular Expression Generation for Business Invoice Extraction

Authors: Anthony Proschka, Deepak Mishra, Merlyn Ramanan, Zurab Baratashvili

Abstract:

Small and medium-sized businesses receive over 160 billion invoices every year. Since these documents exhibit many subtle differences in layout and text, extracting structured fields such as sender name, amount, and VAT rate from them automatically is an open research question. In this paper, existing work in template-based document extraction is extended, and a system is devised that is able to reliably extract all required fields for up to 70% of all documents in the data set, more than any other previously reported method. The approaches are described for 1) detecting through visual features which template a given document belongs to, 2) automatically generating extraction rules for a given new template by composing regular expressions from multiple components, and 3) computing confidence scores that indicate the accuracy of the automatic extractions. The system can generate templates with as little as one training sample and only requires the ground truth field values instead of detailed annotations such as bounding boxes that are hard to obtain. The system is deployed and used inside a commercial accounting software.

Keywords: data mining, information retrieval, business, feature extraction, layout, business data processing, document handling, end-user trained information extraction, document archiving, scanned business documents, automated document processing, F1-measure, commercial accounting software

Procedia PDF Downloads 103
1246 A Transformer-Based Question Answering Framework for Software Contract Risk Assessment

Authors: Qisheng Hu, Jianglei Han, Yue Yang, My Hoa Ha

Abstract:

When a company is considering purchasing software for commercial use, contract risk assessment is critical to identify risks to mitigate the potential adverse business impact, e.g., security, financial and regulatory risks. Contract risk assessment requires reviewers with specialized knowledge and time to evaluate the legal documents manually. Specifically, validating contracts for a software vendor requires the following steps: manual screening, interpreting legal documents, and extracting risk-prone segments. To automate the process, we proposed a framework to assist legal contract document risk identification, leveraging pre-trained deep learning models and natural language processing techniques. Given a set of pre-defined risk evaluation problems, our framework utilizes the pre-trained transformer-based models for question-answering to identify risk-prone sections in a contract. Furthermore, the question-answering model encodes the concatenated question-contract text and predicts the start and end position for clause extraction. Due to the limited labelled dataset for training, we leveraged transfer learning by fine-tuning the models with the CUAD dataset to enhance the model. On a dataset comprising 287 contract documents and 2000 labelled samples, our best model achieved an F1 score of 0.687.

Keywords: contract risk assessment, NLP, transfer learning, question answering

Procedia PDF Downloads 105
1245 Structural Challenges of Social Integration of Immigrants in Iran: Investigating the Status of Providing Citizenship and Social Services

Authors: Iman Shabanzadeh

Abstract:

In terms of its geopolitical position, Iran has been one of the main centers of migration movements in the world in recent decades. However, the policy makers' lack of preparation in completing the cycle of social integration of these immigrants, especially the second and third generation, has caused these people to always be prone to leave the country and immigrate to developed and industrialized countries. In this research, the issue of integration of immigrants in Iran from the perspective of four indicators, "Identity Documents", "Access to Banking Services", "Access to Health and Treatment Services" and "Obtaining a Driver's License" will be analyzed. The research method is descriptive-analytical. To collect information, library and document sources in the field of laws and regulations related to immigrants' rights in Iran, semi-structured interviews with experts have been used. The investigations of this study show that none of the residence documents of immigrants in Iran guarantee the full enjoyment of basic citizenship rights for them. In fact, the function of many of these identity documents, such as the census card, educational support card, etc., is only to prevent crossing the border, and none of them guarantee the basic rights of citizenship. Therefore, for many immigrants, the difference between legality and illegality is only in the risk of crossing the border, and this has led to the spread of the habit of illegal presence for them. Despite this, it seems that there is no clear and coherent policy framework around the issue of foreign immigrants in the country. This policy incoherence can be clearly seen in the diversity and plurality of identity and legal documents of the citizens present in the country and the policy maker's lack of planning to integrate and organize the identity of this huge group. Examining the differences and socioeconomic inequalities between immigrants and the native Iranian population shows that immigrants have been poorly integrated into the structures of Iranian society from an economic and social point of view.

Keywords: immigrants, social integration, citizen services, structural inequality

Procedia PDF Downloads 28
1244 BIM-Based Tool for Sustainability Assessment and Certification Documents Provision

Authors: Taki Eddine Seghier, Mohd Hamdan Ahmad, Yaik-Wah Lim, Samuel Opeyemi Williams

Abstract:

The assessment of building sustainability to achieve a specific green benchmark and the preparation of the required documents in order to receive a green building certification, both are considered as major challenging tasks for green building design team. However, this labor and time-consuming process can take advantage of the available Building Information Modeling (BIM) features such as material take-off and scheduling. Furthermore, the workflow can be automated in order to track potentially achievable credit points and provide rating feedback for several design options by using integrated Visual Programing (VP) to handle the stored parameters within the BIM model. Hence, this study proposes a BIM-based tool that uses Green Building Index (GBI) rating system requirements as a unique input case to evaluate the building sustainability in the design stage of the building project life cycle. The tool covers two key models for data extraction, firstly, a model for data extraction, calculation and the classification of achievable credit points in a green template, secondly, a model for the generation of the required documents for green building certification. The tool was validated on a BIM model of residential building and it serves as proof of concept that building sustainability assessment of GBI certification can be automatically evaluated and documented through BIM.

Keywords: green building rating system, GBRS, building information modeling, BIM, visual programming, VP, sustainability assessment

Procedia PDF Downloads 304
1243 Event Driven Dynamic Clustering and Data Aggregation in Wireless Sensor Network

Authors: Ashok V. Sutagundar, Sunilkumar S. Manvi

Abstract:

Energy, delay and bandwidth are the prime issues of wireless sensor network (WSN). Energy usage optimization and efficient bandwidth utilization are important issues in WSN. Event triggered data aggregation facilitates such optimal tasks for event affected area in WSN. Reliable delivery of the critical information to sink node is also a major challenge of WSN. To tackle these issues, we propose an event driven dynamic clustering and data aggregation scheme for WSN that enhances the life time of the network by minimizing redundant data transmission. The proposed scheme operates as follows: (1) Whenever the event is triggered, event triggered node selects the cluster head. (2) Cluster head gathers data from sensor nodes within the cluster. (3) Cluster head node identifies and classifies the events out of the collected data using Bayesian classifier. (4) Aggregation of data is done using statistical method. (5) Cluster head discovers the paths to the sink node using residual energy, path distance and bandwidth. (6) If the aggregated data is critical, cluster head sends the aggregated data over the multipath for reliable data communication. (7) Otherwise aggregated data is transmitted towards sink node over the single path which is having the more bandwidth and residual energy. The performance of the scheme is validated for various WSN scenarios to evaluate the effectiveness of the proposed approach in terms of aggregation time, cluster formation time and energy consumed for aggregation.

Keywords: wireless sensor network, dynamic clustering, data aggregation, wireless communication

Procedia PDF Downloads 422
1242 A Conglomerate of Multiple Optical Character Recognition Table Detection and Extraction

Authors: Smita Pallavi, Raj Ratn Pranesh, Sumit Kumar

Abstract:

Information representation as tables is compact and concise method that eases searching, indexing, and storage requirements. Extracting and cloning tables from parsable documents is easier and widely used; however, industry still faces challenges in detecting and extracting tables from OCR (Optical Character Recognition) documents or images. This paper proposes an algorithm that detects and extracts multiple tables from OCR document. The algorithm uses a combination of image processing techniques, text recognition, and procedural coding to identify distinct tables in the same image and map the text to appropriate the corresponding cell in dataframe, which can be stored as comma-separated values, database, excel, and multiple other usable formats.

Keywords: table extraction, optical character recognition, image processing, text extraction, morphological transformation

Procedia PDF Downloads 121
1241 Unsupervised Part-of-Speech Tagging for Amharic Using K-Means Clustering

Authors: Zelalem Fantahun

Abstract:

Part-of-speech tagging is the process of assigning a part-of-speech or other lexical class marker to each word into naturally occurring text. Part-of-speech tagging is the most fundamental and basic task almost in all natural language processing. In natural language processing, the problem of providing large amount of manually annotated data is a knowledge acquisition bottleneck. Since, Amharic is one of under-resourced language, the availability of tagged corpus is the bottleneck problem for natural language processing especially for POS tagging. A promising direction to tackle this problem is to provide a system that does not require manually tagged data. In unsupervised learning, the learner is not provided with classifications. Unsupervised algorithms seek out similarity between pieces of data in order to determine whether they can be characterized as forming a group. This paper explicates the development of unsupervised part-of-speech tagger using K-Means clustering for Amharic language since large amount of data is produced in day-to-day activities. In the development of the tagger, the following procedures are followed. First, the unlabeled data (raw text) is divided into 10 folds and tokenization phase takes place; at this level, the raw text is chunked at sentence level and then into words. The second phase is feature extraction which includes word frequency, syntactic and morphological features of a word. The third phase is clustering. Among different clustering algorithms, K-means is selected and implemented in this study that brings group of similar words together. The fourth phase is mapping, which deals with looking at each cluster carefully and the most common tag is assigned to a group. This study finds out two features that are capable of distinguishing one part-of-speech from others these are morphological feature and positional information and show that it is possible to use unsupervised learning for Amharic POS tagging. In order to increase performance of the unsupervised part-of-speech tagger, there is a need to incorporate other features that are not included in this study, such as semantic related information. Finally, based on experimental result, the performance of the system achieves a maximum of 81% accuracy.

Keywords: POS tagging, Amharic, unsupervised learning, k-means

Procedia PDF Downloads 420
1240 Approach Based on Fuzzy C-Means for Band Selection in Hyperspectral Images

Authors: Diego Saqui, José H. Saito, José R. Campos, Lúcio A. de C. Jorge

Abstract:

Hyperspectral images and remote sensing are important for many applications. A problem in the use of these images is the high volume of data to be processed, stored and transferred. Dimensionality reduction techniques can be used to reduce the volume of data. In this paper, an approach to band selection based on clustering algorithms is presented. This approach allows to reduce the volume of data. The proposed structure is based on Fuzzy C-Means (or K-Means) and NWHFC algorithms. New attributes in relation to other studies in the literature, such as kurtosis and low correlation, are also considered. A comparison of the results of the approach using the Fuzzy C-Means and K-Means with different attributes is performed. The use of both algorithms show similar good results but, particularly when used attributes variance and kurtosis in the clustering process, however applicable in hyperspectral images.

Keywords: band selection, fuzzy c-means, k-means, hyperspectral image

Procedia PDF Downloads 377
1239 Privacy Preserving Data Publishing Based on Sensitivity in Context of Big Data Using Hive

Authors: P. Srinivasa Rao, K. Venkatesh Sharma, G. Sadhya Devi, V. Nagesh

Abstract:

Privacy Preserving Data Publication is the main concern in present days because the data being published through the internet has been increasing day by day. This huge amount of data was named as Big Data by its size. This project deals the privacy preservation in the context of Big Data using a data warehousing solution called hive. We implemented Nearest Similarity Based Clustering (NSB) with Bottom-up generalization to achieve (v,l)-anonymity. (v,l)-Anonymity deals with the sensitivity vulnerabilities and ensures the individual privacy. We also calculate the sensitivity levels by simple comparison method using the index values, by classifying the different levels of sensitivity. The experiments were carried out on the hive environment to verify the efficiency of algorithms with Big Data. This framework also supports the execution of existing algorithms without any changes. The model in the paper outperforms than existing models.

Keywords: sensitivity, sensitive level, clustering, Privacy Preserving Data Publication (PPDP), bottom-up generalization, Big Data

Procedia PDF Downloads 266
1238 Improving the Performance of Requisition Document Online System for Royal Thai Army by Using Time Series Model

Authors: D. Prangchumpol

Abstract:

This research presents a forecasting method of requisition document demands for Military units by using Exponential Smoothing methods to analyze data. The data used in the forecast is an actual data requisition document of The Adjutant General Department. The results of the forecasting model to forecast the requisition of the document found that Holt–Winters’ trend and seasonality method of α=0.1, β=0, γ=0 is appropriate and matches for requisition of documents. In addition, the researcher has developed a requisition online system to improve the performance of requisition documents of The Adjutant General Department, and also ensuring that the operation can be checked.

Keywords: requisition, holt–winters, time series, royal thai army

Procedia PDF Downloads 284
1237 Identification of Nonlinear Systems Using Radial Basis Function Neural Network

Authors: C. Pislaru, A. Shebani

Abstract:

This paper uses the radial basis function neural network (RBFNN) for system identification of nonlinear systems. Five nonlinear systems are used to examine the activity of RBFNN in system modeling of nonlinear systems; the five nonlinear systems are dual tank system, single tank system, DC motor system, and two academic models. The feed forward method is considered in this work for modelling the non-linear dynamic models, where the K-Means clustering algorithm used in this paper to select the centers of radial basis function network, because it is reliable, offers fast convergence and can handle large data sets. The least mean square method is used to adjust the weights to the output layer, and Euclidean distance method used to measure the width of the Gaussian function.

Keywords: system identification, nonlinear systems, neural networks, radial basis function, K-means clustering algorithm

Procedia PDF Downloads 448
1236 Discriminating Between Energy Drinks and Sports Drinks Based on Their Chemical Properties Using Chemometric Methods

Authors: Robert Cazar, Nathaly Maza

Abstract:

Energy drinks and sports drinks are quite popular among young adults and teenagers worldwide. Some concerns regarding their health effects – particularly those of the energy drinks - have been raised based on scientific findings. Differentiating between these two types of drinks by means of their chemical properties seems to be an instructive task. Chemometrics provides the most appropriate strategy to do so. In this study, a discrimination analysis of the energy and sports drinks has been carried out applying chemometric methods. A set of eleven samples of available commercial brands of drinks – seven energy drinks and four sports drinks – were collected. Each sample was characterized by eight chemical variables (carbohydrates, energy, sugar, sodium, pH, degrees Brix, density, and citric acid). The data set was standardized and examined by exploratory chemometric techniques such as clustering and principal component analysis. As a preliminary step, a variable selection was carried out by inspecting the variable correlation matrix. It was detected that some variables are redundant, so they can be safely removed, leaving only five variables that are sufficient for this analysis. They are sugar, sodium, pH, density, and citric acid. Then, a hierarchical clustering `employing the average – linkage criterion and using the Euclidian distance metrics was performed. It perfectly separates the two types of drinks since the resultant dendogram, cut at the 25% similarity level, assorts the samples in two well defined groups, one of them containing the energy drinks and the other one the sports drinks. Further assurance of the complete discrimination is provided by the principal component analysis. The projection of the data set on the first two principal components – which retain the 71% of the data information – permits to visualize the distribution of the samples in the two groups identified in the clustering stage. Since the first principal component is the discriminating one, the inspection of its loadings consents to characterize such groups. The energy drinks group possesses medium to high values of density, citric acid, and sugar. The sports drinks group, on the other hand, exhibits low values of those variables. In conclusion, the application of chemometric methods on a data set that features some chemical properties of a number of energy and sports drinks provides an accurate, dependable way to discriminate between these two types of beverages.

Keywords: chemometrics, clustering, energy drinks, principal component analysis, sports drinks

Procedia PDF Downloads 85
1235 Parallel Genetic Algorithms Clustering for Handling Recruitment Problem

Authors: Walid Moudani, Ahmad Shahin

Abstract:

This research presents a study to handle the recruitment services system. It aims to enhance a business intelligence system by embedding data mining in its core engine and to facilitate the link between job searchers and recruiters companies. The purpose of this study is to present an intelligent management system for supporting recruitment services based on data mining methods. It consists to apply segmentation on the extracted job postings offered by the different recruiters. The details of the job postings are associated to a set of relevant features that are extracted from the web and which are based on critical criterion in order to define consistent clusters. Thereafter, we assign the job searchers to the best cluster while providing a ranking according to the job postings of the selected cluster. The performance of the proposed model used is analyzed, based on a real case study, with the clustered job postings dataset and classified job searchers dataset by using some metrics.

Keywords: job postings, job searchers, clustering, genetic algorithms, business intelligence

Procedia PDF Downloads 310
1234 Slovenian Spatial Legislation over Time and Its Issues

Authors: Andreja Benko

Abstract:

Article presents a short overview of the architects’ profession over time with outlined work of the architectural theoreticians. In the continuation is described a former affiliation of Slovenia as well as the spatial planning documents that were in use until the Slovenia joint Yugoslavia (last part in 1919). This legislation from former Austro-Hungarian monarchy was valid almost until 1950 in some parts of Yugoslavia even longer. Upon that will be mentioned some valid Slovenian spatial documents which will be compared with the German legislation. Analysed will be the number of architect and spatial planners in Slovenia and also their number upon certain region in Slovenia. Based on that will be given also the number from statistical office of Slovenia of the number of buildings between years 2007 and 2012, and described also the collapse of the major construction companies in Slovenia and consequences of that. At the end will be outlined the morality and ethics by spatial interventions and lack of the architectural law in Slovenia as well as the problematic of minimal collaboration between the Ministry of infrastructure and spatial planning with the profession.

Keywords: architect, history, legislation, Slovenia

Procedia PDF Downloads 340
1233 A Model Based Metaheuristic for Hybrid Hierarchical Community Structure in Social Networks

Authors: Radhia Toujani, Jalel Akaichi

Abstract:

In recent years, the study of community detection in social networks has received great attention. The hierarchical structure of the network leads to the emergence of the convergence to a locally optimal community structure. In this paper, we aim to avoid this local optimum in the introduced hybrid hierarchical method. To achieve this purpose, we present an objective function where we incorporate the value of structural and semantic similarity based modularity and a metaheuristic namely bees colonies algorithm to optimize our objective function on both hierarchical level divisive and agglomerative. In order to assess the efficiency and the accuracy of the introduced hybrid bee colony model, we perform an extensive experimental evaluation on both synthetic and real networks.

Keywords: social network, community detection, agglomerative hierarchical clustering, divisive hierarchical clustering, similarity, modularity, metaheuristic, bee colony

Procedia PDF Downloads 359
1232 Evaluation of Environmental, Social, and Governance Factors by U.S. Tolling Authorities in Bond Issuance Disclosures

Authors: Nicolas D. Norboge

Abstract:

Purchasers of municipal bonds in primary and secondary markets are increasingly expecting issuers to disclose environmental, social, and governance factors (ESG) inissuance and continuing disclosure documents. U.S. tolling authorities are slowly catching up with other transportation sectors, such as public transit, in integrating ESG factors into their bond disclosure documents. A systematic mixed-methods evaluation of publicly available bond disclosure documents from 2010-2022 suggest that only a small number of U.S. tolling authorities disclosedall ESG factors; however, the pace has accelerated significantly from 2020-2022. Because many tolling authorities have a direct financial stake in the growth of passenger vehicle miles traveled on their toll facilities, and in turn the burning of more climate-warming fossil fuels, one crucial questionthat remains is how bond purchasers will view increasedESG transparency. Recent moves by large institutional investors, credit rating agencies, and regulators suggestan expectation of ESG disclosure is a trend likely to endure. This researchsuggests tolling authorities will need to proactively consider these emerging trends and carefully adapt their disclosure practiceswhere possible. Building on these findings, this research also provides a basic sketch framework for how issuers can responsibly position themselves within the changing global municipal debt marketplace.

Keywords: debt policy, ESG, municipal bonds, public-private partnerships, public tolling authorities, transportation finance, and policy

Procedia PDF Downloads 155
1231 CoP-Networks: Virtual Spaces for New Faculty’s Professional Development in the 21st Higher Education

Authors: Eman AbuKhousa, Marwan Z. Bataineh

Abstract:

The 21st century higher education and globalization challenge new faculty members to build effective professional networks and partnership with industry in order to accelerate their growth and success. This creates the need for community of practice (CoP)-oriented development approaches that focus on cognitive apprenticeship while considering individual predisposition and future career needs. This work adopts data mining, clustering analysis, and social networking technologies to present the CoP-Network as a virtual space that connects together similar career-aspiration individuals who are socially influenced to join and engage in a process for domain-related knowledge and practice acquisitions. The CoP-Network model can be integrated into higher education to extend traditional graduate and professional development programs.

Keywords: clustering analysis, community of practice, data mining, higher education, new faculty challenges, social network, social influence, professional development

Procedia PDF Downloads 162
1230 A Minimum Spanning Tree-Based Method for Initializing the K-Means Clustering Algorithm

Authors: J. Yang, Y. Ma, X. Zhang, S. Li, Y. Zhang

Abstract:

The traditional k-means algorithm has been widely used as a simple and efficient clustering method. However, the algorithm often converges to local minima for the reason that it is sensitive to the initial cluster centers. In this paper, an algorithm for selecting initial cluster centers on the basis of minimum spanning tree (MST) is presented. The set of vertices in MST with same degree are regarded as a whole which is used to find the skeleton data points. Furthermore, a distance measure between the skeleton data points with consideration of degree and Euclidean distance is presented. Finally, MST-based initialization method for the k-means algorithm is presented, and the corresponding time complexity is analyzed as well. The presented algorithm is tested on five data sets from the UCI Machine Learning Repository. The experimental results illustrate the effectiveness of the presented algorithm compared to three existing initialization methods.

Keywords: degree, initial cluster center, k-means, minimum spanning tree

Procedia PDF Downloads 383
1229 Proposing a Boundary Coverage Algorithm ‎for Underwater Sensor Network

Authors: Seyed Mohsen Jameii

Abstract:

Wireless underwater sensor networks are a type of sensor networks that are located in underwater environments and linked together by acoustic waves. The application of these kinds of network includes monitoring of pollutants (chemical, biological, and nuclear), oil fields detection, prediction of the likelihood of a tsunami in coastal areas, the use of wireless sensor nodes to monitor the passing submarines, and determination of appropriate locations for anchoring ships. This paper proposes a boundary coverage algorithm for intrusion detection in underwater sensor networks. In the first phase of the proposed algorithm, optimal deployment of nodes is done in the water. In the second phase, after the employment of nodes at the proper depth, clustering is executed to reduce the exchanges of messages between the sensors. In the third phase, the algorithm of "divide and conquer" is used to save energy and increase network efficiency. The simulation results demonstrate the efficiency of the proposed algorithm.

Keywords: boundary coverage, clustering, divide and ‎conquer, underwater sensor nodes

Procedia PDF Downloads 314