Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 40610

Search results for: topological data analysis

40460 Charge Transport of Individual Thermoelectric Bi₂Te₃ Core-Poly(3,4-Ethylenedioxythiophene):Polystyrenesulfonate Shell Nanowires Determined Using Conductive Atomic Force Microscopy and Spectroscopy

Authors: W. Thongkham, K. Sinthiptharakoon, K. Tantisantisom, A. Klamchuen, P. Khanchaitit, K. Jiramitmongkon, C. Lertsatitthanakorn, M. Liangruksa

Abstract:

Due to demands of sustainable energy, thermoelectricity converting waste heat into electrical energy has become one of the intensive fields of worldwide research. However, such harvesting technology has shown low device performance in the temperature range below 150℃. In this work, a hybrid nanowire of inorganic bismuth telluride (Bi₂Te₃) and organic poly(3,4-ethylenedioxythiophene):polystyrenesulfonate (PEDOT:PSS) synthesized using a simple in-situ one-pot synthesis, enhancing efficiency of the nanowire-incorporated PEDOT:PSS-based thermoelectric converter is highlighted. Since the improvement is ascribed to the increased electrical conductivity of the thermoelectric host material, the individual hybrid nanowires are investigated using voltage-dependent conductive atomic force microscopy (CAFM) and spectroscopy (CAFS) considering that the electrical transport measurement can be performed either on insulating or conducting areas of the sample. Correlated with detailed chemical information on the crystalline structure and compositional profile of the nanowire core-shell structure, an electrical transporting pathway through the nanowire and the corresponding electronic-band structure have been determined, in which the native oxide layer on the Bi₂Te₃ surface is not considered, and charge conduction on the topological surface states of Bi₂Te₃ is suggested. Analyzing the core-shell nanowire synthesized using the conventional mixing of as-prepared Bi₂Te₃ nanowire with PEDOT:PSS for comparison, the oxide-removal effect of the in-situ encapsulating polymeric layer is further supported. The finding not only provides a structural information for mechanistic determination of the thermoelectricity, but it also encourages new approach toward more appropriate encapsulation and consequently higher efficiency of the nanowire-based thermoelectric generation.

Keywords: electrical transport measurement, hybrid Bi₂Te₃-PEDOT:PSS nanowire, nanoencapsulation, thermoelectricity, topological insulator

Procedia PDF Downloads 178

40459 Solitons and Universes with Acceleration Driven by Bulk Particles

Authors: A. C. Amaro de Faria Jr, A. M. Canone

Abstract:

Considering a scenario where our universe is taken as a 3d domain wall embedded in a 5d dimensional Minkowski space-time, we explore the existence of a richer class of solitonic solutions and their consequences for accelerating universes driven by collisions of bulk particle excitations with the walls. In particular it is shown that some of these solutions should play a fundamental role at the beginning of the expansion process. We present some of these solutions in cosmological scenarios that can be applied to models that describe the inflationary period of the Universe.

Keywords: solitons, topological defects, branes, kinks, accelerating universes in brane scenarios

Procedia PDF Downloads 106

40458 Algebras over an Integral Domain and Immediate Neighbors

Authors: Shai Sarussi

Abstract:

Let S be an integral domain with field of fractions F and let A be an F-algebra. An S-subalgebra R of A is called S-nice if R∩F = S and the localization of R with respect to S \{0} is A. Denoting by W the set of all S-nice subalgebras of A, and defining a notion of open sets on W, one can view W as a T0-Alexandroff space. A characterization of the property of immediate neighbors in an Alexandroff topological space is given, in terms of closed and open subsets of appropriate subspaces. Moreover, two special subspaces of W are introduced, and a way in which their closed and open subsets induce W is presented.

Keywords: integral domains, Alexandroff topology, immediate neighbors, valuation domains

Procedia PDF Downloads 145

40457 Heritage and Tourism in the Era of Big Data: Analysis of Chinese Cultural Tourism in Catalonia

Authors: Xinge Liao, Francesc Xavier Roige Ventura, Dolores Sanchez Aguilera

Abstract:

With the development of the Internet, the study of tourism behavior has rapidly expanded from the traditional physical market to the online market. Data on the Internet is characterized by dynamic changes, and new data appear all the time. In recent years the generation of a large volume of data was characterized, such as forums, blogs, and other sources, which have expanded over time and space, together they constitute large-scale Internet data, known as Big Data. This data of technological origin that derives from the use of devices and the activity of multiple users is becoming a source of great importance for the study of geography and the behavior of tourists. The study will focus on cultural heritage tourist practices in the context of Big Data. The research will focus on exploring the characteristics and behavior of Chinese tourists in relation to the cultural heritage of Catalonia. Geographical information, target image, perceptions in user-generated content will be studied through data analysis from Weibo -the largest social networks of blogs in China. Through the analysis of the behavior of heritage tourists in the Big Data environment, this study will understand the practices (activities, motivations, perceptions) of cultural tourists and then understand the needs and preferences of tourists in order to better guide the sustainable development of tourism in heritage sites.

Keywords: Barcelona, Big Data, Catalonia, cultural heritage, Chinese tourism market, tourists’ behavior

Procedia PDF Downloads 110

40456 Big Data: Concepts, Technologies and Applications in the Public Sector

Authors: A. Alexandru, C. A. Alexandru, D. Coardos, E. Tudora

Abstract:

Big Data (BD) is associated with a new generation of technologies and architectures which can harness the value of extremely large volumes of very varied data through real time processing and analysis. It involves changes in (1) data types, (2) accumulation speed, and (3) data volume. This paper presents the main concepts related to the BD paradigm, and introduces architectures and technologies for BD and BD sets. The integration of BD with the Hadoop Framework is also underlined. BD has attracted a lot of attention in the public sector due to the newly emerging technologies that allow the availability of network access. The volume of different types of data has exponentially increased. Some applications of BD in the public sector in Romania are briefly presented.

Keywords: big data, big data analytics, Hadoop, cloud

Procedia PDF Downloads 278

40455 New Approach for Constructing a Secure Biometric Database

Authors: A. Kebbeb, M. Mostefai, F. Benmerzoug, Y. Chahir

Abstract:

The multimodal biometric identification is the combination of several biometric systems. The challenge of this combination is to reduce some limitations of systems based on a single modality while significantly improving performance. In this paper, we propose a new approach to the construction and the protection of a multimodal biometric database dedicated to an identification system. We use a topological watermarking to hide the relation between face image and the registered descriptors extracted from other modalities of the same person for more secure user identification.

Keywords: biometric databases, multimodal biometrics, security authentication, digital watermarking

Procedia PDF Downloads 344

40454 GeoWeb at the Service of Household Waste Collection in Urban Areas

Authors: Abdessalam Hijab, Eric Henry, Hafida Boulekbache

Abstract:

The complexity of the city makes sustainable management of the urban environment more difficult. Managers are required to make significant human and technical investments, particularly in household waste collection (focus of our research). The aim of this communication is to propose a collaborative geographic multi-actor device (MGCD) based on the link between information and communication technologies (ICT) and geo-web tools in order to involve urban residents in household waste collection processes. Our method is based on a collaborative/motivational concept between the city and its residents. It is a geographic collaboration dedicated to the general public (citizens, residents, and any other participant), based on real-time allocation and geographic location of topological, geographic, and multimedia data in the form of local geo-alerts (location-specific problems) related to household waste in an urban environment. This contribution allows us to understand the extent to which residents can assist and contribute to the development of household waste collection processes for a better protected urban environment. This suggestion provides a good idea of how residents can contribute to the data bank for future uses. Moreover, it will contribute to the transformation of the population into a smart inhabitant as an essential component of a smart city. The proposed model will be tested in the Lamkansa sampling district in Casablanca, Morocco.

Keywords: information and communication technologies, ICTs, GeoWeb, geo-collaboration, city, inhabitant, waste, collection, environment

Procedia PDF Downloads 92

40453 Using SNAP and RADTRAD to Establish the Analysis Model for Maanshan PWR Plant

Authors: J. R. Wang, H. C. Chen, C. Shih, S. W. Chen, J. H. Yang, Y. Chiang

Abstract:

In this study, we focus on the establishment of the analysis model for Maanshan PWR nuclear power plant (NPP) by using RADTRAD and SNAP codes with the FSAR, manuals, and other data. In order to evaluate the cumulative dose at the Exclusion Area Boundary (EAB) and Low Population Zone (LPZ) outer boundary, Maanshan NPP RADTRAD/SNAP model was used to perform the analysis of the DBA LOCA case. The analysis results of RADTRAD were similar to FSAR data. These analysis results were lower than the failure criteria of 10 CFR 100.11 (a total radiation dose to the whole body, 250 mSv; a total radiation dose to the thyroid from iodine exposure, 3000 mSv).

Keywords: RADionuclide, transport, removal, and dose estimation (RADTRAD), symbolic nuclear analysis package (SNAP), dose, PWR

Procedia PDF Downloads 430

40452 Efficiency of the Slovak Commercial Banks Applying the DEA Window Analysis

Authors: Iveta Řepková

Abstract:

The aim of this paper is to estimate the efficiency of the Slovak commercial banks employing the Data Envelopment Analysis (DEA) window analysis approach during the period 2003-2012. The research is based on unbalanced panel data of the Slovak commercial banks. Undesirable output was included into analysis of banking efficiency. It was found that most efficient banks were Postovabanka, UniCredit Bank and Istrobanka in CCR model and the most efficient banks were Slovenskasporitelna, Istrobanka and UniCredit Bank in BCC model. On contrary, the lowest efficient banks were found Privatbanka and CitiBank. We found that the largest banks in the Slovak banking market were lower efficient than medium-size and small banks. Results of the paper is that during the period 2003-2008 the average efficiency was increasing and then during the period 2010-2011 the average efficiency decreased as a result of financial crisis.

Keywords: data envelopment analysis, efficiency, Slovak banking sector, window analysis

Procedia PDF Downloads 330

40451 Data Stream Association Rule Mining with Cloud Computing

Authors: B. Suraj Aravind, M. H. M. Krishna Prasad

Abstract:

There exist emerging applications of data streams that require association rule mining, such as network traffic monitoring, web click streams analysis, sensor data, data from satellites etc. Data streams typically arrive continuously in high speed with huge amount and changing data distribution. This raises new issues that need to be considered when developing association rule mining techniques for stream data. This paper proposes to introduce an improved data stream association rule mining algorithm by eliminating the limitation of resources. For this, the concept of cloud computing is used. Inclusion of this may lead to additional unknown problems which needs further research.

Keywords: data stream, association rule mining, cloud computing, frequent itemsets

Procedia PDF Downloads 469

40450 Vector-Based Analysis in Cognitive Linguistics

Authors: Chuluundorj Begz

Abstract:

This paper presents the dynamic, psycho-cognitive approach to study of human verbal thinking on the basis of typologically different languages /as a Mongolian, English and Russian/. Topological equivalence in verbal communication serves as a basis of Universality of mental structures and therefore deep structures. Mechanism of verbal thinking consisted at the deep level of basic concepts, rules for integration and classification, neural networks of vocabulary. In neuro cognitive study of language, neural architecture and neuro psychological mechanism of verbal cognition are basis of a vector-based modeling. Verbal perception and interpretation of the infinite set of meanings and propositions in mental continuum can be modeled by applying tensor methods. Euclidean and non-Euclidean spaces are applied for a description of human semantic vocabulary and high order structures.

Keywords: Euclidean spaces, isomorphism and homomorphism, mental lexicon, mental mapping, semantic memory, verbal cognition, vector space

Procedia PDF Downloads 490

40449 Spatial Variability of Brahmaputra River Flow Characteristics

Authors: Hemant Kumar

Abstract:

Brahmaputra River is known according to the Hindu mythology the son of the Lord Brahma. According to this name, the river Brahmaputra creates mass destruction during the monsoon season in Assam, India. It is a state situated in North-East part of India. This is one of the essential states out of the seven countries of eastern India, where almost all entire Brahmaputra flow carried out. The other states carry their tributaries. In the present case study, the spatial analysis performed in this specific case the number of MODIS data are acquired. In the method of detecting the change, the spray content was found during heavy rainfall and in the flooded monsoon season. By this method, particularly the analysis over the Brahmaputra outflow determines the flooded season. The charged particle-associated in aerosol content genuinely verifies the heavy water content below the ground surface, which is validated by trend analysis through rainfall spectrum data. This is confirmed by in-situ sampled view data from a different position of Brahmaputra River. Further, a Hyperion Hyperspectral 30 m resolution data were used to scan the sediment deposits, which is also confirmed by in-situ sampled view data from a different position.

Keywords: aerosol, change detection, spatial analysis, trend analysis

Procedia PDF Downloads 122

40448 Frequent Itemset Mining Using Rough-Sets

Authors: Usman Qamar, Younus Javed

Abstract:

Frequent pattern mining is the process of finding a pattern (a set of items, subsequences, substructures, etc.) that occurs frequently in a data set. It was proposed in the context of frequent itemsets and association rule mining. Frequent pattern mining is used to find inherent regularities in data. What products were often purchased together? Its applications include basket data analysis, cross-marketing, catalog design, sale campaign analysis, Web log (click stream) analysis, and DNA sequence analysis. However, one of the bottlenecks of frequent itemset mining is that as the data increase the amount of time and resources required to mining the data increases at an exponential rate. In this investigation a new algorithm is proposed which can be uses as a pre-processor for frequent itemset mining. FASTER (FeAture SelecTion using Entropy and Rough sets) is a hybrid pre-processor algorithm which utilizes entropy and rough-sets to carry out record reduction and feature (attribute) selection respectively. FASTER for frequent itemset mining can produce a speed up of 3.1 times when compared to original algorithm while maintaining an accuracy of 71%.

Keywords: rough-sets, classification, feature selection, entropy, outliers, frequent itemset mining

Procedia PDF Downloads 406

40447 Attribute Analysis of Quick Response Code Payment Users Using Discriminant Non-negative Matrix Factorization

Authors: Hironori Karachi, Haruka Yamashita

Abstract:

Recently, the system of quick response (QR) code is getting popular. Many companies introduce new QR code payment services and the services are competing with each other to increase the number of users. For increasing the number of users, we should grasp the difference of feature of the demographic information, usage information, and value of users between services. In this study, we conduct an analysis of real-world data provided by Nomura Research Institute including the demographic data of users and information of users’ usages of two services; LINE Pay, and PayPay. For analyzing such data and interpret the feature of them, Nonnegative Matrix Factorization (NMF) is widely used; however, in case of the target data, there is a problem of the missing data. EM-algorithm NMF (EMNMF) to complete unknown values for understanding the feature of the given data presented by matrix shape. Moreover, for comparing the result of the NMF analysis of two matrices, there is Discriminant NMF (DNMF) shows the difference of users features between two matrices. In this study, we combine EMNMF and DNMF and also analyze the target data. As the interpretation, we show the difference of the features of users between LINE Pay and Paypay.

Keywords: data science, non-negative matrix factorization, missing data, quality of services

Procedia PDF Downloads 98

40446 Passenger Flow Characteristics of Seoul Metropolitan Subway Network

Authors: Kang Won Lee, Jung Won Lee

Abstract:

Characterizing the network flow is of fundamental importance to understand the complex dynamics of networks. And passenger flow characteristics of the subway network are very relevant for an effective transportation management in urban cities. In this study, passenger flow of Seoul metropolitan subway network is investigated and characterized through statistical analysis. Traditional betweenness centrality measure considers only topological structure of the network and ignores the transportation factors. This paper proposes a weighted betweenness centrality measure that incorporates monthly passenger flow volume. We apply the proposed measure on the Seoul metropolitan subway network involving 493 stations and 16 lines. Several interesting insights about the network are derived from the new measures. Using Kolmogorov-Smirnov test, we also find out that monthly passenger flow between any two stations follows a power-law distribution and other traffic characteristics such as congestion level and throughflow traffic follow exponential distribution.

Keywords: betweenness centrality, correlation coefficient, power-law distribution, Korea traffic DB

Procedia PDF Downloads 261

40445 Machine Learning Analysis of Student Success in Introductory Calculus Based Physics I Course

Authors: Chandra Prayaga, Aaron Wade, Lakshmi Prayaga, Gopi Shankar Mallu

Abstract:

This paper presents the use of machine learning algorithms to predict the success of students in an introductory physics course. Data having 140 rows pertaining to the performance of two batches of students was used. The lack of sufficient data to train robust machine learning models was compensated for by generating synthetic data similar to the real data. CTGAN and CTGAN with Gaussian Copula (Gaussian) were used to generate synthetic data, with the real data as input. To check the similarity between the real data and each synthetic dataset, pair plots were made. The synthetic data was used to train machine learning models using the PyCaret package. For the CTGAN data, the Ada Boost Classifier (ADA) was found to be the ML model with the best fit, whereas the CTGAN with Gaussian Copula yielded Logistic Regression (LR) as the best model. Both models were then tested for accuracy with the real data. ROC-AUC analysis was performed for all the ten classes of the target variable (Grades A, A-, B+, B, B-, C+, C, C-, D, F). The ADA model with CTGAN data showed a mean AUC score of 0.4377, but the LR model with the Gaussian data showed a mean AUC score of 0.6149. ROC-AUC plots were obtained for each Grade value separately. The LR model with Gaussian data showed consistently better AUC scores compared to the ADA model with CTGAN data, except in two cases of the Grade value, C- and A-.

Keywords: machine learning, student success, physics course, grades, synthetic data, CTGAN, gaussian copula CTGAN

Procedia PDF Downloads 16

40444 Harmonic Data Preparation for Clustering and Classification

Authors: Ali Asheibi

Abstract:

The rapid increase in the size of databases required to store power quality monitoring data has demanded new techniques for analysing and understanding the data. One suggested technique to assist in analysis is data mining. Preparing raw data to be ready for data mining exploration take up most of the effort and time spent in the whole data mining process. Clustering is an important technique in data mining and machine learning in which underlying and meaningful groups of data are discovered. Large amounts of harmonic data have been collected from an actual harmonic monitoring system in a distribution system in Australia for three years. This amount of acquired data makes it difficult to identify operational events that significantly impact the harmonics generated on the system. In this paper, harmonic data preparation processes to better understanding of the data have been presented. Underlying classes in this data has then been identified using clustering technique based on the Minimum Message Length (MML) method. The underlying operational information contained within the clusters can be rapidly visualised by the engineers. The C5.0 algorithm was used for classification and interpretation of the generated clusters.

Keywords: data mining, harmonic data, clustering, classification

Procedia PDF Downloads 218

40443 Beyond Geometry: The Importance of Surface Properties in Space Syntax Research

Authors: Christoph Opperer

Abstract:

Space syntax is a theory and method for analyzing the spatial layout of buildings and urban environments to understand how they can influence patterns of human movement, social interaction, and behavior. While direct visibility is a key factor in space syntax research, important visual information such as light, color, texture, etc., are typically not considered, even though psychological studies have shown a strong correlation to the human perceptual experience within physical space – with light and color, for example, playing a crucial role in shaping the perception of spaciousness. Furthermore, these surface properties are often the visual features that are most salient and responsible for drawing attention to certain elements within the environment. This paper explores the potential of integrating these factors into general space syntax methods and visibility-based analysis of space, particularly for architectural spatial layouts. To this end, we use a combination of geometric (isovist) and topological (visibility graph) approaches together with image-based methods, allowing a comprehensive exploration of the relationship between spatial geometry, visual aesthetics, and human experience. Custom-coded ray-tracing techniques are employed to generate spherical panorama images, encoding three-dimensional spatial data in the form of two-dimensional images. These images are then processed through computer vision algorithms to generate saliency-maps, which serve as a visual representation of areas most likely to attract human attention based on their visual properties. The maps are subsequently used to weight the vertices of isovists and the visibility graph, placing greater emphasis on areas with high saliency. Compared to traditional methods, our weighted visibility analysis introduces an additional layer of information density by assigning different weights or importance levels to various aspects within the field of view. This extends general space syntax measures to provide a more nuanced understanding of visibility patterns that better reflect the dynamics of human attention and perception. Furthermore, by drawing parallels to traditional isovist and VGA analysis, our weighted approach emphasizes a crucial distinction, which has been pointed out by Ervin and Steinitz: the difference between what is possible to see and what is likely to be seen. Therefore, this paper emphasizes the importance of including surface properties in visibility-based analysis to gain deeper insights into how people interact with their surroundings and to establish a stronger connection with human attention and perception.

Keywords: space syntax, visibility analysis, isovist, visibility graph, visual features, human perception, saliency detection, raytracing, spherical images

Procedia PDF Downloads 39

40442 Simulation Data Summarization Based on Spatial Histograms

Authors: Jing Zhao, Yoshiharu Ishikawa, Chuan Xiao, Kento Sugiura

Abstract:

In order to analyze large-scale scientific data, research on data exploration and visualization has gained popularity. In this paper, we focus on the exploration and visualization of scientific simulation data, and define a spatial V-Optimal histogram for data summarization. We propose histogram construction algorithms based on a general binary hierarchical partitioning as well as a more specific one, the l-grid partitioning. For effective data summarization and efficient data visualization in scientific data analysis, we propose an optimal algorithm as well as a heuristic algorithm for histogram construction. To verify the effectiveness and efficiency of the proposed methods, we conduct experiments on the massive evacuation simulation data.

Keywords: simulation data, data summarization, spatial histograms, exploration, visualization

Procedia PDF Downloads 155

40441 Analysis of Lead Time Delays in Supply Chain: A Case Study

Authors: Abdel-Aziz M. Mohamed, Nermeen Coutry

Abstract:

Lead time is an important measure of supply chain performance. It impacts both customer satisfactions as well as the total cost of inventory. This paper presents the result of a study on the analysis of the customer order lead-time for a multinational company. In the study, the lead time was divided into three stages: order entry, order fulfillment, and order delivery. A sample of size 2,425 order lines from the company records were considered for this study. The sample data includes information regarding customer orders from the time of order entry until order delivery. Data regarding the lead time of each sage for different orders were also provided. Summary statistics on lead time data reveals that about 30% of the orders were delivered after the scheduled due date. The result of the multiple linear regression analysis technique revealed that component type, logistics parameter, order size and the customer type have significant impact on lead time. Data analysis on the stages of lead time indicates that stage 2 consumes over 50% of the lead time. Pareto analysis was made to study the reasons for the customer order delay in each of the 3 stages. Recommendation was given to resolve the problem.

Keywords: lead time reduction, customer satisfaction, service quality, statistical analysis

Procedia PDF Downloads 696

40440 Differentiation between Different Rangeland Sites Using Principal Component Analysis in Semi-Arid Areas of Sudan

Authors: Nancy Ibrahim Abdalla, Abdelaziz Karamalla Gaiballa

Abstract:

Rangelands in semi-arid areas provide a good source for feeding huge numbers of animals and serving environmental, economic and social importance; therefore, these areas are considered economically very important for the pastoral sector in Sudan. This paper investigates the means of differentiating between different rangelands sites according to soil types using principal component analysis to assist in monitoring and assessment purposes. Three rangeland sites were identified in the study area as flat sandy sites, sand dune site, and hard clay site. Principal component analysis (PCA) was used to reduce the number of factors needed to distinguish between rangeland sites and produce a new set of data including the most useful spectral information to run satellite image processing. It was performed using selected types of data (two vegetation indices, topographic data and vegetation surface reflectance within the three bands of MODIS data). Analysis with PCA indicated that there is a relatively high correspondence between vegetation and soil of the total variance in the data set. The results showed that the use of the principal component analysis (PCA) with the selected variables showed a high difference, reflected in the variance and eigenvalues and it can be used for differentiation between different range sites.

Keywords: principal component analysis, PCA, rangeland sites, semi-arid areas, soil types

Procedia PDF Downloads 147

40439 Interpretation and Clustering Framework for Analyzing ECG Survey Data

Authors: Irum Matloob, Shoab Ahmad Khan, Fahim Arif

Abstract:

As Indo-Pak has been the victim of heart diseases since many decades. Many surveys showed that percentage of cardiac patients is increasing in Pakistan day by day, and special attention is needed to pay on this issue. The framework is proposed for performing detailed analysis of ECG survey data which is conducted for measuring prevalence of heart diseases statistics in Pakistan. The ECG survey data is evaluated or filtered by using automated Minnesota codes and only those ECGs are used for further analysis which is fulfilling the standardized conditions mentioned in the Minnesota codes. Then feature selection is performed by applying proposed algorithm based on discernibility matrix, for selecting relevant features from the database. Clustering is performed for exposing natural clusters from the ECG survey data by applying spectral clustering algorithm using fuzzy c means algorithm. The hidden patterns and interesting relationships which have been exposed after this analysis are useful for further detailed analysis and for many other multiple purposes.

Keywords: arrhythmias, centroids, ECG, clustering, discernibility matrix

Procedia PDF Downloads 441

40438 Prediction of Marine Ecosystem Changes Based on the Integrated Analysis of Multivariate Data Sets

Authors: Prozorkevitch D., Mishurov A., Sokolov K., Karsakov L., Pestrikova L.

Abstract:

The current body of knowledge about the marine environment and the dynamics of marine ecosystems includes a huge amount of heterogeneous data collected over decades. It generally includes a wide range of hydrological, biological and fishery data. Marine researchers collect these data and analyze how and why the ecosystem changes from past to present. Based on these historical records and linkages between the processes it is possible to predict future changes. Multivariate analysis of trends and their interconnection in the marine ecosystem may be used as an instrument for predicting further ecosystem evolution. A wide range of information about the components of the marine ecosystem for more than 50 years needs to be used to investigate how these arrays can help to predict the future.

Keywords: barents sea ecosystem, abiotic, biotic, data sets, trends, prediction

Procedia PDF Downloads 81

40437 Combining Diffusion Maps and Diffusion Models for Enhanced Data Analysis

Authors: Meng Su

Abstract:

High-dimensional data analysis often presents challenges in capturing the complex, nonlinear relationships and manifold structures inherent to the data. This article presents a novel approach that leverages the strengths of two powerful techniques, Diffusion Maps and Diffusion Probabilistic Models (DPMs), to address these challenges. By integrating the dimensionality reduction capability of Diffusion Maps with the data modeling ability of DPMs, the proposed method aims to provide a comprehensive solution for analyzing and generating high-dimensional data. The Diffusion Map technique preserves the nonlinear relationships and manifold structure of the data by mapping it to a lower-dimensional space using the eigenvectors of the graph Laplacian matrix. Meanwhile, DPMs capture the dependencies within the data, enabling effective modeling and generation of new data points in the low-dimensional space. The generated data points can then be mapped back to the original high-dimensional space, ensuring consistency with the underlying manifold structure. Through a detailed example implementation, the article demonstrates the potential of the proposed hybrid approach to achieve more accurate and effective modeling and generation of complex, high-dimensional data. Furthermore, it discusses possible applications in various domains, such as image synthesis, time-series forecasting, and anomaly detection, and outlines future research directions for enhancing the scalability, performance, and integration with other machine learning techniques. By combining the strengths of Diffusion Maps and DPMs, this work paves the way for more advanced and robust data analysis methods.

Keywords: diffusion maps, diffusion probabilistic models (DPMs), manifold learning, high-dimensional data analysis

Procedia PDF Downloads 67

40436 A Study on Sentiment Analysis Using Various ML/NLP Models on Historical Data of Indian Leaders

Authors: Sarthak Deshpande, Akshay Patil, Pradip Pandhare, Nikhil Wankhede, Rushali Deshmukh

Abstract:

Among the highly significant duties for any language most effective is the sentiment analysis, which is also a key area of NLP, that recently made impressive strides. There are several models and datasets available for those tasks in popular and commonly used languages like English, Russian, and Spanish. While sentiment analysis research is performed extensively, however it is lagging behind for the regional languages having few resources such as Hindi, Marathi. Marathi is one of the languages that included in the Indian Constitution’s 8th schedule and is the third most widely spoken language in the country and primarily spoken in the Deccan region, which encompasses Maharashtra and Goa. There isn’t sufficient study on sentiment analysis methods based on Marathi text due to lack of available resources, information. Therefore, this project proposes the use of different ML/NLP models for the analysis of Marathi data from the comments below YouTube content, tweets or Instagram posts. We aim to achieve a short and precise analysis and summary of the related data using our dataset (Dates, names, root words) and lexicons to locate exact information.

Keywords: multilingual sentiment analysis, Marathi, natural language processing, text summarization, lexicon-based approaches

Procedia PDF Downloads 37

40435 Industrial Process Mining Based on Data Pattern Modeling and Nonlinear Analysis

Authors: Hyun-Woo Cho

Abstract:

Unexpected events may occur with serious impacts on industrial process. This work utilizes a data representation technique to model and to analyze process data pattern for the purpose of diagnosis. In this work, the use of triangular representation of process data is evaluated using simulation process. Furthermore, the effect of using different pre-treatment techniques based on such as linear or nonlinear reduced spaces was compared. This work extracted the fault pattern in the reduced space, not in the original data space. The results have shown that the non-linear technique based diagnosis method produced more reliable results and outperforms linear method.

Keywords: process monitoring, data analysis, pattern modeling, fault, nonlinear techniques

Procedia PDF Downloads 359

40434 Cyber-Social Networks in Preventing Terrorism: Topological Scope

Authors: Alessandra Rossodivita, Alexei Tikhomirov, Andrey Trufanov, Nikolay Kinash, Olga Berestneva, Svetlana Nikitina, Fabio Casati, Alessandro Visconti, Tommaso Saporito

Abstract:

It is well known that world and national societies are exposed to diverse threats: anthropogenic, technological, and natural. Anthropogenic ones are of greater risks and, thus, attract special interest to researchers within wide spectrum of disciplines in efforts to lower the pertinent risks. Some researchers showed by means of multilayered, complex network models how media promotes the prevention of disease spread. To go further, not only are mass-media sources included in scope the paper suggests but also personificated social bots (socbots) linked according to reflexive theory. The novel scope considers information spread over conscious and unconscious agents while counteracting both natural and man-made threats, i.e., infections and terrorist hazards. Contrary to numerous publications on misinformation disseminated by ‘bad’ bots within social networks, this study focuses on ‘good’ bots, which should be mobilized to counter the former ones. These social bots deployed mixture with real social actors that are engaged in concerted actions at spreading, receiving and analyzing information. All the contemporary complex network platforms (multiplexes, interdependent networks, combined stem networks et al.) are comprised to describe and test socbots activities within competing information sharing tools, namely mass-media hubs, social networks, messengers, and e-mail at all phases of disasters. The scope and concomitant techniques present evidence that embedding such socbots into information sharing process crucially change the network topology of actor interactions. The change might improve or impair robustness of social network environment: it depends on who and how controls the socbots. It is demonstrated that the topological approach elucidates techno-social processes within the field and outline the roadmap to a safer world.

Keywords: complex network platform, counterterrorism, information sharing topology, social bots

Procedia PDF Downloads 135

40433 Longitudinal Analysis of Internet Speed Data in the Gulf Cooperation Council Region

Authors: Musab Isah

Abstract:

This paper presents a longitudinal analysis of Internet speed data in the Gulf Cooperation Council (GCC) region, focusing on the most populous cities of each of the six countries – Riyadh, Saudi Arabia; Dubai, UAE; Kuwait City, Kuwait; Doha, Qatar; Manama, Bahrain; and Muscat, Oman. The study utilizes data collected from the Measurement Lab (M-Lab) infrastructure over a five-year period from January 1, 2019, to December 31, 2023. The analysis includes downstream and upstream throughput data for the cities, covering significant events such as the launch of 5G networks in 2019, COVID-19-induced lockdowns in 2020 and 2021, and the subsequent recovery period and return to normalcy. The results showcase substantial increases in Internet speeds across the cities, highlighting improvements in both download and upload throughput over the years. All the GCC countries have achieved above-average Internet speeds that can conveniently support various online activities and applications with excellent user experience.

Keywords: internet data science, internet performance measurement, throughput analysis, internet speed, measurement lab, network diagnostic tool

Procedia PDF Downloads 19

40432 Extreme Temperature Forecast in Mbonge, Cameroon Through Return Level Analysis of the Generalized Extreme Value (GEV) Distribution

Authors: Nkongho Ayuketang Arreyndip, Ebobenow Joseph

Abstract:

In this paper, temperature extremes are forecast by employing the block maxima method of the generalized extreme value (GEV) distribution to analyse temperature data from the Cameroon Development Corporation (CDC). By considering two sets of data (raw data and simulated data) and two (stationary and non-stationary) models of the GEV distribution, return levels analysis is carried out and it was found that in the stationary model, the return values are constant over time with the raw data, while in the simulated data the return values show an increasing trend with an upper bound. In the non-stationary model, the return levels of both the raw data and simulated data show an increasing trend with an upper bound. This clearly shows that although temperatures in the tropics show a sign of increase in the future, there is a maximum temperature at which there is no exceedance. The results of this paper are very vital in agricultural and environmental research.

Keywords: forecasting, generalized extreme value (GEV), meteorology, return level

Procedia PDF Downloads 424

40431 Analysis of Spatial and Temporal Data Using Remote Sensing Technology

Authors: Kapil Pandey, Vishnu Goyal

Abstract:

Spatial and temporal data analysis is very well known in the field of satellite image processing. When spatial data are correlated with time, series analysis it gives the significant results in change detection studies. In this paper the GIS and Remote sensing techniques has been used to find the change detection using time series satellite imagery of Uttarakhand state during the years of 1990-2010. Natural vegetation, urban area, forest cover etc. were chosen as main landuse classes to study. Landuse/ landcover classes within several years were prepared using satellite images. Maximum likelihood supervised classification technique was adopted in this work and finally landuse change index has been generated and graphical models were used to present the changes.

Keywords: GIS, landuse/landcover, spatial and temporal data, remote sensing

Procedia PDF Downloads 402

‹
1
2
3
4
5
6
7
8
9
10
...
1353
1354
›