Search results for: hierarchical data format
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24728

Search results for: hierarchical data format

24548 Computer-Assisted Management of Building Climate and Microgrid with Model Predictive Control

Authors: Vinko Lešić, Mario Vašak, Anita Martinčević, Marko Gulin, Antonio Starčić, Hrvoje Novak

Abstract:

With 40% of total world energy consumption, building systems are developing into technically complex large energy consumers suitable for application of sophisticated power management approaches to largely increase the energy efficiency and even make them active energy market participants. Centralized control system of building heating and cooling managed by economically-optimal model predictive control shows promising results with estimated 30% of energy efficiency increase. The research is focused on implementation of such a method on a case study performed on two floors of our faculty building with corresponding sensors wireless data acquisition, remote heating/cooling units and central climate controller. Building walls are mathematically modeled with corresponding material types, surface shapes and sizes. Models are then exploited to predict thermal characteristics and changes in different building zones. Exterior influences such as environmental conditions and weather forecast, people behavior and comfort demands are all taken into account for deriving price-optimal climate control. Finally, a DC microgrid with photovoltaics, wind turbine, supercapacitor, batteries and fuel cell stacks is added to make the building a unit capable of active participation in a price-varying energy market. Computational burden of applying model predictive control on such a complex system is relaxed through a hierarchical decomposition of the microgrid and climate control, where the former is designed as higher hierarchical level with pre-calculated price-optimal power flows control, and latter is designed as lower level control responsible to ensure thermal comfort and exploit the optimal supply conditions enabled by microgrid energy flows management. Such an approach is expected to enable the inclusion of more complex building subsystems into consideration in order to further increase the energy efficiency.

Keywords: price-optimal building climate control, Microgrid power flow optimisation, hierarchical model predictive control, energy efficient buildings, energy market participation

Procedia PDF Downloads 438
24547 Understanding the Influence of Cross-National Distances on Tourist Expenditure

Authors: Wei-Ting Hung

Abstract:

Inbound tourist expenditure might not only have influenced by individual tourist characteristics but may also be affected by nationality characteristics. The cross national distance effects on tourist consumption behavior should be incorporated in the analytical framework. Additionally, the often used factor analysis, cluster analysis and regression analysis overlook the hierarchical tourist consumption data structure and may lead to misleading results. The objectives of the present study were twofold. First, we propose a multilevel model that takes individual and cross-national differences into account under a hierarchical framework. Second, we further sought to determine the types of cross-national differences affecting tourist expenditure. Thus, this study incorporates the individual tourist effects and cross national distance effects simultaneously, uses the data of 2010 Annual Survey Report on Visitors’ Expenditure and Trends in Taiwan to investigate the determinants of inbound tourist expenditure. Multilevel analysis was used to investigate the influence of individual tourist effects and cross national distance effects on inbound tourist expenditure. The empirical results show that cross national distance plays a crucial role in tourist consumption behavior. Our findings also indicate age and income have positive influence on tourism expenditure., whereas education and gender do not have significant impact. Regarding macro-level factors, geographic and cultural differences exhibited significant positive relationships on tourism expenditure, while economic differences did not. Based on the above empirical results, it is suggested that tour operators should take tourists’ individual attributes, particularly their income and age, into consideration when arranging tours. In addition, nationality holds sway over tourists’ consumption behavior, of which geographic and cultural differences are the two major factors at play. The empirical results of this study serve as practical suggestions for tourism marketing strategies and policy implications for government policies.

Keywords: cross national distance, inbound tourist, multilevel analysis, tourist expenditure

Procedia PDF Downloads 327
24546 Hierarchical Control Structure to Control the Power Distribution System Components in Building Systems

Authors: Hamed Sarbazy, Zohre Gholipour Haftkhani, Ali Safari, Pejman Hosseiniun

Abstract:

Scientific and industrial progress in the past two decades has resulted in energy distribution systems based on power electronics, as an enabling technology in various industries and building management systems can be considered. Grading and standardization module power electronics systems and its use in a distributed control system, a strategy for overcoming the limitations of using this system. The purpose of this paper is to investigate strategies for scheduling and control structure of standard modules is a power electronic systems. This paper introduces the classical control methods and disadvantages of these methods will be discussed, The hierarchical control as a mechanism for distributed control structure of the classification module explains. The different levels of control and communication between these levels are fully introduced. Also continue to standardize software distribution system control structure is discussed. Finally, as an example, the control structure will be presented in a DC distribution system.

Keywords: application management, hardware management, power electronics, building blocks

Procedia PDF Downloads 488
24545 Seafloor and Sea Surface Modelling in the East Coast Region of North America

Authors: Magdalena Idzikowska, Katarzyna Pająk, Kamil Kowalczyk

Abstract:

Seafloor topography is a fundamental issue in geological, geophysical, and oceanographic studies. Single-beam or multibeam sonars attached to the hulls of ships are used to emit a hydroacoustic signal from transducers and reproduce the topography of the seabed. This solution provides relevant accuracy and spatial resolution. Bathymetric data from ships surveys provides National Centers for Environmental Information – National Oceanic and Atmospheric Administration. Unfortunately, most of the seabed is still unidentified, as there are still many gaps to be explored between ship survey tracks. Moreover, such measurements are very expensive and time-consuming. The solution is raster bathymetric models shared by The General Bathymetric Chart of the Oceans. The offered products are a compilation of different sets of data - raw or processed. Indirect data for the development of bathymetric models are also measurements of gravity anomalies. Some forms of seafloor relief (e.g. seamounts) increase the force of the Earth's pull, leading to changes in the sea surface. Based on satellite altimetry data, Sea Surface Height and marine gravity anomalies can be estimated, and based on the anomalies, it’s possible to infer the structure of the seabed. The main goal of the work is to create regional bathymetric models and models of the sea surface in the area of the east coast of North America – a region of seamounts and undulating seafloor. The research includes an analysis of the methods and techniques used, an evaluation of the interpolation algorithms used, model thickening, and the creation of grid models. Obtained data are raster bathymetric models in NetCDF format, survey data from multibeam soundings in MB-System format, and satellite altimetry data from Copernicus Marine Environment Monitoring Service. The methodology includes data extraction, processing, mapping, and spatial analysis. Visualization of the obtained results was carried out with Geographic Information System tools. The result is an extension of the state of the knowledge of the quality and usefulness of the data used for seabed and sea surface modeling and knowledge of the accuracy of the generated models. Sea level is averaged over time and space (excluding waves, tides, etc.). Its changes, along with knowledge of the topography of the ocean floor - inform us indirectly about the volume of the entire water ocean. The true shape of the ocean surface is further varied by such phenomena as tides, differences in atmospheric pressure, wind systems, thermal expansion of water, or phases of ocean circulation. Depending on the location of the point, the higher the depth, the lower the trend of sea level change. Studies show that combining data sets, from different sources, with different accuracies can affect the quality of sea surface and seafloor topography models.

Keywords: seafloor, sea surface height, bathymetry, satellite altimetry

Procedia PDF Downloads 53
24544 Publishing Formats of Scientific Journals in the XXI Century: the Case of Small Publishing Market

Authors: Arūnas Gudinavičius, Andrius Šuminas

Abstract:

The analysis of scholarly journals formats is fragmented and needs to be studied from a point of view of scientific communication. While PDF is to the author’s best knowledge probably the most popular digital format of XXI century, but there are more formats available: HTML, EPUB, etc. Our aim is to analyze how these formats important to the readers and what is their contribution to scientific communication. We want to investigate how printed journals are still popular between scholars and does different formats are preferred between fields of science . In most cases, publishing of scientific journals are examined from a narrow perspective of a particular university science affair administrators or research funding institution. We believe that more data o n formats used in scholarly periodicals currently published in Lithuania as well as in Eastern Europe are needed. Science communication is often analyzed as a directed chain of information in the author-publisher-reader cycle. The paper is focusing on the publishing part of this chain. A distinction is made between formal and informal forms of scientific communication, which is relevant in today's context, when both forms of communication intertwine and complement each other. In our research, we will analyze formal documentary (formats of publication of scientific articles) communication - scientific information recorded in a certain medium and formatted in certain format (printed, PDF, HTML, EPUB, etc.). In our research, we will analyze the stage of publication of research results in scientific journals and their dissemination through specific publication formats. The paper is to systematize and analyze the various types of formats of scientific journal published in XXI century in Lithuania (small publishing market). The research analyses the case of small European country and presents publishing formats characteristics of the publication of scientific periodicals.

Keywords: scientific communication, scientific journals, publishing formats, reading

Procedia PDF Downloads 58
24543 A Comparative Analysis of Clustering Approaches for Understanding Patterns in Health Insurance Uptake: Evidence from Sociodemographic Kenyan Data

Authors: Nelson Kimeli Kemboi Yego, Juma Kasozi, Joseph Nkruzinza, Francis Kipkogei

Abstract:

The study investigated the low uptake of health insurance in Kenya despite efforts to achieve universal health coverage through various health insurance schemes. Unsupervised machine learning techniques were employed to identify patterns in health insurance uptake based on sociodemographic factors among Kenyan households. The aim was to identify key demographic groups that are underinsured and to provide insights for the development of effective policies and outreach programs. Using the 2021 FinAccess Survey, the study clustered Kenyan households based on their health insurance uptake and sociodemographic features to reveal patterns in health insurance uptake across the country. The effectiveness of k-prototypes clustering, hierarchical clustering, and agglomerative hierarchical clustering in clustering based on sociodemographic factors was compared. The k-prototypes approach was found to be the most effective at uncovering distinct and well-separated clusters in the Kenyan sociodemographic data related to health insurance uptake based on silhouette, Calinski-Harabasz, Davies-Bouldin, and Rand indices. Hence, it was utilized in uncovering the patterns in uptake. The results of the analysis indicate that inclusivity in health insurance is greatly related to affordability. The findings suggest that targeted policy interventions and outreach programs are necessary to increase health insurance uptake in Kenya, with the ultimate goal of achieving universal health coverage. The study provides important insights for policymakers and stakeholders in the health insurance sector to address the low uptake of health insurance and to ensure that healthcare services are accessible and affordable to all Kenyans, regardless of their socio-demographic status. The study highlights the potential of unsupervised machine learning techniques to provide insights into complex health policy issues and improve decision-making in the health sector.

Keywords: health insurance, unsupervised learning, clustering algorithms, machine learning

Procedia PDF Downloads 85
24542 Identification of Watershed Landscape Character Types in Middle Yangtze River within Wuhan Metropolitan Area

Authors: Huijie Wang, Bin Zhang

Abstract:

In China, the middle reaches of the Yangtze River are well-developed, boasting a wealth of different types of watershed landscape. In this regard, landscape character assessment (LCA) can serve as a basis for protection, management and planning of trans-regional watershed landscape types. For this study, we chose the middle reaches of the Yangtze River in Wuhan metropolitan area as our study site, wherein the water system consists of rich variety in landscape types. We analyzed trans-regional data to cluster and identify types of landscape characteristics at two levels. 55 basins were analyzed as variables with topography, land cover and river system features in order to identify the watershed landscape character types. For watershed landscape, drainage density and degree of curvature were specified as special variables to directly reflect the regional differences of river system features. Then, we used the principal component analysis (PCA) method and hierarchical clustering algorithm based on the geographic information system (GIS) and statistical products and services solution (SPSS) to obtain results for clusters of watershed landscape which were divided into 8 characteristic groups. These groups highlighted watershed landscape characteristics of different river systems as well as key landscape characteristics that can serve as a basis for targeted protection of watershed landscape characteristics, thus helping to rationally develop multi-value landscape resources and promote coordinated development of trans-regions.

Keywords: GIS, hierarchical clustering, landscape character, landscape typology, principal component analysis, watershed

Procedia PDF Downloads 178
24541 Machine Learning Strategies for Data Extraction from Unstructured Documents in Financial Services

Authors: Delphine Vendryes, Dushyanth Sekhar, Baojia Tong, Matthew Theisen, Chester Curme

Abstract:

Much of the data that inform the decisions of governments, corporations and individuals are harvested from unstructured documents. Data extraction is defined here as a process that turns non-machine-readable information into a machine-readable format that can be stored, for instance, in a database. In financial services, introducing more automation in data extraction pipelines is a major challenge. Information sought by financial data consumers is often buried within vast bodies of unstructured documents, which have historically required thorough manual extraction. Automated solutions provide faster access to non-machine-readable datasets, in a context where untimely information quickly becomes irrelevant. Data quality standards cannot be compromised, so automation requires high data integrity. This multifaceted task is broken down into smaller steps: ingestion, table parsing (detection and structure recognition), text analysis (entity detection and disambiguation), schema-based record extraction, user feedback incorporation. Selected intermediary steps are phrased as machine learning problems. Solutions leveraging cutting-edge approaches from the fields of computer vision (e.g. table detection) and natural language processing (e.g. entity detection and disambiguation) are proposed.

Keywords: computer vision, entity recognition, finance, information retrieval, machine learning, natural language processing

Procedia PDF Downloads 86
24540 Real-Time Big-Data Warehouse a Next-Generation Enterprise Data Warehouse and Analysis Framework

Authors: Abbas Raza Ali

Abstract:

Big Data technology is gradually becoming a dire need of large enterprises. These enterprises are generating massively large amount of off-line and streaming data in both structured and unstructured formats on daily basis. It is a challenging task to effectively extract useful insights from the large scale datasets, even though sometimes it becomes a technology constraint to manage transactional data history of more than a few months. This paper presents a framework to efficiently manage massively large and complex datasets. The framework has been tested on a communication service provider producing massively large complex streaming data in binary format. The communication industry is bound by the regulators to manage history of their subscribers’ call records where every action of a subscriber generates a record. Also, managing and analyzing transactional data allows service providers to better understand their customers’ behavior, for example, deep packet inspection requires transactional internet usage data to explain internet usage behaviour of the subscribers. However, current relational database systems limit service providers to only maintain history at semantic level which is aggregated at subscriber level. The framework addresses these challenges by leveraging Big Data technology which optimally manages and allows deep analysis of complex datasets. The framework has been applied to offload existing Intelligent Network Mediation and relational Data Warehouse of the service provider on Big Data. The service provider has 50+ million subscriber-base with yearly growth of 7-10%. The end-to-end process takes not more than 10 minutes which involves binary to ASCII decoding of call detail records, stitching of all the interrogations against a call (transformations) and aggregations of all the call records of a subscriber.

Keywords: big data, communication service providers, enterprise data warehouse, stream computing, Telco IN Mediation

Procedia PDF Downloads 147
24539 Important Factors for Successful Solution of Emotional Situations: Empirical Study on Young People

Authors: R. Lekaviciene, D. Antiniene

Abstract:

Attempts to split the construct of emotional intelligence (EI) into separate components – ability to understand own and others’ emotions and ability to control own and others’ emotions may be meaningful more theoretically than practically. In real life, a personality encounters various emotional situations that require exhibition of complex EI to solve them. Emotional situation solution tests enable measurement of such undivided EI. The object of the present study is to determine sociodemographic and other factors that are important for emotional situation solutions. The study involved 1,430 participants from various regions of Lithuania. The age of participants varied from 17 years to 27 years. Emotional social and interpersonal situation scale EI-DARL-V2 was used. Each situation had two mandatory answering formats: The first format contained assignments associated with hypothetical theoretical knowledge of how the situation should be solved, while the second format included the question of how the participant would personally resolve the given situation in reality. A questionnaire that contained various sociodemographic data of subjects was also presented. Factors, statistically significant for emotional situation solution, have been determined: gender, family structure, the subject’s relation with his or her mother, mother’s occupation, subjectively assessed financial situation of the family, level of education of the subjects and his or her parents, academic achievement, etc. The best solvers of emotional situations are women with high academic achievements. According to their chosen study profile/acquired profession, they are related to the fields in social sciences and humanities. The worst solvers of emotional situations are men raised in foster homes. They are/were bad students and mostly choose blue-collar professions.

Keywords: emotional intelligence, emotional situations, solution of situation, young people

Procedia PDF Downloads 146
24538 Re-Analyzing Energy-Conscious Design

Authors: Svetlana Pushkar, Oleg Verbitsky

Abstract:

An energy-conscious design for a classroom in a hot-humid climate is reanalyzed. The hypothesis of this study is that use of photovoltaic (PV) electricity generation in building operation energy consumption will lead to re-analysis of the energy-conscious design. Therefore, the objective of this study is to reanalyze the energy-conscious design by evaluating the environmental impact of operational energy with PV electrical generation. Using the hierarchical design structure of Eco-indicator 99, the alternatives for energy-conscious variables are statistically evaluated by applying a two-stage nested (hierarchical) ANOVA. The recommendations for the preferred solutions for application of glazing types, wall insulation, roof insulation, window size, roof mass, and window shading design alternatives were changed (for example, glazing type recommendations were changed from low-emissivity glazing, green, and double- glazed windows to low-emissivity glazing only), whereas the applications for the lighting control system and infiltration are not changed. Such analysis of operational energy can be defined as environment-conscious analysis.

Keywords: ANOVA, Eco-Indicator 99, energy-conscious design, hot–humid climate, photovoltaic

Procedia PDF Downloads 160
24537 Prioritization Assessment of Housing Development Risk Factors: A Fuzzy Hierarchical Process-Based Approach

Authors: Yusuf Garba Baba

Abstract:

The construction industry and housing subsector are fraught with risks that have the potential of negatively impacting on the achievement of project objectives. The success or otherwise of most construction projects depends to large extent on how well these risks have been managed. The recent paradigm shift by the subsector to use of formal risk management approach in contrast to hitherto developed rules of thumb means that risks must not only be identified but also properly assessed and responded to in a systematic manner. The study focused on identifying risks associated with housing development projects and prioritisation assessment of the identified risks in order to provide basis for informed decision. The study used a three-step identification framework: review of literature for similar projects, expert consultation and questionnaire based survey to identify potential risk factors. Delphi survey method was employed in carrying out the relative prioritization assessment of the risks factors using computer-based Analytical Hierarchical Process (AHP) software. The results show that 19 out of the 50 risks significantly impact on housing development projects. The study concludes that although significant numbers of risk factors have been identified as having relevance and impacting to housing construction projects, economic risk group and, in particular, ‘changes in demand for houses’ is prioritised by most developers as posing a threat to the achievement of their housing development objectives. Unless these risks are carefully managed, their effects will continue to impede success in these projects. The study recommends the adoption and use of the combination of multi-technique identification framework and AHP prioritization assessment methodology as a suitable model for the assessment of risks in housing development projects.

Keywords: risk management, risk identification, risk analysis, analytic hierarchical process

Procedia PDF Downloads 87
24536 Observatory of Sustainability of the Algarve Region for Tourism: Proposal for Environmental and Sociocultural Indicators

Authors: Miguel José Oliveira, Fátima Farinha, Elisa M. J. da Silva, Rui Lança, Manuel Duarte Pinheiro, Cátia Miguel

Abstract:

The Observatory of Sustainability of the Algarve Region for Tourism (OBSERVE) will be a valuable tool to assess the sustainability of this region. The OBSERVE tool is designed to provide data and maintain an up-to-date, consistent set of indicators defined to describe the region on the environmental, sociocultural, economic and institutional domains. This ongoing two-year project has the active participation of the Algarve’s stakeholders, since they were consulted and asked to participate in the discussion for the indicators proposal. The environmental and sociocultural indicators chosen must indicate the characteristics of the region and should be in alignment with other global systems used to monitor the sustainability. This paper presents a review of sustainability indicators systems that support the first proposal for the environmental and sociocultural indicators. Others constraints are discussed, namely the existing data and the data available in digital platforms in a format suitable for automatic importation to the platform of OBSERVE. It is intended that OBSERVE will be a valuable tool to assess the sustainability of the region of Algarve.

Keywords: Algarve, development, environmental indicators, observatory, sociocultural indicators, sustainability, tourism

Procedia PDF Downloads 140
24535 Graph Neural Network-Based Classification for Disease Prediction in Health Care Heterogeneous Data Structures of Electronic Health Record

Authors: Raghavi C. Janaswamy

Abstract:

In the healthcare sector, heterogenous data elements such as patients, diagnosis, symptoms, conditions, observation text from physician notes, and prescriptions form the essentials of the Electronic Health Record (EHR). The data in the form of clear text and images are stored or processed in a relational format in most systems. However, the intrinsic structure restrictions and complex joins of relational databases limit the widespread utility. In this regard, the design and development of realistic mapping and deep connections as real-time objects offer unparallel advantages. Herein, a graph neural network-based classification of EHR data has been developed. The patient conditions have been predicted as a node classification task using a graph-based open source EHR data, Synthea Database, stored in Tigergraph. The Synthea DB dataset is leveraged due to its closer representation of the real-time data and being voluminous. The graph model is built from the EHR heterogeneous data using python modules, namely, pyTigerGraph to get nodes and edges from the Tigergraph database, PyTorch to tensorize the nodes and edges, PyTorch-Geometric (PyG) to train the Graph Neural Network (GNN) and adopt the self-supervised learning techniques with the AutoEncoders to generate the node embeddings and eventually perform the node classifications using the node embeddings. The model predicts patient conditions ranging from common to rare situations. The outcome is deemed to open up opportunities for data querying toward better predictions and accuracy.

Keywords: electronic health record, graph neural network, heterogeneous data, prediction

Procedia PDF Downloads 62
24534 Classification of Land Cover Usage from Satellite Images Using Deep Learning Algorithms

Authors: Shaik Ayesha Fathima, Shaik Noor Jahan, Duvvada Rajeswara Rao

Abstract:

Earth's environment and its evolution can be seen through satellite images in near real-time. Through satellite imagery, remote sensing data provide crucial information that can be used for a variety of applications, including image fusion, change detection, land cover classification, agriculture, mining, disaster mitigation, and monitoring climate change. The objective of this project is to propose a method for classifying satellite images according to multiple predefined land cover classes. The proposed approach involves collecting data in image format. The data is then pre-processed using data pre-processing techniques. The processed data is fed into the proposed algorithm and the obtained result is analyzed. Some of the algorithms used in satellite imagery classification are U-Net, Random Forest, Deep Labv3, CNN, ANN, Resnet etc. In this project, we are using the DeepLabv3 (Atrous convolution) algorithm for land cover classification. The dataset used is the deep globe land cover classification dataset. DeepLabv3 is a semantic segmentation system that uses atrous convolution to capture multi-scale context by adopting multiple atrous rates in cascade or in parallel to determine the scale of segments.

Keywords: area calculation, atrous convolution, deep globe land cover classification, deepLabv3, land cover classification, resnet 50

Procedia PDF Downloads 115
24533 An Evaluation Model for Automatic Map Generalization

Authors: Quynhan Tran, Hong Fan, Quockhanh Pham

Abstract:

Automatic map generalization is a well-known problem in cartography. The development of map generalization research accompanied the development of cartography. The traditional map is plotted manually by cartographic experts. The paper studies none-scale automation generalization of resident polygons and house marker symbol, proposes methodology to evaluate the result maps based on minimal spanning tree. In this paper, the minimal spanning tree before and after map generalization is compared to evaluate whether the generalization result maintain the geographical distribution of features. The minimal spanning tree in vector format is firstly converted into a raster format and the grid size is 2mm (distance on the map). The statistical number of matching grid before and after map generalization and the ratio of overlapping grid to the total grids is calculated. Evaluation experiments are conduct to verify the results. Experiments show that this methodology can give an objective evaluation for the feature distribution and give specialist an hand while they evaluate result maps of none-scale automation generalization with their eyes.

Keywords: automatic cartography generalization, evaluation model, geographic feature distribution, minimal spanning tree

Procedia PDF Downloads 607
24532 Anomaly Detection in Financial Markets Using Tucker Decomposition

Authors: Salma Krafessi

Abstract:

The financial markets have a multifaceted, intricate environment, and enormous volumes of data are produced every day. To find investment possibilities, possible fraudulent activity, and market oddities, accurate anomaly identification in this data is essential. Conventional methods for detecting anomalies frequently fail to capture the complex organization of financial data. In order to improve the identification of abnormalities in financial time series data, this study presents Tucker Decomposition as a reliable multi-way analysis approach. We start by gathering closing prices for the S&P 500 index across a number of decades. The information is converted to a three-dimensional tensor format, which contains internal characteristics and temporal sequences in a sliding window structure. The tensor is then broken down using Tucker Decomposition into a core tensor and matching factor matrices, allowing latent patterns and relationships in the data to be captured. A possible sign of abnormalities is the reconstruction error from Tucker's Decomposition. We are able to identify large deviations that indicate unusual behavior by setting a statistical threshold. A thorough examination that contrasts the Tucker-based method with traditional anomaly detection approaches validates our methodology. The outcomes demonstrate the superiority of Tucker's Decomposition in identifying intricate and subtle abnormalities that are otherwise missed. This work opens the door for more research into multi-way data analysis approaches across a range of disciplines and emphasizes the value of tensor-based methods in financial analysis.

Keywords: tucker decomposition, financial markets, financial engineering, artificial intelligence, decomposition models

Procedia PDF Downloads 22
24531 Integration Process and Analytic Interface of different Environmental Open Data Sets with Java/Oracle and R

Authors: Pavel H. Llamocca, Victoria Lopez

Abstract:

The main objective of our work is the comparative analysis of environmental data from Open Data bases, belonging to different governments. This means that you have to integrate data from various different sources. Nowadays, many governments have the intention of publishing thousands of data sets for people and organizations to use them. In this way, the quantity of applications based on Open Data is increasing. However each government has its own procedures to publish its data, and it causes a variety of formats of data sets because there are no international standards to specify the formats of the data sets from Open Data bases. Due to this variety of formats, we must build a data integration process that is able to put together all kind of formats. There are some software tools developed in order to give support to the integration process, e.g. Data Tamer, Data Wrangler. The problem with these tools is that they need data scientist interaction to take part in the integration process as a final step. In our case we don’t want to depend on a data scientist, because environmental data are usually similar and these processes can be automated by programming. The main idea of our tool is to build Hadoop procedures adapted to data sources per each government in order to achieve an automated integration. Our work focus in environment data like temperature, energy consumption, air quality, solar radiation, speeds of wind, etc. Since 2 years, the government of Madrid is publishing its Open Data bases relative to environment indicators in real time. In the same way, other governments have published Open Data sets relative to the environment (like Andalucia or Bilbao). But all of those data sets have different formats and our solution is able to integrate all of them, furthermore it allows the user to make and visualize some analysis over the real-time data. Once the integration task is done, all the data from any government has the same format and the analysis process can be initiated in a computational better way. So the tool presented in this work has two goals: 1. Integration process; and 2. Graphic and analytic interface. As a first approach, the integration process was developed using Java and Oracle and the graphic and analytic interface with Java (jsp). However, in order to open our software tool, as second approach, we also developed an implementation with R language as mature open source technology. R is a really powerful open source programming language that allows us to process and analyze a huge amount of data with high performance. There are also some R libraries for the building of a graphic interface like shiny. A performance comparison between both implementations was made and no significant differences were found. In addition, our work provides with an Official Real-Time Integrated Data Set about Environment Data in Spain to any developer in order that they can build their own applications.

Keywords: open data, R language, data integration, environmental data

Procedia PDF Downloads 286
24530 Methodologies for Deriving Semantic Technical Information Using an Unstructured Patent Text Data

Authors: Jaehyung An, Sungjoo Lee

Abstract:

Patent documents constitute an up-to-date and reliable source of knowledge for reflecting technological advance, so patent analysis has been widely used for identification of technological trends and formulation of technology strategies. But, identifying technological information from patent data entails some limitations such as, high cost, complexity, and inconsistency because it rely on the expert’ knowledge. To overcome these limitations, researchers have applied to a quantitative analysis based on the keyword technique. By using this method, you can include a technological implication, particularly patent documents, or extract a keyword that indicates the important contents. However, it only uses the simple-counting method by keyword frequency, so it cannot take into account the sematic relationship with the keywords and sematic information such as, how the technologies are used in their technology area and how the technologies affect the other technologies. To automatically analyze unstructured technological information in patents to extract the semantic information, it should be transformed into an abstracted form that includes the technological key concepts. Specific sentence structure ‘SAO’ (subject, action, object) is newly emerged by representing ‘key concepts’ and can be extracted by NLP (Natural language processor). An SAO structure can be organized in a problem-solution format if the action-object (AO) states that the problem and subject (S) form the solution. In this paper, we propose the new methodology that can extract the SAO structure through technical elements extracting rules. Although sentence structures in the patents text have a unique format, prior studies have depended on general NLP (Natural language processor) applied to the common documents such as newspaper, research paper, and twitter mentions, so it cannot take into account the specific sentence structure types of the patent documents. To overcome this limitation, we identified a unique form of the patent sentences and defined the SAO structures in the patents text data. There are four types of technical elements that consist of technology adoption purpose, application area, tool for technology, and technical components. These four types of sentence structures from patents have their own specific word structure by location or sequence of the part of speech at each sentence. Finally, we developed algorithms for extracting SAOs and this result offer insight for the technology innovation process by providing different perspectives of technology.

Keywords: NLP, patent analysis, SAO, semantic-analysis

Procedia PDF Downloads 242
24529 Hierarchical Cluster Analysis of Raw Milk Samples Obtained from Organic and Conventional Dairy Farming in Autonomous Province of Vojvodina, Serbia

Authors: Lidija Jevrić, Denis Kučević, Sanja Podunavac-Kuzmanović, Strahinja Kovačević, Milica Karadžić

Abstract:

In the present study, the Hierarchical Cluster Analysis (HCA) was applied in order to determine the differences between the milk samples originating from a conventional dairy farm (CF) and an organic dairy farm (OF) in AP Vojvodina, Republic of Serbia. The clustering was based on the basis of the average values of saturated fatty acids (SFA) content and unsaturated fatty acids (UFA) content obtained for every season. Therefore, the HCA included the annual SFA and UFA content values. The clustering procedure was carried out on the basis of Euclidean distances and Single linkage algorithm. The obtained dendrograms indicated that the clustering of UFA in OF was much more uniform compared to clustering of UFA in CF. In OF, spring stands out from the other months of the year. The same case can be noticed for CF, where winter is separated from the other months. The results could be expected because the composition of fatty acids content is greatly influenced by the season and nutrition of dairy cows during the year.

Keywords: chemometrics, clustering, food engineering, milk quality

Procedia PDF Downloads 252
24528 Hierarchical Queue-Based Task Scheduling with CloudSim

Authors: Wanqing You, Kai Qian, Ying Qian

Abstract:

The concepts of Cloud Computing provide users with infrastructure, platform and software as service, which make those services more accessible for people via Internet. To better analysis the performance of Cloud Computing provisioning policies as well as resources allocation strategies, a toolkit named CloudSim proposed. With CloudSim, the Cloud Computing environment can be easily constructed by modelling and simulating cloud computing components, such as datacenter, host, and virtual machine. A good scheduling strategy is the key to achieve the load balancing among different machines as well as to improve the utilization of basic resources. Recently, the existing scheduling algorithms may work well in some presumptive cases in a single machine; however they are unable to make the best decision for the unforeseen future. In real world scenario, there would be numbers of tasks as well as several virtual machines working in parallel. Based on the concepts of multi-queue, this paper presents a new scheduling algorithm to schedule tasks with CloudSim by taking into account several parameters, the machines’ capacity, the priority of tasks and the history log.

Keywords: hierarchical queue, load balancing, CloudSim, information technology

Procedia PDF Downloads 393
24527 Linguistic Codes: Food as a Class Indicator

Authors: Elena Valeryevna Pozhidaeva

Abstract:

This linguistic case study is based on an interaction between the social position and foodways. In every culture there is a social hierarchical system in which there can be means to express and to identify the social status of a person. Food serves as a class indicator. The British being a verbal nation use the words as a preferred medium for signalling and recognising the social status. The linguistic analysis reflects a symbolic hierarchy determined by social groups in the UK. The linguistic class indicators of a British hierarchical system are detectable directly – in speech acts. They are articulated in every aspect of a national identity’s life from preferences of the food and the choice to call it to the names of the meals. The linguistic class indicators can as well be detected indirectly – through symbolic meaning or via the choice of the mealtime, its class (e.g the classes of tea or marmalade), the place to buy food (the class of the supermarket) and consume it (the places for eating out and the frequency of such practices). Under analysis of this study are not only food items and their names but also such categories as cutlery as a class indicator and the act of eating together as a practice of social significance and a class indicator. Current social changes and economic developments are considered and their influence on the class indicators appearance and transformation.

Keywords: linguistic, class, social indicator, English, food class

Procedia PDF Downloads 371
24526 Fuzzy Inference-Assisted Saliency-Aware Convolution Neural Networks for Multi-View Summarization

Authors: Tanveer Hussain, Khan Muhammad, Amin Ullah, Mi Young Lee, Sung Wook Baik

Abstract:

The Big Data generated from distributed vision sensors installed on large scale in smart cities create hurdles in its efficient and beneficial exploration for browsing, retrieval, and indexing. This paper presents a three-folded framework for effective video summarization of such data and provide a compact and representative format of Big Video Data. In the first fold, the paper acquires input video data from the installed cameras and collect clues such as type and count of objects and clarity of the view from a chunk of pre-defined number of frames of each view. The decision of representative view selection for a particular interval is based on fuzzy inference system, acquiring a precise and human resembling decision, reinforced by the known clues as a part of the second fold. In the third fold, the paper forwards the selected view frames to the summary generation mechanism that is supported by a saliency-aware convolution neural network (CNN) model. The new trend of fuzzy rules for view selection followed by CNN architecture for saliency computation makes the multi-view video summarization (MVS) framework a suitable candidate for real-world practice in smart cities.

Keywords: big video data analysis, fuzzy logic, multi-view video summarization, saliency detection

Procedia PDF Downloads 160
24525 Nanoarchitectures Cu2S Functions as Effective Surface-Enhanced Raman Scattering Substrates for Molecular Detection Application

Authors: Yu-Kuei Hsu, Ying-Chu Chen, Yan-Gu Lin

Abstract:

The hierarchical Cu2S nano structural film is successfully fabricated via an electroplated ZnO nanorod array as a template and subsequently chemical solution process for the growth of Cu2S in the application of surface-enhanced Raman scattering (SERS) detection. The as-grown Cu2S nano structures were thermally treated at temperature of 150-300 oC under nitrogen atmosphere to improve the crystal quality and unexpectedly induce the Cu nano particles on surface of Cu2S. The structure and composition of thermally treated Cu2S nano structures were carefully analyzed by SEM, XRD, XPS, and XAS. Using 4-aminothiophenol (4-ATP) as probing molecules, the SERS experiments showed that the thermally treated Cu2S nano structures exhibit excellent detecting performance, which could be used as active and cost-effective SERS substrate for ultra sensitive detecting. Additionally, this novel hierarchical SERS substrates show good reproducibility and a linear dependence between analyte concentrations and intensities, revealing the advantage of this method for easily scale-up production.

Keywords: cuprous sulfide, copper, nanostructures, surface-enhanced raman scattering

Procedia PDF Downloads 380
24524 Scene Classification Using Hierarchy Neural Network, Directed Acyclic Graph Structure, and Label Relations

Authors: Po-Jen Chen, Jian-Jiun Ding, Hung-Wei Hsu, Chien-Yao Wang, Jia-Ching Wang

Abstract:

A more accurate scene classification algorithm using label relations and the hierarchy neural network was developed in this work. In many classification algorithms, it is assumed that the labels are mutually exclusive. This assumption is true in some specific problems, however, for scene classification, the assumption is not reasonable. Because there are a variety of objects with a photo image, it is more practical to assign multiple labels for an image. In this paper, two label relations, which are exclusive relation and hierarchical relation, were adopted in the classification process to achieve more accurate multiple label classification results. Moreover, the hierarchy neural network (hierarchy NN) is applied to classify the image and the directed acyclic graph structure is used for predicting a more reasonable result which obey exclusive and hierarchical relations. Simulations show that, with these techniques, a much more accurate scene classification result can be achieved.

Keywords: convolutional neural network, label relation, hierarchy neural network, scene classification

Procedia PDF Downloads 423
24523 Analysis of Airborne Data Using Range Migration Algorithm for the Spotlight Mode of Synthetic Aperture Radar

Authors: Peter Joseph Basil Morris, Chhabi Nigam, S. Ramakrishnan, P. Radhakrishna

Abstract:

This paper brings out the analysis of the airborne Synthetic Aperture Radar (SAR) data using the Range Migration Algorithm (RMA) for the spotlight mode of operation. Unlike in polar format algorithm (PFA), space-variant defocusing and geometric distortion effects are mitigated in RMA since it does not assume that the illuminating wave-fronts are planar. This facilitates the use of RMA for imaging scenarios involving severe differential range curvatures enabling the imaging of larger scenes at fine resolution and at shorter ranges with low center frequencies. The RMA algorithm for the spotlight mode of SAR is analyzed in this paper using the airborne data. Pre-processing operations viz: - range de-skew and motion compensation to a line are performed on the raw data before being fed to the RMA component. Various stages of the RMA viz:- 2D Matched Filtering, Along Track Fourier Transform and Slot Interpolation are analyzed to find the performance limits and the dependence of the imaging geometry on the resolution of the final image. The ability of RMA to compensate for severe differential range curvatures in the two-dimensional spatial frequency domain are also illustrated in this paper.

Keywords: range migration algorithm, spotlight SAR, synthetic aperture radar, matched filtering, slot interpolation

Procedia PDF Downloads 211
24522 Estimating the Probability of Winning the Best Actor/Actress Award Conditional on the Best Picture Nomination with Bayesian Hierarchical Models

Authors: Svetlana K. Eden

Abstract:

Movies and TV shows have long become part of modern culture. We all have our preferred genre, story, actors, and actresses. However, can we objectively discern good acting from the bad? As laymen, we are probably not objective, but what about the Oscar academy members? Are their votes based on objective measures? Oscar academy members are probably also biased due to many factors, including their professional affiliations or advertisement exposure. Heavily advertised films bring more publicity to their cast and are likely to have bigger budgets. Because a bigger budget may also help earn a Best Picture (BP) nomination, we hypothesize that best actor/actress (BA) nominees from BP-nominated movies would have higher chances of winning the award than those BA nominees from non-BP-nominated films. To test this hypothesis, three Bayesian hierarchical models are proposed, and their performance is evaluated. The results from all three models largely support our hypothesis. Depending on the proportion of BP nominations among BA nominees, the odds ratios (estimated over expected) of winning the BA award conditional on BP nomination vary from 2.8 [0.8-7.0] to 4.3 [2.0, 15.8] for actors and from 1.5 [0.0, 12.2] to 5.4 [2.7, 14.2] for actresses.

Keywords: Oscar, best picture, best actor/actress, bias

Procedia PDF Downloads 196
24521 Mobile Application for Construction Sites Management

Authors: A. Khelifi, M. Al Kaabi, B. Al Rawashdeh

Abstract:

The infrastructure is one of the most important pillars of the UAE, where it spends millions of dollars for investments in the construction sectors. The research done by Kuwait Finance House (KFH) Research showed clearly that the UAE investments in the construction sectors have exceeded 30 billion dollars in 2013. There are many construction companies in the UAE and each one of them is taking the responsibilities to build different infrastructures. The large scale construction projects consist of multi human activities which can affect the efficiency and productivity of the running projects. The Construction Administration System is developed to increase the efficiency and productivity at the construction sites. It runs on two platforms: web server and mobile phone and supports two main users: mobile user and institution employee. With Construction Administration Mobile Application the user can manage and control several projects, create several reports and send these reports in Portable Document Format (PDF) formats through emails, view the physical location of each project, capturing and save photos. An institution employee can use the system to view all existing workers and projects, send emails and view the progress of each project.

Keywords: construction sites, management, mobile application, Portable Document Format (PDF)

Procedia PDF Downloads 349
24520 The Role of Demographics and Service Quality in the Adoption and Diffusion of E-Government Services: A Study in India

Authors: Sayantan Khanra, Rojers P. Joseph

Abstract:

Background and Significance: This study is aimed at analyzing the role of demographic and service quality variables in the adoption and diffusion of e-government services among the users in India. The study proposes to examine the users' perception about e-Government services and investigate the key variables that are most salient to the Indian populace. Description of the Basic Methodologies: The methodology to be adopted in this study is Hierarchical Regression Analysis, which will help in exploring the impact of the demographic variables and the quality dimensions on the willingness to use e-government services in two steps. First, the impact of demographic variables on the willingness to use e-government services is to be examined. In the second step, quality dimensions would be used as inputs to the model for explaining variance in excess of prior contribution by the demographic variables. Present Status: Our study is in the data collection stage in collaboration with a highly reliable, authentic and adequate source of user data. Assuming that the population of the study comprises all the Internet users in India, a massive sample size of more than 10,000 random respondents is being approached. Data is being collected using an online survey questionnaire. A pilot survey has already been carried out to refine the questionnaire with inputs from an expert in management information systems and a small group of users of e-government services in India. The first three questions in the survey pertain to the Internet usage pattern of a respondent and probe whether the person has used e-government services. If the respondent confirms that he/she has used e-government services, then an aggregate of 15 indicators are used to measure the quality dimensions under consideration and the willingness of the respondent to use e-government services, on a five-point Likert scale. If the respondent reports that he/she has not used e-government services, then a few optional questions are asked to understand the reason(s) behind the same. Last four questions in the survey are dedicated to collect data related to the demographic variables. An indication of the Major Findings: Based on the extensive literature review carried out to develop several propositions; a research model is prescribed to start with. A major outcome expected at the completion of the study is the development of a research model that would help to understand the relationship involving the demographic variables and service quality dimensions, and the willingness to adopt e-government services, particularly in an emerging economy like India. Concluding Statement: Governments of emerging economies and other relevant agencies can use the findings from the study in designing, updating, and promoting e-government services to enhance public participation, which in turn, would help to improve efficiency, convenience, engagement, and transparency in implementing these services.

Keywords: adoption and diffusion of e-government services, demographic variables, hierarchical regression analysis, service quality dimensions

Procedia PDF Downloads 233
24519 A Heart Arrhythmia Prediction Using Machine Learning’s Classification Approach and the Concept of Data Mining

Authors: Roshani S. Golhar, Neerajkumar S. Sathawane, Snehal Dongre

Abstract:

Background and objectives: As the, cardiovascular illnesses increasing and becoming cause of mortality worldwide, killing around lot of people each year. Arrhythmia is a type of cardiac illness characterized by a change in the linearity of the heartbeat. The goal of this study is to develop novel deep learning algorithms for successfully interpreting arrhythmia using a single second segment. Because the ECG signal indicates unique electrical heart activity across time, considerable changes between time intervals are detected. Such variances, as well as the limited number of learning data available for each arrhythmia, make standard learning methods difficult, and so impede its exaggeration. Conclusions: The proposed method was able to outperform several state-of-the-art methods. Also proposed technique is an effective and convenient approach to deep learning for heartbeat interpretation, that could be probably used in real-time healthcare monitoring systems

Keywords: electrocardiogram, ECG classification, neural networks, convolutional neural networks, portable document format

Procedia PDF Downloads 45