Search results for: data clustering
24749 Leverage Effect for Volatility with Generalized Laplace Error
Authors: Farrukh Javed, Krzysztof Podgórski
Abstract:
We propose a new model that accounts for the asymmetric response of volatility to positive ('good news') and negative ('bad news') shocks in economic time series the so-called leverage effect. In the past, asymmetric powers of errors in the conditionally heteroskedastic models have been used to capture this effect. Our model is using the gamma difference representation of the generalized Laplace distributions that efficiently models the asymmetry. It has one additional natural parameter, the shape, that is used instead of power in the asymmetric power models to capture the strength of a long-lasting effect of shocks. Some fundamental properties of the model are provided including the formula for covariances and an explicit form for the conditional distribution of 'bad' and 'good' news processes given the past the property that is important for the statistical fitting of the model. Relevant features of volatility models are illustrated using S&P 500 historical data.Keywords: heavy tails, volatility clustering, generalized asymmetric laplace distribution, leverage effect, conditional heteroskedasticity, asymmetric power volatility, GARCH models
Procedia PDF Downloads 38124748 Intelligent Software Architecture and Automatic Re-Architecting Based on Machine Learning
Authors: Gebremeskel Hagos Gebremedhin, Feng Chong, Heyan Huang
Abstract:
Software system is the combination of architecture and organized components to accomplish a specific function or set of functions. A good software architecture facilitates application system development, promotes achievement of functional requirements, and supports system reconfiguration. We describe three studies demonstrating the utility of our architecture in the subdomain of mobile office robots and identify software engineering principles embodied in the architecture. The main aim of this paper is to analyze prove architecture design and automatic re-architecting using machine learning. Intelligence software architecture and automatic re-architecting process is reorganizing in to more suitable one of the software organizational structure system using the user access dataset for creating relationship among the components of the system. The 3-step approach of data mining was used to analyze effective recovery, transformation and implantation with the use of clustering algorithm. Therefore, automatic re-architecting without changing the source code is possible to solve the software complexity problem and system software reuse.Keywords: intelligence, software architecture, re-architecting, software reuse, High level design
Procedia PDF Downloads 11424747 Application of a Model-Free Artificial Neural Networks Approach for Structural Health Monitoring of the Old Lidingö Bridge
Authors: Ana Neves, John Leander, Ignacio Gonzalez, Raid Karoumi
Abstract:
Systematic monitoring and inspection are needed to assess the present state of a structure and predict its future condition. If an irregularity is noticed, repair actions may take place and the adequate intervention will most probably reduce the future costs with maintenance, minimize downtime and increase safety by avoiding the failure of the structure as a whole or of one of its structural parts. For this to be possible decisions must be made at the right time, which implies using systems that can detect abnormalities in their early stage. In this sense, Structural Health Monitoring (SHM) is seen as an effective tool for improving the safety and reliability of infrastructures. This paper explores the decision-making problem in SHM regarding the maintenance of civil engineering structures. The aim is to assess the present condition of a bridge based exclusively on measurements using the suggested method in this paper, such that action is taken coherently with the information made available by the monitoring system. Artificial Neural Networks are trained and their ability to predict structural behavior is evaluated in the light of a case study where acceleration measurements are acquired from a bridge located in Stockholm, Sweden. This relatively old bridge is presently still in operation despite experiencing obvious problems already reported in previous inspections. The prediction errors provide a measure of the accuracy of the algorithm and are subjected to further investigation, which comprises concepts like clustering analysis and statistical hypothesis testing. These enable to interpret the obtained prediction errors, draw conclusions about the state of the structure and thus support decision making regarding its maintenance.Keywords: artificial neural networks, clustering analysis, model-free damage detection, statistical hypothesis testing, structural health monitoring
Procedia PDF Downloads 20324746 Heterogeneity of Soil Moisture and Its Impacts on the Mountainous Watershed Hydrology in Northwest China
Authors: Chansheng He, Zhongfu Wang, Xiao Bai, Jie Tian, Xin Jin
Abstract:
Heterogeneity of soil hydraulic properties directly affects hydrological processes at different scales. Understanding heterogeneity of soil hydraulic properties such as soil moisture is therefore essential for modeling watershed ecohydrological processes, particularly in hard to access, topographically complex mountainous watersheds. This study maps spatial variations of soil moisture by in situ observation network that consists of sampling points, zones, and tributaries, and monitors corresponding hydrological variables of air and soil temperatures, evapotranspiration, infiltration, and runoff in the Upper Reach of the Heihe River Watershed, a second largest inland river (terminal lake) with a drainage area of over 128,000 km² in Northwest China. Subsequently, the study uses a hydrological model, SWAT (Soil and Water Assessment Tool) to simulate the effects of heterogeneity of soil moisture on watershed hydrological processes. The spatial clustering method, Full-Order-CLK was employed to derive five soil heterogeneous zones (Configuration 97, 80, 65, 40, and 20) for soil input to SWAT. Results show the simulations by the SWAT model with the spatially clustered soil hydraulic information from the field sampling data had much better representation of the soil heterogeneity and more accurate performance than the model using the average soil property values for each soil type derived from the coarse soil datasets. Thus, incorporating detailed field sampling soil heterogeneity data greatly improves performance in hydrologic modeling.Keywords: heterogeneity, soil moisture, SWAT, up-scaling
Procedia PDF Downloads 34324745 A Quantitative Analysis of Rural to Urban Migration in Morocco
Authors: Donald Wright
Abstract:
The ultimate goal of this study is to reinvigorate the philosophical underpinnings the study of urbanization with scientific data with the goal of circumventing what seems an inevitable future clash between rural and urban populations. To that end urban infrastructure must be sustainable economically, politically and ecologically over the course of several generations as cities continue to grow with the incorporation of climate refugees. Our research will provide data concerning the projected increase in population over the coming two decades in Morocco, and the population will shift from rural areas to urban centers during that period of time. As a result, urban infrastructure will need to be adapted, developed or built to fit the demand of future internal migrations from rural to urban centers in Morocco. This paper will also examine how past experiences of internally displaced people give insight into the challenges faced by future migrants and, beyond the gathering of data, how people react to internal migration. This study employs four different sets of research tools. First, a large part of this study is archival, which involves compiling the relevant literature on the topic and its complex history. This step also includes gathering data bout migrations in Morocco from public data sources. Once the datasets are collected, the next part of the project involves populating the attribute fields and preprocessing the data to make it understandable and usable by machine learning algorithms. In tandem with the mathematical interpretation of data and projected migrations, this study benefits from a theoretical understanding of the critical apparatus existing around urban development of the 20th and 21st centuries that give us insight into past infrastructure development and the rationale behind it. Once the data is ready to be analyzed, different machine learning algorithms will be experimented (k-clustering, support vector regression, random forest analysis) and the results compared for visualization of the data. The final computational part of this study involves analyzing the data and determining what we can learn from it. This paper helps us to understand future trends of population movements within and between regions of North Africa, which will have an impact on various sectors such as urban development, food distribution and water purification, not to mention the creation of public policy in the countries of this region. One of the strengths of this project is the multi-pronged and cross-disciplinary methodology to the research question, which enables an interchange of knowledge and experiences to facilitate innovative solutions to this complex problem. Multiple and diverse intersecting viewpoints allow an exchange of methodological models that provide fresh and informed interpretations of otherwise objective data.Keywords: climate change, machine learning, migration, Morocco, urban development
Procedia PDF Downloads 14524744 Associations between Sharing Bike Usage and Characteristics of Urban Street Built Environment in Wuhan, China
Authors: Miao Li, Mengyuan Xu
Abstract:
As a low-carbon travel mode, bicycling has drawn increasing political interest in the contemporary Chinese urban context, and the public sharing bikes have become the most popular ways of bike usage in China now. This research aims to explore the spatial-temporal relationship between sharing bike usage and different characteristics of the urban street built environment. In the research, street segments were used as the analytic unit of the street built environment defined by street intersections. The sharing bike usage data in the research include a total of 2.64 million samples that are the entire sharing bike distribution data recorded in two days in 2018 within a neighborhood of 185.4 hectares in the city of Wuhan, China. And these data are assigned to the 97 urban street segments in this area based on their geographic location. The built environment variables used in this research are categorized into three sections: 1) street design characteristics, such as street width, street greenery, types of bicycle lanes; 2) condition of other public transportation, such as the availability of metro station; 3) Street function characteristics that are described by the categories and density of the point of interest (POI) along the segments. Spatial Lag Models (SLM) were used in order to reveal the relationships of specific urban streets built environment characteristics and the likelihood of sharing bicycling usage in whole and different periods a day. The results show: 1) there is spatial autocorrelation among sharing bicycling usage of urban streets in case area in general, non-working day, working day and each period of a day, which presents a clustering pattern in the street space; 2) a statistically strong association between bike sharing usage and several different built environment characteristics such as POI density, types of bicycle lanes and street width; 3) the pattern that bike sharing usage is influenced by built environment characteristics depends on the period within a day. These findings could be useful for policymakers and urban designers to better understand the factors affecting bike sharing system and thus propose guidance and strategy for urban street planning and design in order to promote the use of sharing bikes.Keywords: big data, sharing bike usage, spatial statistics, urban street built environment
Procedia PDF Downloads 14124743 Uplift Segmentation Approach for Targeting Customers in a Churn Prediction Model
Authors: Shivahari Revathi Venkateswaran
Abstract:
Segmenting customers plays a significant role in churn prediction. It helps the marketing team with proactive and reactive customer retention. For the reactive retention, the retention team reaches out to customers who already showed intent to disconnect by giving some special offers. When coming to proactive retention, the marketing team uses churn prediction model, which ranks each customer from rank 1 to 100, where 1 being more risk to churn/disconnect (high ranks have high propensity to churn). The churn prediction model is built by using XGBoost model. However, with the churn rank, the marketing team can only reach out to the customers based on their individual ranks. To profile different groups of customers and to frame different marketing strategies for targeted groups of customers are not possible with the churn ranks. For this, the customers must be grouped in different segments based on their profiles, like demographics and other non-controllable attributes. This helps the marketing team to frame different offer groups for the targeted audience and prevent them from disconnecting (proactive retention). For segmentation, machine learning approaches like k-mean clustering will not form unique customer segments that have customers with same attributes. This paper finds an alternate approach to find all the combination of unique segments that can be formed from the user attributes and then finds the segments who have uplift (churn rate higher than the baseline churn rate). For this, search algorithms like fast search and recursive search are used. Further, for each segment, all customers can be targeted using individual churn ranks from the churn prediction model. Finally, a UI (User Interface) is developed for the marketing team to interactively search for the meaningful segments that are formed and target the right set of audience for future marketing campaigns and prevent them from disconnecting.Keywords: churn prediction modeling, XGBoost model, uplift segments, proactive marketing, search algorithms, retention, k-mean clustering
Procedia PDF Downloads 6824742 JavaScript Object Notation Data against eXtensible Markup Language Data in Software Applications a Software Testing Approach
Authors: Theertha Chandroth
Abstract:
This paper presents a comparative study on how to check JSON (JavaScript Object Notation) data against XML (eXtensible Markup Language) data from a software testing point of view. JSON and XML are widely used data interchange formats, each with its unique syntax and structure. The objective is to explore various techniques and methodologies for validating comparison and integration between JSON data to XML and vice versa. By understanding the process of checking JSON data against XML data, testers, developers and data practitioners can ensure accurate data representation, seamless data interchange, and effective data validation.Keywords: XML, JSON, data comparison, integration testing, Python, SQL
Procedia PDF Downloads 13024741 Using Machine Learning Techniques to Extract Useful Information from Dark Data
Authors: Nigar Hussain
Abstract:
It is a subset of big data. Dark data means those data in which we fail to use for future decisions. There are many issues in existing work, but some need powerful tools for utilizing dark data. It needs sufficient techniques to deal with dark data. That enables users to exploit their excellence, adaptability, speed, less time utilization, execution, and accessibility. Another issue is the way to utilize dark data to extract helpful information to settle on better choices. In this paper, we proposed upgrade strategies to remove the dark side from dark data. Using a supervised model and machine learning techniques, we utilized dark data and achieved an F1 score of 89.48%.Keywords: big data, dark data, machine learning, heatmap, random forest
Procedia PDF Downloads 1524740 Multi-Source Data Fusion for Urban Comprehensive Management
Authors: Bolin Hua
Abstract:
In city governance, various data are involved, including city component data, demographic data, housing data and all kinds of business data. These data reflects different aspects of people, events and activities. Data generated from various systems are different in form and data source are different because they may come from different sectors. In order to reflect one or several facets of an event or rule, data from multiple sources need fusion together. Data from different sources using different ways of collection raised several issues which need to be resolved. Problem of data fusion include data update and synchronization, data exchange and sharing, file parsing and entry, duplicate data and its comparison, resource catalogue construction. Governments adopt statistical analysis, time series analysis, extrapolation, monitoring analysis, value mining, scenario prediction in order to achieve pattern discovery, law verification, root cause analysis and public opinion monitoring. The result of Multi-source data fusion is to form a uniform central database, which includes people data, location data, object data, and institution data, business data and space data. We need to use meta data to be referred to and read when application needs to access, manipulate and display the data. A uniform meta data management ensures effectiveness and consistency of data in the process of data exchange, data modeling, data cleansing, data loading, data storing, data analysis, data search and data delivery.Keywords: multi-source data fusion, urban comprehensive management, information fusion, government data
Procedia PDF Downloads 38624739 Reviewing Privacy Preserving Distributed Data Mining
Authors: Sajjad Baghernezhad, Saeideh Baghernezhad
Abstract:
Nowadays considering human involved in increasing data development some methods such as data mining to extract science are unavoidable. One of the discussions of data mining is inherent distribution of the data usually the bases creating or receiving such data belong to corporate or non-corporate persons and do not give their information freely to others. Yet there is no guarantee to enable someone to mine special data without entering in the owner’s privacy. Sending data and then gathering them by each vertical or horizontal software depends on the type of their preserving type and also executed to improve data privacy. In this study it was attempted to compare comprehensively preserving data methods; also general methods such as random data, coding and strong and weak points of each one are examined.Keywords: data mining, distributed data mining, privacy protection, privacy preserving
Procedia PDF Downloads 52124738 The Right to Data Portability and Its Influence on the Development of Digital Services
Authors: Roman Bieda
Abstract:
The General Data Protection Regulation (GDPR) will come into force on 25 May 2018 which will create a new legal framework for the protection of personal data in the European Union. Article 20 of GDPR introduces a right to data portability. This right allows for data subjects to receive the personal data which they have provided to a data controller, in a structured, commonly used and machine-readable format, and to transmit this data to another data controller. The right to data portability, by facilitating transferring personal data between IT environments (e.g.: applications), will also facilitate changing the provider of services (e.g. changing a bank or a cloud computing service provider). Therefore, it will contribute to the development of competition and the digital market. The aim of this paper is to discuss the right to data portability and its influence on the development of new digital services.Keywords: data portability, digital market, GDPR, personal data
Procedia PDF Downloads 47024737 Wireless Sensor Networks Optimization by Using 2-Stage Algorithm Based on Imperialist Competitive Algorithm
Authors: Hamid R. Lashgarian Azad, Seyed N. Shetab Boushehri
Abstract:
Wireless sensor networks (WSN) have become progressively popular due to their wide range of applications. Wireless Sensor Network is made of numerous tiny sensor nodes that are battery-powered. It is a very significant problem to maximize the lifetime of wireless sensor networks. In this paper, we propose a two-stage protocol based on an imperialist competitive algorithm (2S-ICA) to solve a sensor network optimization problem. The energy of the sensors can be greatly reduced and the lifetime of the network reduced by long communication distances between the sensors and the sink. We can minimize the overall communication distance considerably, thereby extending the lifetime of the network lifetime through connecting sensors into a series of independent clusters using 2SICA. Comparison results of the proposed protocol and LEACH protocol, which is common to solving WSN problems, show that our protocol has a better performance in terms of improving network life and increasing the number of transmitted data.Keywords: wireless sensor network, imperialist competitive algorithm, LEACH protocol, k-means clustering
Procedia PDF Downloads 9824736 The Role of Artificial Intelligence Algorithms in Psychiatry: Advancing Diagnosis and Treatment
Authors: Netanel Stern
Abstract:
Artificial intelligence (AI) algorithms have emerged as powerful tools in the field of psychiatry, offering new possibilities for enhancing diagnosis and treatment outcomes. This article explores the utilization of AI algorithms in psychiatry, highlighting their potential to revolutionize patient care. Various AI algorithms, including machine learning, natural language processing (NLP), reinforcement learning, clustering, and Bayesian networks, are discussed in detail. Moreover, ethical considerations and future directions for research and implementation are addressed.Keywords: AI, software engineering, psychiatry, neuroimaging
Procedia PDF Downloads 10724735 Recent Advances in Data Warehouse
Authors: Fahad Hanash Alzahrani
Abstract:
This paper describes some recent advances in a quickly developing area of data storing and processing based on Data Warehouses and Data Mining techniques, which are associated with software, hardware, data mining algorithms and visualisation techniques having common features for any specific problems and tasks of their implementation.Keywords: data warehouse, data mining, knowledge discovery in databases, on-line analytical processing
Procedia PDF Downloads 39724734 How to Use Big Data in Logistics Issues
Authors: Mehmet Akif Aslan, Mehmet Simsek, Eyup Sensoy
Abstract:
Big Data stands for today’s cutting-edge technology. As the technology becomes widespread, so does Data. Utilizing massive data sets enable companies to get competitive advantages over their adversaries. Out of many area of Big Data usage, logistics has significance role in both commercial sector and military. This paper lays out what big data is and how it is used in both military and commercial logistics.Keywords: big data, logistics, operational efficiency, risk management
Procedia PDF Downloads 63824733 Investigating the Characteristics of Correlated Parking-Charging Behaviors for Electric Vehicles: A Data-Driven Approach
Authors: Xizhen Zhou, Yanjie Ji
Abstract:
In advancing the management of integrated electric vehicle (EV) parking-charging behaviors, this study uses Changshu City in Suzhou as a case study to establish a data association mechanism for parking-charging platforms and to develop a database for EV parking-charging behaviors. Key indicators, such as charging start time, initial state of charge, final state of charge, and parking-charging time difference, are considered. Utilizing the K-S test method, the paper examines the heterogeneity of parking-charging behavior preferences among pure EV and non-pure EV users. The K-means clustering method is employed to analyze the characteristics of parking-charging behaviors for both user groups, thereby enhancing the overall understanding of these behaviors. The findings of this study reveal that using a classification model, the parking-charging behaviors of pure EVs can be classified into five distinct groups, while those of non-pure EVs can be separated into four groups. Among them, both types of EV users exhibit groups with low range anxiety for complete charging with special journeys, complete charging at destination, and partial charging. Additionally, both types have a group with high range anxiety, characterized by pure EV users displaying a preference for complete charging with specific journeys, while non-pure EV users exhibit a preference for complete charging. Notably, pure EV users also display a significant group engaging in nocturnal complete charging. The findings of this study can provide technical support for the scientific and rational layout and management of integrated parking and charging facilities for EVs.Keywords: traffic engineering, potential preferences, cluster analysis, EV, parking-charging behavior
Procedia PDF Downloads 7224732 GBKMeans: A Genetic Based K-Means Applied to the Capacitated Planning of Reading Units
Authors: Anderson S. Fonseca, Italo F. S. Da Silva, Robert D. A. Santos, Mayara G. Da Silva, Pedro H. C. Vieira, Antonio M. S. Sobrinho, Victor H. B. Lemos, Petterson S. Diniz, Anselmo C. Paiva, Eliana M. G. Monteiro
Abstract:
In Brazil, the National Electric Energy Agency (ANEEL) establishes that electrical energy companies are responsible for measuring and billing their customers. Among these regulations, it’s defined that a company must bill your customers within 27-33 days. If a relocation or a change of period is required, the consumer must be notified in writing, in advance of a billing period. To make it easier to organize a workday’s measurements, these companies create a reading plan. These plans consist of grouping customers into reading groups, which are visited by an employee responsible for measuring consumption and billing. The creation process of a plan efficiently and optimally is a capacitated clustering problem with constraints related to homogeneity and compactness, that is, the employee’s working load and the geographical position of the consuming unit. This process is a work done manually by several experts who have experience in the geographic formation of the region, which takes a large number of days to complete the final planning, and because it’s human activity, there is no guarantee of finding the best optimization for planning. In this paper, the GBKMeans method presents a technique based on K-Means and genetic algorithms for creating a capacitated cluster that respects the constraints established in an efficient and balanced manner, that minimizes the cost of relocating consumer units and the time required for final planning creation. The results obtained by the presented method are compared with the current planning of a real city, showing an improvement of 54.71% in the standard deviation of working load and 11.97% in the compactness of the groups.Keywords: capacitated clustering, k-means, genetic algorithm, districting problems
Procedia PDF Downloads 19324731 Assessing Functional Structure in European Marine Ecosystems Using a Vector-Autoregressive Spatio-Temporal Model
Authors: Katyana A. Vert-Pre, James T. Thorson, Thomas Trancart, Eric Feunteun
Abstract:
In marine ecosystems, spatial and temporal species structure is an important component of ecosystems’ response to anthropological and environmental factors. Although spatial distribution patterns and fish temporal series of abundance have been studied in the past, little research has been allocated to the joint dynamic spatio-temporal functional patterns in marine ecosystems and their use in multispecies management and conservation. Each species represents a function to the ecosystem, and the distribution of these species might not be random. A heterogeneous functional distribution will lead to a more resilient ecosystem to external factors. Applying a Vector-Autoregressive Spatio-Temporal (VAST) model for count data, we estimate the spatio-temporal distribution, shift in time, and abundance of 140 species of the Eastern English Chanel, Bay of Biscay and Mediterranean Sea. From the model outputs, we determined spatio-temporal clusters, calculating p-values for hierarchical clustering via multiscale bootstrap resampling. Then, we designed a functional map given the defined cluster. We found that the species distribution within the ecosystem was not random. Indeed, species evolved in space and time in clusters. Moreover, these clusters remained similar over time deriving from the fact that species of a same cluster often shifted in sync, keeping the overall structure of the ecosystem similar overtime. Knowing the co-existing species within these clusters could help with predicting data-poor species distribution and abundance. Further analysis is being performed to assess the ecological functions represented in each cluster.Keywords: cluster distribution shift, European marine ecosystems, functional distribution, spatio-temporal model
Procedia PDF Downloads 18924730 Impact Location From Instrumented Mouthguard Kinematic Data In Rugby
Authors: Jazim Sohail, Filipe Teixeira-Dias
Abstract:
Mild traumatic brain injury (mTBI) within non-helmeted contact sports is a growing concern due to the serious risk of potential injury. Extensive research is being conducted looking into head kinematics in non-helmeted contact sports utilizing instrumented mouthguards that allow researchers to record accelerations and velocities of the head during and after an impact. This does not, however, allow the location of the impact on the head, and its magnitude and orientation, to be determined. This research proposes and validates two methods to quantify impact locations from instrumented mouthguard kinematic data, one using rigid body dynamics, the other utilizing machine learning. The rigid body dynamics technique focuses on establishing and matching moments from Euler’s and torque equations in order to find the impact location on the head. The methodology is validated with impact data collected from a lab test with the dummy head fitted with an instrumented mouthguard. Additionally, a Hybrid III Dummy head finite element model was utilized to create synthetic kinematic data sets for impacts from varying locations to validate the impact location algorithm. The algorithm calculates accurate impact locations; however, it will require preprocessing of live data, which is currently being done by cross-referencing data timestamps to video footage. The machine learning technique focuses on eliminating the preprocessing aspect by establishing trends within time-series signals from instrumented mouthguards to determine the impact location on the head. An unsupervised learning technique is used to cluster together impacts within similar regions from an entire time-series signal. The kinematic signals established from mouthguards are converted to the frequency domain before using a clustering algorithm to cluster together similar signals within a time series that may span the length of a game. Impacts are clustered within predetermined location bins. The same Hybrid III Dummy finite element model is used to create impacts that closely replicate on-field impacts in order to create synthetic time-series datasets consisting of impacts in varying locations. These time-series data sets are used to validate the machine learning technique. The rigid body dynamics technique provides a good method to establish accurate impact location of impact signals that have already been labeled as true impacts and filtered out of the entire time series. However, the machine learning technique provides a method that can be implemented with long time series signal data but will provide impact location within predetermined regions on the head. Additionally, the machine learning technique can be used to eliminate false impacts captured by sensors saving additional time for data scientists using instrumented mouthguard kinematic data as validating true impacts with video footage would not be required.Keywords: head impacts, impact location, instrumented mouthguard, machine learning, mTBI
Procedia PDF Downloads 21324729 Formulation of Optimal Shifting Sequence for Multi-Speed Automatic Transmission
Authors: Sireesha Tamada, Debraj Bhattacharjee, Pranab K. Dan, Prabha Bhola
Abstract:
The most important component in an automotive transmission system is the gearbox which controls the speed of the vehicle. In an automatic transmission, the right positioning of actuators ensures efficient transmission mechanism embodiment, wherein the challenge lies in formulating the number of actuators associated with modelling a gearbox. Data with respect to actuation and gear shifting sequence has been retrieved from the available literature, including patent documents, and has been used in this proposed heuristics based methodology for modelling actuation sequence in a gear box. This paper presents a methodological approach in designing a gearbox for the purpose of obtaining an optimal shifting sequence. The computational model considers factors namely, the number of stages and gear teeth as input parameters since these two are the determinants of the gear ratios in an epicyclic gear train. The proposed transmission schematic or stick diagram aids in developing the gearbox layout design. The number of iterations and development time required to design a gearbox layout is reduced by using this approach.Keywords: automatic transmission, gear-shifting, multi-stage planetary gearbox, rank ordered clustering
Procedia PDF Downloads 32124728 Bag of Local Features for Person Re-Identification on Large-Scale Datasets
Authors: Yixiu Liu, Yunzhou Zhang, Jianning Chi, Hao Chu, Rui Zheng, Libo Sun, Guanghao Chen, Fangtong Zhou
Abstract:
In the last few years, large-scale person re-identification has attracted a lot of attention from video surveillance since it has a potential application prospect in public safety management. However, it is still a challenging job considering the variation in human pose, the changing illumination conditions and the lack of paired samples. Although the accuracy has been significantly improved, the data dependence of the sample training is serious. To tackle this problem, a new strategy is proposed based on bag of visual words (BoVW) model of designing the feature representation which has been widely used in the field of image retrieval. The local features are extracted, and more discriminative feature representation is obtained by cross-view dictionary learning (CDL), then the assignment map is obtained through k-means clustering. Finally, the BoVW histograms are formed which encodes the images with the statistics of the feature classes in the assignment map. Experiments conducted on the CUHK03, Market1501 and MARS datasets show that the proposed method performs favorably against existing approaches.Keywords: bag of visual words, cross-view dictionary learning, person re-identification, reranking
Procedia PDF Downloads 19124727 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles
Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis
Abstract:
Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.Keywords: big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review
Procedia PDF Downloads 15724726 A Parallel Implementation of k-Means in MATLAB
Authors: Dimitris Varsamis, Christos Talagkozis, Alkiviadis Tsimpiris, Paris Mastorocostas
Abstract:
The aim of this work is the parallel implementation of k-means in MATLAB, in order to reduce the execution time. Specifically, a new function in MATLAB for serial k-means algorithm is developed, which meets all the requirements for the conversion to a function in MATLAB with parallel computations. Additionally, two different variants for the definition of initial values are presented. In the sequel, the parallel approach is presented. Finally, the performance tests for the computation times respect to the numbers of features and classes are illustrated.Keywords: K-means algorithm, clustering, parallel computations, Matlab
Procedia PDF Downloads 38024725 Government Big Data Ecosystem: A Systematic Literature Review
Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis
Abstract:
Data that is high in volume, velocity, veracity and comes from a variety of sources is usually generated in all sectors including the government sector. Globally public administrations are pursuing (big) data as new technology and trying to adopt a data-centric architecture for hosting and sharing data. Properly executed, big data and data analytics in the government (big) data ecosystem can be led to data-driven government and have a direct impact on the way policymakers work and citizens interact with governments. In this research paper, we conduct a systematic literature review. The main aims of this paper are to highlight essential aspects of the government (big) data ecosystem and to explore the most critical socio-technical factors that contribute to the successful implementation of government (big) data ecosystem. The essential aspects of government (big) data ecosystem include definition, data types, data lifecycle models, and actors and their roles. We also discuss the potential impact of (big) data in public administration and gaps in the government data ecosystems literature. As this is a new topic, we did not find specific articles on government (big) data ecosystem and therefore focused our research on various relevant areas like humanitarian data, open government data, scientific research data, industry data, etc.Keywords: applications of big data, big data, big data types. big data ecosystem, critical success factors, data-driven government, egovernment, gaps in data ecosystems, government (big) data, literature review, public administration, systematic review
Procedia PDF Downloads 22424724 A Mixture Vine Copula Structures Model for Dependence Wind Speed among Wind Farms and Its Application in Reactive Power Optimization
Authors: Yibin Qiu, Yubo Ouyang, Shihan Li, Guorui Zhang, Qi Li, Weirong Chen
Abstract:
This paper aims at exploring the impacts of high dimensional dependencies of wind speed among wind farms on probabilistic optimal power flow. To obtain the reactive power optimization faster and more accurately, a mixture vine Copula structure model combining the K-means clustering, C vine copula and D vine copula is proposed in this paper, through which a more accurate correlation model can be obtained. Moreover, a Modified Backtracking Search Algorithm (MBSA), the three-point estimate method is applied to probabilistic optimal power flow. The validity of the mixture vine copula structure model and the MBSA are respectively tested in IEEE30 node system with measured data of 3 adjacent wind farms in a certain area, and the results indicate effectiveness of these methods.Keywords: mixture vine copula structure model, three-point estimate method, the probability integral transform, modified backtracking search algorithm, reactive power optimization
Procedia PDF Downloads 24624723 A Machine Learning Decision Support Framework for Industrial Engineering Purposes
Authors: Anli Du Preez, James Bekker
Abstract:
Data is currently one of the most critical and influential emerging technologies. However, the true potential of data is yet to be exploited since, currently, about 1% of generated data are ever actually analyzed for value creation. There is a data gap where data is not explored due to the lack of data analytics infrastructure and the required data analytics skills. This study developed a decision support framework for data analytics by following Jabareen’s framework development methodology. The study focused on machine learning algorithms, which is a subset of data analytics. The developed framework is designed to assist data analysts with little experience, in choosing the appropriate machine learning algorithm given the purpose of their application.Keywords: Data analytics, Industrial engineering, Machine learning, Value creation
Procedia PDF Downloads 16524722 Providing Security to Private Cloud Using Advanced Encryption Standard Algorithm
Authors: Annapureddy Srikant Reddy, Atthanti Mahendra, Samala Chinni Krishna, N. Neelima
Abstract:
In our present world, we are generating a lot of data and we, need a specific device to store all these data. Generally, we store data in pen drives, hard drives, etc. Sometimes we may loss the data due to the corruption of devices. To overcome all these issues, we implemented a cloud space for storing the data, and it provides more security to the data. We can access the data with just using the internet from anywhere in the world. We implemented all these with the java using Net beans IDE. Once user uploads the data, he does not have any rights to change the data. Users uploaded files are stored in the cloud with the file name as system time and the directory will be created with some random words. Cloud accepts the data only if the size of the file is less than 2MB.Keywords: cloud space, AES, FTP, NetBeans IDE
Procedia PDF Downloads 20324721 Conjunctive Management of Surface and Groundwater Resources under Uncertainty: A Retrospective Optimization Approach
Authors: Julius M. Ndambuki, Gislar E. Kifanyi, Samuel N. Odai, Charles Gyamfi
Abstract:
Conjunctive management of surface and groundwater resources is a challenging task due to the spatial and temporal variability nature of hydrology as well as hydrogeology of the water storage systems. Surface water-groundwater hydrogeology is highly uncertain; thus it is imperative that this uncertainty is explicitly accounted for, when managing water resources. Various methodologies have been developed and applied by researchers in an attempt to account for the uncertainty. For example, simulation-optimization models are often used for conjunctive water resources management. However, direct application of such an approach in which all realizations are considered at each iteration of the optimization process leads to a very expensive optimization in terms of computational time, particularly when the number of realizations is large. The aim of this paper, therefore, is to introduce and apply an efficient approach referred to as Retrospective Optimization Approximation (ROA) that can be used for optimizing conjunctive use of surface water and groundwater over a multiple hydrogeological model simulations. This work is based on stochastic simulation-optimization framework using a recently emerged technique of sample average approximation (SAA) which is a sampling based method implemented within the Retrospective Optimization Approximation (ROA) approach. The ROA approach solves and evaluates a sequence of generated optimization sub-problems in an increasing number of realizations (sample size). Response matrix technique was used for linking simulation model with optimization procedure. The k-means clustering sampling technique was used to map the realizations. The methodology is demonstrated through the application to a hypothetical example. In the example, the optimization sub-problems generated were solved and analysed using “Active-Set” core optimizer implemented under MATLAB 2014a environment. Through k-means clustering sampling technique, the ROA – Active Set procedure was able to arrive at a (nearly) converged maximum expected total optimal conjunctive water use withdrawal rate within a relatively few number of iterations (6 to 7 iterations). Results indicate that the ROA approach is a promising technique for optimizing conjunctive water use of surface water and groundwater withdrawal rates under hydrogeological uncertainty.Keywords: conjunctive water management, retrospective optimization approximation approach, sample average approximation, uncertainty
Procedia PDF Downloads 22924720 Biochemical and Pomological Variability among 14 Moroccan and Foreign Cultivars of Prunus dulcis
Authors: H. Hanine, H. H'ssaini, M. Ibno Alaoui, A. Nablousi, H. Zahir, S. Ennahli, H. Latrache, H. Zine Abidine
Abstract:
Biochemical and pomological variability among 14 cultivars of Prunus dulcis planted in a germoplasm collection site in Morocco were evaluated. Almond samples from six local and eight foreign cultivars (France, Italy, Spain, and USA) were characterized. Biochemical and pomological data revealed significant genetic variability among the 14 cultivars; local cultivars exhibited higher total polyphenol content. Oil content ranged from 35 to 57% among cultivars; both Texas and Toundout genotypes recorded the highest oil content. Total protein concentration from select cultivars ranged from 50 mg/g in Ferraduel to 105 mg/g in Rizlane1 cultivars. Antioxidant activity of almond samples was examined by a DPPH (1,1-diphenyl-2-picrylhydrazyl) radical-scavenging assay; the antioxidant activity varied significantly within the cultivars, with IC50 (the half-maximal inhibitory concentration) values ranging from 2.25 to 20 mg/ml. Autochthonous cultivars originated from the Oujda region exhibited higher tegument total polyphenol and amino acid content compared to others. The genotype Rizlane2 recorded the highest flavonoid content. Pomological traits revealed a large variability within the almond germplasms. The hierarchical clustering analysis of all the data regarding pomological traits distinguished two groups with some particular genotypes as distinct cultivars, and groups of cultivars as polyclone varieties. These results strongly exhibit a potential use of Moroccan-originated almonds as potential clones for future selection due to their nutritional values and pomological traits compared to well-established cultivars.Keywords: antioxidant activity, DDPH, Moroccan almonds, Prunus dulcis
Procedia PDF Downloads 234