Search results for: categorical data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24301

Search results for: categorical data

24061 An Empirical Study of the Impacts of Big Data on Firm Performance

Authors: Thuan Nguyen

Abstract:

In the present time, data to a data-driven knowledge-based economy is the same as oil to the industrial age hundreds of years ago. Data is everywhere in vast volumes! Big data analytics is expected to help firms not only efficiently improve performance but also completely transform how they should run their business. However, employing the emergent technology successfully is not easy, and assessing the roles of big data in improving firm performance is even much harder. There was a lack of studies that have examined the impacts of big data analytics on organizational performance. This study aimed to fill the gap. The present study suggested using firms’ intellectual capital as a proxy for big data in evaluating its impact on organizational performance. The present study employed the Value Added Intellectual Coefficient method to measure firm intellectual capital, via its three main components: human capital efficiency, structural capital efficiency, and capital employed efficiency, and then used the structural equation modeling technique to model the data and test the models. The financial fundamental and market data of 100 randomly selected publicly listed firms were collected. The results of the tests showed that only human capital efficiency had a significant positive impact on firm profitability, which highlighted the prominent human role in the impact of big data technology.

Keywords: big data, big data analytics, intellectual capital, organizational performance, value added intellectual coefficient

Procedia PDF Downloads 216
24060 Automated Test Data Generation For some types of Algorithm

Authors: Hitesh Tahbildar

Abstract:

The cost of test data generation for a program is computationally very high. In general case, no algorithm to generate test data for all types of algorithms has been found. The cost of generating test data for different types of algorithm is different. Till date, people are emphasizing the need to generate test data for different types of programming constructs rather than different types of algorithms. The test data generation methods have been implemented to find heuristics for different types of algorithms. Some algorithms that includes divide and conquer, backtracking, greedy approach, dynamic programming to find the minimum cost of test data generation have been tested. Our experimental results say that some of these types of algorithm can be used as a necessary condition for selecting heuristics and programming constructs are sufficient condition for selecting our heuristics. Finally we recommend the different heuristics for test data generation to be selected for different types of algorithms.

Keywords: ongest path, saturation point, lmax, kL, kS

Procedia PDF Downloads 376
24059 The Perspective on Data Collection Instruments for Younger Learners

Authors: Hatice Kübra Koç

Abstract:

For academia, collecting reliable and valid data is one of the most significant issues for researchers. However, it is not the same procedure for all different target groups; meanwhile, during data collection from teenagers, young adults, or adults, researchers can use common data collection tools such as questionnaires, interviews, and semi-structured interviews; yet, for young learners and very young ones, these reliable and valid data collection tools cannot be easily designed or applied by the researchers. In this study, firstly, common data collection tools are examined for ‘very young’ and ‘young learners’ participant groups since it is thought that the quality and efficiency of an academic study is mainly based on its valid and correct data collection and data analysis procedure. Secondly, two different data collection instruments for very young and young learners are stated as discussing the efficacy of them. Finally, a suggested data collection tool – a performance-based questionnaire- which is specifically developed for ‘very young’ and ‘young learners’ participant groups in the field of teaching English to young learners as a foreign language is presented in this current study. The designing procedure and suggested items/factors for the suggested data collection tool are accordingly revealed at the end of the study to help researchers have studied with young and very learners.

Keywords: data collection instruments, performance-based questionnaire, young learners, very young learners

Procedia PDF Downloads 58
24058 The Predictive Utility of Subjective Cognitive Decline Using Item Level Data from the Everyday Cognition (ECog) Scales

Authors: J. Fox, J. Randhawa, M. Chan, L. Campbell, A. Weakely, D. J. Harvey, S. Tomaszewski Farias

Abstract:

Early identification of individuals at risk for conversion to dementia provides an opportunity for preventative treatment. Many older adults (30-60%) report specific subjective cognitive decline (SCD); however, previous research is inconsistent in terms of what types of complaints predict future cognitive decline. The purpose of this study is to identify which specific complaints from the Everyday Cognition Scales (ECog) scales, a measure of self-reported concerns for everyday abilities across six cognitive domains, are associated with: 1) conversion from a clinical diagnosis of normal to either MCI or dementia (categorical variable) and 2) progressive cognitive decline in memory and executive function (continuous variables). 415 cognitively normal older adults were monitored annually for an average of 5 years. Cox proportional hazards models were used to assess associations between self-reported ECog items and progression to impairment (MCI or dementia). A total of 114 individuals progressed to impairment; the mean time to progression was 4.9 years (SD=3.4 years, range=0.8-13.8). Follow-up models were run controlling for depression. A subset of individuals (n=352) underwent repeat cognitive assessments for an average of 5.3 years. For those individuals, mixed effects models with random intercepts and slopes were used to assess associations between ECog items and change in neuropsychological measures of episodic memory or executive function. Prior to controlling for depression, subjective concerns on five of the eight Everyday Memory items, three of the nine Everyday Language items, one of the seven Everyday Visuospatial items, two of the five Everyday Planning items, and one of the six Everyday Organization items were associated with subsequent diagnostic conversion (HR=1.25 to 1.59, p=0.003 to 0.03). However, after controlling for depression, only two specific complaints of remembering appointments, meetings, and engagements and understanding spoken directions and instructions were associated with subsequent diagnostic conversion. Episodic memory in individuals reporting no concern on ECog items did not significantly change over time (p>0.4). More complaints on seven of the eight Everyday Memory items, three of the nine Everyday Language items, and three of the seven Everyday Visuospatial items were associated with a decline in episodic memory (Interaction estimate=-0.055 to 0.001, p=0.003 to 0.04). Executive function in those reporting no concern on ECog items declined slightly (p <0.001 to 0.06). More complaints on three of the eight Everyday Memory items and three of the nine Everyday Language items were associated with a decline in executive function (Interaction estimate=-0.021 to -0.012, p=0.002 to 0.04). These findings suggest that specific complaints across several cognitive domains are associated with diagnostic conversion. Specific complaints in the domains of Everyday Memory and Language are associated with a decline in both episodic memory and executive function. Increased monitoring and treatment of individuals with these specific SCD may be warranted.

Keywords: alzheimer’s disease, dementia, memory complaints, mild cognitive impairment, risk factors, subjective cognitive decline

Procedia PDF Downloads 55
24057 Generation of Quasi-Measurement Data for On-Line Process Data Analysis

Authors: Hyun-Woo Cho

Abstract:

For ensuring the safety of a manufacturing process one should quickly identify an assignable cause of a fault in an on-line basis. To this end, many statistical techniques including linear and nonlinear methods have been frequently utilized. However, such methods possessed a major problem of small sample size, which is mostly attributed to the characteristics of empirical models used for reference models. This work presents a new method to overcome the insufficiency of measurement data in the monitoring and diagnosis tasks. Some quasi-measurement data are generated from existing data based on the two indices of similarity and importance. The performance of the method is demonstrated using a real data set. The results turn out that the presented methods are able to handle the insufficiency problem successfully. In addition, it is shown to be quite efficient in terms of computational speed and memory usage, and thus on-line implementation of the method is straightforward for monitoring and diagnosis purposes.

Keywords: data analysis, diagnosis, monitoring, process data, quality control

Procedia PDF Downloads 456
24056 Emerging Technology for Business Intelligence Applications

Authors: Hsien-Tsen Wang

Abstract:

Business Intelligence (BI) has long helped organizations make informed decisions based on data-driven insights and gain competitive advantages in the marketplace. In the past two decades, businesses witnessed not only the dramatically increasing volume and heterogeneity of business data but also the emergence of new technologies, such as Artificial Intelligence (AI), Semantic Web (SW), Cloud Computing, and Big Data. It is plausible that the convergence of these technologies would bring more value out of business data by establishing linked data frameworks and connecting in ways that enable advanced analytics and improved data utilization. In this paper, we first review and summarize current BI applications and methodology. Emerging technologies that can be integrated into BI applications are then discussed. Finally, we conclude with a proposed synergy framework that aims at achieving a more flexible, scalable, and intelligent BI solution.

Keywords: business intelligence, artificial intelligence, semantic web, big data, cloud computing

Procedia PDF Downloads 72
24055 Using Equipment Telemetry Data for Condition-Based maintenance decisions

Authors: John Q. Todd

Abstract:

Given that modern equipment can provide comprehensive health, status, and error condition data via built-in sensors, maintenance organizations have a new and valuable source of insight to take advantage of. This presentation will expose what these data payloads might look like and how they can be filtered, visualized, calculated into metrics, used for machine learning, and generate alerts for further action.

Keywords: condition based maintenance, equipment data, metrics, alerts

Procedia PDF Downloads 162
24054 Ethics Can Enable Open Source Data Research

Authors: Dragana Calic

Abstract:

The openness, availability and the sheer volume of big data have provided, what some regard as, an invaluable and rich dataset. Researchers, businesses, advertising agencies, medical institutions, to name only a few, collect, share, and analyze this data to enable their processes and decision making. However, there are important ethical considerations associated with the use of big data. The rapidly evolving nature of online technologies has overtaken the many legislative, privacy, and ethical frameworks and principles that exist. For example, should we obtain consent to use people’s online data, and under what circumstances can privacy considerations be overridden? Current guidance on how to appropriately and ethically handle big data is inconsistent. Consequently, this paper focuses on two quite distinct but related ethical considerations that are at the core of the use of big data for research purposes. They include empowering the producers of data and empowering researchers who want to study big data. The first consideration focuses on informed consent which is at the core of empowering producers of data. In this paper, we discuss some of the complexities associated with informed consent and consider studies of producers’ perceptions to inform research ethics guidelines and practice. The second consideration focuses on the researcher. Similarly, we explore studies that focus on researchers’ perceptions and experiences.

Keywords: big data, ethics, producers’ perceptions, researchers’ perceptions

Procedia PDF Downloads 265
24053 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning

Authors: Walid Cherif

Abstract:

Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.

Keywords: data mining, knowledge discovery, machine learning, similarity measurement, supervised classification

Procedia PDF Downloads 439
24052 Seismic Data Scaling: Uncertainties, Potential and Applications in Workstation Interpretation

Authors: Ankur Mundhra, Shubhadeep Chakraborty, Y. R. Singh, Vishal Das

Abstract:

Seismic data scaling affects the dynamic range of a data and with present day lower costs of storage and higher reliability of Hard Disk data, scaling is not suggested. However, in dealing with data of different vintages, which perhaps were processed in 16 bits or even 8 bits and are need to be processed with 32 bit available data, scaling is performed. Also, scaling amplifies low amplitude events in deeper region which disappear due to high amplitude shallow events that saturate amplitude scale. We have focused on significance of scaling data to aid interpretation. This study elucidates a proper seismic loading procedure in workstations without using default preset parameters as available in most software suites. Differences and distribution of amplitude values at different depth for seismic data are probed in this exercise. Proper loading parameters are identified and associated steps are explained that needs to be taken care of while loading data. Finally, the exercise interprets the un-certainties which might arise when correlating scaled and unscaled versions of seismic data with synthetics. As, seismic well tie correlates the seismic reflection events with well markers, for our study it is used to identify regions which are enhanced and/or affected by scaling parameter(s).

Keywords: clipping, compression, resolution, seismic scaling

Procedia PDF Downloads 446
24051 Association of Social Data as a Tool to Support Government Decision Making

Authors: Diego Rodrigues, Marcelo Lisboa, Elismar Batista, Marcos Dias

Abstract:

Based on data on child labor, this work arises questions about how to understand and locate the factors that make up the child labor rates, and which properties are important to analyze these cases. Using data mining techniques to discover valid patterns on Brazilian social databases were evaluated data of child labor in the State of Tocantins (located north of Brazil with a territory of 277000 km2 and comprises 139 counties). This work aims to detect factors that are deterministic for the practice of child labor and their relationships with financial indicators, educational, regional and social, generating information that is not explicit in the government database, thus enabling better monitoring and updating policies for this purpose.

Keywords: social data, government decision making, association of social data, data mining

Procedia PDF Downloads 342
24050 Outlier Detection in Stock Market Data using Tukey Method and Wavelet Transform

Authors: Sadam Alwadi

Abstract:

Outlier values become a problem that frequently occurs in the data observation or recording process. Thus, the need for data imputation has become an essential matter. In this work, it will make use of the methods described in the prior work to detect the outlier values based on a collection of stock market data. In order to implement the detection and find some solutions that maybe helpful for investors, real closed price data were obtained from the Amman Stock Exchange (ASE). Tukey and Maximum Overlapping Discrete Wavelet Transform (MODWT) methods will be used to impute the detect the outlier values.

Keywords: outlier values, imputation, stock market data, detecting, estimation

Procedia PDF Downloads 61
24049 PEINS: A Generic Compression Scheme Using Probabilistic Encoding and Irrational Number Storage

Authors: P. Jayashree, S. Rajkumar

Abstract:

With social networks and smart devices generating a multitude of data, effective data management is the need of the hour for networks and cloud applications. Some applications need effective storage while some other applications need effective communication over networks and data reduction comes as a handy solution to meet out both requirements. Most of the data compression techniques are based on data statistics and may result in either lossy or lossless data reductions. Though lossy reductions produce better compression ratios compared to lossless methods, many applications require data accuracy and miniature details to be preserved. A variety of data compression algorithms does exist in the literature for different forms of data like text, image, and multimedia data. In the proposed work, a generic progressive compression algorithm, based on probabilistic encoding, called PEINS is projected as an enhancement over irrational number stored coding technique to cater to storage issues of increasing data volumes as a cost effective solution, which also offers data security as a secondary outcome to some extent. The proposed work reveals cost effectiveness in terms of better compression ratio with no deterioration in compression time.

Keywords: compression ratio, generic compression, irrational number storage, probabilistic encoding

Procedia PDF Downloads 263
24048 Iot Device Cost Effective Storage Architecture and Real-Time Data Analysis/Data Privacy Framework

Authors: Femi Elegbeleye, Omobayo Esan, Muienge Mbodila, Patrick Bowe

Abstract:

This paper focused on cost effective storage architecture using fog and cloud data storage gateway and presented the design of the framework for the data privacy model and data analytics framework on a real-time analysis when using machine learning method. The paper began with the system analysis, system architecture and its component design, as well as the overall system operations. The several results obtained from this study on data privacy model shows that when two or more data privacy model is combined we tend to have a more stronger privacy to our data, and when fog storage gateway have several advantages over using the traditional cloud storage, from our result shows fog has reduced latency/delay, low bandwidth consumption, and energy usage when been compare with cloud storage, therefore, fog storage will help to lessen excessive cost. This paper dwelt more on the system descriptions, the researchers focused on the research design and framework design for the data privacy model, data storage, and real-time analytics. This paper also shows the major system components and their framework specification. And lastly, the overall research system architecture was shown, its structure, and its interrelationships.

Keywords: IoT, fog, cloud, data analysis, data privacy

Procedia PDF Downloads 71
24047 Comparison of Selected Pier-Scour Equations for Wide Piers Using Field Data

Authors: Nordila Ahmad, Thamer Mohammad, Bruce W. Melville, Zuliziana Suif

Abstract:

Current methods for predicting local scour at wide bridge piers, were developed on the basis of laboratory studies and very limited scour prediction were tested with field data. Laboratory wide pier scour equation from previous findings with field data were presented. A wide range of field data were used and it consists of both live-bed and clear-water scour. A method for assessing the quality of the data was developed and applied to the data set. Three other wide pier-scour equations from the literature were used to compare the performance of each predictive method. The best-performing scour equation were analyzed using statistical analysis. Comparisons of computed and observed scour depths indicate that the equation from the previous publication produced the smallest discrepancy ratio and RMSE value when compared with the large amount of laboratory and field data.

Keywords: field data, local scour, scour equation, wide piers

Procedia PDF Downloads 373
24046 The Effectiveness of Psychosocial Interventions for Survivors of Natural Disasters: A Systematic Review

Authors: Santhani M. Selveindran

Abstract:

Background: Natural disasters are traumatic global events that are becoming increasing more common, with significant psychosocial impact on survivors. This impact results not only in psychosocial distress but, for many, can lead to psychosocial disorders and chronic psychopathology. While there are currently available interventions that seek to prevent and treat these psychosocial sequelae, their effectiveness is uncertain. The evidence-base is emerging with more primary studies evaluating the effectiveness of various psychosocial interventions for survivors of natural disasters, which remains to be synthesized. Aim of Review: To identify, critically appraise and synthesize the current evidence-base on the effectiveness of psychosocial interventions in preventing or treating Post-Traumatic Stress Disorder (PTSD), Major Depressive Disorder (MDD) and/or Generalized Anxiety Disorder (GAD) in adults and children who are survivors of natural disasters. Methods: A protocol was developed as a guide to carry out this review. A systematic search was conducted in eight international electronic databases, three grey literature databases, one dissertation and thesis repository, websites of six humanitarian and non-governmental organizations renowned for their work on natural disasters, as well as bibliographic and citation searching for eligible articles. Papers meeting the specific inclusion criteria underwent quality assessment using the Downs and Black checklist. Data were extracted from the included papers and analysed by way of narrative synthesis. Results: Database and website searching returned 3777 papers where 31 met the criteria for inclusion. Additional 2 papers were obtained through bibliographic and citation searching. Methodological quality of most papers was fair. Twenty-five studies evaluated psychological interventions, five, social interventions whereas three studies evaluated ‘mixed’ psychological and social interventions. All studies, irrespective of methodological quality, reported post-intervention reductions in symptom scores for PTSD, depression and/or anxiety and where assessed, reduced diagnosis of PTSD and MDD, and produced improvements in self-efficacy and quality of life. Statistically significant results were seen in 27 studies. However, three studies demonstrated that the evaluated interventions may not have been very beneficial. Conclusions: The overall positive results suggest that any psychosocial interventions are favourable and should be delivered to all natural disaster survivors, irrespective of age, country, and phase of disaster. Yet, heterogeneity and methodological shortcomings of the current evidence-base makes it difficult to draw definite conclusions needed to formulate categorical guidance or frameworks. Further, rigorously conducted research is needed in this area, although the feasibility of such, given the context and nature of the problem, is also recognized.

Keywords: psychosocial interventions, natural disasters, survivors, effectiveness

Procedia PDF Downloads 131
24045 The Maximum Throughput Analysis of UAV Datalink 802.11b Protocol

Authors: Inkyu Kim, SangMan Moon

Abstract:

This IEEE 802.11b protocol provides up to 11Mbps data rate, whereas aerospace industry wants to seek higher data rate COTS data link system in the UAV. The Total Maximum Throughput (TMT) and delay time are studied on many researchers in the past years This paper provides theoretical data throughput performance of UAV formation flight data link using the existing 802.11b performance theory. We operate the UAV formation flight with more than 30 quad copters with 802.11b protocol. We may be predicting that UAV formation flight numbers have to bound data link protocol performance limitations.

Keywords: UAV datalink, UAV formation flight datalink, UAV WLAN datalink application, UAV IEEE 802.11b datalink application

Procedia PDF Downloads 361
24044 Methods for Distinction of Cattle Using Supervised Learning

Authors: Radoslav Židek, Veronika Šidlová, Radovan Kasarda, Birgit Fuerst-Waltl

Abstract:

Machine learning represents a set of topics dealing with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data. The data can present identification patterns which are used to classify into groups. The result of the analysis is the pattern which can be used for identification of data set without the need to obtain input data used for creation of this pattern. An important requirement in this process is careful data preparation validation of model used and its suitable interpretation. For breeders, it is important to know the origin of animals from the point of the genetic diversity. In case of missing pedigree information, other methods can be used for traceability of animal´s origin. Genetic diversity written in genetic data is holding relatively useful information to identify animals originated from individual countries. We can conclude that the application of data mining for molecular genetic data using supervised learning is an appropriate tool for hypothesis testing and identifying an individual.

Keywords: genetic data, Pinzgau cattle, supervised learning, machine learning

Procedia PDF Downloads 521
24043 Router 1X3 - RTL Design and Verification

Authors: Nidhi Gopal

Abstract:

Routing is the process of moving a packet of data from source to destination and enables messages to pass from one computer to another and eventually reach the target machine. A router is a networking device that forwards data packets between computer networks. It is connected to two or more data lines from different networks (as opposed to a network switch, which connects data lines from one single network). This paper mainly emphasizes upon the study of router device, its top level architecture, and how various sub-modules of router i.e. Register, FIFO, FSM and Synchronizer are synthesized, and simulated and finally connected to its top module.

Keywords: data packets, networking, router, routing

Procedia PDF Downloads 768
24042 Noise Reduction in Web Data: A Learning Approach Based on Dynamic User Interests

Authors: Julius Onyancha, Valentina Plekhanova

Abstract:

One of the significant issues facing web users is the amount of noise in web data which hinders the process of finding useful information in relation to their dynamic interests. Current research works consider noise as any data that does not form part of the main web page and propose noise web data reduction tools which mainly focus on eliminating noise in relation to the content and layout of web data. This paper argues that not all data that form part of the main web page is of a user interest and not all noise data is actually noise to a given user. Therefore, learning of noise web data allocated to the user requests ensures not only reduction of noisiness level in a web user profile, but also a decrease in the loss of useful information hence improves the quality of a web user profile. Noise Web Data Learning (NWDL) tool/algorithm capable of learning noise web data in web user profile is proposed. The proposed work considers elimination of noise data in relation to dynamic user interest. In order to validate the performance of the proposed work, an experimental design setup is presented. The results obtained are compared with the current algorithms applied in noise web data reduction process. The experimental results show that the proposed work considers the dynamic change of user interest prior to elimination of noise data. The proposed work contributes towards improving the quality of a web user profile by reducing the amount of useful information eliminated as noise.

Keywords: web log data, web user profile, user interest, noise web data learning, machine learning

Procedia PDF Downloads 240
24041 Data Mining and Knowledge Management Application to Enhance Business Operations: An Exploratory Study

Authors: Zeba Mahmood

Abstract:

The modern business organizations are adopting technological advancement to achieve competitive edge and satisfy their consumer. The development in the field of Information technology systems has changed the way of conducting business today. Business operations today rely more on the data they obtained and this data is continuously increasing in volume. The data stored in different locations is difficult to find and use without the effective implementation of Data mining and Knowledge management techniques. Organizations who smartly identify, obtain and then convert data in useful formats for their decision making and operational improvements create additional value for their customers and enhance their operational capabilities. Marketers and Customer relationship departments of firm use Data mining techniques to make relevant decisions, this paper emphasizes on the identification of different data mining and Knowledge management techniques that are applied to different business industries. The challenges and issues of execution of these techniques are also discussed and critically analyzed in this paper.

Keywords: knowledge, knowledge management, knowledge discovery in databases, business, operational, information, data mining

Procedia PDF Downloads 509
24040 Indexing and Incremental Approach Using Map Reduce Bipartite Graph (MRBG) for Mining Evolving Big Data

Authors: Adarsh Shroff

Abstract:

Big data is a collection of dataset so large and complex that it becomes difficult to process using data base management tools. To perform operations like search, analysis, visualization on big data by using data mining; which is the process of extraction of patterns or knowledge from large data set. In recent years, the data mining applications become stale and obsolete over time. Incremental processing is a promising approach to refreshing mining results. It utilizes previously saved states to avoid the expense of re-computation from scratch. This project uses i2MapReduce, an incremental processing extension to Map Reduce, the most widely used framework for mining big data. I2MapReduce performs key-value pair level incremental processing rather than task level re-computation, supports not only one-step computation but also more sophisticated iterative computation, which is widely used in data mining applications, and incorporates a set of novel techniques to reduce I/O overhead for accessing preserved fine-grain computation states. To optimize the mining results, evaluate i2MapReduce using a one-step algorithm and three iterative algorithms with diverse computation characteristics for efficient mining.

Keywords: big data, map reduce, incremental processing, iterative computation

Procedia PDF Downloads 322
24039 Analyzing Large Scale Recurrent Event Data with a Divide-And-Conquer Approach

Authors: Jerry Q. Cheng

Abstract:

Currently, in analyzing large-scale recurrent event data, there are many challenges such as memory limitations, unscalable computing time, etc. In this research, a divide-and-conquer method is proposed using parametric frailty models. Specifically, the data is randomly divided into many subsets, and the maximum likelihood estimator from each individual data set is obtained. Then a weighted method is proposed to combine these individual estimators as the final estimator. It is shown that this divide-and-conquer estimator is asymptotically equivalent to the estimator based on the full data. Simulation studies are conducted to demonstrate the performance of this proposed method. This approach is applied to a large real dataset of repeated heart failure hospitalizations.

Keywords: big data analytics, divide-and-conquer, recurrent event data, statistical computing

Procedia PDF Downloads 137
24038 Adoption of Big Data by Global Chemical Industries

Authors: Ashiff Khan, A. Seetharaman, Abhijit Dasgupta

Abstract:

The new era of big data (BD) is influencing chemical industries tremendously, providing several opportunities to reshape the way they operate and help them shift towards intelligent manufacturing. Given the availability of free software and the large amount of real-time data generated and stored in process plants, chemical industries are still in the early stages of big data adoption. The industry is just starting to realize the importance of the large amount of data it owns to make the right decisions and support its strategies. This article explores the importance of professional competencies and data science that influence BD in chemical industries to help it move towards intelligent manufacturing fast and reliable. This article utilizes a literature review and identifies potential applications in the chemical industry to move from conventional methods to a data-driven approach. The scope of this document is limited to the adoption of BD in chemical industries and the variables identified in this article. To achieve this objective, government, academia, and industry must work together to overcome all present and future challenges.

Keywords: chemical engineering, big data analytics, industrial revolution, professional competence, data science

Procedia PDF Downloads 57
24037 Brokerage and Value-Creation: Trading Practices in the English Market of 20th-Century Maps

Authors: Shaun Lim

Abstract:

This paper presents a 9-month ethnographic case study of the value creating strategies employed by an Oxford market-trader of 20th-century maps. Maps are usually valued and sold as either antique objets d’art or useful navigational tools, with 20th-century maps precariously lying between the boundary of the aesthetic and utilitarian value-regimes. Here, the brokerage practices involved in the framing of outdated, lowly valued maps into vintage commodities will be examined. Ethnographic material of the unstudied market of old maps is introduced and situated in the second-hand, antique and collectible spheres of exchange. The map-trader as a broker is the ethnographic and methodological starting point of this paper. Brokerage is understood through the activity of framing that defines and brackets the value-regimes of commodities with the aid of market and framing devices. The trader’s activities will be examined in three parts. (1) The post-sourcing industry: the altering, mounting and tagging of maps before putting them into market circulation. Mounts, frames and tags are seen as market devices that authenticates and frames maps with aesthetic and symbolic values along with the disentanglement of its use value. (2) The market-display: the constitution of space that encourages the relations of looking at maps as aesthetic objects, while the categorical arrangement of the display contributes to legitimising of the collectability of maps. (3) The salesmanship strategies of the trader: the match-making of customers with maps of meaningful value, and the mediating of knowledge through the verbal articulation of the map’s symbolic values. Ultimately, value is not created in an accumulative sense, but is layered and superimposed to cater to a wide spectrum of patrons. The trader creates demand for his goods by mediating and articulating value-regimes already coherent to potential patrons.

Keywords: art and material culture, brokerage, commodification, framing, markets, value

Procedia PDF Downloads 122
24036 Secure Multiparty Computations for Privacy Preserving Classifiers

Authors: M. Sumana, K. S. Hareesha

Abstract:

Secure computations are essential while performing privacy preserving data mining. Distributed privacy preserving data mining involve two to more sites that cannot pool in their data to a third party due to the violation of law regarding the individual. Hence in order to model the private data without compromising privacy and information loss, secure multiparty computations are used. Secure computations of product, mean, variance, dot product, sigmoid function using the additive and multiplicative homomorphic property is discussed. The computations are performed on vertically partitioned data with a single site holding the class value.

Keywords: homomorphic property, secure product, secure mean and variance, secure dot product, vertically partitioned data

Procedia PDF Downloads 393
24035 Prediction of Outcome after Endovascular Thrombectomy for Anterior and Posterior Ischemic Stroke: ASPECTS on CT

Authors: Angela T. H. Kwan, Wenjun Liang, Jack Wellington, Mohammad Mofatteh, Thanh N. Nguyen, Pingzhong Fu, Juanmei Chen, Zile Yan, Weijuan Wu, Yongting Zhou, Shuiquan Yang, Sijie Zhou, Yimin Chen

Abstract:

Background: Endovascular Therapy (EVT)—in the form of mechanical thrombectomy—following intravenous thrombolysis is the standard gold treatment for patients with acute ischemic stroke (AIS) due to large vessel occlusion (LVO). It is well established that an ASPECTS ≥ 7 is associated with an increased likelihood of positive post-EVT outcomes, as compared to an ASPECTS < 7. There is also prognostic utility in coupling posterior circulation ASPECTS (pc-ASPECTS) with magnetic resonance imaging for evaluating the post-EVT functional outcome. However, the value of pc-ASPECTS applied to CT must be explored further to determine its usefulness in predicting functional outcomes following EVT. Objective: In this study, we aimed to determine whether pc-ASPECTS on CT can predict post-EVT functional outcomes among patients with AIS due to LVO. Methods: A total of 247 consecutive patients aged 18 and over receiving EVT for LVO-related AIS were recruited into a prospective database. The data were retrospectively analyzed between March 2019 to February 2022 from two comprehensive tertiary care stroke centers: Foshan Sanshui District People’s Hospital and First People's Hospital of Foshan in China. Patient parameters included EVT within 24hrs of symptom onset, premorbid modified Rankin Scale (mRS) ≤ 2, presence of distal and terminal cerebral blood vessel occlusion, and subsequent 24–72-hour post-stroke onset CT scan. Univariate comparisons were performed using the Fisher exact test or χ2 test for categorical variables and the Mann–Whitney U test for continuous variables. A p-value of ≤ 0.05 was statistically significant. Results: A total of 247 patients met the inclusion criteria; however, 3 were excluded due to the absence of post-CTs and 8 for pre-EVT ASPECTS < 7. Overall, 236 individuals were examined: 196 anterior circulation ischemic strokes and 40 posterior strokes of basilar artery occlusion. We found that both baseline post- and pc-ASPECTS ≥ 7 serve as strong positive markers of favorable outcomes at 90 days post-EVT. Moreover, lower rates of inpatient mortality/hospice discharge, 90-day mortality, and 90-day poor outcome were observed. Moreover, patients in the post-ASPECTS ≥ 7 anterior circulation group had shorter door-to-recanalization time (DRT), puncture-to-recanalization time (PRT), and last known normal-to-puncture-time (LKNPT). Conclusion: Patients of anterior and posterior circulation ischemic strokes with baseline post- and pc-ASPECTS ≥ 7 may benefit from EVT.

Keywords: endovascular therapy, thrombectomy, large vessel occlusion, cerebral ischemic stroke, ASPECTS

Procedia PDF Downloads 71
24034 Cross Project Software Fault Prediction at Design Phase

Authors: Pradeep Singh, Shrish Verma

Abstract:

Software fault prediction models are created by using the source code, processed metrics from the same or previous version of code and related fault data. Some company do not store and keep track of all artifacts which are required for software fault prediction. To construct fault prediction model for such company, the training data from the other projects can be one potential solution. The earlier we predict the fault the less cost it requires to correct. The training data consists of metrics data and related fault data at function/module level. This paper investigates fault predictions at early stage using the cross-project data focusing on the design metrics. In this study, empirical analysis is carried out to validate design metrics for cross project fault prediction. The machine learning techniques used for evaluation is Naïve Bayes. The design phase metrics of other projects can be used as initial guideline for the projects where no previous fault data is available. We analyze seven data sets from NASA Metrics Data Program which offer design as well as code metrics. Overall, the results of cross project is comparable to the within company data learning.

Keywords: software metrics, fault prediction, cross project, within project.

Procedia PDF Downloads 314
24033 Comparing Emotion Recognition from Voice and Facial Data Using Time Invariant Features

Authors: Vesna Kirandziska, Nevena Ackovska, Ana Madevska Bogdanova

Abstract:

The problem of emotion recognition is a challenging problem. It is still an open problem from the aspect of both intelligent systems and psychology. In this paper, both voice features and facial features are used for building an emotion recognition system. A Support Vector Machine classifiers are built by using raw data from video recordings. In this paper, the results obtained for the emotion recognition are given, and a discussion about the validity and the expressiveness of different emotions is presented. A comparison between the classifiers build from facial data only, voice data only and from the combination of both data is made here. The need for a better combination of the information from facial expression and voice data is argued.

Keywords: emotion recognition, facial recognition, signal processing, machine learning

Procedia PDF Downloads 292
24032 Cryptosystems in Asymmetric Cryptography for Securing Data on Cloud at Various Critical Levels

Authors: Sartaj Singh, Amar Singh, Ashok Sharma, Sandeep Kaur

Abstract:

With upcoming threats in a digital world, we need to work continuously in the area of security in all aspects, from hardware to software as well as data modelling. The rise in social media activities and hunger for data by various entities leads to cybercrime and more attack on the privacy and security of persons. Cryptography has always been employed to avoid access to important data by using many processes. Symmetric key and asymmetric key cryptography have been used for keeping data secrets at rest as well in transmission mode. Various cryptosystems have evolved from time to time to make the data more secure. In this research article, we are studying various cryptosystems in asymmetric cryptography and their application with usefulness, and much emphasis is given to Elliptic curve cryptography involving algebraic mathematics.

Keywords: cryptography, symmetric key cryptography, asymmetric key cryptography

Procedia PDF Downloads 94