Search results for: data transfer
25798 Big Data Analysis with Rhipe
Authors: Byung Ho Jung, Ji Eun Shin, Dong Hoon Lim
Abstract:
Rhipe that integrates R and Hadoop environment made it possible to process and analyze massive amounts of data using a distributed processing environment. In this paper, we implemented multiple regression analysis using Rhipe with various data sizes of actual data. Experimental results for comparing the performance of our Rhipe with stats and biglm packages available on bigmemory, showed that our Rhipe was more fast than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases. We also compared the computing speeds of pseudo-distributed and fully-distributed modes for configuring Hadoop cluster. The results showed that fully-distributed mode was faster than pseudo-distributed mode, and computing speeds of fully-distributed mode were faster as the number of data nodes increases.Keywords: big data, Hadoop, Parallel regression analysis, R, Rhipe
Procedia PDF Downloads 49525797 Mixed Tetravalent Cs₂RuₘPt₁-ₘX₆ (X = Cl-, Br-) Based Vacancy-Ordered Halide Double Perovskites for Enhanced Solar Water Oxidation
Authors: Jigar Shaileshumar Halpati, Aravind Kumar Chandiran
Abstract:
Vacancy ordered double perovskites (VOPs) have been significantly attracting researchers due to their chemical structure diversity and interesting optoelectronic properties. Some VOPs have been recently reported to be suitable photoelectrodes for photoelectrochemical water-splitting reactions due to their high stability and panchromatic absorption. In this work, we systematically synthesized mixed tetravalent VOPs based on Cs₂RuₘPt₁-ₘX₆ (X = Cl-, Br-) and reported their structural, optical, electrochemical and photoelectrochemical properties. The structural characterization confirms that the mixed tetravalent site intermediates formed their own phases. The parent materials, as well as their intermediates, were found to be stable in ambient conditions for over 1 year and also showed incredible stability in harsh pH media ranging from pH 1 to pH 11. Moreover, these materials showed panchromatic absorption with onset up to 1000 nm depending upon the mixture stoichiometry. The extraordinary stability and excellent absorption properties make them suitable materials for photoelectrochemical water-splitting applications. PEC studies of these series of materials showed a high water oxidation photocurrent of 0.56 mA cm-² for Cs₂Ru₀.₅Pt₀.₅Cl₆. Fundamental investigation from photoelectrochemical reactions revealed that the intrinsic ruthenium-based VOP showed enhanced hole transfer to the electrolyte, while the intrinsic platinum-based VOP showed higher photovoltage. The mix of these end members at the tetravalent site showed a synergic effect of reduced charge transfer resistance from the material to the electrolyte and increased photovoltage, which led to increased PEC performance of the intermediate materials.Keywords: solar water splitting, photo electrochemistry, photo absorbers, material characterization, device characterization, green hydrogen
Procedia PDF Downloads 7425796 Performance Analysis of a Planar Membrane Humidifier for PEM Fuel Cell
Authors: Yu-Hsuan Chang, Jian-Hao Su, Chen-Yu Chen, Wei-Mon Yan
Abstract:
In this work, the experimental measurement was applied to examine the membrane type and flow field design on the performance of a planar membrane humidifier. The performance indexes were used to evaluate the planar membrane humidifier. The performance indexes of the membrane humidifier include the dew point approach temperature (DPAT), water recovery ratio (WRR), water flux (J) and pressure loss (P). The experiments contain mainly three parts. In the first part, a single membrane humidifier was tested using different flow field under different dry-inlet temperatures. The measured results show that the dew point approach temperature decreases with increasing the depth of flow channel at the same width of flow channel. However, the WRR and J reduce with an increase in the dry air-inlet temperature. The pressure loss tests indicate that pressure loss decreases with increasing the hydraulic diameter of flow channel, resulting from an increase in Darcy friction. Owing to the comparison of humidifier performances and pressure losses, the flow channel of width W=1 and height H=1.5 was selected as the channel design of the multi-membrane humidifier in the second part of experiment. In the second part, the multi-membrane humidifier was used to evaluate the humidification performance under different relative humidity and flow rates. The measurement results indicate that the humidifier at both lower temperature and relative humidity of inlet dry air have higher DPAT but lower J and WRR. In addition, the counter flow approach has better mass and heat transfer performance than the parallel flow approach. Moreover, the effects of dry air temperature, relative humidity and humidification approach are not significant to the pressure loss in the planar membrane humidifier. For the third part, different membranes were tested in this work in order to find out which kind membrane is appropriate for humidifier.Keywords: water management, planar membrane humidifier, heat and mass transfer, pressure loss, PEM fuel cell
Procedia PDF Downloads 20525795 Survival Data with Incomplete Missing Categorical Covariates
Authors: Madaki Umar Yusuf, Mohd Rizam B. Abubakar
Abstract:
The survival censored data with incomplete covariate data is a common occurrence in many studies in which the outcome is survival time. With model when the missing covariates are categorical, a useful technique for obtaining parameter estimates is the EM by the method of weights. The survival outcome for the class of generalized linear model is applied and this method requires the estimation of the parameters of the distribution of the covariates. In this paper, we propose some clinical trials with ve covariates, four of which have some missing values which clearly show that they were fully censored data.Keywords: EM algorithm, incomplete categorical covariates, ignorable missing data, missing at random (MAR), Weibull Distribution
Procedia PDF Downloads 40325794 mKDNAD: A Network Flow Anomaly Detection Method Based On Multi-teacher Knowledge Distillation
Abstract:
Anomaly detection models for network flow based on machine learning have poor detection performance under extremely unbalanced training data conditions and also have slow detection speed and large resource consumption when deploying on network edge devices. Embedding multi-teacher knowledge distillation (mKD) in anomaly detection can transfer knowledge from multiple teacher models to a single model. Inspired by this, we proposed a state-of-the-art model, mKDNAD, to improve detection performance. mKDNAD mine and integrate the knowledge of one-dimensional sequence and two-dimensional image implicit in network flow to improve the detection accuracy of small sample classes. The multi-teacher knowledge distillation method guides the train of the student model, thus speeding up the model's detection speed and reducing the number of model parameters. Experiments in the CICIDS2017 dataset verify the improvements of our method in the detection speed and the detection accuracy in dealing with the small sample classes.Keywords: network flow anomaly detection (NAD), multi-teacher knowledge distillation, machine learning, deep learning
Procedia PDF Downloads 12025793 A Study of Blockchain Oracles
Authors: Abdeljalil Beniiche
Abstract:
The limitation with smart contracts is that they cannot access external data that might be required to control the execution of business logic. Oracles can be used to provide external data to smart contracts. An oracle is an interface that delivers data from external data outside the blockchain to a smart contract to consume. Oracle can deliver different types of data depending on the industry and requirements. In this paper, we study and describe the widely used blockchain oracles. Then, we elaborate on his potential role, technical architecture, and design patterns. Finally, we discuss the human oracle and its key role in solving the truth problem by reaching a consensus about a certain inquiry and tasks.Keywords: blockchain, oracles, oracles design, human oracles
Procedia PDF Downloads 13425792 Growth Performance Of fresh Water Microalgae Chlorella sp. Exposed to Carbon Dioxide
Authors: Titin Handayani, Adi Mulyanto, Fajar Eko Priyanto
Abstract:
It is generally recognized, that algae could be an interesting option for reducing CO₂ emissions. Based on light and CO₂, algae can be used for the production various economically interesting products. Current algae cultivation techniques, however, still present a number of limitations. Efficient feeding of CO₂, especially on a large scale, is one of them. Current methods for CO₂ feeding to algae cultures rely on the sparging pure CO₂ or directly from flue gas. The limiting factor in this system is the solubility of CO₂ in water, which demands a considerable amount of energy for an effective gas to liquid transfer and leads to losses to the atmosphere. Due to the current ineffective methods for CO₂ introduction into algae ponds very large surface areas would be required for enough ponds to capture a considerable amount of the CO₂. The purpose of this study is to assess technology to capture carbon dioxide (CO₂) emissions generated by industry by utilizing of microalgae Chlorella sp. The microalgae were cultivated in a bioreactor culture pond raceway type. The result is expected to be useful in mitigating the effects of greenhouse gases in reducing the CO₂ emissions. The research activities include: (1) Characterization of boiler flue gas, (2) Operation of culture pond, (3) Sampling and sample analysis. The results of this study showed that the initial assessment absorption of the flue gas by microalgae using 1000 L raceway pond completed by heat exchanger were quite promising. The transfer of CO₂ into the pond culture system was run well. This identified from the success of cooling the boiler flue gas from the temperature of about 200 °C to below ambient temperature. Except for the temperature, the gas bubbles into the culture media were quite fine. Therefore, the contact between the gas and the media was well performed. The efficiency of CO₂ absorption by Chlorella sp reached 6.68 % with an average CO₂ loading of 0.29 g/L/day.Keywords: Chlorella sp., CO2 emission, heat exchange, microalgae, milk industry, raceway pond
Procedia PDF Downloads 21525791 Multi Data Management Systems in a Cluster Randomized Trial in Poor Resource Setting: The Pneumococcal Vaccine Schedules Trial
Authors: Abdoullah Nyassi, Golam Sarwar, Sarra Baldeh, Mamadou S. K. Jallow, Bai Lamin Dondeh, Isaac Osei, Grant A. Mackenzie
Abstract:
A randomized controlled trial is the "gold standard" for evaluating the efficacy of an intervention. Large-scale, cluster-randomized trials are expensive and difficult to conduct, though. To guarantee the validity and generalizability of findings, high-quality, dependable, and accurate data management systems are necessary. Robust data management systems are crucial for optimizing and validating the quality, accuracy, and dependability of trial data. Regarding the difficulties of data gathering in clinical trials in low-resource areas, there is a scarcity of literature on this subject, which may raise concerns. Effective data management systems and implementation goals should be part of trial procedures. Publicizing the creative clinical data management techniques used in clinical trials should boost public confidence in the study's conclusions and encourage further replication. In the ongoing pneumococcal vaccine schedule study in rural Gambia, this report details the development and deployment of multi-data management systems and methodologies. We implemented six different data management, synchronization, and reporting systems using Microsoft Access, RedCap, SQL, Visual Basic, Ruby, and ASP.NET. Additionally, data synchronization tools were developed to integrate data from these systems into the central server for reporting systems. Clinician, lab, and field data validation systems and methodologies are the main topics of this report. Our process development efforts across all domains were driven by the complexity of research project data collected in real-time data, online reporting, data synchronization, and ways for cleaning and verifying data. Consequently, we effectively used multi-data management systems, demonstrating the value of creative approaches in enhancing the consistency, accuracy, and reporting of trial data in a poor resource setting.Keywords: data management, data collection, data cleaning, cluster-randomized trial
Procedia PDF Downloads 2525790 Comparative Assessment of Geocell and Geogrid Reinforcement for Flexible Pavement: Numerical Parametric Study
Authors: Anjana R. Menon, Anjana Bhasi
Abstract:
Development of highways and railways play crucial role in a nation’s economic growth. While rigid concrete pavements are durable with high load bearing characteristics, growing economies mostly rely on flexible pavements which are easier in construction and more economical. The strength of flexible pavement is based on the strength of subgrade and load distribution characteristics of intermediate granular layers. In this scenario, to simultaneously meet economy and strength criteria, it is imperative to strengthen and stabilize the load transferring layers, namely subbase and base. Geosynthetic reinforcement in planar and cellular forms have been proven effective in improving soil stiffness and providing a stable load transfer platform. Studies have proven the relative superiority of cellular form-geocells over planar geosynthetic forms like geogrid, owing to the additional confinement of infill material and pocket effect arising from vertical deformation. Hence, the present study investigates the efficiency of geocells over single/multiple layer geogrid reinforcements by a series of three-dimensional model analyses of a flexible pavement section under a standard repetitive wheel load. The stress transfer mechanism and deformation profiles under various reinforcement configurations are also studied. Geocell reinforcement is observed to take up a higher proportion of stress caused by the traffic loads compared to single and double-layer geogrid reinforcements. The efficiency of single geogrid reinforcement reduces with an increase in embedment depth. The contribution of lower geogrid is insignificant in the case of the double-geogrid reinforced system.Keywords: Geocell, Geogrid, Flexible Pavement, Repetitive Wheel Load, Numerical Analysis
Procedia PDF Downloads 7425789 Analysis of Landscape Pattern Evolution in Banan District, Chongqing, Based on GIS and FRAGSTATS
Authors: Wenyang Wan
Abstract:
The study of urban land use and landscape pattern is the current hotspot in the fields of planning and design, ecology, etc., which is of great significance for the construction of the overall humanistic ecosystem of the city and optimization of the urban spatial structure. Banan District, as the main part of the eastern eco-city planning of Chongqing Municipality, is a new high ground for highlighting the ecological characteristics of Chongqing, realizing effective transformation of ecological value, and promoting the integrated development of urban and rural areas. The analytical methods of land use transfer matrix (GIS) and landscape pattern index (Fragstats) were used to study the characteristics and laws of the evolution of land use landscape pattern in Banan District from 2000 to 2020, which provide some reference value for Banan District to alleviate the ecological contradiction of landscape. The results of the study show that: ① Banan District is rich in land use types, of which the area of cultivated land will still account for 57.15% of the total area of the landscape until 2020, accounting for an absolute advantage in the land use structure of Banan District; ② From 2000 to 2020, land use conversion in Banan District is characterized as: Cropland > woodland > grassland > shrubland > built-up land > water bodies > wetlands, with cropland converted to built-up land being the largest; ③ From 2000 to 2020, the landscape elements of Banan District were distributed in a balanced way, and the landscape types were rich and diversified, but due to the influence of human interference, it also presented the characteristics that the shape of the landscape elements tended to be irregular, and the dominant patches were distributed in a scattered manner, and the patches had poor connectivity. It is recommended that in future regional ecological construction, the layout should be rationally optimized, the relationship between landscape components should be coordinated, and the connectivity between landscape patches should be strengthened, and the degree of landscape fragmentation should be reduced.Keywords: land use transfer, landscape pattern evolution, GIS and FRAGSTATS, Banan District
Procedia PDF Downloads 7825788 Finding Bicluster on Gene Expression Data of Lymphoma Based on Singular Value Decomposition and Hierarchical Clustering
Authors: Alhadi Bustaman, Soeganda Formalidin, Titin Siswantining
Abstract:
DNA microarray technology is used to analyze thousand gene expression data simultaneously and a very important task for drug development and test, function annotation, and cancer diagnosis. Various clustering methods have been used for analyzing gene expression data. However, when analyzing very large and heterogeneous collections of gene expression data, conventional clustering methods often cannot produce a satisfactory solution. Biclustering algorithm has been used as an alternative approach to identifying structures from gene expression data. In this paper, we introduce a transform technique based on singular value decomposition to identify normalized matrix of gene expression data followed by Mixed-Clustering algorithm and the Lift algorithm, inspired in the node-deletion and node-addition phases proposed by Cheng and Church based on Agglomerative Hierarchical Clustering (AHC). Experimental study on standard datasets demonstrated the effectiveness of the algorithm in gene expression data.Keywords: agglomerative hierarchical clustering (AHC), biclustering, gene expression data, lymphoma, singular value decomposition (SVD)
Procedia PDF Downloads 27625787 An Efficient Traceability Mechanism in the Audited Cloud Data Storage
Authors: Ramya P, Lino Abraham Varghese, S. Bose
Abstract:
By cloud storage services, the data can be stored in the cloud, and can be shared across multiple users. Due to the unexpected hardware/software failures and human errors, which make the data stored in the cloud be lost or corrupted easily it affected the integrity of data in cloud. Some mechanisms have been designed to allow both data owners and public verifiers to efficiently audit cloud data integrity without retrieving the entire data from the cloud server. But public auditing on the integrity of shared data with the existing mechanisms will unavoidably reveal confidential information such as identity of the person, to public verifiers. Here a privacy-preserving mechanism is proposed to support public auditing on shared data stored in the cloud. It uses group signatures to compute verification metadata needed to audit the correctness of shared data. The identity of the signer on each block in shared data is kept confidential from public verifiers, who are easily verifying shared data integrity without retrieving the entire file. But on demand, the signer of the each block is reveal to the owner alone. Group private key is generated once by the owner in the static group, where as in the dynamic group, the group private key is change when the users revoke from the group. When the users leave from the group the already signed blocks are resigned by cloud service provider instead of owner is efficiently handled by efficient proxy re-signature scheme.Keywords: data integrity, dynamic group, group signature, public auditing
Procedia PDF Downloads 39125786 Securing Health Monitoring in Internet of Things with Blockchain-Based Proxy Re-Encryption
Authors: Jerlin George, R. Chitra
Abstract:
The devices with sensors that can monitor your temperature, heart rate, and other vital signs and link to the internet, known as the Internet of Things (IoT), have completely transformed the way we control health. Providing real-time health data, these sensors improve diagnostics and treatment outcomes. Security and privacy matters when IoT comes into play in healthcare. Cyberattacks on centralized database systems are also a problem. To solve these challenges, the study uses blockchain technology coupled with proxy re-encryption to secure health data. ThingSpeak IoT cloud analyzes the collected data and turns them into blockchain transactions which are safely kept on the DriveHQ cloud. Transparency and data integrity are ensured by blockchain, and secure data sharing among authorized users is made possible by proxy re-encryption. This results in a health monitoring system that preserves the accuracy and confidentiality of data while reducing the safety risks of IoT-driven healthcare applications.Keywords: internet of things, healthcare, sensors, electronic health records, blockchain, proxy re-encryption, data privacy, data security
Procedia PDF Downloads 1325785 Rodriguez Diego, Del Valle Martin, Hargreaves Matias, Riveros Jose Luis
Authors: Nathainail Bashir, Neil Anderson
Abstract:
The objective of this study site was to investigate the current state of the practice with regards to karst detection methods and recommend the best method and pattern of arrays to acquire the desire results. Proper site investigation in karst prone regions is extremely valuable in determining the location of possible voids. Two geophysical techniques were employed: multichannel analysis of surface waves (MASW) and electric resistivity tomography (ERT).The MASW data was acquired at each test location using different array lengths and different array orientations (to increase the probability of getting interpretable data in karst terrain). The ERT data were acquired using a dipole-dipole array consisting of 168 electrodes. The MASW data was interpreted (re: estimated depth to physical top of rock) and used to constrain and verify the interpretation of the ERT data. The ERT data indicates poorer quality MASW data were acquired in areas where there was significant local variation in the depth to top of rock.Keywords: dipole-dipole, ERT, Karst terrains, MASW
Procedia PDF Downloads 31425784 Data Science in Military Decision-Making: A Semi-Systematic Literature Review
Authors: H. W. Meerveld, R. H. A. Lindelauf
Abstract:
In contemporary warfare, data science is crucial for the military in achieving information superiority. Yet, to the authors’ knowledge, no extensive literature survey on data science in military decision-making has been conducted so far. In this study, 156 peer-reviewed articles were analysed through an integrative, semi-systematic literature review to gain an overview of the topic. The study examined to what extent literature is focussed on the opportunities or risks of data science in military decision-making, differentiated per level of war (i.e. strategic, operational, and tactical level). A relatively large focus on the risks of data science was observed in social science literature, implying that political and military policymakers are disproportionally influenced by a pessimistic view on the application of data science in the military domain. The perceived risks of data science are, however, hardly addressed in formal science literature. This means that the concerns on the military application of data science are not addressed to the audience that can actually develop and enhance data science models and algorithms. Cross-disciplinary research on both the opportunities and risks of military data science can address the observed research gaps. Considering the levels of war, relatively low attention for the operational level compared to the other two levels was observed, suggesting a research gap with reference to military operational data science. Opportunities for military data science mostly arise at the tactical level. On the contrary, studies examining strategic issues mostly emphasise the risks of military data science. Consequently, domain-specific requirements for military strategic data science applications are hardly expressed. Lacking such applications may ultimately lead to a suboptimal strategic decision in today’s warfare.Keywords: data science, decision-making, information superiority, literature review, military
Procedia PDF Downloads 16525783 Legal Regulation of Personal Information Data Transmission Risk Assessment: A Case Study of the EU’s DPIA
Authors: Cai Qianyi
Abstract:
In the midst of global digital revolution, the flow of data poses security threats that call China's existing legislative framework for protecting personal information into question. As a preliminary procedure for risk analysis and prevention, the risk assessment of personal data transmission lacks detailed guidelines for support. Existing provisions reveal unclear responsibilities for network operators and weakened rights for data subjects. Furthermore, the regulatory system's weak operability and a lack of industry self-regulation heighten data transmission hazards. This paper aims to compare the regulatory pathways for data information transmission risks between China and Europe from a legal framework and content perspective. It draws on the “Data Protection Impact Assessment Guidelines” to empower multiple stakeholders, including data processors, controllers, and subjects, while also defining obligations. In conclusion, this paper intends to solve China's digital security shortcomings by developing a more mature regulatory framework and industry self-regulation mechanisms, resulting in a win-win situation for personal data protection and the development of the digital economy.Keywords: personal information data transmission, risk assessment, DPIA, internet service provider, personal information data transimission, risk assessment
Procedia PDF Downloads 5825782 Wavelets Contribution on Textual Data Analysis
Authors: Habiba Ben Abdessalem
Abstract:
The emergence of giant set of textual data was the push that has encouraged researchers to invest in this field. The purpose of textual data analysis methods is to facilitate access to such type of data by providing various graphic visualizations. Applying these methods requires a corpus pretreatment step, whose standards are set according to the objective of the problem studied. This step determines the forms list contained in contingency table by keeping only those information carriers. This step may, however, lead to noisy contingency tables, so the use of wavelet denoising function. The validity of the proposed approach is tested on a text database that offers economic and political events in Tunisia for a well definite period.Keywords: textual data, wavelet, denoising, contingency table
Procedia PDF Downloads 27625781 A Study to Examine the Use of Traditional Agricultural Practices to Fight the Effects of Climate Change
Authors: Rushva Parihar, Anushka Barua
Abstract:
The negative repercussions of a warming planet are already visible, with biodiversity loss, water scarcity, and extreme weather events becoming ever so frequent. The agriculture sector is perhaps the most impacted, and modern agriculture has failed to defend farmers from the effects of climate change. This, coupled with the added pressure of higher demands for food production caused due to population growth, has only compounded the impact. Traditional agricultural practices that are routed in indigenous knowledge have long safeguarded the delicate balance of the ecosystem through sustainable production techniques. This paper uses secondary data to explore these traditional processes (like Beejamrita, Jeevamrita, sheep penning, earthen bunding, and others) from around the world that have been developed over centuries and focuses on how they can be used to tackle contemporary issues arising from climate change (such as nutrient and water loss, soil degradation, increased incidences of pests). Finally, the resulting framework has been applied to the context of Indian agriculture as a means to combat climate change and improve food security, all while encouraging documentation and transfer of local knowledge as a shared resource among farmers.Keywords: sustainable food systems, traditional agricultural practices, climate smart agriculture, climate change, indigenous knowledge
Procedia PDF Downloads 12525780 Customer Churn Analysis in Telecommunication Industry Using Data Mining Approach
Authors: Burcu Oralhan, Zeki Oralhan, Nilsun Sariyer, Kumru Uyar
Abstract:
Data mining has been becoming more and more important and a wide range of applications in recent years. Data mining is the process of find hidden and unknown patterns in big data. One of the applied fields of data mining is Customer Relationship Management. Understanding the relationships between products and customers is crucial for every business. Customer Relationship Management is an approach to focus on customer relationship development, retention and increase on customer satisfaction. In this study, we made an application of a data mining methods in telecommunication customer relationship management side. This study aims to determine the customers profile who likely to leave the system, develop marketing strategies, and customized campaigns for customers. Data are clustered by applying classification techniques for used to determine the churners. As a result of this study, we will obtain knowledge from international telecommunication industry. We will contribute to the understanding and development of this subject in Customer Relationship Management.Keywords: customer churn analysis, customer relationship management, data mining, telecommunication industry
Procedia PDF Downloads 31625779 On Pooling Different Levels of Data in Estimating Parameters of Continuous Meta-Analysis
Authors: N. R. N. Idris, S. Baharom
Abstract:
A meta-analysis may be performed using aggregate data (AD) or an individual patient data (IPD). In practice, studies may be available at both IPD and AD level. In this situation, both the IPD and AD should be utilised in order to maximize the available information. Statistical advantages of combining the studies from different level have not been fully explored. This study aims to quantify the statistical benefits of including available IPD when conducting a conventional summary-level meta-analysis. Simulated meta-analysis were used to assess the influence of the levels of data on overall meta-analysis estimates based on IPD-only, AD-only and the combination of IPD and AD (mixed data, MD), under different study scenario. The percentage relative bias (PRB), root mean-square-error (RMSE) and coverage probability were used to assess the efficiency of the overall estimates. The results demonstrate that available IPD should always be included in a conventional meta-analysis using summary level data as they would significantly increased the accuracy of the estimates. On the other hand, if more than 80% of the available data are at IPD level, including the AD does not provide significant differences in terms of accuracy of the estimates. Additionally, combining the IPD and AD has moderating effects on the biasness of the estimates of the treatment effects as the IPD tends to overestimate the treatment effects, while the AD has the tendency to produce underestimated effect estimates. These results may provide some guide in deciding if significant benefit is gained by pooling the two levels of data when conducting meta-analysis.Keywords: aggregate data, combined-level data, individual patient data, meta-analysis
Procedia PDF Downloads 37325778 Analyzing On-Line Process Data for Industrial Production Quality Control
Authors: Hyun-Woo Cho
Abstract:
The monitoring of industrial production quality has to be implemented to alarm early warning for unusual operating conditions. Furthermore, identification of their assignable causes is necessary for a quality control purpose. For such tasks many multivariate statistical techniques have been applied and shown to be quite effective tools. This work presents a process data-based monitoring scheme for production processes. For more reliable results some additional steps of noise filtering and preprocessing are considered. It may lead to enhanced performance by eliminating unwanted variation of the data. The performance evaluation is executed using data sets from test processes. The proposed method is shown to provide reliable quality control results, and thus is more effective in quality monitoring in the example. For practical implementation of the method, an on-line data system must be available to gather historical and on-line data. Recently large amounts of data are collected on-line in most processes and implementation of the current scheme is feasible and does not give additional burdens to users.Keywords: detection, filtering, monitoring, process data
Procedia PDF Downloads 55725777 A Review of Travel Data Collection Methods
Authors: Muhammad Awais Shafique, Eiji Hato
Abstract:
Household trip data is of crucial importance for managing present transportation infrastructure as well as to plan and design future facilities. It also provides basis for new policies implemented under Transportation Demand Management. The methods used for household trip data collection have changed with passage of time, starting with the conventional face-to-face interviews or paper-and-pencil interviews and reaching to the recent approach of employing smartphones. This study summarizes the step-wise evolution in the travel data collection methods. It provides a comprehensive review of the topic, for readers interested to know the changing trends in the data collection field.Keywords: computer, smartphone, telephone, travel survey
Procedia PDF Downloads 31125776 A Business-to-Business Collaboration System That Promotes Data Utilization While Encrypting Information on the Blockchain
Authors: Hiroaki Nasu, Ryota Miyamoto, Yuta Kodera, Yasuyuki Nogami
Abstract:
To promote Industry 4.0 and Society 5.0 and so on, it is important to connect and share data so that every member can trust it. Blockchain (BC) technology is currently attracting attention as the most advanced tool and has been used in the financial field and so on. However, the data collaboration using BC has not progressed sufficiently among companies on the supply chain of manufacturing industry that handle sensitive data such as product quality, manufacturing conditions, etc. There are two main reasons why data utilization is not sufficiently advanced in the industrial supply chain. The first reason is that manufacturing information is top secret and a source for companies to generate profits. It is difficult to disclose data even between companies with transactions in the supply chain. In the blockchain mechanism such as Bitcoin using PKI (Public Key Infrastructure), in order to confirm the identity of the company that has sent the data, the plaintext must be shared between the companies. Another reason is that the merits (scenarios) of collaboration data between companies are not specifically specified in the industrial supply chain. For these problems this paper proposes a Business to Business (B2B) collaboration system using homomorphic encryption and BC technique. Using the proposed system, each company on the supply chain can exchange confidential information on encrypted data and utilize the data for their own business. In addition, this paper considers a scenario focusing on quality data, which was difficult to collaborate because it is a top secret. In this scenario, we show a implementation scheme and a benefit of concrete data collaboration by proposing a comparison protocol that can grasp the change in quality while hiding the numerical value of quality data.Keywords: business to business data collaboration, industrial supply chain, blockchain, homomorphic encryption
Procedia PDF Downloads 13525775 Multivariate Assessment of Mathematics Test Scores of Students in Qatar
Authors: Ali Rashash Alzahrani, Elizabeth Stojanovski
Abstract:
Data on various aspects of education are collected at the institutional and government level regularly. In Australia, for example, students at various levels of schooling undertake examinations in numeracy and literacy as part of NAPLAN testing, enabling longitudinal assessment of such data as well as comparisons between schools and states within Australia. Another source of educational data collected internationally is via the PISA study which collects data from several countries when students are approximately 15 years of age and enables comparisons in the performance of science, mathematics and English between countries as well as ranking of countries based on performance in these standardised tests. As well as student and school outcomes based on the tests taken as part of the PISA study, there is a wealth of other data collected in the study including parental demographics data and data related to teaching strategies used by educators. Overall, an abundance of educational data is available which has the potential to be used to help improve educational attainment and teaching of content in order to improve learning outcomes. A multivariate assessment of such data enables multiple variables to be considered simultaneously and will be used in the present study to help develop profiles of students based on performance in mathematics using data obtained from the PISA study.Keywords: cluster analysis, education, mathematics, profiles
Procedia PDF Downloads 12325774 Induced Thermo-Osmotic Convection for Heat and Mass Transfer
Authors: Francisco J. Arias
Abstract:
Consideration is given to a mechanism of heat and mass transport in solutions similar than that of natural convection but with one important difference. Here the mechanism is not promoted by density differences in the fluid occurring due to temperature gradients (coefficient of thermal expansion) but rather by solubility differences due to the thermal dependence of the solubility (coefficient of thermal solubility). Utilizing a simplified physical model, it is shown that by the proper choice of the concentration of a given solution, convection might be induced by the alternating precipitation of the solute -when the solution becomes supersaturated, and its posterior recombination when changes in temperature occurs. The spontaneous change in the Gibbs free energy during the mixing is the driven force for the mechanism. The maximum extractable energy from this new type of thermal convection was derived. Experimental data from a closed-loop circuit was obtained demonstrating the feasibility for continuous separation and recombination of the solution. This type of heat and mass transport -which doesn’t depend on gravity, might potentially be interesting for heat and mass transport downwards (as in solar-roof collectors to inside homes), horizontal (e.g., microelectronic applications), and in microgravity (space technology). Also, because the coefficient of thermal solubility could be positive or negative, the investigated thermo-osmosis convection can be used either for heating or cooling.Keywords: natural convection, thermal gradient, solubility, osmotic pressure
Procedia PDF Downloads 29125773 Analysis of the Evolution of Landscape Spatial Patterns in Banan District, Chongqing, China
Authors: Wenyang Wan
Abstract:
The study of urban land use and landscape pattern is the current hotspot in the fields of planning and design, ecology, etc., which is of great significance for the construction of the overall humanistic ecosystem of the city and optimization of the urban spatial structure. Banan District, as the main part of the eastern eco-city planning of Chongqing Municipality, is a high ground for highlighting the ecological characteristics of Chongqing, realizing effective transformation of ecological value, and promoting the integrated development of urban and rural areas. The analytical methods of land use transfer matrix (GIS) and landscape pattern index (Fragstats) were used to study the characteristics and laws of the evolution of land use landscape pattern in Banan District from 2000 to 2020, which provide some reference value for Banan District to alleviate the ecological contradiction of landscape. The results of the study show that ① Banan District is rich in land use types, of which the area of cultivated land will still account for 57.15% of the total area of the landscape until 2020, accounting for an absolute advantage in land use structure of Banan District; ② From 2000 to 2020, land use conversion in Banan District is characterized as Cropland > woodland > grassland > shrubland > built-up land > water bodies > wetlands, with cropland converted to built-up land being the largest; ③ From 2000 to 2020, the landscape elements of Banan District were distributed in a balanced way, and the landscape types were rich and diversified, but due to the influence of human interference, it also presented the characteristics that the shape of the landscape elements tended to be irregular, and the dominant patches were distributed in a scattered manner, and the patches had poor connectivity. It is recommended that in future regional ecological construction, the layout should be rationally optimized, the relationship between landscape components should be coordinated, the connectivity between landscape patches should be strengthened, and the degree of landscape fragmentation should be reduced.Keywords: land use transfer, landscape pattern evolution, GIS and Fragstats, Banan district
Procedia PDF Downloads 7025772 Changing Misconceptions in Heat Transfer: A Problem Based Learning Approach for Engineering Students
Authors: Paola Utreras, Yazmina Olmos, Loreto Sanhueza
Abstract:
This work has the purpose of study and incorporate Problem Based Learning (PBL) for engineering students, through the analysis of several thermal images of dwellings located in different geographical points of the Region de los Ríos, Chile. The students analyze how heat is transferred in and out of the houses and how is the relation between heat transfer and climatic conditions that affect each zone. As a result of this activity students are able to acquire significant learning in the unit of heat and temperature, and manage to reverse previous conceptual errors related with energy, temperature and heat. In addition, student are able to generate prototype solutions to increase thermal efficiency using low cost materials. Students make public their results in a report using scientific writing standards and in a science fair open to the entire university community. The methodology used to measure previous Conceptual Errors has been applying diagnostic tests with everyday questions that involve concepts of heat, temperature, work and energy, before the unit. After the unit the same evaluation is done in order that themselves are able to evidence the evolution in the construction of knowledge. As a result, we found that in the initial test, 90% of the students showed deficiencies in the concepts previously mentioned, and in the subsequent test 47% showed deficiencies, these percent ages differ between students who carry out the course for the first time and those who have performed this course previously in a traditional way. The methodology used to measure Significant Learning has been by comparing results in subsequent courses of thermodynamics among students who have received problem based learning and those who have received traditional training. We have observe that learning becomes meaningful when applied to the daily lives of students promoting internalization of knowledge and understanding through critical thinking.Keywords: engineering students, heat flow, problem-based learning, thermal images
Procedia PDF Downloads 23125771 Dataset Quality Index:Development of Composite Indicator Based on Standard Data Quality Indicators
Authors: Sakda Loetpiparwanich, Preecha Vichitthamaros
Abstract:
Nowadays, poor data quality is considered one of the majority costs for a data project. The data project with data quality awareness almost as much time to data quality processes while data project without data quality awareness negatively impacts financial resources, efficiency, productivity, and credibility. One of the processes that take a long time is defining the expectations and measurements of data quality because the expectation is different up to the purpose of each data project. Especially, big data project that maybe involves with many datasets and stakeholders, that take a long time to discuss and define quality expectations and measurements. Therefore, this study aimed at developing meaningful indicators to describe overall data quality for each dataset to quick comparison and priority. The objectives of this study were to: (1) Develop a practical data quality indicators and measurements, (2) Develop data quality dimensions based on statistical characteristics and (3) Develop Composite Indicator that can describe overall data quality for each dataset. The sample consisted of more than 500 datasets from public sources obtained by random sampling. After datasets were collected, there are five steps to develop the Dataset Quality Index (SDQI). First, we define standard data quality expectations. Second, we find any indicators that can measure directly to data within datasets. Thirdly, each indicator aggregates to dimension using factor analysis. Next, the indicators and dimensions were weighted by an effort for data preparing process and usability. Finally, the dimensions aggregate to Composite Indicator. The results of these analyses showed that: (1) The developed useful indicators and measurements contained ten indicators. (2) the developed data quality dimension based on statistical characteristics, we found that ten indicators can be reduced to 4 dimensions. (3) The developed Composite Indicator, we found that the SDQI can describe overall datasets quality of each dataset and can separate into 3 Level as Good Quality, Acceptable Quality, and Poor Quality. The conclusion, the SDQI provide an overall description of data quality within datasets and meaningful composition. We can use SQDI to assess for all data in the data project, effort estimation, and priority. The SDQI also work well with Agile Method by using SDQI to assessment in the first sprint. After passing the initial evaluation, we can add more specific data quality indicators into the next sprint.Keywords: data quality, dataset quality, data quality management, composite indicator, factor analysis, principal component analysis
Procedia PDF Downloads 13825770 Observations of Magnetospheric Ulf Waves in Connection to the Kelvin-Helmholtz Instability at Mercury
Authors: Elisabet Liljeblad, Tomas Karlsson, Torbjorn Sundberg, Anita Kullen
Abstract:
The magnetospheric magnetic field data from the MESSENGER spacecraft is investigated to establish the presence of ultra-low frequency (ULF) waves in connection to 131 previously observed nonlinear Kelvin-Helmholtz waves (KHWs) at Mercury. Distinct ULF signatures are detected in 44 out of the 131 magnetospheric traversals prior to or after observing a KHW. In particular, 39 of these 44 ULF events are highly coherent at the frequency of maximum power spectral density. The waves observed at the dayside, which appears mainly at the duskside and naturally following the KHW occurrence asymmetry, are significantly different to the events behind the dawn-dusk terminator and have the following distinct wave characteristics: they oscillate clearly in the perpendicular (azimuthal) direction to the mean magnetic field with a wave normal angle more in the parallel than the perpendicular direction, increase in absolute ellipticity with distance from noon, are almost exclusively right-hand polarized, and are observed mainly for frequencies in the range 0.02-0.04 Hz. These results indicate that the dayside ULF waves are likely to shear Alfvén waves driven by KHWs at the magnetopause, which in turn manifests the importance of the Kelvin-Helmholtz instability in terms of mass transport throughout the Mercury magnetosphere.Keywords: ultra-low frequency waves, kelvin-Helmholtz instability, magnetospheric processes, mercury, messenger, energy and momentum transfer in planetary environments
Procedia PDF Downloads 23825769 Predictive Analysis for Big Data: Extension of Classification and Regression Trees Algorithm
Authors: Ameur Abdelkader, Abed Bouarfa Hafida
Abstract:
Since its inception, predictive analysis has revolutionized the IT industry through its robustness and decision-making facilities. It involves the application of a set of data processing techniques and algorithms in order to create predictive models. Its principle is based on finding relationships between explanatory variables and the predicted variables. Past occurrences are exploited to predict and to derive the unknown outcome. With the advent of big data, many studies have suggested the use of predictive analytics in order to process and analyze big data. Nevertheless, they have been curbed by the limits of classical methods of predictive analysis in case of a large amount of data. In fact, because of their volumes, their nature (semi or unstructured) and their variety, it is impossible to analyze efficiently big data via classical methods of predictive analysis. The authors attribute this weakness to the fact that predictive analysis algorithms do not allow the parallelization and distribution of calculation. In this paper, we propose to extend the predictive analysis algorithm, Classification And Regression Trees (CART), in order to adapt it for big data analysis. The major changes of this algorithm are presented and then a version of the extended algorithm is defined in order to make it applicable for a huge quantity of data.Keywords: predictive analysis, big data, predictive analysis algorithms, CART algorithm
Procedia PDF Downloads 139