Search results for: maximal data sets
25074 Development of Sustainable Building Environmental Model (SBEM) in Hong Kong
Authors: Kwok W. Mui, Ling T. Wong, F. Xiao, Chin T. Cheung, Ho C. Yu
Abstract:
This study addresses a concept of the Sustainable Building Environmental Model (SBEM) developed to optimize energy consumption in air conditioning and ventilation (ACV) systems without any deterioration of indoor environmental quality (IEQ). The SBEM incorporates two main components: an adaptive comfort temperature control module (ACT) and a new carbon dioxide demand control module (nDCV). These two modules take an innovative approach to maintain satisfaction of the Indoor Environmental Quality (IEQ) with optimum energy consumption, they provide a rational basis of effective control. A total of 2133 sets of measurement data of indoor air temperature (Ta), relative humidity (Rh) and carbon dioxide concentration (CO2) were conducted in some Hong Kong offices to investigate the potential of integrating the SBEM. A simulation was used to evaluate the dynamic performance of the energy and air conditioning system with the integration of the SBEM in an air-conditioned building. It allows us make a clear picture of the control strategies and performed any pre-tuned of controllers before utilized in real systems. With the integration of SBEM, it was able to save up to 12.3% in simulation and 15% in field measurement of overall electricity consumption, and maintain the average carbon dioxide concentration within 1000ppm and occupant dissatisfaction in 20%.Keywords: sustainable building environmental model (SBEM), adaptive comfort temperature (ACT), new demand control ventilation (nDCV), energy saving
Procedia PDF Downloads 63625073 Application of Artificial Neural Network Technique for Diagnosing Asthma
Authors: Azadeh Bashiri
Abstract:
Introduction: Lack of proper diagnosis and inadequate treatment of asthma leads to physical and financial complications. This study aimed to use data mining techniques and creating a neural network intelligent system for diagnosis of asthma. Methods: The study population is the patients who had visited one of the Lung Clinics in Tehran. Data were analyzed using the SPSS statistical tool and the chi-square Pearson's coefficient was the basis of decision making for data ranking. The considered neural network is trained using back propagation learning technique. Results: According to the analysis performed by means of SPSS to select the top factors, 13 effective factors were selected, in different performances, data was mixed in various forms, so the different models were made for training the data and testing networks and in all different modes, the network was able to predict correctly 100% of all cases. Conclusion: Using data mining methods before the design structure of system, aimed to reduce the data dimension and the optimum choice of the data, will lead to a more accurate system. Therefore, considering the data mining approaches due to the nature of medical data is necessary.Keywords: asthma, data mining, Artificial Neural Network, intelligent system
Procedia PDF Downloads 27325072 Interpreting Privacy Harms from a Non-Economic Perspective
Authors: Christopher Muhawe, Masooda Bashir
Abstract:
With increased Internet Communication Technology(ICT), the virtual world has become the new normal. At the same time, there is an unprecedented collection of massive amounts of data by both private and public entities. Unfortunately, this increase in data collection has been in tandem with an increase in data misuse and data breach. Regrettably, the majority of data breach and data misuse claims have been unsuccessful in the United States courts for the failure of proof of direct injury to physical or economic interests. The requirement to express data privacy harms from an economic or physical stance negates the fact that not all data harms are physical or economic in nature. The challenge is compounded by the fact that data breach harms and risks do not attach immediately. This research will use a descriptive and normative approach to show that not all data harms can be expressed in economic or physical terms. Expressing privacy harms purely from an economic or physical harm perspective negates the fact that data insecurity may result into harms which run counter the functions of privacy in our lives. The promotion of liberty, selfhood, autonomy, promotion of human social relations and the furtherance of the existence of a free society. There is no economic value that can be placed on these functions of privacy. The proposed approach addresses data harms from a psychological and social perspective.Keywords: data breach and misuse, economic harms, privacy harms, psychological harms
Procedia PDF Downloads 19525071 Union-Primes and Immediate Neighbors
Authors: Shai Sarussi
Abstract:
The union of a nonempty chain of prime ideals in a noncommutative ring is not necessarily a prime ideal. An ideal is called union-prime if it is a union of a nonempty chain of prime ideals but is not a prime. In this paper, some relations between chains of prime ideals and the induced chains of union-prime ideals are shown; in particular, the cardinality of such chains and the cardinality of the sets of cuts of such chains are discussed. For a ring R and a nonempty full chain of prime ideals C of R, several characterizations for the property of immediate neighbors in C are given.Keywords: prime ideals, union-prime ideals, immediate neighbors, Kaplansky's conjecture
Procedia PDF Downloads 13025070 Machine Learning Analysis of Student Success in Introductory Calculus Based Physics I Course
Authors: Chandra Prayaga, Aaron Wade, Lakshmi Prayaga, Gopi Shankar Mallu
Abstract:
This paper presents the use of machine learning algorithms to predict the success of students in an introductory physics course. Data having 140 rows pertaining to the performance of two batches of students was used. The lack of sufficient data to train robust machine learning models was compensated for by generating synthetic data similar to the real data. CTGAN and CTGAN with Gaussian Copula (Gaussian) were used to generate synthetic data, with the real data as input. To check the similarity between the real data and each synthetic dataset, pair plots were made. The synthetic data was used to train machine learning models using the PyCaret package. For the CTGAN data, the Ada Boost Classifier (ADA) was found to be the ML model with the best fit, whereas the CTGAN with Gaussian Copula yielded Logistic Regression (LR) as the best model. Both models were then tested for accuracy with the real data. ROC-AUC analysis was performed for all the ten classes of the target variable (Grades A, A-, B+, B, B-, C+, C, C-, D, F). The ADA model with CTGAN data showed a mean AUC score of 0.4377, but the LR model with the Gaussian data showed a mean AUC score of 0.6149. ROC-AUC plots were obtained for each Grade value separately. The LR model with Gaussian data showed consistently better AUC scores compared to the ADA model with CTGAN data, except in two cases of the Grade value, C- and A-.Keywords: machine learning, student success, physics course, grades, synthetic data, CTGAN, gaussian copula CTGAN
Procedia PDF Downloads 4425069 Impact of Climate Change on Flow Regime in Himalayan Basins, Nepal
Authors: Tirtha Raj Adhikari, Lochan Prasad Devkota
Abstract:
This research studied the hydrological regime of three glacierized river basins in Khumbu, Langtang and Annapurna regions of Nepal using the Hydraologiska Byrans Vattenbalansavde (HBV), HVB-light 3.0 model. Future scenario of discharge is also studied using downscaled climate data derived from statistical downscaling method. General Circulation Models (GCMs) successfully simulate future climate variability and climate change on a global scale; however, poor spatial resolution constrains their application for impact studies at a regional or a local level. The dynamically downscaled precipitation and temperature data from Coupled Global Circulation Model 3 (CGCM3) was used for the climate projection, under A2 and A1B SRES scenarios. In addition, the observed historical temperature, precipitation and discharge data were collected from 14 different hydro-metrological locations for the implementation of this study, which include watershed and hydro-meteorological characteristics, trends analysis and water balance computation. The simulated precipitation and temperature were corrected for bias before implementing in the HVB-light 3.0 conceptual rainfall-runoff model to predict the flow regime, in which Groups Algorithms Programming (GAP) optimization approach and then calibration were used to obtain several parameter sets which were finally reproduced as observed stream flow. Except in summer, the analysis showed that the increasing trends in annual as well as seasonal precipitations during the period 2001 - 2060 for both A2 and A1B scenarios over three basins under investigation. In these river basins, the model projected warmer days in every seasons of entire period from 2001 to 2060 for both A1B and A2 scenarios. These warming trends are higher in maximum than in minimum temperatures throughout the year, indicating increasing trend of daily temperature range due to recent global warming phenomenon. Furthermore, there are decreasing trends in summer discharge in Langtang Khola (Langtang region) which is increasing in Modi Khola (Annapurna region) as well as Dudh Koshi (Khumbu region) river basin. The flow regime is more pronounced during later parts of the future decades than during earlier parts in all basins. The annual water surplus of 1419 mm, 177 mm and 49 mm are observed in Annapurna, Langtang and Khumbu region, respectively.Keywords: temperature, precipitation, water discharge, water balance, global warming
Procedia PDF Downloads 34425068 An Overview of Posterior Fossa Associated Pathologies and Segmentation
Authors: Samuel J. Ahmad, Michael Zhu, Andrew J. Kobets
Abstract:
Segmentation tools continue to advance, evolving from manual methods to automated contouring technologies utilizing convolutional neural networks. These techniques have evaluated ventricular and hemorrhagic volumes in the past but may be applied in novel ways to assess posterior fossa-associated pathologies such as Chiari malformations. Herein, we summarize literature pertaining to segmentation in the context of this and other posterior fossa-based diseases such as trigeminal neuralgia, hemifacial spasm, and posterior fossa syndrome. A literature search for volumetric analysis of the posterior fossa identified 27 papers where semi-automated, automated, manual segmentation, linear measurement-based formulas, and the Cavalieri estimator were utilized. These studies produced superior data than older methods utilizing formulas for rough volumetric estimations. The most commonly used segmentation technique was semi-automated segmentation (12 studies). Manual segmentation was the second most common technique (7 studies). Automated segmentation techniques (4 studies) and the Cavalieri estimator (3 studies), a point-counting method that uses a grid of points to estimate the volume of a region, were the next most commonly used techniques. The least commonly utilized segmentation technique was linear measurement-based formulas (1 study). Semi-automated segmentation produced accurate, reproducible results. However, it is apparent that there does not exist a single semi-automated software, open source or otherwise, that has been widely applied to the posterior fossa. Fully-automated segmentation via such open source software as FSL and Freesurfer produced highly accurate posterior fossa segmentations. Various forms of segmentation have been used to assess posterior fossa pathologies and each has its advantages and disadvantages. According to our results, semi-automated segmentation is the predominant method. However, atlas-based automated segmentation is an extremely promising method that produces accurate results. Future evolution of segmentation technologies will undoubtedly yield superior results, which may be applied to posterior fossa related pathologies. Medical professionals will save time and effort analyzing large sets of data due to these advances.Keywords: chiari, posterior fossa, segmentation, volumetric
Procedia PDF Downloads 10625067 Data Access, AI Intensity, and Scale Advantages
Authors: Chuping Lo
Abstract:
This paper presents a simple model demonstrating that ceteris paribus countries with lower barriers to accessing global data tend to earn higher incomes than other countries. Therefore, large countries that inherently have greater data resources tend to have higher incomes than smaller countries, such that the former may be more hesitant than the latter to liberalize cross-border data flows to maintain this advantage. Furthermore, countries with higher artificial intelligence (AI) intensity in production technologies tend to benefit more from economies of scale in data aggregation, leading to higher income and more trade as they are better able to utilize global data.Keywords: digital intensity, digital divide, international trade, scale of economics
Procedia PDF Downloads 6825066 Secured Transmission and Reserving Space in Images Before Encryption to Embed Data
Authors: G. R. Navaneesh, E. Nagarajan, C. H. Rajam Raju
Abstract:
Nowadays the multimedia data are used to store some secure information. All previous methods allocate a space in image for data embedding purpose after encryption. In this paper, we propose a novel method by reserving space in image with a boundary surrounded before encryption with a traditional RDH algorithm, which makes it easy for the data hider to reversibly embed data in the encrypted images. The proposed method can achieve real time performance, that is, data extraction and image recovery are free of any error. A secure transmission process is also discussed in this paper, which improves the efficiency by ten times compared to other processes as discussed.Keywords: secure communication, reserving room before encryption, least significant bits, image encryption, reversible data hiding
Procedia PDF Downloads 41225065 Identity Verification Using k-NN Classifiers and Autistic Genetic Data
Authors: Fuad M. Alkoot
Abstract:
DNA data have been used in forensics for decades. However, current research looks at using the DNA as a biometric identity verification modality. The goal is to improve the speed of identification. We aim at using gene data that was initially used for autism detection to find if and how accurate is this data for identification applications. Mainly our goal is to find if our data preprocessing technique yields data useful as a biometric identification tool. We experiment with using the nearest neighbor classifier to identify subjects. Results show that optimal classification rate is achieved when the test set is corrupted by normally distributed noise with zero mean and standard deviation of 1. The classification rate is close to optimal at higher noise standard deviation reaching 3. This shows that the data can be used for identity verification with high accuracy using a simple classifier such as the k-nearest neighbor (k-NN).Keywords: biometrics, genetic data, identity verification, k nearest neighbor
Procedia PDF Downloads 25725064 A Review on Intelligent Systems for Geoscience
Authors: R Palson Kennedy, P.Kiran Sai
Abstract:
This article introduces machine learning (ML) researchers to the hurdles that geoscience problems present, as well as the opportunities for improvement in both ML and geosciences. This article presents a review from the data life cycle perspective to meet that need. Numerous facets of geosciences present unique difficulties for the study of intelligent systems. Geosciences data is notoriously difficult to analyze since it is frequently unpredictable, intermittent, sparse, multi-resolution, and multi-scale. The first half addresses data science’s essential concepts and theoretical underpinnings, while the second section contains key themes and sharing experiences from current publications focused on each stage of the data life cycle. Finally, themes such as open science, smart data, and team science are considered.Keywords: Data science, intelligent system, machine learning, big data, data life cycle, recent development, geo science
Procedia PDF Downloads 13525063 Demographic Characteristics of the Atlas Barbary Sheep in Amassine Nature Reserve, Atlas Range, Morocco: Implications For Conservation and Management
Authors: Hakim Bachiri, Mohammed Znari, Moulay Abdeljalil Ait Baamranne
Abstract:
Population characteristics of Atlas Barbary sheep (Ammotragus lervia lervia) were investigated 20 years following the 1999 introduction of 10 individuals into the fenced nature reserve of Amassine, High Atlas range, Morocco, for promoting wildlife watching and tourism. Population age-sex structure and density were determined in late winter-early spring during four consecutive years (2016-2019) by direct observation before the dispersal of the herd. In this latter case, the line transect distance sampling was successfully applied. Population size increased from 37 to 62 animals during the four-year study period; the maximal population size being 82 individuals recorded in 2006. An estimated population density ranged from 0.25 to 0.41 Barbary sheep/ha during the study period. The adult sex ratio varied from 91 to 67 per 100 females. The apparent birth rate was 14 to 73/100 females. Juveniles and subadults comprised 27-43% of the population, adult males 26-31% and adult females 29-45%. The survival rate from birth to 1 year of age approximated 35%, for adult males was estimated to average 69%/year. The obtained results would be helpful for developing sustainable population management and habitat restoration plan and assessing the feasibility of potential reintroduction/restocking in other areas of the Atlas range.Keywords: atlas mountains, barbary sheep, demography, management
Procedia PDF Downloads 46925062 Duration Patterns of English by Native British Speakers and Mandarin ESL Speakers
Authors: Chen Bingru
Abstract:
This study is intended to describe and analyze the effects of polysyllabic shortening and word or phrase boundary on the duration patterns of spoken utterances by Mandarin learners of English in comparison with native speakers of English. To investigate the relative contribution of these effects, two production experiments were conducted. The study included 11 native British English speakers and 20 Mandarin learners of English who were asked to produce four sets of tokens consisting of a mono-syllabic base form, disyllabic, and trisyllabic words derived from the base by the addition of suffixes, and a set of short sentences with a particular combination of phrase size, stress pattern, and boundary location. The duration of words and segments was measured, and results from the data analysis suggest that the amount of polysyllabic shortening and the effect of word or phrase position are likely to affect a Chinese accent for Mandarin ESL speakers. This study sheds light on research on the duration patterns of language by demonstrating the effect of duration-related factors on the foreign accent of Mandarin ESL speakers. It can also benefit both L2 learners and language teachers by increasing their sensitivity to the duration differences and difficulties experienced by L2 learners of English. An understanding of the amount of polysyllabic shortening and the effect of position in words and phrase on syllable duration can also facilitate L2 teachers to establish priorities for teaching pronunciation to ESL learners.Keywords: duration patterns, Chinese accent, Mandarin ESL speakers, polysyllabic shortening
Procedia PDF Downloads 13925061 Observation and Experience of Using Mechanically Activated Fly Ash in Concrete
Authors: Rudolf Hela, Lenka Bodnarova
Abstract:
Paper focuses on experimental testing of possibilities of mechanical activation of fly ash and observation of influence of specific surface and granulometry on final properties of fresh and hardened concrete. Mechanical grinding prepared various fineness of fly ash, which was classed by specific surface in accordance with Blain and their granulometry was determined by means of laser granulometer. Then, sets of testing specimens were made from mix designs of identical composition with 25% or Portland cement CEM I 42.5 R replaced with fly ash with various specific surface and granulometry. Mix design with only Portland cement was used as reference. Mix designs were tested on consistency of fresh concrete and compressive strength after 7, 28, 60, and 90 days.Keywords: concrete, fly ash, latent hydraulicity, mechanically activated fly ash
Procedia PDF Downloads 21225060 General Network with Four Nodes and Four Activities with Triangular Fuzzy Number as Activity Times
Authors: Rashmi Tamhankar, Madhav Bapat
Abstract:
In many projects, we have to use human judgment for determining the duration of the activities which may vary from person to person. Hence, there is vagueness about the time duration for activities in network planning. Fuzzy sets can handle such vague or imprecise concepts and has an application to such network. The vague activity times can be represented by triangular fuzzy numbers. In this paper, a general network with fuzzy activity times is considered and conditions for the critical path are obtained also we compute total float time of each activity. Several numerical examples are discussed.Keywords: PERT, CPM, triangular fuzzy numbers, fuzzy activity times
Procedia PDF Downloads 47325059 The Influence of Cognitive Load in the Acquisition of Words through Sentence or Essay Writing
Authors: Breno Barrreto Silva, Agnieszka Otwinowska, Katarzyna Kutylowska
Abstract:
Research comparing lexical learning following the writing of sentences and longer texts with keywords is limited and contradictory. One possibility is that the recursivity of writing may enhance processing and increase lexical learning; another possibility is that the higher cognitive load of complex-text writing (e.g., essays), at least when timed, may hinder the learning of words. In our study, we selected 2 sets of 10 academic keywords matched for part of speech, length (number of characters), frequency (SUBTLEXus), and concreteness, and we asked 90 L1-Polish advanced-level English majors to use the keywords when writing sentences, timed (60 minutes) or untimed essays. First, all participants wrote a timed Control essay (60 minutes) without keywords. Then different groups produced Timed essays (60 minutes; n=33), Untimed essays (n=24), or Sentences (n=33) using the two sets of glossed keywords (counterbalanced). The comparability of the participants in the three groups was ensured by matching them for proficiency in English (LexTALE), and for few measures derived from the control essay: VocD (assessing productive lexical diversity), normed errors (assessing productive accuracy), words per minute (assessing productive written fluency), and holistic scores (assessing overall quality of production). We measured lexical learning (depth and breadth) via an adapted Vocabulary Knowledge Scale (VKS) and a free association test. Cognitive load was measured in the three essays (Control, Timed, Untimed) using normed number of errors and holistic scores (TOEFL criteria). The number of errors and essay scores were obtained from two raters (interrater reliability Pearson’s r=.78-91). Generalized linear mixed models showed no difference in the breadth and depth of keyword knowledge after writing Sentences, Timed essays, and Untimed essays. The task-based measurements found that Control and Timed essays had similar holistic scores, but that Untimed essay had better quality than Timed essay. Also, Untimed essay was the most accurate, and Timed essay the most error prone. Concluding, using keywords in Timed, but not Untimed, essays increased cognitive load, leading to more errors and lower quality. Still, writing sentences and essays yielded similar lexical learning, and differences in the cognitive load between Timed and Untimed essays did not affect lexical acquisition.Keywords: learning academic words, writing essays, cognitive load, english as an L2
Procedia PDF Downloads 7325058 Factors Affecting Employee Decision Making in an AI Environment
Authors: Yogesh C. Sharma, A. Seetharaman
Abstract:
The decision-making process in humans is a complicated system influenced by a variety of intrinsic and extrinsic factors. Human decisions have a ripple effect on subsequent decisions. In this study, the scope of human decision making is limited to employees. In an organisation, a person makes a variety of decisions from the time they are hired to the time they retire. The goal of this research is to identify various elements that influence decision-making. In addition, the environment in which a decision is made is a significant aspect of the decision-making process. Employees in today's workplace use artificial intelligence (AI) systems for automation and decision augmentation. The impact of AI systems on the decision-making process is examined in this study. This research is designed based on a systematic literature review. Based on gaps in the literature, limitations and the scope of future research have been identified. Based on these findings, a research framework has been designed to identify various factors affecting employee decision making. Employee decision making is influenced by technological advancement, data-driven culture, human trust, decision automation-augmentation, and workplace motivation. Hybrid human-AI systems require the development of new skill sets and organisational design. Employee psychological safety and supportive leadership influences overall job satisfaction.Keywords: employee decision making, artificial intelligence (AI) environment, human trust, technology innovation, psychological safety
Procedia PDF Downloads 10825057 The Effect of Seated Distance on Muscle Activation and Joint Kinematics during Seated Strengthening in Patients with Stroke with Extensor Synergy Pattern in the Lower Limbs
Authors: Y. H. Chen, P. Y. Chiang, T. Sugiarto, I. Karsuna, Y. J. Lin, C. C. Chang, W. C. Hsu
Abstract:
Task-specific training with intense practice of functional tasks has been emphasized for the approaches in motor rehabilitation in patients with hemiplegic strokes. Although reciprocal actions which may increase demands on motor control during seated stepping exercise, motor control is not explicitly trained with emphasis and instruction focused on traditional strengthening. Apart from cycling and treadmill, various forms of seated exerciser are becoming available for the lower extremity exercise. The benefit of seated exerciser has been focused on the effect on the cardiopulmonary system. Thus, the aim of current study is to investigate the effect of seated distance on muscle activation during seated strengthening in patients with stroke with extensor synergy pattern in the lower extremities. Electrodes were placed on the surface of lower limbs muscles, including rectus femoris (RF), vastus lateralis (VL), biceps femoris (BF) and gastrocnemius (GT) of both sides. Maximal voluntary contraction (MVC) of the muscles were obtained to normalize the EMG amplitude obtained during dynamic trials with analog raw data digitized with a sampling frequency of 2000 Hz, fully rectified and the linear enveloped. Movement cycle was separated into two phases by pushing (PP) and Return (RP). Integral EMG (iEMG) is then used to quantify level of activation during each of the phases. Subjects performed strengthening with moderate resistance with speed of 60 rpm in two different distances (D1, short) and (D2, long). The results showed greater iEMG in RF and smaller iEMG in VL and BF with obvious increase range of motion of hip flexion in D1 condition. On the contrary, no significant involvement of RF while greater level of muscular activation in VL and BF during RP was found during PP in D2 condition. In addition, greater hip internal rotation was observed in D2 condition. In patients with stroke with abnormal tone revealed by extensor synergy in the lower extremities, shorter seated distance is suggested to facilitate hip flexor muscle activation while avoid inducing hyper extensor tone which may prevent a smooth repetitive motion. Repetitive muscular contraction exercise of hip flexor may be helpful for further gait training as it may assist hip flexion during swing phase of the walking.Keywords: seated strengthening, patients with stroke, electromyography, synergy pattern
Procedia PDF Downloads 21325056 Data Quality as a Pillar of Data-Driven Organizations: Exploring the Benefits of Data Mesh
Authors: Marc Bachelet, Abhijit Kumar Chatterjee, José Manuel Avila
Abstract:
Data quality is a key component of any data-driven organization. Without data quality, organizations cannot effectively make data-driven decisions, which often leads to poor business performance. Therefore, it is important for an organization to ensure that the data they use is of high quality. This is where the concept of data mesh comes in. Data mesh is an organizational and architectural decentralized approach to data management that can help organizations improve the quality of data. The concept of data mesh was first introduced in 2020. Its purpose is to decentralize data ownership, making it easier for domain experts to manage the data. This can help organizations improve data quality by reducing the reliance on centralized data teams and allowing domain experts to take charge of their data. This paper intends to discuss how a set of elements, including data mesh, are tools capable of increasing data quality. One of the key benefits of data mesh is improved metadata management. In a traditional data architecture, metadata management is typically centralized, which can lead to data silos and poor data quality. With data mesh, metadata is managed in a decentralized manner, ensuring accurate and up-to-date metadata, thereby improving data quality. Another benefit of data mesh is the clarification of roles and responsibilities. In a traditional data architecture, data teams are responsible for managing all aspects of data, which can lead to confusion and ambiguity in responsibilities. With data mesh, domain experts are responsible for managing their own data, which can help provide clarity in roles and responsibilities and improve data quality. Additionally, data mesh can also contribute to a new form of organization that is more agile and adaptable. By decentralizing data ownership, organizations can respond more quickly to changes in their business environment, which in turn can help improve overall performance by allowing better insights into business as an effect of better reports and visualization tools. Monitoring and analytics are also important aspects of data quality. With data mesh, monitoring, and analytics are decentralized, allowing domain experts to monitor and analyze their own data. This will help in identifying and addressing data quality problems in quick time, leading to improved data quality. Data culture is another major aspect of data quality. With data mesh, domain experts are encouraged to take ownership of their data, which can help create a data-driven culture within the organization. This can lead to improved data quality and better business outcomes. Finally, the paper explores the contribution of AI in the coming years. AI can help enhance data quality by automating many data-related tasks, like data cleaning and data validation. By integrating AI into data mesh, organizations can further enhance the quality of their data. The concepts mentioned above are illustrated by AEKIDEN experience feedback. AEKIDEN is an international data-driven consultancy that has successfully implemented a data mesh approach. By sharing their experience, AEKIDEN can help other organizations understand the benefits and challenges of implementing data mesh and improving data quality.Keywords: data culture, data-driven organization, data mesh, data quality for business success
Procedia PDF Downloads 13525055 Evaluation of the Power Generation Effect Obtained by Inserting a Piezoelectric Sheet in the Backlash Clearance of a Circular Arc Helical Gear
Authors: Barenten Suciu, Yuya Nakamoto
Abstract:
Power generation effect, obtained by inserting a piezo- electric sheet in the backlash clearance of a circular arc helical gear, is evaluated. Such type of screw gear is preferred since, in comparison with the involute tooth profile, the circular arc profile leads to reduced stress-concentration effects, and improved life of the piezoelectric film. Firstly, geometry of the circular arc helical gear, and properties of the piezoelectric sheet are presented. Then, description of the test-rig, consisted of a right-hand thread gear meshing with a left-hand thread gear, and the voltage measurement procedure are given. After creating the tridimensional (3D) model of the meshing gears in SolidWorks, they are 3D-printed in acrylonitrile butadiene styrene (ABS) resin. Variation of the generated voltage versus time, during a meshing cycle of the circular arc helical gear, is measured for various values of the center distance. Then, the change of the maximal, minimal, and peak-to-peak voltage versus the center distance is illustrated. Optimal center distance of the gear, to achieve voltage maximization, is found and its significance is discussed. Such results prove that the contact pressure of the meshing gears can be measured, and also, the electrical power can be generated by employing the proposed technique.Keywords: circular arc helical gear, contact problem, optimal center distance, piezoelectric sheet, power generation
Procedia PDF Downloads 16725054 Improving Taint Analysis of Android Applications Using Finite State Machines
Authors: Assad Maalouf, Lunjin Lu, James Lynott
Abstract:
We present a taint analysis that can automatically detect when string operations result in a string that is free of taints, where all the tainted patterns have been removed. This is an improvement on the conservative behavior of previous taint analyzers, where a string operation on a tainted string always leads to a tainted string unless the operation is manually marked as a sanitizer. The taint analysis is built on top of a string analysis that uses finite state automata to approximate the sets of values that string variables can take during the execution of a program. The proposed approach has been implemented as an extension of FlowDroid and experimental results show that the resulting taint analyzer is much more precise than the original FlowDroid.Keywords: android, static analysis, string analysis, taint analysis
Procedia PDF Downloads 18025053 Big Data Analysis with RHadoop
Authors: Ji Eun Shin, Byung Ho Jung, Dong Hoon Lim
Abstract:
It is almost impossible to store or analyze big data increasing exponentially with traditional technologies. Hadoop is a new technology to make that possible. R programming language is by far the most popular statistical tool for big data analysis based on distributed processing with Hadoop technology. With RHadoop that integrates R and Hadoop environment, we implemented parallel multiple regression analysis with different sizes of actual data. Experimental results showed our RHadoop system was much faster as the number of data nodes increases. We also compared the performance of our RHadoop with lm function and big lm packages available on big memory. The results showed that our RHadoop was faster than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases.Keywords: big data, Hadoop, parallel regression analysis, R, RHadoop
Procedia PDF Downloads 43725052 A Mutually Exclusive Task Generation Method Based on Data Augmentation
Authors: Haojie Wang, Xun Li, Rui Yin
Abstract:
In order to solve the memorization overfitting in the meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels, so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to exponential growth of computation, this paper also proposes a key data extraction method, that only extracts part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.Keywords: data augmentation, mutex task generation, meta-learning, text classification.
Procedia PDF Downloads 9325051 Efficient Positioning of Data Aggregation Point for Wireless Sensor Network
Authors: Sifat Rahman Ahona, Rifat Tasnim, Naima Hassan
Abstract:
Data aggregation is a helpful technique for reducing the data communication overhead in wireless sensor network. One of the important tasks of data aggregation is positioning of the aggregator points. There are a lot of works done on data aggregation. But, efficient positioning of the aggregators points is not focused so much. In this paper, authors are focusing on the positioning or the placement of the aggregation points in wireless sensor network. Authors proposed an algorithm to select the aggregators positions for a scenario where aggregator nodes are more powerful than sensor nodes.Keywords: aggregation point, data communication, data aggregation, wireless sensor network
Procedia PDF Downloads 15725050 Spatial Econometric Approaches for Count Data: An Overview and New Directions
Authors: Paula Simões, Isabel Natário
Abstract:
This paper reviews a number of theoretical aspects for implementing an explicit spatial perspective in econometrics for modelling non-continuous data, in general, and count data, in particular. It provides an overview of the several spatial econometric approaches that are available to model data that are collected with reference to location in space, from the classical spatial econometrics approaches to the recent developments on spatial econometrics to model count data, in a Bayesian hierarchical setting. Considerable attention is paid to the inferential framework, necessary for structural consistent spatial econometric count models, incorporating spatial lag autocorrelation, to the corresponding estimation and testing procedures for different assumptions, to the constrains and implications embedded in the various specifications in the literature. This review combines insights from the classical spatial econometrics literature as well as from hierarchical modeling and analysis of spatial data, in order to look for new possible directions on the processing of count data, in a spatial hierarchical Bayesian econometric context.Keywords: spatial data analysis, spatial econometrics, Bayesian hierarchical models, count data
Procedia PDF Downloads 59325049 A NoSQL Based Approach for Real-Time Managing of Robotics's Data
Authors: Gueidi Afef, Gharsellaoui Hamza, Ben Ahmed Samir
Abstract:
This paper deals with the secret of the continual progression data that new data management solutions have been emerged: The NoSQL databases. They crossed several areas like personalization, profile management, big data in real-time, content management, catalog, view of customers, mobile applications, internet of things, digital communication and fraud detection. Nowadays, these database management systems are increasing. These systems store data very well and with the trend of big data, a new challenge’s store demands new structures and methods for managing enterprise data. The new intelligent machine in the e-learning sector, thrives on more data, so smart machines can learn more and faster. The robotics are our use case to focus on our test. The implementation of NoSQL for Robotics wrestle all the data they acquire into usable form because with the ordinary type of robotics; we are facing very big limits to manage and find the exact information in real-time. Our original proposed approach was demonstrated by experimental studies and running example used as a use case.Keywords: NoSQL databases, database management systems, robotics, big data
Procedia PDF Downloads 35425048 Fuzzy Optimization Multi-Objective Clustering Ensemble Model for Multi-Source Data Analysis
Authors: C. B. Le, V. N. Pham
Abstract:
In modern data analysis, multi-source data appears more and more in real applications. Multi-source data clustering has emerged as a important issue in the data mining and machine learning community. Different data sources provide information about different data. Therefore, multi-source data linking is essential to improve clustering performance. However, in practice multi-source data is often heterogeneous, uncertain, and large. This issue is considered a major challenge from multi-source data. Ensemble is a versatile machine learning model in which learning techniques can work in parallel, with big data. Clustering ensemble has been shown to outperform any standard clustering algorithm in terms of accuracy and robustness. However, most of the traditional clustering ensemble approaches are based on single-objective function and single-source data. This paper proposes a new clustering ensemble method for multi-source data analysis. The fuzzy optimized multi-objective clustering ensemble method is called FOMOCE. Firstly, a clustering ensemble mathematical model based on the structure of multi-objective clustering function, multi-source data, and dark knowledge is introduced. Then, rules for extracting dark knowledge from the input data, clustering algorithms, and base clusterings are designed and applied. Finally, a clustering ensemble algorithm is proposed for multi-source data analysis. The experiments were performed on the standard sample data set. The experimental results demonstrate the superior performance of the FOMOCE method compared to the existing clustering ensemble methods and multi-source clustering methods.Keywords: clustering ensemble, multi-source, multi-objective, fuzzy clustering
Procedia PDF Downloads 18925047 Experimental and CFD Simulation of the Jet Pump for Air Bubbles Formation
Authors: L. Grinis, N. Lubashevsky, Y. Ostrovski
Abstract:
A jet pump is a type of pump that accelerates the flow of a secondary fluid (driven fluid) by introducing a motive fluid with high velocity into a converging-diverging nozzle. Jet pumps are also known as adductors or ejectors depending on the motivator phase. The ejector's motivator is of a gaseous nature, usually steam or air, while the educator's motivator is a liquid, usually water. Jet pumps are devices that use air bubbles and are widely used in wastewater treatment processes. In this work, we will discuss about the characteristics of the jet pump and the computational simulation of this device. To find the optimal angle and depth for the air pipe, so as to achieve the maximal air volumetric flow rate, an experimental apparatus was constructed to ascertain the best geometrical configuration for this new type of jet pump. By using 3D printing technology, a series of jet pumps was printed and tested whilst aspiring to maximize air flow rate dependent on angle and depth of the air pipe insertion. The experimental results show a major difference of up to 300% in performance between the different pumps (ratio of air flow rate to supplied power) where the optimal geometric model has an insertion angle of 600 and air pipe insertion depth ending at the center of the mixing chamber. The differences between the pumps were further explained by using CFD for better understanding the reasons that affect the airflow rate. The validity of the computational simulation and the corresponding assumptions have been proved experimentally. The present research showed high degree of congruence with the results of the laboratory tests. This study demonstrates the potential of using of the jet pump in many practical applications.Keywords: air bubbles, CFD simulation, jet pump, applications
Procedia PDF Downloads 24325046 Modeling Activity Pattern Using XGBoost for Mining Smart Card Data
Authors: Eui-Jin Kim, Hasik Lee, Su-Jin Park, Dong-Kyu Kim
Abstract:
Smart-card data are expected to provide information on activity pattern as an alternative to conventional person trip surveys. The focus of this study is to propose a method for training the person trip surveys to supplement the smart-card data that does not contain the purpose of each trip. We selected only available features from smart card data such as spatiotemporal information on the trip and geographic information system (GIS) data near the stations to train the survey data. XGboost, which is state-of-the-art tree-based ensemble classifier, was used to train data from multiple sources. This classifier uses a more regularized model formalization to control the over-fitting and show very fast execution time with well-performance. The validation results showed that proposed method efficiently estimated the trip purpose. GIS data of station and duration of stay at the destination were significant features in modeling trip purpose.Keywords: activity pattern, data fusion, smart-card, XGboost
Procedia PDF Downloads 24625045 Improving Carbon Dioxide Mass Transfer in Open Pond Raceway Systems for Improved Algal Productivity
Authors: William Middleton, Nodumo Zulu, Sue Harrison
Abstract:
Open raceway ponds are currently the most used system for the commercial cultivation of algal biomass, as it is a cost-effective means of production. However, raceway ponds suffer from lower algal productivity when compared to closed photobioreactors. This is due to poor gas exchange between the fluid and the atmosphere. Carbon dioxide (CO₂) mass transfer is a large concern in the production of algae in raceway pond systems. The utilization of atmospheric CO₂ does not support maximal growth; however, CO₂ supplementation in the form of flue gas or concentrated CO₂ is not cost-effective. The introduction of slopes into the raceway system presents a possible improvement to the mass transfer from the air, as seen in previous work conducted at CeBER. Slopes improve turbulence (decreasing the concentration gradient of dissolved CO₂) and can cause air entrainment (allowing for greater surface area and contact time between the air and water). This project tests the findings of previous studies conducted in an indoor lab-scale raceway on a larger scale under outdoor conditions. The addition of slopes resulted in slightly increased CO₂ mass transfer as well as algal growth rate and productivity. However, there were reductions in energy consumption and average fluid velocity in the system. These results indicate a potential to improve the economic feasibility of algal biomass production, but further economic assessment would need to be carried out.Keywords: algae, raceway ponds, mass transfer, algal culture, biotechnology, reactor design
Procedia PDF Downloads 99