Search results for: incomplete count data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25165

Search results for: incomplete count data

24415 Extreme Temperature Forecast in Mbonge, Cameroon Through Return Level Analysis of the Generalized Extreme Value (GEV) Distribution

Authors: Nkongho Ayuketang Arreyndip, Ebobenow Joseph

Abstract:

In this paper, temperature extremes are forecast by employing the block maxima method of the generalized extreme value (GEV) distribution to analyse temperature data from the Cameroon Development Corporation (CDC). By considering two sets of data (raw data and simulated data) and two (stationary and non-stationary) models of the GEV distribution, return levels analysis is carried out and it was found that in the stationary model, the return values are constant over time with the raw data, while in the simulated data the return values show an increasing trend with an upper bound. In the non-stationary model, the return levels of both the raw data and simulated data show an increasing trend with an upper bound. This clearly shows that although temperatures in the tropics show a sign of increase in the future, there is a maximum temperature at which there is no exceedance. The results of this paper are very vital in agricultural and environmental research.

Keywords: forecasting, generalized extreme value (GEV), meteorology, return level

Procedia PDF Downloads 462
24414 Impact of Stack Caches: Locality Awareness and Cost Effectiveness

Authors: Abdulrahman K. Alshegaifi, Chun-Hsi Huang

Abstract:

Treating data based on its location in memory has received much attention in recent years due to its different properties, which offer important aspects for cache utilization. Stack data and non-stack data may interfere with each other’s locality in the data cache. One of the important aspects of stack data is that it has high spatial and temporal locality. In this work, we simulate non-unified cache design that split data cache into stack and non-stack caches in order to maintain stack data and non-stack data separate in different caches. We observe that the overall hit rate of non-unified cache design is sensitive to the size of non-stack cache. Then, we investigate the appropriate size and associativity for stack cache to achieve high hit ratio especially when over 99% of accesses are directed to stack cache. The result shows that on average more than 99% of stack cache accuracy is achieved by using 2KB of capacity and 1-way associativity. Further, we analyze the improvement in hit rate when adding small, fixed, size of stack cache at level1 to unified cache architecture. The result shows that the overall hit rate of unified cache design with adding 1KB of stack cache is improved by approximately, on average, 3.9% for Rijndael benchmark. The stack cache is simulated by using SimpleScalar toolset.

Keywords: hit rate, locality of program, stack cache, stack data

Procedia PDF Downloads 293
24413 Autonomic Threat Avoidance and Self-Healing in Database Management System

Authors: Wajahat Munir, Muhammad Haseeb, Adeel Anjum, Basit Raza, Ahmad Kamran Malik

Abstract:

Databases are the key components of the software systems. Due to the exponential growth of data, it is the concern that the data should be accurate and available. The data in databases is vulnerable to internal and external threats, especially when it contains sensitive data like medical or military applications. Whenever the data is changed by malicious intent, data analysis result may lead to disastrous decisions. Autonomic self-healing is molded toward computer system after inspiring from the autonomic system of human body. In order to guarantee the accuracy and availability of data, we propose a technique which on a priority basis, tries to avoid any malicious transaction from execution and in case a malicious transaction affects the system, it heals the system in an isolated mode in such a way that the availability of system would not be compromised. Using this autonomic system, the management cost and time of DBAs can be minimized. In the end, we test our model and present the findings.

Keywords: autonomic computing, self-healing, threat avoidance, security

Procedia PDF Downloads 495
24412 Information Extraction Based on Search Engine Results

Authors: Mohammed R. Elkobaisi, Abdelsalam Maatuk

Abstract:

The search engines are the large scale information retrieval tools from the Web that are currently freely available to all. This paper explains how to convert the raw resulted number of search engines into useful information. This represents a new method for data gathering comparing with traditional methods. When a query is submitted for a multiple numbers of keywords, this take a long time and effort, hence we develop a user interface program to automatic search by taking multi-keywords at the same time and leave this program to collect wanted data automatically. The collected raw data is processed using mathematical and statistical theories to eliminate unwanted data and converting it to usable data.

Keywords: search engines, information extraction, agent system

Procedia PDF Downloads 418
24411 Implementation and Performance Analysis of Data Encryption Standard and RSA Algorithm with Image Steganography and Audio Steganography

Authors: S. C. Sharma, Ankit Gambhir, Rajeev Arya

Abstract:

In today’s era data security is an important concern and most demanding issues because it is essential for people using online banking, e-shopping, reservations etc. The two major techniques that are used for secure communication are Cryptography and Steganography. Cryptographic algorithms scramble the data so that intruder will not able to retrieve it; however steganography covers that data in some cover file so that presence of communication is hidden. This paper presents the implementation of Ron Rivest, Adi Shamir, and Leonard Adleman (RSA) Algorithm with Image and Audio Steganography and Data Encryption Standard (DES) Algorithm with Image and Audio Steganography. The coding for both the algorithms have been done using MATLAB and its observed that these techniques performed better than individual techniques. The risk of unauthorized access is alleviated up to a certain extent by using these techniques. These techniques could be used in Banks, RAW agencies etc, where highly confidential data is transferred. Finally, the comparisons of such two techniques are also given in tabular forms.

Keywords: audio steganography, data security, DES, image steganography, intruder, RSA, steganography

Procedia PDF Downloads 278
24410 Data Monetisation by E-commerce Companies: A Need for a Regulatory Framework in India

Authors: Anushtha Saxena

Abstract:

This paper examines the process of data monetisation bye-commerce companies operating in India. Data monetisation is collecting, storing, and analysing consumers’ data to use further the data that is generated for profits, revenue, etc. Data monetisation enables e-commerce companies to get better businesses opportunities, innovative products and services, a competitive edge over others to the consumers, and generate millions of revenues. This paper analyses the issues and challenges that are faced due to the process of data monetisation. Some of the issues highlighted in the paper pertain to the right to privacy, protection of data of e-commerce consumers. At the same time, data monetisation cannot be prohibited, but it can be regulated and monitored by stringent laws and regulations. The right to privacy isa fundamental right guaranteed to the citizens of India through Article 21 of The Constitution of India. The Supreme Court of India recognized the Right to Privacy as a fundamental right in the landmark judgment of Justice K.S. Puttaswamy (Retd) and Another v. Union of India . This paper highlights the legal issue of how e-commerce businesses violate individuals’ right to privacy by using the data collected, stored by them for economic gains and monetisation and protection of data. The researcher has mainly focused on e-commerce companies like online shopping websitesto analyse the legal issue of data monetisation. In the Internet of Things and the digital age, people have shifted to online shopping as it is convenient, easy, flexible, comfortable, time-consuming, etc. But at the same time, the e-commerce companies store the data of their consumers and use it by selling to the third party or generating more data from the data stored with them. This violatesindividuals’ right to privacy because the consumers do not know anything while giving their data online. Many times, data is collected without the consent of individuals also. Data can be structured, unstructured, etc., that is used by analytics to monetise. The Indian legislation like The Information Technology Act, 2000, etc., does not effectively protect the e-consumers concerning their data and how it is used by e-commerce businesses to monetise and generate revenues from that data. The paper also examines the draft Data Protection Bill, 2021, pending in the Parliament of India, and how this Bill can make a huge impact on data monetisation. This paper also aims to study the European Union General Data Protection Regulation and how this legislation can be helpful in the Indian scenarioconcerning e-commerce businesses with respect to data monetisation.

Keywords: data monetization, e-commerce companies, regulatory framework, GDPR

Procedia PDF Downloads 105
24409 Intensive Use of Software in Teaching and Learning Calculus

Authors: Nodelman V.

Abstract:

Despite serious difficulties in the assimilation of the conceptual system of Calculus, software in the educational process is used only occasionally, and even then, mainly for illustration purposes. The following are a few reasons: The non-trivial nature of the studied material, Lack of skills in working with software, Fear of losing time working with software, The variety of the software itself, the corresponding interface, syntax, and the methods of working with the software, The need to find suitable models, and familiarize yourself with working with them, Incomplete compatibility of the found models with the content and teaching methods of the studied material. This paper proposes an active use of the developed non-commercial software VusuMatica, which allows removing these restrictions through Broad support for the studied mathematical material (and not only Calculus). As a result - no need to select the right software, Emphasizing the unity of mathematics, its intrasubject and interdisciplinary relations, User-friendly interface, Absence of special syntax in defining mathematical objects, Ease of building models of the studied material and manipulating them, Unlimited flexibility of models thanks to the ability to redefine objects, which allows exploring objects characteristics, and considering examples and counterexamples of the concepts under study. The construction of models is based on an original approach to the analysis of the structure of the studied concepts. Thanks to the ease of construction, students are able not only to use ready-made models but also to create them on their own and explore the material studied with their help. The presentation includes examples of using VusuMatica in studying the concepts of limit and continuity of a function, its derivative, and integral.

Keywords: counterexamples, limitations and requirements, software, teaching and learning calculus, user-friendly interface and syntax

Procedia PDF Downloads 73
24408 Experiments on Weakly-Supervised Learning on Imperfect Data

Authors: Yan Cheng, Yijun Shao, James Rudolph, Charlene R. Weir, Beth Sahlmann, Qing Zeng-Treitler

Abstract:

Supervised predictive models require labeled data for training purposes. Complete and accurate labeled data, i.e., a ‘gold standard’, is not always available, and imperfectly labeled data may need to serve as an alternative. An important question is if the accuracy of the labeled data creates a performance ceiling for the trained model. In this study, we trained several models to recognize the presence of delirium in clinical documents using data with annotations that are not completely accurate (i.e., weakly-supervised learning). In the external evaluation, the support vector machine model with a linear kernel performed best, achieving an area under the curve of 89.3% and accuracy of 88%, surpassing the 80% accuracy of the training sample. We then generated a set of simulated data and carried out a series of experiments which demonstrated that models trained on imperfect data can (but do not always) outperform the accuracy of the training data, e.g., the area under the curve for some models is higher than 80% when trained on the data with an error rate of 40%. Our experiments also showed that the error resistance of linear modeling is associated with larger sample size, error type, and linearity of the data (all p-values < 0.001). In conclusion, this study sheds light on the usefulness of imperfect data in clinical research via weakly-supervised learning.

Keywords: weakly-supervised learning, support vector machine, prediction, delirium, simulation

Procedia PDF Downloads 183
24407 Operating Speed Models on Tangent Sections of Two-Lane Rural Roads

Authors: Dražen Cvitanić, Biljana Maljković

Abstract:

This paper presents models for predicting operating speeds on tangent sections of two-lane rural roads developed on continuous speed data. The data corresponds to 20 drivers of different ages and driving experiences, driving their own cars along an 18 km long section of a state road. The data were first used for determination of maximum operating speeds on tangents and their comparison with speeds in the middle of tangents i.e. speed data used in most of operating speed studies. Analysis of continuous speed data indicated that the spot speed data are not reliable indicators of relevant speeds. After that, operating speed models for tangent sections were developed. There was no significant difference between models developed using speed data in the middle of tangent sections and models developed using maximum operating speeds on tangent sections. All developed models have higher coefficient of determination then models developed on spot speed data. Thus, it can be concluded that the method of measuring has more significant impact on the quality of operating speed model than the location of measurement.

Keywords: operating speed, continuous speed data, tangent sections, spot speed, consistency

Procedia PDF Downloads 447
24406 Classification of Political Affiliations by Reduced Number of Features

Authors: Vesile Evrim, Aliyu Awwal

Abstract:

By the evolvement in technology, the way of expressing opinions switched the direction to the digital world. The domain of politics as one of the hottest topics of opinion mining research merged together with the behavior analysis for affiliation determination in text which constitutes the subject of this paper. This study aims to classify the text in news/blogs either as Republican or Democrat with the minimum number of features. As an initial set, 68 features which 64 are constituted by Linguistic Inquiry and Word Count (LIWC) features are tested against 14 benchmark classification algorithms. In the later experiments, the dimensions of the feature vector reduced based on the 7 feature selection algorithms. The results show that Decision Tree, Rule Induction and M5 Rule classifiers when used with SVM and IGR feature selection algorithms performed the best up to 82.5% accuracy on a given dataset. Further tests on a single feature and the linguistic based feature sets showed the similar results. The feature “function” as an aggregate feature of the linguistic category, is obtained as the most differentiating feature among the 68 features with 81% accuracy by itself in classifying articles either as Republican or Democrat.

Keywords: feature selection, LIWC, machine learning, politics

Procedia PDF Downloads 374
24405 The Effect That the Data Assimilation of Qinghai-Tibet Plateau Has on a Precipitation Forecast

Authors: Ruixia Liu

Abstract:

Qinghai-Tibet Plateau has an important influence on the precipitation of its lower reaches. Data from remote sensing has itself advantage and numerical prediction model which assimilates RS data will be better than other. We got the assimilation data of MHS and terrestrial and sounding from GSI, and introduced the result into WRF, then got the result of RH and precipitation forecast. We found that assimilating MHS and terrestrial and sounding made the forecast on precipitation, area and the center of the precipitation more accurate by comparing the result of 1h,6h,12h, and 24h. Analyzing the difference of the initial field, we knew that the data assimilating about Qinghai-Tibet Plateau influence its lower reaches forecast by affecting on initial temperature and RH.

Keywords: Qinghai-Tibet Plateau, precipitation, data assimilation, GSI

Procedia PDF Downloads 224
24404 Positive Affect, Negative Affect, Organizational and Motivational Factor on the Acceptance of Big Data Technologies

Authors: Sook Ching Yee, Angela Siew Hoong Lee

Abstract:

Big data technologies have become a trend to exploit business opportunities and provide valuable business insights through the analysis of big data. However, there are still many organizations that have yet to adopt big data technologies especially small and medium organizations (SME). This study uses the technology acceptance model (TAM) to look into several constructs in the TAM and other additional constructs which are positive affect, negative affect, organizational factor and motivational factor. The conceptual model proposed in the study will be tested on the relationship and influence of positive affect, negative affect, organizational factor and motivational factor towards the intention to use big data technologies to produce an outcome. Empirical research is used in this study by conducting a survey to collect data.

Keywords: big data technologies, motivational factor, negative affect, organizational factor, positive affect, technology acceptance model (TAM)

Procedia PDF Downloads 347
24403 Risk Assessments of Longest Dry Spells Phenomenon in Northern Tunisia

Authors: Majid Mathlouthi, Fethi Lebdi

Abstract:

Throughout the world, the extent and magnitude of droughts have economic, social and environmental consequences. Today climate change has become more and more felt; most likely they increase the frequency and duration of droughts. An analysis by event of dry event, from series of observations of the daily rainfall is carried out. A daily precipitation threshold value has been set. A catchment localized in Northern Tunisia where the average rainfall is about 600 mm has been studied. Rainfall events are defined as an uninterrupted series of rainfall days understanding at least a day having received a precipitation superior or equal to a fixed threshold. The dry events are constituted of a series of dry days framed by two successive rainfall events. A rainfall event is a vector of coordinates the duration, the rainfall depth per event and the duration of the dry event. The depth and duration are found to be correlated. So we use conditional probabilities to analyse the depth per event. The negative binomial distribution fits well the dry event. The duration of the rainfall event follows a geometric distribution. The length of the climatically cycle adjusts to the Incomplete Gamma. Results of this analysis was used to study of the effects of climate change on water resources and crops and to calibrate precipitation models with little rainfall records. In response to long droughts in the basin, the drought management system is based on three phases during each of the three phases; different measurements are applied and executed. The first is before drought, preparedness and early warning; the second is drought management, mitigation in the event of drought; and the last subsequent drought, when the drought is over.

Keywords: dry spell, precipitation threshold, climate vulnerability, adaptation measures

Procedia PDF Downloads 78
24402 Big Data Analysis with Rhipe

Authors: Byung Ho Jung, Ji Eun Shin, Dong Hoon Lim

Abstract:

Rhipe that integrates R and Hadoop environment made it possible to process and analyze massive amounts of data using a distributed processing environment. In this paper, we implemented multiple regression analysis using Rhipe with various data sizes of actual data. Experimental results for comparing the performance of our Rhipe with stats and biglm packages available on bigmemory, showed that our Rhipe was more fast than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases. We also compared the computing speeds of pseudo-distributed and fully-distributed modes for configuring Hadoop cluster. The results showed that fully-distributed mode was faster than pseudo-distributed mode, and computing speeds of fully-distributed mode were faster as the number of data nodes increases.

Keywords: big data, Hadoop, Parallel regression analysis, R, Rhipe

Procedia PDF Downloads 490
24401 Security in Resource Constraints Network Light Weight Encryption for Z-MAC

Authors: Mona Almansoori, Ahmed Mustafa, Ahmad Elshamy

Abstract:

Wireless sensor network was formed by a combination of nodes, systematically it transmitting the data to their base stations, this transmission data can be easily compromised if the limited processing power and the data consistency from these nodes are kept in mind; there is always a discussion to address the secure data transfer or transmission in actual time. This will present a mechanism to securely transmit the data over a chain of sensor nodes without compromising the throughput of the network by utilizing available battery resources available in the sensor node. Our methodology takes many different advantages of Z-MAC protocol for its efficiency, and it provides a unique key by sharing the mechanism using neighbor node MAC address. We present a light weighted data integrity layer which is embedded in the Z-MAC protocol to prove that our protocol performs well than Z-MAC when we introduce the different attack scenarios.

Keywords: hybrid MAC protocol, data integrity, lightweight encryption, neighbor based key sharing, sensor node dataprocessing, Z-MAC

Procedia PDF Downloads 133
24400 A Study of Blockchain Oracles

Authors: Abdeljalil Beniiche

Abstract:

The limitation with smart contracts is that they cannot access external data that might be required to control the execution of business logic. Oracles can be used to provide external data to smart contracts. An oracle is an interface that delivers data from external data outside the blockchain to a smart contract to consume. Oracle can deliver different types of data depending on the industry and requirements. In this paper, we study and describe the widely used blockchain oracles. Then, we elaborate on his potential role, technical architecture, and design patterns. Finally, we discuss the human oracle and its key role in solving the truth problem by reaching a consensus about a certain inquiry and tasks.

Keywords: blockchain, oracles, oracles design, human oracles

Procedia PDF Downloads 113
24399 Algorithm for Improved Tree Counting and Detection through Adaptive Machine Learning Approach with the Integration of Watershed Transformation and Local Maxima Analysis

Authors: Jigg Pelayo, Ricardo Villar

Abstract:

The Philippines is long considered as a valuable producer of high value crops globally. The country’s employment and economy have been dependent on agriculture, thus increasing its demand for the efficient agricultural mechanism. Remote sensing and geographic information technology have proven to effectively provide applications for precision agriculture through image-processing technique considering the development of the aerial scanning technology in the country. Accurate information concerning the spatial correlation within the field is very important for precision farming of high value crops, especially. The availability of height information and high spatial resolution images obtained from aerial scanning together with the development of new image analysis methods are offering relevant influence to precision agriculture techniques and applications. In this study, an algorithm was developed and implemented to detect and count high value crops simultaneously through adaptive scaling of support vector machine (SVM) algorithm subjected to object-oriented approach combining watershed transformation and local maxima filter in enhancing tree counting and detection. The methodology is compared to cutting-edge template matching algorithm procedures to demonstrate its effectiveness on a demanding tree is counting recognition and delineation problem. Since common data and image processing techniques are utilized, thus can be easily implemented in production processes to cover large agricultural areas. The algorithm is tested on high value crops like Palm, Mango and Coconut located in Misamis Oriental, Philippines - showing a good performance in particular for young adult and adult trees, significantly 90% above. The s inventories or database updating, allowing for the reduction of field work and manual interpretation tasks.

Keywords: high value crop, LiDAR, OBIA, precision agriculture

Procedia PDF Downloads 391
24398 Finding Bicluster on Gene Expression Data of Lymphoma Based on Singular Value Decomposition and Hierarchical Clustering

Authors: Alhadi Bustaman, Soeganda Formalidin, Titin Siswantining

Abstract:

DNA microarray technology is used to analyze thousand gene expression data simultaneously and a very important task for drug development and test, function annotation, and cancer diagnosis. Various clustering methods have been used for analyzing gene expression data. However, when analyzing very large and heterogeneous collections of gene expression data, conventional clustering methods often cannot produce a satisfactory solution. Biclustering algorithm has been used as an alternative approach to identifying structures from gene expression data. In this paper, we introduce a transform technique based on singular value decomposition to identify normalized matrix of gene expression data followed by Mixed-Clustering algorithm and the Lift algorithm, inspired in the node-deletion and node-addition phases proposed by Cheng and Church based on Agglomerative Hierarchical Clustering (AHC). Experimental study on standard datasets demonstrated the effectiveness of the algorithm in gene expression data.

Keywords: agglomerative hierarchical clustering (AHC), biclustering, gene expression data, lymphoma, singular value decomposition (SVD)

Procedia PDF Downloads 267
24397 An Efficient Traceability Mechanism in the Audited Cloud Data Storage

Authors: Ramya P, Lino Abraham Varghese, S. Bose

Abstract:

By cloud storage services, the data can be stored in the cloud, and can be shared across multiple users. Due to the unexpected hardware/software failures and human errors, which make the data stored in the cloud be lost or corrupted easily it affected the integrity of data in cloud. Some mechanisms have been designed to allow both data owners and public verifiers to efficiently audit cloud data integrity without retrieving the entire data from the cloud server. But public auditing on the integrity of shared data with the existing mechanisms will unavoidably reveal confidential information such as identity of the person, to public verifiers. Here a privacy-preserving mechanism is proposed to support public auditing on shared data stored in the cloud. It uses group signatures to compute verification metadata needed to audit the correctness of shared data. The identity of the signer on each block in shared data is kept confidential from public verifiers, who are easily verifying shared data integrity without retrieving the entire file. But on demand, the signer of the each block is reveal to the owner alone. Group private key is generated once by the owner in the static group, where as in the dynamic group, the group private key is change when the users revoke from the group. When the users leave from the group the already signed blocks are resigned by cloud service provider instead of owner is efficiently handled by efficient proxy re-signature scheme.

Keywords: data integrity, dynamic group, group signature, public auditing

Procedia PDF Downloads 379
24396 Rodriguez Diego, Del Valle Martin, Hargreaves Matias, Riveros Jose Luis

Authors: Nathainail Bashir, Neil Anderson

Abstract:

The objective of this study site was to investigate the current state of the practice with regards to karst detection methods and recommend the best method and pattern of arrays to acquire the desire results. Proper site investigation in karst prone regions is extremely valuable in determining the location of possible voids. Two geophysical techniques were employed: multichannel analysis of surface waves (MASW) and electric resistivity tomography (ERT).The MASW data was acquired at each test location using different array lengths and different array orientations (to increase the probability of getting interpretable data in karst terrain). The ERT data were acquired using a dipole-dipole array consisting of 168 electrodes. The MASW data was interpreted (re: estimated depth to physical top of rock) and used to constrain and verify the interpretation of the ERT data. The ERT data indicates poorer quality MASW data were acquired in areas where there was significant local variation in the depth to top of rock.

Keywords: dipole-dipole, ERT, Karst terrains, MASW

Procedia PDF Downloads 304
24395 Data Science in Military Decision-Making: A Semi-Systematic Literature Review

Authors: H. W. Meerveld, R. H. A. Lindelauf

Abstract:

In contemporary warfare, data science is crucial for the military in achieving information superiority. Yet, to the authors’ knowledge, no extensive literature survey on data science in military decision-making has been conducted so far. In this study, 156 peer-reviewed articles were analysed through an integrative, semi-systematic literature review to gain an overview of the topic. The study examined to what extent literature is focussed on the opportunities or risks of data science in military decision-making, differentiated per level of war (i.e. strategic, operational, and tactical level). A relatively large focus on the risks of data science was observed in social science literature, implying that political and military policymakers are disproportionally influenced by a pessimistic view on the application of data science in the military domain. The perceived risks of data science are, however, hardly addressed in formal science literature. This means that the concerns on the military application of data science are not addressed to the audience that can actually develop and enhance data science models and algorithms. Cross-disciplinary research on both the opportunities and risks of military data science can address the observed research gaps. Considering the levels of war, relatively low attention for the operational level compared to the other two levels was observed, suggesting a research gap with reference to military operational data science. Opportunities for military data science mostly arise at the tactical level. On the contrary, studies examining strategic issues mostly emphasise the risks of military data science. Consequently, domain-specific requirements for military strategic data science applications are hardly expressed. Lacking such applications may ultimately lead to a suboptimal strategic decision in today’s warfare.

Keywords: data science, decision-making, information superiority, literature review, military

Procedia PDF Downloads 148
24394 Gender Differences in the Prediction of Smartphone Use While Driving: Personal and Social Factors

Authors: Erez Kita, Gil Luria

Abstract:

This study examines gender as a boundary condition for the relationship between the psychological variable of mindfulness and the social variable of income with regards to the use of smartphones by young drivers. The use of smartphones while driving increases the likelihood of a car accident, endangering young drivers and other road users. The study sample included 186 young drivers who were legally permitted to drive without supervision. The subjects were first asked to complete questionnaires on mindfulness and income. Next, their smartphone use while driving was monitored over a one-month period. This study is unique as it used an objective smartphone monitoring application (rather than self-reporting) to count the number of times the young participants actually touched their smartphones while driving. The findings show that gender moderates the effects of social and personal factors (i.e., income and mindfulness) on the use of smartphones while driving. The pattern of moderation was similar for both social and personal factors. For men, mindfulness and income are negatively associated with the use of smartphones while driving. These factors are not related to the use of smartphones by women drivers. Mindfulness and income can be used to identify male populations that are at risk of using smartphones while driving. Interventions that improve mindfulness can be used to reduce the use of smartphones by male drivers.

Keywords: mindfulness, using smartphones while driving, income, gender, young drivers

Procedia PDF Downloads 161
24393 Legal Regulation of Personal Information Data Transmission Risk Assessment: A Case Study of the EU’s DPIA

Authors: Cai Qianyi

Abstract:

In the midst of global digital revolution, the flow of data poses security threats that call China's existing legislative framework for protecting personal information into question. As a preliminary procedure for risk analysis and prevention, the risk assessment of personal data transmission lacks detailed guidelines for support. Existing provisions reveal unclear responsibilities for network operators and weakened rights for data subjects. Furthermore, the regulatory system's weak operability and a lack of industry self-regulation heighten data transmission hazards. This paper aims to compare the regulatory pathways for data information transmission risks between China and Europe from a legal framework and content perspective. It draws on the “Data Protection Impact Assessment Guidelines” to empower multiple stakeholders, including data processors, controllers, and subjects, while also defining obligations. In conclusion, this paper intends to solve China's digital security shortcomings by developing a more mature regulatory framework and industry self-regulation mechanisms, resulting in a win-win situation for personal data protection and the development of the digital economy.

Keywords: personal information data transmission, risk assessment, DPIA, internet service provider, personal information data transimission, risk assessment

Procedia PDF Downloads 45
24392 Wavelets Contribution on Textual Data Analysis

Authors: Habiba Ben Abdessalem

Abstract:

The emergence of giant set of textual data was the push that has encouraged researchers to invest in this field. The purpose of textual data analysis methods is to facilitate access to such type of data by providing various graphic visualizations. Applying these methods requires a corpus pretreatment step, whose standards are set according to the objective of the problem studied. This step determines the forms list contained in contingency table by keeping only those information carriers. This step may, however, lead to noisy contingency tables, so the use of wavelet denoising function. The validity of the proposed approach is tested on a text database that offers economic and political events in Tunisia for a well definite period.

Keywords: textual data, wavelet, denoising, contingency table

Procedia PDF Downloads 271
24391 Fabrication of a Continuous Flow System for Biofilm Studies

Authors: Mohammed Jibrin Ndejiko

Abstract:

Modern and current models such as flow cell technology which enhances a non-destructive growth and inspection of the sessile microbial communities revealed a great understanding of biofilms. A continuous flow system was designed to evaluate possibility of biofilm formation by Escherichia coli DH5α on the stainless steel (type 304) under continuous nutrient supply. The result of the colony forming unit (CFU) count shows that bacterial attachment and subsequent biofilm formation on stainless steel coupons with average surface roughness of 1.5 ± 1.8 µm and 2.0 ± 0.09 µm were both significantly higher (p ≤ 0.05) than those of the stainless steel coupon with lower surface roughness of 0.38 ± 1.5 µm. These observations support the hypothesis that surface profile is one of the factors that influence biofilm formation on stainless steel surfaces. The SEM and FESEM micrographs of the stainless steel coupons also revealed the attached Escherichia coli DH5α biofilm and dehydrated extracellular polymeric substance on the stainless steel surfaces. Thus, the fabricated flow system represented a very useful tool to study biofilm formation under continuous nutrient supply.

Keywords: biofilm, flowcell, stainless steel, coupon

Procedia PDF Downloads 309
24390 The Proposal for a Framework to Face Opacity and Discrimination ‘Sins’ Caused by Consumer Creditworthiness Machines in the EU

Authors: Diogo José Morgado Rebelo, Francisco António Carneiro Pacheco de Andrade, Paulo Jorge Freitas de Oliveira Novais

Abstract:

Not everything in AI-power consumer credit scoring turns out to be a wonder. When using AI in Creditworthiness Assessment (CWA), opacity and unfairness ‘sins’ must be considered to the task be deemed Responsible. AI software is not always 100% accurate, which can lead to misclassification. Discrimination of some groups can be exponentiated. A hetero personalized identity can be imposed on the individual(s) affected. Also, autonomous CWA sometimes lacks transparency when using black box models. However, for this intended purpose, human analysts ‘on-the-loop’ might not be the best remedy consumers are looking for in credit. This study seeks to explore the legality of implementing a Multi-Agent System (MAS) framework in consumer CWA to ensure compliance with the regulation outlined in Article 14(4) of the Proposal for an Artificial Intelligence Act (AIA), dated 21 April 2021 (as per the last corrigendum by the European Parliament on 19 April 2024), Especially with the adoption of Art. 18(8)(9) of the EU Directive 2023/2225, of 18 October, which will go into effect on 20 November 2026, there should be more emphasis on the need for hybrid oversight in AI-driven scoring to ensure fairness and transparency. In fact, the range of EU regulations on AI-based consumer credit will soon impact the AI lending industry locally and globally, as shown by the broad territorial scope of AIA’s Art. 2. Consequently, engineering the law of consumer’s CWA is imperative. Generally, the proposed MAS framework consists of several layers arranged in a specific sequence, as follows: firstly, the Data Layer gathers legitimate predictor sets from traditional sources; then, the Decision Support System Layer, whose Neural Network model is trained using k-fold Cross Validation, provides recommendations based on the feeder data; the eXplainability (XAI) multi-structure comprises Three-Step-Agents; and, lastly, the Oversight Layer has a 'Bottom Stop' for analysts to intervene in a timely manner. From the analysis, one can assure a vital component of this software is the XAY layer. It appears as a transparent curtain covering the AI’s decision-making process, enabling comprehension, reflection, and further feasible oversight. Local Interpretable Model-agnostic Explanations (LIME) might act as a pillar by offering counterfactual insights. SHapley Additive exPlanation (SHAP), another agent in the XAI layer, could address potential discrimination issues, identifying the contribution of each feature to the prediction. Alternatively, for thin or no file consumers, the Suggestion Agent can promote financial inclusion. It uses lawful alternative sources such as the share of wallet, among others, to search for more advantageous solutions to incomplete evaluation appraisals based on genetic programming. Overall, this research aspires to bring the concept of Machine-Centered Anthropocentrism to the table of EU policymaking. It acknowledges that, when put into service, credit analysts no longer exert full control over the data-driven entities programmers have given ‘birth’ to. With similar explanatory agents under supervision, AI itself can become self-accountable, prioritizing human concerns and values. AI decisions should not be vilified inherently. The issue lies in how they are integrated into decision-making and whether they align with non-discrimination principles and transparency rules.

Keywords: creditworthiness assessment, hybrid oversight, machine-centered anthropocentrism, EU policymaking

Procedia PDF Downloads 25
24389 Customer Churn Analysis in Telecommunication Industry Using Data Mining Approach

Authors: Burcu Oralhan, Zeki Oralhan, Nilsun Sariyer, Kumru Uyar

Abstract:

Data mining has been becoming more and more important and a wide range of applications in recent years. Data mining is the process of find hidden and unknown patterns in big data. One of the applied fields of data mining is Customer Relationship Management. Understanding the relationships between products and customers is crucial for every business. Customer Relationship Management is an approach to focus on customer relationship development, retention and increase on customer satisfaction. In this study, we made an application of a data mining methods in telecommunication customer relationship management side. This study aims to determine the customers profile who likely to leave the system, develop marketing strategies, and customized campaigns for customers. Data are clustered by applying classification techniques for used to determine the churners. As a result of this study, we will obtain knowledge from international telecommunication industry. We will contribute to the understanding and development of this subject in Customer Relationship Management.

Keywords: customer churn analysis, customer relationship management, data mining, telecommunication industry

Procedia PDF Downloads 301
24388 On Pooling Different Levels of Data in Estimating Parameters of Continuous Meta-Analysis

Authors: N. R. N. Idris, S. Baharom

Abstract:

A meta-analysis may be performed using aggregate data (AD) or an individual patient data (IPD). In practice, studies may be available at both IPD and AD level. In this situation, both the IPD and AD should be utilised in order to maximize the available information. Statistical advantages of combining the studies from different level have not been fully explored. This study aims to quantify the statistical benefits of including available IPD when conducting a conventional summary-level meta-analysis. Simulated meta-analysis were used to assess the influence of the levels of data on overall meta-analysis estimates based on IPD-only, AD-only and the combination of IPD and AD (mixed data, MD), under different study scenario. The percentage relative bias (PRB), root mean-square-error (RMSE) and coverage probability were used to assess the efficiency of the overall estimates. The results demonstrate that available IPD should always be included in a conventional meta-analysis using summary level data as they would significantly increased the accuracy of the estimates. On the other hand, if more than 80% of the available data are at IPD level, including the AD does not provide significant differences in terms of accuracy of the estimates. Additionally, combining the IPD and AD has moderating effects on the biasness of the estimates of the treatment effects as the IPD tends to overestimate the treatment effects, while the AD has the tendency to produce underestimated effect estimates. These results may provide some guide in deciding if significant benefit is gained by pooling the two levels of data when conducting meta-analysis.

Keywords: aggregate data, combined-level data, individual patient data, meta-analysis

Procedia PDF Downloads 362
24387 Analyzing On-Line Process Data for Industrial Production Quality Control

Authors: Hyun-Woo Cho

Abstract:

The monitoring of industrial production quality has to be implemented to alarm early warning for unusual operating conditions. Furthermore, identification of their assignable causes is necessary for a quality control purpose. For such tasks many multivariate statistical techniques have been applied and shown to be quite effective tools. This work presents a process data-based monitoring scheme for production processes. For more reliable results some additional steps of noise filtering and preprocessing are considered. It may lead to enhanced performance by eliminating unwanted variation of the data. The performance evaluation is executed using data sets from test processes. The proposed method is shown to provide reliable quality control results, and thus is more effective in quality monitoring in the example. For practical implementation of the method, an on-line data system must be available to gather historical and on-line data. Recently large amounts of data are collected on-line in most processes and implementation of the current scheme is feasible and does not give additional burdens to users.

Keywords: detection, filtering, monitoring, process data

Procedia PDF Downloads 546
24386 Effect of 12 Weeks Pedometer-Based Workplace Program on Inflammation and Arterial Stiffness in Young Men with Cardiovascular Risks

Authors: Norsuhana Omar, Amilia Aminuddina Zaiton Zakaria, Raifana Rosa Mohamad Sattar, Kalaivani Chellappan, Mohd Alauddin Mohd Ali, Norizam Salamt, Zanariyah Asmawi, Norliza Saari, Aini Farzana Zulkefli, Nor Anita Megat Mohd. Nordin

Abstract:

Inflammation plays an important role in the pathogenesis of vascular dysfunction leading to arterial stiffness. Pulse wave velocity (PWV) and augmentation index (AS), as tools for the assessment of vascular damages are widely used and have been shown to predict cardiovascular disease (CVD). C-reactive protein (CRP) is a marker of inflammation. Several studies noted that regular exercise is associated with reduced arterial stiffness. The lack of exercise among Malaysians and the increasing CVD morbidity and mortality among young men are of concern. In Malaysia data on the workplace exercise intervention is scarce. A programme was designed to enable subjects to increase their level of walking as part of their daily work routine and self-monitored by using pedometers. The aim of this study to evaluate the reducing of inflammation by measuring CRP and improvement arterial stiffness measured by carotid femoral PWV (PWVCF) and AI. A total of 70 young men (20 - 40 years) who were sedentary, achieving less than 5,000 steps/day in casual walking with 2 or more cardiovascular risk factors were recruited in Institute of Vocational Skills for Youth (IKBN Hulu Langat). Subjects were randomly assigned to a control (CG) (n=34; no change in walking) and pedometer group (PG) (n=36; minimum target: 8,000 steps/day). The CRP was measured by using immunological method while PWVCF and AI were measured using Vicorder. All parameters were measured at baseline and after 12 weeks. Data for analysis was conducted using Statistical Package of Social Sciences Version 22 (SPSS Inc., Chicago, IL, USA). At post intervention, the CG step counts were similar (4983 ± 366vs 5697 ± 407steps/day). The PG increased step count from 4996 ± 805 to 10,128 ±511 steps/day (P<0.001). The PG showed significant improvement in anthropometric variables and lipid (time and group effect p<0.001). For vascular assessment, the PG showed significantly decreased for time and effect (p<0.001) for PWV (7.21± 0.83 to 6.42 ± 0.89) m/s; AI (11.88± 6.25 to 8.83 ± 3.7) % and CRP (pre= 2.28 ± 3.09, post=1.08± 1.37mg/L). However, no changes were seen in CG. As a conclusion, a pedometer-based walking programme may be an effective strategy for promoting increased daily physical activity which reduces cardiovascular risk markers and thus improve cardiovascular health in terms of inflammation and arterial stiffness. The community intervention for health maintenance has potential to adopt walking as an exercise and adopting vascular fitness index as the performance measuring tools.

Keywords: arterial stiffness, exercise, inflammation, pedometer

Procedia PDF Downloads 344