Search results for: R data science
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 26780

Search results for: R data science

24860 Talent-to-Vec: Using Network Graphs to Validate Models with Data Sparsity

Authors: Shaan Khosla, Jon Krohn

Abstract:

In a recruiting context, machine learning models are valuable for recommendations: to predict the best candidates for a vacancy, to match the best vacancies for a candidate, and compile a set of similar candidates for any given candidate. While useful to create these models, validating their accuracy in a recommendation context is difficult due to a sparsity of data. In this report, we use network graph data to generate useful representations for candidates and vacancies. We use candidates and vacancies as network nodes and designate a bi-directional link between them based on the candidate interviewing for the vacancy. After using node2vec, the embeddings are used to construct a validation dataset with a ranked order, which will help validate new recommender systems.

Keywords: AI, machine learning, NLP, recruiting

Procedia PDF Downloads 84
24859 A Web Service-Based Framework for Mining E-Learning Data

Authors: Felermino D. M. A. Ali, S. C. Ng

Abstract:

E-learning is an evolutionary form of distance learning and has become better over time as new technologies emerged. Today, efforts are still being made to embrace E-learning systems with emerging technologies in order to make them better. Among these advancements, Educational Data Mining (EDM) is one that is gaining a huge and increasing popularity due to its wide application for improving the teaching-learning process in online practices. However, even though EDM promises to bring many benefits to educational industry in general and E-learning environments in particular, its principal drawback is the lack of easy to use tools. The current EDM tools usually require users to have some additional technical expertise to effectively perform EDM tasks. Thus, in response to these limitations, this study intends to design and implement an EDM application framework which aims at automating and simplify the development of EDM in E-learning environment. The application framework introduces a Service-Oriented Architecture (SOA) that hides the complexity of technical details and enables users to perform EDM in an automated fashion. The framework was designed based on abstraction, extensibility, and interoperability principles. The framework implementation was made up of three major modules. The first module provides an abstraction for data gathering, which was done by extending Moodle LMS (Learning Management System) source code. The second module provides data mining methods and techniques as services; it was done by converting Weka API into a set of Web services. The third module acts as an intermediary between the first two modules, it contains a user-friendly interface that allows dynamically locating data provider services, and running knowledge discovery tasks on data mining services. An experiment was conducted to evaluate the overhead of the proposed framework through a combination of simulation and implementation. The experiments have shown that the overhead introduced by the SOA mechanism is relatively small, therefore, it has been concluded that a service-oriented architecture can be effectively used to facilitate educational data mining in E-learning environments.

Keywords: educational data mining, e-learning, distributed data mining, moodle, service-oriented architecture, Weka

Procedia PDF Downloads 236
24858 Mathematics Bridging Theory and Applications for a Data-Driven World

Authors: Zahid Ullah, Atlas Khan

Abstract:

In today's data-driven world, the role of mathematics in bridging the gap between theory and applications is becoming increasingly vital. This abstract highlights the significance of mathematics as a powerful tool for analyzing, interpreting, and extracting meaningful insights from vast amounts of data. By integrating mathematical principles with real-world applications, researchers can unlock the full potential of data-driven decision-making processes. This abstract delves into the various ways mathematics acts as a bridge connecting theoretical frameworks to practical applications. It explores the utilization of mathematical models, algorithms, and statistical techniques to uncover hidden patterns, trends, and correlations within complex datasets. Furthermore, it investigates the role of mathematics in enhancing predictive modeling, optimization, and risk assessment methodologies for improved decision-making in diverse fields such as finance, healthcare, engineering, and social sciences. The abstract also emphasizes the need for interdisciplinary collaboration between mathematicians, statisticians, computer scientists, and domain experts to tackle the challenges posed by the data-driven landscape. By fostering synergies between these disciplines, novel approaches can be developed to address complex problems and make data-driven insights accessible and actionable. Moreover, this abstract underscores the importance of robust mathematical foundations for ensuring the reliability and validity of data analysis. Rigorous mathematical frameworks not only provide a solid basis for understanding and interpreting results but also contribute to the development of innovative methodologies and techniques. In summary, this abstract advocates for the pivotal role of mathematics in bridging theory and applications in a data-driven world. By harnessing mathematical principles, researchers can unlock the transformative potential of data analysis, paving the way for evidence-based decision-making, optimized processes, and innovative solutions to the challenges of our rapidly evolving society.

Keywords: mathematics, bridging theory and applications, data-driven world, mathematical models

Procedia PDF Downloads 75
24857 AI-Enabled Smart Contracts for Reliable Traceability in the Industry 4.0

Authors: Harris Niavis, Dimitra Politaki

Abstract:

The manufacturing industry was collecting vast amounts of data for monitoring product quality thanks to the advances in the ICT sector and dedicated IoT infrastructure is deployed to track and trace the production line. However, industries have not yet managed to unleash the full potential of these data due to defective data collection methods and untrusted data storage and sharing. Blockchain is gaining increasing ground as a key technology enabler for Industry 4.0 and the smart manufacturing domain, as it enables the secure storage and exchange of data between stakeholders. On the other hand, AI techniques are more and more used to detect anomalies in batch and time-series data that enable the identification of unusual behaviors. The proposed scheme is based on smart contracts to enable automation and transparency in the data exchange, coupled with anomaly detection algorithms to enable reliable data ingestion in the system. Before sensor measurements are fed to the blockchain component and the smart contracts, the anomaly detection mechanism uniquely combines artificial intelligence models to effectively detect unusual values such as outliers and extreme deviations in data coming from them. Specifically, Autoregressive integrated moving average, Long short-term memory (LSTM) and Dense-based autoencoders, as well as Generative adversarial networks (GAN) models, are used to detect both point and collective anomalies. Towards the goal of preserving the privacy of industries' information, the smart contracts employ techniques to ensure that only anonymized pointers to the actual data are stored on the ledger while sensitive information remains off-chain. In the same spirit, blockchain technology guarantees the security of the data storage through strong cryptography as well as the integrity of the data through the decentralization of the network and the execution of the smart contracts by the majority of the blockchain network actors. The blockchain component of the Data Traceability Software is based on the Hyperledger Fabric framework, which lays the ground for the deployment of smart contracts and APIs to expose the functionality to the end-users. The results of this work demonstrate that such a system can increase the quality of the end-products and the trustworthiness of the monitoring process in the smart manufacturing domain. The proposed AI-enabled data traceability software can be employed by industries to accurately trace and verify records about quality through the entire production chain and take advantage of the multitude of monitoring records in their databases.

Keywords: blockchain, data quality, industry4.0, product quality

Procedia PDF Downloads 189
24856 Unstructured-Data Content Search Based on Optimized EEG Signal Processing and Multi-Objective Feature Extraction

Authors: Qais M. Yousef, Yasmeen A. Alshaer

Abstract:

Over the last few years, the amount of data available on the globe has been increased rapidly. This came up with the emergence of recent concepts, such as the big data and the Internet of Things, which have furnished a suitable solution for the availability of data all over the world. However, managing this massive amount of data remains a challenge due to their large verity of types and distribution. Therefore, locating the required file particularly from the first trial turned to be a not easy task, due to the large similarities of names for different files distributed on the web. Consequently, the accuracy and speed of search have been negatively affected. This work presents a method using Electroencephalography signals to locate the files based on their contents. Giving the concept of natural mind waves processing, this work analyses the mind wave signals of different people, analyzing them and extracting their most appropriate features using multi-objective metaheuristic algorithm, and then classifying them using artificial neural network to distinguish among files with similar names. The aim of this work is to provide the ability to find the files based on their contents using human thoughts only. Implementing this approach and testing it on real people proved its ability to find the desired files accurately within noticeably shorter time and retrieve them as a first choice for the user.

Keywords: artificial intelligence, data contents search, human active memory, mind wave, multi-objective optimization

Procedia PDF Downloads 175
24855 IoT Based Approach to Healthcare System for a Quadriplegic Patient Using EEG

Authors: R. Gautam, P. Sastha Kanagasabai, G. N. Rathna

Abstract:

The proposed healthcare system enables quadriplegic patients, people with severe motor disabilities to send commands to electronic devices and monitor their vitals. The growth of Brain-Computer-Interface (BCI) has led to rapid development in 'assistive systems' for the disabled called 'assistive domotics'. Brain-Computer-Interface is capable of reading the brainwaves of an individual and analyse it to obtain some meaningful data. This processed data can be used to assist people having speech disorders and sometimes people with limited locomotion to communicate. In this Project, Emotiv EPOC Headset is used to obtain the electroencephalogram (EEG). The obtained data is processed to communicate pre-defined commands over the internet to the desired mobile phone user. Other Vital Information like the heartbeat, blood pressure, ECG and body temperature are monitored and uploaded to the server. Data analytics enables physicians to scan databases for a specific illness. The Data is processed in Intel Edison, system on chip (SoC). Patient metrics are displayed via Intel IoT Analytics cloud service.

Keywords: brain computer interface, Intel Edison, Emotiv EPOC, IoT analytics, electroencephalogram

Procedia PDF Downloads 186
24854 Searchable Encryption in Cloud Storage

Authors: Ren Junn Hwang, Chung-Chien Lu, Jain-Shing Wu

Abstract:

Cloud outsource storage is one of important services in cloud computing. Cloud users upload data to cloud servers to reduce the cost of managing data and maintaining hardware and software. To ensure data confidentiality, users can encrypt their files before uploading them to a cloud system. However, retrieving the target file from the encrypted files exactly is difficult for cloud server. This study proposes a protocol for performing multikeyword searches for encrypted cloud data by applying k-nearest neighbor technology. The protocol ranks the relevance scores of encrypted files and keywords, and prevents cloud servers from learning search keywords submitted by a cloud user. To reduce the costs of file transfer communication, the cloud server returns encrypted files in order of relevance. Moreover, when a cloud user inputs an incorrect keyword and the number of wrong alphabet does not exceed a given threshold; the user still can retrieve the target files from cloud server. In addition, the proposed scheme satisfies security requirements for outsourced data storage.

Keywords: fault-tolerance search, multi-keywords search, outsource storage, ranked search, searchable encryption

Procedia PDF Downloads 383
24853 A Bivariate Inverse Generalized Exponential Distribution and Its Applications in Dependent Competing Risks Model

Authors: Fatemah A. Alqallaf, Debasis Kundu

Abstract:

The aim of this paper is to introduce a bivariate inverse generalized exponential distribution which has a singular component. The proposed bivariate distribution can be used when the marginals have heavy-tailed distributions, and they have non-monotone hazard functions. Due to the presence of the singular component, it can be used quite effectively when there are ties in the data. Since it has four parameters, it is a very flexible bivariate distribution, and it can be used quite effectively for analyzing various bivariate data sets. Several dependency properties and dependency measures have been obtained. The maximum likelihood estimators cannot be obtained in closed form, and it involves solving a four-dimensional optimization problem. To avoid that, we have proposed to use an EM algorithm, and it involves solving only one non-linear equation at each `E'-step. Hence, the implementation of the proposed EM algorithm is very straight forward in practice. Extensive simulation experiments and the analysis of one data set have been performed. We have observed that the proposed bivariate inverse generalized exponential distribution can be used for modeling dependent competing risks data. One data set has been analyzed to show the effectiveness of the proposed model.

Keywords: Block and Basu bivariate distributions, competing risks, EM algorithm, Marshall-Olkin bivariate exponential distribution, maximum likelihood estimators

Procedia PDF Downloads 143
24852 Blind Data Hiding Technique Using Interpolation of Subsampled Images

Authors: Singara Singh Kasana, Pankaj Garg

Abstract:

In this paper, a blind data hiding technique based on interpolation of sub sampled versions of a cover image is proposed. Sub sampled image is taken as a reference image and an interpolated image is generated from this reference image. Then difference between original cover image and interpolated image is used to embed secret data. Comparisons with the existing interpolation based techniques show that proposed technique provides higher embedding capacity and better visual quality marked images. Moreover, the performance of the proposed technique is more stable for different images.

Keywords: interpolation, image subsampling, PSNR, SIM

Procedia PDF Downloads 578
24851 Active Contours for Image Segmentation Based on Complex Domain Approach

Authors: Sajid Hussain

Abstract:

The complex domain approach for image segmentation based on active contour has been designed, which deforms step by step to partition an image into numerous expedient regions. A novel region-based trigonometric complex pressure force function is proposed, which propagates around the region of interest using image forces. The signed trigonometric force function controls the propagation of the active contour and the active contour stops on the exact edges of the object accurately. The proposed model makes the level set function binary and uses Gaussian smoothing kernel to adjust and escape the re-initialization procedure. The working principle of the proposed model is as follows: The real image data is transformed into complex data by iota (i) times of image data and the average iota (i) times of horizontal and vertical components of the gradient of image data is inserted in the proposed model to catch complex gradient of the image data. A simple finite difference mathematical technique has been used to implement the proposed model. The efficiency and robustness of the proposed model have been verified and compared with other state-of-the-art models.

Keywords: image segmentation, active contour, level set, Mumford and Shah model

Procedia PDF Downloads 114
24850 Maramataka ki te Tiri o Te Moana (Maramataka in Antarctica).: A Conceptual Maramataka in the Southwestern Ross Sea Region of Antarctica

Authors: Ayla Hoeta, Holly Winton

Abstract:

Maramataka is an ancestral lunar environmental knowledge system based on environmental tohu (signs, observations or indicators), that continues to impart maatauranga (knowledge) to tangata whenua, people of the land after thousands of years. Maramataka is the mauri (energy) flow between whenua (land), moana (water) and rangi (sky), experienced through tirotiro (observing), connecting and attuning to the natural environment. Tohu serve as guidance to practises of kaiawhina (protection) a key value driving Aotearoa New Zealand led research in Antarctica. Recent developments recognise the importance of including and integrating indigenous knowledge and perspectives such as maatauranga Maaori which can provide insights into the conservation of Antarctica. We use an ancient kaupapa Maaori framework of weaving, wayfinding and attunement to navigate complexities using Hautu Waka. We investigate and weave together learnings from Antarctic and western science and indigenous Maaori maatauranga and tohu of moana, whenua and rangi to provide an indigenous perspective of Antarctica taiao and Maramataka. Drawing on past and present knowledge of environmental calendars contained in maatauranga Maaori and paleoclimate knowledge bases, field observations, interviews and whakataukii (proverbs), we aim to provide a conceptual Maramataka of the southwestern Ross Sea region and area of Antarctica. A key area of interest are the tohu related to the marama which are connected to all three interweaving spheres of moana, rangi, whenua. Maatauranga Maramataka in Aotearoa has been developed over millennia and we acknowledge the mana and sacredness of this tupuna knowledge and that this conceptual Maramataka serves as the starting point of a journey to shine light on indigenous perspectives using Maaori methods and frameworks in a dominant western science paradigm.

Keywords: Maramataka, Antarctica, Aotearoa, Maaori, tohu, moon, lunar calendar

Procedia PDF Downloads 77
24849 Discerning Divergent Nodes in Social Networks

Authors: Mehran Asadi, Afrand Agah

Abstract:

In data mining, partitioning is used as a fundamental tool for classification. With the help of partitioning, we study the structure of data, which allows us to envision decision rules, which can be applied to classification trees. In this research, we used online social network dataset and all of its attributes (e.g., Node features, labels, etc.) to determine what constitutes an above average chance of being a divergent node. We used the R statistical computing language to conduct the analyses in this report. The data were found on the UC Irvine Machine Learning Repository. This research introduces the basic concepts of classification in online social networks. In this work, we utilize overfitting and describe different approaches for evaluation and performance comparison of different classification methods. In classification, the main objective is to categorize different items and assign them into different groups based on their properties and similarities. In data mining, recursive partitioning is being utilized to probe the structure of a data set, which allow us to envision decision rules and apply them to classify data into several groups. Estimating densities is hard, especially in high dimensions, with limited data. Of course, we do not know the densities, but we could estimate them using classical techniques. First, we calculated the correlation matrix of the dataset to see if any predictors are highly correlated with one another. By calculating the correlation coefficients for the predictor variables, we see that density is strongly correlated with transitivity. We initialized a data frame to easily compare the quality of the result classification methods and utilized decision trees (with k-fold cross validation to prune the tree). The method performed on this dataset is decision trees. Decision tree is a non-parametric classification method, which uses a set of rules to predict that each observation belongs to the most commonly occurring class label of the training data. Our method aggregates many decision trees to create an optimized model that is not susceptible to overfitting. When using a decision tree, however, it is important to use cross-validation to prune the tree in order to narrow it down to the most important variables.

Keywords: online social networks, data mining, social cloud computing, interaction and collaboration

Procedia PDF Downloads 158
24848 Comparison of Different k-NN Models for Speed Prediction in an Urban Traffic Network

Authors: Seyoung Kim, Jeongmin Kim, Kwang Ryel Ryu

Abstract:

A database that records average traffic speeds measured at five-minute intervals for all the links in the traffic network of a metropolitan city. While learning from this data the models that can predict future traffic speed would be beneficial for the applications such as the car navigation system, building predictive models for every link becomes a nontrivial job if the number of links in a given network is huge. An advantage of adopting k-nearest neighbor (k-NN) as predictive models is that it does not require any explicit model building. Instead, k-NN takes a long time to make a prediction because it needs to search for the k-nearest neighbors in the database at prediction time. In this paper, we investigate how much we can speed up k-NN in making traffic speed predictions by reducing the amount of data to be searched for without a significant sacrifice of prediction accuracy. The rationale behind this is that we had a better look at only the recent data because the traffic patterns not only repeat daily or weekly but also change over time. In our experiments, we build several different k-NN models employing different sets of features which are the current and past traffic speeds of the target link and the neighbor links in its up/down-stream. The performances of these models are compared by measuring the average prediction accuracy and the average time taken to make a prediction using various amounts of data.

Keywords: big data, k-NN, machine learning, traffic speed prediction

Procedia PDF Downloads 363
24847 Comparative Analysis of Classification Methods in Determining Non-Active Student Characteristics in Indonesia Open University

Authors: Dewi Juliah Ratnaningsih, Imas Sukaesih Sitanggang

Abstract:

Classification is one of data mining techniques that aims to discover a model from training data that distinguishes records into the appropriate category or class. Data mining classification methods can be applied in education, for example, to determine the classification of non-active students in Indonesia Open University. This paper presents a comparison of three methods of classification: Naïve Bayes, Bagging, and C.45. The criteria used to evaluate the performance of three methods of classification are stratified cross-validation, confusion matrix, the value of the area under the ROC Curve (AUC), Recall, Precision, and F-measure. The data used for this paper are from the non-active Indonesia Open University students in registration period of 2004.1 to 2012.2. Target analysis requires that non-active students were divided into 3 groups: C1, C2, and C3. Data analyzed are as many as 4173 students. Results of the study show: (1) Bagging method gave a high degree of classification accuracy than Naïve Bayes and C.45, (2) the Bagging classification accuracy rate is 82.99 %, while the Naïve Bayes and C.45 are 80.04 % and 82.74 % respectively, (3) the result of Bagging classification tree method has a large number of nodes, so it is quite difficult in decision making, (4) classification of non-active Indonesia Open University student characteristics uses algorithms C.45, (5) based on the algorithm C.45, there are 5 interesting rules which can describe the characteristics of non-active Indonesia Open University students.

Keywords: comparative analysis, data mining, clasiffication, Bagging, Naïve Bayes, C.45, non-active students, Indonesia Open University

Procedia PDF Downloads 316
24846 A Study of the Adaptive Reuse for School Land Use Strategy: An Application of the Analytic Network Process and Big Data

Authors: Wann-Ming Wey

Abstract:

In today's popularity and progress of information technology, the big data set and its analysis are no longer a major conundrum. Now, we could not only use the relevant big data to analysis and emulate the possible status of urban development in the near future, but also provide more comprehensive and reasonable policy implementation basis for government units or decision-makers via the analysis and emulation results as mentioned above. In this research, we set Taipei City as the research scope, and use the relevant big data variables (e.g., population, facility utilization and related social policy ratings) and Analytic Network Process (ANP) approach to implement in-depth research and discussion for the possible reduction of land use in primary and secondary schools of Taipei City. In addition to enhance the prosperous urban activities for the urban public facility utilization, the final results of this research could help improve the efficiency of urban land use in the future. Furthermore, the assessment model and research framework established in this research also provide a good reference for schools or other public facilities land use and adaptive reuse strategies in the future.

Keywords: adaptive reuse, analytic network process, big data, land use strategy

Procedia PDF Downloads 203
24845 Interoperability Standard for Data Exchange in Educational Documents in Professional and Technological Education: A Comparative Study and Feasibility Analysis for the Brazilian Context

Authors: Giovana Nunes Inocêncio

Abstract:

The professional and technological education (EPT) plays a pivotal role in equipping students for specialized careers, and it is imperative to establish a framework for efficient data exchange among educational institutions. The primary focus of this article is to address the pressing need for document interoperability within the context of EPT. The challenges, motivations, and benefits of implementing interoperability standards for digital educational documents are thoroughly explored. These documents include EPT completion certificates, academic records, and curricula. In conjunction with the prior abstract, it is evident that the intersection of IT governance and interoperability standards holds the key to transforming the landscape of technical education in Brazil. IT governance provides the strategic framework for effective data management, aligning with educational objectives, ensuring compliance, and managing risks. By adopting interoperability standards, the technical education sector in Brazil can facilitate data exchange, enhance data security, and promote international recognition of qualifications. The utilization of the XML (Extensible Markup Language) standard further strengthens the foundation for structured data exchange, fostering efficient communication, standardization of curricula, and enhancing educational materials. The IT governance, interoperability standards, and data management critical role in driving the quality, efficiency, and security of technical education. The adoption of these standards fosters transparency, stakeholder coordination, and regulatory compliance, ultimately empowering the technical education sector to meet the dynamic demands of the 21st century.

Keywords: interoperability, education, standards, governance

Procedia PDF Downloads 70
24844 A Critical Review and Bibliometric Analysis on Measures of Achievement Motivation

Authors: Kanupriya Rawat, Aleksandra Błachnio, Paweł Izdebski

Abstract:

Achievement motivation, which drives a person to strive for success, is an important construct in sports psychology. This systematic review aims to analyze the methods of measuring achievement motivation used in previous studies published over the past four decades and to find out which method of measuring achievement motivation is the most prevalent and the most effective by thoroughly examining measures of achievement motivation used in each study and by evaluating most highly cited achievement motivation measures in sport. In order to understand this latent construct, thorough measurement is necessary, hence a critical evaluation of measurement tools is required. The literature search was conducted in the following databases: EBSCO, MEDLINE, APA PsychARTICLES, Academic Search Ultimate, Open Dissertations, ERIC, Science direct, Web of Science, as well as Wiley Online Library. A total of 26 articles met the inclusion criteria and were selected. From this review, it was found that the Achievement Goal Questionnaire- Sport (AGQ-Sport) and the Task and Ego Orientation in Sport Questionnaire (TEOSQ) were used in most of the research, however, the average weighted impact factor of the Achievement Goal Questionnaire- Sport (AGQ-Sport) is the second highest and most relevant in terms of research articles related to the sport psychology discipline. Task and Ego Orientation in Sport Questionnaire (TEOSQ) is highly popular in cross-cultural adaptation but has the second last average IF among other scales due to the less impact factor of most of the publishing journals. All measures of achievement motivation have Cronbach’s alpha value of more than .70, which is acceptable. The advantages and limitations of each measurement tool are discussed, and the distinction between using implicit and explicit measures of achievement motivation is explained. Overall, both implicit and explicit measures of achievement motivation have different conceptualizations of achievement motivation and are applicable at either the contextual or situational level. The conceptualization and degree of applicability are perhaps the most crucial factors for researchers choosing a questionnaire, even though they differ in their development, reliability, and use.

Keywords: achievement motivation, task and ego orientation, sports psychology, measures of achievement motivation

Procedia PDF Downloads 96
24843 Generating Real-Time Visual Summaries from Located Sensor-Based Data with Chorems

Authors: Z. Bouattou, R. Laurini, H. Belbachir

Abstract:

This paper describes a new approach for the automatic generation of the visual summaries dealing with cartographic visualization methods and sensors real time data modeling. Hence, the concept of chorems seems an interesting candidate to visualize real time geographic database summaries. Chorems have been defined by Roger Brunet (1980) as schematized visual representations of territories. However, the time information is not yet handled in existing chorematic map approaches, issue has been discussed in this paper. Our approach is based on spatial analysis by interpolating the values recorded at the same time, by sensors available, so we have a number of distributed observations on study areas and used spatial interpolation methods to find the concentration fields, from these fields and by using some spatial data mining procedures on the fly, it is possible to extract important patterns as geographic rules. Then, those patterns are visualized as chorems.

Keywords: geovisualization, spatial analytics, real-time, geographic data streams, sensors, chorems

Procedia PDF Downloads 401
24842 Need for Privacy in the Technological Era: An Analysis in the Indian Perspective

Authors: Amrashaa Singh

Abstract:

In the digital age and the large cyberspace, Data Protection and Privacy have become major issues in this technological era. There was a time when social media and online shopping websites were treated as a blessing for the people. But now the tables have turned, and the people have started to look at them with suspicion. They are getting aware of the privacy implications, and they do not feel as safe as they used to initially. When Edward Snowden informed the world about the snooping United States Security Agencies had been doing, that is when the picture became clear for the people. After the Cambridge Analytica case where the data of Facebook users were stored without their consent, the doubts arose in the minds of people about how safe they actually are. In India, the case of spyware Pegasus also raised a lot of concerns. It was used to snoop on a lot of human right activists and lawyers and the company which invented the spyware claims that it only sells it to the government. The paper will be dealing with the privacy concerns in the Indian perspective with an analytical methodology. The Supreme Court here had recently declared a right to privacy a Fundamental Right under Article 21 of the Constitution of India. Further, the Government is also working on the Data Protection Bill. The point to note is that India is still a developing country, and with the bill, the government aims at data localization. But there are doubts in the minds of many people that the Government would actually be snooping on the data of the individuals. It looks more like an attempt to curb dissenters ‘lawfully’. The focus of the paper would be on these issues in India in light of the European Union (EU) General Data Protection Regulation (GDPR). The Indian Data Protection Bill is also said to be loosely based on EU GDPR. But how helpful would these laws actually be is another concern since the economic and social conditions in both countries are very different? The paper aims at discussing these concerns, how good or bad is the intention of the government behind the bill, and how the nations can act together and draft common regulations so that there is some uniformity in the laws and their application.

Keywords: Article 21, data protection, dissent, fundamental right, India, privacy

Procedia PDF Downloads 114
24841 Enhanced Model for Risk-Based Assessment of Employee Security with Bring Your Own Device Using Cyber Hygiene

Authors: Saidu I. R., Shittu S. S.

Abstract:

As the trend of personal devices accessing corporate data continues to rise through Bring Your Own Device (BYOD) practices, organizations recognize the potential cost reduction and productivity gains. However, the associated security risks pose a significant threat to these benefits. Often, organizations adopt BYOD environments without fully considering the vulnerabilities introduced by human factors in this context. This study presents an enhanced assessment model that evaluates the security posture of employees in BYOD environments using cyber hygiene principles. The framework assesses users' adherence to best practices and guidelines for maintaining a secure computing environment, employing scales and the Euclidean distance formula. By utilizing this algorithm, the study measures the distance between users' security practices and the organization's optimal security policies. To facilitate user evaluation, a simple and intuitive interface for automated assessment is developed. To validate the effectiveness of the proposed framework, design science research methods are employed, and empirical assessments are conducted using five artifacts to analyze user suitability in BYOD environments. By addressing the human factor vulnerabilities through the assessment of cyber hygiene practices, this study aims to enhance the overall security of BYOD environments and enable organizations to leverage the advantages of this evolving trend while mitigating potential risks.

Keywords: security, BYOD, vulnerability, risk, cyber hygiene

Procedia PDF Downloads 76
24840 An Online 3D Modeling Method Based on a Lossless Compression Algorithm

Authors: Jiankang Wang, Hongyang Yu

Abstract:

This paper proposes a portable online 3D modeling method. The method first utilizes a depth camera to collect data and compresses the depth data using a frame-by-frame lossless data compression method. The color image is encoded using the H.264 encoding format. After the cloud obtains the color image and depth image, a 3D modeling method based on bundlefusion is used to complete the 3D modeling. The results of this study indicate that this method has the characteristics of portability, online, and high efficiency and has a wide range of application prospects.

Keywords: 3D reconstruction, bundlefusion, lossless compression, depth image

Procedia PDF Downloads 82
24839 Theorizing Optimal Use of Numbers and Anecdotes: The Science of Storytelling in Newsrooms

Authors: Hai L. Tran

Abstract:

When covering events and issues, the news media often employ both personal accounts as well as facts and figures. However, the process of using numbers and narratives in the newsroom is mostly operated through trial and error. There is a demonstrated need for the news industry to better understand the specific effects of storytelling and data-driven reporting on the audience as well as explanatory factors driving such effects. In the academic world, anecdotal evidence and statistical evidence have been studied in a mutually exclusive manner. Existing research tends to treat pertinent effects as though the use of one form precludes the other and as if a tradeoff is required. Meanwhile, narratives and statistical facts are often combined in various communication contexts, especially in news presentations. There is value in reconceptualizing and theorizing about both relative and collective impacts of numbers and narratives as well as the mechanism underlying such effects. The current undertaking seeks to link theory to practice by providing a complete picture of how and why people are influenced by information conveyed through quantitative and qualitative accounts. Specifically, the cognitive-experiential theory is invoked to argue that humans employ two distinct systems to process information. The rational system requires the processing of logical evidence effortful analytical cognitions, which are affect-free. Meanwhile, the experiential system is intuitive, rapid, automatic, and holistic, thereby demanding minimum cognitive resources and relating to the experience of affect. In certain situations, one system might dominate the other, but rational and experiential modes of processing operations in parallel and at the same time. As such, anecdotes and quantified facts impact audience response differently and a combination of data and narratives is more effective than either form of evidence. In addition, the present study identifies several media variables and human factors driving the effects of statistics and anecdotes. An integrative model is proposed to explain how message characteristics (modality, vividness, salience, congruency, position) and individual differences (involvement, numeracy skills, cognitive resources, cultural orientation) impact selective exposure, which in turn activates pertinent modes of processing, and thereby induces corresponding responses. The present study represents a step toward bridging theoretical frameworks from various disciplines to better understand the specific effects and the conditions under which the use of anecdotal evidence and/or statistical evidence enhances or undermines information processing. In addition to theoretical contributions, this research helps inform news professionals about the benefits and pitfalls of incorporating quantitative and qualitative accounts in reporting. It proposes a typology of possible scenarios and appropriate strategies for journalists to use when presenting news with anecdotes and numbers.

Keywords: data, narrative, number, anecdote, storytelling, news

Procedia PDF Downloads 79
24838 H∞ Sampled-Data Control for Linear Systems Time-Varying Delays: Application to Power System

Authors: Chang-Ho Lee, Seung-Hoon Lee, Myeong-Jin Park, Oh-Min Kwon

Abstract:

This paper investigates improved stability criteria for sampled-data control of linear systems with disturbances and time-varying delays. Based on Lyapunov-Krasovskii stability theory, delay-dependent conditions sufficient to ensure H∞ stability for the system are derived in the form of linear matrix inequalities(LMI). The effectiveness of the proposed method will be shown in numerical examples.

Keywords: sampled-data control system, Lyapunov-Krasovskii functional, time delay-dependent, LMI, H∞ control

Procedia PDF Downloads 320
24837 Exploring the Synergistic Effects of Aerobic Exercise and Cinnamon Extract on Metabolic Markers in Insulin-Resistant Rats through Advanced Machine Learning and Deep Learning Techniques

Authors: Masoomeh Alsadat Mirshafaei

Abstract:

The present study aims to explore the effect of an 8-week aerobic training regimen combined with cinnamon extract on serum irisin and leptin levels in insulin-resistant rats. Additionally, this research leverages various machine learning (ML) and deep learning (DL) algorithms to model the complex interdependencies between exercise, nutrition, and metabolic markers, offering a groundbreaking approach to obesity and diabetes research. Forty-eight Wistar rats were selected and randomly divided into four groups: control, training, cinnamon, and training cinnamon. The training protocol was conducted over 8 weeks, with sessions 5 days a week at 75-80% VO2 max. The cinnamon and training-cinnamon groups were injected with 200 ml/kg/day of cinnamon extract. Data analysis included serum data, dietary intake, exercise intensity, and metabolic response variables, with blood samples collected 72 hours after the final training session. The dataset was analyzed using one-way ANOVA (P<0.05) and fed into various ML and DL models, including Support Vector Machines (SVM), Random Forest (RF), and Convolutional Neural Networks (CNN). Traditional statistical methods indicated that aerobic training, with and without cinnamon extract, significantly increased serum irisin and decreased leptin levels. Among the algorithms, the CNN model provided superior performance in identifying specific interactions between cinnamon extract concentration and exercise intensity, optimizing the increase in irisin and the decrease in leptin. The CNN model achieved an accuracy of 92%, outperforming the SVM (85%) and RF (88%) models in predicting the optimal conditions for metabolic marker improvements. The study demonstrated that advanced ML and DL techniques could uncover nuanced relationships and potential cellular responses to exercise and dietary supplements, which is not evident through traditional methods. These findings advocate for the integration of advanced analytical techniques in nutritional science and exercise physiology, paving the way for personalized health interventions in managing obesity and diabetes.

Keywords: aerobic training, cinnamon extract, insulin resistance, irisin, leptin, convolutional neural networks, exercise physiology, support vector machines, random forest

Procedia PDF Downloads 38
24836 Logistics Information Systems in the Distribution of Flour in Nigeria

Authors: Cornelius Femi Popoola

Abstract:

This study investigated logistics information systems in the distribution of flour in Nigeria. A case study design was used and 50 staff of Honeywell Flour Mill was sampled for the study. Data generated through a questionnaire were analysed using correlation and regression analysis. The findings of the study revealed that logistic information systems such as e-commerce, interactive telephone systems and electronic data interchange positively correlated with the distribution of flour in Honeywell Flour Mill. Finding also deduced that e-commerce, interactive telephone systems and electronic data interchange jointly and positively contribute to the distribution of flour in Honeywell Flour Mill in Nigeria (R = .935; Adj. R2 = .642; F (3,47) = 14.739; p < .05). The study therefore recommended that Honeywell Flour Mill should upgrade their logistic information systems to computer-to-computer communication of business transactions and documents, as well adopt new technology such as, tracking-and-tracing systems (barcode scanning for packages and palettes), tracking vehicles with Global Positioning System (GPS), measuring vehicle performance with ‘black boxes’ (containing logistic data), and Automatic Equipment Identification (AEI) into their systems.

Keywords: e-commerce, electronic data interchange, flour distribution, information system, interactive telephone systems

Procedia PDF Downloads 553
24835 Cascaded Neural Network for Internal Temperature Forecasting in Induction Motor

Authors: Hidir S. Nogay

Abstract:

In this study, two systems were created to predict interior temperature in induction motor. One of them consisted of a simple ANN model which has two layers, ten input parameters and one output parameter. The other one consisted of eight ANN models connected each other as cascaded. Cascaded ANN system has 17 inputs. Main reason of cascaded system being used in this study is to accomplish more accurate estimation by increasing inputs in the ANN system. Cascaded ANN system is compared with simple conventional ANN model to prove mentioned advantages. Dataset was obtained from experimental applications. Small part of the dataset was used to obtain more understandable graphs. Number of data is 329. 30% of the data was used for testing and validation. Test data and validation data were determined for each ANN model separately and reliability of each model was tested. As a result of this study, it has been understood that the cascaded ANN system produced more accurate estimates than conventional ANN model.

Keywords: cascaded neural network, internal temperature, inverter, three-phase induction motor

Procedia PDF Downloads 345
24834 Big Data and Health: An Australian Perspective Which Highlights the Importance of Data Linkage to Support Health Research at a National Level

Authors: James Semmens, James Boyd, Anna Ferrante, Katrina Spilsbury, Sean Randall, Adrian Brown

Abstract:

‘Big data’ is a relatively new concept that describes data so large and complex that it exceeds the storage or computing capacity of most systems to perform timely and accurate analyses. Health services generate large amounts of data from a wide variety of sources such as administrative records, electronic health records, health insurance claims, and even smart phone health applications. Health data is viewed in Australia and internationally as highly sensitive. Strict ethical requirements must be met for the use of health data to support health research. These requirements differ markedly from those imposed on data use from industry or other government sectors and may have the impact of reducing the capacity of health data to be incorporated into the real time demands of the Big Data environment. This ‘big data revolution’ is increasingly supported by national governments, who have invested significant funds into initiatives designed to develop and capitalize on big data and methods for data integration using record linkage. The benefits to health following research using linked administrative data are recognised internationally and by the Australian Government through the National Collaborative Research Infrastructure Strategy Roadmap, which outlined a multi-million dollar investment strategy to develop national record linkage capabilities. This led to the establishment of the Population Health Research Network (PHRN) to coordinate and champion this initiative. The purpose of the PHRN was to establish record linkage units in all Australian states, to support the implementation of secure data delivery and remote access laboratories for researchers, and to develop the Centre for Data Linkage for the linkage of national and cross-jurisdictional data. The Centre for Data Linkage has been established within Curtin University in Western Australia; it provides essential record linkage infrastructure necessary for large-scale, cross-jurisdictional linkage of health related data in Australia and uses a best practice ‘separation principle’ to support data privacy and security. Privacy preserving record linkage technology is also being developed to link records without the use of names to overcome important legal and privacy constraint. This paper will present the findings of the first ‘Proof of Concept’ project selected to demonstrate the effectiveness of increased record linkage capacity in supporting nationally significant health research. This project explored how cross-jurisdictional linkage can inform the nature and extent of cross-border hospital use and hospital-related deaths. The technical challenges associated with national record linkage, and the extent of cross-border population movements, were explored as part of this pioneering research project. Access to person-level data linked across jurisdictions identified geographical hot spots of cross border hospital use and hospital-related deaths in Australia. This has implications for planning of health service delivery and for longitudinal follow-up studies, particularly those involving mobile populations.

Keywords: data integration, data linkage, health planning, health services research

Procedia PDF Downloads 216
24833 Spatial Variability of Brahmaputra River Flow Characteristics

Authors: Hemant Kumar

Abstract:

Brahmaputra River is known according to the Hindu mythology the son of the Lord Brahma. According to this name, the river Brahmaputra creates mass destruction during the monsoon season in Assam, India. It is a state situated in North-East part of India. This is one of the essential states out of the seven countries of eastern India, where almost all entire Brahmaputra flow carried out. The other states carry their tributaries. In the present case study, the spatial analysis performed in this specific case the number of MODIS data are acquired. In the method of detecting the change, the spray content was found during heavy rainfall and in the flooded monsoon season. By this method, particularly the analysis over the Brahmaputra outflow determines the flooded season. The charged particle-associated in aerosol content genuinely verifies the heavy water content below the ground surface, which is validated by trend analysis through rainfall spectrum data. This is confirmed by in-situ sampled view data from a different position of Brahmaputra River. Further, a Hyperion Hyperspectral 30 m resolution data were used to scan the sediment deposits, which is also confirmed by in-situ sampled view data from a different position.

Keywords: aerosol, change detection, spatial analysis, trend analysis

Procedia PDF Downloads 147
24832 Data Mining Model for Predicting the Status of HIV Patients during Drug Regimen Change

Authors: Ermias A. Tegegn, Million Meshesha

Abstract:

Human Immunodeficiency Virus and Acquired Immunodeficiency Syndrome (HIV/AIDS) is a major cause of death for most African countries. Ethiopia is one of the seriously affected countries in sub Saharan Africa. Previously in Ethiopia, having HIV/AIDS was almost equivalent to a death sentence. With the introduction of Antiretroviral Therapy (ART), HIV/AIDS has become chronic, but manageable disease. The study focused on a data mining technique to predict future living status of HIV/AIDS patients at the time of drug regimen change when the patients become toxic to the currently taking ART drug combination. The data is taken from University of Gondar Hospital ART program database. Hybrid methodology is followed to explore the application of data mining on ART program dataset. Data cleaning, handling missing values and data transformation were used for preprocessing the data. WEKA 3.7.9 data mining tools, classification algorithms, and expertise are utilized as means to address the research problem. By using four different classification algorithms, (i.e., J48 Classifier, PART rule induction, Naïve Bayes and Neural network) and by adjusting their parameters thirty-two models were built on the pre-processed University of Gondar ART program dataset. The performances of the models were evaluated using the standard metrics of accuracy, precision, recall, and F-measure. The most effective model to predict the status of HIV patients with drug regimen substitution is pruned J48 decision tree with a classification accuracy of 98.01%. This study extracts interesting attributes such as Ever taking Cotrim, Ever taking TbRx, CD4 count, Age, Weight, and Gender so as to predict the status of drug regimen substitution. The outcome of this study can be used as an assistant tool for the clinician to help them make more appropriate drug regimen substitution. Future research directions are forwarded to come up with an applicable system in the area of the study.

Keywords: HIV drug regimen, data mining, hybrid methodology, predictive model

Procedia PDF Downloads 142
24831 Internal Cycles from Hydrometric Data and Variability Detected Through Hydrological Modelling Results, on the Niger River, over 1901-2020

Authors: Salif Koné

Abstract:

We analyze hydrometric data at the Koulikoro station on the Niger River; this basin drains 120600 km2 and covers three countries in West Africa, Guinea, Mali, and Ivory Coast. Two subsequent decadal cycles are highlighted (1925-1936 and 1929-1939) instead of the presumed single decadal one from literature. Moreover, the observed hydrometric data shows a multidecadal 40-year period that is confirmed when graphing a spatial coefficient of variation of runoff over decades (starting at 1901-1910). Spatial runoff data are produced on 48 grids (0.5 degree by 0.5 degree) and through semi-distributed versions of both SimulHyd model and GR2M model - variants of a French Hydrologic model – standing for Genie Rural of 2 parameters at monthly time step. Both extremal decades in terms of runoff coefficient of variation are confronted: 1951-1960 has minimal coefficient of variation, and 1981-1990 shows the maximal value of it during the three months of high-water level (August, September, and October). The mapping of the relative variation of these two decadal situations allows hypothesizing as following: the scale of variation between both extremal situations could serve to fix boundary conditions for further simulations using data from climate scenario.

Keywords: internal cycles, hydrometric data, niger river, gr2m and simulhyd framework, runoff coefficient of variation

Procedia PDF Downloads 95