Search results for: Big data analytics
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25183

Search results for: Big data analytics

23983 Development of a Spatial Data for Renal Registry in Nigeria Health Sector

Authors: Adekunle Kolawole Ojo, Idowu Peter Adebayo, Egwuche Sylvester O.

Abstract:

Chronic Kidney Disease (CKD) is a significant cause of morbidity and mortality across developed and developing nations and is associated with increased risk. There are no existing electronic means of capturing and monitoring CKD in Nigeria. The work is aimed at developing a spatial data model that can be used to implement renal registries required for tracking and monitoring the spatial distribution of renal diseases by public health officers and patients. In this study, we have developed a spatial data model for a functional renal registry.

Keywords: renal registry, health informatics, chronic kidney disease, interface

Procedia PDF Downloads 212
23982 Environmental Evaluation of Two Kind of Drug Production (Syrup and Pomade Form) Using Life Cycle Assessment Methodology

Authors: H. Aksas, S. Boughrara, K. Louhab

Abstract:

The goal of this study was the use of life cycle assessment (LCA) methodology to assess the environmental impact of pharmaceutical product (four kinds of syrup form and tree kinds of pomade form), which are produced in one leader manufactory in Algeria town that is SAIDAL Company. The impacts generated have evaluated using SimpaPro7.1 with CML92 Method for syrup form and EPD 2007 for pomade form. All impacts evaluated have compared between them, with determination of the compound contributing to each impacts in each case. Data needed to conduct Life Cycle Inventory (LCI) came from this factory, by the collection of theoretical data near the responsible technicians and engineers of the company, the practical data are resulting from the assay of pharmaceutical liquid, obtained at the laboratories of the university. This data represent different raw material imported from European and Asian country necessarily to formulate the drug. Energy used is coming from Algerian resource for the input. Outputs are the result of effluent analysis of this factory with different form (liquid, solid and gas form). All this data (input and output) represent the ecobalance.

Keywords: pharmaceutical product, drug residues, LCA methodology, environmental impacts

Procedia PDF Downloads 246
23981 Multi Cloud Storage Systems for Resource Constrained Mobile Devices: Comparison and Analysis

Authors: Rajeev Kumar Bedi, Jaswinder Singh, Sunil Kumar Gupta

Abstract:

Cloud storage is a model of online data storage where data is stored in virtualized pool of servers hosted by third parties (CSPs) and located in different geographical locations. Cloud storage revolutionized the way how users access their data online anywhere, anytime and using any device as a tablet, mobile, laptop, etc. A lot of issues as vendor lock-in, frequent service outage, data loss and performance related issues exist in single cloud storage systems. So to evade these issues, the concept of multi cloud storage introduced. There are a lot of multi cloud storage systems exists in the market for mobile devices. In this article, we are providing comparison of four multi cloud storage systems for mobile devices Otixo, Unclouded, Cloud Fuze, and Clouds and evaluate their performance on the basis of CPU usage, battery consumption, time consumption and data usage parameters on three mobile phones Nexus 5, Moto G and Nexus 7 tablet and using Wi-Fi network. Finally, open research challenges and future scope are discussed.

Keywords: cloud storage, multi cloud storage, vendor lock-in, mobile devices, mobile cloud computing

Procedia PDF Downloads 407
23980 Preparation of Wireless Networks and Security; Challenges in Efficient Accession of Encrypted Data in Healthcare

Authors: M. Zayoud, S. Oueida, S. Ionescu, P. AbiChar

Abstract:

Background: Wireless sensor network is encompassed of diversified tools of information technology, which is widely applied in a range of domains, including military surveillance, weather forecasting, and earthquake forecasting. Strengthened grounds are always developed for wireless sensor networks, which usually emerges security issues during professional application. Thus, essential technological tools are necessary to be assessed for secure aggregation of data. Moreover, such practices have to be incorporated in the healthcare practices that shall be serving in the best of the mutual interest Objective: Aggregation of encrypted data has been assessed through homomorphic stream cipher to assure its effectiveness along with providing the optimum solutions to the field of healthcare. Methods: An experimental design has been incorporated, which utilized newly developed cipher along with CPU-constrained devices. Modular additions have also been employed to evaluate the nature of aggregated data. The processes of homomorphic stream cipher have been highlighted through different sensors and modular additions. Results: Homomorphic stream cipher has been recognized as simple and secure process, which has allowed efficient aggregation of encrypted data. In addition, the application has led its way to the improvisation of the healthcare practices. Statistical values can be easily computed through the aggregation on the basis of selected cipher. Sensed data in accordance with variance, mean, and standard deviation has also been computed through the selected tool. Conclusion: It can be concluded that homomorphic stream cipher can be an ideal tool for appropriate aggregation of data. Alongside, it shall also provide the best solutions to the healthcare sector.

Keywords: aggregation, cipher, homomorphic stream, encryption

Procedia PDF Downloads 260
23979 Design-Based Elements to Sustain Participant Activity in Massive Open Online Courses: A Case Study

Authors: C. Zimmermann, E. Lackner, M. Ebner

Abstract:

Massive Open Online Courses (MOOCs) are increasingly popular learning hubs that are boasting considerable participant numbers, innovative technical features, and a multitude of instructional resources. Still, there is a high level of evidence showing that almost all MOOCs suffer from a declining frequency of participant activity and fairly low completion rates. In this paper, we would like to share the lessons learned in implementing several design patterns that have been suggested in order to foster participant activity. Our conclusions are based on experiences with the ‘Dr. Internet’ MOOC, which was created as an xMOOC to raise awareness for a more critical approach to online health information: participants had to diagnose medical case studies. There is a growing body of recommendations (based on Learning Analytics results from earlier xMOOCs) as to how the decline in participant activity can be alleviated. One promising focus in this regard is instructional design patterns, since they have a tremendous influence on the learner’s motivation, which in turn is a crucial trigger of learning processes. Since Medieval Age storytelling, micro-learning units and specific comprehensible, narrative structures were chosen to animate the audience to follow narration. Hence, MOOC participants are not likely to abandon a course or information channel when their curiosity is kept at a continuously high level. Critical aspects that warrant consideration in this regard include shorter course duration, a narrative structure with suspense peaks (according to the ‘storytelling’ approach), and a course schedule that is diversified and stimulating, yet easy to follow. All of these criteria have been observed within the design of the Dr. Internet MOOC: 1) the standard eight week course duration was shortened down to six weeks, 2) all six case studies had a special quiz format and a corresponding resolution video which was made available in the subsequent week, 3) two out of six case studies were split up in serial video sequences to be presented over the span of two weeks, and 4) the videos were generally scheduled in a less predictable sequence. However, the statistical results from the first run of the MOOC do not indicate any strong influences on the retention rate, so we conclude with some suggestions as to why this might be and what aspects need further consideration.

Keywords: case study, Dr. internet, experience, MOOCs, design patterns

Procedia PDF Downloads 266
23978 The Relationship between Emotional Intelligence and Leadership Performance

Authors: Omar Al Ali

Abstract:

The current study was aimed to explore the relationships between emotional intelligence, cognitive ability, and leader's performance. Data were collected from 260 senior managers from UAE. The results showed that there are significant relationships between emotional intelligence and leadership performance as measured by the annual internal evaluations of each participant (r = .42, p < .01). Data from regression analysis revealed that both variables namely emotional intelligence (beta = .31, p < .01), and cognitive ability (beta = .29, p < .01), predicted leadership competencies, and together explained 26% of its variance. Data suggests that EI and cognitive ability are significantly correlated with leadership performance. In depth implications of the present findings for human resource development theory and practice are discussed.

Keywords: emotional intelligence, cognitive ability, leadership, performance

Procedia PDF Downloads 477
23977 Comparison of Irradiance Decomposition and Energy Production Methods in a Solar Photovoltaic System

Authors: Tisciane Perpetuo e Oliveira, Dante Inga Narvaez, Marcelo Gradella Villalva

Abstract:

Installations of solar photovoltaic systems have increased considerably in the last decade. Therefore, it has been noticed that monitoring of meteorological data (solar irradiance, air temperature, wind velocity, etc.) is important to predict the potential of a given geographical area in solar energy production. In this sense, the present work compares two computational tools that are capable of estimating the energy generation of a photovoltaic system through correlation analyzes of solar radiation data: PVsyst software and an algorithm based on the PVlib package implemented in MATLAB. In order to achieve the objective, it was necessary to obtain solar radiation data (measured and from a solarimetric database), analyze the decomposition of global solar irradiance in direct normal and horizontal diffuse components, as well as analyze the modeling of the devices of a photovoltaic system (solar modules and inverters) for energy production calculations. Simulated results were compared with experimental data in order to evaluate the performance of the studied methods. Errors in estimation of energy production were less than 30% for the MATLAB algorithm and less than 20% for the PVsyst software.

Keywords: energy production, meteorological data, irradiance decomposition, solar photovoltaic system

Procedia PDF Downloads 142
23976 Social Media Data Analysis for Personality Modelling and Learning Styles Prediction Using Educational Data Mining

Authors: Srushti Patil, Preethi Baligar, Gopalkrishna Joshi, Gururaj N. Bhadri

Abstract:

In designing learning environments, the instructional strategies can be tailored to suit the learning style of an individual to ensure effective learning. In this study, the information shared on social media like Facebook is being used to predict learning style of a learner. Previous research studies have shown that Facebook data can be used to predict user personality. Users with a particular personality exhibit an inherent pattern in their digital footprint on Facebook. The proposed work aims to correlate the user's’ personality, predicted from Facebook data to the learning styles, predicted through questionnaires. For Millennial learners, Facebook has become a primary means for information sharing and interaction with peers. Thus, it can serve as a rich bed for research and direct the design of learning environments. The authors have conducted this study in an undergraduate freshman engineering course. Data from 320 freshmen Facebook users was collected. The same users also participated in the learning style and personality prediction survey. The Kolb’s Learning style questionnaires and Big 5 personality Inventory were adopted for the survey. The users have agreed to participate in this research and have signed individual consent forms. A specific page was created on Facebook to collect user data like personal details, status updates, comments, demographic characteristics and egocentric network parameters. This data was captured by an application created using Python program. The data captured from Facebook was subjected to text analysis process using the Linguistic Inquiry and Word Count dictionary. An analysis of the data collected from the questionnaires performed reveals individual student personality and learning style. The results obtained from analysis of Facebook, learning style and personality data were then fed into an automatic classifier that was trained by using the data mining techniques like Rule-based classifiers and Decision trees. This helps to predict the user personality and learning styles by analysing the common patterns. Rule-based classifiers applied for text analysis helps to categorize Facebook data into positive, negative and neutral. There were totally two models trained, one to predict the personality from Facebook data; another one to predict the learning styles from the personalities. The results show that the classifier model has high accuracy which makes the proposed method to be a reliable one for predicting the user personality and learning styles.

Keywords: educational data mining, Facebook, learning styles, personality traits

Procedia PDF Downloads 231
23975 Talent-to-Vec: Using Network Graphs to Validate Models with Data Sparsity

Authors: Shaan Khosla, Jon Krohn

Abstract:

In a recruiting context, machine learning models are valuable for recommendations: to predict the best candidates for a vacancy, to match the best vacancies for a candidate, and compile a set of similar candidates for any given candidate. While useful to create these models, validating their accuracy in a recommendation context is difficult due to a sparsity of data. In this report, we use network graph data to generate useful representations for candidates and vacancies. We use candidates and vacancies as network nodes and designate a bi-directional link between them based on the candidate interviewing for the vacancy. After using node2vec, the embeddings are used to construct a validation dataset with a ranked order, which will help validate new recommender systems.

Keywords: AI, machine learning, NLP, recruiting

Procedia PDF Downloads 84
23974 A Web Service-Based Framework for Mining E-Learning Data

Authors: Felermino D. M. A. Ali, S. C. Ng

Abstract:

E-learning is an evolutionary form of distance learning and has become better over time as new technologies emerged. Today, efforts are still being made to embrace E-learning systems with emerging technologies in order to make them better. Among these advancements, Educational Data Mining (EDM) is one that is gaining a huge and increasing popularity due to its wide application for improving the teaching-learning process in online practices. However, even though EDM promises to bring many benefits to educational industry in general and E-learning environments in particular, its principal drawback is the lack of easy to use tools. The current EDM tools usually require users to have some additional technical expertise to effectively perform EDM tasks. Thus, in response to these limitations, this study intends to design and implement an EDM application framework which aims at automating and simplify the development of EDM in E-learning environment. The application framework introduces a Service-Oriented Architecture (SOA) that hides the complexity of technical details and enables users to perform EDM in an automated fashion. The framework was designed based on abstraction, extensibility, and interoperability principles. The framework implementation was made up of three major modules. The first module provides an abstraction for data gathering, which was done by extending Moodle LMS (Learning Management System) source code. The second module provides data mining methods and techniques as services; it was done by converting Weka API into a set of Web services. The third module acts as an intermediary between the first two modules, it contains a user-friendly interface that allows dynamically locating data provider services, and running knowledge discovery tasks on data mining services. An experiment was conducted to evaluate the overhead of the proposed framework through a combination of simulation and implementation. The experiments have shown that the overhead introduced by the SOA mechanism is relatively small, therefore, it has been concluded that a service-oriented architecture can be effectively used to facilitate educational data mining in E-learning environments.

Keywords: educational data mining, e-learning, distributed data mining, moodle, service-oriented architecture, Weka

Procedia PDF Downloads 236
23973 Mathematics Bridging Theory and Applications for a Data-Driven World

Authors: Zahid Ullah, Atlas Khan

Abstract:

In today's data-driven world, the role of mathematics in bridging the gap between theory and applications is becoming increasingly vital. This abstract highlights the significance of mathematics as a powerful tool for analyzing, interpreting, and extracting meaningful insights from vast amounts of data. By integrating mathematical principles with real-world applications, researchers can unlock the full potential of data-driven decision-making processes. This abstract delves into the various ways mathematics acts as a bridge connecting theoretical frameworks to practical applications. It explores the utilization of mathematical models, algorithms, and statistical techniques to uncover hidden patterns, trends, and correlations within complex datasets. Furthermore, it investigates the role of mathematics in enhancing predictive modeling, optimization, and risk assessment methodologies for improved decision-making in diverse fields such as finance, healthcare, engineering, and social sciences. The abstract also emphasizes the need for interdisciplinary collaboration between mathematicians, statisticians, computer scientists, and domain experts to tackle the challenges posed by the data-driven landscape. By fostering synergies between these disciplines, novel approaches can be developed to address complex problems and make data-driven insights accessible and actionable. Moreover, this abstract underscores the importance of robust mathematical foundations for ensuring the reliability and validity of data analysis. Rigorous mathematical frameworks not only provide a solid basis for understanding and interpreting results but also contribute to the development of innovative methodologies and techniques. In summary, this abstract advocates for the pivotal role of mathematics in bridging theory and applications in a data-driven world. By harnessing mathematical principles, researchers can unlock the transformative potential of data analysis, paving the way for evidence-based decision-making, optimized processes, and innovative solutions to the challenges of our rapidly evolving society.

Keywords: mathematics, bridging theory and applications, data-driven world, mathematical models

Procedia PDF Downloads 75
23972 AI-Enabled Smart Contracts for Reliable Traceability in the Industry 4.0

Authors: Harris Niavis, Dimitra Politaki

Abstract:

The manufacturing industry was collecting vast amounts of data for monitoring product quality thanks to the advances in the ICT sector and dedicated IoT infrastructure is deployed to track and trace the production line. However, industries have not yet managed to unleash the full potential of these data due to defective data collection methods and untrusted data storage and sharing. Blockchain is gaining increasing ground as a key technology enabler for Industry 4.0 and the smart manufacturing domain, as it enables the secure storage and exchange of data between stakeholders. On the other hand, AI techniques are more and more used to detect anomalies in batch and time-series data that enable the identification of unusual behaviors. The proposed scheme is based on smart contracts to enable automation and transparency in the data exchange, coupled with anomaly detection algorithms to enable reliable data ingestion in the system. Before sensor measurements are fed to the blockchain component and the smart contracts, the anomaly detection mechanism uniquely combines artificial intelligence models to effectively detect unusual values such as outliers and extreme deviations in data coming from them. Specifically, Autoregressive integrated moving average, Long short-term memory (LSTM) and Dense-based autoencoders, as well as Generative adversarial networks (GAN) models, are used to detect both point and collective anomalies. Towards the goal of preserving the privacy of industries' information, the smart contracts employ techniques to ensure that only anonymized pointers to the actual data are stored on the ledger while sensitive information remains off-chain. In the same spirit, blockchain technology guarantees the security of the data storage through strong cryptography as well as the integrity of the data through the decentralization of the network and the execution of the smart contracts by the majority of the blockchain network actors. The blockchain component of the Data Traceability Software is based on the Hyperledger Fabric framework, which lays the ground for the deployment of smart contracts and APIs to expose the functionality to the end-users. The results of this work demonstrate that such a system can increase the quality of the end-products and the trustworthiness of the monitoring process in the smart manufacturing domain. The proposed AI-enabled data traceability software can be employed by industries to accurately trace and verify records about quality through the entire production chain and take advantage of the multitude of monitoring records in their databases.

Keywords: blockchain, data quality, industry4.0, product quality

Procedia PDF Downloads 189
23971 Unstructured-Data Content Search Based on Optimized EEG Signal Processing and Multi-Objective Feature Extraction

Authors: Qais M. Yousef, Yasmeen A. Alshaer

Abstract:

Over the last few years, the amount of data available on the globe has been increased rapidly. This came up with the emergence of recent concepts, such as the big data and the Internet of Things, which have furnished a suitable solution for the availability of data all over the world. However, managing this massive amount of data remains a challenge due to their large verity of types and distribution. Therefore, locating the required file particularly from the first trial turned to be a not easy task, due to the large similarities of names for different files distributed on the web. Consequently, the accuracy and speed of search have been negatively affected. This work presents a method using Electroencephalography signals to locate the files based on their contents. Giving the concept of natural mind waves processing, this work analyses the mind wave signals of different people, analyzing them and extracting their most appropriate features using multi-objective metaheuristic algorithm, and then classifying them using artificial neural network to distinguish among files with similar names. The aim of this work is to provide the ability to find the files based on their contents using human thoughts only. Implementing this approach and testing it on real people proved its ability to find the desired files accurately within noticeably shorter time and retrieve them as a first choice for the user.

Keywords: artificial intelligence, data contents search, human active memory, mind wave, multi-objective optimization

Procedia PDF Downloads 175
23970 Searchable Encryption in Cloud Storage

Authors: Ren Junn Hwang, Chung-Chien Lu, Jain-Shing Wu

Abstract:

Cloud outsource storage is one of important services in cloud computing. Cloud users upload data to cloud servers to reduce the cost of managing data and maintaining hardware and software. To ensure data confidentiality, users can encrypt their files before uploading them to a cloud system. However, retrieving the target file from the encrypted files exactly is difficult for cloud server. This study proposes a protocol for performing multikeyword searches for encrypted cloud data by applying k-nearest neighbor technology. The protocol ranks the relevance scores of encrypted files and keywords, and prevents cloud servers from learning search keywords submitted by a cloud user. To reduce the costs of file transfer communication, the cloud server returns encrypted files in order of relevance. Moreover, when a cloud user inputs an incorrect keyword and the number of wrong alphabet does not exceed a given threshold; the user still can retrieve the target files from cloud server. In addition, the proposed scheme satisfies security requirements for outsourced data storage.

Keywords: fault-tolerance search, multi-keywords search, outsource storage, ranked search, searchable encryption

Procedia PDF Downloads 383
23969 A Bivariate Inverse Generalized Exponential Distribution and Its Applications in Dependent Competing Risks Model

Authors: Fatemah A. Alqallaf, Debasis Kundu

Abstract:

The aim of this paper is to introduce a bivariate inverse generalized exponential distribution which has a singular component. The proposed bivariate distribution can be used when the marginals have heavy-tailed distributions, and they have non-monotone hazard functions. Due to the presence of the singular component, it can be used quite effectively when there are ties in the data. Since it has four parameters, it is a very flexible bivariate distribution, and it can be used quite effectively for analyzing various bivariate data sets. Several dependency properties and dependency measures have been obtained. The maximum likelihood estimators cannot be obtained in closed form, and it involves solving a four-dimensional optimization problem. To avoid that, we have proposed to use an EM algorithm, and it involves solving only one non-linear equation at each `E'-step. Hence, the implementation of the proposed EM algorithm is very straight forward in practice. Extensive simulation experiments and the analysis of one data set have been performed. We have observed that the proposed bivariate inverse generalized exponential distribution can be used for modeling dependent competing risks data. One data set has been analyzed to show the effectiveness of the proposed model.

Keywords: Block and Basu bivariate distributions, competing risks, EM algorithm, Marshall-Olkin bivariate exponential distribution, maximum likelihood estimators

Procedia PDF Downloads 143
23968 Blind Data Hiding Technique Using Interpolation of Subsampled Images

Authors: Singara Singh Kasana, Pankaj Garg

Abstract:

In this paper, a blind data hiding technique based on interpolation of sub sampled versions of a cover image is proposed. Sub sampled image is taken as a reference image and an interpolated image is generated from this reference image. Then difference between original cover image and interpolated image is used to embed secret data. Comparisons with the existing interpolation based techniques show that proposed technique provides higher embedding capacity and better visual quality marked images. Moreover, the performance of the proposed technique is more stable for different images.

Keywords: interpolation, image subsampling, PSNR, SIM

Procedia PDF Downloads 578
23967 Active Contours for Image Segmentation Based on Complex Domain Approach

Authors: Sajid Hussain

Abstract:

The complex domain approach for image segmentation based on active contour has been designed, which deforms step by step to partition an image into numerous expedient regions. A novel region-based trigonometric complex pressure force function is proposed, which propagates around the region of interest using image forces. The signed trigonometric force function controls the propagation of the active contour and the active contour stops on the exact edges of the object accurately. The proposed model makes the level set function binary and uses Gaussian smoothing kernel to adjust and escape the re-initialization procedure. The working principle of the proposed model is as follows: The real image data is transformed into complex data by iota (i) times of image data and the average iota (i) times of horizontal and vertical components of the gradient of image data is inserted in the proposed model to catch complex gradient of the image data. A simple finite difference mathematical technique has been used to implement the proposed model. The efficiency and robustness of the proposed model have been verified and compared with other state-of-the-art models.

Keywords: image segmentation, active contour, level set, Mumford and Shah model

Procedia PDF Downloads 114
23966 Discerning Divergent Nodes in Social Networks

Authors: Mehran Asadi, Afrand Agah

Abstract:

In data mining, partitioning is used as a fundamental tool for classification. With the help of partitioning, we study the structure of data, which allows us to envision decision rules, which can be applied to classification trees. In this research, we used online social network dataset and all of its attributes (e.g., Node features, labels, etc.) to determine what constitutes an above average chance of being a divergent node. We used the R statistical computing language to conduct the analyses in this report. The data were found on the UC Irvine Machine Learning Repository. This research introduces the basic concepts of classification in online social networks. In this work, we utilize overfitting and describe different approaches for evaluation and performance comparison of different classification methods. In classification, the main objective is to categorize different items and assign them into different groups based on their properties and similarities. In data mining, recursive partitioning is being utilized to probe the structure of a data set, which allow us to envision decision rules and apply them to classify data into several groups. Estimating densities is hard, especially in high dimensions, with limited data. Of course, we do not know the densities, but we could estimate them using classical techniques. First, we calculated the correlation matrix of the dataset to see if any predictors are highly correlated with one another. By calculating the correlation coefficients for the predictor variables, we see that density is strongly correlated with transitivity. We initialized a data frame to easily compare the quality of the result classification methods and utilized decision trees (with k-fold cross validation to prune the tree). The method performed on this dataset is decision trees. Decision tree is a non-parametric classification method, which uses a set of rules to predict that each observation belongs to the most commonly occurring class label of the training data. Our method aggregates many decision trees to create an optimized model that is not susceptible to overfitting. When using a decision tree, however, it is important to use cross-validation to prune the tree in order to narrow it down to the most important variables.

Keywords: online social networks, data mining, social cloud computing, interaction and collaboration

Procedia PDF Downloads 157
23965 Comparison of Different k-NN Models for Speed Prediction in an Urban Traffic Network

Authors: Seyoung Kim, Jeongmin Kim, Kwang Ryel Ryu

Abstract:

A database that records average traffic speeds measured at five-minute intervals for all the links in the traffic network of a metropolitan city. While learning from this data the models that can predict future traffic speed would be beneficial for the applications such as the car navigation system, building predictive models for every link becomes a nontrivial job if the number of links in a given network is huge. An advantage of adopting k-nearest neighbor (k-NN) as predictive models is that it does not require any explicit model building. Instead, k-NN takes a long time to make a prediction because it needs to search for the k-nearest neighbors in the database at prediction time. In this paper, we investigate how much we can speed up k-NN in making traffic speed predictions by reducing the amount of data to be searched for without a significant sacrifice of prediction accuracy. The rationale behind this is that we had a better look at only the recent data because the traffic patterns not only repeat daily or weekly but also change over time. In our experiments, we build several different k-NN models employing different sets of features which are the current and past traffic speeds of the target link and the neighbor links in its up/down-stream. The performances of these models are compared by measuring the average prediction accuracy and the average time taken to make a prediction using various amounts of data.

Keywords: big data, k-NN, machine learning, traffic speed prediction

Procedia PDF Downloads 363
23964 Comparative Analysis of Classification Methods in Determining Non-Active Student Characteristics in Indonesia Open University

Authors: Dewi Juliah Ratnaningsih, Imas Sukaesih Sitanggang

Abstract:

Classification is one of data mining techniques that aims to discover a model from training data that distinguishes records into the appropriate category or class. Data mining classification methods can be applied in education, for example, to determine the classification of non-active students in Indonesia Open University. This paper presents a comparison of three methods of classification: Naïve Bayes, Bagging, and C.45. The criteria used to evaluate the performance of three methods of classification are stratified cross-validation, confusion matrix, the value of the area under the ROC Curve (AUC), Recall, Precision, and F-measure. The data used for this paper are from the non-active Indonesia Open University students in registration period of 2004.1 to 2012.2. Target analysis requires that non-active students were divided into 3 groups: C1, C2, and C3. Data analyzed are as many as 4173 students. Results of the study show: (1) Bagging method gave a high degree of classification accuracy than Naïve Bayes and C.45, (2) the Bagging classification accuracy rate is 82.99 %, while the Naïve Bayes and C.45 are 80.04 % and 82.74 % respectively, (3) the result of Bagging classification tree method has a large number of nodes, so it is quite difficult in decision making, (4) classification of non-active Indonesia Open University student characteristics uses algorithms C.45, (5) based on the algorithm C.45, there are 5 interesting rules which can describe the characteristics of non-active Indonesia Open University students.

Keywords: comparative analysis, data mining, clasiffication, Bagging, Naïve Bayes, C.45, non-active students, Indonesia Open University

Procedia PDF Downloads 315
23963 A Study of the Adaptive Reuse for School Land Use Strategy: An Application of the Analytic Network Process and Big Data

Authors: Wann-Ming Wey

Abstract:

In today's popularity and progress of information technology, the big data set and its analysis are no longer a major conundrum. Now, we could not only use the relevant big data to analysis and emulate the possible status of urban development in the near future, but also provide more comprehensive and reasonable policy implementation basis for government units or decision-makers via the analysis and emulation results as mentioned above. In this research, we set Taipei City as the research scope, and use the relevant big data variables (e.g., population, facility utilization and related social policy ratings) and Analytic Network Process (ANP) approach to implement in-depth research and discussion for the possible reduction of land use in primary and secondary schools of Taipei City. In addition to enhance the prosperous urban activities for the urban public facility utilization, the final results of this research could help improve the efficiency of urban land use in the future. Furthermore, the assessment model and research framework established in this research also provide a good reference for schools or other public facilities land use and adaptive reuse strategies in the future.

Keywords: adaptive reuse, analytic network process, big data, land use strategy

Procedia PDF Downloads 203
23962 Interoperability Standard for Data Exchange in Educational Documents in Professional and Technological Education: A Comparative Study and Feasibility Analysis for the Brazilian Context

Authors: Giovana Nunes Inocêncio

Abstract:

The professional and technological education (EPT) plays a pivotal role in equipping students for specialized careers, and it is imperative to establish a framework for efficient data exchange among educational institutions. The primary focus of this article is to address the pressing need for document interoperability within the context of EPT. The challenges, motivations, and benefits of implementing interoperability standards for digital educational documents are thoroughly explored. These documents include EPT completion certificates, academic records, and curricula. In conjunction with the prior abstract, it is evident that the intersection of IT governance and interoperability standards holds the key to transforming the landscape of technical education in Brazil. IT governance provides the strategic framework for effective data management, aligning with educational objectives, ensuring compliance, and managing risks. By adopting interoperability standards, the technical education sector in Brazil can facilitate data exchange, enhance data security, and promote international recognition of qualifications. The utilization of the XML (Extensible Markup Language) standard further strengthens the foundation for structured data exchange, fostering efficient communication, standardization of curricula, and enhancing educational materials. The IT governance, interoperability standards, and data management critical role in driving the quality, efficiency, and security of technical education. The adoption of these standards fosters transparency, stakeholder coordination, and regulatory compliance, ultimately empowering the technical education sector to meet the dynamic demands of the 21st century.

Keywords: interoperability, education, standards, governance

Procedia PDF Downloads 70
23961 Need for Privacy in the Technological Era: An Analysis in the Indian Perspective

Authors: Amrashaa Singh

Abstract:

In the digital age and the large cyberspace, Data Protection and Privacy have become major issues in this technological era. There was a time when social media and online shopping websites were treated as a blessing for the people. But now the tables have turned, and the people have started to look at them with suspicion. They are getting aware of the privacy implications, and they do not feel as safe as they used to initially. When Edward Snowden informed the world about the snooping United States Security Agencies had been doing, that is when the picture became clear for the people. After the Cambridge Analytica case where the data of Facebook users were stored without their consent, the doubts arose in the minds of people about how safe they actually are. In India, the case of spyware Pegasus also raised a lot of concerns. It was used to snoop on a lot of human right activists and lawyers and the company which invented the spyware claims that it only sells it to the government. The paper will be dealing with the privacy concerns in the Indian perspective with an analytical methodology. The Supreme Court here had recently declared a right to privacy a Fundamental Right under Article 21 of the Constitution of India. Further, the Government is also working on the Data Protection Bill. The point to note is that India is still a developing country, and with the bill, the government aims at data localization. But there are doubts in the minds of many people that the Government would actually be snooping on the data of the individuals. It looks more like an attempt to curb dissenters ‘lawfully’. The focus of the paper would be on these issues in India in light of the European Union (EU) General Data Protection Regulation (GDPR). The Indian Data Protection Bill is also said to be loosely based on EU GDPR. But how helpful would these laws actually be is another concern since the economic and social conditions in both countries are very different? The paper aims at discussing these concerns, how good or bad is the intention of the government behind the bill, and how the nations can act together and draft common regulations so that there is some uniformity in the laws and their application.

Keywords: Article 21, data protection, dissent, fundamental right, India, privacy

Procedia PDF Downloads 114
23960 An Online 3D Modeling Method Based on a Lossless Compression Algorithm

Authors: Jiankang Wang, Hongyang Yu

Abstract:

This paper proposes a portable online 3D modeling method. The method first utilizes a depth camera to collect data and compresses the depth data using a frame-by-frame lossless data compression method. The color image is encoded using the H.264 encoding format. After the cloud obtains the color image and depth image, a 3D modeling method based on bundlefusion is used to complete the 3D modeling. The results of this study indicate that this method has the characteristics of portability, online, and high efficiency and has a wide range of application prospects.

Keywords: 3D reconstruction, bundlefusion, lossless compression, depth image

Procedia PDF Downloads 82
23959 H∞ Sampled-Data Control for Linear Systems Time-Varying Delays: Application to Power System

Authors: Chang-Ho Lee, Seung-Hoon Lee, Myeong-Jin Park, Oh-Min Kwon

Abstract:

This paper investigates improved stability criteria for sampled-data control of linear systems with disturbances and time-varying delays. Based on Lyapunov-Krasovskii stability theory, delay-dependent conditions sufficient to ensure H∞ stability for the system are derived in the form of linear matrix inequalities(LMI). The effectiveness of the proposed method will be shown in numerical examples.

Keywords: sampled-data control system, Lyapunov-Krasovskii functional, time delay-dependent, LMI, H∞ control

Procedia PDF Downloads 320
23958 Logistics Information Systems in the Distribution of Flour in Nigeria

Authors: Cornelius Femi Popoola

Abstract:

This study investigated logistics information systems in the distribution of flour in Nigeria. A case study design was used and 50 staff of Honeywell Flour Mill was sampled for the study. Data generated through a questionnaire were analysed using correlation and regression analysis. The findings of the study revealed that logistic information systems such as e-commerce, interactive telephone systems and electronic data interchange positively correlated with the distribution of flour in Honeywell Flour Mill. Finding also deduced that e-commerce, interactive telephone systems and electronic data interchange jointly and positively contribute to the distribution of flour in Honeywell Flour Mill in Nigeria (R = .935; Adj. R2 = .642; F (3,47) = 14.739; p < .05). The study therefore recommended that Honeywell Flour Mill should upgrade their logistic information systems to computer-to-computer communication of business transactions and documents, as well adopt new technology such as, tracking-and-tracing systems (barcode scanning for packages and palettes), tracking vehicles with Global Positioning System (GPS), measuring vehicle performance with ‘black boxes’ (containing logistic data), and Automatic Equipment Identification (AEI) into their systems.

Keywords: e-commerce, electronic data interchange, flour distribution, information system, interactive telephone systems

Procedia PDF Downloads 553
23957 Cascaded Neural Network for Internal Temperature Forecasting in Induction Motor

Authors: Hidir S. Nogay

Abstract:

In this study, two systems were created to predict interior temperature in induction motor. One of them consisted of a simple ANN model which has two layers, ten input parameters and one output parameter. The other one consisted of eight ANN models connected each other as cascaded. Cascaded ANN system has 17 inputs. Main reason of cascaded system being used in this study is to accomplish more accurate estimation by increasing inputs in the ANN system. Cascaded ANN system is compared with simple conventional ANN model to prove mentioned advantages. Dataset was obtained from experimental applications. Small part of the dataset was used to obtain more understandable graphs. Number of data is 329. 30% of the data was used for testing and validation. Test data and validation data were determined for each ANN model separately and reliability of each model was tested. As a result of this study, it has been understood that the cascaded ANN system produced more accurate estimates than conventional ANN model.

Keywords: cascaded neural network, internal temperature, inverter, three-phase induction motor

Procedia PDF Downloads 345
23956 Big Data and Health: An Australian Perspective Which Highlights the Importance of Data Linkage to Support Health Research at a National Level

Authors: James Semmens, James Boyd, Anna Ferrante, Katrina Spilsbury, Sean Randall, Adrian Brown

Abstract:

‘Big data’ is a relatively new concept that describes data so large and complex that it exceeds the storage or computing capacity of most systems to perform timely and accurate analyses. Health services generate large amounts of data from a wide variety of sources such as administrative records, electronic health records, health insurance claims, and even smart phone health applications. Health data is viewed in Australia and internationally as highly sensitive. Strict ethical requirements must be met for the use of health data to support health research. These requirements differ markedly from those imposed on data use from industry or other government sectors and may have the impact of reducing the capacity of health data to be incorporated into the real time demands of the Big Data environment. This ‘big data revolution’ is increasingly supported by national governments, who have invested significant funds into initiatives designed to develop and capitalize on big data and methods for data integration using record linkage. The benefits to health following research using linked administrative data are recognised internationally and by the Australian Government through the National Collaborative Research Infrastructure Strategy Roadmap, which outlined a multi-million dollar investment strategy to develop national record linkage capabilities. This led to the establishment of the Population Health Research Network (PHRN) to coordinate and champion this initiative. The purpose of the PHRN was to establish record linkage units in all Australian states, to support the implementation of secure data delivery and remote access laboratories for researchers, and to develop the Centre for Data Linkage for the linkage of national and cross-jurisdictional data. The Centre for Data Linkage has been established within Curtin University in Western Australia; it provides essential record linkage infrastructure necessary for large-scale, cross-jurisdictional linkage of health related data in Australia and uses a best practice ‘separation principle’ to support data privacy and security. Privacy preserving record linkage technology is also being developed to link records without the use of names to overcome important legal and privacy constraint. This paper will present the findings of the first ‘Proof of Concept’ project selected to demonstrate the effectiveness of increased record linkage capacity in supporting nationally significant health research. This project explored how cross-jurisdictional linkage can inform the nature and extent of cross-border hospital use and hospital-related deaths. The technical challenges associated with national record linkage, and the extent of cross-border population movements, were explored as part of this pioneering research project. Access to person-level data linked across jurisdictions identified geographical hot spots of cross border hospital use and hospital-related deaths in Australia. This has implications for planning of health service delivery and for longitudinal follow-up studies, particularly those involving mobile populations.

Keywords: data integration, data linkage, health planning, health services research

Procedia PDF Downloads 216
23955 Spatial Variability of Brahmaputra River Flow Characteristics

Authors: Hemant Kumar

Abstract:

Brahmaputra River is known according to the Hindu mythology the son of the Lord Brahma. According to this name, the river Brahmaputra creates mass destruction during the monsoon season in Assam, India. It is a state situated in North-East part of India. This is one of the essential states out of the seven countries of eastern India, where almost all entire Brahmaputra flow carried out. The other states carry their tributaries. In the present case study, the spatial analysis performed in this specific case the number of MODIS data are acquired. In the method of detecting the change, the spray content was found during heavy rainfall and in the flooded monsoon season. By this method, particularly the analysis over the Brahmaputra outflow determines the flooded season. The charged particle-associated in aerosol content genuinely verifies the heavy water content below the ground surface, which is validated by trend analysis through rainfall spectrum data. This is confirmed by in-situ sampled view data from a different position of Brahmaputra River. Further, a Hyperion Hyperspectral 30 m resolution data were used to scan the sediment deposits, which is also confirmed by in-situ sampled view data from a different position.

Keywords: aerosol, change detection, spatial analysis, trend analysis

Procedia PDF Downloads 147
23954 Data Mining Model for Predicting the Status of HIV Patients during Drug Regimen Change

Authors: Ermias A. Tegegn, Million Meshesha

Abstract:

Human Immunodeficiency Virus and Acquired Immunodeficiency Syndrome (HIV/AIDS) is a major cause of death for most African countries. Ethiopia is one of the seriously affected countries in sub Saharan Africa. Previously in Ethiopia, having HIV/AIDS was almost equivalent to a death sentence. With the introduction of Antiretroviral Therapy (ART), HIV/AIDS has become chronic, but manageable disease. The study focused on a data mining technique to predict future living status of HIV/AIDS patients at the time of drug regimen change when the patients become toxic to the currently taking ART drug combination. The data is taken from University of Gondar Hospital ART program database. Hybrid methodology is followed to explore the application of data mining on ART program dataset. Data cleaning, handling missing values and data transformation were used for preprocessing the data. WEKA 3.7.9 data mining tools, classification algorithms, and expertise are utilized as means to address the research problem. By using four different classification algorithms, (i.e., J48 Classifier, PART rule induction, Naïve Bayes and Neural network) and by adjusting their parameters thirty-two models were built on the pre-processed University of Gondar ART program dataset. The performances of the models were evaluated using the standard metrics of accuracy, precision, recall, and F-measure. The most effective model to predict the status of HIV patients with drug regimen substitution is pruned J48 decision tree with a classification accuracy of 98.01%. This study extracts interesting attributes such as Ever taking Cotrim, Ever taking TbRx, CD4 count, Age, Weight, and Gender so as to predict the status of drug regimen substitution. The outcome of this study can be used as an assistant tool for the clinician to help them make more appropriate drug regimen substitution. Future research directions are forwarded to come up with an applicable system in the area of the study.

Keywords: HIV drug regimen, data mining, hybrid methodology, predictive model

Procedia PDF Downloads 142