Search results for: data inten- sive
24236 Talent-to-Vec: Using Network Graphs to Validate Models with Data Sparsity
Authors: Shaan Khosla, Jon Krohn
Abstract:
In a recruiting context, machine learning models are valuable for recommendations: to predict the best candidates for a vacancy, to match the best vacancies for a candidate, and compile a set of similar candidates for any given candidate. While useful to create these models, validating their accuracy in a recommendation context is difficult due to a sparsity of data. In this report, we use network graph data to generate useful representations for candidates and vacancies. We use candidates and vacancies as network nodes and designate a bi-directional link between them based on the candidate interviewing for the vacancy. After using node2vec, the embeddings are used to construct a validation dataset with a ranked order, which will help validate new recommender systems.Keywords: AI, machine learning, NLP, recruiting
Procedia PDF Downloads 8424235 A Web Service-Based Framework for Mining E-Learning Data
Authors: Felermino D. M. A. Ali, S. C. Ng
Abstract:
E-learning is an evolutionary form of distance learning and has become better over time as new technologies emerged. Today, efforts are still being made to embrace E-learning systems with emerging technologies in order to make them better. Among these advancements, Educational Data Mining (EDM) is one that is gaining a huge and increasing popularity due to its wide application for improving the teaching-learning process in online practices. However, even though EDM promises to bring many benefits to educational industry in general and E-learning environments in particular, its principal drawback is the lack of easy to use tools. The current EDM tools usually require users to have some additional technical expertise to effectively perform EDM tasks. Thus, in response to these limitations, this study intends to design and implement an EDM application framework which aims at automating and simplify the development of EDM in E-learning environment. The application framework introduces a Service-Oriented Architecture (SOA) that hides the complexity of technical details and enables users to perform EDM in an automated fashion. The framework was designed based on abstraction, extensibility, and interoperability principles. The framework implementation was made up of three major modules. The first module provides an abstraction for data gathering, which was done by extending Moodle LMS (Learning Management System) source code. The second module provides data mining methods and techniques as services; it was done by converting Weka API into a set of Web services. The third module acts as an intermediary between the first two modules, it contains a user-friendly interface that allows dynamically locating data provider services, and running knowledge discovery tasks on data mining services. An experiment was conducted to evaluate the overhead of the proposed framework through a combination of simulation and implementation. The experiments have shown that the overhead introduced by the SOA mechanism is relatively small, therefore, it has been concluded that a service-oriented architecture can be effectively used to facilitate educational data mining in E-learning environments.Keywords: educational data mining, e-learning, distributed data mining, moodle, service-oriented architecture, Weka
Procedia PDF Downloads 23624234 Mathematics Bridging Theory and Applications for a Data-Driven World
Authors: Zahid Ullah, Atlas Khan
Abstract:
In today's data-driven world, the role of mathematics in bridging the gap between theory and applications is becoming increasingly vital. This abstract highlights the significance of mathematics as a powerful tool for analyzing, interpreting, and extracting meaningful insights from vast amounts of data. By integrating mathematical principles with real-world applications, researchers can unlock the full potential of data-driven decision-making processes. This abstract delves into the various ways mathematics acts as a bridge connecting theoretical frameworks to practical applications. It explores the utilization of mathematical models, algorithms, and statistical techniques to uncover hidden patterns, trends, and correlations within complex datasets. Furthermore, it investigates the role of mathematics in enhancing predictive modeling, optimization, and risk assessment methodologies for improved decision-making in diverse fields such as finance, healthcare, engineering, and social sciences. The abstract also emphasizes the need for interdisciplinary collaboration between mathematicians, statisticians, computer scientists, and domain experts to tackle the challenges posed by the data-driven landscape. By fostering synergies between these disciplines, novel approaches can be developed to address complex problems and make data-driven insights accessible and actionable. Moreover, this abstract underscores the importance of robust mathematical foundations for ensuring the reliability and validity of data analysis. Rigorous mathematical frameworks not only provide a solid basis for understanding and interpreting results but also contribute to the development of innovative methodologies and techniques. In summary, this abstract advocates for the pivotal role of mathematics in bridging theory and applications in a data-driven world. By harnessing mathematical principles, researchers can unlock the transformative potential of data analysis, paving the way for evidence-based decision-making, optimized processes, and innovative solutions to the challenges of our rapidly evolving society.Keywords: mathematics, bridging theory and applications, data-driven world, mathematical models
Procedia PDF Downloads 7524233 AI-Enabled Smart Contracts for Reliable Traceability in the Industry 4.0
Authors: Harris Niavis, Dimitra Politaki
Abstract:
The manufacturing industry was collecting vast amounts of data for monitoring product quality thanks to the advances in the ICT sector and dedicated IoT infrastructure is deployed to track and trace the production line. However, industries have not yet managed to unleash the full potential of these data due to defective data collection methods and untrusted data storage and sharing. Blockchain is gaining increasing ground as a key technology enabler for Industry 4.0 and the smart manufacturing domain, as it enables the secure storage and exchange of data between stakeholders. On the other hand, AI techniques are more and more used to detect anomalies in batch and time-series data that enable the identification of unusual behaviors. The proposed scheme is based on smart contracts to enable automation and transparency in the data exchange, coupled with anomaly detection algorithms to enable reliable data ingestion in the system. Before sensor measurements are fed to the blockchain component and the smart contracts, the anomaly detection mechanism uniquely combines artificial intelligence models to effectively detect unusual values such as outliers and extreme deviations in data coming from them. Specifically, Autoregressive integrated moving average, Long short-term memory (LSTM) and Dense-based autoencoders, as well as Generative adversarial networks (GAN) models, are used to detect both point and collective anomalies. Towards the goal of preserving the privacy of industries' information, the smart contracts employ techniques to ensure that only anonymized pointers to the actual data are stored on the ledger while sensitive information remains off-chain. In the same spirit, blockchain technology guarantees the security of the data storage through strong cryptography as well as the integrity of the data through the decentralization of the network and the execution of the smart contracts by the majority of the blockchain network actors. The blockchain component of the Data Traceability Software is based on the Hyperledger Fabric framework, which lays the ground for the deployment of smart contracts and APIs to expose the functionality to the end-users. The results of this work demonstrate that such a system can increase the quality of the end-products and the trustworthiness of the monitoring process in the smart manufacturing domain. The proposed AI-enabled data traceability software can be employed by industries to accurately trace and verify records about quality through the entire production chain and take advantage of the multitude of monitoring records in their databases.Keywords: blockchain, data quality, industry4.0, product quality
Procedia PDF Downloads 18924232 Unstructured-Data Content Search Based on Optimized EEG Signal Processing and Multi-Objective Feature Extraction
Authors: Qais M. Yousef, Yasmeen A. Alshaer
Abstract:
Over the last few years, the amount of data available on the globe has been increased rapidly. This came up with the emergence of recent concepts, such as the big data and the Internet of Things, which have furnished a suitable solution for the availability of data all over the world. However, managing this massive amount of data remains a challenge due to their large verity of types and distribution. Therefore, locating the required file particularly from the first trial turned to be a not easy task, due to the large similarities of names for different files distributed on the web. Consequently, the accuracy and speed of search have been negatively affected. This work presents a method using Electroencephalography signals to locate the files based on their contents. Giving the concept of natural mind waves processing, this work analyses the mind wave signals of different people, analyzing them and extracting their most appropriate features using multi-objective metaheuristic algorithm, and then classifying them using artificial neural network to distinguish among files with similar names. The aim of this work is to provide the ability to find the files based on their contents using human thoughts only. Implementing this approach and testing it on real people proved its ability to find the desired files accurately within noticeably shorter time and retrieve them as a first choice for the user.Keywords: artificial intelligence, data contents search, human active memory, mind wave, multi-objective optimization
Procedia PDF Downloads 17524231 IoT Based Approach to Healthcare System for a Quadriplegic Patient Using EEG
Authors: R. Gautam, P. Sastha Kanagasabai, G. N. Rathna
Abstract:
The proposed healthcare system enables quadriplegic patients, people with severe motor disabilities to send commands to electronic devices and monitor their vitals. The growth of Brain-Computer-Interface (BCI) has led to rapid development in 'assistive systems' for the disabled called 'assistive domotics'. Brain-Computer-Interface is capable of reading the brainwaves of an individual and analyse it to obtain some meaningful data. This processed data can be used to assist people having speech disorders and sometimes people with limited locomotion to communicate. In this Project, Emotiv EPOC Headset is used to obtain the electroencephalogram (EEG). The obtained data is processed to communicate pre-defined commands over the internet to the desired mobile phone user. Other Vital Information like the heartbeat, blood pressure, ECG and body temperature are monitored and uploaded to the server. Data analytics enables physicians to scan databases for a specific illness. The Data is processed in Intel Edison, system on chip (SoC). Patient metrics are displayed via Intel IoT Analytics cloud service.Keywords: brain computer interface, Intel Edison, Emotiv EPOC, IoT analytics, electroencephalogram
Procedia PDF Downloads 18624230 Searchable Encryption in Cloud Storage
Authors: Ren Junn Hwang, Chung-Chien Lu, Jain-Shing Wu
Abstract:
Cloud outsource storage is one of important services in cloud computing. Cloud users upload data to cloud servers to reduce the cost of managing data and maintaining hardware and software. To ensure data confidentiality, users can encrypt their files before uploading them to a cloud system. However, retrieving the target file from the encrypted files exactly is difficult for cloud server. This study proposes a protocol for performing multikeyword searches for encrypted cloud data by applying k-nearest neighbor technology. The protocol ranks the relevance scores of encrypted files and keywords, and prevents cloud servers from learning search keywords submitted by a cloud user. To reduce the costs of file transfer communication, the cloud server returns encrypted files in order of relevance. Moreover, when a cloud user inputs an incorrect keyword and the number of wrong alphabet does not exceed a given threshold; the user still can retrieve the target files from cloud server. In addition, the proposed scheme satisfies security requirements for outsourced data storage.Keywords: fault-tolerance search, multi-keywords search, outsource storage, ranked search, searchable encryption
Procedia PDF Downloads 38324229 A Bivariate Inverse Generalized Exponential Distribution and Its Applications in Dependent Competing Risks Model
Authors: Fatemah A. Alqallaf, Debasis Kundu
Abstract:
The aim of this paper is to introduce a bivariate inverse generalized exponential distribution which has a singular component. The proposed bivariate distribution can be used when the marginals have heavy-tailed distributions, and they have non-monotone hazard functions. Due to the presence of the singular component, it can be used quite effectively when there are ties in the data. Since it has four parameters, it is a very flexible bivariate distribution, and it can be used quite effectively for analyzing various bivariate data sets. Several dependency properties and dependency measures have been obtained. The maximum likelihood estimators cannot be obtained in closed form, and it involves solving a four-dimensional optimization problem. To avoid that, we have proposed to use an EM algorithm, and it involves solving only one non-linear equation at each `E'-step. Hence, the implementation of the proposed EM algorithm is very straight forward in practice. Extensive simulation experiments and the analysis of one data set have been performed. We have observed that the proposed bivariate inverse generalized exponential distribution can be used for modeling dependent competing risks data. One data set has been analyzed to show the effectiveness of the proposed model.Keywords: Block and Basu bivariate distributions, competing risks, EM algorithm, Marshall-Olkin bivariate exponential distribution, maximum likelihood estimators
Procedia PDF Downloads 14324228 Blind Data Hiding Technique Using Interpolation of Subsampled Images
Authors: Singara Singh Kasana, Pankaj Garg
Abstract:
In this paper, a blind data hiding technique based on interpolation of sub sampled versions of a cover image is proposed. Sub sampled image is taken as a reference image and an interpolated image is generated from this reference image. Then difference between original cover image and interpolated image is used to embed secret data. Comparisons with the existing interpolation based techniques show that proposed technique provides higher embedding capacity and better visual quality marked images. Moreover, the performance of the proposed technique is more stable for different images.Keywords: interpolation, image subsampling, PSNR, SIM
Procedia PDF Downloads 57824227 Active Contours for Image Segmentation Based on Complex Domain Approach
Authors: Sajid Hussain
Abstract:
The complex domain approach for image segmentation based on active contour has been designed, which deforms step by step to partition an image into numerous expedient regions. A novel region-based trigonometric complex pressure force function is proposed, which propagates around the region of interest using image forces. The signed trigonometric force function controls the propagation of the active contour and the active contour stops on the exact edges of the object accurately. The proposed model makes the level set function binary and uses Gaussian smoothing kernel to adjust and escape the re-initialization procedure. The working principle of the proposed model is as follows: The real image data is transformed into complex data by iota (i) times of image data and the average iota (i) times of horizontal and vertical components of the gradient of image data is inserted in the proposed model to catch complex gradient of the image data. A simple finite difference mathematical technique has been used to implement the proposed model. The efficiency and robustness of the proposed model have been verified and compared with other state-of-the-art models.Keywords: image segmentation, active contour, level set, Mumford and Shah model
Procedia PDF Downloads 11424226 Discerning Divergent Nodes in Social Networks
Authors: Mehran Asadi, Afrand Agah
Abstract:
In data mining, partitioning is used as a fundamental tool for classification. With the help of partitioning, we study the structure of data, which allows us to envision decision rules, which can be applied to classification trees. In this research, we used online social network dataset and all of its attributes (e.g., Node features, labels, etc.) to determine what constitutes an above average chance of being a divergent node. We used the R statistical computing language to conduct the analyses in this report. The data were found on the UC Irvine Machine Learning Repository. This research introduces the basic concepts of classification in online social networks. In this work, we utilize overfitting and describe different approaches for evaluation and performance comparison of different classification methods. In classification, the main objective is to categorize different items and assign them into different groups based on their properties and similarities. In data mining, recursive partitioning is being utilized to probe the structure of a data set, which allow us to envision decision rules and apply them to classify data into several groups. Estimating densities is hard, especially in high dimensions, with limited data. Of course, we do not know the densities, but we could estimate them using classical techniques. First, we calculated the correlation matrix of the dataset to see if any predictors are highly correlated with one another. By calculating the correlation coefficients for the predictor variables, we see that density is strongly correlated with transitivity. We initialized a data frame to easily compare the quality of the result classification methods and utilized decision trees (with k-fold cross validation to prune the tree). The method performed on this dataset is decision trees. Decision tree is a non-parametric classification method, which uses a set of rules to predict that each observation belongs to the most commonly occurring class label of the training data. Our method aggregates many decision trees to create an optimized model that is not susceptible to overfitting. When using a decision tree, however, it is important to use cross-validation to prune the tree in order to narrow it down to the most important variables.Keywords: online social networks, data mining, social cloud computing, interaction and collaboration
Procedia PDF Downloads 15824225 Comparison of Different k-NN Models for Speed Prediction in an Urban Traffic Network
Authors: Seyoung Kim, Jeongmin Kim, Kwang Ryel Ryu
Abstract:
A database that records average traffic speeds measured at five-minute intervals for all the links in the traffic network of a metropolitan city. While learning from this data the models that can predict future traffic speed would be beneficial for the applications such as the car navigation system, building predictive models for every link becomes a nontrivial job if the number of links in a given network is huge. An advantage of adopting k-nearest neighbor (k-NN) as predictive models is that it does not require any explicit model building. Instead, k-NN takes a long time to make a prediction because it needs to search for the k-nearest neighbors in the database at prediction time. In this paper, we investigate how much we can speed up k-NN in making traffic speed predictions by reducing the amount of data to be searched for without a significant sacrifice of prediction accuracy. The rationale behind this is that we had a better look at only the recent data because the traffic patterns not only repeat daily or weekly but also change over time. In our experiments, we build several different k-NN models employing different sets of features which are the current and past traffic speeds of the target link and the neighbor links in its up/down-stream. The performances of these models are compared by measuring the average prediction accuracy and the average time taken to make a prediction using various amounts of data.Keywords: big data, k-NN, machine learning, traffic speed prediction
Procedia PDF Downloads 36324224 Comparative Analysis of Classification Methods in Determining Non-Active Student Characteristics in Indonesia Open University
Authors: Dewi Juliah Ratnaningsih, Imas Sukaesih Sitanggang
Abstract:
Classification is one of data mining techniques that aims to discover a model from training data that distinguishes records into the appropriate category or class. Data mining classification methods can be applied in education, for example, to determine the classification of non-active students in Indonesia Open University. This paper presents a comparison of three methods of classification: Naïve Bayes, Bagging, and C.45. The criteria used to evaluate the performance of three methods of classification are stratified cross-validation, confusion matrix, the value of the area under the ROC Curve (AUC), Recall, Precision, and F-measure. The data used for this paper are from the non-active Indonesia Open University students in registration period of 2004.1 to 2012.2. Target analysis requires that non-active students were divided into 3 groups: C1, C2, and C3. Data analyzed are as many as 4173 students. Results of the study show: (1) Bagging method gave a high degree of classification accuracy than Naïve Bayes and C.45, (2) the Bagging classification accuracy rate is 82.99 %, while the Naïve Bayes and C.45 are 80.04 % and 82.74 % respectively, (3) the result of Bagging classification tree method has a large number of nodes, so it is quite difficult in decision making, (4) classification of non-active Indonesia Open University student characteristics uses algorithms C.45, (5) based on the algorithm C.45, there are 5 interesting rules which can describe the characteristics of non-active Indonesia Open University students.Keywords: comparative analysis, data mining, clasiffication, Bagging, Naïve Bayes, C.45, non-active students, Indonesia Open University
Procedia PDF Downloads 31624223 A Study of the Adaptive Reuse for School Land Use Strategy: An Application of the Analytic Network Process and Big Data
Authors: Wann-Ming Wey
Abstract:
In today's popularity and progress of information technology, the big data set and its analysis are no longer a major conundrum. Now, we could not only use the relevant big data to analysis and emulate the possible status of urban development in the near future, but also provide more comprehensive and reasonable policy implementation basis for government units or decision-makers via the analysis and emulation results as mentioned above. In this research, we set Taipei City as the research scope, and use the relevant big data variables (e.g., population, facility utilization and related social policy ratings) and Analytic Network Process (ANP) approach to implement in-depth research and discussion for the possible reduction of land use in primary and secondary schools of Taipei City. In addition to enhance the prosperous urban activities for the urban public facility utilization, the final results of this research could help improve the efficiency of urban land use in the future. Furthermore, the assessment model and research framework established in this research also provide a good reference for schools or other public facilities land use and adaptive reuse strategies in the future.Keywords: adaptive reuse, analytic network process, big data, land use strategy
Procedia PDF Downloads 20324222 Interoperability Standard for Data Exchange in Educational Documents in Professional and Technological Education: A Comparative Study and Feasibility Analysis for the Brazilian Context
Authors: Giovana Nunes Inocêncio
Abstract:
The professional and technological education (EPT) plays a pivotal role in equipping students for specialized careers, and it is imperative to establish a framework for efficient data exchange among educational institutions. The primary focus of this article is to address the pressing need for document interoperability within the context of EPT. The challenges, motivations, and benefits of implementing interoperability standards for digital educational documents are thoroughly explored. These documents include EPT completion certificates, academic records, and curricula. In conjunction with the prior abstract, it is evident that the intersection of IT governance and interoperability standards holds the key to transforming the landscape of technical education in Brazil. IT governance provides the strategic framework for effective data management, aligning with educational objectives, ensuring compliance, and managing risks. By adopting interoperability standards, the technical education sector in Brazil can facilitate data exchange, enhance data security, and promote international recognition of qualifications. The utilization of the XML (Extensible Markup Language) standard further strengthens the foundation for structured data exchange, fostering efficient communication, standardization of curricula, and enhancing educational materials. The IT governance, interoperability standards, and data management critical role in driving the quality, efficiency, and security of technical education. The adoption of these standards fosters transparency, stakeholder coordination, and regulatory compliance, ultimately empowering the technical education sector to meet the dynamic demands of the 21st century.Keywords: interoperability, education, standards, governance
Procedia PDF Downloads 7024221 Generating Real-Time Visual Summaries from Located Sensor-Based Data with Chorems
Authors: Z. Bouattou, R. Laurini, H. Belbachir
Abstract:
This paper describes a new approach for the automatic generation of the visual summaries dealing with cartographic visualization methods and sensors real time data modeling. Hence, the concept of chorems seems an interesting candidate to visualize real time geographic database summaries. Chorems have been defined by Roger Brunet (1980) as schematized visual representations of territories. However, the time information is not yet handled in existing chorematic map approaches, issue has been discussed in this paper. Our approach is based on spatial analysis by interpolating the values recorded at the same time, by sensors available, so we have a number of distributed observations on study areas and used spatial interpolation methods to find the concentration fields, from these fields and by using some spatial data mining procedures on the fly, it is possible to extract important patterns as geographic rules. Then, those patterns are visualized as chorems.Keywords: geovisualization, spatial analytics, real-time, geographic data streams, sensors, chorems
Procedia PDF Downloads 40124220 Need for Privacy in the Technological Era: An Analysis in the Indian Perspective
Authors: Amrashaa Singh
Abstract:
In the digital age and the large cyberspace, Data Protection and Privacy have become major issues in this technological era. There was a time when social media and online shopping websites were treated as a blessing for the people. But now the tables have turned, and the people have started to look at them with suspicion. They are getting aware of the privacy implications, and they do not feel as safe as they used to initially. When Edward Snowden informed the world about the snooping United States Security Agencies had been doing, that is when the picture became clear for the people. After the Cambridge Analytica case where the data of Facebook users were stored without their consent, the doubts arose in the minds of people about how safe they actually are. In India, the case of spyware Pegasus also raised a lot of concerns. It was used to snoop on a lot of human right activists and lawyers and the company which invented the spyware claims that it only sells it to the government. The paper will be dealing with the privacy concerns in the Indian perspective with an analytical methodology. The Supreme Court here had recently declared a right to privacy a Fundamental Right under Article 21 of the Constitution of India. Further, the Government is also working on the Data Protection Bill. The point to note is that India is still a developing country, and with the bill, the government aims at data localization. But there are doubts in the minds of many people that the Government would actually be snooping on the data of the individuals. It looks more like an attempt to curb dissenters ‘lawfully’. The focus of the paper would be on these issues in India in light of the European Union (EU) General Data Protection Regulation (GDPR). The Indian Data Protection Bill is also said to be loosely based on EU GDPR. But how helpful would these laws actually be is another concern since the economic and social conditions in both countries are very different? The paper aims at discussing these concerns, how good or bad is the intention of the government behind the bill, and how the nations can act together and draft common regulations so that there is some uniformity in the laws and their application.Keywords: Article 21, data protection, dissent, fundamental right, India, privacy
Procedia PDF Downloads 11424219 An Online 3D Modeling Method Based on a Lossless Compression Algorithm
Authors: Jiankang Wang, Hongyang Yu
Abstract:
This paper proposes a portable online 3D modeling method. The method first utilizes a depth camera to collect data and compresses the depth data using a frame-by-frame lossless data compression method. The color image is encoded using the H.264 encoding format. After the cloud obtains the color image and depth image, a 3D modeling method based on bundlefusion is used to complete the 3D modeling. The results of this study indicate that this method has the characteristics of portability, online, and high efficiency and has a wide range of application prospects.Keywords: 3D reconstruction, bundlefusion, lossless compression, depth image
Procedia PDF Downloads 8224218 H∞ Sampled-Data Control for Linear Systems Time-Varying Delays: Application to Power System
Authors: Chang-Ho Lee, Seung-Hoon Lee, Myeong-Jin Park, Oh-Min Kwon
Abstract:
This paper investigates improved stability criteria for sampled-data control of linear systems with disturbances and time-varying delays. Based on Lyapunov-Krasovskii stability theory, delay-dependent conditions sufficient to ensure H∞ stability for the system are derived in the form of linear matrix inequalities(LMI). The effectiveness of the proposed method will be shown in numerical examples.Keywords: sampled-data control system, Lyapunov-Krasovskii functional, time delay-dependent, LMI, H∞ control
Procedia PDF Downloads 32024217 Logistics Information Systems in the Distribution of Flour in Nigeria
Authors: Cornelius Femi Popoola
Abstract:
This study investigated logistics information systems in the distribution of flour in Nigeria. A case study design was used and 50 staff of Honeywell Flour Mill was sampled for the study. Data generated through a questionnaire were analysed using correlation and regression analysis. The findings of the study revealed that logistic information systems such as e-commerce, interactive telephone systems and electronic data interchange positively correlated with the distribution of flour in Honeywell Flour Mill. Finding also deduced that e-commerce, interactive telephone systems and electronic data interchange jointly and positively contribute to the distribution of flour in Honeywell Flour Mill in Nigeria (R = .935; Adj. R2 = .642; F (3,47) = 14.739; p < .05). The study therefore recommended that Honeywell Flour Mill should upgrade their logistic information systems to computer-to-computer communication of business transactions and documents, as well adopt new technology such as, tracking-and-tracing systems (barcode scanning for packages and palettes), tracking vehicles with Global Positioning System (GPS), measuring vehicle performance with ‘black boxes’ (containing logistic data), and Automatic Equipment Identification (AEI) into their systems.Keywords: e-commerce, electronic data interchange, flour distribution, information system, interactive telephone systems
Procedia PDF Downloads 55324216 Cascaded Neural Network for Internal Temperature Forecasting in Induction Motor
Authors: Hidir S. Nogay
Abstract:
In this study, two systems were created to predict interior temperature in induction motor. One of them consisted of a simple ANN model which has two layers, ten input parameters and one output parameter. The other one consisted of eight ANN models connected each other as cascaded. Cascaded ANN system has 17 inputs. Main reason of cascaded system being used in this study is to accomplish more accurate estimation by increasing inputs in the ANN system. Cascaded ANN system is compared with simple conventional ANN model to prove mentioned advantages. Dataset was obtained from experimental applications. Small part of the dataset was used to obtain more understandable graphs. Number of data is 329. 30% of the data was used for testing and validation. Test data and validation data were determined for each ANN model separately and reliability of each model was tested. As a result of this study, it has been understood that the cascaded ANN system produced more accurate estimates than conventional ANN model.Keywords: cascaded neural network, internal temperature, inverter, three-phase induction motor
Procedia PDF Downloads 34524215 Big Data and Health: An Australian Perspective Which Highlights the Importance of Data Linkage to Support Health Research at a National Level
Authors: James Semmens, James Boyd, Anna Ferrante, Katrina Spilsbury, Sean Randall, Adrian Brown
Abstract:
‘Big data’ is a relatively new concept that describes data so large and complex that it exceeds the storage or computing capacity of most systems to perform timely and accurate analyses. Health services generate large amounts of data from a wide variety of sources such as administrative records, electronic health records, health insurance claims, and even smart phone health applications. Health data is viewed in Australia and internationally as highly sensitive. Strict ethical requirements must be met for the use of health data to support health research. These requirements differ markedly from those imposed on data use from industry or other government sectors and may have the impact of reducing the capacity of health data to be incorporated into the real time demands of the Big Data environment. This ‘big data revolution’ is increasingly supported by national governments, who have invested significant funds into initiatives designed to develop and capitalize on big data and methods for data integration using record linkage. The benefits to health following research using linked administrative data are recognised internationally and by the Australian Government through the National Collaborative Research Infrastructure Strategy Roadmap, which outlined a multi-million dollar investment strategy to develop national record linkage capabilities. This led to the establishment of the Population Health Research Network (PHRN) to coordinate and champion this initiative. The purpose of the PHRN was to establish record linkage units in all Australian states, to support the implementation of secure data delivery and remote access laboratories for researchers, and to develop the Centre for Data Linkage for the linkage of national and cross-jurisdictional data. The Centre for Data Linkage has been established within Curtin University in Western Australia; it provides essential record linkage infrastructure necessary for large-scale, cross-jurisdictional linkage of health related data in Australia and uses a best practice ‘separation principle’ to support data privacy and security. Privacy preserving record linkage technology is also being developed to link records without the use of names to overcome important legal and privacy constraint. This paper will present the findings of the first ‘Proof of Concept’ project selected to demonstrate the effectiveness of increased record linkage capacity in supporting nationally significant health research. This project explored how cross-jurisdictional linkage can inform the nature and extent of cross-border hospital use and hospital-related deaths. The technical challenges associated with national record linkage, and the extent of cross-border population movements, were explored as part of this pioneering research project. Access to person-level data linked across jurisdictions identified geographical hot spots of cross border hospital use and hospital-related deaths in Australia. This has implications for planning of health service delivery and for longitudinal follow-up studies, particularly those involving mobile populations.Keywords: data integration, data linkage, health planning, health services research
Procedia PDF Downloads 21624214 Spatial Variability of Brahmaputra River Flow Characteristics
Authors: Hemant Kumar
Abstract:
Brahmaputra River is known according to the Hindu mythology the son of the Lord Brahma. According to this name, the river Brahmaputra creates mass destruction during the monsoon season in Assam, India. It is a state situated in North-East part of India. This is one of the essential states out of the seven countries of eastern India, where almost all entire Brahmaputra flow carried out. The other states carry their tributaries. In the present case study, the spatial analysis performed in this specific case the number of MODIS data are acquired. In the method of detecting the change, the spray content was found during heavy rainfall and in the flooded monsoon season. By this method, particularly the analysis over the Brahmaputra outflow determines the flooded season. The charged particle-associated in aerosol content genuinely verifies the heavy water content below the ground surface, which is validated by trend analysis through rainfall spectrum data. This is confirmed by in-situ sampled view data from a different position of Brahmaputra River. Further, a Hyperion Hyperspectral 30 m resolution data were used to scan the sediment deposits, which is also confirmed by in-situ sampled view data from a different position.Keywords: aerosol, change detection, spatial analysis, trend analysis
Procedia PDF Downloads 14724213 Data Mining Model for Predicting the Status of HIV Patients during Drug Regimen Change
Authors: Ermias A. Tegegn, Million Meshesha
Abstract:
Human Immunodeficiency Virus and Acquired Immunodeficiency Syndrome (HIV/AIDS) is a major cause of death for most African countries. Ethiopia is one of the seriously affected countries in sub Saharan Africa. Previously in Ethiopia, having HIV/AIDS was almost equivalent to a death sentence. With the introduction of Antiretroviral Therapy (ART), HIV/AIDS has become chronic, but manageable disease. The study focused on a data mining technique to predict future living status of HIV/AIDS patients at the time of drug regimen change when the patients become toxic to the currently taking ART drug combination. The data is taken from University of Gondar Hospital ART program database. Hybrid methodology is followed to explore the application of data mining on ART program dataset. Data cleaning, handling missing values and data transformation were used for preprocessing the data. WEKA 3.7.9 data mining tools, classification algorithms, and expertise are utilized as means to address the research problem. By using four different classification algorithms, (i.e., J48 Classifier, PART rule induction, Naïve Bayes and Neural network) and by adjusting their parameters thirty-two models were built on the pre-processed University of Gondar ART program dataset. The performances of the models were evaluated using the standard metrics of accuracy, precision, recall, and F-measure. The most effective model to predict the status of HIV patients with drug regimen substitution is pruned J48 decision tree with a classification accuracy of 98.01%. This study extracts interesting attributes such as Ever taking Cotrim, Ever taking TbRx, CD4 count, Age, Weight, and Gender so as to predict the status of drug regimen substitution. The outcome of this study can be used as an assistant tool for the clinician to help them make more appropriate drug regimen substitution. Future research directions are forwarded to come up with an applicable system in the area of the study.Keywords: HIV drug regimen, data mining, hybrid methodology, predictive model
Procedia PDF Downloads 14224212 Internal Cycles from Hydrometric Data and Variability Detected Through Hydrological Modelling Results, on the Niger River, over 1901-2020
Authors: Salif Koné
Abstract:
We analyze hydrometric data at the Koulikoro station on the Niger River; this basin drains 120600 km2 and covers three countries in West Africa, Guinea, Mali, and Ivory Coast. Two subsequent decadal cycles are highlighted (1925-1936 and 1929-1939) instead of the presumed single decadal one from literature. Moreover, the observed hydrometric data shows a multidecadal 40-year period that is confirmed when graphing a spatial coefficient of variation of runoff over decades (starting at 1901-1910). Spatial runoff data are produced on 48 grids (0.5 degree by 0.5 degree) and through semi-distributed versions of both SimulHyd model and GR2M model - variants of a French Hydrologic model – standing for Genie Rural of 2 parameters at monthly time step. Both extremal decades in terms of runoff coefficient of variation are confronted: 1951-1960 has minimal coefficient of variation, and 1981-1990 shows the maximal value of it during the three months of high-water level (August, September, and October). The mapping of the relative variation of these two decadal situations allows hypothesizing as following: the scale of variation between both extremal situations could serve to fix boundary conditions for further simulations using data from climate scenario.Keywords: internal cycles, hydrometric data, niger river, gr2m and simulhyd framework, runoff coefficient of variation
Procedia PDF Downloads 9524211 A Novel Probabilistic Spatial Locality of Reference Technique for Automatic Cleansing of Digital Maps
Authors: A. Abdullah, S. Abushalmat, A. Bakshwain, A. Basuhail, A. Aslam
Abstract:
GIS (Geographic Information System) applications require geo-referenced data, this data could be available as databases or in the form of digital or hard-copy agro-meteorological maps. These parameter maps are color-coded with different regions corresponding to different parameter values, converting these maps into a database is not very difficult. However, text and different planimetric elements overlaid on these maps makes an accurate image to database conversion a challenging problem. The reason being, it is almost impossible to exactly replace what was underneath the text or icons; thus, pointing to the need for inpainting. In this paper, we propose a probabilistic inpainting approach that uses the probability of spatial locality of colors in the map for replacing overlaid elements with underlying color. We tested the limits of our proposed technique using non-textual simulated data and compared text removing results with a popular image editing tool using public domain data with promising results.Keywords: noise, image, GIS, digital map, inpainting
Procedia PDF Downloads 35224210 Evaluation of Urban Parks Based on POI Data: Taking Futian District of Shenzhen as an Example
Authors: Juanling Lin
Abstract:
The construction of urban parks is an important part of eco-city construction, and the intervention of big data provides a more scientific and rational platform for the assessment of urban parks by identifying and correcting the irrationality of urban park planning from the macroscopic level and then promoting the rational planning of urban parks. The study builds an urban park assessment system based on urban road network data and POI data, taking Futian District of Shenzhen as the research object, and utilizes the GIS geographic information system to assess the park system of Futian District in five aspects: park spatial distribution, accessibility, service capacity, demand, and supply-demand relationship. The urban park assessment system can effectively reflect the current situation of urban park construction and provide a useful exploration for realizing the rationality and fairness of urban park planning.Keywords: urban parks, assessment system, POI, supply and demand
Procedia PDF Downloads 4224209 Copula-Based Estimation of Direct and Indirect Effects in Path Analysis Model
Authors: Alam Ali, Ashok Kumar Pathak
Abstract:
Path analysis is a statistical technique used to evaluate the strength of the direct and indirect effects of variables. One or more structural regression equations are used to estimate a series of parameters in order to find the better fit of data. Sometimes, exogenous variables do not show a significant strength of their direct and indirect effect when the assumption of classical regression (ordinary least squares (OLS)) are violated by the nature of the data. The main motive of this article is to investigate the efficacy of the copula-based regression approach over the classical regression approach and calculate the direct and indirect effects of variables when data violates the OLS assumption and variables are linked through an elliptical copula. We perform this study using a well-organized numerical scheme. Finally, a real data application is also presented to demonstrate the performance of the superiority of the copula approach.Keywords: path analysis, copula-based regression models, direct and indirect effects, k-fold cross validation technique
Procedia PDF Downloads 7224208 Optimizing Quantum Machine Learning with Amplitude and Phase Encoding Techniques
Authors: Om Viroje
Abstract:
Quantum machine learning represents a frontier in computational technology, promising significant advancements in data processing capabilities. This study explores the significance of data encoding techniques, specifically amplitude and phase encoding, in this emerging field. By employing a comparative analysis methodology, the research evaluates how these encoding techniques affect the accuracy, efficiency, and noise resilience of quantum algorithms. Our findings reveal that amplitude encoding enhances algorithmic accuracy and noise tolerance, whereas phase encoding significantly boosts computational efficiency. These insights are crucial for developing robust quantum frameworks that can be effectively applied in real-world scenarios. In conclusion, optimizing encoding strategies is essential for advancing quantum machine learning, potentially transforming various industries through improved data processing and analysis.Keywords: quantum machine learning, data encoding, amplitude encoding, phase encoding, noise resilience
Procedia PDF Downloads 1624207 Reversible Information Hitting in Encrypted JPEG Bitstream by LSB Based on Inherent Algorithm
Authors: Vaibhav Barve
Abstract:
Reversible information hiding has drawn a lot of interest as of late. Being reversible, we can restore unique computerized data totally. It is a plan where mystery data is put away in digital media like image, video, audio to maintain a strategic distance from unapproved access and security reason. By and large JPEG bit stream is utilized to store this key data, first JPEG bit stream is encrypted into all around sorted out structure and then this secret information or key data is implanted into this encrypted region by marginally changing the JPEG bit stream. Valuable pixels suitable for information implanting are computed and as indicated by this key subtle elements are implanted. In our proposed framework we are utilizing RC4 algorithm for encrypting JPEG bit stream. Encryption key is acknowledged by framework user which, likewise, will be used at the time of decryption. We are executing enhanced least significant bit supplanting steganography by utilizing genetic algorithm. At first, the quantity of bits that must be installed in a guaranteed coefficient is versatile. By utilizing proper parameters, we can get high capacity while ensuring high security. We are utilizing logistic map for shuffling of bits and utilization GA (Genetic Algorithm) to find right parameters for the logistic map. Information embedding key is utilized at the time of information embedding. By utilizing precise picture encryption and information embedding key, the beneficiary can, without much of a stretch, concentrate the incorporated secure data and totally recoup the first picture and also the original secret information. At the point when the embedding key is truant, the first picture can be recouped pretty nearly with sufficient quality without getting the embedding key of interest.Keywords: data embedding, decryption, encryption, reversible data hiding, steganography
Procedia PDF Downloads 288