Search results for: missing data estimation
23830 A Web Service-Based Framework for Mining E-Learning Data
Authors: Felermino D. M. A. Ali, S. C. Ng
Abstract:
E-learning is an evolutionary form of distance learning and has become better over time as new technologies emerged. Today, efforts are still being made to embrace E-learning systems with emerging technologies in order to make them better. Among these advancements, Educational Data Mining (EDM) is one that is gaining a huge and increasing popularity due to its wide application for improving the teaching-learning process in online practices. However, even though EDM promises to bring many benefits to educational industry in general and E-learning environments in particular, its principal drawback is the lack of easy to use tools. The current EDM tools usually require users to have some additional technical expertise to effectively perform EDM tasks. Thus, in response to these limitations, this study intends to design and implement an EDM application framework which aims at automating and simplify the development of EDM in E-learning environment. The application framework introduces a Service-Oriented Architecture (SOA) that hides the complexity of technical details and enables users to perform EDM in an automated fashion. The framework was designed based on abstraction, extensibility, and interoperability principles. The framework implementation was made up of three major modules. The first module provides an abstraction for data gathering, which was done by extending Moodle LMS (Learning Management System) source code. The second module provides data mining methods and techniques as services; it was done by converting Weka API into a set of Web services. The third module acts as an intermediary between the first two modules, it contains a user-friendly interface that allows dynamically locating data provider services, and running knowledge discovery tasks on data mining services. An experiment was conducted to evaluate the overhead of the proposed framework through a combination of simulation and implementation. The experiments have shown that the overhead introduced by the SOA mechanism is relatively small, therefore, it has been concluded that a service-oriented architecture can be effectively used to facilitate educational data mining in E-learning environments.Keywords: educational data mining, e-learning, distributed data mining, moodle, service-oriented architecture, Weka
Procedia PDF Downloads 23723829 Effect of Fractional Flow Curves on the Heavy Oil and Light Oil Recoveries in Petroleum Reservoirs
Authors: Abdul Jamil Nazari, Shigeo Honma
Abstract:
This paper evaluates and compares the effect of fractional flow curves on the heavy oil and light oil recoveries in a petroleum reservoir. Fingering of flowing water is one of the serious problems of the oil displacement by water and another problem is the estimation of the amount of recover oil from a petroleum reservoir. To address these problems, the fractional flow of heavy oil and light oil are investigated. The fractional flow approach treats the multi-phases flow rate as a total mixed fluid and then describes the individual phases as fractional of the total flow. Laboratory experiments are implemented for two different types of oils, heavy oil, and light oil, to experimentally obtain relative permeability and fractional flow curves. Application of the light oil fractional curve, which exhibits a regular S-shape, to the water flooding method showed that a large amount of mobile oil in the reservoir is displaced by water injection. In contrast, the fractional flow curve of heavy oil does not display an S-shape because of its high viscosity. Although the advance of the injected waterfront is faster than in light oil reservoirs, a significant amount of mobile oil remains behind the waterfront.Keywords: fractional flow, relative permeability, oil recovery, water fingering
Procedia PDF Downloads 30423828 Mathematics Bridging Theory and Applications for a Data-Driven World
Authors: Zahid Ullah, Atlas Khan
Abstract:
In today's data-driven world, the role of mathematics in bridging the gap between theory and applications is becoming increasingly vital. This abstract highlights the significance of mathematics as a powerful tool for analyzing, interpreting, and extracting meaningful insights from vast amounts of data. By integrating mathematical principles with real-world applications, researchers can unlock the full potential of data-driven decision-making processes. This abstract delves into the various ways mathematics acts as a bridge connecting theoretical frameworks to practical applications. It explores the utilization of mathematical models, algorithms, and statistical techniques to uncover hidden patterns, trends, and correlations within complex datasets. Furthermore, it investigates the role of mathematics in enhancing predictive modeling, optimization, and risk assessment methodologies for improved decision-making in diverse fields such as finance, healthcare, engineering, and social sciences. The abstract also emphasizes the need for interdisciplinary collaboration between mathematicians, statisticians, computer scientists, and domain experts to tackle the challenges posed by the data-driven landscape. By fostering synergies between these disciplines, novel approaches can be developed to address complex problems and make data-driven insights accessible and actionable. Moreover, this abstract underscores the importance of robust mathematical foundations for ensuring the reliability and validity of data analysis. Rigorous mathematical frameworks not only provide a solid basis for understanding and interpreting results but also contribute to the development of innovative methodologies and techniques. In summary, this abstract advocates for the pivotal role of mathematics in bridging theory and applications in a data-driven world. By harnessing mathematical principles, researchers can unlock the transformative potential of data analysis, paving the way for evidence-based decision-making, optimized processes, and innovative solutions to the challenges of our rapidly evolving society.Keywords: mathematics, bridging theory and applications, data-driven world, mathematical models
Procedia PDF Downloads 7723827 Evaluation of Geomechanical and Geometrical Parameters’ Effects on Hydro-Mechanical Estimation of Water Inflow into Underground Excavations
Authors: M. Mazraehli, F. Mehrabani, S. Zare
Abstract:
In general, mechanical and hydraulic processes are not independent of each other in jointed rock masses. Therefore, the study on hydro-mechanical coupling of geomaterials should be a center of attention in rock mechanics. Rocks in their nature contain discontinuities whose presence extremely influences mechanical and hydraulic characteristics of the medium. Assuming this effect, experimental investigations on intact rock cannot help to identify jointed rock mass behavior. Hence, numerical methods are being used for this purpose. In this paper, water inflow into a tunnel under significant water table has been estimated using hydro-mechanical discrete element method (HM-DEM). Besides, effects of geomechanical and geometrical parameters including constitutive model, friction angle, joint spacing, dip of joint sets, and stress factor on the estimated inflow rate have been studied. Results demonstrate that inflow rates are not identical for different constitutive models. Also, inflow rate reduces with increased spacing and stress factor.Keywords: distinct element method, fluid flow, hydro-mechanical coupling, jointed rock mass, underground excavations
Procedia PDF Downloads 16723826 Pseudo Modal Operating Deflection Shape Based Estimation Technique of Mode Shape Using Time History Modal Assurance Criterion
Authors: Doyoung Kim, Hyo Seon Park
Abstract:
Studies of System Identification(SI) based on Structural Health Monitoring(SHM) have actively conducted for structural safety. Recently SI techniques have been rapidly developed with output-only SI paradigm for estimating modal parameters. The features of these output-only SI methods consist of Frequency Domain Decomposition(FDD) and Stochastic Subspace Identification(SSI) are using the algorithms based on orthogonal decomposition such as singular value decomposition(SVD). But the SVD leads to high level of computational complexity to estimate modal parameters. This paper proposes the technique to estimate mode shape with lower computational cost. This technique shows pseudo modal Operating Deflections Shape(ODS) through bandpass filter and suggests time history Modal Assurance Criterion(MAC). Finally, mode shape could be estimated from pseudo modal ODS and time history MAC. Analytical simulations of vibration measurement were performed and the results with mode shape and computation time between representative SI method and proposed method were compared.Keywords: modal assurance criterion, mode shape, operating deflection shape, system identification
Procedia PDF Downloads 41123825 AI-Enabled Smart Contracts for Reliable Traceability in the Industry 4.0
Authors: Harris Niavis, Dimitra Politaki
Abstract:
The manufacturing industry was collecting vast amounts of data for monitoring product quality thanks to the advances in the ICT sector and dedicated IoT infrastructure is deployed to track and trace the production line. However, industries have not yet managed to unleash the full potential of these data due to defective data collection methods and untrusted data storage and sharing. Blockchain is gaining increasing ground as a key technology enabler for Industry 4.0 and the smart manufacturing domain, as it enables the secure storage and exchange of data between stakeholders. On the other hand, AI techniques are more and more used to detect anomalies in batch and time-series data that enable the identification of unusual behaviors. The proposed scheme is based on smart contracts to enable automation and transparency in the data exchange, coupled with anomaly detection algorithms to enable reliable data ingestion in the system. Before sensor measurements are fed to the blockchain component and the smart contracts, the anomaly detection mechanism uniquely combines artificial intelligence models to effectively detect unusual values such as outliers and extreme deviations in data coming from them. Specifically, Autoregressive integrated moving average, Long short-term memory (LSTM) and Dense-based autoencoders, as well as Generative adversarial networks (GAN) models, are used to detect both point and collective anomalies. Towards the goal of preserving the privacy of industries' information, the smart contracts employ techniques to ensure that only anonymized pointers to the actual data are stored on the ledger while sensitive information remains off-chain. In the same spirit, blockchain technology guarantees the security of the data storage through strong cryptography as well as the integrity of the data through the decentralization of the network and the execution of the smart contracts by the majority of the blockchain network actors. The blockchain component of the Data Traceability Software is based on the Hyperledger Fabric framework, which lays the ground for the deployment of smart contracts and APIs to expose the functionality to the end-users. The results of this work demonstrate that such a system can increase the quality of the end-products and the trustworthiness of the monitoring process in the smart manufacturing domain. The proposed AI-enabled data traceability software can be employed by industries to accurately trace and verify records about quality through the entire production chain and take advantage of the multitude of monitoring records in their databases.Keywords: blockchain, data quality, industry4.0, product quality
Procedia PDF Downloads 19123824 Maintenance Work Order Management Tool (Desktop & Mobile Solution)
Authors: Haitham Al Rawahi
Abstract:
Oman Electricity Transmission Company (OETC) has implemented Computerized Maintenance Management System (CMMS), which is based on Oracle enterprise asset management model e-AM. This was implemented with cooperation of Nama Shared Services (NSS). CMMS is mainly used to create maintenance work orders with a preconfigured workflow of defined maintenance schedules/plans, required resources, and materials, obtaining shutdown approvals, completing maintenance activities, and closing the work orders. Furthermore, CMMS is also configured with asset failure classifications, asset hierarchy, asset maintenance activities, integration with spare inventories, etc. Since the year 2017, site engineer is working on CMMS by filling-in manually all related maintenance and inspection records on paper forms and then scanning and attaching it in CMMS for further analysis. Site engineer will finalize all paper works at site and then goes back to office to scan and attach it to work order in CMMS. This creates sub tasks for site engineer and makes it very difficult and lengthy process. Also, there is a significant risk for missing or deleted important fields on the paper due to usage of pen to fill the paper. In addition to that, site engineer may take time and days working outside of the office. therefore, OETC has decided to digitize these inspection and maintenance forms in one platform in CMMS, and it can be opened with both functionalities online and offline. The ArcGIS product formats or web-enabled solutions which has ability to access from mobile and desktop devices via arc map modules will be used too. The purpose of interlinking is to setup for maintenance and inspection forms to work orders in e-AM, which the site engineer has daily interactions with. This ArcGIS environment or tool is designed to link with e-AM, so when site engineer opens this application from the site and a window will take him through same ArcGIS. This window opens the maintenance forms and shows the required fields to fill-in and save the work through his mobile application. After saving his work with the availability of network (Off/In) line, notification will trigger to his line manager to review and take further actions (approve/reject/request more information). In this function, the user can see the assigned work orders to his departments as well as chart of all work orders with status. The approver has ability to see the statistics of all work.Keywords: e-AM, GIS, CMMS, integration
Procedia PDF Downloads 9923823 Unstructured-Data Content Search Based on Optimized EEG Signal Processing and Multi-Objective Feature Extraction
Authors: Qais M. Yousef, Yasmeen A. Alshaer
Abstract:
Over the last few years, the amount of data available on the globe has been increased rapidly. This came up with the emergence of recent concepts, such as the big data and the Internet of Things, which have furnished a suitable solution for the availability of data all over the world. However, managing this massive amount of data remains a challenge due to their large verity of types and distribution. Therefore, locating the required file particularly from the first trial turned to be a not easy task, due to the large similarities of names for different files distributed on the web. Consequently, the accuracy and speed of search have been negatively affected. This work presents a method using Electroencephalography signals to locate the files based on their contents. Giving the concept of natural mind waves processing, this work analyses the mind wave signals of different people, analyzing them and extracting their most appropriate features using multi-objective metaheuristic algorithm, and then classifying them using artificial neural network to distinguish among files with similar names. The aim of this work is to provide the ability to find the files based on their contents using human thoughts only. Implementing this approach and testing it on real people proved its ability to find the desired files accurately within noticeably shorter time and retrieve them as a first choice for the user.Keywords: artificial intelligence, data contents search, human active memory, mind wave, multi-objective optimization
Procedia PDF Downloads 17923822 IoT Based Approach to Healthcare System for a Quadriplegic Patient Using EEG
Authors: R. Gautam, P. Sastha Kanagasabai, G. N. Rathna
Abstract:
The proposed healthcare system enables quadriplegic patients, people with severe motor disabilities to send commands to electronic devices and monitor their vitals. The growth of Brain-Computer-Interface (BCI) has led to rapid development in 'assistive systems' for the disabled called 'assistive domotics'. Brain-Computer-Interface is capable of reading the brainwaves of an individual and analyse it to obtain some meaningful data. This processed data can be used to assist people having speech disorders and sometimes people with limited locomotion to communicate. In this Project, Emotiv EPOC Headset is used to obtain the electroencephalogram (EEG). The obtained data is processed to communicate pre-defined commands over the internet to the desired mobile phone user. Other Vital Information like the heartbeat, blood pressure, ECG and body temperature are monitored and uploaded to the server. Data analytics enables physicians to scan databases for a specific illness. The Data is processed in Intel Edison, system on chip (SoC). Patient metrics are displayed via Intel IoT Analytics cloud service.Keywords: brain computer interface, Intel Edison, Emotiv EPOC, IoT analytics, electroencephalogram
Procedia PDF Downloads 18723821 Searchable Encryption in Cloud Storage
Authors: Ren Junn Hwang, Chung-Chien Lu, Jain-Shing Wu
Abstract:
Cloud outsource storage is one of important services in cloud computing. Cloud users upload data to cloud servers to reduce the cost of managing data and maintaining hardware and software. To ensure data confidentiality, users can encrypt their files before uploading them to a cloud system. However, retrieving the target file from the encrypted files exactly is difficult for cloud server. This study proposes a protocol for performing multikeyword searches for encrypted cloud data by applying k-nearest neighbor technology. The protocol ranks the relevance scores of encrypted files and keywords, and prevents cloud servers from learning search keywords submitted by a cloud user. To reduce the costs of file transfer communication, the cloud server returns encrypted files in order of relevance. Moreover, when a cloud user inputs an incorrect keyword and the number of wrong alphabet does not exceed a given threshold; the user still can retrieve the target files from cloud server. In addition, the proposed scheme satisfies security requirements for outsourced data storage.Keywords: fault-tolerance search, multi-keywords search, outsource storage, ranked search, searchable encryption
Procedia PDF Downloads 38423820 A Bivariate Inverse Generalized Exponential Distribution and Its Applications in Dependent Competing Risks Model
Authors: Fatemah A. Alqallaf, Debasis Kundu
Abstract:
The aim of this paper is to introduce a bivariate inverse generalized exponential distribution which has a singular component. The proposed bivariate distribution can be used when the marginals have heavy-tailed distributions, and they have non-monotone hazard functions. Due to the presence of the singular component, it can be used quite effectively when there are ties in the data. Since it has four parameters, it is a very flexible bivariate distribution, and it can be used quite effectively for analyzing various bivariate data sets. Several dependency properties and dependency measures have been obtained. The maximum likelihood estimators cannot be obtained in closed form, and it involves solving a four-dimensional optimization problem. To avoid that, we have proposed to use an EM algorithm, and it involves solving only one non-linear equation at each `E'-step. Hence, the implementation of the proposed EM algorithm is very straight forward in practice. Extensive simulation experiments and the analysis of one data set have been performed. We have observed that the proposed bivariate inverse generalized exponential distribution can be used for modeling dependent competing risks data. One data set has been analyzed to show the effectiveness of the proposed model.Keywords: Block and Basu bivariate distributions, competing risks, EM algorithm, Marshall-Olkin bivariate exponential distribution, maximum likelihood estimators
Procedia PDF Downloads 14423819 Blind Data Hiding Technique Using Interpolation of Subsampled Images
Authors: Singara Singh Kasana, Pankaj Garg
Abstract:
In this paper, a blind data hiding technique based on interpolation of sub sampled versions of a cover image is proposed. Sub sampled image is taken as a reference image and an interpolated image is generated from this reference image. Then difference between original cover image and interpolated image is used to embed secret data. Comparisons with the existing interpolation based techniques show that proposed technique provides higher embedding capacity and better visual quality marked images. Moreover, the performance of the proposed technique is more stable for different images.Keywords: interpolation, image subsampling, PSNR, SIM
Procedia PDF Downloads 58023818 Introduction to Various Innovative Techniques Suggested for Seismic Hazard Assessment
Authors: Deepshikha Shukla, C. H. Solanki, Mayank K. Desai
Abstract:
Amongst all the natural hazards, earthquakes have the potential for causing the greatest damages. Since the earthquake forces are random in nature and unpredictable, the quantification of the hazards becomes important in order to assess the hazards. The time and place of a future earthquake are both uncertain. Since earthquakes can neither be prevented nor be predicted, engineers have to design and construct in such a way, that the damage to life and property are minimized. Seismic hazard analysis plays an important role in earthquake design structures by providing a rational value of input parameter. In this paper, both mathematical, as well as computational methods adopted by researchers globally in the past five years, will be discussed. Some mathematical approaches involving the concepts of Poisson’s ratio, Convex Set Theory, Empirical Green’s Function, Bayesian probability estimation applied for seismic hazard and FOSM (first-order second-moment) algorithm methods will be discussed. Computational approaches and numerical model SSIFiBo developed in MATLAB to study dynamic soil-structure interaction problem is discussed in this paper. The GIS-based tool will also be discussed which is predominantly used in the assessment of seismic hazards.Keywords: computational methods, MATLAB, seismic hazard, seismic measurements
Procedia PDF Downloads 34223817 Development and Implementation of An "Electric Island" Monitoring Infrastructure for Promoting Energy Efficiency in Schools
Authors: Vladislav Grigorovitch, Marina Grigorovitch, David Pearlmutter, Erez Gal
Abstract:
The concept of “electric island” is involved with achieving the balance between the self-power generation ability of each educational institution and energy consumption demand. Photo-Voltaic (PV) solar system installed on the roofs of educational buildings is a common way to absorb the available solar energy and generate electricity for self-consumption and even for returning to the grid. The main objective of this research is to develop and implement an “electric island” monitoring infrastructure for promoting energy efficiency in educational buildings. A microscale monitoring methodology will be developed to provide a platform to estimate energy consumption performance classified by rooms and subspaces rather than the more common macroscale monitoring of the whole building. The monitoring platform will be established on the experimental sites, enabling an estimation and further analysis of the variety of environmental and physical conditions. For each building, separate measurement configurations will be applied taking into account the specific requirements, restrictions, location and infrastructure issues. The direct results of the measurements will be analyzed to provide deeper understanding of the impact of environmental conditions and sustainability construction standards, not only on the energy demand of public building, but also on the energy consumption habits of the children that study in those schools and the educational and administrative staff that is responsible for providing the thermal comfort conditions and healthy studying atmosphere for the children. A monitoring methodology being developed in this research is providing online access to real-time data of Interferential Therapy (IFTs) from any mobile phone or computer by simply browsing the dedicated website, providing powerful tools for policy makers for better decision making while developing PV production infrastructure to achieve “electric islands” in educational buildings. A detailed measurement configuration was technically designed based on the specific conditions and restriction of each of the pilot buildings. A monitoring and analysis methodology includes a large variety of environmental parameters inside and outside the schools to investigate the impact of environmental conditions both on the energy performance of the school and educational abilities of the children. Indoor measurements are mandatory to acquire the energy consumption data, temperature, humidity, carbon dioxide and other air quality conditions in different parts of the building. In addition to that, we aim to study the awareness of the users to the energy consideration and thus the impact on their energy consumption habits. The monitoring of outdoor conditions is vital for proper design of the off-grid energy supply system and validation of its sufficient capacity. The suggested outcomes of this research include: 1. both experimental sites are designed to have PV production and storage capabilities; 2. Developing an online information feedback platform. The platform will provide consumer dedicated information to academic researchers, municipality officials and educational staff and students; 3. Designing an environmental work path for educational staff regarding optimal conditions and efficient hours for operating air conditioning, natural ventilation, closing of blinds, etc.Keywords: sustainability, electric island, IOT, smart building
Procedia PDF Downloads 18023816 Active Contours for Image Segmentation Based on Complex Domain Approach
Authors: Sajid Hussain
Abstract:
The complex domain approach for image segmentation based on active contour has been designed, which deforms step by step to partition an image into numerous expedient regions. A novel region-based trigonometric complex pressure force function is proposed, which propagates around the region of interest using image forces. The signed trigonometric force function controls the propagation of the active contour and the active contour stops on the exact edges of the object accurately. The proposed model makes the level set function binary and uses Gaussian smoothing kernel to adjust and escape the re-initialization procedure. The working principle of the proposed model is as follows: The real image data is transformed into complex data by iota (i) times of image data and the average iota (i) times of horizontal and vertical components of the gradient of image data is inserted in the proposed model to catch complex gradient of the image data. A simple finite difference mathematical technique has been used to implement the proposed model. The efficiency and robustness of the proposed model have been verified and compared with other state-of-the-art models.Keywords: image segmentation, active contour, level set, Mumford and Shah model
Procedia PDF Downloads 11523815 Discerning Divergent Nodes in Social Networks
Authors: Mehran Asadi, Afrand Agah
Abstract:
In data mining, partitioning is used as a fundamental tool for classification. With the help of partitioning, we study the structure of data, which allows us to envision decision rules, which can be applied to classification trees. In this research, we used online social network dataset and all of its attributes (e.g., Node features, labels, etc.) to determine what constitutes an above average chance of being a divergent node. We used the R statistical computing language to conduct the analyses in this report. The data were found on the UC Irvine Machine Learning Repository. This research introduces the basic concepts of classification in online social networks. In this work, we utilize overfitting and describe different approaches for evaluation and performance comparison of different classification methods. In classification, the main objective is to categorize different items and assign them into different groups based on their properties and similarities. In data mining, recursive partitioning is being utilized to probe the structure of a data set, which allow us to envision decision rules and apply them to classify data into several groups. Estimating densities is hard, especially in high dimensions, with limited data. Of course, we do not know the densities, but we could estimate them using classical techniques. First, we calculated the correlation matrix of the dataset to see if any predictors are highly correlated with one another. By calculating the correlation coefficients for the predictor variables, we see that density is strongly correlated with transitivity. We initialized a data frame to easily compare the quality of the result classification methods and utilized decision trees (with k-fold cross validation to prune the tree). The method performed on this dataset is decision trees. Decision tree is a non-parametric classification method, which uses a set of rules to predict that each observation belongs to the most commonly occurring class label of the training data. Our method aggregates many decision trees to create an optimized model that is not susceptible to overfitting. When using a decision tree, however, it is important to use cross-validation to prune the tree in order to narrow it down to the most important variables.Keywords: online social networks, data mining, social cloud computing, interaction and collaboration
Procedia PDF Downloads 16023814 Comparison of Different k-NN Models for Speed Prediction in an Urban Traffic Network
Authors: Seyoung Kim, Jeongmin Kim, Kwang Ryel Ryu
Abstract:
A database that records average traffic speeds measured at five-minute intervals for all the links in the traffic network of a metropolitan city. While learning from this data the models that can predict future traffic speed would be beneficial for the applications such as the car navigation system, building predictive models for every link becomes a nontrivial job if the number of links in a given network is huge. An advantage of adopting k-nearest neighbor (k-NN) as predictive models is that it does not require any explicit model building. Instead, k-NN takes a long time to make a prediction because it needs to search for the k-nearest neighbors in the database at prediction time. In this paper, we investigate how much we can speed up k-NN in making traffic speed predictions by reducing the amount of data to be searched for without a significant sacrifice of prediction accuracy. The rationale behind this is that we had a better look at only the recent data because the traffic patterns not only repeat daily or weekly but also change over time. In our experiments, we build several different k-NN models employing different sets of features which are the current and past traffic speeds of the target link and the neighbor links in its up/down-stream. The performances of these models are compared by measuring the average prediction accuracy and the average time taken to make a prediction using various amounts of data.Keywords: big data, k-NN, machine learning, traffic speed prediction
Procedia PDF Downloads 36523813 ESRA: An End-to-End System for Re-identification and Anonymization of Swiss Court Decisions
Authors: Joel Niklaus, Matthias Sturmer
Abstract:
The publication of judicial proceedings is a cornerstone of many democracies. It enables the court system to be made accountable by ensuring that justice is made in accordance with the laws. Equally important is privacy, as a fundamental human right (Article 12 in the Declaration of Human Rights). Therefore, it is important that the parties (especially minors, victims, or witnesses) involved in these court decisions be anonymized securely. Today, the anonymization of court decisions in Switzerland is performed either manually or semi-automatically using primitive software. While much research has been conducted on anonymization for tabular data, the literature on anonymization for unstructured text documents is thin and virtually non-existent for court decisions. In 2019, it has been shown that manual anonymization is not secure enough. In 21 of 25 attempted Swiss federal court decisions related to pharmaceutical companies, pharmaceuticals, and legal parties involved could be manually re-identified. This was achieved by linking the decisions with external databases using regular expressions. An automated re-identification system serves as an automated test for the safety of existing anonymizations and thus promotes the right to privacy. Manual anonymization is very expensive (recurring annual costs of over CHF 20M in Switzerland alone, according to an estimation). Consequently, many Swiss courts only publish a fraction of their decisions. An automated anonymization system reduces these costs substantially, further leading to more capacity for publishing court decisions much more comprehensively. For the re-identification system, topic modeling with latent dirichlet allocation is used to cluster an amount of over 500K Swiss court decisions into meaningful related categories. A comprehensive knowledge base with publicly available data (such as social media, newspapers, government documents, geographical information systems, business registers, online address books, obituary portal, web archive, etc.) is constructed to serve as an information hub for re-identifications. For the actual re-identification, a general-purpose language model is fine-tuned on the respective part of the knowledge base for each category of court decisions separately. The input to the model is the court decision to be re-identified, and the output is a probability distribution over named entities constituting possible re-identifications. For the anonymization system, named entity recognition (NER) is used to recognize the tokens that need to be anonymized. Since the focus lies on Swiss court decisions in German, a corpus for Swiss legal texts will be built for training the NER model. The recognized named entities are replaced by the category determined by the NER model and an identifier to preserve context. This work is part of an ongoing research project conducted by an interdisciplinary research consortium. Both a legal analysis and the implementation of the proposed system design ESRA will be performed within the next three years. This study introduces the system design of ESRA, an end-to-end system for re-identification and anonymization of Swiss court decisions. Firstly, the re-identification system tests the safety of existing anonymizations and thus promotes privacy. Secondly, the anonymization system substantially reduces the costs of manual anonymization of court decisions and thus introduces a more comprehensive publication practice.Keywords: artificial intelligence, courts, legal tech, named entity recognition, natural language processing, ·privacy, topic modeling
Procedia PDF Downloads 15023812 Comparative Analysis of Classification Methods in Determining Non-Active Student Characteristics in Indonesia Open University
Authors: Dewi Juliah Ratnaningsih, Imas Sukaesih Sitanggang
Abstract:
Classification is one of data mining techniques that aims to discover a model from training data that distinguishes records into the appropriate category or class. Data mining classification methods can be applied in education, for example, to determine the classification of non-active students in Indonesia Open University. This paper presents a comparison of three methods of classification: Naïve Bayes, Bagging, and C.45. The criteria used to evaluate the performance of three methods of classification are stratified cross-validation, confusion matrix, the value of the area under the ROC Curve (AUC), Recall, Precision, and F-measure. The data used for this paper are from the non-active Indonesia Open University students in registration period of 2004.1 to 2012.2. Target analysis requires that non-active students were divided into 3 groups: C1, C2, and C3. Data analyzed are as many as 4173 students. Results of the study show: (1) Bagging method gave a high degree of classification accuracy than Naïve Bayes and C.45, (2) the Bagging classification accuracy rate is 82.99 %, while the Naïve Bayes and C.45 are 80.04 % and 82.74 % respectively, (3) the result of Bagging classification tree method has a large number of nodes, so it is quite difficult in decision making, (4) classification of non-active Indonesia Open University student characteristics uses algorithms C.45, (5) based on the algorithm C.45, there are 5 interesting rules which can describe the characteristics of non-active Indonesia Open University students.Keywords: comparative analysis, data mining, clasiffication, Bagging, Naïve Bayes, C.45, non-active students, Indonesia Open University
Procedia PDF Downloads 31723811 Chemical Characterization and Antioxidant Capacity of Flour From Two Soya Bean Cultivars (Glycine Max)
Authors: Meziani Samira, Menadi Noreddine, Labga Lahouaria, Chenni Fatima Zohra, Toumi Asma
Abstract:
A comparative study between two varieties of soya beans was carried out in this work. The method consists of studying and proceeding to prepare a by-product (Flour) from two varieties of soybeans, a Chinese variety imported and marketed in Algeria. The chemical composition of ash, protein and fat was determined in this study. The minerals, namely potassium and sodium, were measured by flame spectrophotometer. In addition, the estimation of the polyphenol content and evaluation of the antioxidant activity Ferric Reducing Antioxidant Power assay (FRAP) f the methanol extracts of the flours were also carried out. The result revealed that soy flour from two cultivars, on average, contained 8% moisture, more than 50% protein, 1.58-1.87g fat, and 0.28-0.30g of ash. A slight difference was found for contents of 489 mg/ml of K + and 20 mg/ml of NA +. In addition, the phenolic content of the methanolic extracts gives a value of almost 37 mg EAG / g for both cultivars of soy flour. The estimated Reductive Antioxidant Iron (FRAP) potency of soy flour might be related to its polyphenol richness, which is similar to the variety of China. The flour Soya varieties tested contained a significant amount of protein and phenolic compounds with good antioxidant properties.Keywords: soye beans, soya flour, protein, total polyphenols
Procedia PDF Downloads 9423810 A Study of the Adaptive Reuse for School Land Use Strategy: An Application of the Analytic Network Process and Big Data
Authors: Wann-Ming Wey
Abstract:
In today's popularity and progress of information technology, the big data set and its analysis are no longer a major conundrum. Now, we could not only use the relevant big data to analysis and emulate the possible status of urban development in the near future, but also provide more comprehensive and reasonable policy implementation basis for government units or decision-makers via the analysis and emulation results as mentioned above. In this research, we set Taipei City as the research scope, and use the relevant big data variables (e.g., population, facility utilization and related social policy ratings) and Analytic Network Process (ANP) approach to implement in-depth research and discussion for the possible reduction of land use in primary and secondary schools of Taipei City. In addition to enhance the prosperous urban activities for the urban public facility utilization, the final results of this research could help improve the efficiency of urban land use in the future. Furthermore, the assessment model and research framework established in this research also provide a good reference for schools or other public facilities land use and adaptive reuse strategies in the future.Keywords: adaptive reuse, analytic network process, big data, land use strategy
Procedia PDF Downloads 20523809 Interoperability Standard for Data Exchange in Educational Documents in Professional and Technological Education: A Comparative Study and Feasibility Analysis for the Brazilian Context
Authors: Giovana Nunes Inocêncio
Abstract:
The professional and technological education (EPT) plays a pivotal role in equipping students for specialized careers, and it is imperative to establish a framework for efficient data exchange among educational institutions. The primary focus of this article is to address the pressing need for document interoperability within the context of EPT. The challenges, motivations, and benefits of implementing interoperability standards for digital educational documents are thoroughly explored. These documents include EPT completion certificates, academic records, and curricula. In conjunction with the prior abstract, it is evident that the intersection of IT governance and interoperability standards holds the key to transforming the landscape of technical education in Brazil. IT governance provides the strategic framework for effective data management, aligning with educational objectives, ensuring compliance, and managing risks. By adopting interoperability standards, the technical education sector in Brazil can facilitate data exchange, enhance data security, and promote international recognition of qualifications. The utilization of the XML (Extensible Markup Language) standard further strengthens the foundation for structured data exchange, fostering efficient communication, standardization of curricula, and enhancing educational materials. The IT governance, interoperability standards, and data management critical role in driving the quality, efficiency, and security of technical education. The adoption of these standards fosters transparency, stakeholder coordination, and regulatory compliance, ultimately empowering the technical education sector to meet the dynamic demands of the 21st century.Keywords: interoperability, education, standards, governance
Procedia PDF Downloads 7323808 Generating Real-Time Visual Summaries from Located Sensor-Based Data with Chorems
Authors: Z. Bouattou, R. Laurini, H. Belbachir
Abstract:
This paper describes a new approach for the automatic generation of the visual summaries dealing with cartographic visualization methods and sensors real time data modeling. Hence, the concept of chorems seems an interesting candidate to visualize real time geographic database summaries. Chorems have been defined by Roger Brunet (1980) as schematized visual representations of territories. However, the time information is not yet handled in existing chorematic map approaches, issue has been discussed in this paper. Our approach is based on spatial analysis by interpolating the values recorded at the same time, by sensors available, so we have a number of distributed observations on study areas and used spatial interpolation methods to find the concentration fields, from these fields and by using some spatial data mining procedures on the fly, it is possible to extract important patterns as geographic rules. Then, those patterns are visualized as chorems.Keywords: geovisualization, spatial analytics, real-time, geographic data streams, sensors, chorems
Procedia PDF Downloads 40323807 Need for Privacy in the Technological Era: An Analysis in the Indian Perspective
Authors: Amrashaa Singh
Abstract:
In the digital age and the large cyberspace, Data Protection and Privacy have become major issues in this technological era. There was a time when social media and online shopping websites were treated as a blessing for the people. But now the tables have turned, and the people have started to look at them with suspicion. They are getting aware of the privacy implications, and they do not feel as safe as they used to initially. When Edward Snowden informed the world about the snooping United States Security Agencies had been doing, that is when the picture became clear for the people. After the Cambridge Analytica case where the data of Facebook users were stored without their consent, the doubts arose in the minds of people about how safe they actually are. In India, the case of spyware Pegasus also raised a lot of concerns. It was used to snoop on a lot of human right activists and lawyers and the company which invented the spyware claims that it only sells it to the government. The paper will be dealing with the privacy concerns in the Indian perspective with an analytical methodology. The Supreme Court here had recently declared a right to privacy a Fundamental Right under Article 21 of the Constitution of India. Further, the Government is also working on the Data Protection Bill. The point to note is that India is still a developing country, and with the bill, the government aims at data localization. But there are doubts in the minds of many people that the Government would actually be snooping on the data of the individuals. It looks more like an attempt to curb dissenters ‘lawfully’. The focus of the paper would be on these issues in India in light of the European Union (EU) General Data Protection Regulation (GDPR). The Indian Data Protection Bill is also said to be loosely based on EU GDPR. But how helpful would these laws actually be is another concern since the economic and social conditions in both countries are very different? The paper aims at discussing these concerns, how good or bad is the intention of the government behind the bill, and how the nations can act together and draft common regulations so that there is some uniformity in the laws and their application.Keywords: Article 21, data protection, dissent, fundamental right, India, privacy
Procedia PDF Downloads 11523806 An Online 3D Modeling Method Based on a Lossless Compression Algorithm
Authors: Jiankang Wang, Hongyang Yu
Abstract:
This paper proposes a portable online 3D modeling method. The method first utilizes a depth camera to collect data and compresses the depth data using a frame-by-frame lossless data compression method. The color image is encoded using the H.264 encoding format. After the cloud obtains the color image and depth image, a 3D modeling method based on bundlefusion is used to complete the 3D modeling. The results of this study indicate that this method has the characteristics of portability, online, and high efficiency and has a wide range of application prospects.Keywords: 3D reconstruction, bundlefusion, lossless compression, depth image
Procedia PDF Downloads 8323805 Numerical Study for the Estimation of Hydrodynamic Current Drag Coefficients for the Colombian Navy Frigates Using Computational Fluid Dynamics
Authors: Mauricio Gracia, Luis Leal, Bharat Verma
Abstract:
Computational fluid dynamics (CFD) has become nowadays an important tool in the process of hydrodynamic design of modern ships. CFD is used to model any phenomena related to fluid flow in a control volume like a ship or any offshore structure in the sea. In the present study, the current force drag coefficients for a Colombian Navy Frigate in deep and shallow water are estimated through the application of CFD. The study shows the process of simulating the ship current drag coefficients using the CFD simulations method, which is conducted using STAR-CCM+ software package. The Almirante Padilla class Frigate ship scale model is investigated. The results show the ship current drag coefficient calculated considering a current speed of 1 knot with a 90° drift angle for the full-scale ship. Predicted results were compared against the current drag coefficients published in the Lloyds register OCIMF report. It is shown that the simulation results agree fairly well with the published results and that STAR-CCM+ code can predict current drag coefficients.Keywords: CFD, current draft coefficient, STAR-CCM+, OCIMF, Bollard pull
Procedia PDF Downloads 17723804 H∞ Sampled-Data Control for Linear Systems Time-Varying Delays: Application to Power System
Authors: Chang-Ho Lee, Seung-Hoon Lee, Myeong-Jin Park, Oh-Min Kwon
Abstract:
This paper investigates improved stability criteria for sampled-data control of linear systems with disturbances and time-varying delays. Based on Lyapunov-Krasovskii stability theory, delay-dependent conditions sufficient to ensure H∞ stability for the system are derived in the form of linear matrix inequalities(LMI). The effectiveness of the proposed method will be shown in numerical examples.Keywords: sampled-data control system, Lyapunov-Krasovskii functional, time delay-dependent, LMI, H∞ control
Procedia PDF Downloads 32123803 Redefining Health Information Systems with Machine Learning: Harnessing the Potential of AI-Powered Data Fusion Ecosystems
Authors: Shohoni Mahabub
Abstract:
Health Information Systems (HIS) are essential to contemporary healthcare; nonetheless, they frequently encounter challenges such as data fragmentation, inefficiencies, and an absence of real-time analytics. The advent of machine learning (ML) and artificial intelligence (AI) provides a revolutionary potential to address these difficulties via AI-driven data fusion ecosystems. These ecosystems integrate many health data sources, including electronic health records (EHRs), wearable devices, and genetic data, with sophisticated machine learning techniques such as natural language processing (NLP) and predictive analytics to produce actionable insights. Through the integration of strong data intake layers, secure interoperability protocols, and privacy-preserving models, these ecosystems provide individualized treatment, early illness diagnosis, and enhanced operational efficiency. This paradigm change enhances clinical decision-making and rectifies systemic inefficiencies in healthcare delivery. Nonetheless, adoption presents problems such as data privacy concerns, ethical considerations, and scalability constraints. The study examines options such as federated learning for safe, decentralized data sharing, explainable AI for transparency, and cloud-based infrastructure for scalability to address these issues. These ecosystems aim to address health equity disparities, particularly in resource-limited environments, and improve public health surveillance, notably in pandemic response initiatives. This article emphasizes the revolutionary potential of AI-driven data fusion ecosystems in redefining Health Information Systems by providing an implementation roadmap and showcasing successful deployment case studies. The suggested method promotes a cooperative initiative among legislators, healthcare professionals, and technology to establish a cohesive, efficient, and patient-centric healthcare model.Keywords: AI-powered healthcare systems, data fusion ecosystem, predictive analytics, digital health interoperability
Procedia PDF Downloads 1623802 End-to-End Pyramid Based Method for Magnetic Resonance Imaging Reconstruction
Authors: Omer Cahana, Ofer Levi, Maya Herman
Abstract:
Magnetic Resonance Imaging (MRI) is a lengthy medical scan that stems from a long acquisition time. Its length is mainly due to the traditional sampling theorem, which defines a lower boundary for sampling. However, it is still possible to accelerate the scan by using a different approach such as Compress Sensing (CS) or Parallel Imaging (PI). These two complementary methods can be combined to achieve a faster scan with high-fidelity imaging. To achieve that, two conditions must be satisfied: i) the signal must be sparse under a known transform domain, and ii) the sampling method must be incoherent. In addition, a nonlinear reconstruction algorithm must be applied to recover the signal. While the rapid advances in Deep Learning (DL) have had tremendous successes in various computer vision tasks, the field of MRI reconstruction is still in its early stages. In this paper, we present an end-to-end method for MRI reconstruction from k-space to image. Our method contains two parts. The first is sensitivity map estimation (SME), which is a small yet effective network that can easily be extended to a variable number of coils. The second is reconstruction, which is a top-down architecture with lateral connections developed for building high-level refinement at all scales. Our method holds the state-of-art fastMRI benchmark, which is the largest, most diverse benchmark for MRI reconstruction.Keywords: magnetic resonance imaging, image reconstruction, pyramid network, deep learning
Procedia PDF Downloads 9223801 Logistics Information Systems in the Distribution of Flour in Nigeria
Authors: Cornelius Femi Popoola
Abstract:
This study investigated logistics information systems in the distribution of flour in Nigeria. A case study design was used and 50 staff of Honeywell Flour Mill was sampled for the study. Data generated through a questionnaire were analysed using correlation and regression analysis. The findings of the study revealed that logistic information systems such as e-commerce, interactive telephone systems and electronic data interchange positively correlated with the distribution of flour in Honeywell Flour Mill. Finding also deduced that e-commerce, interactive telephone systems and electronic data interchange jointly and positively contribute to the distribution of flour in Honeywell Flour Mill in Nigeria (R = .935; Adj. R2 = .642; F (3,47) = 14.739; p < .05). The study therefore recommended that Honeywell Flour Mill should upgrade their logistic information systems to computer-to-computer communication of business transactions and documents, as well adopt new technology such as, tracking-and-tracing systems (barcode scanning for packages and palettes), tracking vehicles with Global Positioning System (GPS), measuring vehicle performance with ‘black boxes’ (containing logistic data), and Automatic Equipment Identification (AEI) into their systems.Keywords: e-commerce, electronic data interchange, flour distribution, information system, interactive telephone systems
Procedia PDF Downloads 557