Search results for: categorical data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24596

Search results for: categorical data

24506 Assessing an Instrument Usability: Response Interpolation and Scale Sensitivity

Authors: Betsy Ng, Seng Chee Tan, Choon Lang Quek, Peter Looker, Jaime Koh

Abstract:

The purpose of the present study was to determine the particular scale rating that stands out for an instrument. The instrument was designed to assess student perceptions of various learning environments, namely face-to-face, online and blended. The original instrument had a 5-point Likert items (1 = strongly disagree and 5 = strongly agree). Alternate versions were modified with a 6-point Likert scale and a bar scale rating. Participants consisted of undergraduates in a local university were involved in the usability testing of the instrument in an electronic setting. They were presented with the 5-point, 6-point and percentage-bar (100-point) scale ratings, in response to their perceptions of learning environments. The 5-point and 6-point Likert scales were presented in the form of radio button controls for each number, while the percentage-bar scale was presented with a sliding selection. Among these responses, 6-point Likert scale emerged to be the best overall. When participants were confronted with the 5-point items, they either chose 3 or 4, suggesting that data loss could occur due to the insensitivity of instrument. The insensitivity of instrument could be due to the discreet options, as evidenced by response interpolation. To avoid the constraint of discreet options, the percentage-bar scale rating was tested, but the participant responses were not well-interpolated. The bar scale might have provided a variety of responses without a constraint of a set of categorical options, but it seemed to reflect a lack of perceived and objective accuracy. The 6-point Likert scale was more likely to reflect a respondent’s perceived and objective accuracy as well as higher sensitivity. This finding supported the conclusion that 6-point Likert items provided a more accurate measure of the participant’s evaluation. The 5-point and bar scale ratings might not be accurately measuring the participants’ responses. This study highlighted the importance of the respondent’s perception of accuracy, respondent’s true evaluation, and the scale’s ease of use. Implications and limitations of this study were also discussed.

Keywords: usability, interpolation, sensitivity, Likert scales, accuracy

Procedia PDF Downloads 399
24505 Control the Flow of Big Data

Authors: Shizra Waris, Saleem Akhtar

Abstract:

Big data is a research area receiving attention from academia and IT communities. In the digital world, the amounts of data produced and stored have within a short period of time. Consequently this fast increasing rate of data has created many challenges. In this paper, we use functionalism and structuralism paradigms to analyze the genesis of big data applications and its current trends. This paper presents a complete discussion on state-of-the-art big data technologies based on group and stream data processing. Moreover, strengths and weaknesses of these technologies are analyzed. This study also covers big data analytics techniques, processing methods, some reported case studies from different vendor, several open research challenges and the chances brought about by big data. The similarities and differences of these techniques and technologies based on important limitations are also investigated. Emerging technologies are suggested as a solution for big data problems.

Keywords: computer, it community, industry, big data

Procedia PDF Downloads 181
24504 High Performance Computing and Big Data Analytics

Authors: Branci Sarra, Branci Saadia

Abstract:

Because of the multiplied data growth, many computer science tools have been developed to process and analyze these Big Data. High-performance computing architectures have been designed to meet the treatment needs of Big Data (view transaction processing standpoint, strategic, and tactical analytics). The purpose of this article is to provide a historical and global perspective on the recent trend of high-performance computing architectures especially what has a relation with Analytics and Data Mining.

Keywords: high performance computing, HPC, big data, data analysis

Procedia PDF Downloads 506
24503 A Landscape of Research Data Repositories in Re3data.org Registry: A Case Study of Indian Repositories

Authors: Prashant Shrivastava

Abstract:

The purpose of this study is to explore re3dat.org registry to identify research data repositories registration workflow process. Further objective is to depict a graph for present development of research data repositories in India. Preliminarily with an approach to understand re3data.org registry framework and schema design then further proceed to explore the status of research data repositories of India in re3data.org registry. Research data repositories are getting wider relevance due to e-research concepts. Now available registry re3data.org is a good tool for users and researchers to identify appropriate research data repositories as per their research requirements. In Indian environment, a compatible National Research Data Policy is the need of the time to boost the management of research data. Registry for Research Data Repositories is a crucial tool to discover specific information in specific domain. Also, Research Data Repositories in India have not been studied. Re3data.org registry and status of Indian research data repositories both discussed in this study.

Keywords: research data, research data repositories, research data registry, re3data.org

Procedia PDF Downloads 311
24502 Study of Three-Dimensional Computed Tomography of Frontoethmoidal Cells Using International Frontal Sinus Anatomy Classification

Authors: Prabesh Karki, Shyam Thapa Chettri, Bajarang Prasad Sah, Manoj Bhattarai, Sudeep Mishra

Abstract:

Introduction: Frontal sinus is frequently described as the most difficult sinus to access surgically due to its proximity to the cribriform plate, orbit, and anterior ethmoid artery. Frontal sinus surgery requires a detailed understanding of the cellular structure and FSDP unique to each patient, making high-resolution CT scans an indispensable tool to assess the difficulty of planned sinus surgery. International Frontal Sinus Anatomy Classification (IFAC) was developed to provide a more precise nomenclature for cells in the frontal recess, classifying cells based on their anatomic origin. Objectives: To assess the proportion of frontal cell variants defined by IFAC, variation with respect to age and gender. Methods: 54 cases were enrolled after a detailed clinical history, thorough general and physical examinations, and CT a report ordered in a film. Assessment and tabulation of the presence of frontal cells according to the IFAC analyzed. The prevalence of each cell type was calculated, and data were entered in MS Excel and analyzed using Statistical Package for the Social Sciences (SPSS). Descriptive statistics and frequencies were defined for categorical and numerical variables. Frequency, percentage, the mean and standard deviation were calculated. Result: Among 54 patients, 30 (55.6%) were male and 24 (44.4%) were female. The patient enrolled ranged from 18 to 78 years. Majority33.3% (n=18) were in age group of >50 years.According to IFAC, Agger nasi cells (92.6%) were most common, whereas supraorbital ethmoidal cells were least common 16 (29.6%). Prevalence of other frontoethmoidal cells was SAC- 57.4%, SAFC- 38.9%, SBC- 74.1%, SBFC- 33.3%, FSC- 38.9% of 54 cases. Conclusion: IFAC is an international consensus document that describes an anatomically precise nomenclature for classifying frontoethmoidal cells' anatomy. This study has defined the prevalence, symmetry and reliability of frontoethmoidal cells as established by the IFAC system as in other parts of the world.

Keywords: frontal sinus, frontoethmoidal cells, international frontal sinus anatomy classification

Procedia PDF Downloads 84
24501 A Study of Cloud Computing Solution for Transportation Big Data Processing

Authors: Ilgin Gökaşar, Saman Ghaffarian

Abstract:

The need for fast processed big data of transportation ridership (eg., smartcard data) and traffic operation (e.g., traffic detectors data) which requires a lot of computational power is incontrovertible in Intelligent Transportation Systems. Nowadays cloud computing is one of the important subjects and popular information technology solution for data processing. It enables users to process enormous measure of data without having their own particular computing power. Thus, it can also be a good selection for transportation big data processing as well. This paper intends to examine how the cloud computing can enhance transportation big data process with contrasting its advantages and disadvantages, and discussing cloud computing features.

Keywords: big data, cloud computing, Intelligent Transportation Systems, ITS, traffic data processing

Procedia PDF Downloads 449
24500 Harmonic Data Preparation for Clustering and Classification

Authors: Ali Asheibi

Abstract:

The rapid increase in the size of databases required to store power quality monitoring data has demanded new techniques for analysing and understanding the data. One suggested technique to assist in analysis is data mining. Preparing raw data to be ready for data mining exploration take up most of the effort and time spent in the whole data mining process. Clustering is an important technique in data mining and machine learning in which underlying and meaningful groups of data are discovered. Large amounts of harmonic data have been collected from an actual harmonic monitoring system in a distribution system in Australia for three years. This amount of acquired data makes it difficult to identify operational events that significantly impact the harmonics generated on the system. In this paper, harmonic data preparation processes to better understanding of the data have been presented. Underlying classes in this data has then been identified using clustering technique based on the Minimum Message Length (MML) method. The underlying operational information contained within the clusters can be rapidly visualised by the engineers. The C5.0 algorithm was used for classification and interpretation of the generated clusters.

Keywords: data mining, harmonic data, clustering, classification

Procedia PDF Downloads 233
24499 Merging and Comparing Ontologies Generically

Authors: Xiuzhan Guo, Arthur Berrill, Ajinkya Kulkarni, Kostya Belezko, Min Luo

Abstract:

Ontology operations, e.g., aligning and merging, were studied and implemented extensively in different settings, such as categorical operations, relation algebras, and typed graph grammars, with different concerns. However, aligning and merging operations in the settings share some generic properties, e.g., idempotence, commutativity, associativity, and representativity, labeled by (I), (C), (A), and (R), respectively, which are defined on an ontology merging system (D~M), where D is a non-empty set of the ontologies concerned, ~ is a binary relation on D modeling ontology aligning and M is a partial binary operation on D modeling ontology merging. Given an ontology repository, a finite set O ⊆ D, its merging closure Ô is the smallest set of ontologies, which contains the repository and is closed with respect to merging. If (I), (C), (A), and (R) are satisfied, then both D and Ô are partially ordered naturally by merging, Ô is finite and can be computed, compared, and sorted efficiently, including sorting, selecting, and querying some specific elements, e.g., maximal ontologies and minimal ontologies. We also show that the ontology merging system, given by ontology V -alignment pairs and pushouts, satisfies the properties: (I), (C), (A), and (R) so that the merging system is partially ordered and the merging closure of a given repository with respect to pushouts can be computed efficiently.

Keywords: ontology aligning, ontology merging, merging system, poset, merging closure, ontology V-alignment pair, ontology homomorphism, ontology V-alignment pair homomorphism, pushout

Procedia PDF Downloads 882
24498 Linguistic Summarization of Structured Patent Data

Authors: E. Y. Igde, S. Aydogan, F. E. Boran, D. Akay

Abstract:

Patent data have an increasingly important role in economic growth, innovation, technical advantages and business strategies and even in countries competitions. Analyzing of patent data is crucial since patents cover large part of all technological information of the world. In this paper, we have used the linguistic summarization technique to prove the validity of the hypotheses related to patent data stated in the literature.

Keywords: data mining, fuzzy sets, linguistic summarization, patent data

Procedia PDF Downloads 258
24497 Proposal of Data Collection from Probes

Authors: M. Kebisek, L. Spendla, M. Kopcek, T. Skulavik

Abstract:

In our paper we describe the security capabilities of data collection. Data are collected with probes located in the near and distant surroundings of the company. Considering the numerous obstacles e.g. forests, hills, urban areas, the data collection is realized in several ways. The collection of data uses connection via wireless communication, LAN network, GSM network and in certain areas data are collected by using vehicles. In order to ensure the connection to the server most of the probes have ability to communicate in several ways. Collected data are archived and subsequently used in supervisory applications. To ensure the collection of the required data, it is necessary to propose algorithms that will allow the probes to select suitable communication channel.

Keywords: communication, computer network, data collection, probe

Procedia PDF Downloads 350
24496 A Comparative Study between Digital Mammography, B Mode Ultrasound, Shear-Wave and Strain Elastography to Distinguish Benign and Malignant Breast Masses

Authors: Arjun Prakash, Samanvitha H.

Abstract:

BACKGROUND: Breast cancer is the commonest malignancy among women globally, with an estimated incidence of 2.3 million new cases as of 2020, representing 11.7% of all malignancies. As per Globocan data 2020, it accounted for 13.5% of all cancers and 10.6% of all cancer deaths in India. Early diagnosis and treatment can improve the overall morbidity and mortality, which necessitates the importance of differentiating benign from malignant breast masses. OBJECTIVE: The objective of the present study was to evaluate and compare the role of Digital Mammography (DM), B mode Ultrasound (USG), Shear Wave Elastography (SWE) and Strain Elastography (SE) in differentiating benign and malignant breast masses (ACR BI-RADS 3 - 5). Histo-Pathological Examination (HPE) was considered the Gold standard. MATERIALS & METHODS: We conducted a cross-sectional study on 53 patients with 64 breast masses over a period of 10 months. All patients underwent DM, USG, SWE and SE. These modalities were individually assessed to know their accuracy in differentiating benign and malignant masses. All Digital Mammograms were done using the Fujifilm AMULET Innovality Digital Mammography system and all Ultrasound examinations were performed on SAMSUNG RS 80 EVO Ultrasound system equipped with 2 to 9 MHz and 3 – 16 MHz linear transducers. All masses were subjected to HPE. Independent t-test and Chi-square or Fisher’s exact test were used to assess continuous and categorical variables, respectively. ROC analysis was done to assess the accuracy of diagnostic tests. RESULTS: Of 64 lesions, 51 (79.68%) were malignant and 13 (20.31%) (p < 0.0001) were benign. SE was the most specific (100%) (p < 0.0001) and USG (98%) (p < 0.0001) was the most sensitive of all the modalities. E max, E mean, E max ratio, E mean ratio and Strain Ratio of the malignant masses significantly differed from those of the benign masses. Maximum SWE value showed the highest sensitivity (88.2%) (p < 0.0001) among the elastography parameters. A combination of USG, SE and SWE had good sensitivity (86%) (p < 0.0001). CONCLUSION: A combination of USG, SE and SWE improves overall diagnostic yield in differentiating benign and malignant breast masses. Early diagnosis and treatment of breast carcinoma will reduce patient mortality and morbidity.

Keywords: digital mammography, breast cancer, ultrasound, elastography

Procedia PDF Downloads 97
24495 A Review on Big Data Movement with Different Approaches

Authors: Nay Myo Sandar

Abstract:

With the growth of technologies and applications, a large amount of data has been producing at increasing rate from various resources such as social media networks, sensor devices, and other information serving devices. This large collection of massive, complex and exponential growth of dataset is called big data. The traditional database systems cannot store and process such data due to large and complexity. Consequently, cloud computing is a potential solution for data storage and processing since it can provide a pool of resources for servers and storage. However, moving large amount of data to and from is a challenging issue since it can encounter a high latency due to large data size. With respect to big data movement problem, this paper reviews the literature of previous works, discusses about research issues, finds out approaches for dealing with big data movement problem.

Keywords: Big Data, Cloud Computing, Big Data Movement, Network Techniques

Procedia PDF Downloads 70
24494 External Validation of Established Pre-Operative Scoring Systems in Predicting Response to Microvascular Decompression for Trigeminal Neuralgia

Authors: Kantha Siddhanth Gujjari, Shaani Singhal, Robert Andrew Danks, Adrian Praeger

Abstract:

Background: Trigeminal neuralgia (TN) is a heterogenous pain syndrome characterised by short paroxysms of lancinating facial pain in the distribution of the trigeminal nerve, often triggered by usually innocuous stimuli. TN has a low prevalence of less than 0.1%, of which 80% to 90% is caused by compression of the trigeminal nerve from an adjacent artery or vein. The root entry zone of the trigeminal nerve is most sensitive to neurovascular conflict (NVC), causing dysmyelination. Whilst microvascular decompression (MVD) is an effective treatment for TN with NVC, all patients do not achieve long-term pain relief. Pre-operative scoring systems by Panczykowski and Hardaway have been proposed but have not been externally validated. These pre-operative scoring systems are composite scores calculated according to a subtype of TN, presence and degree of neurovascular conflict, and response to medical treatments. There is discordance in the assessment of NVC identified on pre-operative magnetic resonance imaging (MRI) between neurosurgeons and radiologists. To our best knowledge, the prognostic impact for MVD of this difference of interpretation has not previously been investigated in the form of a composite scoring system such as those suggested by Panczykowski and Hardaway. Aims: This study aims to identify prognostic factors and externally validate the proposed scoring systems by Panczykowski and Hardaway for TN. A secondary aim is to investigate the prognostic difference between a neurosurgeon's interpretation of NVC on MRI compared with a radiologist’s. Methods: This retrospective cohort study included 95 patients who underwent de novo MVD in a single neurosurgical unit in Melbourne. Data was recorded from patients’ hospital records and neurosurgeon’s correspondence from perioperative clinic reviews. Patient demographics, type of TN, distribution of TN, response to carbamazepine, neurosurgeon, and radiologist interpretation of NVC on MRI, were clearly described prospectively and preoperatively in the correspondence. Scoring systems published by Panczykowski et al. and Hardaway et al. were used to determine composite scores, which were compared with the recurrence of TN recorded during follow-up over 1-year. Categorical data analysed using Pearson chi-square testing. Independent numerical and nominal data analysed with logistical regression. Results: Logistical regression showed that a Panczykowski composite score of greater than 3 points was associated with a higher likelihood of pain-free outcome 1-year post-MVD with an OR 1.81 (95%CI 1.41-2.61, p=0.032). The composite score using neurosurgeon’s impression of NVC had an OR 2.96 (95%CI 2.28-3.31, p=0.048). A Hardaway composite score of greater than 2 points was associated with a higher likelihood of pain-free outcome 1 year post-MVD with an OR 3.41 (95%CI 2.58-4.37, p=0.028). The composite score using neurosurgeon’s impression of NVC had an OR 3.96 (95%CI 3.01-4.65, p=0.042). Conclusion: Composite scores developed by Panczykowski and Hardaway were validated for the prediction of response to MVD in TN. A composite score based on the neurosurgeon’s interpretation of NVC on MRI, when compared with the radiologist’s had a greater correlation with pain-free outcomes 1 year post-MVD.

Keywords: de novo microvascular decompression, neurovascular conflict, prognosis, trigeminal neuralgia

Procedia PDF Downloads 66
24493 Optimized Approach for Secure Data Sharing in Distributed Database

Authors: Ahmed Mateen, Zhu Qingsheng, Ahmad Bilal

Abstract:

In the current age of technology, information is the most precious asset of a company. Today, companies have a large amount of data. As the data become larger, access to data for some particular information is becoming slower day by day. Faster data processing to shape it in the form of information is the biggest issue. The major problems in distributed databases are the efficiency of data distribution and response time of data distribution. The security of data distribution is also a big issue. For these problems, we proposed a strategy that can maximize the efficiency of data distribution and also increase its response time. This technique gives better results for secure data distribution from multiple heterogeneous sources. The newly proposed technique facilitates the companies for secure data sharing efficiently and quickly.

Keywords: ER-schema, electronic record, P2P framework, API, query formulation

Procedia PDF Downloads 319
24492 Intelligent Indoor Localization Using WLAN Fingerprinting

Authors: Gideon C. Joseph

Abstract:

The ability to localize mobile devices is quite important, as some applications may require location information of these devices to operate or deliver better services to the users. Although there are several ways of acquiring location data of mobile devices, the WLAN fingerprinting approach has been considered in this work. This approach uses the Received Signal Strength Indicator (RSSI) measurement as a function of the position of the mobile device. RSSI is a quantitative technique of describing the radio frequency power carried by a signal. RSSI may be used to determine RF link quality and is very useful in dense traffic scenarios where interference is of major concern, for example, indoor environments. This research aims to design a system that can predict the location of a mobile device, when supplied with the mobile’s RSSIs. The developed system takes as input the RSSIs relating to the mobile device, and outputs parameters that describe the location of the device such as the longitude, latitude, floor, and building. The relationship between the Received Signal Strengths (RSSs) of mobile devices and their corresponding locations is meant to be modelled; hence, subsequent locations of mobile devices can be predicted using the developed model. It is obvious that describing mathematical relationships between the RSSIs measurements and localization parameters is one option to modelling the problem, but the complexity of such an approach is a serious turn-off. In contrast, we propose an intelligent system that can learn the mapping of such RSSIs measurements to the localization parameters to be predicted. The system is capable of upgrading its performance as more experiential knowledge is acquired. The most appealing consideration to using such a system for this task is that complicated mathematical analysis and theoretical frameworks are excluded or not needed; the intelligent system on its own learns the underlying relationship in the supplied data (RSSI levels) that corresponds to the localization parameters. These localization parameters to be predicted are of two different tasks: Longitude and latitude of mobile devices are real values (regression problem), while the floor and building of the mobile devices are of integer values or categorical (classification problem). This research work presents artificial neural network based intelligent systems to model the relationship between the RSSIs predictors and the mobile device localization parameters. The designed systems were trained and validated on the collected WLAN fingerprint database. The trained networks were then tested with another supplied database to obtain the performance of trained systems on achieved Mean Absolute Error (MAE) and error rates for the regression and classification tasks involved therein.

Keywords: indoor localization, WLAN fingerprinting, neural networks, classification, regression

Procedia PDF Downloads 333
24491 Data Mining Algorithms Analysis: Case Study of Price Predictions of Lands

Authors: Julio Albuja, David Zaldumbide

Abstract:

Data analysis is an important step before taking a decision about money. The aim of this work is to analyze the factors that influence the final price of the houses through data mining algorithms. To our best knowledge, previous work was researched just to compare results. Furthermore, before using the data of the data set, the Z-Transformation were used to standardize the data in the same range. Hence, the data was classified into two groups to visualize them in a readability format. A decision tree was built, and graphical data is displayed where clearly is easy to see the results and the factors' influence in these graphics. The definitions of these methods are described, as well as the descriptions of the results. Finally, conclusions and recommendations are presented related to the released results that our research showed making it easier to apply these algorithms using a customized data set.

Keywords: algorithms, data, decision tree, transformation

Procedia PDF Downloads 362
24490 Application of Blockchain Technology in Geological Field

Authors: Mengdi Zhang, Zhenji Gao, Ning Kang, Rongmei Liu

Abstract:

Management and application of geological big data is an important part of China's national big data strategy. With the implementation of a national big data strategy, geological big data management becomes more and more critical. At present, there are still a lot of technology barriers as well as cognition chaos in many aspects of geological big data management and application, such as data sharing, intellectual property protection, and application technology. Therefore, it’s a key task to make better use of new technologies for deeper delving and wider application of geological big data. In this paper, we briefly introduce the basic principle of blockchain technology at the beginning and then make an analysis of the application dilemma of geological data. Based on the current analysis, we bring forward some feasible patterns and scenarios for the blockchain application in geological big data and put forward serval suggestions for future work in geological big data management.

Keywords: blockchain, intellectual property protection, geological data, big data management

Procedia PDF Downloads 71
24489 Effectiveness of Adrenal Venous Sampling in the Management of Primary Aldosteronism: Single Centered Cohort Study at a Tertiary Care Hospital in Sri Lanka

Authors: Balasooriya B. M. C. M., Sujeeva N., Thowfeek Z., Siddiqa Omo, Liyanagunawardana J. E., Jayawardana Saiu, Manathunga S. S., Katulanda G. W.

Abstract:

Introduction and objectives: Adrenal venous sampling (AVS) is the gold standard to discriminate unilateral primary aldosteronism (UPA) from bilateral disease (BPA). AVS is technically demanding and only performed in a limited number of centers worldwide. To the best of our knowledge, Except for one study conducted in India, no other research studies on this area have been conducted in South Asia. This study aimed to evaluate the effectiveness of AVS in the management of primary aldosteronism. Methods: A total of 32 patients who underwent AVS at the National Hospital of Sri Lanka from April 2021 to April 2023 were enrolled. Demographic, clinical and laboratory data were obtained retrospectively. A procedure was considered successful when adequate cannulation of both adrenal veins was demonstrated. Cortisol gradient across the adrenal vein (AV) and the peripheral vein was used to establish the success of venous cannulation. Lateralization was determined by the aldosterone gradient between the two sides. Continuous and categorical variables were summarized with mean, SD, and proportions, respectively. The mean and standard deviation of the contralateral suppression index (CSI) were estimated with an intercept-only Bayesian inference model. Results: Of the 32 patients, the average age was 52.47 +26.14 and 19 (59.4%) were males. Both AVs were successfully cannulated in 12 (37.5%). Among them, lateralization was demonstrated in 11(91.7%), and one was diagnosed as a bilateral disease. There were no total failures. Right AV cannulation was unsuccessful in 18 (56.25%), of which lateralization was demonstrated in 9 (50%), and others were inconclusive. Left AV cannulation was unsuccessful only in 2 (6.25%); one was lateralized, and the other remained inconclusive. The estimated mean of the CSI was 0.33 (89% credible interval 0.11-0.86). Seven patients underwent unilateral adrenalectomy and demonstrated significant improvement in blood pressure during follow-up. Two patients await surgery. Others were treated medically. Conclusions: Despite failure due to procedural difficulties, AVS remained useful in the management of patients with PA. Moreover, the success of the procedure needs experienced hands and advanced equipment to achieve optimal outcomes in PA.

Keywords: adrenal venous sampling, lateralization, contralateral suppression index, primary aldosteronism

Procedia PDF Downloads 52
24488 Comparative Study of Deep Reinforcement Learning Algorithm Against Evolutionary Algorithms for Finding the Optimal Values in a Simulated Environment Space

Authors: Akshay Paranjape, Nils Plettenberg, Robert Schmitt

Abstract:

Traditional optimization methods like evolutionary algorithms are widely used in production processes to find an optimal or near-optimal solution of control parameters based on the simulated environment space of a process. These algorithms are computationally intensive and therefore do not provide the opportunity for real-time optimization. This paper utilizes the Deep Reinforcement Learning (DRL) framework to find an optimal or near-optimal solution for control parameters. A model based on maximum a posteriori policy optimization (Hybrid-MPO) that can handle both numerical and categorical parameters is used as a benchmark for comparison. A comparative study shows that DRL can find optimal solutions of similar quality as compared to evolutionary algorithms while requiring significantly less time making them preferable for real-time optimization. The results are confirmed in a large-scale validation study on datasets from production and other fields. A trained XGBoost model is used as a surrogate for process simulation. Finally, multiple ways to improve the model are discussed.

Keywords: reinforcement learning, evolutionary algorithms, production process optimization, real-time optimization, hybrid-MPO

Procedia PDF Downloads 99
24487 Frequent Item Set Mining for Big Data Using MapReduce Framework

Authors: Tamanna Jethava, Rahul Joshi

Abstract:

Frequent Item sets play an essential role in many data Mining tasks that try to find interesting patterns from the database. Typically it refers to a set of items that frequently appear together in transaction dataset. There are several mining algorithm being used for frequent item set mining, yet most do not scale to the type of data we presented with today, so called “BIG DATA”. Big Data is a collection of large data sets. Our approach is to work on the frequent item set mining over the large dataset with scalable and speedy way. Big Data basically works with Map Reduce along with HDFS is used to find out frequent item sets from Big Data on large cluster. This paper focuses on using pre-processing & mining algorithm as hybrid approach for big data over Hadoop platform.

Keywords: frequent item set mining, big data, Hadoop, MapReduce

Procedia PDF Downloads 410
24486 The Role Of Data Gathering In NGOs

Authors: Hussaini Garba Mohammed

Abstract:

Background/Significance: The lack of data gathering is affecting NGOs world-wide in general to have good data information about educational and health related issues among communities in any country and around the world. For example, HIV/AIDS smoking (Tuberculosis diseases) and COVID-19 virus carriers is becoming a serious public health problem, especially among old men and women. But there is no full details data survey assessment from communities, villages, and rural area in some countries to show the percentage of victims and patients, especial with this world COVID-19 virus among the people. These data are essential to inform programming targets, strategies, and priorities in getting good information about data gathering in any society.

Keywords: reliable information, data assessment, data mining, data communication

Procedia PDF Downloads 165
24485 The Application of Data Mining Technology in Building Energy Consumption Data Analysis

Authors: Liang Zhao, Jili Zhang, Chongquan Zhong

Abstract:

Energy consumption data, in particular those involving public buildings, are impacted by many factors: the building structure, climate/environmental parameters, construction, system operating condition, and user behavior patterns. Traditional methods for data analysis are insufficient. This paper delves into the data mining technology to determine its application in the analysis of building energy consumption data including energy consumption prediction, fault diagnosis, and optimal operation. Recent literature are reviewed and summarized, the problems faced by data mining technology in the area of energy consumption data analysis are enumerated, and research points for future studies are given.

Keywords: data mining, data analysis, prediction, optimization, building operational performance

Procedia PDF Downloads 836
24484 To Handle Data-Driven Software Development Projects Effectively

Authors: Shahnewaz Khan

Abstract:

Machine learning (ML) techniques are often used in projects for creating data-driven applications. These tasks typically demand additional research and analysis. The proper technique and strategy must be chosen to ensure the success of data-driven projects. Otherwise, even exerting a lot of effort, the necessary development might not always be possible. In this post, an effort to examine the workflow of data-driven software development projects and its implementation process in order to describe how to manage a project successfully. Which will assist in minimizing the added workload.

Keywords: data, data-driven projects, data science, NLP, software project

Procedia PDF Downloads 67
24483 Comparison Between Bispectral Index Guided Anesthesia and Standard Anesthesia Care in Middle Age Adult Patients Undergoing Modified Radical Mastectomy

Authors: Itee Chowdhury, Shikha Modi

Abstract:

Introduction: Cancer is beginning to outpace cardiovascular disease as a cause of death affecting every major organ system with profound implications for perioperative management. Breast cancer is the most common cancer in women in India, accounting for 27% of all cancers. The small changes in analgesic management of cancer patients can greatly improve prognosis and reduce the risk of postsurgical cancer recurrence as opioid-based analgesia has a deleterious effect on cancer outcomes. Shortened postsurgical recovery time facilitates earlier return to intended oncological therapy maximising the chance of successful treatment. Literature reveals that the role of BIS since FDA approval has been assessed in various types of surgeries, but clinical data on its use in oncosurgical patients are scanty. Our study focuses on the role of BIS-guided anaesthesia for breast cancer surgery patients. Methods: A prospective randomized controlled study in patients aged 36-55years scheduled for modified radical mastectomy was conducted in 51 patients in each group who met the inclusion and exclusion criteria, and randomization was done by sealed envelope technique. In BIS guided anaesthesia group (B), sevoflurane was titrated to keep the BIS value 45-60, and thereafter if the patient showed hypertension/tachycardia, an opioid was given. In standard anaesthesia care (group C), sevoflurane was titrated to keep MAC in the range of 0.8-1, and fentanyl was given if the patient showed hypertension/tachycardia. Intraoperative opioid consumption was calculated. Postsurgery recovery characteristics, including Aldrete score, were assessed. Patients were questioned for pain, PONV, and recall of the intraoperative event. A comparison of age, BMI, ASA, recovery characteristics, opioid, and VAS score was made using the non-parametric Mann-Whitney U test. Categorical data like intraoperative awareness of surgery and PONV was studied using the Chi-square test. A comparison of heart rate and MAP was made by an independent sample t-test. #ggplot2 package was used to show the trend of the BIS index for all intraoperative time points for each patient. For a statistical test of significance, the cut-off p-value was set as <0.05. Conclusions: BIS monitoring led to reduced opioid consumption and early recovery from anaesthesia in breast cancer patients undergoing MRM resulting in less postoperative nausea and vomiting and less pain intensity in the immediate postoperative period without any recall of the intraoperative event. Thus, the use of a Bispectral index monitor allows for tailoring of anaesthesia administration with a good outcome.

Keywords: bispectral index, depth of anaesthesia, recovery, opioid consumption

Procedia PDF Downloads 114
24482 Emotions in Health Tweets: Analysis of American Government Official Accounts

Authors: García López

Abstract:

The Government Departments of Health have the task of informing and educating citizens about public health issues. For this, they use channels like Twitter, key in the search for health information and the propagation of content. The tweets, important in the virality of the content, may contain emotions that influence the contagion and exchange of knowledge. The goal of this study is to perform an analysis of the emotional projection of health information shared on Twitter by official American accounts: the disease control account CDCgov, National Institutes of Health, NIH, the government agency HHSGov, and the professional organization PublicHealth. For this, we used Tone Analyzer, an International Business Machines Corporation (IBM) tool specialized in emotion detection in text, corresponding to the categorical model of emotion representation. For 15 days, all tweets from these accounts were analyzed with the emotional analysis tool in text. The results showed that their tweets contain an important emotional load, a determining factor in the success of their communications. This exposes that official accounts also use subjective language and contain emotions. The predominance of emotion joy over sadness and the strong presence of emotions in their tweets stimulate the virality of content, a key in the work of informing that government health departments have.

Keywords: emotions in tweets, emotion detection in the text, health information on Twitter, American health official accounts, emotions on Twitter, emotions and content

Procedia PDF Downloads 125
24481 The Relationship Between Artificial Intelligence, Data Science, and Privacy

Authors: M. Naidoo

Abstract:

Artificial intelligence often requires large amounts of good quality data. Within important fields, such as healthcare, the training of AI systems predominately relies on health and personal data; however, the usage of this data is complicated by various layers of law and ethics that seek to protect individuals’ privacy rights. This research seeks to establish the challenges AI and data sciences pose to (i) informational rights, (ii) privacy rights, and (iii) data protection. To solve some of the issues presented, various methods are suggested, such as embedding values in technological development, proper balancing of rights and interests, and others.

Keywords: artificial intelligence, data science, law, policy

Procedia PDF Downloads 93
24480 Simulation Data Summarization Based on Spatial Histograms

Authors: Jing Zhao, Yoshiharu Ishikawa, Chuan Xiao, Kento Sugiura

Abstract:

In order to analyze large-scale scientific data, research on data exploration and visualization has gained popularity. In this paper, we focus on the exploration and visualization of scientific simulation data, and define a spatial V-Optimal histogram for data summarization. We propose histogram construction algorithms based on a general binary hierarchical partitioning as well as a more specific one, the l-grid partitioning. For effective data summarization and efficient data visualization in scientific data analysis, we propose an optimal algorithm as well as a heuristic algorithm for histogram construction. To verify the effectiveness and efficiency of the proposed methods, we conduct experiments on the massive evacuation simulation data.

Keywords: simulation data, data summarization, spatial histograms, exploration, visualization

Procedia PDF Downloads 165
24479 Algorithms used in Spatial Data Mining GIS

Authors: Vahid Bairami Rad

Abstract:

Extracting knowledge from spatial data like GIS data is important to reduce the data and extract information. Therefore, the development of new techniques and tools that support the human in transforming data into useful knowledge has been the focus of the relatively new and interdisciplinary research area ‘knowledge discovery in databases’. Thus, we introduce a set of database primitives or basic operations for spatial data mining which are sufficient to express most of the spatial data mining algorithms from the literature. This approach has several advantages. Similar to the relational standard language SQL, the use of standard primitives will speed-up the development of new data mining algorithms and will also make them more portable. We introduced a database-oriented framework for spatial data mining which is based on the concepts of neighborhood graphs and paths. A small set of basic operations on these graphs and paths were defined as database primitives for spatial data mining. Furthermore, techniques to efficiently support the database primitives by a commercial DBMS were presented.

Keywords: spatial data base, knowledge discovery database, data mining, spatial relationship, predictive data mining

Procedia PDF Downloads 443
24478 Data Stream Association Rule Mining with Cloud Computing

Authors: B. Suraj Aravind, M. H. M. Krishna Prasad

Abstract:

There exist emerging applications of data streams that require association rule mining, such as network traffic monitoring, web click streams analysis, sensor data, data from satellites etc. Data streams typically arrive continuously in high speed with huge amount and changing data distribution. This raises new issues that need to be considered when developing association rule mining techniques for stream data. This paper proposes to introduce an improved data stream association rule mining algorithm by eliminating the limitation of resources. For this, the concept of cloud computing is used. Inclusion of this may lead to additional unknown problems which needs further research.

Keywords: data stream, association rule mining, cloud computing, frequent itemsets

Procedia PDF Downloads 489
24477 A Comprehensive Survey and Improvement to Existing Privacy Preserving Data Mining Techniques

Authors: Tosin Ige

Abstract:

Ethics must be a condition of the world, like logic. (Ludwig Wittgenstein, 1889-1951). As important as data mining is, it possess a significant threat to ethics, privacy, and legality, since data mining makes it difficult for an individual or consumer (in the case of a company) to control the accessibility and usage of his data. This research focuses on Current issues and the latest research and development on Privacy preserving data mining methods as at year 2022. It also discusses some advances in those techniques while at the same time highlighting and providing a new technique as a solution to an existing technique of privacy preserving data mining methods. This paper also bridges the wide gap between Data mining and the Web Application Programing Interface (web API), where research is urgently needed for an added layer of security in data mining while at the same time introducing a seamless and more efficient way of data mining.

Keywords: data, privacy, data mining, association rule, privacy preserving, mining technique

Procedia PDF Downloads 153