Search results for: data discovery
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25294

Search results for: data discovery

24724 Predictive Analysis for Big Data: Extension of Classification and Regression Trees Algorithm

Authors: Ameur Abdelkader, Abed Bouarfa Hafida

Abstract:

Since its inception, predictive analysis has revolutionized the IT industry through its robustness and decision-making facilities. It involves the application of a set of data processing techniques and algorithms in order to create predictive models. Its principle is based on finding relationships between explanatory variables and the predicted variables. Past occurrences are exploited to predict and to derive the unknown outcome. With the advent of big data, many studies have suggested the use of predictive analytics in order to process and analyze big data. Nevertheless, they have been curbed by the limits of classical methods of predictive analysis in case of a large amount of data. In fact, because of their volumes, their nature (semi or unstructured) and their variety, it is impossible to analyze efficiently big data via classical methods of predictive analysis. The authors attribute this weakness to the fact that predictive analysis algorithms do not allow the parallelization and distribution of calculation. In this paper, we propose to extend the predictive analysis algorithm, Classification And Regression Trees (CART), in order to adapt it for big data analysis. The major changes of this algorithm are presented and then a version of the extended algorithm is defined in order to make it applicable for a huge quantity of data.

Keywords: predictive analysis, big data, predictive analysis algorithms, CART algorithm

Procedia PDF Downloads 136
24723 Canopy Temperature Acquired from Daytime and Nighttime Aerial Data as an Indicator of Trees’ Health Status

Authors: Agata Zakrzewska, Dominik Kopeć, Adrian Ochtyra

Abstract:

The growing number of new cameras, sensors, and research methods allow for a broader application of thermal data in remote sensing vegetation studies. The aim of this research was to check whether it is possible to use thermal infrared data with a spectral range (3.6-4.9 μm) obtained during the day and the night to assess the health condition of selected species of deciduous trees in an urban environment. For this purpose, research was carried out in the city center of Warsaw (Poland) in 2020. During the airborne data acquisition, thermal data, laser scanning, and orthophoto map images were collected. Synchronously with airborne data, ground reference data were obtained for 617 studied species (Acer platanoides, Acer pseudoplatanus, Aesculus hippocastanum, Tilia cordata, and Tilia × euchlora) in different health condition states. The results were as follows: (i) healthy trees are cooler than trees in poor condition and dying both in the daytime and nighttime data; (ii) the difference in the canopy temperatures between healthy and dying trees was 1.06oC of mean value on the nighttime data and 3.28oC of mean value on the daytime data; (iii) condition classes significantly differentiate on both daytime and nighttime thermal data, but only on daytime data all condition classes differed statistically significantly from each other. In conclusion, the aerial thermal data can be considered as an alternative to hyperspectral data, a method of assessing the health condition of trees in an urban environment. Especially data obtained during the day, which can differentiate condition classes better than data obtained at night. The method based on thermal infrared and laser scanning data fusion could be a quick and efficient solution for identifying trees in poor health that should be visually checked in the field.

Keywords: middle wave infrared, thermal imagery, tree discoloration, urban trees

Procedia PDF Downloads 110
24722 Biodegradation of Endoxifen in Wastewater: Isolation and Identification of Bacteria Degraders, Kinetics, and By-Products

Authors: Marina Arino Martin, John McEvoy, Eakalak Khan

Abstract:

Endoxifen is an active metabolite responsible for the effectiveness of tamoxifen, a chemotherapeutic drug widely used for endocrine responsive breast cancer and chemo-preventive long-term treatment. Tamoxifen and endoxifen are not completely metabolized in human body and are actively excreted. As a result, they are released to the water environment via wastewater treatment plants (WWTPs). The presence of tamoxifen in the environment produces negative effects on aquatic lives due to its antiestrogenic activity. Because endoxifen is 30-100 times more potent than tamoxifen itself and also presents antiestrogenic activity, its presence in the water environment could result in even more toxic effects on aquatic lives compared to tamoxifen. Data on actual concentrations of endoxifen in the environment is limited due to recent discovery of endoxifen pharmaceutical activity. However, endoxifen has been detected in hospital and municipal wastewater effluents. The detection of endoxifen in wastewater effluents questions the treatment efficiency of WWTPs. Studies reporting information about endoxifen removal in WWTPs are also scarce. There was a study that used chlorination to eliminate endoxifen in wastewater. However, an inefficient degradation of endoxifen by chlorination and the production of hazardous disinfection by-products were observed. Therefore, there is a need to remove endoxifen from wastewater prior to chlorination in order to reduce the potential release of endoxifen into the environment and its possible effects. The aim of this research is to isolate and identify bacteria strain(s) capable of degrading endoxifen into less hazardous compound(s). For this purpose, bacteria strains from WWTPs were exposed to endoxifen as a sole carbon and nitrogen source for 40 days. Bacteria presenting positive growth were isolated and tested for endoxifen biodegradation. Endoxifen concentration and by-product formation were monitored. The Monod kinetic model was used to determine endoxifen biodegradation rate. Preliminary results of the study suggest that isolated bacteria from WWTPs are able to growth in presence of endoxifen as a sole carbon and nitrogen source. Ongoing work includes identification of these bacteria strains and by-product(s) of endoxifen biodegradation.

Keywords: biodegradation, bacterial degraders, endoxifen, wastewater

Procedia PDF Downloads 209
24721 Directional Search for Dark Matter Using Nuclear Emulsion

Authors: Ali Murat Guler

Abstract:

A variety of experiments have been developed over the past decades, aiming at the detection of Weakly Interactive Massive Particles (WIMPs) via their scattering in an instrumented medium. The sensitivity of these experiments has improved with a tremendous speed, thanks to a constant development of detectors and analysis methods. Detectors capable of reconstructing the direction of the nuclear recoil induced by the WIMP scattering are opening a new frontier to possibly extend Dark Matter searches beyond the neutrino background. Measurement of WIMP’s direction will allow us to detect the galactic origin of dark matter and, therefore to have a clear signal-background separation. The NEWSdm experiment, based on nuclear emulsions, is intended to measure the direction of WIMP-induced nuclear coils with a solid-state detector, thus with high sensitivity. We discuss the discovery potential of a directional experiment based on the use of a solid target made of newly developed nuclear emulsions and novel read-out systems achieving nanometric resolution. We also report results of a technical test conducted in Gran Sasso.

Keywords: dark matter, direct detection, nuclear emulsion, WIMPS

Procedia PDF Downloads 268
24720 Hierarchical Clustering Algorithms in Data Mining

Authors: Z. Abdullah, A. R. Hamdan

Abstract:

Clustering is a process of grouping objects and data into groups of clusters to ensure that data objects from the same cluster are identical to each other. Clustering algorithms in one of the areas in data mining and it can be classified into partition, hierarchical, density based, and grid-based. Therefore, in this paper, we do a survey and review for four major hierarchical clustering algorithms called CURE, ROCK, CHAMELEON, and BIRCH. The obtained state of the art of these algorithms will help in eliminating the current problems, as well as deriving more robust and scalable algorithms for clustering.

Keywords: clustering, unsupervised learning, algorithms, hierarchical

Procedia PDF Downloads 881
24719 End to End Monitoring in Oracle Fusion Middleware for Data Verification

Authors: Syed Kashif Ali, Usman Javaid, Abdullah Chohan

Abstract:

In large enterprises multiple departments use different sort of information systems and databases according to their needs. These systems are independent and heterogeneous in nature and sharing information/data between these systems is not an easy task. The usage of middleware technologies have made data sharing between systems very easy. However, monitoring the exchange of data/information for verification purposes between target and source systems is often complex or impossible for maintenance department due to security/access privileges on target and source systems. In this paper, we are intended to present our experience of an end to end data monitoring approach at middle ware level implemented in Oracle BPEL for data verification without any help of monitoring tool.

Keywords: service level agreement, SOA, BPEL, oracle fusion middleware, web service monitoring

Procedia PDF Downloads 476
24718 Dissimilarity Measure for General Histogram Data and Its Application to Hierarchical Clustering

Authors: K. Umbleja, M. Ichino

Abstract:

Symbolic data mining has been developed to analyze data in very large datasets. It is also useful in cases when entry specific details should remain hidden. Symbolic data mining is quickly gaining popularity as datasets in need of analyzing are becoming ever larger. One type of such symbolic data is a histogram, which enables to save huge amounts of information into a single variable with high-level of granularity. Other types of symbolic data can also be described in histograms, therefore making histogram a very important and general symbolic data type - a method developed for histograms - can also be applied to other types of symbolic data. Due to its complex structure, analyzing histograms is complicated. This paper proposes a method, which allows to compare two histogram-valued variables and therefore find a dissimilarity between two histograms. Proposed method uses the Ichino-Yaguchi dissimilarity measure for mixed feature-type data analysis as a base and develops a dissimilarity measure specifically for histogram data, which allows to compare histograms with different number of bins and bin widths (so called general histogram). Proposed dissimilarity measure is then used as a measure for clustering. Furthermore, linkage method based on weighted averages is proposed with the concept of cluster compactness to measure the quality of clustering. The method is then validated with application on real datasets. As a result, the proposed dissimilarity measure is found producing adequate and comparable results with general histograms without the loss of detail or need to transform the data.

Keywords: dissimilarity measure, hierarchical clustering, histograms, symbolic data analysis

Procedia PDF Downloads 157
24717 Chinese Language Teaching as a Second Language: Immersion Teaching

Authors: Lee Bih Ni, Kiu Su Na

Abstract:

This paper discusses the Chinese Language Teaching as a Second Language by focusing on Immersion Teaching. Researchers used narrative literature review to describe the current states of both art and science in focused areas of inquiry. Immersion teaching comes with a standard that teachers must reliably meet. Chinese language-immersion instruction consists of language and content lessons, including functional usage of the language, academic language, authentic language, and correct Chinese sociocultural language. Researchers used narrative literature reviews to build a scientific knowledge base. Researchers collected all the important points of discussion, and put them here with reference to the specific field where this paper is originally based on. The findings show that Chinese Language in immersion teaching is not like standard foreign language classroom; immersion setting provides more opportunities to teach students colloquial language than academic. Immersion techniques also introduce a language’s cultural and social contexts in a meaningful and memorable way. It is particularly important that immersion teachers connect classwork with real-life experiences. Immersion also includes more elements of discovery and inquiry based learning than do other kinds of instructional practices. Students are always and consistently interpreted the conclusions and context clues.

Keywords: a second language, Chinese language teaching, immersion teaching, instructional strategies

Procedia PDF Downloads 447
24716 WiFi Data Offloading: Bundling Method in a Canvas Business Model

Authors: Majid Mokhtarnia, Alireza Amini

Abstract:

Mobile operators deal with increasing in the data traffic as a critical issue. As a result, a vital responsibility of the operators is to deal with such a trend in order to create added values. This paper addresses a bundling method in a Canvas business model in a WiFi Data Offloading (WDO) strategy by which some elements of the model may be affected. In the proposed method, it is supposed to sell a number of data packages for subscribers in which there are some packages with a free given volume of data-offloaded WiFi complimentary. The paper on hands analyses this method in the views of attractiveness and profitability. The results demonstrate that the quality of implementation of the WDO strongly affects the final result and helps the decision maker to make the best one.

Keywords: bundling, canvas business model, telecommunication, WiFi data offloading

Procedia PDF Downloads 193
24715 Rheological Characteristics of Ice Slurries Based on Propylene- and Ethylene-Glycol at High Ice Fractions

Authors: Senda Trabelsi, Sébastien Poncet, Michel Poirier

Abstract:

Ice slurries are considered as a promising phase-changing secondary fluids for air-conditioning, packaging or cooling industrial processes. An experimental study has been here carried out to measure the rheological characteristics of ice slurries. Ice slurries consist in a solid phase (flake ice crystals) and a liquid phase. The later is composed of a mixture of liquid water and an additive being here either (1) Propylene-Glycol (PG) or (2) Ethylene-Glycol (EG) used to lower the freezing point of water. Concentrations of 5%, 14% and 24% of both additives are investigated with ice mass fractions ranging from 5% to 85%. The rheological measurements are carried out using a Discovery HR-2 vane-concentric cylinder with four full-length blades. The experimental results show that the behavior of ice slurries is generally non-Newtonian with shear-thinning or shear-thickening behaviors depending on the experimental conditions. In order to determine the consistency and the flow index, the Herschel-Bulkley model is used to describe the behavior of ice slurries. The present results are finally validated against an experimental database found in the literature and the predictions of an Artificial Neural Network model.

Keywords: ice slurry, propylene-glycol, ethylene-glycol, rheology

Procedia PDF Downloads 257
24714 Distributed Perceptually Important Point Identification for Time Series Data Mining

Authors: Tak-Chung Fu, Ying-Kit Hung, Fu-Lai Chung

Abstract:

In the field of time series data mining, the concept of the Perceptually Important Point (PIP) identification process is first introduced in 2001. This process originally works for financial time series pattern matching and it is then found suitable for time series dimensionality reduction and representation. Its strength is on preserving the overall shape of the time series by identifying the salient points in it. With the rise of Big Data, time series data contributes a major proportion, especially on the data which generates by sensors in the Internet of Things (IoT) environment. According to the nature of PIP identification and the successful cases, it is worth to further explore the opportunity to apply PIP in time series ‘Big Data’. However, the performance of PIP identification is always considered as the limitation when dealing with ‘Big’ time series data. In this paper, two distributed versions of PIP identification based on the Specialized Binary (SB) Tree are proposed. The proposed approaches solve the bottleneck when running the PIP identification process in a standalone computer. Improvement in term of speed is obtained by the distributed versions.

Keywords: distributed computing, performance analysis, Perceptually Important Point identification, time series data mining

Procedia PDF Downloads 429
24713 Analysing Techniques for Fusing Multimodal Data in Predictive Scenarios Using Convolutional Neural Networks

Authors: Philipp Ruf, Massiwa Chabbi, Christoph Reich, Djaffar Ould-Abdeslam

Abstract:

In recent years, convolutional neural networks (CNN) have demonstrated high performance in image analysis, but oftentimes, there is only structured data available regarding a specific problem. By interpreting structured data as images, CNNs can effectively learn and extract valuable insights from tabular data, leading to improved predictive accuracy and uncovering hidden patterns that may not be apparent in traditional structured data analysis. In applying a single neural network for analyzing multimodal data, e.g., both structured and unstructured information, significant advantages in terms of time complexity and energy efficiency can be achieved. Converting structured data into images and merging them with existing visual material offers a promising solution for applying CNN in multimodal datasets, as they often occur in a medical context. By employing suitable preprocessing techniques, structured data is transformed into image representations, where the respective features are expressed as different formations of colors and shapes. In an additional step, these representations are fused with existing images to incorporate both types of information. This final image is finally analyzed using a CNN.

Keywords: CNN, image processing, tabular data, mixed dataset, data transformation, multimodal fusion

Procedia PDF Downloads 119
24712 Investigation of Delivery of Triple Play Data in GE-PON Fiber to the Home Network

Authors: Ashima Anurag Sharma

Abstract:

Optical fiber based networks can deliver performance that can support the increasing demands for high speed connections. One of the new technologies that have emerged in recent years is Passive Optical Networks. This research paper is targeted to show the simultaneous delivery of triple play service (data, voice, and video). The comparison between various data rates is presented. It is demonstrated that as we increase the data rate, number of users to be decreases due to increase in bit error rate.

Keywords: BER, PON, TDMPON, GPON, CWDM, OLT, ONT

Procedia PDF Downloads 523
24711 Microarray Gene Expression Data Dimensionality Reduction Using PCA

Authors: Fuad M. Alkoot

Abstract:

Different experimental technologies such as microarray sequencing have been proposed to generate high-resolution genetic data, in order to understand the complex dynamic interactions between complex diseases and the biological system components of genes and gene products. However, the generated samples have a very large dimension reaching thousands. Therefore, hindering all attempts to design a classifier system that can identify diseases based on such data. Additionally, the high overlap in the class distributions makes the task more difficult. The data we experiment with is generated for the identification of autism. It includes 142 samples, which is small compared to the large dimension of the data. The classifier systems trained on this data yield very low classification rates that are almost equivalent to a guess. We aim at reducing the data dimension and improve it for classification. Here, we experiment with applying a multistage PCA on the genetic data to reduce its dimensionality. Results show a significant improvement in the classification rates which increases the possibility of building an automated system for autism detection.

Keywords: PCA, gene expression, dimensionality reduction, classification, autism

Procedia PDF Downloads 557
24710 Data Science-Based Key Factor Analysis and Risk Prediction of Diabetic

Authors: Fei Gao, Rodolfo C. Raga Jr.

Abstract:

This research proposal will ascertain the major risk factors for diabetes and to design a predictive model for risk assessment. The project aims to improve diabetes early detection and management by utilizing data science techniques, which may improve patient outcomes and healthcare efficiency. The phase relation values of each attribute were used to analyze and choose the attributes that might influence the examiner's survival probability using Diabetes Health Indicators Dataset from Kaggle’s data as the research data. We compare and evaluate eight machine learning algorithms. Our investigation begins with comprehensive data preprocessing, including feature engineering and dimensionality reduction, aimed at enhancing data quality. The dataset, comprising health indicators and medical data, serves as a foundation for training and testing these algorithms. A rigorous cross-validation process is applied, and we assess their performance using five key metrics like accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC-ROC). After analyzing the data characteristics, investigate their impact on the likelihood of diabetes and develop corresponding risk indicators.

Keywords: diabetes, risk factors, predictive model, risk assessment, data science techniques, early detection, data analysis, Kaggle

Procedia PDF Downloads 71
24709 A Methodology to Integrate Data in the Company Based on the Semantic Standard in the Context of Industry 4.0

Authors: Chang Qin, Daham Mustafa, Abderrahmane Khiat, Pierre Bienert, Paulo Zanini

Abstract:

Nowadays, companies are facing lots of challenges in the process of digital transformation, which can be a complex and costly undertaking. Digital transformation involves the collection and analysis of large amounts of data, which can create challenges around data management and governance. Furthermore, it is also challenged to integrate data from multiple systems and technologies. Although with these pains, companies are still pursuing digitalization because by embracing advanced technologies, companies can improve efficiency, quality, decision-making, and customer experience while also creating different business models and revenue streams. In this paper, the issue that data is stored in data silos with different schema and structures is focused. The conventional approaches to addressing this issue involve utilizing data warehousing, data integration tools, data standardization, and business intelligence tools. However, these approaches primarily focus on the grammar and structure of the data and neglect the importance of semantic modeling and semantic standardization, which are essential for achieving data interoperability. In this session, the challenge of data silos in Industry 4.0 is addressed by developing a semantic modeling approach compliant with Asset Administration Shell (AAS) models as an efficient standard for communication in Industry 4.0. The paper highlights how our approach can facilitate the data mapping process and semantic lifting according to existing industry standards such as ECLASS and other industrial dictionaries. It also incorporates the Asset Administration Shell technology to model and map the company’s data and utilize a knowledge graph for data storage and exploration.

Keywords: data interoperability in industry 4.0, digital integration, industrial dictionary, semantic modeling

Procedia PDF Downloads 90
24708 Monte Carlo Simulation of Thyroid Phantom Imaging Using Geant4-GATE

Authors: Parimalah Velo, Ahmad Zakaria

Abstract:

Introduction: Monte Carlo simulations of preclinical imaging systems allow opportunity to enable new research that could range from designing hardware up to discovery of new imaging application. The simulation system which could accurately model an imaging modality provides a platform for imaging developments that might be inconvenient in physical experiment systems due to the expense, unnecessary radiation exposures and technological difficulties. The aim of present study is to validate the Monte Carlo simulation of thyroid phantom imaging using Geant4-GATE for Siemen’s e-cam single head gamma camera. Upon the validation of the gamma camera simulation model by comparing physical characteristic such as energy resolution, spatial resolution, sensitivity, and dead time, the GATE simulation of thyroid phantom imaging is carried out. Methods: A thyroid phantom is defined geometrically which comprises of 2 lobes with 80mm in diameter, 1 hot spot, and 3 cold spots. This geometry accurately resembling the actual dimensions of thyroid phantom. A planar image of 500k counts with 128x128 matrix size was acquired using simulation model and in actual experimental setup. Upon image acquisition, quantitative image analysis was performed by investigating the total number of counts in image, the contrast of the image, radioactivity distributions on image and the dimension of hot spot. Algorithm for each quantification is described in detail. The difference in estimated and actual values for both simulation and experimental setup is analyzed for radioactivity distribution and dimension of hot spot. Results: The results show that the difference between contrast level of simulation image and experimental image is within 2%. The difference in the total count between simulation and actual study is 0.4%. The results of activity estimation show that the relative difference between estimated and actual activity for experimental and simulation is 4.62% and 3.03% respectively. The deviation in estimated diameter of hot spot for both simulation and experimental study are similar which is 0.5 pixel. In conclusion, the comparisons show good agreement between the simulation and experimental data.

Keywords: gamma camera, Geant4 application of tomographic emission (GATE), Monte Carlo, thyroid imaging

Procedia PDF Downloads 266
24707 Big Data Analytics and Data Security in the Cloud via Fully Homomorphic Encryption

Authors: Waziri Victor Onomza, John K. Alhassan, Idris Ismaila, Noel Dogonyaro Moses

Abstract:

This paper describes the problem of building secure computational services for encrypted information in the Cloud Computing without decrypting the encrypted data; therefore, it meets the yearning of computational encryption algorithmic aspiration model that could enhance the security of big data for privacy, confidentiality, availability of the users. The cryptographic model applied for the computational process of the encrypted data is the Fully Homomorphic Encryption Scheme. We contribute theoretical presentations in high-level computational processes that are based on number theory and algebra that can easily be integrated and leveraged in the Cloud computing with detail theoretic mathematical concepts to the fully homomorphic encryption models. This contribution enhances the full implementation of big data analytics based cryptographic security algorithm.

Keywords: big data analytics, security, privacy, bootstrapping, homomorphic, homomorphic encryption scheme

Procedia PDF Downloads 374
24706 Clonal Evaluation of Malignant Mesothelioma

Authors: Sabahattin Comertpay, Sandra Pastorino, Rosanna Mezzapelle, Mika Tanji, Oriana Strianese, Andrea Napolitano, Tracey Weigel, Joseph Friedberg, Paul Sugarbaker, Thomas Krausz, Ena Wang, Amy Powers, Giovanni Gaudino, Harvey I. Pass, Fatmagul Ozcelik, Barbara L. Parsons, Haining Yang, Michele Carbone

Abstract:

Tumors are thought to be monoclonal in origin. This paradigm arose decades ago, primarily from the study of hematopoietic malignancies and sarcomas. The clonal origin of malignant mesothelioma (MM), a deadly cancer resistant to the current therapies, has not been investigated. Examination of the pleura from patients with MM shows often the presence of multiple pleural nodules, raising the question of whether they represent independent or metastatic growth processes. To investigate the clonality patterns of MM, we used the HUMARA (Human Androgen Receptor) assay to examine 14 sporadic and 2 familial Malignant Mesotheliomas (MM). Of 16 specimens studied, 15 were informative and 14/15 revealed two electrophoretically distinct methylated HUMARA alleles, indicating a polyclonal origin for these tumors. This discovery has important clinical implications, because an accurate assessment of tumor clonality is key to the design of novel molecular strategies for the treatment of MM.

Keywords: malignant mesothelioma, clonal origin, HUMARA, sarcomas

Procedia PDF Downloads 456
24705 Protecting Privacy and Data Security in Online Business

Authors: Bilquis Ferdousi

Abstract:

With the exponential growth of the online business, the threat to consumers’ privacy and data security has become a serious challenge. This literature review-based study focuses on a better understanding of those threats and what legislative measures have been taken to address those challenges. Research shows that people are increasingly involved in online business using different digital devices and platforms, although this practice varies based on age groups. The threat to consumers’ privacy and data security is a serious hindrance in developing trust among consumers in online businesses. There are some legislative measures taken at the federal and state level to protect consumers’ privacy and data security. The study was based on an extensive review of current literature on protecting consumers’ privacy and data security and legislative measures that have been taken.

Keywords: privacy, data security, legislation, online business

Procedia PDF Downloads 100
24704 Flowing Online Vehicle GPS Data Clustering Using a New Parallel K-Means Algorithm

Authors: Orhun Vural, Oguz Bayat, Rustu Akay, Osman N. Ucan

Abstract:

This study presents a new parallel approach clustering of GPS data. Evaluation has been made by comparing execution time of various clustering algorithms on GPS data. This paper aims to propose a parallel based on neighborhood K-means algorithm to make it faster. The proposed parallelization approach assumes that each GPS data represents a vehicle and to communicate between vehicles close to each other after vehicles are clustered. This parallelization approach has been examined on different sized continuously changing GPS data and compared with serial K-means algorithm and other serial clustering algorithms. The results demonstrated that proposed parallel K-means algorithm has been shown to work much faster than other clustering algorithms.

Keywords: parallel k-means algorithm, parallel clustering, clustering algorithms, clustering on flowing data

Procedia PDF Downloads 217
24703 An Analysis of Privacy and Security for Internet of Things Applications

Authors: Dhananjay Singh, M. Abdullah-Al-Wadud

Abstract:

The Internet of Things is a concept of a large scale ecosystem of wireless actuators. The actuators are defined as things in the IoT, those which contribute or produces some data to the ecosystem. However, ubiquitous data collection, data security, privacy preserving, large volume data processing, and intelligent analytics are some of the key challenges into the IoT technologies. In order to solve the security requirements, challenges and threats in the IoT, we have discussed a message authentication mechanism for IoT applications. Finally, we have discussed data encryption mechanism for messages authentication before propagating into IoT networks.

Keywords: Internet of Things (IoT), message authentication, privacy, security

Procedia PDF Downloads 379
24702 Synthesis of Gold Nanoparticles Stabilized in Na-Montmorillonite for Nitrophenol Reduction

Authors: Fatima Ammari, Meriem Chenouf

Abstract:

Synthesis of gold nano particles has attracted much attention since the pioneering discovery of the high catalytic activity of supported gold nano particles in the reaction of CO oxidation at low temperature. In this research field, we used Na-montmorillonite for gold nanoparticles stabilization; different loading percentage 1, 2 and 5%. The gold nano particles were obtained using chemical reduction method using NaBH4 as reductant agent. The obtained gold nano particles Au-mont stabilized in Na-montmorillonite were used as catalysts for reduction of 4-nitrophenol to aminophenol with sodium borohydride at room temperature. The UV-Vis results confirm directly the gold nano particles formation. The XRD and N2 adsorption results showed the formation of gold nano particles in the pores of montmorillonite with an average size of 5 nm obtained on samples with 2%Au-mont. The gold particles size increased with the increase of gold loading percentage. The reduction reaction of 4-nitrophenol into 4-aminophenol with NaBH4 catalyzed by Au-Na-montmorillonite catalyst exhibits remarkably a high activity; the reaction was completed within 9 min for 1Au-mont and within 3 min for 2Au-mont.

Keywords: chemical reduction, gold, montmorillonite, nano particles, 4-nitrophenol

Procedia PDF Downloads 325
24701 Cognitive Science Based Scheduling in Grid Environment

Authors: N. D. Iswarya, M. A. Maluk Mohamed, N. Vijaya

Abstract:

Grid is infrastructure that allows the deployment of distributed data in large size from multiple locations to reach a common goal. Scheduling data intensive applications becomes challenging as the size of data sets are very huge in size. Only two solutions exist in order to tackle this challenging issue. First, computation which requires huge data sets to be processed can be transferred to the data site. Second, the required data sets can be transferred to the computation site. In the former scenario, the computation cannot be transferred since the servers are storage/data servers with little or no computational capability. Hence, the second scenario can be considered for further exploration. During scheduling, transferring huge data sets from one site to another site requires more network bandwidth. In order to mitigate this issue, this work focuses on incorporating cognitive science in scheduling. Cognitive Science is the study of human brain and its related activities. Current researches are mainly focused on to incorporate cognitive science in various computational modeling techniques. In this work, the problem solving approach of human brain is studied and incorporated during the data intensive scheduling in grid environments. Here, a cognitive engine is designed and deployed in various grid sites. The intelligent agents present in CE will help in analyzing the request and creating the knowledge base. Depending upon the link capacity, decision will be taken whether to transfer data sets or to partition the data sets. Prediction of next request is made by the agents to serve the requesting site with data sets in advance. This will reduce the data availability time and data transfer time. Replica catalog and Meta data catalog created by the agents assist in decision making process.

Keywords: data grid, grid workflow scheduling, cognitive artificial intelligence

Procedia PDF Downloads 391
24700 Heritage and Tourism in the Era of Big Data: Analysis of Chinese Cultural Tourism in Catalonia

Authors: Xinge Liao, Francesc Xavier Roige Ventura, Dolores Sanchez Aguilera

Abstract:

With the development of the Internet, the study of tourism behavior has rapidly expanded from the traditional physical market to the online market. Data on the Internet is characterized by dynamic changes, and new data appear all the time. In recent years the generation of a large volume of data was characterized, such as forums, blogs, and other sources, which have expanded over time and space, together they constitute large-scale Internet data, known as Big Data. This data of technological origin that derives from the use of devices and the activity of multiple users is becoming a source of great importance for the study of geography and the behavior of tourists. The study will focus on cultural heritage tourist practices in the context of Big Data. The research will focus on exploring the characteristics and behavior of Chinese tourists in relation to the cultural heritage of Catalonia. Geographical information, target image, perceptions in user-generated content will be studied through data analysis from Weibo -the largest social networks of blogs in China. Through the analysis of the behavior of heritage tourists in the Big Data environment, this study will understand the practices (activities, motivations, perceptions) of cultural tourists and then understand the needs and preferences of tourists in order to better guide the sustainable development of tourism in heritage sites.

Keywords: Barcelona, Big Data, Catalonia, cultural heritage, Chinese tourism market, tourists’ behavior

Procedia PDF Downloads 136
24699 Towards A Framework for Using Open Data for Accountability: A Case Study of A Program to Reduce Corruption

Authors: Darusalam, Jorish Hulstijn, Marijn Janssen

Abstract:

Media has revealed a variety of corruption cases in the regional and local governments all over the world. Many governments pursued many anti-corruption reforms and have created a system of checks and balances. Three types of corruption are faced by citizens; administrative corruption, collusion and extortion. Accountability is one of the benchmarks for building transparent government. The public sector is required to report the results of the programs that have been implemented so that the citizen can judge whether the institution has been working such as economical, efficient and effective. Open Data is offering solutions for the implementation of good governance in organizations who want to be more transparent. In addition, Open Data can create transparency and accountability to the community. The objective of this paper is to build a framework of open data for accountability to combating corruption. This paper will investigate the relationship between open data, and accountability as part of anti-corruption initiatives. This research will investigate the impact of open data implementation on public organization.

Keywords: open data, accountability, anti-corruption, framework

Procedia PDF Downloads 330
24698 Preliminary Phytochemical Screening and Comparison of Different Extracts of Capparidaceae Family

Authors: Noshaba Dilbar, Maria Jabbar

Abstract:

Medicinal plants are considered to be the richest source of drug discovery. The main cause of medicinal properties of plants is the presence of bioactive compounds in them. Phytochemical screening is the valuable process that detects bioactive compounds(secondary metabolites) in plants. The present study was carried out to determine phytochemical profile and ethnobotanical importance of Capparidaceae species. ( Capparis spinosa and Dipterygium glaucum). The selection of plants was made on basis of traditional knowledge of their usage in ayurvedic medicines. Different type of solvents(ethanol, methanol, chloroform, benzene and petroleum ether) were used to make extracts of dry and fresh plants. Phytochemical screening was made by using various standard techniques. Results reveal the presence of large range of bioactive compounds i.e alakloids, saponins, flavonoids, terpenoids, glycosides, phenols and steroids. Methanol, petroleum ether and chloroform extracts showed high extractability of bioactive compounds. The results obtained ensure these plants a reliable source of pharmacological industry and can be used in making of various biological friendly drugs.

Keywords: bioactive compounds, Capparidaceae, phytochemical screening, secondary metabolites

Procedia PDF Downloads 170
24697 Contribution of Artificial Intelligence in the Studies of Natural Compounds Against SARS-COV-2

Authors: Salah Belaidi

Abstract:

We have carried out extensive and in-depth research to search for bioactive compounds based on Algerian plants. A selection of 50 ligands from Algerian medicinal plants. Several compounds used in herbal medicine have been drawn using Marvin Sketch software. We determined the three-dimensional structures of the ligands with the MMFF94 force field in order to prepare these ligands for molecular docking. The 3D protein structure of the SARS-CoV-2 main protease was taken from the Protein Data Bank. We used AutoDockVina software to apply molecular docking. The hydrogen atoms were added during the molecular docking process, and all the twist bonds of the ligands were added using the (ligand) module in the AutoDock software. The COVID-19 main protease (Mpro) is a key enzyme that plays a vital role in viral transcription and mediating replication, so it is a very attractive drug target for SARS-CoV-2. In this work, an evaluation was carried out on the biologically active compounds present in these selected medicinal plants as effective inhibitors of the protease enzyme of COVID-19, with an in-depth computational calculation of the molecular docking using the Autodock Vina software. The top 7 ligands: Phloroglucinol, Afzelin, Myricetin-3-O- rutinosidTricin 7-neohesperidoside, Silybin, Silychristinthat and Kaempferol are selected among the 50 molecules studied which are Algerian medicinal plants, whose selection is based on the best binding energy which is relatively low compared to the reference molecule with binding affinities of -9.3, -9.3, -9, -8.9, -8 .5, 8.3 and -8.3 kcal mol-1 respectively. Then, we analyzed the ADME properties of the best7 ligands using the web server SwissADME. Two ligands (Silybin, Silychristin) were found to be potential candidates for the discovery and design of novel drug inhibitors of the protease enzyme of SARS-CoV-2. The stability of the two ligands in complexing with the Mpro protease was validated by molecular dynamics simulation; they revealed a stable trajectory in both techniques, RMSD and RMSF, by showing molecular properties with coherent interactions in molecular dynamics simulations. Finally, we conclude that the Silybin ligand forms a more stable complex with the Mpro protease compared to the Silychristin ligand.

Keywords: COVID-19, medicinal plants, molecular docking, ADME properties, molecular dynamics

Procedia PDF Downloads 27
24696 Role of Natural Products in Drug Discovery of Anti-Biotic and Anti-Cancer Agents

Authors: Sunil Kumar

Abstract:

For many years, small organic molecules derived naturally from microbes and plants have delivered a number of expedient therapeutic drug agents. The search for naturally occurring lead compounds has continued in recent years as well, with the constituents of marine flora and fauna along with those of telluric microorganisms and plants being investigated for their anti-bacterial and anti-cancer activities. It has been observed that such promising lead molecules incline to promptly generate substantial attention among scientists like synthetic organic chemists and biologists. Subsequently, the availability of a given precious natural product sample may be enriched, and it may be possible to determine a preliminary idea of structure-activity relationships to develop synthetic analogues. For instance, anti-tumor drug topotecan is a synthetic chemical compound similar in chemical structure to camptothecin which is found in extracts of Camptotheca acuminate. Similarly, researchers at AstraZeneca discovered anti-biotic pyrrolamide through a fragment-based lead generation approach from kibdelomycin, which is isolated from Staphylococcus aureuss.

Keywords: anticancer, antibiotic, lead molecule, natural product, synthetic analogues

Procedia PDF Downloads 147
24695 Syndromic Surveillance Framework Using Tweets Data Analytics

Authors: David Ming Liu, Benjamin Hirsch, Bashir Aden

Abstract:

Syndromic surveillance is to detect or predict disease outbreaks through the analysis of medical sources of data. Using social media data like tweets to do syndromic surveillance becomes more and more popular with the aid of open platform to collect data and the advantage of microblogging text and mobile geographic location features. In this paper, a Syndromic Surveillance Framework is presented with machine learning kernel using tweets data analytics. Influenza and the three cities Abu Dhabi, Al Ain and Dubai of United Arabic Emirates are used as the test disease and trial areas. Hospital cases data provided by the Health Authority of Abu Dhabi (HAAD) are used for the correlation purpose. In our model, Latent Dirichlet allocation (LDA) engine is adapted to do supervised learning classification and N-Fold cross validation confusion matrix are given as the simulation results with overall system recall 85.595% performance achieved.

Keywords: Syndromic surveillance, Tweets, Machine Learning, data mining, Latent Dirichlet allocation (LDA), Influenza

Procedia PDF Downloads 109