Search results for: multivariate failure-time data
24638 BigCrypt: A Probable Approach of Big Data Encryption to Protect Personal and Business Privacy
Authors: Abdullah Al Mamun, Talal Alkharobi
Abstract:
As data size is growing up, people are became more familiar to store big amount of secret information into cloud storage. Companies are always required to need transfer massive business files from one end to another. We are going to lose privacy if we transmit it as it is and continuing same scenario repeatedly without securing the communication mechanism means proper encryption. Although asymmetric key encryption solves the main problem of symmetric key encryption but it can only encrypt limited size of data which is inapplicable for large data encryption. In this paper we propose a probable approach of pretty good privacy for encrypt big data using both symmetric and asymmetric keys. Our goal is to achieve encrypt huge collection information and transmit it through a secure communication channel for committing the business and personal privacy. To justify our method an experimental dataset from three different platform is provided. We would like to show that our approach is working for massive size of various data efficiently and reliably.Keywords: big data, cloud computing, cryptography, hadoop, public key
Procedia PDF Downloads 32024637 Implementation of Big Data Concepts Led by the Business Pressures
Authors: Snezana Savoska, Blagoj Ristevski, Violeta Manevska, Zlatko Savoski, Ilija Jolevski
Abstract:
Big data is widely accepted by the pharmaceutical companies as a result of business demands create through legal pressure. Pharmaceutical companies have many legal demands as well as standards’ demands and have to adapt their procedures to the legislation. To manage with these demands, they have to standardize the usage of the current information technology and use the latest software tools. This paper highlights some important aspects of experience with big data projects implementation in a pharmaceutical Macedonian company. These projects made improvements of their business processes by the help of new software tools selected to comply with legal and business demands. They use IT as a strategic tool to obtain competitive advantage on the market and to reengineer the processes towards new Internet economy and quality demands. The company is required to manage vast amounts of structured as well as unstructured data. For these reasons, they implement projects for emerging and appropriate software tools which have to deal with big data concepts accepted in the company.Keywords: big data, unstructured data, SAP ERP, documentum
Procedia PDF Downloads 27124636 Saving Energy at a Wastewater Treatment Plant through Electrical and Production Data Analysis
Authors: Adriano Araujo Carvalho, Arturo Alatrista Corrales
Abstract:
This paper intends to show how electrical energy consumption and production data analysis were used to find opportunities to save energy at Taboada wastewater treatment plant in Callao, Peru. In order to access the data, it was used independent data networks for both electrical and process instruments, which were taken to analyze under an ISO 50001 energy audit, which considered, thus, Energy Performance Indexes for each process and a step-by-step guide presented in this text. Due to the use of aforementioned methodology and data mining techniques applied on information gathered through electronic multimeters (conveniently placed on substation switchboards connected to a cloud network), it was possible to identify thoroughly the performance of each process and thus, evidence saving opportunities which were previously hidden before. The data analysis brought both costs and energy reduction, allowing the plant to save significant resources and to be certified under ISO 50001.Keywords: energy and production data analysis, energy management, ISO 50001, wastewater treatment plant energy analysis
Procedia PDF Downloads 19324635 Data Clustering in Wireless Sensor Network Implemented on Self-Organization Feature Map (SOFM) Neural Network
Authors: Krishan Kumar, Mohit Mittal, Pramod Kumar
Abstract:
Wireless sensor network is one of the most promising communication networks for monitoring remote environmental areas. In this network, all the sensor nodes are communicated with each other via radio signals. The sensor nodes have capability of sensing, data storage and processing. The sensor nodes collect the information through neighboring nodes to particular node. The data collection and processing is done by data aggregation techniques. For the data aggregation in sensor network, clustering technique is implemented in the sensor network by implementing self-organizing feature map (SOFM) neural network. Some of the sensor nodes are selected as cluster head nodes. The information aggregated to cluster head nodes from non-cluster head nodes and then this information is transferred to base station (or sink nodes). The aim of this paper is to manage the huge amount of data with the help of SOM neural network. Clustered data is selected to transfer to base station instead of whole information aggregated at cluster head nodes. This reduces the battery consumption over the huge data management. The network lifetime is enhanced at a greater extent.Keywords: artificial neural network, data clustering, self organization feature map, wireless sensor network
Procedia PDF Downloads 51724634 Genetic and Non-Genetic Factors Affecting the Response to Clopidogrel Therapy
Authors: Snezana Mugosa, Zoran Todorovic, Zoran Bukumiric, Ivan Radosavljevic, Natasa Djordjevic
Abstract:
Introduction: Various studies have shown that the frequency of clopidogrel resistance ranges from 4-40%. The aim of this study was to provide in depth analysis of genetic and non-genetic factors that influence clopidogrel resistance in cardiology patients. Methods: We have conducted a prospective study in 200 hospitalized patients hospitalized at Cardiology Centre of the Clinical Centre of Montenegro. CYP2C19 genetic testing was conducted, and the PREDICT score was calculated in 102 out of 200 patients treated with clopidogrel in order to determine the influence of genetic and non-genetic factors on outcomes of interest. Adverse cardiovascular events and adverse reactions to clopidogrel were assessed during 12 months follow up period. Results: PREDICT score and CYP2C19 enzymatic activity were found to be statistically significant predictors of expressing lack of therapeutic efficacy of clopidogrel by multivariate logistic regression, without multicollinearity or interaction between the predictors (p = 0.002 and 0.009, respectively). Conclusions: Pharmacogenetics analyses that were done in the Montenegrin population of patients for the first time suggest that these analyses can predict patient response to the certain therapy. Stepwise approach could be used in assessing the clopidogrel resistance in cardiology patients, combining the PREDICT score, platelet aggregation test, and genetic testing for CYP2C19 polymorphism.Keywords: clopidogrel, pharmacogenetics, pharmacotherapy, PREDICT score
Procedia PDF Downloads 34924633 Review and Comparison of Associative Classification Data Mining Approaches
Authors: Suzan Wedyan
Abstract:
Data mining is one of the main phases in the Knowledge Discovery Database (KDD) which is responsible of finding hidden and useful knowledge from databases. There are many different tasks for data mining including regression, pattern recognition, clustering, classification, and association rule. In recent years a promising data mining approach called associative classification (AC) has been proposed, AC integrates classification and association rule discovery to build classification models (classifiers). This paper surveys and critically compares several AC algorithms with reference of the different procedures are used in each algorithm, such as rule learning, rule sorting, rule pruning, classifier building, and class allocation for test cases.Keywords: associative classification, classification, data mining, learning, rule ranking, rule pruning, prediction
Procedia PDF Downloads 53724632 Hierarchical Checkpoint Protocol in Data Grids
Authors: Rahma Souli-Jbali, Minyar Sassi Hidri, Rahma Ben Ayed
Abstract:
Grid of computing nodes has emerged as a representative means of connecting distributed computers or resources scattered all over the world for the purpose of computing and distributed storage. Since fault tolerance becomes complex due to the availability of resources in decentralized grid environment, it can be used in connection with replication in data grids. The objective of our work is to present fault tolerance in data grids with data replication-driven model based on clustering. The performance of the protocol is evaluated with Omnet++ simulator. The computational results show the efficiency of our protocol in terms of recovery time and the number of process in rollbacks.Keywords: data grids, fault tolerance, clustering, chandy-lamport
Procedia PDF Downloads 34124631 An Observation of the Information Technology Research and Development Based on Article Data Mining: A Survey Study on Science Direct
Authors: Muhammet Dursun Kaya, Hasan Asil
Abstract:
One of the most important factors of research and development is the deep insight into the evolutions of scientific development. The state-of-the-art tools and instruments can considerably assist the researchers, and many of the world organizations have become aware of the advantages of data mining for the acquisition of the knowledge required for the unstructured data. This paper was an attempt to review the articles on the information technology published in the past five years with the aid of data mining. A clustering approach was used to study these articles, and the research results revealed that three topics, namely health, innovation, and information systems, have captured the special attention of the researchers.Keywords: information technology, data mining, scientific development, clustering
Procedia PDF Downloads 27824630 Security in Resource Constraints: Network Energy Efficient Encryption
Authors: Mona Almansoori, Ahmed Mustafa, Ahmad Elshamy
Abstract:
Wireless nodes in a sensor network gather and process critical information designed to process and communicate, information flooding through such network is critical for decision making and data processing, the integrity of such data is one of the most critical factors in wireless security without compromising the processing and transmission capability of the network. This paper presents mechanism to securely transmit data over a chain of sensor nodes without compromising the throughput of the network utilizing available battery resources available at the sensor node.Keywords: hybrid protocol, data integrity, lightweight encryption, neighbor based key sharing, sensor node data processing, Z-MAC
Procedia PDF Downloads 14524629 Data Mining Techniques for Anti-Money Laundering
Authors: M. Sai Veerendra
Abstract:
Today, money laundering (ML) poses a serious threat not only to financial institutions but also to the nation. This criminal activity is becoming more and more sophisticated and seems to have moved from the cliché of drug trafficking to financing terrorism and surely not forgetting personal gain. Most of the financial institutions internationally have been implementing anti-money laundering solutions (AML) to fight investment fraud activities. However, traditional investigative techniques consume numerous man-hours. Recently, data mining approaches have been developed and are considered as well-suited techniques for detecting ML activities. Within the scope of a collaboration project on developing a new data mining solution for AML Units in an international investment bank in Ireland, we survey recent data mining approaches for AML. In this paper, we present not only these approaches but also give an overview on the important factors in building data mining solutions for AML activities.Keywords: data mining, clustering, money laundering, anti-money laundering solutions
Procedia PDF Downloads 53724628 Phenotypic Diversity of the Tomato Germplasm from the Lazio Region in Central Italy, with a Case Study on Molecular Distinctiveness
Authors: Barbara Farinon, Maurizio E. Picarella, Lorenzo Mancini, Andrea Mazzucato
Abstract:
Italy is notoriously a secondary center of diversification for cultivated tomatoes (Solanum lycopersicum L.). The study of phenotypic and genetic diversity in landrace collections is important for germplasm conservation and biodiversity protection. Here, we set up to study the germplasm collected in the region of Lazio in Central Italy with a focus on the distinctiveness among landraces and the attribution of membership to unnamed accessions. Our regional collection included 30 accessions belonging to six different locally recognized landraces and 21 unnamed accessions. All accessions were gathered in Lazio and belonged to the collection held at the Regional Agency for the Development and Innovation of Agriculture in Lazio (ARSIAL, in the application of the Regional Act n. 15/2000, funded by Lazio Rural Development Plan 2014 – 2020 Agro-environmental Measure, Action 10.2.1) and at the University of Tuscia. We included 13 control genotypes as references. The collection showed wide phenotypic variability for several traits, such as fruit weight (range 14-277 g), locule number (2-12), shape index (0.54-2.65), yield (0.24-3.08 kg/plant), and soluble solids (3.4-7.5 °B). A few landraces showed uncommon phenotypes, such as potato leaf, colorless fruit epidermis, or delayed ripening. Multivariate analysis of 25 cardinal phenotypic variables grouped the named varieties and allowed to assign of some of the unnamed to recognized groups. A case study for distinctiveness is presented for the flattened-ribbed types that presented overlapping distribution according to the phenotypic data. Molecular markers retrieved by previous studies revealed differences compared to the phenotyping clustering, indicating that the named varieties “Scatolone di Bolsena” and “Pantano Romanesco” belong to the Marmande group, together with the reference landrace from Tuscany “Costoluto Fiorentino”. Differently, the landrace “Spagnoletta di Formia e Gaeta” was clearly distinct from the former at the molecular level. Therefore, a genotypic analysis of the analyzed collection appears needed to better define the molecular distinctiveness among the flattened-ribbed accessions, as well as to properly attribute the membership group of the unnamed accessions.Keywords: distinctiveness, flattened-ribbed fruits, regional landraces, tomato
Procedia PDF Downloads 13824627 Development of New Technology Evaluation Model by Using Patent Information and Customers' Review Data
Authors: Kisik Song, Kyuwoong Kim, Sungjoo Lee
Abstract:
Many global firms and corporations derive new technology and opportunity by identifying vacant technology from patent analysis. However, previous studies failed to focus on technologies that promised continuous growth in industrial fields. Most studies that derive new technology opportunities do not test practical effectiveness. Since previous studies depended on expert judgment, it became costly and time-consuming to evaluate new technologies based on patent analysis. Therefore, research suggests a quantitative and systematic approach to technology evaluation indicators by using patent data to and from customer communities. The first step involves collecting two types of data. The data is used to construct evaluation indicators and apply these indicators to the evaluation of new technologies. This type of data mining allows a new method of technology evaluation and better predictor of how new technologies are adopted.Keywords: data mining, evaluating new technology, technology opportunity, patent analysis
Procedia PDF Downloads 37724626 Anomaly Detection Based on System Log Data
Authors: M. Kamel, A. Hoayek, M. Batton-Hubert
Abstract:
With the increase of network virtualization and the disparity of vendors, the continuous monitoring and detection of anomalies cannot rely on static rules. An advanced analytical methodology is needed to discriminate between ordinary events and unusual anomalies. In this paper, we focus on log data (textual data), which is a crucial source of information for network performance. Then, we introduce an algorithm used as a pipeline to help with the pretreatment of such data, group it into patterns, and dynamically label each pattern as an anomaly or not. Such tools will provide users and experts with continuous real-time logs monitoring capability to detect anomalies and failures in the underlying system that can affect performance. An application of real-world data illustrates the algorithm.Keywords: logs, anomaly detection, ML, scoring, NLP
Procedia PDF Downloads 9424625 Exploring Suicidal Behaviors among Transgender and Gender Nonconforming Youth in China
Authors: Krystal Wang, Chongzheng Wei, Runsen Chen, Shufang Sun
Abstract:
Suicide is a global public mental health issue and is the tenth leading cause of death globally. Approximately 75% of suicides occur in low- and middle-income countries (LMIC). Compared to the general population, transgender and gender nonconforming (TGNC) young people have higher suicidal risks. Research has shown that the prevalence of suicidal behaviors among TGNC populations was high in both the United States and China. However, studies were mostly embedded within Western cultures. Limited data and research were available to assess suicidal behaviors among TGNC youth in LMIC countries and to consider various types of TGNC youth. The goal of the current project is to 1) investigate the prevalence of lifetime and past-year suicidal ideations, plans, and attempts among Chinese TGNC youth, 2) explore the relationship between gender identity and suicidal outcomes among TGNC youth in China, 3) identify individual, school, and family level risk and protective factors for suicidal behaviors. The study used data from a cross-sectional survey conducted by Beijing LGBTQ Center in 2021. The survey was the largest TGNC population study in China to understand the health conditions of TGNC individuals. Of the 7612 individuals who completed the survey, a total of 5632 youth (aged 10 to 19) was included in the final analysis. 2259 (40.11%) participants were categorized as transfeminine youth, 1034 (18.36%) as transmasculine youth, 1169 (20.76%) as nonbinary youth AFAB, 568 (10.09%) as nonbinary youth AMAB, 344 (6.11%) as questioning youth AFAB and 258 (4.58%) as questioning youth AMAB. Suicidal behaviors were assessed by asking about lifetime suicidal ideation and attempts, past 12 months suicidal ideation, plan and attempts, and suicidal methods. To achieve the aims, we conducted statistical analysis in Stata/SE 17.0 to 1) describe the prevalence of suicidal outcomes and 2) assess the relationship between gender identity and suicidal outcomes by performing crosstabs, bivariate and multivariate logistic regressions, and adjusting for covariates. The lifetime prevalence of suicidal ideations and attempts for the whole sample was 85.13% and 51.7%. Transfeminine youth had a significantly higher risk for lifetime suicidal ideations (Odds Ratios (OR) = 1.67, CI:1.28,2.18) and attempts than transmasculine youth (OR=1.66, CI: 1.35,2.03), adjusting for age and past year binge drinking, known risk factors of suicide behavior. Past-year prevalence of suicidal behaviors was also high among TGNC youth, with 75.69% in suicidal ideation, 88.77% in suicidal plans, and 57.96% in suicidal attempts. Transfeminine youth, among six subgroups, had the highest risk for past-year suicidal ideations and attempts compared to transmasculine youth. Non-binary youth, regardless of sex assigned at birth, also had a significantly higher risk for suicidal ideations. The prevalence of lifetime and past-year suicidal behaviors was alarming among TGNC youth in China. Among different categories of TGNC youth, transfeminine youth reported the most elevated suicidal risk. The findings indicated a compelling need for researchers and practitioners to address the mental health risks for this specific group and target interventions for TGNC youth in China.Keywords: child and adolescent mental health, gender minority health, cross-cultural perspective, preventing suicide in youth
Procedia PDF Downloads 7424624 The 10-year Risk of Major Osteoporotic and Hip Fractures Among Indonesian People Living with HIV
Authors: Iqbal Pramukti, Mamat Lukman, Hasniatisari Harun, Kusman Ibrahim
Abstract:
Introduction: People living with HIV had a higher risk of osteoporotic fracture than the general population. The purpose of this study was to predict the 10-year risk of fracture among people living with HIV (PLWH) using FRAX™ and to identify characteristics related to the fracture risk. Methodology: This study consisted of 75 subjects. The ten-year probability of major osteoporotic fractures (MOF) and hip fractures was assessed using the FRAX™ algorithm. A cross-tabulation was used to identify the participant’s characteristics related to fracture risk. Results: The overall mean 10-year probability of fracture was 2.4% (1.7) for MOF and 0.4% (0.3) for hip fractures. For MOF score, participants with parents’ hip fracture history, smoking behavior and glucocorticoid use showed a higher MOF score than those who were not (3.1 vs. 2.5; 4.6 vs 2.5; and 3.4 vs 2.5, respectively). For HF score, participants with parents’ hip fracture history, smoking behavior and glucocorticoid use also showed a higher HF score than those who were not (0.5 vs. 0.3; 0.8 vs. 0.3; and 0.5 vs. 0.3, respectively). Conclusions: The 10-year risk of fracture was higher among PLWH with several factors, including the parent’s hip. Fracture history, smoking behavior and glucocorticoid used. Further analysis on determining factors using multivariate regression analysis with a larger sample size is required to confirm the factors associated with the high fracture risk.Keywords: HIV, PLWH, osteoporotic fractures, hip fractures, 10-year risk of fracture, FRAX
Procedia PDF Downloads 4924623 EnumTree: An Enumerative Biclustering Algorithm for DNA Microarray Data
Authors: Haifa Ben Saber, Mourad Elloumi
Abstract:
In a number of domains, like in DNA microarray data analysis, we need to cluster simultaneously rows (genes) and columns (conditions) of a data matrix to identify groups of constant rows with a group of columns. This kind of clustering is called biclustering. Biclustering algorithms are extensively used in DNA microarray data analysis. More effective biclustering algorithms are highly desirable and needed. We introduce a new algorithm called, Enumerative tree (EnumTree) for biclustering of binary microarray data. is an algorithm adopting the approach of enumerating biclusters. This algorithm extracts all biclusters consistent good quality. The main idea of EnumLat is the construction of a new tree structure to represent adequately different biclusters discovered during the process of enumeration. This algorithm adopts the strategy of all biclusters at a time. The performance of the proposed algorithm is assessed using both synthetic and real DNA micryarray data, our algorithm outperforms other biclustering algorithms for binary microarray data. Biclusters with different numbers of rows. Moreover, we test the biological significance using a gene annotation web tool to show that our proposed method is able to produce biologically relevent biclusters.Keywords: DNA microarray, biclustering, gene expression data, tree, datamining.
Procedia PDF Downloads 37224622 The Impact of Financial Reporting on Sustainability
Authors: Lynn Ruggieri
Abstract:
The worldwide pandemic has only increased sustainability awareness. The public is demanding that businesses be held accountable for their impact on the environment. While financial data enjoys uniformity in reporting requirements, there are no uniform reporting requirements for non-financial data. Europe is leading the way with some standards being implemented for reporting non-financial sustainability data; however, there is no uniformity globally. And without uniformity, there is not a clear understanding of what information to include and how to disclose it. Sustainability reporting will provide important information to stakeholders and will enable businesses to understand their impact on the environment. Therefore, there is a crucial need for this data. This paper looks at the history of sustainability reporting in the countries of the European Union and throughout the world and makes a case for worldwide reporting requirements for sustainability.Keywords: financial reporting, non-financial data, sustainability, global financial reporting
Procedia PDF Downloads 17824621 Testing Two Actors Contextual Interaction Theory in a Multi Actors Context: Case of COVID-19 Disease Prevention and Control Policy
Authors: Muhammad Fayyaz Nazir, Ellen Wayenberg, Shahzadaah Faahed Qureshi
Abstract:
Introduction: The study is based on the Contextual Interaction Theory (CIT) constructs to explore the role of policy actors in implementing the COVID-19 Disease Prevention and Control (DP&C) Policy. The study analyzes the role of healthcare workers' contextual factors, such as cognition, motives, and resources, and their interactions in implementing Social Distancing (SD). In this way, we test a two actors policy implementation theory, i.e., the CIT in a three-actor context. Methods: Data was collected through document analysis and semi-structured interviews. For a qualitative study design, interviews were conducted with questions on cognition, motives, and resources from the healthcare workers involved in implementing SD in the local context in Multan – Pakistan. The possible interactions resulting from contextual factors of the policy actors – healthcare workers were identified through framework analysis protocol guided by CIT and supported by trustworthiness criterion and data saturation. Results: This inquiry resulted in theory application, addition, and enrichment. The theoretical application in the three actor's contexts illustrates the different levels of motives, cognition, and resources of healthcare workers – senior administrators, managers, and healthcare professionals. The senior administrators working in National Command and Operations Center (NCOC), Provincial Technical Committees (PTCs), and Districts Covid Teams (DCTs) were playing their role with high motivation. They were fully informed about the policy and moderately resourceful. The policy implementors: healthcare managers working on implementing the SD within their respective hospitals were playing their role with high motivation and were fully informed about the policy. However, they lacked the required resources to implement SD. The target medical and allied healthcare professionals were moderately motivated but lack of resources and information. The interaction resulted in cooperation and the need for learning to manage the future healthcare crisis. However, the lack of resources created opposition to the implementation of SD. Objectives of the Study: The study aimed to apply a two actors theory in a multi actors context. We take this as an opportunity to qualitatively test the theory in a novel situation of the Covid-19 pandemic and make way for its quantitative application by designing a survey instrument so that implementation researchers can apply CIT through multivariate analyses or higher-order statistical modeling. Conclusion: Applying two actors' implementation theory in exploring a complex case of healthcare intervention in three actors context is a unique work that has never been done before, up to the best of our knowledge. So, the work will contribute to the policy implementation studies by applying, extending, and enriching an implementation theory in a novel case of the Covi-19 pandemic, ultimately fulfilling the gap in implementation literature. Policy institutions and other low or middle-income countries can learn from this research and improve SD implementation by working on the variables with weak significance levels.Keywords: COVID-19, disease prevention and control policy, implementation, policy actors, social distancing
Procedia PDF Downloads 5824620 Methods and Algorithms of Ensuring Data Privacy in AI-Based Healthcare Systems and Technologies
Authors: Omar Farshad Jeelani, Makaire Njie, Viktoriia M. Korzhuk
Abstract:
Recently, the application of AI-powered algorithms in healthcare continues to flourish. Particularly, access to healthcare information, including patient health history, diagnostic data, and PII (Personally Identifiable Information) is paramount in the delivery of efficient patient outcomes. However, as the exchange of healthcare information between patients and healthcare providers through AI-powered solutions increases, protecting a person’s information and their privacy has become even more important. Arguably, the increased adoption of healthcare AI has resulted in a significant concentration on the security risks and protection measures to the security and privacy of healthcare data, leading to escalated analyses and enforcement. Since these challenges are brought by the use of AI-based healthcare solutions to manage healthcare data, AI-based data protection measures are used to resolve the underlying problems. Consequently, this project proposes AI-powered safeguards and policies/laws to protect the privacy of healthcare data. The project presents the best-in-school techniques used to preserve the data privacy of AI-powered healthcare applications. Popular privacy-protecting methods like Federated learning, cryptographic techniques, differential privacy methods, and hybrid methods are discussed together with potential cyber threats, data security concerns, and prospects. Also, the project discusses some of the relevant data security acts/laws that govern the collection, storage, and processing of healthcare data to guarantee owners’ privacy is preserved. This inquiry discusses various gaps and uncertainties associated with healthcare AI data collection procedures and identifies potential correction/mitigation measures.Keywords: data privacy, artificial intelligence (AI), healthcare AI, data sharing, healthcare organizations (HCOs)
Procedia PDF Downloads 9324619 Assessment of Marine Diversity on Rocky Shores of Triporti, Vlore, Albania
Authors: Ina Nasto, Denada Sota, Kerol Sacaj, Brunilda Veshaj, Hajdar Kicaj
Abstract:
Rocky shores are often used as models to describe the dynamics of biodiversity around the world, making them one of the most studied marine habitats and their communities. The variability in the number of species and the abundance of hard-bottom benthic animal communities on the coast of Triporti, north of the Bay of Vlora, Albania is described in relation to environmental variables using multivariate analysis. The purpose of this study is to monitor the species composition, quantitative characteristics, and seasonal variations of the benthic macroinvertebrate populations of the shallow rocky shores of the Triportit-Vlora area, as well as the assessment of the ecological condition of these populations. The rocky coast of Triport, with a length of 7 km, was divided into three sampling stations, with three transects each of 50m. The monitoring of benthic macroinvertebrates in these areas was carried out in two seasons, spring and summer (June and August 2021). In each station and sampling season, estimates of the total and average density for each species, the presence constant, and the assessment of biodiversity were calculated using the Shannon–Wiener and the Simpson index. The species composition, the quantitative characteristics of the populations, and the indicators mentioned above were analyzed in a comparative way, both between the seasons within one station and between the three stations with each other. Statistical processing of the data was carried out to analyze the changes between the seasons and between the sampling stations for the species composition, population density, as well as correlation between them. A total of 105 benthic macroinvertebrate taxa were found, dominated by Molluscs, Annelids, and Arthropods. The small density of species and the low degree of stability of the macrozoobenthic community are indicators of the poor ecological condition and environmental impact in the studied areas. Algal cover, the diversity of coastal microhabitats, and the degree of coastal exposure to waves play an important role in the characteristics of macrozoobenthos populations in the studied areas. Also, the rocky shores are of special interest because, in the infralittoral of these areas, there are dense kelp forests with Gongolaria barbata, Ericaria crinita as well as fragmented areas with Posidonia oceanica that reach the coast, priority habitats of special conservation importance in the Mediterranean.Keywords: Macrozoobenthic communities, Shannon–Wiener, Triporti, Vlore, rocky shore
Procedia PDF Downloads 9824618 Mapping Tunnelling Parameters for Global Optimization in Big Data via Dye Laser Simulation
Authors: Sahil Imtiyaz
Abstract:
One of the biggest challenges has emerged from the ever-expanding, dynamic, and instantaneously changing space-Big Data; and to find a data point and inherit wisdom to this space is a hard task. In this paper, we reduce the space of big data in Hamiltonian formalism that is in concordance with Ising Model. For this formulation, we simulate the system using dye laser in FORTRAN and analyse the dynamics of the data point in energy well of rhodium atom. After mapping the photon intensity and pulse width with energy and potential we concluded that as we increase the energy there is also increase in probability of tunnelling up to some point and then it starts decreasing and then shows a randomizing behaviour. It is due to decoherence with the environment and hence there is a loss of ‘quantumness’. This interprets the efficiency parameter and the extent of quantum evolution. The results are strongly encouraging in favour of the use of ‘Topological Property’ as a source of information instead of the qubit.Keywords: big data, optimization, quantum evolution, hamiltonian, dye laser, fermionic computations
Procedia PDF Downloads 19424617 Applying Different Stenography Techniques in Cloud Computing Technology to Improve Cloud Data Privacy and Security Issues
Authors: Muhammad Muhammad Suleiman
Abstract:
Cloud Computing is a versatile concept that refers to a service that allows users to outsource their data without having to worry about local storage issues. However, the most pressing issues to be addressed are maintaining a secure and reliable data repository rather than relying on untrustworthy service providers. In this study, we look at how stenography approaches and collaboration with Digital Watermarking can greatly improve the system's effectiveness and data security when used for Cloud Computing. The main requirement of such frameworks, where data is transferred or exchanged between servers and users, is safe data management in cloud environments. Steganography is the cloud is among the most effective methods for safe communication. Steganography is a method of writing coded messages in such a way that only the sender and recipient can safely interpret and display the information hidden in the communication channel. This study presents a new text steganography method for hiding a loaded hidden English text file in a cover English text file to ensure data protection in cloud computing. Data protection, data hiding capability, and time were all improved using the proposed technique.Keywords: cloud computing, steganography, information hiding, cloud storage, security
Procedia PDF Downloads 19124616 Investigation on Performance of Change Point Algorithm in Time Series Dynamical Regimes and Effect of Data Characteristics
Authors: Farhad Asadi, Mohammad Javad Mollakazemi
Abstract:
In this paper, Bayesian online inference in models of data series are constructed by change-points algorithm, which separated the observed time series into independent series and study the change and variation of the regime of the data with related statistical characteristics. variation of statistical characteristics of time series data often represent separated phenomena in the some dynamical system, like a change in state of brain dynamical reflected in EEG signal data measurement or a change in important regime of data in many dynamical system. In this paper, prediction algorithm for studying change point location in some time series data is simulated. It is verified that pattern of proposed distribution of data has important factor on simpler and smother fluctuation of hazard rate parameter and also for better identification of change point locations. Finally, the conditions of how the time series distribution effect on factors in this approach are explained and validated with different time series databases for some dynamical system.Keywords: time series, fluctuation in statistical characteristics, optimal learning, change-point algorithm
Procedia PDF Downloads 42624615 Determination of the Risks of Heart Attack at the First Stage as Well as Their Control and Resource Planning with the Method of Data Mining
Authors: İbrahi̇m Kara, Seher Arslankaya
Abstract:
Frequently preferred in the field of engineering in particular, data mining has now begun to be used in the field of health as well since the data in the health sector have reached great dimensions. With data mining, it is aimed to reveal models from the great amounts of raw data in agreement with the purpose and to search for the rules and relationships which will enable one to make predictions about the future from the large amount of data set. It helps the decision-maker to find the relationships among the data which form at the stage of decision-making. In this study, it is aimed to determine the risk of heart attack at the first stage, to control it, and to make its resource planning with the method of data mining. Through the early and correct diagnosis of heart attacks, it is aimed to reveal the factors which affect the diseases, to protect health and choose the right treatment methods, to reduce the costs in health expenditures, and to shorten the durations of patients’ stay at hospitals. In this way, the diagnosis and treatment costs of a heart attack will be scrutinized, which will be useful to determine the risk of the disease at the first stage, to control it, and to make its resource planning.Keywords: data mining, decision support systems, heart attack, health sector
Procedia PDF Downloads 35624614 Bayesian Borrowing Methods for Count Data: Analysis of Incontinence Episodes in Patients with Overactive Bladder
Authors: Akalu Banbeta, Emmanuel Lesaffre, Reynaldo Martina, Joost Van Rosmalen
Abstract:
Including data from previous studies (historical data) in the analysis of the current study may reduce the sample size requirement and/or increase the power of analysis. The most common example is incorporating historical control data in the analysis of a current clinical trial. However, this only applies when the historical control dataare similar enough to the current control data. Recently, several Bayesian approaches for incorporating historical data have been proposed, such as the meta-analytic-predictive (MAP) prior and the modified power prior (MPP) both for single control as well as for multiple historical control arms. Here, we examine the performance of the MAP and the MPP approaches for the analysis of (over-dispersed) count data. To this end, we propose a computational method for the MPP approach for the Poisson and the negative binomial models. We conducted an extensive simulation study to assess the performance of Bayesian approaches. Additionally, we illustrate our approaches on an overactive bladder data set. For similar data across the control arms, the MPP approach outperformed the MAP approach with respect to thestatistical power. When the means across the control arms are different, the MPP yielded a slightly inflated type I error (TIE) rate, whereas the MAP did not. In contrast, when the dispersion parameters are different, the MAP gave an inflated TIE rate, whereas the MPP did not.We conclude that the MPP approach is more promising than the MAP approach for incorporating historical count data.Keywords: count data, meta-analytic prior, negative binomial, poisson
Procedia PDF Downloads 11724613 Strategic Citizen Participation in Applied Planning Investigations: How Planners Use Etic and Emic Community Input Perspectives to Fill-in the Gaps in Their Analysis
Authors: John Gaber
Abstract:
Planners regularly use citizen input as empirical data to help them better understand community issues they know very little about. This type of community data is based on the lived experiences of local residents and is known as "emic" data. What is becoming more common practice for planners is their use of data from local experts and stakeholders (known as "etic" data or the outsider perspective) to help them fill in the gaps in their analysis of applied planning research projects. Utilizing international Health Impact Assessment (HIA) data, I look at who planners invite to their citizen input investigations. Research presented in this paper shows that planners access a wide range of emic and etic community perspectives in their search for the “community’s view.” The paper concludes with how planners can chart out a new empirical path in their execution of emic/etic citizen participation strategies in their applied planning research projects.Keywords: citizen participation, emic data, etic data, Health Impact Assessment (HIA)
Procedia PDF Downloads 48424612 Data Augmentation for Automatic Graphical User Interface Generation Based on Generative Adversarial Network
Authors: Xulu Yao, Moi Hoon Yap, Yanlong Zhang
Abstract:
As a branch of artificial neural network, deep learning is widely used in the field of image recognition, but the lack of its dataset leads to imperfect model learning. By analysing the data scale requirements of deep learning and aiming at the application in GUI generation, it is found that the collection of GUI dataset is a time-consuming and labor-consuming project, which is difficult to meet the needs of current deep learning network. To solve this problem, this paper proposes a semi-supervised deep learning model that relies on the original small-scale datasets to produce a large number of reliable data sets. By combining the cyclic neural network with the generated countermeasure network, the cyclic neural network can learn the sequence relationship and characteristics of data, make the generated countermeasure network generate reasonable data, and then expand the Rico dataset. Relying on the network structure, the characteristics of collected data can be well analysed, and a large number of reasonable data can be generated according to these characteristics. After data processing, a reliable dataset for model training can be formed, which alleviates the problem of dataset shortage in deep learning.Keywords: GUI, deep learning, GAN, data augmentation
Procedia PDF Downloads 18424611 Modelling Rainfall-Induced Shallow Landslides in the Northern New South Wales
Authors: S. Ravindran, Y.Liu, I. Gratchev, D.Jeng
Abstract:
Rainfall-induced shallow landslides are more common in the northern New South Wales (NSW), Australia. From 2009 to 2017, around 105 rainfall-induced landslides occurred along the road corridors and caused temporary road closures in the northern NSW. Rainfall causing shallow landslides has different distributions of rainfall varying from uniform, normal, decreasing to increasing rainfall intensity. The duration of rainfall varied from one day to 18 days according to historical data. The objective of this research is to analyse slope instability of some of the sites in the northern NSW by varying cumulative rainfall using SLOPE/W and SEEP/W and compare with field data of rainfall causing shallow landslides. The rainfall data and topographical data from public authorities and soil data obtained from laboratory tests will be used for this modelling. There is a likelihood of shallow landslides if the cumulative rainfall is between 100 mm to 400 mm in accordance with field data.Keywords: landslides, modelling, rainfall, suction
Procedia PDF Downloads 17924610 Machine Learning-Enabled Classification of Climbing Using Small Data
Authors: Nicholas Milburn, Yu Liang, Dalei Wu
Abstract:
Athlete performance scoring within the climbing do-main presents interesting challenges as the sport does not have an objective way to assign skill. Assessing skill levels within any sport is valuable as it can be used to mark progress while training, and it can help an athlete choose appropriate climbs to attempt. Machine learning-based methods are popular for complex problems like this. The dataset available was composed of dynamic force data recorded during climbing; however, this dataset came with challenges such as data scarcity, imbalance, and it was temporally heterogeneous. Investigated solutions to these challenges include data augmentation, temporal normalization, conversion of time series to the spectral domain, and cross validation strategies. The investigated solutions to the classification problem included light weight machine classifiers KNN and SVM as well as the deep learning with CNN. The best performing model had an 80% accuracy. In conclusion, there seems to be enough information within climbing force data to accurately categorize climbers by skill.Keywords: classification, climbing, data imbalance, data scarcity, machine learning, time sequence
Procedia PDF Downloads 14224609 Analysis of Expression Data Using Unsupervised Techniques
Authors: M. A. I Perera, C. R. Wijesinghe, A. R. Weerasinghe
Abstract:
his study was conducted to review and identify the unsupervised techniques that can be employed to analyze gene expression data in order to identify better subtypes of tumors. Identifying subtypes of cancer help in improving the efficacy and reducing the toxicity of the treatments by identifying clues to find target therapeutics. Process of gene expression data analysis described under three steps as preprocessing, clustering, and cluster validation. Feature selection is important since the genomic data are high dimensional with a large number of features compared to samples. Hierarchical clustering and K Means are often used in the analysis of gene expression data. There are several cluster validation techniques used in validating the clusters. Heatmaps are an effective external validation method that allows comparing the identified classes with clinical variables and visual analysis of the classes.Keywords: cancer subtypes, gene expression data analysis, clustering, cluster validation
Procedia PDF Downloads 149