Search results for: survival data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25515

Search results for: survival data

24735 Sensor Data Analysis for a Large Mining Major

Authors: Sudipto Shanker Dasgupta

Abstract:

One of the largest mining companies wanted to look at health analytics for their driverless trucks. These trucks were the key to their supply chain logistics. The automated trucks had multi-level sub-assemblies which would send out sensor information. The use case that was worked on was to capture the sensor signal from the truck subcomponents and analyze the health of the trucks from repair and replacement purview. Open source software was used to stream the data into a clustered Hadoop setup in Amazon Web Services cloud and Apache Spark SQL was used to analyze the data. All of this was achieved through a 10 node amazon 32 core, 64 GB RAM setup real-time analytics was achieved on ‘300 million records’. To check the scalability of the system, the cluster was increased to 100 node setup. This talk will highlight how Open Source software was used to achieve the above use case and the insights on the high data throughput on a cloud set up.

Keywords: streaming analytics, data science, big data, Hadoop, high throughput, sensor data

Procedia PDF Downloads 400
24734 Data-Centric Anomaly Detection with Diffusion Models

Authors: Sheldon Liu, Gordon Wang, Lei Liu, Xuefeng Liu

Abstract:

Anomaly detection, also referred to as one-class classification, plays a crucial role in identifying product images that deviate from the expected distribution. This study introduces Data-centric Anomaly Detection with Diffusion Models (DCADDM), presenting a systematic strategy for data collection and further diversifying the data with image generation via diffusion models. The algorithm addresses data collection challenges in real-world scenarios and points toward data augmentation with the integration of generative AI capabilities. The paper explores the generation of normal images using diffusion models. The experiments demonstrate that with 30% of the original normal image size, modeling in an unsupervised setting with state-of-the-art approaches can achieve equivalent performances. With the addition of generated images via diffusion models (10% equivalence of the original dataset size), the proposed algorithm achieves better or equivalent anomaly localization performance.

Keywords: diffusion models, anomaly detection, data-centric, generative AI

Procedia PDF Downloads 78
24733 Oncogenic Functions of Long Non-Coding RNA XIST in Human Nasopharyngeal Carcinoma by Targeting MiR-34a-5p

Authors: Cheng-Cao Sun, Shu-Jun Li, De-Jia Li

Abstract:

Long non-coding RNA (lncRNA) X inactivate-specific transcript (XIST) has been verified as an oncogenic gene in several human malignant tumors, and its dysregulation was closed associated with tumor initiation, development and progression. Nevertheless, whether the aberrant expression of XIST in human nasopharyngeal carcinoma (NPC) is corrected with malignancy, metastasis or prognosis has not been elaborated. Here, we discovered that XIST was up-regulated in NPC tissues and higher expression of XIST contributed to a markedly poorer survival time. In addition, multivariate analysis demonstrated XIST was an independent risk factor for prognosis. XIST over-expression enhanced, while XIST silencing hampered the cell growth in NPC. Additionally, mechanistic analysis revealed that XIST up-regulated the expression of miR-34a-5p targeted gene E2F3 through acting as a competitive ‘sponge’ of miR-34a-5p. Taking all into account, we concluded that XIST functioned as an oncogene in NPC through up-regulating E2F3 in part through ‘spongeing’ miR-34a-5p.

Keywords: X inactivate-specific transcript; hsa-miRNA-34a-5p, miR-34a-5p; E2F3, nasopharyngeal carcinoma, tumorigenesis

Procedia PDF Downloads 235
24732 Regulation on the Protection of Personal Data Versus Quality Data Assurance in the Healthcare System Case Report

Authors: Elizabeta Krstić Vukelja

Abstract:

Digitization of personal data is a consequence of the development of information and communication technologies that create a new work environment with many advantages and challenges, but also potential threats to privacy and personal data protection. Regulation (EU) 2016/679 of the European Parliament and of the Council is becoming a law and obligation that should address the issues of personal data protection and information security. The existence of the Regulation leads to the conclusion that national legislation in the field of virtual environment, protection of the rights of EU citizens and processing of their personal data is insufficiently effective. In the health system, special emphasis is placed on the processing of special categories of personal data, such as health data. The healthcare industry is recognized as a particularly sensitive area in which a large amount of medical data is processed, the digitization of which enables quick access and quick identification of the health insured. The protection of the individual requires quality IT solutions that guarantee the technical protection of personal categories. However, the real problems are the technical and human nature and the spatial limitations of the application of the Regulation. Some conclusions will be drawn by analyzing the implementation of the basic principles of the Regulation on the example of the Croatian health care system and comparing it with similar activities in other EU member states.

Keywords: regulation, healthcare system, personal dana protection, quality data assurance

Procedia PDF Downloads 35
24731 Parallel Vector Processing Using Multi Level Orbital DATA

Authors: Nagi Mekhiel

Abstract:

Many applications use vector operations by applying single instruction to multiple data that map to different locations in conventional memory. Transferring data from memory is limited by access latency and bandwidth affecting the performance gain of vector processing. We present a memory system that makes all of its content available to processors in time so that processors need not to access the memory, we force each location to be available to all processors at a specific time. The data move in different orbits to become available to other processors in higher orbits at different time. We use this memory to apply parallel vector operations to data streams at first orbit level. Data processed in the first level move to upper orbit one data element at a time, allowing a processor in that orbit to apply another vector operation to deal with serial code limitations inherited in all parallel applications and interleaved it with lower level vector operations.

Keywords: Memory Organization, Parallel Processors, Serial Code, Vector Processing

Procedia PDF Downloads 261
24730 Reconstructability Analysis for Landslide Prediction

Authors: David Percy

Abstract:

Landslides are a geologic phenomenon that affects a large number of inhabited places and are constantly being monitored and studied for the prediction of future occurrences. Reconstructability analysis (RA) is a methodology for extracting informative models from large volumes of data that work exclusively with discrete data. While RA has been used in medical applications and social science extensively, we are introducing it to the spatial sciences through applications like landslide prediction. Since RA works exclusively with discrete data, such as soil classification or bedrock type, working with continuous data, such as porosity, requires that these data are binned for inclusion in the model. RA constructs models of the data which pick out the most informative elements, independent variables (IVs), from each layer that predict the dependent variable (DV), landslide occurrence. Each layer included in the model retains its classification data as a primary encoding of the data. Unlike other machine learning algorithms that force the data into one-hot encoding type of schemes, RA works directly with the data as it is encoded, with the exception of continuous data, which must be binned. The usual physical and derived layers are included in the model, and testing our results against other published methodologies, such as neural networks, yields accuracy that is similar but with the advantage of a completely transparent model. The results of an RA session with a data set are a report on every combination of variables and their probability of landslide events occurring. In this way, every combination of informative state combinations can be examined.

Keywords: reconstructability analysis, machine learning, landslides, raster analysis

Procedia PDF Downloads 57
24729 Development of a Robust Protein Classifier to Predict EMT Status of Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma (CESC) Tumors

Authors: ZhenlinJu, Christopher P. Vellano, RehanAkbani, Yiling Lu, Gordon B. Mills

Abstract:

The epithelial–mesenchymal transition (EMT) is a process by which epithelial cells acquire mesenchymal characteristics, such as profound disruption of cell-cell junctions, loss of apical-basolateral polarity, and extensive reorganization of the actin cytoskeleton to induce cell motility and invasion. A hallmark of EMT is its capacity to promote metastasis, which is due in part to activation of several transcription factors and subsequent downregulation of E-cadherin. Unfortunately, current approaches have yet to uncover robust protein marker sets that can classify tumors as possessing strong EMT signatures. In this study, we utilize reverse phase protein array (RPPA) data and consensus clustering methods to successfully classify a subset of cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC) tumors into an EMT protein signaling group (EMT group). The overall survival (OS) of patients in the EMT group is significantly worse than those in the other Hormone and PI3K/AKT signaling groups. In addition to a shrinkage and selection method for linear regression (LASSO), we applied training/test set and Monte Carlo resampling approaches to identify a set of protein markers that predicts the EMT status of CESC tumors. We fit a logistic model to these protein markers and developed a classifier, which was fixed in the training set and validated in the testing set. The classifier robustly predicted the EMT status of the testing set with an area under the curve (AUC) of 0.975 by Receiver Operating Characteristic (ROC) analysis. This method not only identifies a core set of proteins underlying an EMT signature in cervical cancer patients, but also provides a tool to examine protein predictors that drive molecular subtypes in other diseases.

Keywords: consensus clustering, TCGA CESC, Silhouette, Monte Carlo LASSO

Procedia PDF Downloads 461
24728 An Investigation into Why Very Few Small Start-Ups Business Survive for Longer Than Three Years: An Explanatory Study in the Context of Saudi Arabia

Authors: Motaz Alsolaim

Abstract:

Nowadays, the challenges of running a start-up can be very complex and are perhaps more difficult than at any other time in the past. Changes in technology, manufacturing innovation, and product development, combined with intense competition and market regulations are factors that have put pressure on classic ways of managing firms, thereby forcing change. As a result, the rate of closure, exit or discontinuation of start-ups and young businesses is very high. Despite the essential role of small firms in an economy, they still tend to face obstacles that exert a negative influence on their performance and rate of survival. In fact, it is not easy to determine with any certainty the reasons why small firms fail. For this reason, failure itself is not clearly defined, and its exact causes are hard to diagnose. In this current study, therefore, the barriers to survival will be covered more broadly, especially personal/entrepreneurial, enterprise and environmental factors with regard to various possible reasons for this failure, in order to determine the best solutions and make appropriate recommendations. Methodology: It could be argued that mixed methods might help to improve entrepreneurship research addressing challenges emphasis in previous studies and to achieve the triangulation. Calls for the combined use of quantitative and qualitative research were also made in the entrepreneurship field since entrepreneurship is a multi-faceted area of research. Therefore, explanatory sequential mixed method was used, using questionnaire online survey for entrepreneurs, followed by semi-structure interview. Collecting over 750 surveys and accepting 296 valid surveys, after that 13 interviews from government official seniors, businessmen successful entrepreneurs, and non-successful entrepreneurs. Findings: The first phase findings ( quantitative) shows the obstacles to survive; starting from the personal/ entrepreneurial factors such as; past work experience, lack of skills and interest, are positive factors, while; gender, age and education level of the owner are negative factors. Internal factors such as lack of marketing research and weak business planning are positive. The environmental factors; in economic perspectives; difficulty to find labors, in socio-cultural perspectives; Social restriction and traditions found to be a negative factors. In other hand, from the political perspective; cost of compliance and insufficient government plans found to be a positive factors for small business failure. From infrastructure perspective; lack of skills labor, high level of bureaucracy and lack of information are positive factors. Conclusion: This paper serves to enrich the understanding of failure factors in MENA region more precisely in SA, by minimizing the probability of failure in small-micro entrepreneurial start-up in SA, in the light of the Saudi government’s Vision 2030 plan.

Keywords: small business barriers, start-up business, entrepreneurship, Saudi Arabia

Procedia PDF Downloads 174
24727 Data Analytics in Hospitality Industry

Authors: Tammy Wee, Detlev Remy, Arif Perdana

Abstract:

In the recent years, data analytics has become the buzzword in the hospitality industry. The hospitality industry is another example of a data-rich industry that has yet fully benefited from the insights of data analytics. Effective use of data analytics can change how hotels operate, market and position themselves competitively in the hospitality industry. However, at the moment, the data obtained by individual hotels remain under-utilized. This research is a preliminary research on data analytics in the hospitality industry, using an in-depth face-to-face interview on one hotel as a start to a multi-level research. The main case study of this research, hotel A, is a chain brand of international hotel that has been systematically gathering and collecting data on its own customer for the past five years. The data collection points begin from the moment a guest book a room until the guest leave the hotel premises, which includes room reservation, spa booking, and catering. Although hotel A has been gathering data intelligence on its customer for some time, they have yet utilized the data to its fullest potential, and they are aware of their limitation as well as the potential of data analytics. Currently, the utilization of data analytics in hotel A is limited in the area of customer service improvement, namely to enhance the personalization of service for each individual customer. Hotel A is able to utilize the data to improve and enhance their service which in turn, encourage repeated customers. According to hotel A, 50% of their guests returned to their hotel, and 70% extended nights because of the personalized service. Apart from using the data analytics for enhancing customer service, hotel A also uses the data in marketing. Hotel A uses the data analytics to predict or forecast the change in consumer behavior and demand, by tracking their guest’s booking preference, payment preference and demand shift between properties. However, hotel A admitted that the data they have been collecting was not fully utilized due to two challenges. The first challenge of using data analytics in hotel A is the data is not clean. At the moment, the data collection of one guest profile is meaningful only for one department in the hotel but meaningless for another department. Cleaning up the data and getting standards correctly for usage by different departments are some of the main concerns of hotel A. The second challenge of using data analytics in hotel A is the non-integral internal system. At the moment, the internal system used by hotel A do not integrate with each other well, limiting the ability to collect data systematically. Hotel A is considering another system to replace the current one for more comprehensive data collection. Hotel proprietors recognized the potential of data analytics as reported in this research, however, the current challenges of implementing a system to collect data come with a cost. This research has identified the current utilization of data analytics and the challenges faced when it comes to implementing data analytics.

Keywords: data analytics, hospitality industry, customer relationship management, hotel marketing

Procedia PDF Downloads 173
24726 Realization of a (GIS) for Drilling (DWS) through the Adrar Region

Authors: Djelloul Benatiallah, Ali Benatiallah, Abdelkader Harouz

Abstract:

Geographic Information Systems (GIS) include various methods and computer techniques to model, capture digitally, store, manage, view and analyze. Geographic information systems have the characteristic to appeal to many scientific and technical field, and many methods. In this article we will present a complete and operational geographic information system, following the theoretical principles of data management and adapting to spatial data, especially data concerning the monitoring of drinking water supply wells (DWS) Adrar region. The expected results of this system are firstly an offer consulting standard features, updating and editing beneficiaries and geographical data, on the other hand, provides specific functionality contractors entered data, calculations parameterized and statistics.

Keywords: GIS, DWS, drilling, Adrar

Procedia PDF Downloads 305
24725 Generic Data Warehousing for Consumer Electronics Retail Industry

Authors: S. Habte, K. Ouazzane, P. Patel, S. Patel

Abstract:

The dynamic and highly competitive nature of the consumer electronics retail industry means that businesses in this industry are experiencing different decision making challenges in relation to pricing, inventory control, consumer satisfaction and product offerings. To overcome the challenges facing retailers and create opportunities, we propose a generic data warehousing solution which can be applied to a wide range of consumer electronics retailers with a minimum configuration. The solution includes a dimensional data model, a template SQL script, a high level architectural descriptions, ETL tool developed using C#, a set of APIs, and data access tools. It has been successfully applied by ASK Outlets Ltd UK resulting in improved productivity and enhanced sales growth.

Keywords: consumer electronics, data warehousing, dimensional data model, generic, retail industry

Procedia PDF Downloads 405
24724 Sequential Data Assimilation with High-Frequency (HF) Radar Surface Current

Authors: Lei Ren, Michael Hartnett, Stephen Nash

Abstract:

The abundant measured surface current from HF radar system in coastal area is assimilated into model to improve the modeling forecasting ability. A simple sequential data assimilation scheme, Direct Insertion (DI), is applied to update model forecast states. The influence of Direct Insertion data assimilation over time is analyzed at one reference point. Vector maps of surface current from models are compared with HF radar measurements. Root-Mean-Squared-Error (RMSE) between modeling results and HF radar measurements is calculated during the last four days with no data assimilation.

Keywords: data assimilation, CODAR, HF radar, surface current, direct insertion

Procedia PDF Downloads 567
24723 Measured versus Default Interstate Traffic Data in New Mexico, USA

Authors: M. A. Hasan, M. R. Islam, R. A. Tarefder

Abstract:

This study investigates how the site specific traffic data differs from the Mechanistic Empirical Pavement Design Software default values. Two Weigh-in-Motion (WIM) stations were installed in Interstate-40 (I-40) and Interstate-25 (I-25) to developed site specific data. A computer program named WIM Data Analysis Software (WIMDAS) was developed using Microsoft C-Sharp (.Net) for quality checking and processing of raw WIM data. A complete year data from November 2013 to October 2014 was analyzed using the developed WIM Data Analysis Program. After that, the vehicle class distribution, directional distribution, lane distribution, monthly adjustment factor, hourly distribution, axle load spectra, average number of axle per vehicle, axle spacing, lateral wander distribution, and wheelbase distribution were calculated. Then a comparative study was done between measured data and AASHTOWare default values. It was found that the measured general traffic inputs for I-40 and I-25 significantly differ from the default values.

Keywords: AASHTOWare, traffic, weigh-in-motion, axle load distribution

Procedia PDF Downloads 337
24722 Effect of Ginger, Red Pepper, and Their Mixture in Diet on Growth Performance and Body Composition of Oscar, Astronotus ocellatus

Authors: Sarah Jorjani, Afshin Ghelichi, Mazyar Kamali

Abstract:

The aim of this study was to estimate the effect of addition of ginger and red pepper and their mixture in diet on growth performance, survival rate and body composition of Astronotus ocellatus (Oscar fish). This study had been carried out for 8 weeks. For this reason 132 oscar fishes with intial weight of 2.44±0.26 (gr) were divided into 4 treatments with three replicate as compeletly randomize design test and fed by 100% Biomar diet (T1), Biomar + red pepper (55 mg/kg) (T2), Biomar + ginger (1%) (T3) and Biomar + mixture of red pepper and ginger (T4).The fish were fed in 5% of their body weight. The results showed T2 have significant differences in most of growth parameters in compare with other treatments, such as PBWI, SGR, PER and SR (P < 0.05), but there were no significant differences between treatments in FCR and FE (P > 0.05).

Keywords: red pepper, ginger, oscar fish, growth performance, body composition

Procedia PDF Downloads 417
24721 Design of Knowledge Management System with Geographic Information System

Authors: Angga Hidayah Ramadhan, Luciana Andrawina, M. Azani Hasibuan

Abstract:

Data will be as a core of the decision if it has a good treatment or process, which is process that data into information, and information into knowledge to make a wisdom or decision. Today, many companies have not realize it include XYZ University Admission Directorate as executor of National Admission called Seleksi Masuk Bersama (SMB) that during the time, the workers only uses their feeling to make a decision. Whereas if it done, then that company can analyze the data to make a right decision to get a pin sales from student candidate or registrant that follow SMB as many as possible. Therefore, needs Knowledge Management System (KMS) with Geographic Information System (GIS) use 5C4C that can process that company data becomes more useful and can help make decisions. This information system can process data into information based on the pin sold data with 5C (Contextualized, Categorize, Calculation, Correction, Condensed) and convert information into knowledge with 4C (Comparing, Consequence, Connection, Conversation) that has been several steps until these data can be useful to make easier to take a decision or wisdom, resolve problems, communicate, and quicker to learn to the employees have not experience and also for ease of viewing/visualization based on spatial data that equipped with GIS functionality that can be used to indicate events in each province with indicator that facilitate in this system. The system also have a function to save the tacit on the system then to be proceed into explicit in expert system based on the problems that will be found from the consequences of information. With the system each team can make a decision with same ways, structured, and the important is based on the actual event/data.

Keywords: 5C4C, data, information, knowledge

Procedia PDF Downloads 459
24720 Rhizobium leguminosarum: Selecting Strain and Exploring Delivery Systems for White Clover

Authors: Laura Villamizar, David Wright, Claudia Baena, Marie Foxwell, Maureen O'Callaghan

Abstract:

Leguminous crops can be self-sufficient for their nitrogen requirements when their roots are nodulated with an effective Rhizobium strain and for this reason seed or soil inoculation is practiced worldwide to ensure nodulation and nitrogen fixation in grain and forage legumes. The most widely used method of applying commercially available inoculants is using peat cultures which are coated onto seeds prior to sowing. In general, rhizobia survive well in peat, but some species die rapidly after inoculation onto seeds. The development of improved formulation methodology is essential to achieve extended persistence of rhizobia on seeds, and improved efficacy. Formulations could be solid or liquid. Most popular solid formulations or delivery systems are: wettable powders (WP), water dispersible granules (WG), and granules (DG). Liquid formulation generally are: suspension concentrates (SC) or emulsifiable concentrates (EC). In New Zealand, R. leguminosarum bv. trifolii strain TA1 has been used as a commercial inoculant for white clover over wide areas for many years. Seeds inoculation is carried out by mixing the seeds with inoculated peat, some adherents and lime, but rhizobial populations on stored seeds decline over several weeks due to a number of factors including desiccation and antibacterial compounds produced by the seeds. In order to develop a more stable and suitable delivery system to incorporate rhizobia in pastures, two strains of R. leguminosarum (TA1 and CC275e) and several formulations and processes were explored (peat granules, self-sticky peat for seed coating, emulsions and a powder containing spray dried microcapsules). Emulsions prepared with fresh broth of strain TA1 were very unstable under storage and after seed inoculation. Formulations where inoculated peat was used as the active ingredient were significantly more stable than those prepared with fresh broth. The strain CC275e was more tolerant to stress conditions generated during formulation and seed storage. Peat granules and peat inoculated seeds using strain CC275e maintained an acceptable loading of 108 CFU/g of granules or 105 CFU/g of seeds respectively, during six months of storage at room temperature. Strain CC275e inoculated on peat was also microencapsulated with a natural biopolymer by spray drying and after optimizing operational conditions, microparticles containing 107 CFU/g and a mean particle size between 10 and 30 micrometers were obtained. Survival of rhizobia during storage of the microcapsules is being assessed. The development of a stable product depends on selecting an active ingredient (microorganism), robust enough to tolerate some adverse conditions generated during formulation, storage, and commercialization and after its use in the field. However, the design and development of an adequate formulation, using compatible ingredients, optimization of the formulation process and selecting the appropriate delivery system, is possibly the best tool to overcome the poor survival of rhizobia and provide farmers with better quality inoculants to use.

Keywords: formulation, Rhizobium leguminosarum, storage stability, white clover

Procedia PDF Downloads 146
24719 A Policy Strategy for Building Energy Data Management in India

Authors: Shravani Itkelwar, Deepak Tewari, Bhaskar Natarajan

Abstract:

The energy consumption data plays a vital role in energy efficiency policy design, implementation, and impact assessment. Any demand-side energy management intervention's success relies on the availability of accurate, comprehensive, granular, and up-to-date data on energy consumption. The Building sector, including residential and commercial, is one of the largest consumers of energy in India after the Industrial sector. With economic growth and increasing urbanization, the building sector is projected to grow at an unprecedented rate, resulting in a 5.6 times escalation in energy consumption till 2047 compared to 2017. Therefore, energy efficiency interventions will play a vital role in decoupling the floor area growth and associated energy demand, thereby increasing the need for robust data. In India, multiple institutions are involved in the collection and dissemination of data. This paper focuses on energy consumption data management in the building sector in India for both residential and commercial segments. It evaluates the robustness of data available through administrative and survey routes to estimate the key performance indicators and identify critical data gaps for making informed decisions. The paper explores several issues in the data, such as lack of comprehensiveness, non-availability of disaggregated data, the discrepancy in different data sources, inconsistent building categorization, and others. The identified data gaps are justified with appropriate examples. Moreover, the paper prioritizes required data in order of relevance to policymaking and groups it into "available," "easy to get," and "hard to get" categories. The paper concludes with recommendations to address the data gaps by leveraging digital initiatives, strengthening institutional capacity, institutionalizing exclusive building energy surveys, and standardization of building categorization, among others, to strengthen the management of building sector energy consumption data.

Keywords: energy data, energy policy, energy efficiency, buildings

Procedia PDF Downloads 181
24718 Genomic Resilience and Ecological Vulnerability in Coffea Arabica: Insights from Whole Genome Resequencing at Its Center of Origin

Authors: Zewdneh Zana Zate

Abstract:

The study focuses on the evolutionary and ecological genomics of both wild and cultivated Coffea arabica L. at its center of origin, Ethiopia, aiming to uncover how this vital species may withstand future climate changes. Utilizing bioclimatic models, we project the future distribution of Arabica under varied climate scenarios for 2050 and 2080, identifying potential conservation zones and immediate risk areas. Through whole-genome resequencing of accessions from Ethiopian gene banks, this research assesses genetic diversity and divergence between wild and cultivated populations. It explores relationships, demographic histories, and potential hybridization events among Coffea arabica accessions to better understand the species' origins and its connection to parental species. This genomic analysis also seeks to detect signs of natural or artificial selection across populations. Integrating these genomic discoveries with ecological data, the study evaluates the current and future ecological and genomic vulnerabilities of wild Coffea arabica, emphasizing necessary adaptations for survival. We have identified key genomic regions linked to environmental stress tolerance, which could be crucial for breeding more resilient Arabica varieties. Additionally, our ecological modeling predicted a contraction of suitable habitats, urging immediate conservation actions in identified key areas. This research not only elucidates the evolutionary history and adaptive strategies of Arabica but also informs conservation priorities and breeding strategies to enhance resilience to climate change. By synthesizing genomic and ecological insights, we provide a robust framework for developing effective management strategies aimed at sustaining Coffea arabica, a species of profound global importance, in its native habitat under evolving climatic conditions.

Keywords: coffea arabica, climate change adaptation, conservation strategies, genomic resilience

Procedia PDF Downloads 37
24717 A Survey on Data-Centric and Data-Aware Techniques for Large Scale Infrastructures

Authors: Silvina Caíno-Lores, Jesús Carretero

Abstract:

Large scale computing infrastructures have been widely developed with the core objective of providing a suitable platform for high-performance and high-throughput computing. These systems are designed to support resource-intensive and complex applications, which can be found in many scientific and industrial areas. Currently, large scale data-intensive applications are hindered by the high latencies that result from the access to vastly distributed data. Recent works have suggested that improving data locality is key to move towards exascale infrastructures efficiently, as solutions to this problem aim to reduce the bandwidth consumed in data transfers, and the overheads that arise from them. There are several techniques that attempt to move computations closer to the data. In this survey we analyse the different mechanisms that have been proposed to provide data locality for large scale high-performance and high-throughput systems. This survey intends to assist scientific computing community in understanding the various technical aspects and strategies that have been reported in recent literature regarding data locality. As a result, we present an overview of locality-oriented techniques, which are grouped in four main categories: application development, task scheduling, in-memory computing and storage platforms. Finally, the authors include a discussion on future research lines and synergies among the former techniques.

Keywords: data locality, data-centric computing, large scale infrastructures, cloud computing

Procedia PDF Downloads 254
24716 Wind Speed Data Analysis in Colombia in 2013 and 2015

Authors: Harold P. Villota, Alejandro Osorio B.

Abstract:

The energy meteorology is an area for study energy complementarity and the use of renewable sources in interconnected systems. Due to diversify the energy matrix in Colombia with wind sources, is necessary to know the data bases about this one. However, the time series given by 260 automatic weather stations have empty, and no apply data, so the purpose is to fill the time series selecting two years to characterize, impute and use like base to complete the data between 2005 and 2020.

Keywords: complementarity, wind speed, renewable, colombia, characteri, characterization, imputation

Procedia PDF Downloads 161
24715 Industrial Process Mining Based on Data Pattern Modeling and Nonlinear Analysis

Authors: Hyun-Woo Cho

Abstract:

Unexpected events may occur with serious impacts on industrial process. This work utilizes a data representation technique to model and to analyze process data pattern for the purpose of diagnosis. In this work, the use of triangular representation of process data is evaluated using simulation process. Furthermore, the effect of using different pre-treatment techniques based on such as linear or nonlinear reduced spaces was compared. This work extracted the fault pattern in the reduced space, not in the original data space. The results have shown that the non-linear technique based diagnosis method produced more reliable results and outperforms linear method.

Keywords: process monitoring, data analysis, pattern modeling, fault, nonlinear techniques

Procedia PDF Downloads 383
24714 Creating Inclusive Educational Environments for Women Faculty of Color Harnessing Ubuntu Perspectives

Authors: Gonzaga Mukasa, Faith Maina, Amani Zaier

Abstract:

This study investigated whether harnessing Ubuntu perspectives can aid in healing wounds Hierarchical Microaggressive intersectionalities inflict on African immigrant women faculty in predominantly white institutions. The study interviewed 8 African immigrant faculty from different higher education institutions in the United States selected using the snowball sampling technique. The Ubuntu Theory anchored the study. Findings indicated that women faculty of color experience Hierarchical Microaggressive intersectionalities leading them to lose job satisfaction and feel deprofessionalized and isolated. The recommendations were that institutions make their recruitment more inclusive of women of color to avoid isolation. And should embrace Ubuntu perspectives such as survival, solidarity, compassion, dignity, and mutual respect to architect educational environments that foster diversity and inclusion.

Keywords: ubuntu, women faculty, African immigrants, hierarchical microaggressive intersectionalities

Procedia PDF Downloads 63
24713 Recommender System Based on Mining Graph Databases for Data-Intensive Applications

Authors: Mostafa Gamal, Hoda K. Mohamed, Islam El-Maddah, Ali Hamdi

Abstract:

In recent years, many digital documents on the web have been created due to the rapid growth of ’social applications’ communities or ’Data-intensive applications’. The evolution of online-based multimedia data poses new challenges in storing and querying large amounts of data for online recommender systems. Graph data models have been shown to be more efficient than relational data models for processing complex data. This paper will explain the key differences between graph and relational databases, their strengths and weaknesses, and why using graph databases is the best technology for building a realtime recommendation system. Also, The paper will discuss several similarity metrics algorithms that can be used to compute a similarity score of pairs of nodes based on their neighbourhoods or their properties. Finally, the paper will discover how NLP strategies offer the premise to improve the accuracy and coverage of realtime recommendations by extracting the information from the stored unstructured knowledge, which makes up the bulk of the world’s data to enrich the graph database with this information. As the size and number of data items are increasing rapidly, the proposed system should meet current and future needs.

Keywords: graph databases, NLP, recommendation systems, similarity metrics

Procedia PDF Downloads 102
24712 Digital Revolution a Veritable Infrastructure for Technological Development

Authors: Osakwe Jude Odiakaosa

Abstract:

Today’s digital society is characterized by e-education or e-learning, e-commerce, and so on. All these have been propelled by digital revolution. Digital technology such as computer technology, Global Positioning System (GPS) and Geographic Information System (GIS) has been having a tremendous impact on the field of technology. This development has positively affected the scope, methods, speed of data acquisition, data management and the rate of delivery of the results (map and other map products) of data processing. This paper tries to address the impact of revolution brought by digital technology.

Keywords: digital revolution, internet, technology, data management

Procedia PDF Downloads 444
24711 BigCrypt: A Probable Approach of Big Data Encryption to Protect Personal and Business Privacy

Authors: Abdullah Al Mamun, Talal Alkharobi

Abstract:

As data size is growing up, people are became more familiar to store big amount of secret information into cloud storage. Companies are always required to need transfer massive business files from one end to another. We are going to lose privacy if we transmit it as it is and continuing same scenario repeatedly without securing the communication mechanism means proper encryption. Although asymmetric key encryption solves the main problem of symmetric key encryption but it can only encrypt limited size of data which is inapplicable for large data encryption. In this paper we propose a probable approach of pretty good privacy for encrypt big data using both symmetric and asymmetric keys. Our goal is to achieve encrypt huge collection information and transmit it through a secure communication channel for committing the business and personal privacy. To justify our method an experimental dataset from three different platform is provided. We would like to show that our approach is working for massive size of various data efficiently and reliably.

Keywords: big data, cloud computing, cryptography, hadoop, public key

Procedia PDF Downloads 314
24710 Implementation of Big Data Concepts Led by the Business Pressures

Authors: Snezana Savoska, Blagoj Ristevski, Violeta Manevska, Zlatko Savoski, Ilija Jolevski

Abstract:

Big data is widely accepted by the pharmaceutical companies as a result of business demands create through legal pressure. Pharmaceutical companies have many legal demands as well as standards’ demands and have to adapt their procedures to the legislation. To manage with these demands, they have to standardize the usage of the current information technology and use the latest software tools. This paper highlights some important aspects of experience with big data projects implementation in a pharmaceutical Macedonian company. These projects made improvements of their business processes by the help of new software tools selected to comply with legal and business demands. They use IT as a strategic tool to obtain competitive advantage on the market and to reengineer the processes towards new Internet economy and quality demands. The company is required to manage vast amounts of structured as well as unstructured data. For these reasons, they implement projects for emerging and appropriate software tools which have to deal with big data concepts accepted in the company.

Keywords: big data, unstructured data, SAP ERP, documentum

Procedia PDF Downloads 265
24709 Saving Energy at a Wastewater Treatment Plant through Electrical and Production Data Analysis

Authors: Adriano Araujo Carvalho, Arturo Alatrista Corrales

Abstract:

This paper intends to show how electrical energy consumption and production data analysis were used to find opportunities to save energy at Taboada wastewater treatment plant in Callao, Peru. In order to access the data, it was used independent data networks for both electrical and process instruments, which were taken to analyze under an ISO 50001 energy audit, which considered, thus, Energy Performance Indexes for each process and a step-by-step guide presented in this text. Due to the use of aforementioned methodology and data mining techniques applied on information gathered through electronic multimeters (conveniently placed on substation switchboards connected to a cloud network), it was possible to identify thoroughly the performance of each process and thus, evidence saving opportunities which were previously hidden before. The data analysis brought both costs and energy reduction, allowing the plant to save significant resources and to be certified under ISO 50001.

Keywords: energy and production data analysis, energy management, ISO 50001, wastewater treatment plant energy analysis

Procedia PDF Downloads 190
24708 Data Clustering in Wireless Sensor Network Implemented on Self-Organization Feature Map (SOFM) Neural Network

Authors: Krishan Kumar, Mohit Mittal, Pramod Kumar

Abstract:

Wireless sensor network is one of the most promising communication networks for monitoring remote environmental areas. In this network, all the sensor nodes are communicated with each other via radio signals. The sensor nodes have capability of sensing, data storage and processing. The sensor nodes collect the information through neighboring nodes to particular node. The data collection and processing is done by data aggregation techniques. For the data aggregation in sensor network, clustering technique is implemented in the sensor network by implementing self-organizing feature map (SOFM) neural network. Some of the sensor nodes are selected as cluster head nodes. The information aggregated to cluster head nodes from non-cluster head nodes and then this information is transferred to base station (or sink nodes). The aim of this paper is to manage the huge amount of data with the help of SOM neural network. Clustered data is selected to transfer to base station instead of whole information aggregated at cluster head nodes. This reduces the battery consumption over the huge data management. The network lifetime is enhanced at a greater extent.

Keywords: artificial neural network, data clustering, self organization feature map, wireless sensor network

Procedia PDF Downloads 511
24707 Review and Comparison of Associative Classification Data Mining Approaches

Authors: Suzan Wedyan

Abstract:

Data mining is one of the main phases in the Knowledge Discovery Database (KDD) which is responsible of finding hidden and useful knowledge from databases. There are many different tasks for data mining including regression, pattern recognition, clustering, classification, and association rule. In recent years a promising data mining approach called associative classification (AC) has been proposed, AC integrates classification and association rule discovery to build classification models (classifiers). This paper surveys and critically compares several AC algorithms with reference of the different procedures are used in each algorithm, such as rule learning, rule sorting, rule pruning, classifier building, and class allocation for test cases.

Keywords: associative classification, classification, data mining, learning, rule ranking, rule pruning, prediction

Procedia PDF Downloads 532
24706 Hemispheric Locus and Gender Predict the Delay between the Moment of Stroke and Hospitalization

Authors: D. Anderlini, G. Wallis

Abstract:

Background: The number of people experiencing stroke is steadily increasing due to changes in diet and lifestyle, to longer life expectancy resulting in older population, to higher survival rates as a consequence of improvements during the acute phase. This study considers what risk factors might contribute to delayed entry to hospital for treatment. Methods: We analyzed data from 2472 patients admitted to the Stroke Unit of the Royal Brisbane Women's Hospital, Australia, between 2002 to 2011. Results: Previous studies have reported that factors which can contribute to delay include the patient’s age, the time of day, physical location, visit the GP instead of going to the emergency, means of transport, severity of symptoms and type of stroke. Contrary to findings of other studies, we found a strong correlation between side of lesion and delay in admission: patients with right hemisphere lesions had an average delay of 3.78 days, while patients with left hemisphere lesions had an average delay of 1.49 days. Damage to the right hemisphere generally ends in motor impairment in the non-dominant hand and no speech impediment. In contrast, left hemisphere lesions can result in deficit to; dominant hand function and aphasia which will be noticed even if their impact on performance is relatively minor. A finding which goes against many previous studies, is the fact that women get to the hospital much sooner than men, showing an average delay of 0.92 days in women vs. 3.36 days in men. Conclusion: Acute surgical-pharmacological therapies are most effective if applied immediately after stroke. Hence delays to admission can be crucial to the degree of recovery. The tendency of patients to overlook symptoms of right hemisphere lesion should be the target of information campaigns both for the general public and GPs. Why do men go to hospital so late? We don't know yet! Nevertheless an awareness plan specifically direct to male population should be on the agenda of Health Departments.

Keywords: gender, admission delay, stroke location, bioinformatics, biomedicine

Procedia PDF Downloads 224