Search results for: data harvesting
24692 How Western Donors Allocate Official Development Assistance: New Evidence From a Natural Language Processing Approach
Authors: Daniel Benson, Yundan Gong, Hannah Kirk
Abstract:
Advancement in national language processing techniques has led to increased data processing speeds, and reduced the need for cumbersome, manual data processing that is often required when processing data from multilateral organizations for specific purposes. As such, using named entity recognition (NER) modeling and the Organisation of Economically Developed Countries (OECD) Creditor Reporting System database, we present the first geotagged dataset of OECD donor Official Development Assistance (ODA) projects on a global, subnational basis. Our resulting data contains 52,086 ODA projects geocoded to subnational locations across 115 countries, worth a combined $87.9bn. This represents the first global, OECD donor ODA project database with geocoded projects. We use this new data to revisit old questions of how ‘well’ donors allocate ODA to the developing world. This understanding is imperative for policymakers seeking to improve ODA effectiveness.Keywords: international aid, geocoding, subnational data, natural language processing, machine learning
Procedia PDF Downloads 7824691 Compressed Suffix Arrays to Self-Indexes Based on Partitioned Elias-Fano
Abstract:
A practical and simple self-indexing data structure, Partitioned Elias-Fano (PEF) - Compressed Suffix Arrays (CSA), is built in linear time for the CSA based on PEF indexes. Moreover, the PEF-CSA is compared with two classical compressed indexing methods, Ferragina and Manzini implementation (FMI) and Sad-CSA on different type and size files in Pizza & Chili. The PEF-CSA performs better on the existing data in terms of the compression ratio, count, and locates time except for the evenly distributed data such as proteins data. The observations of the experiments are that the distribution of the φ is more important than the alphabet size on the compression ratio. Unevenly distributed data φ makes better compression effect, and the larger the size of the hit counts, the longer the count and locate time.Keywords: compressed suffix array, self-indexing, partitioned Elias-Fano, PEF-CSA
Procedia PDF Downloads 25224690 Data, Digital Identity and Antitrust Law: An Exploratory Study of Facebook’s Novi Digital Wallet
Authors: Wanjiku Karanja
Abstract:
Facebook has monopoly power in the social networking market. It has grown and entrenched its monopoly power through the capture of its users’ data value chains. However, antitrust law’s consumer welfare roots have prevented it from effectively addressing the role of data capture in Facebook’s market dominance. These regulatory blind spots are augmented in Facebook’s proposed Diem cryptocurrency project and its Novi Digital wallet. Novi, which is Diem’s digital identity component, shall enable Facebook to collect an unprecedented volume of consumer data. Consequently, Novi has seismic implications on internet identity as the network effects of Facebook’s large user base could establish it as the de facto internet identity layer. Moreover, the large tracts of data Facebook shall collect through Novi shall further entrench Facebook's market power. As such, the attendant lock-in effects of this project shall be very difficult to reverse. Urgent regulatory action is therefore required to prevent this expansion of Facebook’s data resources and monopoly power. This research thus highlights the importance of data capture to competition and market health in the social networking industry. It utilizes interviews with key experts to empirically interrogate the impact of Facebook’s data capture and control of its users’ data value chains on its market power. This inquiry is contextualized against Novi’s expansive effect on Facebook’s data value chains. It thus addresses the novel antitrust issues arising at the nexus of Facebook’s monopoly power and the privacy of its users’ data. It also explores the impact of platform design principles, specifically data portability and data portability, in mitigating Facebook’s anti-competitive practices. As such, this study finds that Facebook is a powerful monopoly that dominates the social media industry to the detriment of potential competitors. Facebook derives its power from its size, annexure of the consumer data value chain, and control of its users’ social graphs. Additionally, the platform design principles of data interoperability and data portability are not a panacea to restoring competition in the social networking market. Their success depends on the establishment of robust technical standards and regulatory frameworks.Keywords: antitrust law, data protection law, data portability, data interoperability, digital identity, Facebook
Procedia PDF Downloads 12324689 Recommendations for Data Quality Filtering of Opportunistic Species Occurrence Data
Authors: Camille Van Eupen, Dirk Maes, Marc Herremans, Kristijn R. R. Swinnen, Ben Somers, Stijn Luca
Abstract:
In ecology, species distribution models are commonly implemented to study species-environment relationships. These models increasingly rely on opportunistic citizen science data when high-quality species records collected through standardized recording protocols are unavailable. While these opportunistic data are abundant, uncertainty is usually high, e.g., due to observer effects or a lack of metadata. Data quality filtering is often used to reduce these types of uncertainty in an attempt to increase the value of studies relying on opportunistic data. However, filtering should not be performed blindly. In this study, recommendations are built for data quality filtering of opportunistic species occurrence data that are used as input for species distribution models. Using an extensive database of 5.7 million citizen science records from 255 species in Flanders, the impact on model performance was quantified by applying three data quality filters, and these results were linked to species traits. More specifically, presence records were filtered based on record attributes that provide information on the observation process or post-entry data validation, and changes in the area under the receiver operating characteristic (AUC), sensitivity, and specificity were analyzed using the Maxent algorithm with and without filtering. Controlling for sample size enabled us to study the combined impact of data quality filtering, i.e., the simultaneous impact of an increase in data quality and a decrease in sample size. Further, the variation among species in their response to data quality filtering was explored by clustering species based on four traits often related to data quality: commonness, popularity, difficulty, and body size. Findings show that model performance is affected by i) the quality of the filtered data, ii) the proportional reduction in sample size caused by filtering and the remaining absolute sample size, and iii) a species ‘quality profile’, resulting from a species classification based on the four traits related to data quality. The findings resulted in recommendations on when and how to filter volunteer generated and opportunistically collected data. This study confirms that correctly processed citizen science data can make a valuable contribution to ecological research and species conservation.Keywords: citizen science, data quality filtering, species distribution models, trait profiles
Procedia PDF Downloads 20224688 Data Quality Enhancement with String Length Distribution
Authors: Qi Xiu, Hiromu Hota, Yohsuke Ishii, Takuya Oda
Abstract:
Recently, collectable manufacturing data are rapidly increasing. On the other hand, mega recall is getting serious as a social problem. Under such circumstances, there are increasing needs for preventing mega recalls by defect analysis such as root cause analysis and abnormal detection utilizing manufacturing data. However, the time to classify strings in manufacturing data by traditional method is too long to meet requirement of quick defect analysis. Therefore, we present String Length Distribution Classification method (SLDC) to correctly classify strings in a short time. This method learns character features, especially string length distribution from Product ID, Machine ID in BOM and asset list. By applying the proposal to strings in actual manufacturing data, we verified that the classification time of strings can be reduced by 80%. As a result, it can be estimated that the requirement of quick defect analysis can be fulfilled.Keywords: string classification, data quality, feature selection, probability distribution, string length
Procedia PDF Downloads 31824687 Temporally Coherent 3D Animation Reconstruction from RGB-D Video Data
Authors: Salam Khalifa, Naveed Ahmed
Abstract:
We present a new method to reconstruct a temporally coherent 3D animation from single or multi-view RGB-D video data using unbiased feature point sampling. Given RGB-D video data, in form of a 3D point cloud sequence, our method first extracts feature points using both color and depth information. In the subsequent steps, these feature points are used to match two 3D point clouds in consecutive frames independent of their resolution. Our new motion vectors based dynamic alignment method then fully reconstruct a spatio-temporally coherent 3D animation. We perform extensive quantitative validation using novel error functions to analyze the results. We show that despite the limiting factors of temporal and spatial noise associated to RGB-D data, it is possible to extract temporal coherence to faithfully reconstruct a temporally coherent 3D animation from RGB-D video data.Keywords: 3D video, 3D animation, RGB-D video, temporally coherent 3D animation
Procedia PDF Downloads 37324686 Determining Abnomal Behaviors in UAV Robots for Trajectory Control in Teleoperation
Authors: Kiwon Yeom
Abstract:
Change points are abrupt variations in a data sequence. Detection of change points is useful in modeling, analyzing, and predicting time series in application areas such as robotics and teleoperation. In this paper, a change point is defined to be a discontinuity in one of its derivatives. This paper presents a reliable method for detecting discontinuities within a three-dimensional trajectory data. The problem of determining one or more discontinuities is considered in regular and irregular trajectory data from teleoperation. We examine the geometric detection algorithm and illustrate the use of the method on real data examples.Keywords: change point, discontinuity, teleoperation, abrupt variation
Procedia PDF Downloads 16724685 Microalgae Bacteria Granules, an Alternative Technology to the Conventional Wastewater Treatment: Structural and Metabolic Characterization
Authors: M. Nita-Lazar, E. Manea, C. Bumbac, A. Banciu, C. Stoica
Abstract:
The population and economic growth have generated a significant new number of pollutant compounds which have to be degraded before reaching the environment. The wastewater treatment plants (WWTPs) have been the last barrier between the domestic and/or industrial wastewaters and the environment. At present, the conventional WWTPs have very high operational costs, most of them linked to the aeration process (60-65% from total energy costs related to wastewater treatment). In addition, they have had a low efficiency in pollutants removal such as pharmaceutical and other resilient anthropogenic compounds. In our study, we have been focused on new wastewater treatment strategies to enhance the efficiency of pollutants removal and decrease the wastewater treatment operational costs. The usage of mixed microalgae-bacteria granules technology generated high efficiency and low costs by a better harvesting and less expensive aeration. The intertrophic relationships between microalgae and bacteria have been characterized by the structure of the population community to their metabolic relationships. The results, obtained by microscopic studies, showed well-organized and stratified microalgae-bacteria granules where bacteria have been enveloped in the microalgal structures. Moreover, their population community structure has been modulated as well as their nitrification, denitrification processes (analysis based on qPCR genes expression) by the type of the pollutant compounds and amounts. In conclusion, the understanding and modulation of intertrophic relationships between microalgae and bacteria could be an economical and technological viable alternative to the conventional wastewater treatment. Acknowledgements: This research was supported by grant PN-III-P4-ID-PCE-2016-0865 from the Romanian National Authority for Scientific Research and Innovation CNCS/CCCDI-UEFISCDI.Keywords: activated sludge, bacteria, granules, microalgae
Procedia PDF Downloads 12224684 Multidimensional Item Response Theory Models for Practical Application in Large Tests Designed to Measure Multiple Constructs
Authors: Maria Fernanda Ordoñez Martinez, Alvaro Mauricio Montenegro
Abstract:
This work presents a statistical methodology for measuring and founding constructs in Latent Semantic Analysis. This approach uses the qualities of Factor Analysis in binary data with interpretations present on Item Response Theory. More precisely, we propose initially reducing dimensionality with specific use of Principal Component Analysis for the linguistic data and then, producing axes of groups made from a clustering analysis of the semantic data. This approach allows the user to give meaning to previous clusters and found the real latent structure presented by data. The methodology is applied in a set of real semantic data presenting impressive results for the coherence, speed and precision.Keywords: semantic analysis, factorial analysis, dimension reduction, penalized logistic regression
Procedia PDF Downloads 44324683 Analysis of Production Forecasting in Unconventional Gas Resources Development Using Machine Learning and Data-Driven Approach
Authors: Dongkwon Han, Sangho Kim, Sunil Kwon
Abstract:
Unconventional gas resources have dramatically changed the future energy landscape. Unlike conventional gas resources, the key challenges in unconventional gas have been the requirement that applies to advanced approaches for production forecasting due to uncertainty and complexity of fluid flow. In this study, artificial neural network (ANN) model which integrates machine learning and data-driven approach was developed to predict productivity in shale gas. The database of 129 wells of Eagle Ford shale basin used for testing and training of the ANN model. The Input data related to hydraulic fracturing, well completion and productivity of shale gas were selected and the output data is a cumulative production. The performance of the ANN using all data sets, clustering and variables importance (VI) models were compared in the mean absolute percentage error (MAPE). ANN model using all data sets, clustering, and VI were obtained as 44.22%, 10.08% (cluster 1), 5.26% (cluster 2), 6.35%(cluster 3), and 32.23% (ANN VI), 23.19% (SVM VI), respectively. The results showed that the pre-trained ANN model provides more accurate results than the ANN model using all data sets.Keywords: unconventional gas, artificial neural network, machine learning, clustering, variables importance
Procedia PDF Downloads 19624682 Fungal Flocculation of Single Algae Species and Mixed Algal Communities
Authors: Digby Wrede, Stephen Gray, Syed Hussainy
Abstract:
Microalgae are extremely useful organisms but notoriously hard to harvest. The use of fungal pellets has been found to be an efficient way to flocculate numerous species of algae. However, only the flocculation of single species of algae has been investigated. Algae are generally found in complex communities in the environment comprising of numerous species of algae ranging from simple single cell algae such as Chlorella to more complex or communal algae such as Dictyosphaerium. This study investigated the flocculation capabilities of Aspergillus oryzae to flocculate four species of algae; Chlorella vulgaris, Scenedesmus quadricauda, Scenedesmus acuminatus and Dictyosphaerium sp., and the algal communities in four different types of domestic effluent from a lagoon-based treatment plant; primary effluent, secondary effluent and the high rate algal pond effluent at a natural and at a lowered pH level. Spectrophotometry was used to measure the changes in algal population. C. vulgaris, S. acuminatus and S. quadricauda, had over 90% reduction of algal in suspension after 24 hours. Dictyosphaerium sp. showed a little to no removal after 24 hours. The primary, secondary, and natural pH level HRAP had roughly a 50% removal after 24 hours, the HRAP which was grown at a lower pH level had over a 90% removal after 24 hours. pH has been shown previously to affect fungal flocculation. Fungal and algae pellets have been shown to be able to treat wastewater and can be converted to biofuels in a very similar method to how algae are currently converted. The mixture of both fungi and algae has also been shown to provide a higher yield of oils then separately and are able to more efficiently treat wastewater then algae or fungi by themselves.Keywords: algae harvesting, Aspergillus oryzae, fungal flocculation, wastewater treatment
Procedia PDF Downloads 16124681 Procedure Model for Data-Driven Decision Support Regarding the Integration of Renewable Energies into Industrial Energy Management
Authors: M. Graus, K. Westhoff, X. Xu
Abstract:
The climate change causes a change in all aspects of society. While the expansion of renewable energies proceeds, industry could not be convinced based on general studies about the potential of demand side management to reinforce smart grid considerations in their operational business. In this article, a procedure model for a case-specific data-driven decision support for industrial energy management based on a holistic data analytics approach is presented. The model is executed on the example of the strategic decision problem, to integrate the aspect of renewable energies into industrial energy management. This question is induced due to considerations of changing the electricity contract model from a standard rate to volatile energy prices corresponding to the energy spot market which is increasingly more affected by renewable energies. The procedure model corresponds to a data analytics process consisting on a data model, analysis, simulation and optimization step. This procedure will help to quantify the potentials of sustainable production concepts based on the data from a factory. The model is validated with data from a printer in analogy to a simple production machine. The overall goal is to establish smart grid principles for industry via the transformation from knowledge-driven to data-driven decisions within manufacturing companies.Keywords: data analytics, green production, industrial energy management, optimization, renewable energies, simulation
Procedia PDF Downloads 43524680 Dissimilarity-Based Coloring for Symbolic and Multivariate Data Visualization
Authors: K. Umbleja, M. Ichino, H. Yaguchi
Abstract:
In this paper, we propose a coloring method for multivariate data visualization by using parallel coordinates based on dissimilarity and tree structure information gathered during hierarchical clustering. The proposed method is an extension for proximity-based coloring that suffers from a few undesired side effects if hierarchical tree structure is not balanced tree. We describe the algorithm by assigning colors based on dissimilarity information, show the application of proposed method on three commonly used datasets, and compare the results with proximity-based coloring. We found our proposed method to be especially beneficial for symbolic data visualization where many individual objects have already been aggregated into a single symbolic object.Keywords: data visualization, dissimilarity-based coloring, proximity-based coloring, symbolic data
Procedia PDF Downloads 17024679 The Impact of Data Science on Geography: A Review
Authors: Roberto Machado
Abstract:
We conducted a systematic review using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses methodology, analyzing 2,996 studies and synthesizing 41 of them to explore the evolution of data science and its integration into geography. By employing optimization algorithms, we accelerated the review process, significantly enhancing the efficiency and precision of literature selection. Our findings indicate that data science has developed over five decades, facing challenges such as the diversified integration of data and the need for advanced statistical and computational skills. In geography, the integration of data science underscores the importance of interdisciplinary collaboration and methodological innovation. Techniques like large-scale spatial data analysis and predictive algorithms show promise in natural disaster management and transportation route optimization, enabling faster and more effective responses. These advancements highlight the transformative potential of data science in geography, providing tools and methodologies to address complex spatial problems. The relevance of this study lies in the use of optimization algorithms in systematic reviews and the demonstrated need for deeper integration of data science into geography. Key contributions include identifying specific challenges in combining diverse spatial data and the necessity for advanced computational skills. Examples of connections between these two fields encompass significant improvements in natural disaster management and transportation efficiency, promoting more effective and sustainable environmental solutions with a positive societal impact.Keywords: data science, geography, systematic review, optimization algorithms, supervised learning
Procedia PDF Downloads 2924678 Developing Structured Sizing Systems for Manufacturing Ready-Made Garments of Indian Females Using Decision Tree-Based Data Mining
Authors: Hina Kausher, Sangita Srivastava
Abstract:
In India, there is a lack of standard, systematic sizing approach for producing readymade garments. Garments manufacturing companies use their own created size tables by modifying international sizing charts of ready-made garments. The purpose of this study is to tabulate the anthropometric data which covers the variety of figure proportions in both height and girth. 3,000 data has been collected by an anthropometric survey undertaken over females between the ages of 16 to 80 years from some states of India to produce the sizing system suitable for clothing manufacture and retailing. This data is used for the statistical analysis of body measurements, the formulation of sizing systems and body measurements tables. Factor analysis technique is used to filter the control body dimensions from a large number of variables. Decision tree-based data mining is used to cluster the data. The standard and structured sizing system can facilitate pattern grading and garment production. Moreover, it can exceed buying ratios and upgrade size allocations to retail segments.Keywords: anthropometric data, data mining, decision tree, garments manufacturing, sizing systems, ready-made garments
Procedia PDF Downloads 13324677 A Framework on Data and Remote Sensing for Humanitarian Logistics
Authors: Vishnu Nagendra, Marten Van Der Veen, Stefania Giodini
Abstract:
Effective humanitarian logistics operations are a cornerstone in the success of disaster relief operations. However, for effectiveness, they need to be demand driven and supported by adequate data for prioritization. Without this data operations are carried out in an ad hoc manner and eventually become chaotic. The current availability of geospatial data helps in creating models for predictive damage and vulnerability assessment, which can be of great advantage to logisticians to gain an understanding on the nature and extent of the disaster damage. This translates into actionable information on the demand for relief goods, the state of the transport infrastructure and subsequently the priority areas for relief delivery. However, due to the unpredictable nature of disasters, the accuracy in the models need improvement which can be done using remote sensing data from UAVs (Unmanned Aerial Vehicles) or satellite imagery, which again come with certain limitations. This research addresses the need for a framework to combine data from different sources to support humanitarian logistic operations and prediction models. The focus is on developing a workflow to combine data from satellites and UAVs post a disaster strike. A three-step approach is followed: first, the data requirements for logistics activities are made explicit, which is done by carrying out semi-structured interviews with on field logistics workers. Second, the limitations in current data collection tools are analyzed to develop workaround solutions by following a systems design approach. Third, the data requirements and the developed workaround solutions are fit together towards a coherent workflow. The outcome of this research will provide a new method for logisticians to have immediately accurate and reliable data to support data-driven decision making.Keywords: unmanned aerial vehicles, damage prediction models, remote sensing, data driven decision making
Procedia PDF Downloads 37824676 The Role of Sustainable Financing Models for Smallholder Tree Growers in Ghana
Authors: Raymond Awinbilla
Abstract:
The call for tree planting has long been set in motion by the government of Ghana. The Forestry Commission encourages plantation development through numerous interventions including formulating policies and enacting legislations. However, forest policies have failed and that has generated a major concern over the vast gap between the intentions of national policies and the realities established. This study addresses three objectives;1) Assessing the farmers' response and contribution to the tree planting initiative, 2) Identifying socio-economic factors hindering the development of smallholder plantations as a livelihood strategy, and 3) Determining the level of support available for smallholder tree growers and the factors influencing it. The field work was done in 12 farming communities in Ghana. The article illuminates that farmers have responded to the call for tree planting and have planted both exotic and indigenous tree species. Farmers have converted 17.2% (369.48ha) of their total land size into plantations and have no problem with land tenure. Operations and marketing constraints include lack of funds for operations, delay in payment, low price of wood, manipulation of price by buyers, documentation by buyers, and no ready market for harvesting wood products. Environmental institutions encourage tree planting; the only exception is with the Lands Commission. Support availed to farmers includes capacity building in silvicultural practices, organisation of farmers, linkage to markets and finance. Efforts by the Government of Ghana to enhance forest resources in the country could rely on the input of local populations.Keywords: livelihood strategy, marketing constraints, environmental institutions, silvicultural practices
Procedia PDF Downloads 5824675 Facility Data Model as Integration and Interoperability Platform
Authors: Nikola Tomasevic, Marko Batic, Sanja Vranes
Abstract:
Emerging Semantic Web technologies can be seen as the next step in evolution of the intelligent facility management systems. Particularly, this considers increased usage of open source and/or standardized concepts for data classification and semantic interpretation. To deliver such facility management systems, providing the comprehensive integration and interoperability platform in from of the facility data model is a prerequisite. In this paper, one of the possible modelling approaches to provide such integrative facility data model which was based on the ontology modelling concept was presented. Complete ontology development process, starting from the input data acquisition, ontology concepts definition and finally ontology concepts population, was described. At the beginning, the core facility ontology was developed representing the generic facility infrastructure comprised of the common facility concepts relevant from the facility management perspective. To develop the data model of a specific facility infrastructure, first extension and then population of the core facility ontology was performed. For the development of the full-blown facility data models, Malpensa and Fiumicino airports in Italy, two major European air-traffic hubs, were chosen as a test-bed platform. Furthermore, the way how these ontology models supported the integration and interoperability of the overall airport energy management system was analyzed as well.Keywords: airport ontology, energy management, facility data model, ontology modeling
Procedia PDF Downloads 44824674 In Vitro Propagation in Barleria prionitis L. Via Callus Organogenesis
Authors: Rashmi Ranade, Neelu Joshi
Abstract:
Barleria prionitis L. is a well explored Indian medicinal plant valued for its stem and leaf which forms an important ingredient of many Ayurvedic formulations. It is used for the treatment of various disorders like toothache, bleeding gums, strengthening gums, whooping cough, inflammation, arthritis, enlargement of scrotum and sciatica etc. The plant is propagated vegetatively through stem cuttings. Frequent harvesting of this plant has led to the shortage of planting material, and it has acquired the status of vulnerable plant species. Plant tissue culture technology offers a very good alternative for propagation and conservation of such plant species. The present investigation was undertaken to develop in vitro regeneration protocol for B. prionitis L. via callus organogenesis pathway. Stem and leaf explants were used for this purpose. Different media and plant growth regulators were optimized to develop the protocol. The problem of phenol secretion and browning and in vitro cultures at the establishment phase was successfully curbed with the usage of antibrowning agents such as ascorbic acid and activated charcoal. Optimum shoot multiplication was achieved by the use of liquid media and incorporation of silver nitrate and TIBA (triiodobenzoic acid) into the media. High percent rooting (76%) was observed on WPM media supplemented with IBA (2.0 mg/l), IAA (0.5 mg/l), GA3(0.5) and activated charcoal(500 mg/l). The rooted plantlets were subjected to in vitro hardening on sterile potting mix (soil:farmyard manure:compost; 1:2:1) and acclimatized under greenhouse conditions. Around 85% survival of plantlets was recorded upon acclimatization. This lab scale protocol would be tested for in vitro scaling up production of B. prionitis L.Keywords: explant browning, liquid culture, micropropagation, shoot multiplication, phenolic secretion
Procedia PDF Downloads 28324673 Anti-Tyrosinase and Antibacterial Activities of Marine Fungal Extracts
Authors: Shivankar Agrawal, Sunil Kumar Deshmukh, Colin Barrow, Alok Adholeya
Abstract:
A variety of genetic and environmental factors cause various cosmetics and dermatological problems. There are already claimed drugs available in market for treating these problems. However, the challenge remains in finding more potent, environmental friendly, causing minimal side effects and economical cosmeceuticals. This leads to an increased demand for natural cosmeceutical products in the last few decades. Plant derived ingredients are limited because plants either contain toxic metabolites, grow too slow or seasonal harvesting is a problem. The research work carried out in this project aims at isolation, characterization of marine fungal secondary metabolite and evaluating their potential use in future cosmetic skin care products. We have isolated and purified 35 morphologically different fungal isolates from various marine habitats of the India. These isolates have been functionally characterized for anti-tyrosinase, antioxidant and anti-acne activities. For molecular characterization, the Internal Transcribed spacer (ITS) region of 15 functionally active marine fungal isolates was amplified using universal primers, ITS1 and ITS4 and sequenced. Out of 15 marine fungal isolates crude extract of strains D4 (Aspergillus terreus) and P2 (Talaromyces stipitatus) showed 70% and 57% tyrosinase inhibition at 1mg/mL respectively. Strain D5 (Simplicillium lamellicola) has showed significant inhibition against Propionibacterium acnes and Staphylococcus epidermidis. In addition, all these strains also displayed DPPH- radical scavenging activity and may be utilized as skin cosmeceutical applications. Purification and characterization of crude extracts for identification of active lead molecule is under process.Keywords: anti-acne, anti-tyrosinase, cosmeceutical, marine fungi
Procedia PDF Downloads 27724672 A Machine Learning Model for Dynamic Prediction of Chronic Kidney Disease Risk Using Laboratory Data, Non-Laboratory Data, and Metabolic Indices
Authors: Amadou Wurry Jallow, Adama N. S. Bah, Karamo Bah, Shih-Ye Wang, Kuo-Chung Chu, Chien-Yeh Hsu
Abstract:
Chronic kidney disease (CKD) is a major public health challenge with high prevalence, rising incidence, and serious adverse consequences. Developing effective risk prediction models is a cost-effective approach to predicting and preventing complications of chronic kidney disease (CKD). This study aimed to develop an accurate machine learning model that can dynamically identify individuals at risk of CKD using various kinds of diagnostic data, with or without laboratory data, at different follow-up points. Creatinine is a key component used to predict CKD. These models will enable affordable and effective screening for CKD even with incomplete patient data, such as the absence of creatinine testing. This retrospective cohort study included data on 19,429 adults provided by a private research institute and screening laboratory in Taiwan, gathered between 2001 and 2015. Univariate Cox proportional hazard regression analyses were performed to determine the variables with high prognostic values for predicting CKD. We then identified interacting variables and grouped them according to diagnostic data categories. Our models used three types of data gathered at three points in time: non-laboratory, laboratory, and metabolic indices data. Next, we used subgroups of variables within each category to train two machine learning models (Random Forest and XGBoost). Our machine learning models can dynamically discriminate individuals at risk for developing CKD. All the models performed well using all three kinds of data, with or without laboratory data. Using only non-laboratory-based data (such as age, sex, body mass index (BMI), and waist circumference), both models predict chronic kidney disease as accurately as models using laboratory and metabolic indices data. Our machine learning models have demonstrated the use of different categories of diagnostic data for CKD prediction, with or without laboratory data. The machine learning models are simple to use and flexible because they work even with incomplete data and can be applied in any clinical setting, including settings where laboratory data is difficult to obtain.Keywords: chronic kidney disease, glomerular filtration rate, creatinine, novel metabolic indices, machine learning, risk prediction
Procedia PDF Downloads 10524671 Road Accidents Bigdata Mining and Visualization Using Support Vector Machines
Authors: Usha Lokala, Srinivas Nowduri, Prabhakar K. Sharma
Abstract:
Useful information has been extracted from the road accident data in United Kingdom (UK), using data analytics method, for avoiding possible accidents in rural and urban areas. This analysis make use of several methodologies such as data integration, support vector machines (SVM), correlation machines and multinomial goodness. The entire datasets have been imported from the traffic department of UK with due permission. The information extracted from these huge datasets forms a basis for several predictions, which in turn avoid unnecessary memory lapses. Since data is expected to grow continuously over a period of time, this work primarily proposes a new framework model which can be trained and adapt itself to new data and make accurate predictions. This work also throws some light on use of SVM’s methodology for text classifiers from the obtained traffic data. Finally, it emphasizes the uniqueness and adaptability of SVMs methodology appropriate for this kind of research work.Keywords: support vector mechanism (SVM), machine learning (ML), support vector machines (SVM), department of transportation (DFT)
Procedia PDF Downloads 27424670 A Relational Data Base for Radiation Therapy
Authors: Raffaele Danilo Esposito, Domingo Planes Meseguer, Maria Del Pilar Dorado Rodriguez
Abstract:
As far as we know, it is still unavailable a commercial solution which would allow to manage, openly and configurable up to user needs, the huge amount of data generated in a modern Radiation Oncology Department. Currently, available information management systems are mainly focused on Record & Verify and clinical data, and only to a small extent on physical data. Thus, results in a partial and limited use of the actually available information. In the present work we describe the implementation at our department of a centralized information management system based on a web server. Our system manages both information generated during patient planning and treatment, and information of general interest for the whole department (i.e. treatment protocols, quality assurance protocols etc.). Our objective it to be able to analyze in a simple and efficient way all the available data and thus to obtain quantitative evaluations of our treatments. This would allow us to improve our work flow and protocols. To this end we have implemented a relational data base which would allow us to use in a practical and efficient way all the available information. As always we only use license free software.Keywords: information management system, radiation oncology, medical physics, free software
Procedia PDF Downloads 24024669 A Study of Safety of Data Storage Devices of Graduate Students at Suan Sunandha Rajabhat University
Authors: Komol Phaisarn, Natcha Wattanaprapa
Abstract:
This research is a survey research with an objective to study the safety of data storage devices of graduate students of academic year 2013, Suan Sunandha Rajabhat University. Data were collected by questionnaire on the safety of data storage devices according to CIA principle. A sample size of 81 was drawn from population by purposive sampling method. The results show that most of the graduate students of academic year 2013 at Suan Sunandha Rajabhat University use handy drive to store their data and the safety level of the devices is at good level.Keywords: security, safety, storage devices, graduate students
Procedia PDF Downloads 35224668 Simulation of a Cost Model Response Requests for Replication in Data Grid Environment
Authors: Kaddi Mohammed, A. Benatiallah, D. Benatiallah
Abstract:
Data grid is a technology that has full emergence of new challenges, such as the heterogeneity and availability of various resources and geographically distributed, fast data access, minimizing latency and fault tolerance. Researchers interested in this technology address the problems of the various systems related to the industry such as task scheduling, load balancing and replication. The latter is an effective solution to achieve good performance in terms of data access and grid resources and better availability of data cost. In a system with duplication, a coherence protocol is used to impose some degree of synchronization between the various copies and impose some order on updates. In this project, we present an approach for placing replicas to minimize the cost of response of requests to read or write, and we implement our model in a simulation environment. The placement techniques are based on a cost model which depends on several factors, such as bandwidth, data size and storage nodes.Keywords: response time, query, consistency, bandwidth, storage capacity, CERN
Procedia PDF Downloads 27124667 Improving the Genetic Diversity of Soybean Seeds and Tolerance to Drought Irradiated with Gamma Rays
Authors: Aminah Muchdar
Abstract:
To increase the genetic diversity of soybean in order to adapt to agroecology in Indonesia conducted ways including introduction, cross, mutation and genetic transformation. The purpose of this research is to obtain early maturity soybean mutant lines, large seed tolerant to drought with high yield potential. This study consisted of two stages: the first is sensitivity of gamma rays carried out in the Laboratory BATAN. The genetic variety used is Anjasmoro. The method seeds irradiated with gamma rays at a rate of activity with the old ci 1046.16976 irradiation 0-71 minutes. Irradiation doses of 0, 100, 200, 300, 400, 500, 600, 700, 800, 900 and 1000gy. The results indicated all seeds irradiated with doses of 0 - 1000gy, just a dose of 200 and 300gy are able to show the percentage of germination, plant height, number of leaves, number of normal sprouts and green leaves of the best and can be continued for a second trial in order to assemble and to get mutants which is expected. The result of second stage of soybean M2 Population irradiated with diversity Gamma Irradiation performed that in the form of soybean planting, the seed planted is the first derivative of the M2 irradiated seeds. The result after the age of 30ADP has already showing growth and development of plants that vary when compared to its parent, both in terms of plant height, number of leaves, leaf shape and leaf forage level. In the generative phase, a plant that has been irradiated 200 and 300 gy seen some plants flower form packs, but not formed pods, there is also a form packs of flowers, but few pods produce soybean morphological characters such as plant height, number of branches, pods, days to flowering, harvesting, seed weight and seed number.Keywords: gamma ray, genetic mutation, irradiation, soybean
Procedia PDF Downloads 40024666 Prompt Design for Code Generation in Data Analysis Using Large Language Models
Authors: Lu Song Ma Li Zhi
Abstract:
With the rapid advancement of artificial intelligence technology, large language models (LLMs) have become a milestone in the field of natural language processing, demonstrating remarkable capabilities in semantic understanding, intelligent question answering, and text generation. These models are gradually penetrating various industries, particularly showcasing significant application potential in the data analysis domain. However, retraining or fine-tuning these models requires substantial computational resources and ample downstream task datasets, which poses a significant challenge for many enterprises and research institutions. Without modifying the internal parameters of the large models, prompt engineering techniques can rapidly adapt these models to new domains. This paper proposes a prompt design strategy aimed at leveraging the capabilities of large language models to automate the generation of data analysis code. By carefully designing prompts, data analysis requirements can be described in natural language, which the large language model can then understand and convert into executable data analysis code, thereby greatly enhancing the efficiency and convenience of data analysis. This strategy not only lowers the threshold for using large models but also significantly improves the accuracy and efficiency of data analysis. Our approach includes requirements for the precision of natural language descriptions, coverage of diverse data analysis needs, and mechanisms for immediate feedback and adjustment. Experimental results show that with this prompt design strategy, large language models perform exceptionally well in multiple data analysis tasks, generating high-quality code and significantly shortening the data analysis cycle. This method provides an efficient and convenient tool for the data analysis field and demonstrates the enormous potential of large language models in practical applications.Keywords: large language models, prompt design, data analysis, code generation
Procedia PDF Downloads 3924665 Comparison of Different Methods to Produce Fuzzy Tolerance Relations for Rainfall Data Classification in the Region of Central Greece
Authors: N. Samarinas, C. Evangelides, C. Vrekos
Abstract:
The aim of this paper is the comparison of three different methods, in order to produce fuzzy tolerance relations for rainfall data classification. More specifically, the three methods are correlation coefficient, cosine amplitude and max-min method. The data were obtained from seven rainfall stations in the region of central Greece and refers to 20-year time series of monthly rainfall height average. Three methods were used to express these data as a fuzzy relation. This specific fuzzy tolerance relation is reformed into an equivalence relation with max-min composition for all three methods. From the equivalence relation, the rainfall stations were categorized and classified according to the degree of confidence. The classification shows the similarities among the rainfall stations. Stations with high similarity can be utilized in water resource management scenarios interchangeably or to augment data from one to another. Due to the complexity of calculations, it is important to find out which of the methods is computationally simpler and needs fewer compositions in order to give reliable results.Keywords: classification, fuzzy logic, tolerance relations, rainfall data
Procedia PDF Downloads 31424664 Customer Satisfaction and Effective HRM Policies: Customer and Employee Satisfaction
Authors: S. Anastasiou, C. Nathanailides
Abstract:
The purpose of this study is to examine the possible link between employee and customer satisfaction. The service provided by employees, help to build a good relationship with customers and can help at increasing their loyalty. Published data for job satisfaction and indicators of customer services were gathered from relevant published works which included data from five different countries. The reviewed data indicate a significant correlation between indicators of customer and employee satisfaction in the Banking sector. There was a significant correlation between the two parameters (Pearson correlation R2=0.52 P<0.05) The reviewed data provide evidence that there is some practical evidence which links these two parameters.Keywords: job satisfaction, job performance, customer’ service, banks, human resources management
Procedia PDF Downloads 32124663 Management of Insect Pests Using Baculovirus Based Biopesticides in India
Authors: Mudasir Gani, Rakesh Kumar Gupta, Kamlesh Bali, Abdul Rouf Wani
Abstract:
The gypsy moth (Lymantria obfuscata) and tent caterpillar (Malacosoma indicum) are serious pests that attack a wide range of fruit and forest trees in Jammu & Kashmir range of North-Western Himalayas in India. Investigations were carried out to isolate and bioprospect naturally occurring nucleopolyhedroviruses (NPVs) as potent biopesticides against these pests. The biological and molecular characterization of NPV isolates from different ecosystems was conducted, and the polh, lef-8 and lef-9 genes were sequenced and subjected to phylogenetic analysis. The L. obfuscata NPV was more closely related to the L. dispar NPV, whereas M. indicum NPV was more closely related to the M. californicum NPV in the NCBI taxonomy database. Among different isolates, Bhaderwah isolates exhibited highest virus activity (LD₅₀ = 250 POBs/larvae) and speed of kill (ST₅₀ = 6.80 days) against L. obfuscata whereas Mahor isolates proved most virulent against M. indicum, with lowest LD₅₀ (257 POBs/larva) and ST₅₀ (6.80 days). The in vivo mass production for highest productivity and quality revealed that the optimum yield was obtained when 3rd instar larvae were inoculated with a viral dose of 1.44 × 105 POBs/larva and allowed to incubate for nine days for L. obfuscata. However, for M. indicum larvae, a viral dose of 2.88 × 10⁶ POBs/larva and incubation period of 10 days were found optimum. It was found that harvesting of moribund larvae yields good quality NPV. The field application of L. obfuscata NPV and M. indicum NPV against the respective host populations on apple and willow with the pre-standardized dosage of 1 × 10¹² POBs/acre reduced the larval population density up to 25-63%.Keywords: baculoviruses, biopesticides, Lymantria obfuscata, Malacosoma indicum
Procedia PDF Downloads 112