Search results for: data mining analytics
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25609

Search results for: data mining analytics

24229 Issue Reorganization Using the Measure of Relevance

Authors: William Wong Xiu Shun, Yoonjin Hyun, Mingyu Kim, Seongi Choi, Namgyu Kim

Abstract:

Recently, the demand of extracting the R&D keywords from the issues and using them in retrieving R&D information is increasing rapidly. But it is hard to identify the related issues or to distinguish them. Although the similarity between the issues cannot be identified, but with the R&D lexicon, the issues that always shared the same R&D keywords can be determined. In details, the R&D keywords that associated with particular issue is implied the key technology elements that needed to solve the problem of the particular issue. Furthermore, the related issues that sharing the same R&D keywords can be showed in a more systematic way through the issue clustering constructed from the perspective of R&D. Thus, sharing of the R&D result and reusable of the R&D technology can be facilitated. Indirectly, the redundancy of investment on the same R&D can be reduce as the R&D information can be shared between those corresponding issues and reusability of the related R&D can be improved. Therefore, a methodology of constructing an issue clustering from the perspective of common R&D keywords is proposed to satisfy the demands mentioned.

Keywords: clustering, social network analysis, text mining, topic analysis

Procedia PDF Downloads 573
24228 Cloning and Characterization of UDP-Glucose Pyrophosphorylases from Lactobacillus kefiranofaciens and Rhodococcus wratislaviensis

Authors: Mesfin Angaw Tesfay

Abstract:

Uridine-5’-diphosphate (UDP)-glucose is one of the most versatile building blocks within the metabolism of prokaryotes and eukaryotes, serving as an activated sugar donor during the glycosylation of natural products. It is formed by the enzyme UDP-glucose pyrophosphorylase (UGPase) using uridine-5′-triphosphate (UTP) and α-d-glucose 1-phosphate as a substrate. Herein, two UGPase genes from Lactobacillus kefiranofaciens ZW3 (LkUGPase) and Rhodococcus wratislaviensis IFP 2016 (RwUGPase) were identified through genome mining approaches. The LkUGPase and RwUGPase have 299 and 306 amino acids, respectively. Both UGPase has the conserved UTP binding site (G-X-G-T-R-X-L-P) and the glucose -1-phosphate binding site (V-E-K-P). The LkUGPase and RwUGPase were cloned in E. coli, and SDS-PAGE analysis showed the expression of both enzymes forming about 36 KDa of protein band after induction. LkUGPase and RwUGPase have an activity of 1549.95 and 671.53 U/mg, respectively. Currently, their kinetic properties are under investigation.

Keywords: UGPase, LkUGPase, RwUGPase, UDP-glucose, glycosylation

Procedia PDF Downloads 24
24227 Prediction of Marine Ecosystem Changes Based on the Integrated Analysis of Multivariate Data Sets

Authors: Prozorkevitch D., Mishurov A., Sokolov K., Karsakov L., Pestrikova L.

Abstract:

The current body of knowledge about the marine environment and the dynamics of marine ecosystems includes a huge amount of heterogeneous data collected over decades. It generally includes a wide range of hydrological, biological and fishery data. Marine researchers collect these data and analyze how and why the ecosystem changes from past to present. Based on these historical records and linkages between the processes it is possible to predict future changes. Multivariate analysis of trends and their interconnection in the marine ecosystem may be used as an instrument for predicting further ecosystem evolution. A wide range of information about the components of the marine ecosystem for more than 50 years needs to be used to investigate how these arrays can help to predict the future.

Keywords: barents sea ecosystem, abiotic, biotic, data sets, trends, prediction

Procedia PDF Downloads 117
24226 Optical Fiber Data Throughput in a Quantum Communication System

Authors: Arash Kosari, Ali Araghi

Abstract:

A mathematical model for an optical-fiber communication channel is developed which results in an expression that calculates the throughput and loss of the corresponding link. The data are assumed to be transmitted by using of separate photons with different polarizations. The derived model also shows the dependency of data throughput with length of the channel and depolarization factor. It is observed that absorption of photons affects the throughput in a more intensive way in comparison with that of depolarization. Apart from that, the probability of depolarization and the absorption of radiated photons are obtained.

Keywords: absorption, data throughput, depolarization, optical fiber

Procedia PDF Downloads 286
24225 Event Driven Dynamic Clustering and Data Aggregation in Wireless Sensor Network

Authors: Ashok V. Sutagundar, Sunilkumar S. Manvi

Abstract:

Energy, delay and bandwidth are the prime issues of wireless sensor network (WSN). Energy usage optimization and efficient bandwidth utilization are important issues in WSN. Event triggered data aggregation facilitates such optimal tasks for event affected area in WSN. Reliable delivery of the critical information to sink node is also a major challenge of WSN. To tackle these issues, we propose an event driven dynamic clustering and data aggregation scheme for WSN that enhances the life time of the network by minimizing redundant data transmission. The proposed scheme operates as follows: (1) Whenever the event is triggered, event triggered node selects the cluster head. (2) Cluster head gathers data from sensor nodes within the cluster. (3) Cluster head node identifies and classifies the events out of the collected data using Bayesian classifier. (4) Aggregation of data is done using statistical method. (5) Cluster head discovers the paths to the sink node using residual energy, path distance and bandwidth. (6) If the aggregated data is critical, cluster head sends the aggregated data over the multipath for reliable data communication. (7) Otherwise aggregated data is transmitted towards sink node over the single path which is having the more bandwidth and residual energy. The performance of the scheme is validated for various WSN scenarios to evaluate the effectiveness of the proposed approach in terms of aggregation time, cluster formation time and energy consumed for aggregation.

Keywords: wireless sensor network, dynamic clustering, data aggregation, wireless communication

Procedia PDF Downloads 451
24224 Offshore Outsourcing: Global Data Privacy Controls and International Compliance Issues

Authors: Michelle J. Miller

Abstract:

In recent year, there has been a rise of two emerging issues that impact the global employment and business market that the legal community must review closer: offshore outsourcing and data privacy. These two issues intersect because employment opportunities are shifting due to offshore outsourcing and some States, like the United States, anti-outsourcing legislation has been passed or presented to retain jobs within the country. In addition, the legal requirements to retain the privacy of data as a global employer extends to employees and third party service provides, including services outsourced to offshore locations. For this reason, this paper will review the intersection of these two issues with a specific focus on data privacy.

Keywords: outsourcing, data privacy, international compliance, multinational corporations

Procedia PDF Downloads 411
24223 Weighted Data Replication Strategy for Data Grid Considering Economic Approach

Authors: N. Mansouri, A. Asadi

Abstract:

Data Grid is a geographically distributed environment that deals with data intensive application in scientific and enterprise computing. Data replication is a common method used to achieve efficient and fault-tolerant data access in Grids. In this paper, a dynamic data replication strategy, called Enhanced Latest Access Largest Weight (ELALW) is proposed. This strategy is an enhanced version of Latest Access Largest Weight strategy. However, replication should be used wisely because the storage capacity of each Grid site is limited. Thus, it is important to design an effective strategy for the replication replacement task. ELALW replaces replicas based on the number of requests in future, the size of the replica, and the number of copies of the file. It also improves access latency by selecting the best replica when various sites hold replicas. The proposed replica selection selects the best replica location from among the many replicas based on response time that can be determined by considering the data transfer time, the storage access latency, the replica requests that waiting in the storage queue and the distance between nodes. Simulation results utilizing the OptorSim show our replication strategy achieve better performance overall than other strategies in terms of job execution time, effective network usage and storage resource usage.

Keywords: data grid, data replication, simulation, replica selection, replica placement

Procedia PDF Downloads 260
24222 Evaluation of Satellite and Radar Rainfall Product over Seyhan Plain

Authors: Kazım Kaba, Erdem Erdi, M. Akif Erdoğan, H. Mustafa Kandırmaz

Abstract:

Rainfall is crucial data source for very different discipline such as agriculture, hydrology and climate. Therefore rain rate should be known well both spatial and temporal for any area. Rainfall is measured by using rain-gauge at meteorological ground stations traditionally for many years. At the present time, rainfall products are acquired from radar and satellite images with a temporal and spatial continuity. In this study, we investigated the accuracy of these rainfall data according to rain-gauge data. For this purpose, we used Adana-Hatay radar hourly total precipitation product (RN1) and Meteosat convective rainfall rate (CRR) product over Seyhan plain. We calculated daily rainfall values from RN1 and CRR hourly precipitation products. We used the data of rainy days of four stations located within range of the radar from October 2013 to November 2015. In the study, we examined two rainfall data over Seyhan plain and the correlation between the rain-gauge data and two raster rainfall data was observed lowly.

Keywords: meteosat, radar, rainfall, rain-gauge, Turkey

Procedia PDF Downloads 328
24221 Data-Driven Dynamic Overbooking Model for Tour Operators

Authors: Kannapha Amaruchkul

Abstract:

We formulate a dynamic overbooking model for a tour operator, in which most reservations contain at least two people. The cancellation rate and the timing of the cancellation may depend on the group size. We propose two overbooking policies, namely economic- and service-based. In an economic-based policy, we want to minimize the expected oversold and underused cost, whereas, in a service-based policy, we ensure that the probability of an oversold situation does not exceed the pre-specified threshold. To illustrate the applicability of our approach, we use tour package data in 2016-2018 from a tour operator in Thailand to build a data-driven robust optimization model, and we tested the proposed overbooking policy in 2019. We also compare the data-driven approach to the conventional approach of fitting data into a probability distribution.

Keywords: applied stochastic model, data-driven robust optimization, overbooking, revenue management, tour operator

Procedia PDF Downloads 134
24220 Modeling and Statistical Analysis of a Soap Production Mix in Bejoy Manufacturing Industry, Anambra State, Nigeria

Authors: Okolie Chukwulozie Paul, Iwenofu Chinwe Onyedika, Sinebe Jude Ebieladoh, M. C. Nwosu

Abstract:

The research work is based on the statistical analysis of the processing data. The essence is to analyze the data statistically and to generate a design model for the production mix of soap manufacturing products in Bejoy manufacturing company Nkpologwu, Aguata Local Government Area, Anambra state, Nigeria. The statistical analysis shows the statistical analysis and the correlation of the data. T test, Partial correlation and bi-variate correlation were used to understand what the data portrays. The design model developed was used to model the data production yield and the correlation of the variables show that the R2 is 98.7%. However, the results confirm that the data is fit for further analysis and modeling. This was proved by the correlation and the R-squared.

Keywords: General Linear Model, correlation, variables, pearson, significance, T-test, soap, production mix and statistic

Procedia PDF Downloads 445
24219 The Importance of Imaging and Functional Tests for Early Detection of Occupational Diseases in Kosovo's Miners

Authors: Krenare Shabani, Kreshnike Dedushi Hoti, Serbeze Kabashi, Jeton Shatri, Arben Rroji, Mrikë Bunjaku, Leotrim Berisha, Jona Kosova, Edmond Puca, Bleriana Shabani

Abstract:

Introduction: Workers in Kosovo's mining industry are subjected to hazardous working conditions and airborne particles, such as silica dust, which can cause silicosis and other severe respiratory illnesses. The purpose of this research is to assess the health impacts of such exposures, as well as the importance of imaging and functional testing in detecting pathological changes early on. Methodology: The study is prospective and cross-sectional and was carried out during the year 2024. 626 people (446 miners and 180 non-miners) were enrolled in the study. Subjects underwent spirometry and chest radiography. Data were analysed with SPSS24. Results: The average age of the participants is 48 years. Demographics and Smoking: Smoking was common among young miners. Radiological Changes: Radiographic abnormalities in the lungs were seen in 23.1% of miners and 10.6% of non-miners, including small irregular opacities and emphysematous changes. Lung Function: The FEV1/FVC ratio decreased with increased exposure time, indicating a decline in pulmonary function.Impact of Exposure Duration: Longer exposure duration was associated with a higher number of miners experiencing coughs and requiring medical consultations such as CT scans and biopsies. Conclusions: Medical imaging and functional testing are critical for early diagnosis of lung abnormalities in miners.Findings demonstrate a strong correlation between extended exposure to mine dust and the development of respiratory disorders, emphasising the importance of preventative measures and routine health monitoring.

Keywords: silicosis, miners, imaging, spirometry

Procedia PDF Downloads 28
24218 pH-Responsive Carrier Based on Polymer Particle

Authors: Florin G. Borcan, Ramona C. Albulescu, Adela Chirita-Emandi

Abstract:

pH-responsive drug delivery systems are gaining more importance because these systems deliver the drug at a specific time in regards to pathophysiological necessity, resulting in improved patient therapeutic efficacy and compliance. Polyurethane materials are well-known for industrial applications (elastomers and foams used in different insulations and automotive), but they are versatile biocompatible materials with many applications in medicine, as artificial skin for the premature neonate, membrane in the hybrid artificial pancreas, prosthetic heart valves, etc. This study aimed to obtain the physico-chemical characterization of a drug delivery system based on polyurethane microparticles. The synthesis is based on a polyaddition reaction between an aqueous phase (mixture of polyethylene-glycol M=200, 1,4-butanediol and Tween® 20) and an organic phase (lysin-diisocyanate in acetone) combined with simultaneous emulsification. Different active agents (omeprazole, amoxicillin, metoclopramide) were used to verify the release profile of the macromolecular particles in different pH mediums. Zetasizer measurements were performed using an instrument based on two modules: a Vasco size analyzer and a Wallis Zeta potential analyzer (Cordouan Technol., France) in samples that were kept in various solutions with different pH and the maximum absorbance in UV-Vis spectra were collected on a UVi Line 9,400 Spectrophotometer (SI Analytics, Germany). The results of this investigation have revealed that these particles are proper for a prolonged release in gastric medium where they can assure an almost constant concentration of the active agents for 1-2 weeks, while they can be disassembled faster in a medium with neutral pHs, such as the intestinal fluid.

Keywords: lysin-diisocyanate, nanostructures, polyurethane, Zetasizer

Procedia PDF Downloads 184
24217 The Study of Internship Performances: Comparison of Information Technology Interns towards Students’ Types and Background Profiles

Authors: Shutchapol Chopvitayakun

Abstract:

Internship program is a compulsory course of many undergraduate programs in Thailand. It gives opportunities to a lot of senior students as interns to practice their working skills in the real organizations and also gives chances for interns to face real-world working problems. Interns also learn how to solve those problems by direct and indirect experiences. This program in many schools is a well-structured course with a contract or agreement made with real business organizations. Moreover, this program also offers opportunities for interns to get jobs after completing it from where the internship program takes place. Interns also learn how to work as a team and how to associate with other colleagues, trainers, and superiors of each organization in term of social hierarchy, self-responsibility, and self-disciplinary. This research focuses on senior students of Suan Sunandha Rajabhat University, Thailand whose studying major is information technology program. They practiced their working skills or took internship programs in the real business sector or real operating organizations in 2015-2016. Interns are categorized in to two types: normal program and special program. For special program, students study in weekday evening from Monday to Friday or Weekend and most of them work full-time or part-time job. For normal program, students study in weekday working hours and most of them do not work. The differences of these characters and the outcomes of internship performance were studied and analyzed in this research. This work applied some statistical analytics to find out whether the internship performance of each intern type has different performances statistically or not.

Keywords: internship, intern, senior student, information technology program

Procedia PDF Downloads 263
24216 Examination of Occupational Health and Safety Practices in Ghana

Authors: Zakari Mustapha, Clinto Aigbavboa, Wellinton Didi Thwala

Abstract:

Occupational Health and Safety (OHS) issues has been a major challenge to the Ghanaian government. The purpose of the study was to examine OHS practices in Ghana. The study looked at various views from different scholars about OHS practices in order to achieve the objective of the study. Literature review was conducted on OHS in Ghana. Findings from the study shows Ministry of Roads and Transport (MRT) and Ministry of Water Resources, Works and Housing (MWRWH) are two government ministries in charge of construction and implementation of the construction sector policy. The Factories, Offices and Shops Act 1970, Act 328 and the Mining Regulations 1970 LI 665 are the two major edicts. The study presents a strong background on OHS practices in Ghana and contribute to the body of knowledge on the solution to the current trends and challenges of OHS in the construction sector.

Keywords: ILO convention, OHS challenges, OHS practices, OHS improvement

Procedia PDF Downloads 367
24215 Optimization of Real Time Measured Data Transmission, Given the Amount of Data Transmitted

Authors: Michal Kopcek, Tomas Skulavik, Michal Kebisek, Gabriela Krizanova

Abstract:

The operation of nuclear power plants involves continuous monitoring of the environment in their area. This monitoring is performed using a complex data acquisition system, which collects status information about the system itself and values of many important physical variables e.g. temperature, humidity, dose rate etc. This paper describes a proposal and optimization of communication that takes place in teledosimetric system between the central control server responsible for the data processing and storing and the decentralized measuring stations, which are measuring the physical variables. Analyzes of ongoing communication were performed and consequently the optimization of the system architecture and communication was done.

Keywords: communication protocol, transmission optimization, data acquisition, system architecture

Procedia PDF Downloads 518
24214 Industrial Kaolinite Resource Deposits Study in Grahamstown Area, Eastern Cape, South Africa

Authors: Adeola Ibukunoluwa Samuel, Afsoon Kazerouni

Abstract:

Industrial mineral kaolin has many favourable properties such as colour, shape, softness, non-abrasiveness, natural whiteness, as well as chemical stability. It occurs extensively in North of Bedford road Grahamstown, South Africa. The relationship between both the physical and chemical properties as lead to its application in the production of certain industrial products which are used by the public; this includes the prospect of production of paper, ceramics, rubber, paint, and plastics. Despite its interesting economic potentials, kaolinite clay mineral remains undermined, and this is threatening its sustainability in the mineral industry. This research study focuses on a detailed evaluation of the kaolinite mineral and possible ways to increase its lifespan in the industry. The methods employed for this study includes petrographic microscopy analysis, X-ray powder diffraction analysis (XRD), and proper field reconnaissance survey. Results emanating from this research include updated geological information on Grahamstown. Also, mineral transformation phases such as quartz, kaolinite, calcite and muscovite were identified in the clay samples. Petrographic analysis of the samples showed that the study area has been subjected to intense tectonic deformation and cement replacement. Also, different dissolution patterns were identified on the Grahamstown kaolinitic clay deposits. Hence incorporating analytical studies and data interpretations, possible ways such as the establishment of processing refinery near mining plants, which will, in turn, provide employment for the locals and land reclamation is suggested. In addition, possible future sustainable industrial applications of the clay minerals seem to be possible if additives, cellulosic wastes are used to alter the clay mineral.

Keywords: kaolinite, industrial use, sustainability, Grahamstown, clay minerals

Procedia PDF Downloads 188
24213 Shear Strength Characterization of Coal Mine Spoil in Very-High Dumps with Large Scale Direct Shear Testing

Authors: Leonie Bradfield, Stephen Fityus, John Simmons

Abstract:

The shearing behavior of current and planned coal mine spoil dumps up to 400m in height is studied using large-sample-high-stress direct shear tests performed on a range of spoils common to the coalfields of Eastern Australia. The motivation for the study is to address industry concerns that some constructed spoil dump heights ( > 350m) are exceeding the scale ( ≤ 120m) for which reliable design information exists, and because modern geotechnical laboratories are not equipped to test representative spoil specimens at field-scale stresses. For more than two decades, shear strength estimation for spoil dumps has been based on either infrequent, very small-scale tests where oversize particles are scalped to comply with device specimen size capacity such that the influence of prototype-sized particles on shear strength is not captured; or on published guidelines that provide linear shear strength envelopes derived from small-scale test data and verified in practice by slope performance of dumps up to 120m in height. To date, these published guidelines appear to have been reliable. However, in the field of rockfill dam design there is a broad acceptance of a curvilinear shear strength envelope, and if this is applicable to coal mine spoils, then these industry-accepted guidelines may overestimate the strength and stability of dumps at higher stress levels. The pressing need to rationally define the shearing behavior of more representative spoil specimens at field-scale stresses led to the successful design, construction and operation of a large direct shear machine (LDSM) and its subsequent application to provide reliable design information for current and planned very-high dumps. The LDSM can test at a much larger scale, in terms of combined specimen size (720mm x 720mm x 600mm) and stress (σn up to 4.6MPa), than has ever previously been achieved using a direct shear machine for geotechnical testing of rockfill. The results of an extensive LDSM testing program on a wide range of coal-mine spoils are compared to a published framework that widely accepted by the Australian coal mining industry as the standard for shear strength characterization of mine spoil. A critical outcome is that the LDSM data highlights several non-compliant spoils, and stress-dependent shearing behavior, for which the correct application of the published framework will not provide reliable shear strength parameters for design. Shear strength envelopes developed from the LDSM data are also compared with dam engineering knowledge, where failure envelopes of rockfills are curved in a concave-down manner. The LDSM data indicates that shear strength envelopes for coal-mine spoils abundant with rock fragments are not in fact curved and that the shape of the failure envelope is ultimately determined by the strength of rock fragments. Curvilinear failure envelopes were found to be appropriate for soil-like spoils containing minor or no rock fragments, or hard-soil aggregates.

Keywords: coal mine, direct shear test, high dump, large scale, mine spoil, shear strength, spoil dump

Procedia PDF Downloads 161
24212 The Comparison of Safety Factor in Dry and Rainy Condition at Coal Bearing Formation. Case Study: Lahat Area South Sumatera Province, Indonesia

Authors: Teguh Nurhidayat, Nurhamid, Dicky Muslim, Zufialdi Zakaria, Irvan Sophian

Abstract:

This paper presents the role of climate change as the factor that induces landslide. Case study is located at Lahat Regency, South Sumatera Province, Indonesia. Study area has high economic value of coal reserves (mostly subbituminous – bituminous), which is developable for open pit coal mining in the future. Seams are found in Muara Enim Formation. This formation is at south Sumatera basin which is formed at Tertiary as a result of collision between the indian plate and eurasian plate. South Sumatera basin which is a basin located in back arc basin. This study aims to unravel the relationship between slope stability with different season condition in tropical climate. Undisturbed soil samples were obtained in the field along with other geological data. Laboratory works were carried out to obtain physical and mechanical properties of soils. Methodology to analyze slope stability is bishop method. Bishop methods are used to identify safety factor of slope. Result shows that slopes in rainy season conditions are more prone to landslides than in dry season. In the dry seasons with moisture content is 22.65%, safety factor is 1.28 the slope in stable condition. If rain is approaching with moisture content increasing to 97.8%, the slope began to be critical. On wet condition groundwater levels is increased, followed by γ (unit weight), c (cohesion), and φ (angle of friction) at 18.04, 5,88 kN/m2, and 28,04°, respectively, which ultimately determines the security factor FS to be 1.01 (slope in unstable conditions).

Keywords: rainfall, moisture content, slope analysis, landslide prone

Procedia PDF Downloads 313
24211 Anyword: A Digital Marketing Tool to Increase Productivity in Newly Launching Businesses

Authors: Jana Atteah, Wid Jan, Yara AlHibshi, Rahaf AlRougi

Abstract:

Anyword is an AI copywriting tool that helps marketers create effective campaigns for specific audiences. It offers a wide range of templates for various platforms, brand voice guidelines, and valuable analytics insights. Anyword is used by top global companies and has been recognized as one of the "Fastest Growing Products" in the 2023 software awards. A recent study examined the utilization and impact of AI-powered writing tools, specifically focusing on the adoption of AI in writing pursuits and the use of the Anyword platform. The results indicate that a majority of respondents (52.17%) had not previously used Anyword, but those who had were generally satisfied with the platform. Notable productivity improvements were observed among 13% of the participants, while an additional 34.8% reported a slight increase in productivity. A majority (47.8%) maintained a neutral stance, suggesting that their productivity remained unaffected. Only a minimal percentage (4.3%) claimed that their productivity did not improve with the usage of Anyword AI. In terms of the quality of written content generated, the participants responded positively. Approximately 91% of participants gave Anyword AI a score of 5 or higher, with roughly 17% giving it a perfect score. A small percentage (approximately 9%) gave a low score between 0-2. The mode result was a score of 7, indicating a generally positive perception of the quality of content generated using Anyword AI. These findings suggest that AI can contribute to increased productivity and positively influence the quality of written content. Further research and exploration of AI tools in writing pursuits are warranted to fully understand their potential and limitations.

Keywords: artificial intelligence, marketing platforms, productivity, user interface

Procedia PDF Downloads 63
24210 Sourcing and Compiling a Maltese Traffic Dataset MalTra

Authors: Gabriele Borg, Alexei De Bono, Charlie Abela

Abstract:

There on a constant rise in the availability of high volumes of data gathered from multiple sources, resulting in an abundance of unprocessed information that can be used to monitor patterns and trends in user behaviour. Similarly, year after year, Malta is also constantly experiencing ongoing population growth and an increase in mobilization demand. This research takes advantage of data which is continuously being sourced and converting it into useful information related to the traffic problem on the Maltese roads. The scope of this paper is to provide a methodology to create a custom dataset (MalTra - Malta Traffic) compiled from multiple participants from various locations across the island to identify the most common routes taken to expose the main areas of activity. This use of big data is seen being used in various technologies and is referred to as ITSs (Intelligent Transportation Systems), which has been concluded that there is significant potential in utilising such sources of data on a nationwide scale.

Keywords: Big Data, vehicular traffic, traffic management, mobile data patterns

Procedia PDF Downloads 109
24209 Comparative Study of Accuracy of Land Cover/Land Use Mapping Using Medium Resolution Satellite Imagery: A Case Study

Authors: M. C. Paliwal, A. K. Jain, S. K. Katiyar

Abstract:

Classification of satellite imagery is very important for the assessment of its accuracy. In order to determine the accuracy of the classified image, usually the assumed-true data are derived from ground truth data using Global Positioning System. The data collected from satellite imagery and ground truth data is then compared to find out the accuracy of data and error matrices are prepared. Overall and individual accuracies are calculated using different methods. The study illustrates advanced classification and accuracy assessment of land use/land cover mapping using satellite imagery. IRS-1C-LISS IV data were used for classification of satellite imagery. The satellite image was classified using the software in fourteen classes namely water bodies, agricultural fields, forest land, urban settlement, barren land and unclassified area etc. Classification of satellite imagery and calculation of accuracy was done by using ERDAS-Imagine software to find out the best method. This study is based on the data collected for Bhopal city boundaries of Madhya Pradesh State of India.

Keywords: resolution, accuracy assessment, land use mapping, satellite imagery, ground truth data, error matrices

Procedia PDF Downloads 508
24208 Glasshouse Experiment to Improve Phytomanagement Solutions for Cu-Polluted Mine Soils

Authors: Marc Romero-Estonllo, Judith Ramos-Castro, Yaiza San Miguel, Beatriz Rodríguez-Garrido, Carmela Monterroso

Abstract:

Mining activity is among the main sources of trace and heavy metal(loid) pollution worldwide, which is a hazard to human and environmental health. That is why several projects have been emerging for the remediation of such polluted places. Phytomanagement strategies draw good performances besides big side benefits. In this work, a glasshouse assay with trace element polluted soils from an old Cu mine ore (NW of Spain) which forms part of the PhytoSUDOE network of phytomanaged contaminated field sites (PhytoSUDOE Project (SOE1/P5/E0189)) was set. The objective was to evaluate improvements induced by the following phytoremediation-related treatments. Three increasingly complex amendments alone or together with plant growth (Populus nigra L. alone and together with Tripholium repens L.) were tested. And three different rhizosphere bioinocula were applied (Plant Growth Promoting Bacteria (PGP), mycorrhiza (MYC), or mixed (PGP+MYC)). After 110 days of growth, plants were collected, biomass was weighed, and tree length was measured. Physical-chemical analyses were carried out to determine pH, effective Cation Exchange Capacity, carbon and nitrogen contents, bioavailable phosphorous (Olsen bicarbonate method), pseudo total element content (microwave acid digested fraction), EDTA extractable metals (complexed fraction), and NH4NO3 extractable metals (easily bioavailable fraction). On plant material, nitrogen content and acid digestion elements were determined. Amendment usage, plant growth, and bioinoculation were demonstrated to improve soil fertility and/or plant health within the time span of this study. Particularly, pH levels increased from 3 (highly acidic) to 5 (acidic) in the worst-case scenario, even reaching 7 (neutrality) in the best plots. Organic matter and pH increments were related to polluting metals’ bioavailability decrements. Plants grew better both with the most complex amendment and the middle one, with few differences due to bioinoculation. Using the less complex amendment (just compost) beneficial effects of bioinoculants were more observable, although plants didn’t thrive very well. On unamended soils, plants neither sprouted nor bloomed. The scheme assayed in this study is suitable for phytomanagement of these kinds of soils affected by mining activity. These findings should be tested now on a larger scale.

Keywords: aided phytoremediation, mine pollution, phytostabilization, soil pollution, trace elements

Procedia PDF Downloads 66
24207 Effect of Genuine Missing Data Imputation on Prediction of Urinary Incontinence

Authors: Suzan Arslanturk, Mohammad-Reza Siadat, Theophilus Ogunyemi, Ananias Diokno

Abstract:

Missing data is a common challenge in statistical analyses of most clinical survey datasets. A variety of methods have been developed to enable analysis of survey data to deal with missing values. Imputation is the most commonly used among the above methods. However, in order to minimize the bias introduced due to imputation, one must choose the right imputation technique and apply it to the correct type of missing data. In this paper, we have identified different types of missing values: missing data due to skip pattern (SPMD), undetermined missing data (UMD), and genuine missing data (GMD) and applied rough set imputation on only the GMD portion of the missing data. We have used rough set imputation to evaluate the effect of such imputation on prediction by generating several simulation datasets based on an existing epidemiological dataset (MESA). To measure how well each dataset lends itself to the prediction model (logistic regression), we have used p-values from the Wald test. To evaluate the accuracy of the prediction, we have considered the width of 95% confidence interval for the probability of incontinence. Both imputed and non-imputed simulation datasets were fit to the prediction model, and they both turned out to be significant (p-value < 0.05). However, the Wald score shows a better fit for the imputed compared to non-imputed datasets (28.7 vs. 23.4). The average confidence interval width was decreased by 10.4% when the imputed dataset was used, meaning higher precision. The results show that using the rough set method for missing data imputation on GMD data improve the predictive capability of the logistic regression. Further studies are required to generalize this conclusion to other clinical survey datasets.

Keywords: rough set, imputation, clinical survey data simulation, genuine missing data, predictive index

Procedia PDF Downloads 168
24206 Database Management System for Orphanages to Help Track of Orphans

Authors: Srivatsav Sanjay Sridhar, Asvitha Raja, Prathit Kalra, Soni Gupta

Abstract:

Database management is a system that keeps track of details about a person in an organisation. Not a lot of orphanages these days are shifting to a computer and program-based system, but unfortunately, most have only pen and paper-based records, which not only consumes space but it is also not eco-friendly. It comes as a hassle when one has to view a record of a person as they have to search through multiple records, and it will consume time. This program will organise all the data and can pull out any information about anyone whose data is entered. This is also a safe way of storage as physical data gets degraded over time or, worse, destroyed due to natural disasters. In this developing world, it is only smart enough to shift all data to an electronic-based storage system. The program comes with all features, including creating, inserting, searching, and deleting the data, as well as printing them.

Keywords: database, orphans, programming, C⁺⁺

Procedia PDF Downloads 156
24205 Computerized Scoring System: A Stethoscope to Understand Consumer's Emotion through His or Her Feedback

Authors: Chen Yang, Jun Hu, Ping Li, Lili Xue

Abstract:

Most companies pay careful attention to consumer feedback collection, so it is popular to find the ‘feedback’ button of all kinds of mobile apps. Yet it is much more changeling to analyze these feedback texts and to catch the true feelings of a consumer regarding either a problem or a complimentary of consumers who hands out the feedback. Especially to the Chinese content, it is possible that; in one context the Chinese feedback expresses positive feedback, but in the other context, the same Chinese feedback may be a negative one. For example, in Chinese, the feedback 'operating with loudness' works well with both refrigerator and stereo system. Apparently, this feedback towards a refrigerator shows negative feedback; however, the same feedback is positive towards a stereo system. By introducing Bradley, M. and Lang, P.'s Affective Norms for English Text (ANET) theory and Bucci W.’s Referential Activity (RA) theory, we, usability researchers at Pingan, are able to decipher the feedback and to find the hidden feelings behind the content. We subtract 2 disciplines ‘valence’ and ‘dominance’ out of 3 of ANET and 2 disciplines ‘concreteness’ and ‘specificity’ out of 4 of RA to organize our own rating system with a scale of 1 to 5 points. This rating system enables us to judge the feelings/emotion behind each feedback, and it works well with both single word/phrase and a whole paragraph. The result of the rating reflects the strength of the feeling/emotion of the consumer when he/she is typing the feedback. In our daily work, we first require a consumer to answer the net promoter score (NPS) before writing the feedback, so we can determine the feedback is positive or negative. Secondly, we code the feedback content according to company problematic list, which contains 200 problematic items. In this way, we are able to collect the data that how many feedbacks left by the consumer belong to one typical problem. Thirdly, we rate each feedback based on the rating system mentioned above to illustrate the strength of the feeling/emotion when our consumer writes the feedback. In this way, we actually obtain two kinds of data 1) the portion, which means how many feedbacks are ascribed into one problematic item and 2) the severity, how strong the negative feeling/emotion is when the consumer is writing this feedback. By crossing these two, and introducing the portion into X-axis and severity into Y-axis, we are able to find which typical problem gets the high score in both portion and severity. The higher the score of a problem has, the more urgent a problem is supposed to be solved as it means more people write stronger negative feelings in feedbacks regarding this problem. Moreover, by introducing hidden Markov model to program our rating system, we are able to computerize the scoring system and are able to process thousands of feedback in a short period of time, which is efficient and accurate enough for the industrial purpose.

Keywords: computerized scoring system, feeling/emotion of consumer feedback, referential activity, text mining

Procedia PDF Downloads 176
24204 New Two-Way Map-Reduce Join Algorithm: Hash Semi Join

Authors: Marwa Hussein Mohamed, Mohamed Helmy Khafagy, Samah Ahmed Senbel

Abstract:

Map Reduce is a programming model used to handle and support massive data sets. Rapidly increasing in data size and big data are the most important issue today to make an analysis of this data. map reduce is used to analyze data and get more helpful information by using two simple functions map and reduce it's only written by the programmer, and it includes load balancing , fault tolerance and high scalability. The most important operation in data analysis are join, but map reduce is not directly support join. This paper explains two-way map-reduce join algorithm, semi-join and per split semi-join, and proposes new algorithm hash semi-join that used hash table to increase performance by eliminating unused records as early as possible and apply join using hash table rather than using map function to match join key with other data table in the second phase but using hash tables isn't affecting on memory size because we only save matched records from the second table only. Our experimental result shows that using a hash table with hash semi-join algorithm has higher performance than two other algorithms while increasing the data size from 10 million records to 500 million and running time are increased according to the size of joined records between two tables.

Keywords: map reduce, hadoop, semi join, two way join

Procedia PDF Downloads 513
24203 Enabling Quantitative Urban Sustainability Assessment with Big Data

Authors: Changfeng Fu

Abstract:

Sustainable urban development has been widely accepted a common sense in the modern urban planning and design. However, the measurement and assessment of urban sustainability, especially the quantitative assessment have been always an issue obsessing planning and design professionals. This paper will present an on-going research on the principles and technologies to develop a quantitative urban sustainability assessment principles and techniques which aim to integrate indicators, geospatial and geo-reference data, and assessment techniques together into a mechanism. It is based on the principles and techniques of geospatial analysis with GIS and statistical analysis methods. The decision-making technologies and methods such as AHP and SMART are also adopted to address overall assessment conclusions. The possible interfaces and presentation of data and quantitative assessment results are also described. This research is based on the knowledge, situations and data sources of UK, but it is potentially adaptable to other countries or regions. The implementation potentials of the mechanism are also discussed.

Keywords: urban sustainability assessment, quantitative analysis, sustainability indicator, geospatial data, big data

Procedia PDF Downloads 359
24202 Development of Generalized Correlation for Liquid Thermal Conductivity of N-Alkane and Olefin

Authors: A. Ishag Mohamed, A. A. Rabah

Abstract:

The objective of this research is to develop a generalized correlation for the prediction of thermal conductivity of n-Alkanes and Alkenes. There is a minority of research and lack of correlation for thermal conductivity of liquids in the open literature. The available experimental data are collected covering the groups of n-Alkanes and Alkenes.The data were assumed to correlate to temperature using Filippov correlation. Nonparametric regression of Grace Algorithm was used to develop the generalized correlation model. A spread sheet program based on Microsoft Excel was used to plot and calculate the value of the coefficients. The results obtained were compared with the data that found in Perry's Chemical Engineering Hand Book. The experimental data correlated to the temperature ranged "between" 273.15 to 673.15 K, with R2 = 0.99.The developed correlation reproduced experimental data that which were not included in regression with absolute average percent deviation (AAPD) of less than 7 %. Thus the spread sheet was quite accurate which produces reliable data.

Keywords: N-Alkanes, N-Alkenes, nonparametric, regression

Procedia PDF Downloads 654
24201 Survey on Arabic Sentiment Analysis in Twitter

Authors: Sarah O. Alhumoud, Mawaheb I. Altuwaijri, Tarfa M. Albuhairi, Wejdan M. Alohaideb

Abstract:

Large-scale data stream analysis has become one of the important business and research priorities lately. Social networks like Twitter and other micro-blogging platforms hold an enormous amount of data that is large in volume, velocity and variety. Extracting valuable information and trends out of these data would aid in a better understanding and decision-making. Multiple analysis techniques are deployed for English content. Moreover, one of the languages that produce a large amount of data over social networks and is least analyzed is the Arabic language. The proposed paper is a survey on the research efforts to analyze the Arabic content in Twitter focusing on the tools and methods used to extract the sentiments for the Arabic content on Twitter.

Keywords: big data, social networks, sentiment analysis, twitter

Procedia PDF Downloads 576
24200 Estimating Current Suicide Rates Using Google Trends

Authors: Ladislav Kristoufek, Helen Susannah Moat, Tobias Preis

Abstract:

Data on the number of people who have committed suicide tends to be reported with a substantial time lag of around two years. We examine whether online activity measured by Google searches can help us improve estimates of the number of suicide occurrences in England before official figures are released. Specifically, we analyse how data on the number of Google searches for the terms “depression” and “suicide” relate to the number of suicides between 2004 and 2013. We find that estimates drawing on Google data are significantly better than estimates using previous suicide data alone. We show that a greater number of searches for the term “depression” is related to fewer suicides, whereas a greater number of searches for the term “suicide” is related to more suicides. Data on suicide related search behaviour can be used to improve current estimates of the number of suicide occurrences.

Keywords: nowcasting, search data, Google Trends, official statistics

Procedia PDF Downloads 357