Search results for: data centre

7526 Ultra High Speed Approach for Document Skew Detection and Correction Based On Centre of Gravity

Abstract:

Skew detection and correction (SDC) has a direct effect in efficiency and exactitude of documents’ segmentation and analysis and thus is considered as a very important step in documents’ analysis field. Skew is a major problem in documents’ analysis for every language. For Arabic/Persian document scripts this problem is more severe because of special features of these languages. In this paper an efficient and fast algorithm for Document Skew Detection (DSD) based on the concept of segmentation and Center of Gravity (COG) is proposed. This algorithm is examined for 150 Arabic/Persian and English documents and SDC process are done successfully for 93 percent of documents with error rate of less than 1°. This algorithm shows better results for English documents compared to Arabic/Persian documents. The proposed method is also represents favorable results for handwritten, printed and also complicated documents such as newspapers and journals even with very low quality and resolution.

Keywords: Arabic/Persian document, Baseline, Centre of gravity, Document segmentation, Skew detection and correction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1911

7525 Centre Of Mass Selection Operator Based Meta-Heuristic For Unbounded Knapsack Problem

Authors: D.Venkatesan, K.Kannan, S. Raja Balachandar

Abstract:

In this paper a new Genetic Algorithm based on a heuristic operator and Centre of Mass selection operator (CMGA) is designed for the unbounded knapsack problem(UKP), which is NP-Hard combinatorial optimization problem. The proposed genetic algorithm is based on a heuristic operator, which utilizes problem specific knowledge. This center of mass operator when combined with other Genetic Operators forms a competitive algorithm to the existing ones. Computational results show that the proposed algorithm is capable of obtaining high quality solutions for problems of standard randomly generated knapsack instances. Comparative study of CMGA with simple GA in terms of results for unbounded knapsack instances of size up to 200 show the superiority of CMGA. Thus CMGA is an efficient tool of solving UKP and this algorithm is competitive with other Genetic Algorithms also.

Keywords: Genetic Algorithm, Unbounded Knapsack Problem, Combinatorial Optimization, Meta-Heuristic, Center of Mass

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1700

7524 Effect of Footing Shape on Bearing Capacity and Settlement of Closely Spaced Footings on Sandy Soil

Authors: A. Shafaghat, H. Khabbaz, S. Moravej, Ah. Shafaghat

Abstract:

The bearing capacity of closely spaced shallow footings alters with their spacing and the shape of footing. In this study, the bearing capacity and settlement of two adjacent footings constructed on a sand layer are investigated. The effect of different footing shapes including square, circular, ring and strip on sandy soil is captured in the calculations. The investigations are carried out numerically using PLAXIS-3D software and analytically employing conventional settlement equations. For this purpose, foundations are modelled in the program with practical dimensions and various spacing ratios ranging from 1 to 5. The spacing ratio is defined as the centre-to-centre distance to the width of foundations (S/B). Overall, 24 models are analyzed; and the results are compared and discussed in detail. It can be concluded that the presence of adjacent foundation leads to the reduction in bearing capacity for round shape footings while it can increase the bearing capacity of rectangular footings in some specific distances.

Keywords: Bearing capacity, finite element analysis, loose sand, settlement equations, shallow foundation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1070

7523 Consumption Pattern and Dietary Practices of Pregnant Women in Odeda Local Government Area of Ogun State

Authors: Ademuyiwa, M. O., Sanni, S. A.

Abstract:

The importance of maternal nutritional practices during pregnancy cannot be overemphasized. This paper assessed the consumption pattern and dietary practices of 50 pregnant women selected using purposive sampling technique from three health care centres (Primary Health Care Centre, Obantoko; Primary Health Care Centre Alabata; and the General Hospital, Odeda) in Odeda Local Government Area of Ogun State, Nigeria. Structured questionnaire was used to elicit information on socioeconomic status, consumption pattern and dietary practices. Data were analyzed using the Statistical Package for Social Sciences (SPSS, 17). The results indicated that about 58% of the pregnant women were below the age of 30 while 42% were ages 28-40 years. Only 16% had tertiary education while (38%) had secondary education, 52% earn income through petty trading. On food intake, 52% got their energy source from rice on a daily basis, followed by pap (38%) and eko (34%). For protein intake, 36% consumed bean cake on a daily basis while 66% consumed moinmoin 2-3 times a week. Orange (48%) and Green Leafy vegetable (40%) accounted for the mostly consumed fruit and vegetable on daily basis. In terms of animal origin, fish (76%), meat (58%) and eggs (30%) were consumed daily, while chicken and snail were consumed occasionally by 54% and 42%, respectively. Forty-six percent (46%) of the pregnant women eat more than three times daily; while 60% of the women eat outside their homes with 42% respondents eat out lunch and only two percent least eaten out dinner. It is important to increase in awareness campaign to sensitize the pregnant women on the importance of good nutrition especially fruits, vegetables and dairy products.

Keywords: Consumption Pattern, Dietary Practices, Pregnant, Women, Nigeria.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4920

7522 CFD of Oscillating Airfoil Pitch Cycle by using PISO Algorithm

Authors: Muhammad Amjad Sohail, Rizwan Ullah

Abstract:

This research paper presents the CFD analysis of oscillating airfoil during pitch cycle. Unsteady subsonic flow is simulated for pitching airfoil at Mach number 0.283 and Reynolds number 3.45 millions. Turbulent effects are also considered for this study by using K-ω SST turbulent model. Two-dimensional unsteady compressible Navier-Stokes code including two-equation turbulence model and PISO pressure velocity coupling is used. Pressure based implicit solver with first order implicit unsteady formulation is used. The simulated pitch cycle results are compared with the available experimental data. The results have a good agreement with the experimental data. Aerodynamic characteristics during pitch cycles have been studied and validated.

Keywords: Angle of attack, Centre of pressure, subsonic flow, pitching moment coefficient, turbulence mode

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2393

7521 Applying WILSERV in Measuring Visitor Satisfaction at Sepilok Orangutan Rehabilitation Centre (SORC)

Authors: A. H. Hendry, H. S. Mogindol

Abstract:

There is an increasing worldwide demand on the field of interaction with wildlife tourism. Studies pertaining to the service quality within the sphere of interaction with wildlife tourism are plentiful. However, studies on service quality in wildlife attractions, especially on semi-captured wildlife tourism are still limited. The Sepilok Orangutan Rehabilitation Centre (SORC) in Sandakan, Sabah, Malaysia is one good example of a semi-captured wildlife attraction and a renowned attraction in Sabah. This study presents a gap analysis by measuring the perception and expectation of service quality at SORC through the use of a modified SERVQUAL, referred to as WILSERV. A survey questionnaire was devised and administered to 190 visitors who visited SORC. The study revealed that all the means of the six dimensions for perceived perceptions were lower than the expectations. The highest gap was from the dimension of reliability (-0.21), followed by tangible (-0.17), responsiveness (-0.11), assurance, (-0.11), empathy (-0.11) and wild-tangible (-0.05). Similarly, the study also showed that all six dimensions for perceived perceptions means were lower than the expectations for both local and foreign visitors.

Keywords: Gap analysis, service quality, WILSERV, wildlife tourism.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1109

7520 A New Approaches for Seismic Signals Discrimination

Authors: M. Benbrahim, K. Benjelloun, A. Ibenbrahim, M. Kasmi, E. Ardil

Abstract:

The automatic discrimination of seismic signals is an important practical goal for the earth-science observatories due to the large amount of information that they receive continuously. An essential discrimination task is to allocate the incoming signal to a group associated with the kind of physical phenomena producing it. In this paper, we present new techniques for seismic signals classification: local, regional and global discrimination. These techniques were tested on seismic signals from the data base of the National Geophysical Institute of the Centre National pour la Recherche Scientifique et Technique (Morocco) by using the Moroccan software for seismic signals analysis.

Keywords: Seismic signals, local discrimination, regionaldiscrimination, global discrimination, Moroccan software for seismicsignals analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1557

7519 Image Restoration in Non-Linear Filtering Domain using MDB approach

Authors: S. K. Satpathy, S. Panda, K. K. Nagwanshi, C. Ardil

Abstract:

This paper proposes a new technique based on nonlinear Minmax Detector Based (MDB) filter for image restoration. The aim of image enhancement is to reconstruct the true image from the corrupted image. The process of image acquisition frequently leads to degradation and the quality of the digitized image becomes inferior to the original image. Image degradation can be due to the addition of different types of noise in the original image. Image noise can be modeled of many types and impulse noise is one of them. Impulse noise generates pixels with gray value not consistent with their local neighborhood. It appears as a sprinkle of both light and dark or only light spots in the image. Filtering is a technique for enhancing the image. Linear filter is the filtering in which the value of an output pixel is a linear combination of neighborhood values, which can produce blur in the image. Thus a variety of smoothing techniques have been developed that are non linear. Median filter is the one of the most popular non-linear filter. When considering a small neighborhood it is highly efficient but for large window and in case of high noise it gives rise to more blurring to image. The Centre Weighted Mean (CWM) filter has got a better average performance over the median filter. However the original pixel corrupted and noise reduction is substantial under high noise condition. Hence this technique has also blurring affect on the image. To illustrate the superiority of the proposed approach, the proposed new scheme has been simulated along with the standard ones and various restored performance measures have been compared.

Keywords: Filtering, Minmax Detector Based (MDB), noise, centre weighted mean filter, PSNR, restoration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2739

7518 Developing Optical Sensors with Application of Cancer Detection by Elastic Light Scattering Spectroscopy

Authors: May Fadheel Estephan, Richard Perks

Abstract:

Cancer is a serious health concern that affects millions of people worldwide. Early detection and treatment are essential for improving patient outcomes. However, current methods for cancer detection have limitations, such as low sensitivity and specificity. The aim of this study was to develop an optical sensor for cancer detection using elastic light scattering spectroscopy (ELSS). ELSS is a non-invasive optical technique that can be used to characterize the size and concentration of particles in a solution. An optical probe was fabricated with a 100-μm-diameter core and a 132-μm centre-to-centre separation. The probe was used to measure the ELSS spectra of polystyrene spheres with diameters of 2 μm, 0.8 μm, and 0.413 μm. The spectra were then analysed to determine the size and concentration of the spheres. The results showed that the optical probe was able to differentiate between the three different sizes of polystyrene spheres. The probe was also able to detect the presence of polystyrene spheres in suspension concentrations as low as 0.01%. The results of this study demonstrate the potential of ELSS for cancer detection. ELSS is a non-invasive technique that can be used to characterize the size and concentration of cells in a tissue sample. This information can be used to identify cancer cells and assess the stage of the disease. The data for this study were collected by measuring the ELSS spectra of polystyrene spheres with different diameters. The spectra were collected using a spectrometer and a computer. The ELSS spectra were analysed using a software program to determine the size and concentration of the spheres. The software program used a mathematical algorithm to fit the spectra to a theoretical model. The question addressed by this study was whether ELSS could be used to detect cancer cells. The results of the study showed that ELSS could be used to differentiate between different sizes of cells, suggesting that it could be used to detect cancer cells. The findings of this research show the utility of ELSS in the early identification of cancer. ELSS is a non-invasive method for characterizing the number and size of cells in a tissue sample. To determine cancer cells and determine the disease's stage, this information can be employed. Further research is needed to evaluate the clinical performance of ELSS for cancer detection.

Keywords: Elastic Light Scattering Spectroscopy, Polystyrene spheres in suspension, optical probe, fibre optics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 149

7517 Big Data: Big Challenges to Privacy and Data Protection

Authors: Abu Bakar Munir, Siti Hajar Mohd Yasin, Firdaus Muhammad-Sukki

Abstract:

This paper seeks to analyse the benefits of big data and more importantly the challenges it pose to the subject of privacy and data protection. First, the nature of big data will be briefly deliberated before presenting the potential of big data in the present days. Afterwards, the issue of privacy and data protection is highlighted before discussing the challenges of implementing this issue in big data. In conclusion, the paper will put forward the debate on the adequacy of the existing legal framework in protecting personal data in the era of big data.

Keywords: Big data, data protection, information, privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3928

7516 Dosimetric Analysis of Intensity Modulated Radiotherapy versus 3D Conformal Radiotherapy in Adult Primary Brain Tumors: Regional Cancer Centre, India

Authors: Ravi Kiran Pothamsetty, Radha Rani Ghosh, Baby Paul Thaliath

Abstract:

Radiation therapy has undergone many advancements and evloved from 2D to 3D. Recently, with rapid pace of drug discoveries, cutting edge technology, and clinical trials has made innovative advancements in computer technology and treatment planning and upgraded to intensity modulated radiotherapy (IMRT) which delivers in homogenous dose to tumor and normal tissues. The present study was a hospital-based experience comparing two different conformal radiotherapy techniques for brain tumors. This analytical study design has been conducted at Regional Cancer Centre, India from January 2014 to January 2015. Ten patients have been selected after inclusion and exclusion criteria. All the patients were treated on Artiste Siemens Linac Accelerator. The tolerance level for maximum dose was 6.0 Gyfor lenses and 54.0 Gy for brain stem, optic chiasm and optical nerves as per RTOG criteria. Mean and standard deviation values of PTV98%, PTV 95% and PTV 2% in IMRT were 93.16±2.9, 95.01±3.4 and 103.1±1.1 respectively; for 3DCRT were 91.4±4.7, 94.17±2.6 and 102.7±0.39 respectively. PTV max dose (%) in IMRT and 3D-CRT were 104.7±0.96 and 103.9±1.0 respectively. Maximum dose to the tumor can be delivered with IMRT with acceptable toxicity limits. Variables such as expertise, location of tumor, patient condition, and TPS influence the outcome of the treatment.

Keywords: IMRT, 3D CRT, Brain, tumors, OARs, RTOG.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 819

7515 Customers’ Perception towards the Service Marketing Mix and Frequency of Use of Mercedes Benz Automobile Service, Thailand

Authors: Pranee Tridhoskul

Abstract:

This research paper is aimed to examine a relationship between the service marketing mix and customers’ frequency of use of service at Mercedes Benz Auto Repair Centres under Thonburi Group, Thailand. Based on 2,267 customers who used the service of Thonburi Group’s Auto Repair Centres as the population, the sampling of this research was a total of 340 samples, by use of Probability Sampling Technique. Systematic Random Sampling was applied by use of questionnaire in collecting the data at Thonburi Group’s Auto Repair Centres. Mean and Pearson’s basic statistical correlations were utilized in analyzing the data. The study discovered a medium level of customers’ perception towards product and service of Thonburi Group’s Auto Repair Centres, price, place or distribution channel and promotion. People who provided service were perceived also at a medium level, whereas the physical evidence and service process were perceived at a high level. Furthermore, there appeared a correlation between the physical evidence and service process, and customers’ frequency of use of automobile service per year.

Keywords: Service Marketing Mix, Behavior, Mercedes Auto Service Centre.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2966

7514 Remote Vital Signs Monitoring in Neonatal Intensive Care Unit Using a Digital Camera

Authors: Fatema-Tuz-Zohra Khanam, Ali Al-Naji, Asanka G. Perera, Kim Gibson, Javaan Chahl

Abstract:

Conventional contact-based vital signs monitoring sensors such as pulse oximeters or electrocardiogram (ECG) may cause discomfort, skin damage, and infections, particularly in neonates with fragile, sensitive skin. Therefore, remote monitoring of the vital sign is desired in both clinical and non-clinical settings to overcome these issues. Camera-based vital signs monitoring is a recent technology for these applications with many positive attributes. However, there are still limited camera-based studies on neonates in a clinical setting. In this study, the heart rate (HR) and respiratory rate (RR) of eight infants at the Neonatal Intensive Care Unit (NICU) in Flinders Medical Centre were remotely monitored using a digital camera applying color and motion-based computational methods. The region-of-interest (ROI) was efficiently selected by incorporating an image decomposition method. Furthermore, spatial averaging, spectral analysis, band-pass filtering, and peak detection were also used to extract both HR and RR. The experimental results were validated with the ground truth data obtained from an ECG monitor and showed a strong correlation using the Pearson correlation coefficient (PCC) 0.9794 and 0.9412 for HR and RR, respectively. The root mean square errors (RMSE) between camera-based data and ECG data for HR and RR were 2.84 beats/min and 2.91 breaths/min, respectively. A Bland Altman analysis of the data also showed a close correlation between both data sets with a mean bias of 0.60 beats/min and 1 breath/min, and the lower and upper limit of agreement -4.9 to + 6.1 beats/min and -4.4 to +6.4 breaths/min for both HR and RR, respectively. Therefore, video camera imaging may replace conventional contact-based monitoring in NICU and has potential applications in other contexts such as home health monitoring.

Keywords: Neonates, NICU, digital camera, heart rate, respiratory rate, image decomposition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 580

7513 Correlation between the Sowing Date and the Yield of Maize on Chernozem Soil, in Connection with the Leaf Area Index and the Photosynthesis

Authors: E. Bene

Abstract:

Our sowing date experiment took place in the Demonstration Garden of Institution of Plant Sciences, Centre for Agricultural Sciences of University of Debrecen, in 2012-2014. The paper contains data of test year 2014. Our purpose, besides several other examinations, was to observe how sowing date influences the leaf area index and the activity of photosynthesis of maize hybrids, and how those factors affect fruiting. In the experiment we monitored the change of the leaf area index and the photosynthesis of hybrids with four different growing seasons. The results obtained confirm that not only the environmental and agricultural factors in the growing season have effect on the yield, but also other factors like the leaf area index and the photosynthesis are determinative parameters, and all those factors together, modifying the effects of each other, develop average yields.

Keywords: Sowing date, hybrid, leaf area index, photosynthetic capacity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1437

7512 Exploring the Safety of Sodium Glucose Co-Transporter-2 Inhibitors at the Imperial College London Diabetes Centre, UAE

Authors: Raad Nari, Maura Moriaty, Maha T. Barakat

Abstract:

Introduction: Sodium-glucose co-transporter-2 (SGLT2) inhibitors are a new class of oral anti-diabetic drugs with a unique mechanism of action. They are used to improve glycaemic control in adults with type 2 diabetes by enhancing urinary glucose excretion. In the UAE, there has been certainly an increased use of these medications. As with any new medication, there are safety considerations related to their use in patients with type two diabetes. A retrospective study was conducted at the three main centres of the Imperial College London Diabetes Centre. Methodology: All patients in electronic database (Diamond) from October 2014 to October 2017 were included with a minimum of six months usage of sodium glucose co-transporter inhibitors that comprise canagliflozin, dapagliflozin and empagliflozin. There were 15 paired sample biochemical and clinical correlations. The analysis was done at the start of the study, three months and six months apart. SPSS version 24 was used for this study. Conclusion: This study of sodium glucose co-transporter-2 inhibitors used showed significant reductions in weight, glycated haemoglobin A1C, systolic and diastolic blood pressures. As the case with systematic reviews, there were similar changes in liver enzymes, raised total cholesterol, low density lipopoptein and high density lipoprotein. There was slight improvement in estimated glomerular filtration rate too. Our analysis also showed that they increased in the incidence of urinary tract symptoms and incidence of urinary tract infections.

Keywords: SGLT2 inhibitors dapagliflozin empagliflozin canagliflozin, adverse effects, amputation diabetic ketoacidosis DKA, urinary tract infection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 722

7511 Data Preprocessing for Supervised Leaning

Authors: S. B. Kotsiantis, D. Kanellopoulos, P. E. Pintelas

Abstract:

Many factors affect the success of Machine Learning (ML) on a given task. The representation and quality of the instance data is first and foremost. If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. It is well known that data preparation and filtering steps take considerable amount of processing time in ML problems. Data pre-processing includes data cleaning, normalization, transformation, feature extraction and selection, etc. The product of data pre-processing is the final training set. It would be nice if a single sequence of data pre-processing algorithms had the best performance for each data set but this is not happened. Thus, we present the most well know algorithms for each step of data pre-processing so that one achieves the best performance for their data set.

Keywords: Data mining, feature selection, data cleaning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6091

7510 Forecasting Direct Normal Irradiation at Djibouti Using Artificial Neural Network

Authors: Ahmed Kayad Abdourazak, Abderafi Souad, Zejli Driss, Idriss Abdoulkader Ibrahim

Abstract:

In this paper Artificial Neural Network (ANN) is used to predict the solar irradiation in Djibouti for the first Time that is useful to the integration of Concentrating Solar Power (CSP) and sites selections for new or future solar plants as part of solar energy development. An ANN algorithm was developed to establish a forward/reverse correspondence between the latitude, longitude, altitude and monthly solar irradiation. For this purpose the German Aerospace Centre (DLR) data of eight Djibouti sites were used as training and testing in a standard three layers network with the back propagation algorithm of Lavenber-Marquardt. Results have shown a very good agreement for the solar irradiation prediction in Djibouti and proves that the proposed approach can be well used as an efficient tool for prediction of solar irradiation by providing so helpful information concerning sites selection, design and planning of solar plants.

Keywords: Artificial neural network, solar irradiation, concentrated solar power, Lavenberg-Marquardt.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1083

7509 Applications of Big Data in Education

Authors: Faisal Kalota

Abstract:

Big Data and analytics have gained a huge momentum in recent years. Big Data feeds into the field of Learning Analytics (LA) that may allow academic institutions to better understand the learners’ needs and proactively address them. Hence, it is important to have an understanding of Big Data and its applications. The purpose of this descriptive paper is to provide an overview of Big Data, the technologies used in Big Data, and some of the applications of Big Data in education. Additionally, it discusses some of the concerns related to Big Data and current research trends. While Big Data can provide big benefits, it is important that institutions understand their own needs, infrastructure, resources, and limitation before jumping on the Big Data bandwagon.

Keywords: Analytics, Big Data in Education, Hadoop, Learning Analytics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4877

7508 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, Wang Qun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSql), and gives 6 data cleaning methods based on these algorithms.

Keywords: Data cleaning, dependency rules, violation data discovery, data repair.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2612

7507 Coalescing Data Marts

Authors: N. Parimala, P. Pahwa

Abstract:

OLAP uses multidimensional structures, to provide access to data for analysis. Traditionally, OLAP operations are more focused on retrieving data from a single data mart. An exception is the drill across operator. This, however, is restricted to retrieving facts on common dimensions of the multiple data marts. Our concern is to define further operations while retrieving data from multiple data marts. Towards this, we have defined six operations which coalesce data marts. While doing so we consider the common as well as the non-common dimensions of the data marts.

Keywords: Data warehouse, Dimension, OLAP, Star Schema.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1559

7506 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: Mining Big Data, Big Data, Machine learning, Data Streams, Telecommunication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2480

7505 Comparative Analysis of Diverse Collection of Big Data Analytics Tools

Authors: S. Vidhya, S. Sarumathi, N. Shanthi

Abstract:

Over the past era, there have been a lot of efforts and studies are carried out in growing proficient tools for performing various tasks in big data. Recently big data have gotten a lot of publicity for their good reasons. Due to the large and complex collection of datasets it is difficult to process on traditional data processing applications. This concern turns to be further mandatory for producing various tools in big data. Moreover, the main aim of big data analytics is to utilize the advanced analytic techniques besides very huge, different datasets which contain diverse sizes from terabytes to zettabytes and diverse types such as structured or unstructured and batch or streaming. Big data is useful for data sets where their size or type is away from the capability of traditional relational databases for capturing, managing and processing the data with low-latency. Thus the out coming challenges tend to the occurrence of powerful big data tools. In this survey, a various collection of big data tools are illustrated and also compared with the salient features.

Keywords: Big data, Big data analytics, Business analytics, Data analysis, Data visualization, Data discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3775

7504 Multi-labeled Data Expressed by a Set of Labels

Authors: Tetsuya Furukawa, Masahiro Kuzunishi

Abstract:

Collected data must be organized to be utilized efficiently, and hierarchical classification of data is efficient approach to organize data. When data is classified to multiple categories or annotated with a set of labels, users request multi-labeled data by giving a set of labels. There are several interpretations of the data expressed by a set of labels. This paper discusses which data is expressed by a set of labels by introducing orders for sets of labels and shows that there are four types of orders, which are characterized by whether the labels of expressed data includes every label of the given set of labels within the range of the set. Desirable properties of the orders, data is also expressed by the higher set of labels and different sets of labels express different data, are discussed for the orders.

Keywords: Classification Hierarchies, Multi-labeled Data, Multiple Classificaiton, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1304

7503 Analysis of the Fire Hazard Posed by Petrol Stations in Stellenbosch and the Degree of Risk Acknowledgement in Land-Use Planning

Authors: K. Qonono

Abstract:

Despite the significance and economic benefits of petrol stations in South Africa, these still pose a huge risk of fire and explosion threatening public safety. This research paper examines the extent to which land-use planning in Stellenbosch, South Africa, considers the fire risk posed by petrol stations and the implications for public safety as well as preparedness for large fires or explosions. To achieve this, the research identified the land-use types around petrol stations in Stellenbosch and determined the extent to which their locations comply with the local, national, and international land-use planning regulations. A mixed research method consisting of the collection and analysis of geospatial data and qualitative data was applied, where petrol stations within a six-kilometre radius of Stellenbosch’s town centre were utilised as study sites. The research examined the risk of fires/explosions at these petrol stations. The research investigated Stellenbosch Municipality’s institutional preparedness to respond in the event of a fire/explosion at these petrol stations. The research observed that siting of petrol stations does not comply with local, national, and international good practices, thus exposing the surrounding developments to fires and explosions. Land-use planning practice does not consider hazards created by petrol stations. Despite the potential for major fires at petrol stations, Stellenbosch Municipality’s level of preparedness to respond to petrol station fires appears low due to the prioritisation of more frequent events.

Keywords: Petrol stations, technological hazard, DRR, land-use planning, risk analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 142

7502 The Comparison of Data Replication in Distributed Systems

Authors: Iman Zangeneh, Mostafa Moradi, Ali Mokhtarbaf

Abstract:

The necessity of ever-increasing use of distributed data in computer networks is obvious for all. One technique that is performed on the distributed data for increasing of efficiency and reliablity is data rplication. In this paper, after introducing this technique and its advantages, we will examine some dynamic data replication. We will examine their characteristies for some overus scenario and the we will propose some suggestion for their improvement.

Keywords: data replication, data hiding, consistency, dynamicdata replication strategy

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1635

7501 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: Clustering, data mining, DBSCAN, k-means, k-medoids, sensor data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2010

7500 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: Big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2146

7499 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analyzing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2795

7498 Automatic Real-Patient Medical Data De-Identification for Research Purposes

Authors: Petr Vcelak, Jana Kleckova

Abstract:

Our Medicine-oriented research is based on a medical data set of real patients. It is a security problem to share patient private data with peoples other than clinician or hospital staff. We have to remove person identification information from medical data. The medical data without private data are available after a de-identification process for any research purposes. In this paper, we introduce an universal automatic rule-based de-identification application to do all this stuff on an heterogeneous medical data. A patient private identification is replaced by an unique identification number, even in burnedin annotation in pixel data. The identical identification is used for all patient medical data, so it keeps relationships in a data. Hospital can take an advantage of a research feedback based on results.

Keywords: DASTA, De-identification, DICOM, Health Level Seven, Medical data, OCR, Personal data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1644

7497 Economics of Open and Distance Education in the University of Ibadan, Nigeria

Authors: Babatunde Kasim Oladele

Abstract:

One of the major objectives of the Nigeria national policy on education is the provision of equal educational opportunities to all citizens at different levels of education. With regards to higher education, an aspect of the policy encourages distance learning to be organized and delivered by tertiary institutions in Nigeria. This study therefore, determines how much of the Government resources are committed, how the resources are utilized and what alternative sources of funding are available for this system of education. This study investigated the trends in recurrent costs between 2004/2005 and 2013/2014 at University of Ibadan Distance Learning Centre (DLC). A descriptive survey research design was employed for the study. Questionnaire was the research instrument used for the collection of data. The population of the study was 280 current distance learning education students, 70 academic staff and 50 administrative staff. Only 354 questionnaires were correctly filled and returned. Data collected were analyzed and coded using the frequencies, ratio, average and percentages were used to answer all the research questions. The study revealed that staff salaries and allowances of academic and non-academic staff represent the most important variable that influences the cost of education. About 55% of resources were allocated to this sector alone. The study also indicates that costs rise every year with increase in enrolment representing a situation of diseconomies of scale. This study recommends that Universities who operates distance learning program should strive to explore other internally generated revenue option to boost their revenue. University of Ibadan, being the premier university in Nigeria, should be given foreign aid and home support, both financially and materially, to enable the institute to run a formidable distance education program that would measure up in planning and implementation with those of developed nation.

Keywords: Open education, distance education, University of Ibadan, cost of education, Nigeria.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 938