Search results for: minimum data set
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 26525

Search results for: minimum data set

25475 Data-Driven Dynamic Overbooking Model for Tour Operators

Authors: Kannapha Amaruchkul

Abstract:

We formulate a dynamic overbooking model for a tour operator, in which most reservations contain at least two people. The cancellation rate and the timing of the cancellation may depend on the group size. We propose two overbooking policies, namely economic- and service-based. In an economic-based policy, we want to minimize the expected oversold and underused cost, whereas, in a service-based policy, we ensure that the probability of an oversold situation does not exceed the pre-specified threshold. To illustrate the applicability of our approach, we use tour package data in 2016-2018 from a tour operator in Thailand to build a data-driven robust optimization model, and we tested the proposed overbooking policy in 2019. We also compare the data-driven approach to the conventional approach of fitting data into a probability distribution.

Keywords: applied stochastic model, data-driven robust optimization, overbooking, revenue management, tour operator

Procedia PDF Downloads 134
25474 Modeling and Statistical Analysis of a Soap Production Mix in Bejoy Manufacturing Industry, Anambra State, Nigeria

Authors: Okolie Chukwulozie Paul, Iwenofu Chinwe Onyedika, Sinebe Jude Ebieladoh, M. C. Nwosu

Abstract:

The research work is based on the statistical analysis of the processing data. The essence is to analyze the data statistically and to generate a design model for the production mix of soap manufacturing products in Bejoy manufacturing company Nkpologwu, Aguata Local Government Area, Anambra state, Nigeria. The statistical analysis shows the statistical analysis and the correlation of the data. T test, Partial correlation and bi-variate correlation were used to understand what the data portrays. The design model developed was used to model the data production yield and the correlation of the variables show that the R2 is 98.7%. However, the results confirm that the data is fit for further analysis and modeling. This was proved by the correlation and the R-squared.

Keywords: General Linear Model, correlation, variables, pearson, significance, T-test, soap, production mix and statistic

Procedia PDF Downloads 445
25473 Opportunities for Reducing Post-Harvest Losses of Cactus Pear (Opuntia Ficus-Indica) to Improve Small-Holder Farmers Income in Eastern Tigray, Northern Ethiopia: Value Chain Approach

Authors: Meron Zenaselase Rata, Euridice Leyequien Abarca

Abstract:

The production of major crops in Northern Ethiopia, especially the Tigray Region, is at subsistence level due to drought, erratic rainfall, and poor soil fertility. Since cactus pear is a drought-resistant plant, it is considered as a lifesaver fruit and a strategy for poverty reduction in a drought-affected area of the region. Despite its contribution to household income and food security in the area, the cactus pear sub-sector is experiencing many constraints with limited attention given to its post-harvest loss management. Therefore, this research was carried out to identify opportunities for reducing post-harvest losses and recommend possible strategies to reduce post-harvest losses, thereby improving production and smallholder’s income. Both probability and non-probability sampling techniques were employed to collect the data. Ganta Afeshum district was selected from Eastern Tigray, and two peasant associations (Buket and Golea) were also selected from the district purposively for being potential in cactus pear production. Simple random sampling techniques were employed to survey 30 households from each of the two peasant associations, and a semi-structured questionnaire was used as a tool for data collection. Moreover, in this research 2 collectors, 2 wholesalers, 1 processor, 3 retailers, 2 consumers were interviewed; and two focus group discussion was also done with 14 key farmers using semi-structured checklist; and key informant interview with governmental and non-governmental organizations were interviewed to gather more information about the cactus pear production, post-harvest losses, the strategies used to reduce the post-harvest losses and suggestions to improve the post-harvest management. To enter and analyze the quantitative data, SPSS version 20 was used, whereas MS-word were used to transcribe the qualitative data. The data were presented using frequency and descriptive tables and graphs. The data analysis was also done using a chain map, correlations, stakeholder matrix, and gross margin. Mean comparisons like ANOVA and t-test between variables were used. The analysis result shows that the present cactus pear value chain involves main actors and supporters. However, there is inadequate information flow and informal market linkages among actors in the cactus pear value chain. The farmer's gross margin is higher when they sell to the processor than sell to collectors. The significant postharvest loss in the cactus pear value chain is at the producer level, followed by wholesalers and retailers. The maximum and minimum volume of post-harvest losses at the producer level is 4212 and 240 kgs per season. The post-harvest loss was caused by limited farmers skill on-farm management and harvesting, low market price, limited market information, absence of producer organization, poor post-harvest handling, absence of cold storage, absence of collection centers, poor infrastructure, inadequate credit access, using traditional transportation system, absence of quality control, illegal traders, inadequate research and extension services and using inappropriate packaging material. Therefore, some of the recommendations were providing adequate practical training, forming producer organizations, and constructing collection centers.

Keywords: cactus pear, post-harvest losses, profit margin, value-chain

Procedia PDF Downloads 130
25472 Implications of Human Cytomegalovirus as a Protective Factor in the Pathogenesis of Breast Cancer

Authors: Marissa Dallara, Amalia Ardeljan, Lexi Frankel, Nadia Obaed, Naureen Rashid, Omar Rashid

Abstract:

Human Cytomegalovirus (HCMV) is a ubiquitous virus that remains latent in approximately 60% of individuals in developed countries. Viral load is kept at a minimum due to a robust immune response that is produced in most individuals who remain asymptomatic. HCMV has been recently implicated in cancer research because it may impose oncomodulatory effects on tumor cells of which it infects, which could have an impact on the progression of cancer. HCMV has been implicated in increased pathogenicity of certain cancers such as gliomas, but in contrast, it can also exhibit anti-tumor activity. HCMV seropositivity has been recorded in tumor cells, but this may also have implications in decreased pathogenesis of certain forms of cancer such as leukemia as well as increased pathogenesis in others. This study aimed to investigate the correlation between cytomegalovirus and the incidence of breast cancer. Methods The data used in this project was extracted from a Health Insurance Portability and Accountability Act (HIPAA) compliant national database to analyze the patients infected versus patients not infection with cytomegalovirus using ICD-10, ICD-9 codes. Permission to utilize the database was given by Holy Cross Health, Fort Lauderdale, for the purpose of academic research. Data analysis was conducted using standard statistical methods. Results The query was analyzed for dates ranging from January 2010 to December 2019, which resulted in 14,309 patients in both the infected and control groups, respectively. The two groups were matched by age range and CCI score. The incidence of breast cancer was 1.642% and 235 patients in the cytomegalovirus group compared to 4.752% and 680 patients in the control group. The difference was statistically significant by a p-value of less than 2.2x 10^-16 with an odds ratio of 0.43 (0.4 to 0.48) with a 95% confidence interval. Investigation into the effects of HCMV treatment modalities, including Valganciclovir, Cidofovir, and Foscarnet, on breast cancer in both groups was conducted, but the numbers were insufficient to yield any statistically significant correlations. Conclusion This study demonstrates a statistically significant correlation between cytomegalovirus and a reduced incidence of breast cancer. If HCMV can exert anti-tumor effects on breast cancer and inhibit growth, it could potentially be used to formulate immunotherapy that targets various types of breast cancer. Further evaluation is warranted to assess the implications of cytomegalovirus in reducing the incidence of breast cancer.

Keywords: human cytomegalovirus, breast cancer, immunotherapy, anti-tumor

Procedia PDF Downloads 208
25471 Implementing a Database from a Requirement Specification

Authors: M. Omer, D. Wilson

Abstract:

Creating a database scheme is essentially a manual process. From a requirement specification, the information contained within has to be analyzed and reduced into a set of tables, attributes and relationships. This is a time-consuming process that has to go through several stages before an acceptable database schema is achieved. The purpose of this paper is to implement a Natural Language Processing (NLP) based tool to produce a from a requirement specification. The Stanford CoreNLP version 3.3.1 and the Java programming were used to implement the proposed model. The outcome of this study indicates that the first draft of a relational database schema can be extracted from a requirement specification by using NLP tools and techniques with minimum user intervention. Therefore, this method is a step forward in finding a solution that requires little or no user intervention.

Keywords: information extraction, natural language processing, relation extraction

Procedia PDF Downloads 261
25470 Helping the Development of Public Policies with Knowledge of Criminal Data

Authors: Diego De Castro Rodrigues, Marcelo B. Nery, Sergio Adorno

Abstract:

The project aims to develop a framework for social data analysis, particularly by mobilizing criminal records and applying descriptive computational techniques, such as associative algorithms and extraction of tree decision rules, among others. The methods and instruments discussed in this work will enable the discovery of patterns, providing a guided means to identify similarities between recurring situations in the social sphere using descriptive techniques and data visualization. The study area has been defined as the city of São Paulo, with the structuring of social data as the central idea, with a particular focus on the quality of the information. Given this, a set of tools will be validated, including the use of a database and tools for visualizing the results. Among the main deliverables related to products and the development of articles are the discoveries made during the research phase. The effectiveness and utility of the results will depend on studies involving real data, validated both by domain experts and by identifying and comparing the patterns found in this study with other phenomena described in the literature. The intention is to contribute to evidence-based understanding and decision-making in the social field.

Keywords: social data analysis, criminal records, computational techniques, data mining, big data

Procedia PDF Downloads 84
25469 Optimization of Real Time Measured Data Transmission, Given the Amount of Data Transmitted

Authors: Michal Kopcek, Tomas Skulavik, Michal Kebisek, Gabriela Krizanova

Abstract:

The operation of nuclear power plants involves continuous monitoring of the environment in their area. This monitoring is performed using a complex data acquisition system, which collects status information about the system itself and values of many important physical variables e.g. temperature, humidity, dose rate etc. This paper describes a proposal and optimization of communication that takes place in teledosimetric system between the central control server responsible for the data processing and storing and the decentralized measuring stations, which are measuring the physical variables. Analyzes of ongoing communication were performed and consequently the optimization of the system architecture and communication was done.

Keywords: communication protocol, transmission optimization, data acquisition, system architecture

Procedia PDF Downloads 518
25468 A Green Optically Active Hydrogen and Oxygen Generation System Employing Terrestrial and Extra-Terrestrial Ultraviolet Solar Irradiance

Authors: H. Shahid

Abstract:

Due to Ozone layer depletion on earth, the incoming ultraviolet (UV) radiation is recorded at its high index levels such as 25 in South Peru (13.5° S, 3360 m a.s.l.) Also, the planning of human inhabitation on Mars is under discussion where UV radiations are quite high. The exposure to UV is health hazardous and is avoided by UV filters. On the other hand, artificial UV sources are in use for water thermolysis to generate Hydrogen and Oxygen, which are later used as fuels. This paper presents the utility of employing UVA (315-400nm) and UVB (280-315nm) electromagnetic radiation from the solar spectrum to design and implement an optically active, Hydrogen and Oxygen generation system via thermolysis of desalinated seawater. The proposed system finds its utility on earth and can be deployed in the future on Mars (UVB). In this system, by using Fresnel lens arrays as an optical filter and via active tracking, the ultraviolet light from the sun is concentrated and then allowed to fall on two sub-systems of the proposed system. The first sub-system generates electrical energy by using UV based tandem photovoltaic cells such as GaAs/GaInP/GaInAs/GaInAsP and the second elevates temperature of water to lower the electric potential required to electrolyze the water. An empirical analysis is performed at 30 atm and an electrical potential is observed to be the main controlling factor for the rate of production of Hydrogen and Oxygen and hence the operating point (Q-Point) of the proposed system. The hydrogen production rate in the case of the commercial system in static mode (650ᵒC, 0.6V) is taken as a reference. The silicon oxide electrolyzer cell (SOEC) is used in the proposed (UV) system for the Hydrogen and Oxygen production. To achieve the same amount of Hydrogen as in the case of the reference system, with minimum chamber operating temperature of 850ᵒC in static mode, the corresponding required electrical potential is calculated as 0.3V. However, practically, the Hydrogen production rate is observed to be low in comparison to the reference system at 850ᵒC at 0.3V. However, it has been shown empirically that the Hydrogen production can be enhanced and by raising the electrical potential to 0.45V. It increases the production rate to the same level as is of the reference system. Therefore, 850ᵒC and 0.45V are assigned as the Q-point of the proposed system which is actively stabilized via proportional integral derivative controllers which adjust the axial position of the lens arrays for both subsystems. The functionality of the controllers is based on maintaining the chamber fixed at 850ᵒC (minimum operating temperature) and 0.45V; Q-Point to realize the same Hydrogen production rate as-is for the reference system.

Keywords: hydrogen, oxygen, thermolysis, ultraviolet

Procedia PDF Downloads 133
25467 Effect on Bandwidth of Using Double Substrates Based Metamaterial Planar Antenna

Authors: Smrity Dwivedi

Abstract:

The present paper has revealed the effect of double substrates over a bandwidth performance for planar antennas. The used material has its own importance to get minimum return loss and improved directivity. The author has taken double substrates to enhance the efficiency in terms of gain of antenna. Metamaterial based antenna has its own specific structure which increased the performance of antenna. Improved return loss is -20 dB, and the voltage standing wave ratio (VSWR) is 1.2, which is better than single substrate having return loss of -15 dB and VSWR of 1.4. Complete results are obtained using commercial software CST microwave studio.

Keywords: CST microwave studio, metamaterial, return loss, VSWR

Procedia PDF Downloads 390
25466 Automated Building Internal Layout Design Incorporating Post-Earthquake Evacuation Considerations

Authors: Sajjad Hassanpour, Vicente A. González, Yang Zou, Jiamou Liu

Abstract:

Earthquakes pose a significant threat to both structural and non-structural elements in buildings, putting human lives at risk. Effective post-earthquake evacuation is critical for ensuring the safety of building occupants. However, current design practices often neglect the integration of post-earthquake evacuation considerations into the early-stage architectural design process. To address this gap, this paper presents a novel automated internal architectural layout generation tool that optimizes post-earthquake evacuation performance. The tool takes an initial plain floor plan as input, along with specific requirements from the user/architect, such as minimum room dimensions, corridor width, and exit lengths. Based on these inputs, firstly, the tool randomly generates different architectural layouts. Secondly, the human post-earthquake evacuation behaviour will be thoroughly assessed for each generated layout using the advanced Agent-Based Building Earthquake Evacuation Simulation (AB2E2S) model. The AB2E2S prototype is a post-earthquake evacuation simulation tool that incorporates variables related to earthquake intensity, architectural layout, and human factors. It leverages a hierarchical agent-based simulation approach, incorporating reinforcement learning to mimic human behaviour during evacuation. The model evaluates different layout options and provides feedback on evacuation flow, time, and possible casualties due to earthquake non-structural damage. By integrating the AB2E2S model into the automated layout generation tool, architects and designers can obtain optimized architectural layouts that prioritize post-earthquake evacuation performance. Through the use of the tool, architects and designers can explore various design alternatives, considering different minimum room requirements, corridor widths, and exit lengths. This approach ensures that evacuation considerations are embedded in the early stages of the design process. In conclusion, this research presents an innovative automated internal architectural layout generation tool that integrates post-earthquake evacuation simulation. By incorporating evacuation considerations into the early-stage design process, architects and designers can optimize building layouts for improved post-earthquake evacuation performance. This tool empowers professionals to create resilient designs that prioritize the safety of building occupants in the face of seismic events.

Keywords: agent-based simulation, automation in design, architectural layout, post-earthquake evacuation behavior

Procedia PDF Downloads 104
25465 The Duty of Application and Connection Providers Regarding the Supply of Internet Protocol by Court Order in Brazil to Determine Authorship of Acts Practiced on the Internet

Authors: João Pedro Albino, Ana Cláudia Pires Ferreira de Lima

Abstract:

Humanity has undergone a transformation from the physical to the virtual world, generating an enormous amount of data on the world wide web, known as big data. Many facts that occur in the physical world or in the digital world are proven through records made on the internet, such as digital photographs, posts on social media, contract acceptances by digital platforms, email, banking, and messaging applications, among others. These data recorded on the internet have been used as evidence in judicial proceedings. The identification of internet users is essential for the security of legal relationships. This research was carried out on scientific articles and materials from courses and lectures, with an analysis of Brazilian legislation and some judicial decisions on the request of static data from logs and Internet Protocols (IPs) from application and connection providers. In this article, we will address the determination of authorship of data processing on the internet by obtaining the IP address and the appropriate judicial procedure for this purpose under Brazilian law.

Keywords: IP address, digital forensics, big data, data analytics, information and communication technology

Procedia PDF Downloads 124
25464 Early Phase Design Study of a Sliding Door with Multibody Simulations

Authors: Erkan Talay, Mustafa Yigit Yagci

Abstract:

For the systems like sliding door, designers should predict not only strength but also dynamic behavior of the system and this prediction usually becomes more critical if design has radical changes refer to previous designs. Also, sometimes physical tests could cost more than expected, especially for rail geometry changes, since this geometry affects design of the body. The aim of the study is to observe and understand the dynamics of the sliding door in virtual environment. For this, multibody dynamic model of the sliding door was built and then affects of various parameters like rail geometry, roller diameters, or center of mass detected. Also, a design of experiment study was performed to observe interactions of these parameters.

Keywords: design of experiment, minimum closing effort, multibody simulation, sliding door

Procedia PDF Downloads 137
25463 Sourcing and Compiling a Maltese Traffic Dataset MalTra

Authors: Gabriele Borg, Alexei De Bono, Charlie Abela

Abstract:

There on a constant rise in the availability of high volumes of data gathered from multiple sources, resulting in an abundance of unprocessed information that can be used to monitor patterns and trends in user behaviour. Similarly, year after year, Malta is also constantly experiencing ongoing population growth and an increase in mobilization demand. This research takes advantage of data which is continuously being sourced and converting it into useful information related to the traffic problem on the Maltese roads. The scope of this paper is to provide a methodology to create a custom dataset (MalTra - Malta Traffic) compiled from multiple participants from various locations across the island to identify the most common routes taken to expose the main areas of activity. This use of big data is seen being used in various technologies and is referred to as ITSs (Intelligent Transportation Systems), which has been concluded that there is significant potential in utilising such sources of data on a nationwide scale.

Keywords: Big Data, vehicular traffic, traffic management, mobile data patterns

Procedia PDF Downloads 109
25462 Comparative Study of Accuracy of Land Cover/Land Use Mapping Using Medium Resolution Satellite Imagery: A Case Study

Authors: M. C. Paliwal, A. K. Jain, S. K. Katiyar

Abstract:

Classification of satellite imagery is very important for the assessment of its accuracy. In order to determine the accuracy of the classified image, usually the assumed-true data are derived from ground truth data using Global Positioning System. The data collected from satellite imagery and ground truth data is then compared to find out the accuracy of data and error matrices are prepared. Overall and individual accuracies are calculated using different methods. The study illustrates advanced classification and accuracy assessment of land use/land cover mapping using satellite imagery. IRS-1C-LISS IV data were used for classification of satellite imagery. The satellite image was classified using the software in fourteen classes namely water bodies, agricultural fields, forest land, urban settlement, barren land and unclassified area etc. Classification of satellite imagery and calculation of accuracy was done by using ERDAS-Imagine software to find out the best method. This study is based on the data collected for Bhopal city boundaries of Madhya Pradesh State of India.

Keywords: resolution, accuracy assessment, land use mapping, satellite imagery, ground truth data, error matrices

Procedia PDF Downloads 508
25461 Effect of Genuine Missing Data Imputation on Prediction of Urinary Incontinence

Authors: Suzan Arslanturk, Mohammad-Reza Siadat, Theophilus Ogunyemi, Ananias Diokno

Abstract:

Missing data is a common challenge in statistical analyses of most clinical survey datasets. A variety of methods have been developed to enable analysis of survey data to deal with missing values. Imputation is the most commonly used among the above methods. However, in order to minimize the bias introduced due to imputation, one must choose the right imputation technique and apply it to the correct type of missing data. In this paper, we have identified different types of missing values: missing data due to skip pattern (SPMD), undetermined missing data (UMD), and genuine missing data (GMD) and applied rough set imputation on only the GMD portion of the missing data. We have used rough set imputation to evaluate the effect of such imputation on prediction by generating several simulation datasets based on an existing epidemiological dataset (MESA). To measure how well each dataset lends itself to the prediction model (logistic regression), we have used p-values from the Wald test. To evaluate the accuracy of the prediction, we have considered the width of 95% confidence interval for the probability of incontinence. Both imputed and non-imputed simulation datasets were fit to the prediction model, and they both turned out to be significant (p-value < 0.05). However, the Wald score shows a better fit for the imputed compared to non-imputed datasets (28.7 vs. 23.4). The average confidence interval width was decreased by 10.4% when the imputed dataset was used, meaning higher precision. The results show that using the rough set method for missing data imputation on GMD data improve the predictive capability of the logistic regression. Further studies are required to generalize this conclusion to other clinical survey datasets.

Keywords: rough set, imputation, clinical survey data simulation, genuine missing data, predictive index

Procedia PDF Downloads 168
25460 Groundwater Level Modelling by ARMA and PARMA Models (Case Study: Qorveh Aquifer)

Authors: Motalleb Byzedi, Seyedeh Chaman Naderi Korvandan

Abstract:

Regarding annual statistics of groundwater level resources about current piezometers at Qorveh plains, both ARMA & PARMA modeling methods were applied in this study by the using of SAMS software. Upon performing required tests, a model was used with minimum amount of Akaike information criteria and suitable model was selected for piezometers. Then it was possible to make necessary estimations by using these models for future fluctuations in each piezometer. According to the results, ARMA model had more facilities for modeling of aquifer. Also it was cleared that eastern parts of aquifer had more failures than other parts. Therefore it is necessary to prohibit critical parts along with more supervision on taking rates of wells.

Keywords: qorveh plain, groundwater level, ARMA, PARMA

Procedia PDF Downloads 286
25459 Database Management System for Orphanages to Help Track of Orphans

Authors: Srivatsav Sanjay Sridhar, Asvitha Raja, Prathit Kalra, Soni Gupta

Abstract:

Database management is a system that keeps track of details about a person in an organisation. Not a lot of orphanages these days are shifting to a computer and program-based system, but unfortunately, most have only pen and paper-based records, which not only consumes space but it is also not eco-friendly. It comes as a hassle when one has to view a record of a person as they have to search through multiple records, and it will consume time. This program will organise all the data and can pull out any information about anyone whose data is entered. This is also a safe way of storage as physical data gets degraded over time or, worse, destroyed due to natural disasters. In this developing world, it is only smart enough to shift all data to an electronic-based storage system. The program comes with all features, including creating, inserting, searching, and deleting the data, as well as printing them.

Keywords: database, orphans, programming, C⁺⁺

Procedia PDF Downloads 156
25458 Reducing CO2 Emission Using EDA and Weighted Sum Model in Smart Parking System

Authors: Rahman Ali, Muhammad Sajjad, Farkhund Iqbal, Muhammad Sadiq Hassan Zada, Mohammed Hussain

Abstract:

Emission of Carbon Dioxide (CO2) has adversely affected the environment. One of the major sources of CO2 emission is transportation. In the last few decades, the increase in mobility of people using vehicles has enormously increased the emission of CO2 in the environment. To reduce CO2 emission, sustainable transportation system is required in which smart parking is one of the important measures that need to be established. To contribute to the issue of reducing the amount of CO2 emission, this research proposes a smart parking system. A cloud-based solution is provided to the drivers which automatically searches and recommends the most preferred parking slots. To determine preferences of the parking areas, this methodology exploits a number of unique parking features which ultimately results in the selection of a parking that leads to minimum level of CO2 emission from the current position of the vehicle. To realize the methodology, a scenario-based implementation is considered. During the implementation, a mobile application with GPS signals, vehicles with a number of vehicle features and a list of parking areas with parking features are used by sorting, multi-level filtering, exploratory data analysis (EDA, Analytical Hierarchy Process (AHP)) and weighted sum model (WSM) to rank the parking areas and recommend the drivers with top-k most preferred parking areas. In the EDA process, “2020testcar-2020-03-03”, a freely available dataset is used to estimate CO2 emission of a particular vehicle. To evaluate the system, results of the proposed system are compared with the conventional approach, which reveal that the proposed methodology supersedes the conventional one in reducing the emission of CO2 into the atmosphere.

Keywords: car parking, Co2, Co2 reduction, IoT, merge sort, number plate recognition, smart car parking

Procedia PDF Downloads 146
25457 New Two-Way Map-Reduce Join Algorithm: Hash Semi Join

Authors: Marwa Hussein Mohamed, Mohamed Helmy Khafagy, Samah Ahmed Senbel

Abstract:

Map Reduce is a programming model used to handle and support massive data sets. Rapidly increasing in data size and big data are the most important issue today to make an analysis of this data. map reduce is used to analyze data and get more helpful information by using two simple functions map and reduce it's only written by the programmer, and it includes load balancing , fault tolerance and high scalability. The most important operation in data analysis are join, but map reduce is not directly support join. This paper explains two-way map-reduce join algorithm, semi-join and per split semi-join, and proposes new algorithm hash semi-join that used hash table to increase performance by eliminating unused records as early as possible and apply join using hash table rather than using map function to match join key with other data table in the second phase but using hash tables isn't affecting on memory size because we only save matched records from the second table only. Our experimental result shows that using a hash table with hash semi-join algorithm has higher performance than two other algorithms while increasing the data size from 10 million records to 500 million and running time are increased according to the size of joined records between two tables.

Keywords: map reduce, hadoop, semi join, two way join

Procedia PDF Downloads 513
25456 Using Implicit Data to Improve E-Learning Systems

Authors: Slah Alsaleh

Abstract:

In the recent years and with popularity of internet and technology, e-learning became a major part of majority of education systems. One of the advantages the e-learning systems provide is the large amount of information available about the students' behavior while communicating with the e-learning system. Such information is very rich and it can be used to improve the capability and efficiency of e-learning systems. This paper discusses how e-learning can benefit from implicit data in different ways including; creating homogeneous groups of student, evaluating students' learning, creating behavior profiles for students and identifying the students through their behaviors.

Keywords: e-learning, implicit data, user behavior, data mining

Procedia PDF Downloads 310
25455 Enabling Quantitative Urban Sustainability Assessment with Big Data

Authors: Changfeng Fu

Abstract:

Sustainable urban development has been widely accepted a common sense in the modern urban planning and design. However, the measurement and assessment of urban sustainability, especially the quantitative assessment have been always an issue obsessing planning and design professionals. This paper will present an on-going research on the principles and technologies to develop a quantitative urban sustainability assessment principles and techniques which aim to integrate indicators, geospatial and geo-reference data, and assessment techniques together into a mechanism. It is based on the principles and techniques of geospatial analysis with GIS and statistical analysis methods. The decision-making technologies and methods such as AHP and SMART are also adopted to address overall assessment conclusions. The possible interfaces and presentation of data and quantitative assessment results are also described. This research is based on the knowledge, situations and data sources of UK, but it is potentially adaptable to other countries or regions. The implementation potentials of the mechanism are also discussed.

Keywords: urban sustainability assessment, quantitative analysis, sustainability indicator, geospatial data, big data

Procedia PDF Downloads 359
25454 Development of Generalized Correlation for Liquid Thermal Conductivity of N-Alkane and Olefin

Authors: A. Ishag Mohamed, A. A. Rabah

Abstract:

The objective of this research is to develop a generalized correlation for the prediction of thermal conductivity of n-Alkanes and Alkenes. There is a minority of research and lack of correlation for thermal conductivity of liquids in the open literature. The available experimental data are collected covering the groups of n-Alkanes and Alkenes.The data were assumed to correlate to temperature using Filippov correlation. Nonparametric regression of Grace Algorithm was used to develop the generalized correlation model. A spread sheet program based on Microsoft Excel was used to plot and calculate the value of the coefficients. The results obtained were compared with the data that found in Perry's Chemical Engineering Hand Book. The experimental data correlated to the temperature ranged "between" 273.15 to 673.15 K, with R2 = 0.99.The developed correlation reproduced experimental data that which were not included in regression with absolute average percent deviation (AAPD) of less than 7 %. Thus the spread sheet was quite accurate which produces reliable data.

Keywords: N-Alkanes, N-Alkenes, nonparametric, regression

Procedia PDF Downloads 654
25453 Survey on Arabic Sentiment Analysis in Twitter

Authors: Sarah O. Alhumoud, Mawaheb I. Altuwaijri, Tarfa M. Albuhairi, Wejdan M. Alohaideb

Abstract:

Large-scale data stream analysis has become one of the important business and research priorities lately. Social networks like Twitter and other micro-blogging platforms hold an enormous amount of data that is large in volume, velocity and variety. Extracting valuable information and trends out of these data would aid in a better understanding and decision-making. Multiple analysis techniques are deployed for English content. Moreover, one of the languages that produce a large amount of data over social networks and is least analyzed is the Arabic language. The proposed paper is a survey on the research efforts to analyze the Arabic content in Twitter focusing on the tools and methods used to extract the sentiments for the Arabic content on Twitter.

Keywords: big data, social networks, sentiment analysis, twitter

Procedia PDF Downloads 576
25452 Estimating Current Suicide Rates Using Google Trends

Authors: Ladislav Kristoufek, Helen Susannah Moat, Tobias Preis

Abstract:

Data on the number of people who have committed suicide tends to be reported with a substantial time lag of around two years. We examine whether online activity measured by Google searches can help us improve estimates of the number of suicide occurrences in England before official figures are released. Specifically, we analyse how data on the number of Google searches for the terms “depression” and “suicide” relate to the number of suicides between 2004 and 2013. We find that estimates drawing on Google data are significantly better than estimates using previous suicide data alone. We show that a greater number of searches for the term “depression” is related to fewer suicides, whereas a greater number of searches for the term “suicide” is related to more suicides. Data on suicide related search behaviour can be used to improve current estimates of the number of suicide occurrences.

Keywords: nowcasting, search data, Google Trends, official statistics

Procedia PDF Downloads 357
25451 Modern Trends in Pest Management Agroindustry

Authors: Amarjit S Tanda

Abstract:

Integrated Pest Management Technology (IPMT) offers a crop protection model with sustainable agriculture production with minimum damage to the environment and human health. A concept of agro-ecological crop protection seems unsuitable under dynamic environmental systems. To remedy this, we are proposing Genetically Engineered Crop Protection System (GECPS), as an alternate concept in IPMT that suggests how GE cultivars can be optimally put to the service of crop protection. Genetically engineered cultivars which are developed by gene editing biotechnology may provide a preventive defense against the insect pests and plant diseases, a suitable alternative crop system for blending in IPMT program, in the future agro-industry.

Keywords: integrated, pest, management, technology

Procedia PDF Downloads 73
25450 Geology and Geochemistry of the Paleozoic Basement, Western Algeria

Authors: Hadj Mohamed Nacera, Boutaleb Abdelhak

Abstract:

The Hercynian granite in Western Algeria, has a typical high-K calc-alkaline evolution, with peraluminous trend U-Pb zircon geochronology yielded the minimum emplacement age of 297 ± 1 Ma. It shows dark microgranular enclaves, veins of pegmatite, aplite, tourmaline and quartz. The granite plutons selected for this study are formed during the late Variscian phase and intrudes the Lower Silurian metasediments which were affected by the major Hercynian folding phases. An important Quartz vein field cross-cutting metasedimentary and granitic rocks. Invisible gold occurs in a very small arsenopyrite minerals. The purpose of this study is to highlight the relationship between the gold mineralisation and the intrusion by combining petrographic and geochemic studies.

Keywords: Algeria, basement, geochemestry, granite

Procedia PDF Downloads 271
25449 On the Network Packet Loss Tolerance of SVM Based Activity Recognition

Authors: Gamze Uslu, Sebnem Baydere, Alper K. Demir

Abstract:

In this study, data loss tolerance of Support Vector Machines (SVM) based activity recognition model and multi activity classification performance when data are received over a lossy wireless sensor network is examined. Initially, the classification algorithm we use is evaluated in terms of resilience to random data loss with 3D acceleration sensor data for sitting, lying, walking and standing actions. The results show that the proposed classification method can recognize these activities successfully despite high data loss. Secondly, the effect of differentiated quality of service performance on activity recognition success is measured with activity data acquired from a multi hop wireless sensor network, which introduces high data loss. The effect of number of nodes on the reliability and multi activity classification success is demonstrated in simulation environment. To the best of our knowledge, the effect of data loss in a wireless sensor network on activity detection success rate of an SVM based classification algorithm has not been studied before.

Keywords: activity recognition, support vector machines, acceleration sensor, wireless sensor networks, packet loss

Procedia PDF Downloads 475
25448 GIS Data Governance: GIS Data Submission Process for Build-in Project, Replacement Project at Oman electricity Transmission Company

Authors: Rahma Saleh Hussein Al Balushi

Abstract:

Oman Electricity Transmission Company's (OETC) vision is to be a renowned world-class transmission grid by 2025, and one of the indications of achieving the vision is obtaining Asset Management ISO55001 certification, which required setting out a documented Standard Operating Procedures (SOP). Hence, documented SOP for the Geographical information system data process has been established. Also, to effectively manage and improve OETC power transmission, asset data and information need to be governed as such by Asset Information & GIS department. This paper will describe in detail the current GIS data submission process and the journey for developing it. The methodology used to develop the process is based on three main pillars, which are system and end-user requirements, Risk evaluation, data availability, and accuracy. The output of this paper shows the dramatic change in the used process, which results subsequently in more efficient, accurate, and updated data. Furthermore, due to this process, GIS has been and is ready to be integrated with other systems as well as the source of data for all OETC users. Some decisions related to issuing No objection certificates (NOC) for excavation permits and scheduling asset maintenance plans in Computerized Maintenance Management System (CMMS) have been made consequently upon GIS data availability. On the Other hand, defining agreed and documented procedures for data collection, data systems update, data release/reporting and data alterations has also contributed to reducing the missing attributes and enhance data quality index of GIS transmission data. A considerable difference in Geodatabase (GDB) completeness percentage was observed between the years 2017 and year 2022. Overall, concluding that by governance, asset information & GIS department can control the GIS data process; collect, properly record, and manage asset data and information within the OETC network. This control extends to other applications and systems integrated with/related to GIS systems.

Keywords: asset management ISO55001, standard procedures process, governance, CMMS

Procedia PDF Downloads 125
25447 Efects of Data Corelation in a Sparse-View Compresive Sensing Based Image Reconstruction

Authors: Sajid Abas, Jon Pyo Hong, Jung-Ryun Le, Seungryong Cho

Abstract:

Computed tomography and laminography are heavily investigated in a compressive sensing based image reconstruction framework to reduce the dose to the patients as well as to the radiosensitive devices such as multilayer microelectronic circuit boards. Nowadays researchers are actively working on optimizing the compressive sensing based iterative image reconstruction algorithm to obtain better quality images. However, the effects of the sampled data’s properties on reconstructed the image’s quality, particularly in an insufficient sampled data conditions have not been explored in computed laminography. In this paper, we investigated the effects of two data properties i.e. sampling density and data incoherence on the reconstructed image obtained by conventional computed laminography and a recently proposed method called spherical sinusoidal scanning scheme. We have found that in a compressive sensing based image reconstruction framework, the image quality mainly depends upon the data incoherence when the data is uniformly sampled.

Keywords: computed tomography, computed laminography, compressive sending, low-dose

Procedia PDF Downloads 464
25446 Fuzzy Wavelet Model to Forecast the Exchange Rate of IDR/USD

Authors: Tri Wijayanti Septiarini, Agus Maman Abadi, Muhammad Rifki Taufik

Abstract:

The exchange rate of IDR/USD can be the indicator to analysis Indonesian economy. The exchange rate as a important factor because it has big effect in Indonesian economy overall. So, it needs the analysis data of exchange rate. There is decomposition data of exchange rate of IDR/USD to be frequency and time. It can help the government to monitor the Indonesian economy. This method is very effective to identify the case, have high accurate result and have simple structure. In this paper, data of exchange rate that used is weekly data from December 17, 2010 until November 11, 2014.

Keywords: the exchange rate, fuzzy mamdani, discrete wavelet transforms, fuzzy wavelet

Procedia PDF Downloads 571