Search results for: startup data analytics
23639 On Estimating the Low Income Proportion with Several Auxiliary Variables
Authors: Juan F. Muñoz-Rosas, Rosa M. García-Fernández, Encarnación Álvarez-Verdejo, Pablo J. Moya-Fernández
Abstract:
Poverty measurement is a very important topic in many studies in social sciences. One of the most important indicators when measuring poverty is the low income proportion. This indicator gives the proportion of people of a population classified as poor. This indicator is generally unknown, and for this reason, it is estimated by using survey data, which are obtained by official surveys carried out by many statistical agencies such as Eurostat. The main feature of the mentioned survey data is the fact that they contain several variables. The variable used to estimate the low income proportion is called as the variable of interest. The survey data may contain several additional variables, also named as the auxiliary variables, related to the variable of interest, and if this is the situation, they could be used to improve the estimation of the low income proportion. In this paper, we use Monte Carlo simulation studies to analyze numerically the performance of estimators based on several auxiliary variables. In this simulation study, we considered real data sets obtained from the 2011 European Union Survey on Income and Living Condition. Results derived from this study indicate that the estimators based on auxiliary variables are more accurate than the naive estimator.Keywords: inclusion probability, poverty, poverty line, survey sampling
Procedia PDF Downloads 45923638 TessPy – Spatial Tessellation Made Easy
Authors: Jonas Hamann, Siavash Saki, Tobias Hagen
Abstract:
Discretization of urban areas is a crucial aspect in many spatial analyses. The process of discretization of space into subspaces without overlaps and gaps is called tessellation. It helps understanding spatial space and provides a framework for analyzing geospatial data. Tessellation methods can be divided into two groups: regular tessellations and irregular tessellations. While regular tessellation methods, like squares-grids or hexagons-grids, are suitable for addressing pure geometry problems, they cannot take the unique characteristics of different subareas into account. However, irregular tessellation methods allow the border between the subareas to be defined more realistically based on urban features like a road network or Points of Interest (POI). Even though Python is one of the most used programming languages when it comes to spatial analysis, there is currently no library that combines different tessellation methods to enable users and researchers to compare different techniques. To close this gap, we are proposing TessPy, an open-source Python package, which combines all above-mentioned tessellation methods and makes them easily accessible to everyone. The core functions of TessPy represent the five different tessellation methods: squares, hexagons, adaptive squares, Voronoi polygons, and city blocks. By using regular methods, users can set the resolution of the tessellation which defines the finesse of the discretization and the desired number of tiles. Irregular tessellation methods allow users to define which spatial data to consider (e.g., amenity, building, office) and how fine the tessellation should be. The spatial data used is open-source and provided by OpenStreetMap. This data can be easily extracted and used for further analyses. Besides the methodology of the different techniques, the state-of-the-art, including examples and future work, will be discussed. All dependencies can be installed using conda or pip; however, the former is more recommended.Keywords: geospatial data science, geospatial data analysis, tessellations, urban studies
Procedia PDF Downloads 12923637 A CFD Analysis of Hydraulic Characteristics of the Rod Bundles in the BREST-OD-300 Wire-Spaced Fuel Assemblies
Authors: Dmitry V. Fomichev, Vladimir V. Solonin
Abstract:
This paper presents the findings from a numerical simulation of the flow in 37-rod fuel assembly models spaced by a double-wire trapezoidal wrapping as applied to the BREST-OD-300 experimental nuclear reactor. Data on a high static pressure distribution within the models, and equations for determining the fuel bundle flow friction factors have been obtained. Recommendations are provided on using the closing turbulence models available in the ANSYS Fluent. A comparative analysis has been performed against the existing empirical equations for determining the flow friction factors. The calculated and experimental data fit has been shown. An analysis into the experimental data and results of the numerical simulation of the BREST-OD-300 fuel rod assembly hydrodynamic performance are presented.Keywords: BREST-OD-300, ware-spaces, fuel assembly, computation fluid dynamics
Procedia PDF Downloads 38523636 Analysis of Lead Time Delays in Supply Chain: A Case Study
Authors: Abdel-Aziz M. Mohamed, Nermeen Coutry
Abstract:
Lead time is an important measure of supply chain performance. It impacts both customer satisfactions as well as the total cost of inventory. This paper presents the result of a study on the analysis of the customer order lead-time for a multinational company. In the study, the lead time was divided into three stages: order entry, order fulfillment, and order delivery. A sample of size 2,425 order lines from the company records were considered for this study. The sample data includes information regarding customer orders from the time of order entry until order delivery. Data regarding the lead time of each sage for different orders were also provided. Summary statistics on lead time data reveals that about 30% of the orders were delivered after the scheduled due date. The result of the multiple linear regression analysis technique revealed that component type, logistics parameter, order size and the customer type have significant impact on lead time. Data analysis on the stages of lead time indicates that stage 2 consumes over 50% of the lead time. Pareto analysis was made to study the reasons for the customer order delay in each of the 3 stages. Recommendation was given to resolve the problem.Keywords: lead time reduction, customer satisfaction, service quality, statistical analysis
Procedia PDF Downloads 73423635 A Unified Approach for Digital Forensics Analysis
Authors: Ali Alshumrani, Nathan Clarke, Bogdan Ghite, Stavros Shiaeles
Abstract:
Digital forensics has become an essential tool in the investigation of cyber and computer-assisted crime. Arguably, given the prevalence of technology and the subsequent digital footprints that exist, it could have a significant role across almost all crimes. However, the variety of technology platforms (such as computers, mobiles, Closed-Circuit Television (CCTV), Internet of Things (IoT), databases, drones, cloud computing services), heterogeneity and volume of data, forensic tool capability, and the investigative cost make investigations both technically challenging and prohibitively expensive. Forensic tools also tend to be siloed into specific technologies, e.g., File System Forensic Analysis Tools (FS-FAT) and Network Forensic Analysis Tools (N-FAT), and a good deal of data sources has little to no specialist forensic tools. Increasingly it also becomes essential to compare and correlate evidence across data sources and to do so in an efficient and effective manner enabling an investigator to answer high-level questions of the data in a timely manner without having to trawl through data and perform the correlation manually. This paper proposes a Unified Forensic Analysis Tool (U-FAT), which aims to establish a common language for electronic information and permit multi-source forensic analysis. Core to this approach is the identification and development of forensic analyses that automate complex data correlations, enabling investigators to investigate cases more efficiently. The paper presents a systematic analysis of major crime categories and identifies what forensic analyses could be used. For example, in a child abduction, an investigation team might have evidence from a range of sources including computing devices (mobile phone, PC), CCTV (potentially a large number), ISP records, and mobile network cell tower data, in addition to third party databases such as the National Sex Offender registry and tax records, with the desire to auto-correlate and across sources and visualize in a cognitively effective manner. U-FAT provides a holistic, flexible, and extensible approach to providing digital forensics in technology, application, and data-agnostic manner, providing powerful and automated forensic analysis.Keywords: digital forensics, evidence correlation, heterogeneous data, forensics tool
Procedia PDF Downloads 19823634 Analyzing Medical Workflows Using Market Basket Analysis
Authors: Mohit Kumar, Mayur Betharia
Abstract:
Healthcare domain, with the emergence of Electronic Medical Record (EMR), collects a lot of data which have been attracting Data Mining expert’s interest. In the past, doctors have relied on their intuition while making critical clinical decisions. This paper presents the means to analyze the Medical workflows to get business insights out of huge dumped medical databases. Market Basket Analysis (MBA) which is a special data mining technique, has been widely used in marketing and e-commerce field to discover the association between products bought together by customers. It helps businesses in increasing their sales by analyzing the purchasing behavior of customers and pitching the right customer with the right product. This paper is an attempt to demonstrate Market Basket Analysis applications in healthcare. In particular, it discusses the Market Basket Analysis Algorithm ‘Apriori’ applications within healthcare in major areas such as analyzing the workflow of diagnostic procedures, Up-selling and Cross-selling of Healthcare Systems, designing healthcare systems more user-friendly. In the paper, we have demonstrated the MBA applications using Angiography Systems, but can be extrapolated to other modalities as well.Keywords: data mining, market basket analysis, healthcare applications, knowledge discovery in healthcare databases, customer relationship management, healthcare systems
Procedia PDF Downloads 17423633 Infrastructural Investment and Economic Growth in Indian States: A Panel Data Analysis
Authors: Jonardan Koner, Basabi Bhattacharya, Avinash Purandare
Abstract:
The study is focused to find out the impact of infrastructural investment on economic development in Indian states. The study uses panel data analysis to measure the impact of infrastructural investment on Real Gross Domestic Product in Indian States. Panel data analysis incorporates Unit Root Test, Cointegration Teat, Pooled Ordinary Least Squares, Fixed Effect Approach, Random Effect Approach, Hausman Test. The study analyzes panel data (annual in frequency) ranging from 1991 to 2012 and concludes that infrastructural investment has a desirable impact on economic development in Indian. Finally, the study reveals that the infrastructural investment significantly explains the variation of economic indicator.Keywords: infrastructural investment, real GDP, unit root test, cointegration teat, pooled ordinary least squares, fixed effect approach, random effect approach, Hausman test
Procedia PDF Downloads 40423632 Adjusting Electricity Demand Data to Account for the Impact of Loadshedding in Forecasting Models
Authors: Migael van Zyl, Stefanie Visser, Awelani Phaswana
Abstract:
The electricity landscape in South Africa is characterized by frequent occurrences of loadshedding, a measure implemented by Eskom to manage electricity generation shortages by curtailing demand. Loadshedding, classified into stages ranging from 1 to 8 based on severity, involves the systematic rotation of power cuts across municipalities according to predefined schedules. However, this practice introduces distortions in recorded electricity demand, posing challenges to accurate forecasting essential for budgeting, network planning, and generation scheduling. Addressing this challenge requires the development of a methodology to quantify the impact of loadshedding and integrate it back into metered electricity demand data. Fortunately, comprehensive records of loadshedding impacts are maintained in a database, enabling the alignment of Loadshedding effects with hourly demand data. This adjustment ensures that forecasts accurately reflect true demand patterns, independent of loadshedding's influence, thereby enhancing the reliability of electricity supply management in South Africa. This paper presents a methodology for determining the hourly impact of load scheduling and subsequently adjusting historical demand data to account for it. Furthermore, two forecasting models are developed: one utilizing the original dataset and the other using the adjusted data. A comparative analysis is conducted to evaluate forecast accuracy improvements resulting from the adjustment process. By implementing this methodology, stakeholders can make more informed decisions regarding electricity infrastructure investments, resource allocation, and operational planning, contributing to the overall stability and efficiency of South Africa's electricity supply system.Keywords: electricity demand forecasting, load shedding, demand side management, data science
Procedia PDF Downloads 6323631 Corporate Governance and Share Prices: Firm Level Review in Turkey
Authors: Raif Parlakkaya, Ahmet Diken, Erkan Kara
Abstract:
This paper examines the relationship between corporate governance rating and stock prices of 26 Turkish firms listed in Turkish stock exchange (Borsa Istanbul) by using panel data analysis over five-year period. The paper also investigates the stock performance of firms with governance rating with regards to the market portfolio (i.e. BIST 100 Index) both prior and after governance scoring began. The empirical results show that there is no relation between corporate governance rating and stock prices when using panel data for annual variation in both rating score and stock prices. Further analysis indicates surprising results that while the selected firms outperform the market significantly prior to rating, the same performance does not continue afterwards.Keywords: corporate governance, stock price, performance, panel data analysis
Procedia PDF Downloads 39423630 Special Education Teachers’ Knowledge and Application of the Concept of Curriculum Adaptation for Learners with Special Education Needs in Zambia
Authors: Kenneth Kapalu Muzata, Dikeledi Mahlo, Pinkie Mabunda Mabunda
Abstract:
This paper presents results of a study conducted to establish special education teachers’ knowledge and application of curriculum adaptation of the 2013 revised curriculum in Zambia. From a sample of 134 respondents (120 special education teachers, 12 education officers, and 2 curriculum specialists), the study collected both quantitative and qualitative data to establish whether teachers understood and applied the concept of curriculum adaptation in teaching learners with special education needs. To obtain data validity and reliability, the researchers collected data by use of mixed methods. Semi-structured questionnaires and interviews were administered. Lesson Observations and post-lesson discussions were conducted on 12 selected teachers from the 120 sample that answered the questionnaires. Frequencies, percentages, and significant differences were derived through the statistical package for social sciences. Qualitative data were analyzed with the help of NVIVO qualitative software to create themes and obtain coding density to help with conclusions. Both quantitative and qualitative data were concurrently compared and related. The results revealed that special education teachers lacked a thorough understanding of the concept of curriculum adaptation, thus denying learners with special education needs the opportunity to benefit from the revised curriculum. The teachers were not oriented on the revised curriculum and hence facing numerous challenges trying to adapt the curriculum. The study recommended training of special education teachers in curriculum adaptation.Keywords: curriculum adaptation, special education, learners with special education needs, special education teachers
Procedia PDF Downloads 18023629 Simultaneous Determination of Methotrexate and Aspirin Using Fourier Transform Convolution Emission Data under Non-Parametric Linear Regression Method
Authors: Marwa A. A. Ragab, Hadir M. Maher, Eman I. El-Kimary
Abstract:
Co-administration of methotrexate (MTX) and aspirin (ASP) can cause a pharmacokinetic interaction and a subsequent increase in blood MTX concentrations which may increase the risk of MTX toxicity. Therefore, it is important to develop a sensitive, selective, accurate and precise method for their simultaneous determination in urine. A new hybrid chemometric method has been applied to the emission response data of the two drugs. Spectrofluorimetric method for determination of MTX through measurement of its acid-degradation product, 4-amino-4-deoxy-10-methylpteroic acid (4-AMP), was developed. Moreover, the acid-catalyzed degradation reaction enables the spectrofluorimetric determination of ASP through the formation of its active metabolite salicylic acid (SA). The proposed chemometric method deals with convolution of emission data using 8-points sin xi polynomials (discrete Fourier functions) after the derivative treatment of these emission data. The first and second derivative curves (D1 & D2) were obtained first then convolution of these curves was done to obtain first and second derivative under Fourier functions curves (D1/FF) and (D2/FF). This new application was used for the resolution of the overlapped emission bands of the degradation products of both drugs to allow their simultaneous indirect determination in human urine. Not only this chemometric approach was applied to the emission data but also the obtained data were subjected to non-parametric linear regression analysis (Theil’s method). The proposed method was fully validated according to the ICH guidelines and it yielded linearity ranges as follows: 0.05-0.75 and 0.5-2.5 µg mL-1 for MTX and ASP respectively. It was found that the non-parametric method was superior over the parametric one in the simultaneous determination of MTX and ASP after the chemometric treatment of the emission spectra of their degradation products. The work combines the advantages of derivative and convolution using discrete Fourier function together with the reliability and efficacy of the non-parametric analysis of data. The achieved sensitivity along with the low values of LOD (0.01 and 0.06 µg mL-1) and LOQ (0.04 and 0.2 µg mL-1) for MTX and ASP respectively, by the second derivative under Fourier functions (D2/FF) were promising and guarantee its application for monitoring the two drugs in patients’ urine samples.Keywords: chemometrics, emission curves, derivative, convolution, Fourier transform, human urine, non-parametric regression, Theil’s method
Procedia PDF Downloads 43123628 Adopting Structured Mini Writing Retreats as a Tool for Undergraduate Researchers
Authors: Clare Cunningham
Abstract:
Whilst there is a strong global research base on the benefits of structured writing retreats and similar provisions, such as Shut Up and Write events, for academic staff and postgraduate researchers, very little has been published about the worth of such events for undergraduate students. This is despite the fact that, internationally, undergraduate student researchers experience similar pressures, distractions and feelings towards writing as those who are at more senior levels within the academy. This paper reports on a mixed-methods study with cohorts of third-year undergraduate students over the course of four academic years. This involved a range of research instruments adopted over the four years of the study. They include the administration of four questionnaires across three academic years, a collection of ethnographic recordings in the second year, and the collation of reflective journal entries and evaluations from all four years. The final two years of data collection took place during the period of Covid-19 restrictions when writing retreats moved to the virtual space which adds an additional dimension of interest to the analysis. The analysis involved the collation of quantitative questionnaire data to observe patterns in expressions of attitudes towards writing. Qualitative data were analysed thematically and used to corroborate and support the quantitative data when appropriate. The resulting data confirmed that one of the biggest challenges for undergraduate students mirrors those reported in the findings of studies focused on more experienced researchers. This is not surprising, especially given the number of undergraduate students who now work alongside their studies, as well as the increasing number who have caring responsibilities, but it has, as yet, been under-reported. The data showed that the groups of writing retreat participants all had very positive experiences, with accountability, a sense of community and procrastination avoidance some of the key aspects. The analysis revealed the sometimes transformative power of these events for a number of these students in terms of changing the way they viewed writing and themselves as writers. The data presented in this talk will support the proposal that retreats should much more widely be offered to undergraduate students across the world.Keywords: academic writing, students, undergraduates, writing retreat
Procedia PDF Downloads 20123627 Detecting Overdispersion for Mortality AIDS in Zero-inflated Negative Binomial Death Rate (ZINBDR) Co-infection Patients in Kelantan
Authors: Mohd Asrul Affedi, Nyi Nyi Naing
Abstract:
Overdispersion is present in count data, and basically when a phenomenon happened, a Negative Binomial (NB) is commonly used to replace a standard Poisson model. Analysis of count data event, such as mortality cases basically Poisson regression model is appropriate. Hence, the model is not appropriate when existing a zero values. The zero-inflated negative binomial model is appropriate. In this article, we modelled the mortality cases as a dependent variable by age categorical. The objective of this study to determine existing overdispersion in mortality data of AIDS co-infection patients in Kelantan.Keywords: negative binomial death rate, overdispersion, zero-inflation negative binomial death rate, AIDS
Procedia PDF Downloads 46523626 Using Geospatial Analysis to Reconstruct the Thunderstorm Climatology for the Washington DC Metropolitan Region
Authors: Mace Bentley, Zhuojun Duan, Tobias Gerken, Dudley Bonsal, Henry Way, Endre Szakal, Mia Pham, Hunter Donaldson, Chelsea Lang, Hayden Abbott, Leah Wilcynzski
Abstract:
Air pollution has the potential to modify the lifespan and intensity of thunderstorms and the properties of lightning. Using data mining and geovisualization, we investigate how background climate and weather conditions shape variability in urban air pollution and how this, in turn, shapes thunderstorms as measured by the intensity, distribution, and frequency of cloud-to-ground lightning. A spatiotemporal analysis was conducted in order to identify thunderstorms using high-resolution lightning detection network data. Over seven million lightning flashes were used to identify more than 196,000 thunderstorms that occurred between 2006 - 2020 in the Washington, DC Metropolitan Region. Each lightning flash in the dataset was grouped into thunderstorm events by means of a temporal and spatial clustering algorithm. Once the thunderstorm event database was constructed, hourly wind direction, wind speed, and atmospheric thermodynamic data were added to the initiation and dissipation times and locations for the 196,000 identified thunderstorms. Hourly aerosol and air quality data for the thunderstorm initiation times and locations were also incorporated into the dataset. Developing thunderstorm climatologies using a lightning tracking algorithm and lightning detection network data was found to be useful for visualizing the spatial and temporal distribution of urban augmented thunderstorms in the region.Keywords: lightning, urbanization, thunderstorms, climatology
Procedia PDF Downloads 7723625 Real-Time Network Anomaly Detection Systems Based on Machine-Learning Algorithms
Authors: Zahra Ramezanpanah, Joachim Carvallo, Aurelien Rodriguez
Abstract:
This paper aims to detect anomalies in streaming data using machine learning algorithms. In this regard, we designed two separate pipelines and evaluated the effectiveness of each separately. The first pipeline, based on supervised machine learning methods, consists of two phases. In the first phase, we trained several supervised models using the UNSW-NB15 data-set. We measured the efficiency of each using different performance metrics and selected the best model for the second phase. At the beginning of the second phase, we first, using Argus Server, sniffed a local area network. Several types of attacks were simulated and then sent the sniffed data to a running algorithm at short intervals. This algorithm can display the results of each packet of received data in real-time using the trained model. The second pipeline presented in this paper is based on unsupervised algorithms, in which a Temporal Graph Network (TGN) is used to monitor a local network. The TGN is trained to predict the probability of future states of the network based on its past behavior. Our contribution in this section is introducing an indicator to identify anomalies from these predicted probabilities.Keywords: temporal graph network, anomaly detection, cyber security, IDS
Procedia PDF Downloads 10523624 Diabetes Diagnosis Model Using Rough Set and K- Nearest Neighbor Classifier
Authors: Usiobaifo Agharese Rosemary, Osaseri Roseline Oghogho
Abstract:
Diabetes is a complex group of disease with a variety of causes; it is a disorder of the body metabolism in the digestion of carbohydrates food. The application of machine learning in the field of medical diagnosis has been the focus of many researchers and the use of recognition and classification model as a decision support tools has help the medical expert in diagnosis of diseases. Considering the large volume of medical data which require special techniques, experience, and high diagnostic skill in the diagnosis of diseases, the application of an artificial intelligent system to assist medical personnel in order to enhance their efficiency and accuracy in diagnosis will be an invaluable tool. In this study will propose a diabetes diagnosis model using rough set and K-nearest Neighbor classifier algorithm. The system consists of two modules: the feature extraction module and predictor module, rough data set is used to preprocess the attributes while K-nearest neighbor classifier is used to classify the given data. The dataset used for this model was taken for University of Benin Teaching Hospital (UBTH) database. Half of the data was used in the training while the other half was used in testing the system. The proposed model was able to achieve over 80% accuracy.Keywords: classifier algorithm, diabetes, diagnostic model, machine learning
Procedia PDF Downloads 33723623 Neural Network-based Risk Detection for Dyslexia and Dysgraphia in Sinhala Language Speaking Children
Authors: Budhvin T. Withana, Sulochana Rupasinghe
Abstract:
The problem of Dyslexia and Dysgraphia, two learning disabilities that affect reading and writing abilities, respectively, is a major concern for the educational system. Due to the complexity and uniqueness of the Sinhala language, these conditions are especially difficult for children who speak it. The traditional risk detection methods for Dyslexia and Dysgraphia frequently rely on subjective assessments, making it difficult to cover a wide range of risk detection and time-consuming. As a result, diagnoses may be delayed and opportunities for early intervention may be lost. The project was approached by developing a hybrid model that utilized various deep learning techniques for detecting risk of Dyslexia and Dysgraphia. Specifically, Resnet50, VGG16 and YOLOv8 were integrated to detect the handwriting issues, and their outputs were fed into an MLP model along with several other input data. The hyperparameters of the MLP model were fine-tuned using Grid Search CV, which allowed for the optimal values to be identified for the model. This approach proved to be effective in accurately predicting the risk of Dyslexia and Dysgraphia, providing a valuable tool for early detection and intervention of these conditions. The Resnet50 model achieved an accuracy of 0.9804 on the training data and 0.9653 on the validation data. The VGG16 model achieved an accuracy of 0.9991 on the training data and 0.9891 on the validation data. The MLP model achieved an impressive training accuracy of 0.99918 and a testing accuracy of 0.99223, with a loss of 0.01371. These results demonstrate that the proposed hybrid model achieved a high level of accuracy in predicting the risk of Dyslexia and Dysgraphia.Keywords: neural networks, risk detection system, Dyslexia, Dysgraphia, deep learning, learning disabilities, data science
Procedia PDF Downloads 11923622 A Critical Analysis on Gaps Associated with Culture Policy Milieu Governing Traditional Male Circumcision in the Eastern Cape, South Africa
Authors: Thanduxolo Nomngcoyiya, Simon M. Kang’ethe
Abstract:
The paper aimed to critically analyse gaps pertaining to the cultural policy environments governing traditional male circumcision in the Eastern Cape as exemplified by an empirical case study. The original study which this paper is derived from utilized qualitative paradigm; and encompassed 28 participants. It used in-depth one-on-one interviews complemented by focus group discussions and key informants as a method of data collection. It also adopted interview guide as a data collection instrument. The original study was cross-sectional in nature, and the data was audio recorded and transcribed later during the data analysis and coding process. The study data analysis was content thematic analysis and identified the following key major findings on the culture of male circumcision policy: Lack of clarity on culture of male circumcision policy operations; Myths surrounding procedures on culture of male circumcision; Divergent views on cultural policies between government and male circumcision custodians; Unclear cultural policies on selection criteria of practitioners; and Lack of policy enforcement and implementation on transgressors of culture of male circumcision. It recommended: a stringent selection criteria of practitioners; a need to carry out death-free male circumcision; a need for male circumcision stakeholders to work with other culture and tradition-friendly stakeholders.Keywords: human rights, policy enforcement, traditional male circumcision, traditional surgeons and nurses
Procedia PDF Downloads 29823621 River Network Delineation from Sentinel 1 Synthetic Aperture Radar Data
Authors: Christopher B. Obida, George A. Blackburn, James D. Whyatt, Kirk T. Semple
Abstract:
In many regions of the world, especially in developing countries, river network data are outdated or completely absent, yet such information is critical for supporting important functions such as flood mitigation efforts, land use and transportation planning, and the management of water resources. In this study, a method was developed for delineating river networks using Sentinel 1 imagery. Unsupervised classification was applied to multi-temporal Sentinel 1 data to discriminate water bodies from other land covers then the outputs were combined to generate a single persistent water bodies product. A thinning algorithm was then used to delineate river centre lines, which were converted into vector features and built into a topologically structured geometric network. The complex river system of the Niger Delta was used to compare the performance of the Sentinel-based method against alternative freely available water body products from United States Geological Survey, European Space Agency and OpenStreetMap and a river network derived from a Shuttle Rader Topography Mission Digital Elevation Model. From both raster-based and vector-based accuracy assessments, it was found that the Sentinel-based river network products were superior to the comparator data sets by a substantial margin. The geometric river network that was constructed permitted a flow routing analysis which is important for a variety of environmental management and planning applications. The extracted network will potentially be applied for modelling dispersion of hydrocarbon pollutants in Ogoniland, a part of the Niger Delta. The approach developed in this study holds considerable potential for generating up to date, detailed river network data for the many countries where such data are deficient.Keywords: Sentinel 1, image processing, river delineation, large scale mapping, data comparison, geometric network
Procedia PDF Downloads 14023620 Modeling Local Warming Trend: An Application of Remote Sensing Technique
Authors: Khan R. Rahaman, Quazi K. Hassan
Abstract:
Global changes in climate, environment, economies, populations, governments, institutions, and cultures converge in localities. Changes at a local scale, in turn, contribute to global changes as well as being affected by them. Our hypothesis is built on a consideration that temperature does vary at local level (i.e., termed as local warming) in comparison to the predicted models at the regional and/or global scale. To date, the bulk of the research relating local places to global climate change has been top-down, from the global toward the local, concentrating on methods of impact analysis that use as a starting point climate change scenarios derived from global models, even though these have little regional or local specificity. Thus, our focus is to understand such trends over the southern Alberta, which will enable decision makers, scientists, researcher community, and local people to adapt their policies based on local level temperature variations and to act accordingly. Specific objectives in this study are: (i) to understand the local warming (temperature in particular) trend in context of temperature normal during the period 1961-2010 at point locations using meteorological data; (ii) to validate the data by using specific yearly data, and (iii) to delineate the spatial extent of the local warming trends and understanding influential factors to adopt situation by local governments. Existing data has brought the evidence of such changes and future research emphasis will be given to validate this hypothesis based on remotely sensed data (i.e. MODIS product by NASA).Keywords: local warming, climate change, urban area, Alberta, Canada
Procedia PDF Downloads 34923619 Development of Electroencephalograph Collection System in Language-Learning Self-Study System That Can Detect Learning State of the Learner
Authors: Katsuyuki Umezawa, Makoto Nakazawa, Manabu Kobayashi, Yutaka Ishii, Michiko Nakano, Shigeichi Hirasawa
Abstract:
This research aims to develop a self-study system equipped with an artificial teacher who gives advice to students by detecting the learners and to evaluate language learning in a unified framework. 'Detecting the learners' means that the system understands the learners' learning conditions, such as each learner’s degree of understanding, the difference in each learner’s thinking process, the degree of concentration or boredom in learning, and problem solving for each learner, which can be interpreted from learning behavior. In this paper, we propose a system to efficiently collect brain waves from learners by focusing on only the brain waves among the biological information for 'detecting the learners'. The conventional Electroencephalograph (EEG) measurement method during learning using a simple EEG has the following disadvantages. (1) The start and end of EEG measurement must be done manually by the experiment participant or staff. (2) Even when the EEG signal is weak, it may not be noticed, and the data may not be obtained. (3) Since the acquired EEG data is stored in each PC, there is a possibility that the time of data acquisition will be different in each PC. This time, we developed a system to collect brain wave data on the server side. This system overcame the above disadvantages.Keywords: artificial teacher, e-learning, self-study system, simple EEG
Procedia PDF Downloads 14723618 Characterization of Optical Communication Channels as Non-Deterministic Model
Authors: Valentina Alessandra Carvalho do Vale, Elmo Thiago Lins Cöuras Ford
Abstract:
Increasingly telecommunications sectors are adopting optical technologies, due to its ability to transmit large amounts of data over long distances. However, as in all systems of data transmission, optical communication channels suffer from undesirable and non-deterministic effects, being essential to know the same. Thus, this research allows the assessment of these effects, as well as their characterization and beneficial uses of these effects.Keywords: optical communication, optical fiber, non-deterministic effects, telecommunication
Procedia PDF Downloads 78823617 Liquefaction Potential Assessment Using Screw Driving Testing and Microtremor Data: A Case Study in the Philippines
Authors: Arturo Daag
Abstract:
The Philippine Institute of Volcanology and Seismology (PHIVOLCS) is enhancing its liquefaction hazard map towards a detailed probabilistic approach using SDS and geophysical data. Target sites for liquefaction assessment are public schools in Metro Manila. Since target sites are in highly urbanized-setting, the objective of the project is to conduct both non-destructive geotechnical studies using Screw Driving Testing (SDFS) combined with geophysical data such as refraction microtremor array (ReMi), 3 component microtremor Horizontal to Vertical Spectral Ratio (HVSR), and ground penetrating RADAR (GPR). Initial test data was conducted in liquefaction impacted areas from the Mw 6.1 earthquake in Central Luzon last April 22, 2019 Province of Pampanga. Numerous accounts of liquefaction events were documented areas underlain by quaternary alluvium and mostly covered by recent lahar deposits. SDS estimated values showed a good correlation to actual SPT values obtained from available borehole data. Thus, confirming that SDS can be an alternative tool for liquefaction assessment and more efficient in terms of cost and time compared to SPT and CPT. Conducting borehole may limit its access in highly urbanized areas. In order to extend or extrapolate the SPT borehole data, non-destructive geophysical equipment was used. A 3-component microtremor obtains a subsurface velocity model in 1-D seismic shear wave velocity of the upper 30 meters of the profile (Vs30). For the ReMi, 12 geophone array with 6 to 8-meter spacing surveys were conducted. Microtremor data were computed through the Factor of Safety, which is the quotient of Cyclic Resistance Ratio (CRR) and Cyclic Stress Ratio (CSR). Complementary GPR was used to study the subsurface structure and used to inferred subsurface structures and groundwater conditions.Keywords: screw drive testing, microtremor, ground penetrating RADAR, liquefaction
Procedia PDF Downloads 20323616 Association Rules Mining Task Using Metaheuristics: Review
Authors: Abir Derouiche, Abdesslem Layeb
Abstract:
Association Rule Mining (ARM) is one of the most popular data mining tasks and it is widely used in various areas. The search for association rules is an NP-complete problem that is why metaheuristics have been widely used to solve it. The present paper presents the ARM as an optimization problem and surveys the proposed approaches in the literature based on metaheuristics.Keywords: Optimization, Metaheuristics, Data Mining, Association rules Mining
Procedia PDF Downloads 16323615 Ubiquitous Life People Informatics Engine (U-Life PIE): Wearable Health Promotion System
Authors: Yi-Ping Lo, Shi-Yao Wei, Chih-Chun Ma
Abstract:
Since Google launched Google Glass in 2012, numbers of commercial wearable devices were released, such as smart belt, smart band, smart shoes, smart clothes ... etc. However, most of these devices perform as sensors to show the readings of measurements and few of them provide the interactive feedback to the user. Furthermore, these devices are single task devices which are not able to communicate with each other. In this paper a new health promotion system, Ubiquitous Life People Informatics Engine (U-Life PIE), will be presented. This engine consists of People Informatics Engine (PIE) and the interactive user interface. PIE collects all the data from the compatible devices, analyzes this data comprehensively and communicates between devices via various application programming interfaces. All the data and informations are stored on the PIE unit, therefore, the user is able to view the instant and historical data on their mobile devices any time. It also provides the real-time hands-free feedback and instructions through the user interface visually, acoustically and tactilely. These feedback and instructions suggest the user to adjust their posture or habits in order to avoid the physical injuries and prevent illness.Keywords: machine learning, wearable devices, user interface, user experience, internet of things
Procedia PDF Downloads 29423614 Study and Conservation of Cultural and Natural Heritages with the Use of Laser Scanner and Processing System for 3D Modeling Spatial Data
Authors: Julia Desiree Velastegui Caceres, Luis Alejandro Velastegui Caceres, Oswaldo Padilla, Eduardo Kirby, Francisco Guerrero, Theofilos Toulkeridis
Abstract:
It is fundamental to conserve sites of natural and cultural heritage with any available technique or existing methodology of preservation in order to sustain them for the following generations. We propose a further skill to protect the actual view of such sites, in which with high technology instrumentation we are able to digitally preserve natural and cultural heritages applied in Ecuador. In this project the use of laser technology is presented for three-dimensional models, with high accuracy in a relatively short period of time. In Ecuador so far, there are not any records on the use and processing of data obtained by this new technological trend. The importance of the project is the description of the methodology of the laser scanner system using the Faro Laser Scanner Focus 3D 120, the method for 3D modeling of geospatial data and the development of virtual environments in the areas of Cultural and Natural Heritage. In order to inform users this trend in technology in which three-dimensional models are generated, the use of such tools has been developed to be able to be displayed in all kinds of digitally formats. The results of the obtained 3D models allows to demonstrate that this technology is extremely useful in these areas, but also indicating that each data campaign needs an individual slightly different proceeding starting with the data capture and processing to obtain finally the chosen virtual environments.Keywords: laser scanner system, 3D model, cultural heritage, natural heritage
Procedia PDF Downloads 31023613 Marginalized Two-Part Joint Models for Generalized Gamma Family of Distributions
Authors: Mohadeseh Shojaei Shahrokhabadi, Ding-Geng (Din) Chen
Abstract:
Positive continuous outcomes with a substantial number of zero values and incomplete longitudinal follow-up are quite common in medical cost data. To jointly model semi-continuous longitudinal cost data and survival data and to provide marginalized covariate effect estimates, a marginalized two-part joint model (MTJM) has been developed for outcome variables with lognormal distributions. In this paper, we propose MTJM models for outcome variables from a generalized gamma (GG) family of distributions. The GG distribution constitutes a general family that includes approximately all of the most frequently used distributions like the Gamma, Exponential, Weibull, and Log Normal. In the proposed MTJM-GG model, the conditional mean from a conventional two-part model with a three-parameter GG distribution is parameterized to provide the marginal interpretation for regression coefficients. In addition, MTJM-gamma and MTJM-Weibull are developed as special cases of MTJM-GG. To illustrate the applicability of the MTJM-GG, we applied the model to a set of real electronic health record data recently collected in Iran, and we provided SAS code for application. The simulation results showed that when the outcome distribution is unknown or misspecified, which is usually the case in real data sets, the MTJM-GG consistently outperforms other models. The GG family of distribution facilitates estimating a model with improved fit over the MTJM-gamma, standard Weibull, or Log-Normal distributions.Keywords: marginalized two-part model, zero-inflated, right-skewed, semi-continuous, generalized gamma
Procedia PDF Downloads 17723612 Proposing an Architecture for Drug Response Prediction by Integrating Multiomics Data and Utilizing Graph Transformers
Authors: Nishank Raisinghani
Abstract:
Efficiently predicting drug response remains a challenge in the realm of drug discovery. To address this issue, we propose four model architectures that combine graphical representation with varying positions of multiheaded self-attention mechanisms. By leveraging two types of multi-omics data, transcriptomics and genomics, we create a comprehensive representation of target cells and enable drug response prediction in precision medicine. A majority of our architectures utilize multiple transformer models, one with a graph attention mechanism and the other with a multiheaded self-attention mechanism, to generate latent representations of both drug and omics data, respectively. Our model architectures apply an attention mechanism to both drug and multiomics data, with the goal of procuring more comprehensive latent representations. The latent representations are then concatenated and input into a fully connected network to predict the IC-50 score, a measure of cell drug response. We experiment with all four of these architectures and extract results from all of them. Our study greatly contributes to the future of drug discovery and precision medicine by looking to optimize the time and accuracy of drug response prediction.Keywords: drug discovery, transformers, graph neural networks, multiomics
Procedia PDF Downloads 15623611 Masked Candlestick Model: A Pre-Trained Model for Trading Prediction
Authors: Ling Qi, Matloob Khushi, Josiah Poon
Abstract:
This paper introduces a pre-trained Masked Candlestick Model (MCM) for trading time-series data. The pre-trained model is based on three core designs. First, we convert trading price data at each data point as a set of normalized elements and produce embeddings of each element. Second, we generate a masked sequence of such embedded elements as inputs for self-supervised learning. Third, we use the encoder mechanism from the transformer to train the inputs. The masked model learns the contextual relations among the sequence of embedded elements, which can aid downstream classification tasks. To evaluate the performance of the pre-trained model, we fine-tune MCM for three different downstream classification tasks to predict future price trends. The fine-tuned models achieved better accuracy rates for all three tasks than the baseline models. To better analyze the effectiveness of MCM, we test the same architecture for three currency pairs, namely EUR/GBP, AUD/USD, and EUR/JPY. The experimentation results demonstrate MCM’s effectiveness on all three currency pairs and indicate the MCM’s capability for signal extraction from trading data.Keywords: masked language model, transformer, time series prediction, trading prediction, embedding, transfer learning, self-supervised learning
Procedia PDF Downloads 13023610 Design of Traffic Counting Android Application with Database Management System and Its Comparative Analysis with Traditional Counting Methods
Authors: Muhammad Nouman, Fahad Tiwana, Muhammad Irfan, Mohsin Tiwana
Abstract:
Traffic congestion has been increasing significantly in major metropolitan areas as a result of increased motorization, urbanization, population growth and changes in the urban density. Traffic congestion compromises efficiency of transport infrastructure and causes multiple traffic concerns; including but not limited to increase of travel time, safety hazards, air pollution, and fuel consumption. Traffic management has become a serious challenge for federal and provincial governments, as well as exasperated commuters. Effective, flexible, efficient and user-friendly traffic information/database management systems characterize traffic conditions by making use of traffic counts for storage, processing, and visualization. While, the emerging data collection technologies continue to proliferate, its accuracy can be guaranteed through the comparison of observed data with the manual handheld counters. This paper presents the design of tablet based manual traffic counting application and framework for development of traffic database management system for Pakistan. The database management system comprises of three components including traffic counting android application; establishing online database and its visualization using Google maps. Oracle relational database was chosen to develop the data structure whereas structured query language (SQL) was adopted to program the system architecture. The GIS application links the data from the database and projects it onto a dynamic map for traffic conditions visualization. The traffic counting device and example of a database application in the real-world problem provided a creative outlet to visualize the uses and advantages of a database management system in real time. Also, traffic data counts by means of handheld tablet/ mobile application can be used for transportation planning and forecasting.Keywords: manual count, emerging data sources, traffic information quality, traffic surveillance, traffic counting device, android; data visualization, traffic management
Procedia PDF Downloads 197