Search results for: maximal data sets
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25280

Search results for: maximal data sets

25190 Combining Real Actors with Virtual Sets: The Future of Immersive Virtual Reality Fiction Cinema

Authors: Nefeli Dimitriadi

Abstract:

This paper aims to present immersive cinema where real actors are filmed and integrated in Virtual Reality environments and 360 cinematic narrative, in comparison to 360 filming of real actors and sets and to fully computer graphics animation movies with 3D avatars. Objectives: This reseach aims to present immersive cinema where real actors are integrated in Virrual Reality environments and 360 cinematic narrative as the future of immersive cinema. Meghdology: A comparative analysis is conducted between real actors filming combined with Virtual Reality sets, to 360 filming of real actors and sets, and to fully computer graphics animation movies with 3D avatars, using as case study Virtual Reality movie Neurosynapses and others. Contribution: This reseach contributes in defining the best practices leading to impactful Immersive cinematic narratives.

Keywords: virtual reality, 360 movies, immersive cinema, directing for virtual reality

Procedia PDF Downloads 103
25189 3D Point Cloud Model Color Adjustment by Combining Terrestrial Laser Scanner and Close Range Photogrammetry Datasets

Authors: M. Pepe, S. Ackermann, L. Fregonese, C. Achille

Abstract:

3D models obtained with advanced survey techniques such as close-range photogrammetry and laser scanner are nowadays particularly appreciated in Cultural Heritage and Archaeology fields. In order to produce high quality models representing archaeological evidences and anthropological artifacts, the appearance of the model (i.e. color) beyond the geometric accuracy, is not a negligible aspect. The integration of the close-range photogrammetry survey techniques with the laser scanner is still a topic of study and research. By combining point cloud data sets of the same object generated with both technologies, or with the same technology but registered in different moment and/or natural light condition, could construct a final point cloud with accentuated color dissimilarities. In this paper, a methodology to uniform the different data sets, to improve the chromatic quality and to highlight further details by balancing the point color will be presented.

Keywords: color models, cultural heritage, laser scanner, photogrammetry

Procedia PDF Downloads 267
25188 Generative AI: A Comparison of Conditional Tabular Generative Adversarial Networks and Conditional Tabular Generative Adversarial Networks with Gaussian Copula in Generating Synthetic Data with Synthetic Data Vault

Authors: Lakshmi Prayaga, Chandra Prayaga. Aaron Wade, Gopi Shankar Mallu, Harsha Satya Pola

Abstract:

Synthetic data generated by Generative Adversarial Networks and Autoencoders is becoming more common to combat the problem of insufficient data for research purposes. However, generating synthetic data is a tedious task requiring extensive mathematical and programming background. Open-source platforms such as the Synthetic Data Vault (SDV) and Mostly AI have offered a platform that is user-friendly and accessible to non-technical professionals to generate synthetic data to augment existing data for further analysis. The SDV also provides for additions to the generic GAN, such as the Gaussian copula. We present the results from two synthetic data sets (CTGAN data and CTGAN with Gaussian Copula) generated by the SDV and report the findings. The results indicate that the ROC and AUC curves for the data generated by adding the layer of Gaussian copula are much higher than the data generated by the CTGAN.

Keywords: synthetic data generation, generative adversarial networks, conditional tabular GAN, Gaussian copula

Procedia PDF Downloads 56
25187 Cloud-Based Multiresolution Geodata Cube for Efficient Raster Data Visualization and Analysis

Authors: Lassi Lehto, Jaakko Kahkonen, Juha Oksanen, Tapani Sarjakoski

Abstract:

The use of raster-formatted data sets in geospatial analysis is increasing rapidly. At the same time, geographic data are being introduced into disciplines outside the traditional domain of geoinformatics, like climate change, intelligent transport, and immigration studies. These developments call for better methods to deliver raster geodata in an efficient and easy-to-use manner. Data cube technologies have traditionally been used in the geospatial domain for managing Earth Observation data sets that have strict requirements for effective handling of time series. The same approach and methodologies can also be applied in managing other types of geospatial data sets. A cloud service-based geodata cube, called GeoCubes Finland, has been developed to support online delivery and analysis of most important geospatial data sets with national coverage. The main target group of the service is the academic research institutes in the country. The most significant aspects of the GeoCubes data repository include the use of multiple resolution levels, cloud-optimized file structure, and a customized, flexible content access API. Input data sets are pre-processed while being ingested into the repository to bring them into a harmonized form in aspects like georeferencing, sampling resolutions, spatial subdivision, and value encoding. All the resolution levels are created using an appropriate generalization method, selected depending on the nature of the source data set. Multiple pre-processed resolutions enable new kinds of online analysis approaches to be introduced. Analysis processes based on interactive visual exploration can be effectively carried out, as the level of resolution most close to the visual scale can always be used. In the same way, statistical analysis can be carried out on resolution levels that best reflect the scale of the phenomenon being studied. Access times remain close to constant, independent of the scale applied in the application. The cloud service-based approach, applied in the GeoCubes Finland repository, enables analysis operations to be performed on the server platform, thus making high-performance computing facilities easily accessible. The developed GeoCubes API supports this kind of approach for online analysis. The use of cloud-optimized file structures in data storage enables the fast extraction of subareas. The access API allows for the use of vector-formatted administrative areas and user-defined polygons as definitions of subareas for data retrieval. Administrative areas of the country in four levels are available readily from the GeoCubes platform. In addition to direct delivery of raster data, the service also supports the so-called virtual file format, in which only a small text file is first downloaded. The text file contains links to the raster content on the service platform. The actual raster data is downloaded on demand, from the spatial area and resolution level required in each stage of the application. By the geodata cube approach, pre-harmonized geospatial data sets are made accessible to new categories of inexperienced users in an easy-to-use manner. At the same time, the multiresolution nature of the GeoCubes repository facilitates expert users to introduce new kinds of interactive online analysis operations.

Keywords: cloud service, geodata cube, multiresolution, raster geodata

Procedia PDF Downloads 118
25186 Holomorphic Prioritization of Sets within Decagram of Strategic Decision Making of POSM Using Operational Research (OR): Analytic Hierarchy Process (AHP) Analysis

Authors: Elias Ogutu Azariah Tembe, Hussain Abdullah Habib Al-Salamin

Abstract:

There is decagram of strategic decisions of operations and production/service management (POSM) within operational research (OR) which must collate, namely: design, inventory, quality, location, process and capacity, layout, scheduling, maintain ace, and supply chain. This paper presents an architectural configuration conceptual framework of a decagram of sets decisions in a form of mathematical complete graph and abelian graph. Mathematically, a complete graph is undirected (UDG), and directed (DG) a relationship where every pair of vertices are connected, collated, confluent, and holomorphic. There has not been any study conducted which, however, prioritizes the holomorphic sets which of POMS within OR field of study. The study utilizes OR structured technique known as The Analytic Hierarchy Process (AHP) analysis for organizing, sorting and prioritizing (ranking) the sets within the decagram of POMS according to their attribution (propensity), and provides an analysis how the prioritization has real-world application within the 21st century.

Keywords: holomorphic, decagram, decagon, confluent, complete graph, AHP analysis, SCM, HRM, OR, OM, abelian graph

Procedia PDF Downloads 393
25185 AM/E/c Queuing Hub Maximal Covering Location Model with Fuzzy Parameter

Authors: M. H. Fazel Zarandi, N. Moshahedi

Abstract:

The hub location problem appears in a variety of applications such as medical centers, firefighting facilities, cargo delivery systems and telecommunication network design. The location of service centers has a strong influence on the congestion at each of them, and, consequently, on the quality of service. This paper presents a fuzzy maximal hub covering location problem (FMCHLP) in which travel costs between any pair of nodes is considered as a fuzzy variable. In order to consider the quality of service, we model each hub as a queue. Arrival rate follows Poisson distribution and service rate follows Erlang distribution. In this paper, at first, a nonlinear mathematical programming model is presented. Then, we convert it to the linear one. We solved the linear model using GAMS software up to 25 nodes and for large sizes due to the complexity of hub covering location problems, and simulated annealing algorithm is developed to solve and test the model. Also, we used possibilistic c-means clustering method in order to find an initial solution.

Keywords: fuzzy modeling, location, possibilistic clustering, queuing

Procedia PDF Downloads 382
25184 An Interpretable Data-Driven Approach for the Stratification of the Cardiorespiratory Fitness

Authors: D.Mendes, J. Henriques, P. Carvalho, T. Rocha, S. Paredes, R. Cabiddu, R. Trimer, R. Mendes, A. Borghi-Silva, L. Kaminsky, E. Ashley, R. Arena, J. Myers

Abstract:

The continued exploration of clinically relevant predictive models continues to be an important pursuit. Cardiorespiratory fitness (CRF) portends clinical vital information and as such its accurate prediction is of high importance. Therefore, the aim of the current study was to develop a data-driven model, based on computational intelligence techniques and, in particular, clustering approaches, to predict CRF. Two prediction models were implemented and compared: 1) the traditional Wasserman/Hansen Equations; and 2) an interpretable clustering approach. Data used for this analysis were from the 'FRIEND - Fitness Registry and the Importance of Exercise: The National Data Base'; in the present study a subset of 10690 apparently healthy individuals were utilized. The accuracy of the models was performed through the computation of sensitivity, specificity, and geometric mean values. The results show the superiority of the clustering approach in the accurate estimation of CRF (i.e., maximal oxygen consumption).

Keywords: cardiorespiratory fitness, data-driven models, knowledge extraction, machine learning

Procedia PDF Downloads 273
25183 The Various Forms of a Soft Set and Its Extension in Medical Diagnosis

Authors: Biplab Singha, Mausumi Sen, Nidul Sinha

Abstract:

In order to deal with the impreciseness and uncertainty of a system, D. Molodtsov has introduced the concept of ‘Soft Set’ in the year 1999. Since then, a number of related definitions have been conceptualized. This paper includes a study on various forms of Soft Sets with examples. The paper contains the concepts of domain and co-domain of a soft set, conversion to one-one and onto function, matrix representation of a soft set and its relation with one-one function, upper and lower triangular matrix, transpose and Kernel of a soft set. This paper also gives the idea of the extension of soft sets in medical diagnosis. Here, two soft sets related to disease and symptoms are considered and using AND operation and OR operation, diagnosis of the disease is calculated through appropriate examples.

Keywords: kernel of a soft set, soft set, transpose of a soft set, upper and lower triangular matrix of a soft set

Procedia PDF Downloads 327
25182 A Novel Heuristic for Analysis of Large Datasets by Selecting Wrapper-Based Features

Authors: Bushra Zafar, Usman Qamar

Abstract:

Large data sample size and dimensions render the effectiveness of conventional data mining methodologies. A data mining technique are important tools for collection of knowledgeable information from variety of databases and provides supervised learning in the form of classification to design models to describe vital data classes while structure of the classifier is based on class attribute. Classification efficiency and accuracy are often influenced to great extent by noisy and undesirable features in real application data sets. The inherent natures of data set greatly masks its quality analysis and leave us with quite few practical approaches to use. To our knowledge first time, we present a new approach for investigation of structure and quality of datasets by providing a targeted analysis of localization of noisy and irrelevant features of data sets. Machine learning is based primarily on feature selection as pre-processing step which offers us to select few features from number of features as a subset by reducing the space according to certain evaluation criterion. The primary objective of this study is to trim down the scope of the given data sample by searching a small set of important features which may results into good classification performance. For this purpose, a heuristic for wrapper-based feature selection using genetic algorithm and for discriminative feature selection an external classifier are used. Selection of feature based on its number of occurrence in the chosen chromosomes. Sample dataset has been used to demonstrate proposed idea effectively. A proposed method has improved average accuracy of different datasets is about 95%. Experimental results illustrate that proposed algorithm increases the accuracy of prediction of different diseases.

Keywords: data mining, generic algorithm, KNN algorithms, wrapper based feature selection

Procedia PDF Downloads 304
25181 Mining Multicity Urban Data for Sustainable Population Relocation

Authors: Xu Du, Aparna S. Varde

Abstract:

In this research, we propose to conduct diagnostic and predictive analysis about the key factors and consequences of urban population relocation. To achieve this goal, urban simulation models extract the urban development trends as land use change patterns from a variety of data sources. The results are treated as part of urban big data with other information such as population change and economic conditions. Multiple data mining methods are deployed on this data to analyze nonlinear relationships between parameters. The result determines the driving force of population relocation with respect to urban sprawl and urban sustainability and their related parameters. Experiments so far reveal that data mining methods discover useful knowledge from the multicity urban data. This work sets the stage for developing a comprehensive urban simulation model for catering to specific questions by targeted users. It contributes towards achieving sustainability as a whole.

Keywords: data mining, environmental modeling, sustainability, urban planning

Procedia PDF Downloads 283
25180 Effect of Class V Cavity Configuration and Loading Situation on the Stress Concentration

Authors: Jia-Yu Wu, Chih-Han Chang, Shu-Fen Chuang, Rong-Yang Lai

Abstract:

Objective: This study was to examine the stress distribution of tooth with different class V restorations under different loading situations and geometry by 3D finite element (FE) analysis. `Methods: A series of FE models of mandibular premolars containing class V cavities were constructed using micro-CT. The class V cavities were assigned as the combinations of different cavity depths x occlusal -gingival heights: 1x2, 1x4, 2x2, and 2x4 mm. Three alveolar bone loss conditions were examined: 0, 1, and 2 mm. 200 N force was exerted on the buccal cusp tip under various directions (vertical, V; obliquely 30° angled, O; oblique and parallel the individual occlusal cavity wall, P). A 3-D FE analysis was performed and the von-Mises stress was used to summarize the data of stress distribution and maximum stress. Results: The maximal stress did not vary in different alveolar bone heights. For each geometry, the maximal stress was found at bilateral corners of the cavity. The peak stress of restorations was significantly higher under load P compared to those under loads V and O while the latter two were similar. 2x2mm cavity exhibited significantly increased (2.88 fold) stress under load P compared to that under load V, followed by 1x2mm (2.11 fold), 2x4mm (1.98 fold) and 1x4mm (1.1fold). Conclusion: Load direction causes the greatest impact on the results of stress, while the effect of alveolar bone loss is minor. Load direction parallel to the cavity wall may enhance the stress concentration especially in deep and narrow class cavities.

Keywords: class v restoration, finite element analysis, loading situation, stress

Procedia PDF Downloads 233
25179 Using Gene Expression Programming in Learning Process of Rough Neural Networks

Authors: Sanaa Rashed Abdallah, Yasser F. Hassan

Abstract:

The paper will introduce an approach where a rough sets, gene expression programming and rough neural networks are used cooperatively for learning and classification support. The Objective of gene expression programming rough neural networks (GEP-RNN) approach is to obtain new classified data with minimum error in training and testing process. Starting point of gene expression programming rough neural networks (GEP-RNN) approach is an information system and the output from this approach is a structure of rough neural networks which is including the weights and thresholds with minimum classification error.

Keywords: rough sets, gene expression programming, rough neural networks, classification

Procedia PDF Downloads 363
25178 BER of the Leaky Feeder under Rayleigh Fading Multichannel Reception with Imperfect Phase Estimation

Authors: Hasan Farahneh, Xavier Fernando

Abstract:

Leaky Feeder (LF) has been a proven technology for many decades and its promises broadband wireless access in short range but being overlooked until now. The LF is a natural MIMO transceiver ideal for micro and pico cells. In this work, the LF is considered as a linear antenna array MultiInput-Single-Output (MISO) and derive the average bit error rate (BER) in Rayleigh fading channel considering ideal and independent paths (iid) which consider there is no correlation and mutual coupling between transmit antennas (slots) or receiver antenna considering QPSK modulation with imperfect phase estimation. We consider maximal ratio transmission (MRT) at the transmit end and maximal ratio combining (MRC) at the receiving end. Analytical expressions are derived for the BER with radiating cable transmitters. The effects of slot spacing and carrier frequency on the BER are also studied. Numerical evaluations show the radiating cable transmitter offer much lower BER than a single antenna transmitter with same SNR.

Keywords: leaky feeder, BER, QPSK, rayleigh fading, channel gain, phase mismatch

Procedia PDF Downloads 366
25177 Structural Design Optimization of Reinforced Thin-Walled Vessels under External Pressure Using Simulation and Machine Learning Classification Algorithm

Authors: Lydia Novozhilova, Vladimir Urazhdin

Abstract:

An optimization problem for reinforced thin-walled vessels under uniform external pressure is considered. The conventional approaches to optimization generally start with pre-defined geometric parameters of the vessels, and then employ analytic or numeric calculations and/or experimental testing to verify functionality, such as stability under the projected conditions. The proposed approach consists of two steps. First, the feasibility domain will be identified in the multidimensional parameter space. Every point in the feasibility domain defines a design satisfying both geometric and functional constraints. Second, an objective function defined in this domain is formulated and optimized. The broader applicability of the suggested methodology is maximized by implementing the Support Vector Machines (SVM) classification algorithm of machine learning for identification of the feasible design region. Training data for SVM classifier is obtained using the Simulation package of SOLIDWORKS®. Based on the data, the SVM algorithm produces a curvilinear boundary separating admissible and not admissible sets of design parameters with maximal margins. Then optimization of the vessel parameters in the feasibility domain is performed using the standard algorithms for the constrained optimization. As an example, optimization of a ring-stiffened closed cylindrical thin-walled vessel with semi-spherical caps under high external pressure is implemented. As a functional constraint, von Mises stress criterion is used but any other stability constraint admitting mathematical formulation can be incorporated into the proposed approach. Suggested methodology has a good potential for reducing design time for finding optimal parameters of thin-walled vessels under uniform external pressure.

Keywords: design parameters, feasibility domain, von Mises stress criterion, Support Vector Machine (SVM) classifier

Procedia PDF Downloads 312
25176 Data Mining Meets Educational Analysis: Opportunities and Challenges for Research

Authors: Carla Silva

Abstract:

Recent development of information and communication technology enables us to acquire, collect, analyse data in various fields of socioeconomic – technological systems. Along with the increase of economic globalization and the evolution of information technology, data mining has become an important approach for economic data analysis. As a result, there has been a critical need for automated approaches to effective and efficient usage of massive amount of educational data, in order to support institutions to a strategic planning and investment decision-making. In this article, we will address data from several different perspectives and define the applied data to sciences. Many believe that 'big data' will transform business, government, and other aspects of the economy. We discuss how new data may impact educational policy and educational research. Large scale administrative data sets and proprietary private sector data can greatly improve the way we measure, track, and describe educational activity and educational impact. We also consider whether the big data predictive modeling tools that have emerged in statistics and computer science may prove useful in educational and furthermore in economics. Finally, we highlight a number of challenges and opportunities for future research.

Keywords: data mining, research analysis, investment decision-making, educational research

Procedia PDF Downloads 342
25175 Predicting Medical Check-Up Patient Re-Coming Using Sequential Pattern Mining and Association Rules

Authors: Rizka Aisha Rahmi Hariadi, Chao Ou-Yang, Han-Cheng Wang, Rajesri Govindaraju

Abstract:

As the increasing of medical check-up popularity, there are a huge number of medical check-up data stored in database and have not been useful. These data actually can be very useful for future strategic planning if we mine it correctly. In other side, a lot of patients come with unpredictable coming and also limited available facilities make medical check-up service offered by hospital not maximal. To solve that problem, this study used those medical check-up data to predict patient re-coming. Sequential pattern mining (SPM) and association rules method were chosen because these methods are suitable for predicting patient re-coming using sequential data. First, based on patient personal information the data was grouped into … groups then discriminant analysis was done to check significant of the grouping. Second, for each group some frequent patterns were generated using SPM method. Third, based on frequent patterns of each group, pairs of variable can be extracted using association rules to get general pattern of re-coming patient. Last, discussion and conclusion was done to give some implications of the results.

Keywords: patient re-coming, medical check-up, health examination, data mining, sequential pattern mining, association rules, discriminant analysis

Procedia PDF Downloads 626
25174 Optimal Pricing Based on Real Estate Demand Data

Authors: Vanessa Kummer, Maik Meusel

Abstract:

Real estate demand estimates are typically derived from transaction data. However, in regions with excess demand, transactions are driven by supply and therefore do not indicate what people are actually looking for. To estimate the demand for housing in Switzerland, search subscriptions from all important Swiss real estate platforms are used. These data do, however, suffer from missing information—for example, many users do not specify how many rooms they would like or what price they would be willing to pay. In economic analyses, it is often the case that only complete data is used. Usually, however, the proportion of complete data is rather small which leads to most information being neglected. Also, the data might have a strong distortion if it is complete. In addition, the reason that data is missing might itself also contain information, which is however ignored with that approach. An interesting issue is, therefore, if for economic analyses such as the one at hand, there is an added value by using the whole data set with the imputed missing values compared to using the usually small percentage of complete data (baseline). Also, it is interesting to see how different algorithms affect that result. The imputation of the missing data is done using unsupervised learning. Out of the numerous unsupervised learning approaches, the most common ones, such as clustering, principal component analysis, or neural networks techniques are applied. By training the model iteratively on the imputed data and, thereby, including the information of all data into the model, the distortion of the first training set—the complete data—vanishes. In a next step, the performances of the algorithms are measured. This is done by randomly creating missing values in subsets of the data, estimating those values with the relevant algorithms and several parameter combinations, and comparing the estimates to the actual data. After having found the optimal parameter set for each algorithm, the missing values are being imputed. Using the resulting data sets, the next step is to estimate the willingness to pay for real estate. This is done by fitting price distributions for real estate properties with certain characteristics, such as the region or the number of rooms. Based on these distributions, survival functions are computed to obtain the functional relationship between characteristics and selling probabilities. Comparing the survival functions shows that estimates which are based on imputed data sets do not differ significantly from each other; however, the demand estimate that is derived from the baseline data does. This indicates that the baseline data set does not include all available information and is therefore not representative for the entire sample. Also, demand estimates derived from the whole data set are much more accurate than the baseline estimation. Thus, in order to obtain optimal results, it is important to make use of all available data, even though it involves additional procedures such as data imputation.

Keywords: demand estimate, missing-data imputation, real estate, unsupervised learning

Procedia PDF Downloads 272
25173 Airborne SAR Data Analysis for Impact of Doppler Centroid on Image Quality and Registration Accuracy

Authors: Chhabi Nigam, S. Ramakrishnan

Abstract:

This paper brings out the analysis of the airborne Synthetic Aperture Radar (SAR) data to study the impact of Doppler centroid on Image quality and geocoding accuracy from the perspective of Stripmap mode of data acquisition. Although in Stripmap mode of data acquisition radar beam points at 90 degrees broad side (side looking), shift in the Doppler centroid is invariable due to platform motion. In-accurate estimation of Doppler centroid leads to poor image quality and image miss-registration. The effect of Doppler centroid is analyzed in this paper using multiple sets of data collected from airborne platform. Occurrences of ghost (ambiguous) targets and their power levels have been analyzed that impacts appropriate choice of PRF. Effect of aircraft attitudes (roll, pitch and yaw) on the Doppler centroid is also analyzed with the collected data sets. Various stages of the RDA (Range Doppler Algorithm) algorithm used for image formation in Stripmap mode, range compression, Doppler centroid estimation, azimuth compression, range cell migration correction are analyzed to find the performance limits and the dependence of the imaging geometry on the final image. The ability of Doppler centroid estimation to enhance the imaging accuracy for registration are also illustrated in this paper. The paper also tries to bring out the processing of low squint SAR data, the challenges and the performance limits imposed by the imaging geometry and the platform dynamics on the final image quality metrics. Finally, the effect on various terrain types, including land, water and bright scatters is also presented.

Keywords: ambiguous target, Doppler Centroid, image registration, Airborne SAR

Procedia PDF Downloads 204
25172 Iterative Method for Lung Tumor Localization in 4D CT

Authors: Sarah K. Hagi, Majdi Alnowaimi

Abstract:

In the last decade, there were immense advancements in the medical imaging modalities. These advancements can scan a whole volume of the lung organ in high resolution images within a short time. According to this performance, the physicians can clearly identify the complicated anatomical and pathological structures of lung. Therefore, these advancements give large opportunities for more advance of all types of lung cancer treatment available and will increase the survival rate. However, lung cancer is still one of the major causes of death with around 19% of all the cancer patients. Several factors may affect survival rate. One of the serious effects is the breathing process, which can affect the accuracy of diagnosis and lung tumor treatment plan. We have therefore developed a semi automated algorithm to localize the 3D lung tumor positions across all respiratory data during respiratory motion. The algorithm can be divided into two stages. First, a lung tumor segmentation for the first phase of the 4D computed tomography (CT). Lung tumor segmentation is performed using an active contours method. Then, localize the tumor 3D position across all next phases using a 12 degrees of freedom of an affine transformation. Two data set where used in this study, a compute simulate for 4D CT using extended cardiac-torso (XCAT) phantom and 4D CT clinical data sets. The result and error calculation is presented as root mean square error (RMSE). The average error in data sets is 0.94 mm ± 0.36. Finally, evaluation and quantitative comparison of the results with a state-of-the-art registration algorithm was introduced. The results obtained from the proposed localization algorithm show a promising result to localize alung tumor in 4D CT data.

Keywords: automated algorithm , computed tomography, lung tumor, tumor localization

Procedia PDF Downloads 591
25171 Different Sampling Schemes for Semi-Parametric Frailty Model

Authors: Nursel Koyuncu, Nihal Ata Tutkun

Abstract:

Frailty model is a survival model that takes into account the unobserved heterogeneity for exploring the relationship between the survival of an individual and several covariates. In the recent years, proposed survival models become more complex and this feature causes convergence problems especially in large data sets. Therefore selection of sample from these big data sets is very important for estimation of parameters. In sampling literature, some authors have defined new sampling schemes to predict the parameters correctly. For this aim, we try to see the effect of sampling design in semi-parametric frailty model. We conducted a simulation study in R programme to estimate the parameters of semi-parametric frailty model for different sample sizes, censoring rates under classical simple random sampling and ranked set sampling schemes. In the simulation study, we used data set recording 17260 male Civil Servants aged 40–64 years with complete 10-year follow-up as population. Time to death from coronary heart disease is treated as a survival-time and age, systolic blood pressure are used as covariates. We select the 1000 samples from population using different sampling schemes and estimate the parameters. From the simulation study, we concluded that ranked set sampling design performs better than simple random sampling for each scenario.

Keywords: frailty model, ranked set sampling, efficiency, simple random sampling

Procedia PDF Downloads 198
25170 Shoulder-Arm Mobility and Upper and Lower Extremity Muscle Function are Impaired in Patients with Systemic Sclerosis

Authors: F. Bringby, A. Nordin, L. Björnådal, E. Svenungsson, C. Boström, H Alexanderson

Abstract:

Patients with systemic sclerosis (SSc) have reduced hand function and self-reported limitations in daily activities. Few studies have explored limitations in shoulder-arm mobility and muscle function, or if there are differences in physical function between diffuse cutaneous (dcSSc) and limited cutaneous (lcSSc) SSc. The purpose of this study was to describe objectively assessed shoulder-arm mobility, lower extremity muscle function and muscle endurance in SSc and evaluate possible differences between lcSSc and dcSSc. 121 patients with SSc were included in this cross sectional study. Shoulder-arm mobility were examined using the Shoulder Function Assessment Scale (SFA) including 5 tasks ,lower extremity muscle function was measured by Timed stands test (TST) and muscle endurance in shoulder- and hip flexors were assessed by the Functional Index 2 (FI-2). Patients with dcSSc had median SFA hand to back score 5 (4-6) and median “hand to seat” score of 5 (4-6) compared to patients with lcSSc with corresponding median values of 6 (4-6) and 6 (5-6) respectively (p<0.01-p<0.05). 50% of both patientsgroups had lower muscle function assessed by the TST compared to age- and gender matched reference values but there were no differences in TST between the two patient groups. There was no difference in FI-2 scores between dcSSc and lcSSc. The whole group had 40 (28-83) % and 38 (32-72) % of maximal FI-2 shoulder flexion score on the right and left sides, and 40 (23-63) % and 37 (23-62) % of maximal FI-2 hip flexion score on the right and left sides. Reference values for the FI-2 indicate that healthy individuals perform in mean 100 % of maximal score. Patients with dcSSc were more limited than patients with lcSSc. Patients with SSc have reduced muscle function compared to reference values. These results highlights the importance of assessing shoulder-arm mobility and muscle function as well as a need for further research to identify exercise interventions to target these limitations.

Keywords: diffuse, limited, mobility, muscle function, physical therapy, systemic sclerosis

Procedia PDF Downloads 379
25169 Parallel Fuzzy Rough Support Vector Machine for Data Classification in Cloud Environment

Authors: Arindam Chaudhuri

Abstract:

Classification of data has been actively used for most effective and efficient means of conveying knowledge and information to users. The prima face has always been upon techniques for extracting useful knowledge from data such that returns are maximized. With emergence of huge datasets the existing classification techniques often fail to produce desirable results. The challenge lies in analyzing and understanding characteristics of massive data sets by retrieving useful geometric and statistical patterns. We propose a supervised parallel fuzzy rough support vector machine (PFRSVM) for data classification in cloud environment. The classification is performed by PFRSVM using hyperbolic tangent kernel. The fuzzy rough set model takes care of sensitiveness of noisy samples and handles impreciseness in training samples bringing robustness to results. The membership function is function of center and radius of each class in feature space and is represented with kernel. It plays an important role towards sampling the decision surface. The success of PFRSVM is governed by choosing appropriate parameter values. The training samples are either linear or nonlinear separable. The different input points make unique contributions to decision surface. The algorithm is parallelized with a view to reduce training times. The system is built on support vector machine library using Hadoop implementation of MapReduce. The algorithm is tested on large data sets to check its feasibility and convergence. The performance of classifier is also assessed in terms of number of support vectors. The challenges encountered towards implementing big data classification in machine learning frameworks are also discussed. The experiments are done on the cloud environment available at University of Technology and Management, India. The results are illustrated for Gaussian RBF and Bayesian kernels. The effect of variability in prediction and generalization of PFRSVM is examined with respect to values of parameter C. It effectively resolves outliers’ effects, imbalance and overlapping class problems, normalizes to unseen data and relaxes dependency between features and labels. The average classification accuracy for PFRSVM is better than other classifiers for both Gaussian RBF and Bayesian kernels. The experimental results on both synthetic and real data sets clearly demonstrate the superiority of the proposed technique.

Keywords: FRSVM, Hadoop, MapReduce, PFRSVM

Procedia PDF Downloads 479
25168 Single-Cell Visualization with Minimum Volume Embedding

Authors: Zhenqiu Liu

Abstract:

Visualizing the heterogeneity within cell-populations for single-cell RNA-seq data is crucial for studying the functional diversity of a cell. However, because of the high level of noises, outlier, and dropouts, it is very challenging to measure the cell-to-cell similarity (distance), visualize and cluster the data in a low-dimension. Minimum volume embedding (MVE) projects the data into a lower-dimensional space and is a promising tool for data visualization. However, it is computationally inefficient to solve a semi-definite programming (SDP) when the sample size is large. Therefore, it is not applicable to single-cell RNA-seq data with thousands of samples. In this paper, we develop an efficient algorithm with an accelerated proximal gradient method and visualize the single-cell RNA-seq data efficiently. We demonstrate that the proposed approach separates known subpopulations more accurately in single-cell data sets than other existing dimension reduction methods.

Keywords: single-cell RNA-seq, minimum volume embedding, visualization, accelerated proximal gradient method

Procedia PDF Downloads 215
25167 Heuristic Search Algorithm (HSA) for Enhancing the Lifetime of Wireless Sensor Networks

Authors: Tripatjot S. Panag, J. S. Dhillon

Abstract:

The lifetime of a wireless sensor network can be effectively increased by using scheduling operations. Once the sensors are randomly deployed, the task at hand is to find the largest number of disjoint sets of sensors such that every sensor set provides complete coverage of the target area. At any instant, only one of these disjoint sets is switched on, while all other are switched off. This paper proposes a heuristic search method to find the maximum number of disjoint sets that completely cover the region. A population of randomly initialized members is made to explore the solution space. A set of heuristics has been applied to guide the members to a possible solution in their neighborhood. The heuristics escalate the convergence of the algorithm. The best solution explored by the population is recorded and is continuously updated. The proposed algorithm has been tested for applications which require sensing of multiple target points, referred to as point coverage applications. Results show that the proposed algorithm outclasses the existing algorithms. It always finds the optimum solution, and that too by making fewer number of fitness function evaluations than the existing approaches.

Keywords: coverage, disjoint sets, heuristic, lifetime, scheduling, Wireless sensor networks, WSN

Procedia PDF Downloads 435
25166 Using Combination of Sets of Features of Molecules for Aqueous Solubility Prediction: A Random Forest Model

Authors: Muhammet Baldan, Emel Timuçin

Abstract:

Generally, absorption and bioavailability increase if solubility increases; therefore, it is crucial to predict them in drug discovery applications. Molecular descriptors and Molecular properties are traditionally used for the prediction of water solubility. There are various key descriptors that are used for this purpose, namely Drogan Descriptors, Morgan Descriptors, Maccs keys, etc., and each has different prediction capabilities with differentiating successes between different data sets. Another source for the prediction of solubility is structural features; they are commonly used for the prediction of solubility. However, there are little to no studies that combine three or more properties or descriptors for prediction to produce a more powerful prediction model. Unlike available models, we used a combination of those features in a random forest machine learning model for improved solubility prediction to better predict and, therefore, contribute to drug discovery systems.

Keywords: solubility, random forest, molecular descriptors, maccs keys

Procedia PDF Downloads 24
25165 Determining Optimal Number of Trees in Random Forests

Authors: Songul Cinaroglu

Abstract:

Background: Random Forest is an efficient, multi-class machine learning method using for classification, regression and other tasks. This method is operating by constructing each tree using different bootstrap sample of the data. Determining the number of trees in random forests is an open question in the literature for studies about improving classification performance of random forests. Aim: The aim of this study is to analyze whether there is an optimal number of trees in Random Forests and how performance of Random Forests differ according to increase in number of trees using sample health data sets in R programme. Method: In this study we analyzed the performance of Random Forests as the number of trees grows and doubling the number of trees at every iteration using “random forest” package in R programme. For determining minimum and optimal number of trees we performed Mc Nemar test and Area Under ROC Curve respectively. Results: At the end of the analysis it was found that as the number of trees grows, it does not always means that the performance of the forest is better than forests which have fever trees. In other words larger number of trees only increases computational costs but not increases performance results. Conclusion: Despite general practice in using random forests is to generate large number of trees for having high performance results, this study shows that increasing number of trees doesn’t always improves performance. Future studies can compare different kinds of data sets and different performance measures to test whether Random Forest performance results change as number of trees increase or not.

Keywords: classification methods, decision trees, number of trees, random forest

Procedia PDF Downloads 384
25164 A New Heuristic Algorithm for Maximization Total Demands of Nodes and Number of Covered Nodes Simultaneously

Authors: Ehsan Saghehei, Mahdi Eghbali

Abstract:

The maximal covering location problem (MCLP) was originally developed to determine a set of facility locations which would maximize the total customers' demand serviced by the facilities within a predetermined critical service criterion. However, on some problems that differences between the demand nodes are covered or the number of nodes each node is large, the method of solving MCLP may ignore these differences. In this paper, Heuristic solution based on the ranking of demands in each node and the number of nodes covered by each node according to a predetermined critical value is proposed. The output of this method is to maximize total demands of nodes and number of covered nodes, simultaneously. Furthermore, by providing an example, the solution algorithm is described and its results are compared with Greedy and Lagrange algorithms. Also, the results of the algorithm to solve the larger problem sizes that compared with other methods are provided. A summary and future works conclude the paper.

Keywords: heuristic solution, maximal covering location problem, ranking, set covering

Procedia PDF Downloads 555
25163 Two-Photon-Exchange Effects in the Electromagnetic Production of Pions

Authors: Hui-Yun Cao, Hai-Qing Zhou

Abstract:

The high precision measurements and experiments play more and more important roles in particle physics and atomic physics. To analyse the precise experimental data sets, the corresponding precise and reliable theoretical calculations are necessary. Until now, the form factors of elemental constituents such as pion and proton are still attractive issues in current Quantum Chromodynamics (QCD). In this work, the two-photon-exchange (TPE) effects in ep→enπ⁺ at small -t are discussed within a hadronic model. Under the pion dominance approximation and the limit mₑ→0, the TPE contribution to the amplitude can be described by a scalar function. We calculate TPE contributions to the amplitude, and the unpolarized differential cross section with the only elastic intermediate state is considered. The results show that the TPE corrections to the unpolarized differential cross section are about from -4% to -20% at Q²=1-1.6 GeV². After considering the TPE corrections to the experimental data sets of unpolarized differential cross section, we analyze the TPE corrections to the separated cross sections σ(L,T,LT,TT). We find that the TPE corrections (at Q²=1-1.6 GeV²) to σL are about from -10% to -30%, to σT are about 20%, and to σ(LT,TT) are much larger. By these analyses, we conclude that the TPE contributions in ep→enπ⁺ at small -t are important to extract the separated cross sections σ(L,T,LT,TT) and the electromagnetic form factor of π⁺ in the experimental analysis.

Keywords: differential cross section, form factor, hadronic, two-photon

Procedia PDF Downloads 117
25162 In Vivo Response of Scaffolds of Bioactive Glass-Ceramic

Authors: Ana Claudia Muniz Rennó, Karina Nogueira

Abstract:

This study aimed to investigate the in vivo tissue response of the introduction of the bioactive mesh (BM) scaffolds using a model of tibial bone defect implants in rats. Although a previous in vivo study demonstrated a highly positive response of particulate bioactive materials in the morphological and biomechanical properties of the bone callus, the effects of material with superior bioactivity, present in form of meshes have not been studied yet. Eighty male Wistar rats with 3 mm tibial defects were used. Animals were divided into four groups: intact group (IG) – tibia without any injury; bone defect day zero (0dD) – bone defects, sacrificed immediately after injury; bone defect control group (CG) – bone defects without any filler and bone defect filled with BM scaffold. The animals of BM and CG groups were sacrificed 15, 30 and 45 days post-injury to compare the temporal-special effects of the scaffolds on bone healing. The histological analysis revealed an organized newly formed bone at 30 and 45 days post-surgery in the BM. Also, this group presented an increased COX-2 expression on days 15 and 30 post-surgery. Furthermore, the immunohistochemistry analysis revealed that, BM presented a positive immunoexpression of RUNX-2 during all periods evaluated. The biomechanical analysis revealed that at 15 day after surgery, no significant statistically difference was observed between BM and CG and both groups had significantly higher values of maximal load compared to 0dG and significantly lower values than IG. On days 30 and 45 post-surgery, BM presented statistically lower values of maximal load compared to the CG. Nevertheless, at the same periods, BM did not show statistically significant difference compared to the IG maximal load values (p > 0, 05). Our results revealed that the implantation of the BM scaffolds was effective in stimulating newly bone formation.

Keywords: bone, biomaterials, scaffolds, cartilage

Procedia PDF Downloads 328
25161 Effective Stacking of Deep Neural Models for Automated Object Recognition in Retail Stores

Authors: Ankit Sinha, Soham Banerjee, Pratik Chattopadhyay

Abstract:

Automated product recognition in retail stores is an important real-world application in the domain of Computer Vision and Pattern Recognition. In this paper, we consider the problem of automatically identifying the classes of the products placed on racks in retail stores from an image of the rack and information about the query/product images. We improve upon the existing approaches in terms of effectiveness and memory requirement by developing a two-stage object detection and recognition pipeline comprising of a Faster-RCNN-based object localizer that detects the object regions in the rack image and a ResNet-18-based image encoder that classifies the detected regions into the appropriate classes. Each of the models is fine-tuned using appropriate data sets for better prediction and data augmentation is performed on each query image to prepare an extensive gallery set for fine-tuning the ResNet-18-based product recognition model. This encoder is trained using a triplet loss function following the strategy of online-hard-negative-mining for improved prediction. The proposed models are lightweight and can be connected in an end-to-end manner during deployment to automatically identify each product object placed in a rack image. Extensive experiments using Grozi-32k and GP-180 data sets verify the effectiveness of the proposed model.

Keywords: retail stores, faster-RCNN, object localization, ResNet-18, triplet loss, data augmentation, product recognition

Procedia PDF Downloads 136