Search results for: open dataset
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4168

Search results for: open dataset

4078 Comparison between XGBoost, LightGBM and CatBoost Using a Home Credit Dataset

Authors: Essam Al Daoud

Abstract:

Gradient boosting methods have been proven to be a very important strategy. Many successful machine learning solutions were developed using the XGBoost and its derivatives. The aim of this study is to investigate and compare the efficiency of three gradient methods. Home credit dataset is used in this work which contains 219 features and 356251 records. However, new features are generated and several techniques are used to rank and select the best features. The implementation indicates that the LightGBM is faster and more accurate than CatBoost and XGBoost using variant number of features and records.

Keywords: gradient boosting, XGBoost, LightGBM, CatBoost, home credit

Procedia PDF Downloads 170
4077 Strategies of Spatial Optimization for Open Space in the Old-Age Friendly City: An Investigation of the Behavior of the Elderly in Xicheng Square in Hangzhou

Authors: Yunxiang Fang

Abstract:

With the aging trend continuing to accelerate, open space is important for the daily life of the elderly, and its old-age friendliness is worthy of attention. Based on behavioral observation and literature research, this paper studies the behavior of the elderly in urban open space. Through the investigation, classification and quantitative analysis of the activity types, time characteristics and spatial behavior order of the elderly in Xicheng Square in Hangzhou, it summarizes the square space suitable for the psychological needs, physiology and activity needs of the elderly, combined with the basis of literature research. Finally, the suggestions for the improvement of the old-age friendship of Xicheng Square are put forward, from the aspects of microclimate, safety and accessibility, space richness and service facility quality.

Keywords: behavior characteristics, old-age friendliness, open space, square

Procedia PDF Downloads 168
4076 PaSA: A Dataset for Patent Sentiment Analysis to Highlight Patent Paragraphs

Authors: Renukswamy Chikkamath, Vishvapalsinhji Ramsinh Parmar, Christoph Hewel, Markus Endres

Abstract:

Given a patent document, identifying distinct semantic annotations is an interesting research aspect. Text annotation helps the patent practitioners such as examiners and patent attorneys to quickly identify the key arguments of any invention, successively providing a timely marking of a patent text. In the process of manual patent analysis, to attain better readability, recognising the semantic information by marking paragraphs is in practice. This semantic annotation process is laborious and time-consuming. To alleviate such a problem, we proposed a dataset to train machine learning algorithms to automate the highlighting process. The contributions of this work are: i) we developed a multi-class dataset of size 150k samples by traversing USPTO patents over a decade, ii) articulated statistics and distributions of data using imperative exploratory data analysis, iii) baseline Machine Learning models are developed to utilize the dataset to address patent paragraph highlighting task, and iv) future path to extend this work using Deep Learning and domain-specific pre-trained language models to develop a tool to highlight is provided. This work assists patent practitioners in highlighting semantic information automatically and aids in creating a sustainable and efficient patent analysis using the aptitude of machine learning.

Keywords: machine learning, patents, patent sentiment analysis, patent information retrieval

Procedia PDF Downloads 89
4075 Green Open Space in Sustainable Housing and Islamic Values Perspectives – Case Study Kampung Kauman Malang

Authors: Nunik Junara, Sugeng Triyadi

Abstract:

Sustainable Housing in Islamic perspective, can be defined as a multi-dimensional process that seeks to achieve a balance between economic and socio-cultural aspects on the side, and environmental aspect on the other. There are many quotes verses in the Quran and Hadith that leads to the belief that Islam as a Rahmatan lil Alamin, where men are encouraged to act wisely in treating nature and all living things in it. One aspect of the natural environment that closed to human is plants. In the settlement, the availability of plants or also called green open space is highly recommended. The availability of green open space in the neighborhood, both the public and private green open spaces is expected to reduce the effects of global warming that has engulfed various parts of the world. Green open space that can be viewed from the angle of eco-aestetic and eco-medical in sustainable architecture, is expected to increase the temperature and provide aesthetic impression to the surrounding environment. This paper attempts to discuss the principles of Islamic values related to the natural environment as a major resource for sustainability. This paper also aims to raise awareness of the importance of the theme of sustainability in settlements, especially in big cities. Analysis of the availability of green open space in kampung Kauman Malang is one example of the effort to apply the principles of sustainable housing.

Keywords: green open space, sustainable housing, Islamic values, Kampung Kauman Malang

Procedia PDF Downloads 410
4074 Generation of High-Quality Synthetic CT Images from Cone Beam CT Images Using A.I. Based Generative Networks

Authors: Heeba A. Gurku

Abstract:

Introduction: Cone Beam CT(CBCT) images play an integral part in proper patient positioning in cancer patients undergoing radiation therapy treatment. But these images are low in quality. The purpose of this study is to generate high-quality synthetic CT images from CBCT using generative models. Material and Methods: This study utilized two datasets from The Cancer Imaging Archive (TCIA) 1) Lung cancer dataset of 20 patients (with full view CBCT images) and 2) Pancreatic cancer dataset of 40 patients (only 27 patients having limited view images were included in the study). Cycle Generative Adversarial Networks (GAN) and its variant Attention Guided Generative Adversarial Networks (AGGAN) models were used to generate the synthetic CTs. Models were evaluated by visual evaluation and on four metrics, Structural Similarity Index Measure (SSIM), Peak Signal Noise Ratio (PSNR) Mean Absolute Error (MAE) and Root Mean Square Error (RMSE), to compare the synthetic CT and original CT images. Results: For pancreatic dataset with limited view CBCT images, our study showed that in Cycle GAN model, MAE, RMSE, PSNR improved from 12.57to 8.49, 20.94 to 15.29 and 21.85 to 24.63, respectively but structural similarity only marginally increased from 0.78 to 0.79. Similar, results were achieved with AGGAN with no improvement over Cycle GAN. However, for lung dataset with full view CBCT images Cycle GAN was able to reduce MAE significantly from 89.44 to 15.11 and AGGAN was able to reduce it to 19.77. Similarly, RMSE was also decreased from 92.68 to 23.50 in Cycle GAN and to 29.02 in AGGAN. SSIM and PSNR also improved significantly from 0.17 to 0.59 and from 8.81 to 21.06 in Cycle GAN respectively while in AGGAN SSIM increased to 0.52 and PSNR increased to 19.31. In both datasets, GAN models were able to reduce artifacts, reduce noise, have better resolution, and better contrast enhancement. Conclusion and Recommendation: Both Cycle GAN and AGGAN were significantly able to reduce MAE, RMSE and PSNR in both datasets. However, full view lung dataset showed more improvement in SSIM and image quality than limited view pancreatic dataset.

Keywords: CT images, CBCT images, cycle GAN, AGGAN

Procedia PDF Downloads 83
4073 Evaluation of Massive Open Online Course in a Rural Marginalized Area: Case Study of Alice Community, Eastern Cape, South Africa

Authors: Dare Ebenezer Fatumo, Olusesan Emmanuel Adelabu

Abstract:

Online learning has taken another dimension through the introduction of Massive Open Online Courses (MOOCs), it has also become an important resource base for teaching and learning. This research aimed at investigating the use of Massive Open Online Course in a rural marginalized area. The survey research design of descriptive nature was adopted to evaluate the awareness and usage of Massive Open Online Course (MOOCs) in Alice community, Eastern Cape, South Africa. This study also employed quantitative approach by using self-structured questionnaire to evoke information from the respondents. The data collected were analyzed by Statistical Package for Social Sciences (SPSS). The findings revealed amongst others the efficacy of Massive Open Online Course (MOOCs) in fostering teaching and learning in rural marginalized areas. This study concludes that MOOCs is a veritable medium for busy or less privileged individual to acquire a degree or certification. Therefore, the study recommends MOOCs platform to be fully embraced by people in rural marginalized areas, awareness programs about its usefulness should be propagated across the municipalities nationwide.

Keywords: distance learning, information and communication technology, massive open online course, online learning, teaching and learning

Procedia PDF Downloads 177
4072 Vertebral Transverse Open Wedge Osteotomy in Correction of Thoracolumbar Kyphosis Resulting from Ankylosing Spondylitis

Authors: S. AliReza Mirghasemi, Amin Mohamadi, Zameer Hussain, Narges Rahimi Gabaran, Mir Mostafa Sadat, Shervin Rashidinia

Abstract:

In progressive cases of Ankylosing Spondylitis, patients will have high degrees of kyphosis leading to severe disabilities. Several operative techniques have been used in this stage, but little knowledge exists on the indications for and outcome of these methods. In this study, we examined the efficacy of monosegmental transverse open wedge osteotomy of L3 in 11 patients with progressive spinal kyphosis. The average correction was 36̊ (20 to 42) with no loss of correction after operation. The average operating time was 120 minutes (100 to 130) and the mean blood loss was 1500 ml (1100 to 2000). Osteotomy corrected all patients sufficiently to allow them to see ahead and their posture was improved. There were no fatal complications but one patient had paraplegia after the operation.

Keywords: ankylosing spondylitis, thoracolumbar kyphosis, open wedge osteotomy, L3 transverse open wedge osteotomy

Procedia PDF Downloads 392
4071 Dataset Quality Index:Development of Composite Indicator Based on Standard Data Quality Indicators

Authors: Sakda Loetpiparwanich, Preecha Vichitthamaros

Abstract:

Nowadays, poor data quality is considered one of the majority costs for a data project. The data project with data quality awareness almost as much time to data quality processes while data project without data quality awareness negatively impacts financial resources, efficiency, productivity, and credibility. One of the processes that take a long time is defining the expectations and measurements of data quality because the expectation is different up to the purpose of each data project. Especially, big data project that maybe involves with many datasets and stakeholders, that take a long time to discuss and define quality expectations and measurements. Therefore, this study aimed at developing meaningful indicators to describe overall data quality for each dataset to quick comparison and priority. The objectives of this study were to: (1) Develop a practical data quality indicators and measurements, (2) Develop data quality dimensions based on statistical characteristics and (3) Develop Composite Indicator that can describe overall data quality for each dataset. The sample consisted of more than 500 datasets from public sources obtained by random sampling. After datasets were collected, there are five steps to develop the Dataset Quality Index (SDQI). First, we define standard data quality expectations. Second, we find any indicators that can measure directly to data within datasets. Thirdly, each indicator aggregates to dimension using factor analysis. Next, the indicators and dimensions were weighted by an effort for data preparing process and usability. Finally, the dimensions aggregate to Composite Indicator. The results of these analyses showed that: (1) The developed useful indicators and measurements contained ten indicators. (2) the developed data quality dimension based on statistical characteristics, we found that ten indicators can be reduced to 4 dimensions. (3) The developed Composite Indicator, we found that the SDQI can describe overall datasets quality of each dataset and can separate into 3 Level as Good Quality, Acceptable Quality, and Poor Quality. The conclusion, the SDQI provide an overall description of data quality within datasets and meaningful composition. We can use SQDI to assess for all data in the data project, effort estimation, and priority. The SDQI also work well with Agile Method by using SDQI to assessment in the first sprint. After passing the initial evaluation, we can add more specific data quality indicators into the next sprint.

Keywords: data quality, dataset quality, data quality management, composite indicator, factor analysis, principal component analysis

Procedia PDF Downloads 138
4070 Combined Surface Tension and Natural Convection of Nanofluids in a Square Open Cavity

Authors: Habibis Saleh, Ishak Hashim

Abstract:

Combined surface tension and natural convection heat transfer in an open cavity is studied numerically in this article. The cavity is filled with water-{Cu} nanofluids. The left wall is kept at low temperature, the right wall at high temperature and the bottom and top walls are adiabatic. The top free surface is assumed to be flat and non--deformable. Finite difference method is applied to solve the dimensionless governing equations. It is found that the insignificant effect of adding the nanoparticles were obtained about $Ma_{bf}=250$.

Keywords: natural convection, marangoni convection, nanofluids, square open cavity

Procedia PDF Downloads 549
4069 Enhancing Cultural Heritage Data Retrieval by Mapping COURAGE to CIDOC Conceptual Reference Model

Authors: Ghazal Faraj, Andras Micsik

Abstract:

The CIDOC Conceptual Reference Model (CRM) is an extensible ontology that provides integrated access to heterogeneous and digital datasets. The CIDOC-CRM offers a “semantic glue” intended to promote accessibility to several diverse and dispersed sources of cultural heritage data. That is achieved by providing a formal structure for the implicit and explicit concepts and their relationships in the cultural heritage field. The COURAGE (“Cultural Opposition – Understanding the CultuRal HeritAGE of Dissent in the Former Socialist Countries”) project aimed to explore methods about socialist-era cultural resistance during 1950-1990 and planned to serve as a basis for further narratives and digital humanities (DH) research. This project highlights the diversity of flourished alternative cultural scenes in Eastern Europe before 1989. Moreover, the dataset of COURAGE is an online RDF-based registry that consists of historical people, organizations, collections, and featured items. For increasing the inter-links between different datasets and retrieving more relevant data from various data silos, a shared federated ontology for reconciled data is needed. As a first step towards these goals, a full understanding of the CIDOC CRM ontology (target ontology), as well as the COURAGE dataset, was required to start the work. Subsequently, the queries toward the ontology were determined, and a table of equivalent properties from COURAGE and CIDOC CRM was created. The structural diagrams that clarify the mapping process and construct queries are on progress to map person, organization, and collection entities to the ontology. Through mapping the COURAGE dataset to CIDOC-CRM ontology, the dataset will have a common ontological foundation with several other datasets. Therefore, the expected results are: 1) retrieving more detailed data about existing entities, 2) retrieving new entities’ data, 3) aligning COURAGE dataset to a standard vocabulary, 4) running distributed SPARQL queries over several CIDOC-CRM datasets and testing the potentials of distributed query answering using SPARQL. The next plan is to map CIDOC-CRM to other upper-level ontologies or large datasets (e.g., DBpedia, Wikidata), and address similar questions on a wide variety of knowledge bases.

Keywords: CIDOC CRM, cultural heritage data, COURAGE dataset, ontology alignment

Procedia PDF Downloads 144
4068 Predictive Analysis of Chest X-rays Using NLP and Large Language Models with the Indiana University Dataset and Random Forest Classifier

Authors: Azita Ramezani, Ghazal Mashhadiagha, Bahareh Sanabakhsh

Abstract:

This study researches the combination of Random. Forest classifiers with large language models (LLMs) and natural language processing (NLP) to improve diagnostic accuracy in chest X-ray analysis using the Indiana University dataset. Utilizing advanced NLP techniques, the research preprocesses textual data from radiological reports to extract key features, which are then merged with image-derived data. This improved dataset is analyzed with Random Forest classifiers to predict specific clinical results, focusing on the identification of health issues and the estimation of case urgency. The findings reveal that the combination of NLP, LLMs, and machine learning not only increases diagnostic precision but also reliability, especially in quickly identifying critical conditions. Achieving an accuracy of 99.35%, the model shows significant advancements over conventional diagnostic techniques. The results emphasize the large potential of machine learning in medical imaging, suggesting that these technologies could greatly enhance clinician judgment and patient outcomes by offering quicker and more precise diagnostic approximations.

Keywords: natural language processing (NLP), large language models (LLMs), random forest classifier, chest x-ray analysis, medical imaging, diagnostic accuracy, indiana university dataset, machine learning in healthcare, predictive modeling, clinical decision support systems

Procedia PDF Downloads 42
4067 Plant Identification Using Convolution Neural Network and Vision Transformer-Based Models

Authors: Virender Singh, Mathew Rees, Simon Hampton, Sivaram Annadurai

Abstract:

Plant identification is a challenging task that aims to identify the family, genus, and species according to plant morphological features. Automated deep learning-based computer vision algorithms are widely used for identifying plants and can help users narrow down the possibilities. However, numerous morphological similarities between and within species render correct classification difficult. In this paper, we tested custom convolution neural network (CNN) and vision transformer (ViT) based models using the PyTorch framework to classify plants. We used a large dataset of 88,000 provided by the Royal Horticultural Society (RHS) and a smaller dataset of 16,000 images from the PlantClef 2015 dataset for classifying plants at genus and species levels, respectively. Our results show that for classifying plants at the genus level, ViT models perform better compared to CNN-based models ResNet50 and ResNet-RS-420 and other state-of-the-art CNN-based models suggested in previous studies on a similar dataset. ViT model achieved top accuracy of 83.3% for classifying plants at the genus level. For classifying plants at the species level, ViT models perform better compared to CNN-based models ResNet50 and ResNet-RS-420, with a top accuracy of 92.5%. We show that the correct set of augmentation techniques plays an important role in classification success. In conclusion, these results could help end users, professionals and the general public alike in identifying plants quicker and with improved accuracy.

Keywords: plant identification, CNN, image processing, vision transformer, classification

Procedia PDF Downloads 102
4066 VideoAssist: A Labelling Assistant to Increase Efficiency in Annotating Video-Based Fire Dataset Using a Foundation Model

Authors: Keyur Joshi, Philip Dietrich, Tjark Windisch, Markus König

Abstract:

In the field of surveillance-based fire detection, the volume of incoming data is increasing rapidly. However, the labeling of a large industrial dataset is costly due to the high annotation costs associated with current state-of-the-art methods, which often require bounding boxes or segmentation masks for model training. This paper introduces VideoAssist, a video annotation solution that utilizes a video-based foundation model to annotate entire videos with minimal effort, requiring the labeling of bounding boxes for only a few keyframes. To the best of our knowledge, VideoAssist is the first method to significantly reduce the effort required for labeling fire detection videos. The approach offers bounding box and segmentation annotations for the video dataset with minimal manual effort. Results demonstrate that the performance of labels annotated by VideoAssist is comparable to those annotated by humans, indicating the potential applicability of this approach in fire detection scenarios.

Keywords: fire detection, label annotation, foundation models, object detection, segmentation

Procedia PDF Downloads 6
4065 Artificial Neural Networks Application on Nusselt Number and Pressure Drop Prediction in Triangular Corrugated Plate Heat Exchanger

Authors: Hany Elsaid Fawaz Abdallah

Abstract:

This study presents a new artificial neural network(ANN) model to predict the Nusselt Number and pressure drop for the turbulent flow in a triangular corrugated plate heat exchanger for forced air and turbulent water flow. An experimental investigation was performed to create a new dataset for the Nusselt Number and pressure drop values in the following range of dimensionless parameters: The plate corrugation angles (from 0° to 60°), the Reynolds number (from 10000 to 40000), pitch to height ratio (from 1 to 4), and Prandtl number (from 0.7 to 200). Based on the ANN performance graph, the three-layer structure with {12-8-6} hidden neurons has been chosen. The training procedure includes back-propagation with the biases and weight adjustment, the evaluation of the loss function for the training and validation dataset and feed-forward propagation of the input parameters. The linear function was used at the output layer as the activation function, while for the hidden layers, the rectified linear unit activation function was utilized. In order to accelerate the ANN training, the loss function minimization may be achieved by the adaptive moment estimation algorithm (ADAM). The ‘‘MinMax’’ normalization approach was utilized to avoid the increase in the training time due to drastic differences in the loss function gradients with respect to the values of weights. Since the test dataset is not being used for the ANN training, a cross-validation technique is applied to the ANN network using the new data. Such procedure was repeated until loss function convergence was achieved or for 4000 epochs with a batch size of 200 points. The program code was written in Python 3.0 using open-source ANN libraries such as Scikit learn, TensorFlow and Keras libraries. The mean average percent error values of 9.4% for the Nusselt number and 8.2% for pressure drop for the ANN model have been achieved. Therefore, higher accuracy compared to the generalized correlations was achieved. The performance validation of the obtained model was based on a comparison of predicted data with the experimental results yielding excellent accuracy.

Keywords: artificial neural networks, corrugated channel, heat transfer enhancement, Nusselt number, pressure drop, generalized correlations

Procedia PDF Downloads 86
4064 PatchMix: Learning Transferable Semi-Supervised Representation by Predicting Patches

Authors: Arpit Rai

Abstract:

In this work, we propose PatchMix, a semi-supervised method for pre-training visual representations. PatchMix mixes patches of two images and then solves an auxiliary task of predicting the label of each patch in the mixed image. Our experiments on the CIFAR-10, 100 and the SVHN dataset show that the representations learned by this method encodes useful information for transfer to new tasks and outperform the baseline Residual Network encoders by on CIFAR 10 by 12% on ResNet 101 and 2% on ResNet-56, by 4% on CIFAR-100 on ResNet101 and by 6% on SVHN dataset on the ResNet-101 baseline model.

Keywords: self-supervised learning, representation learning, computer vision, generalization

Procedia PDF Downloads 89
4063 Open Source, Open Hardware Ground Truth for Visual Odometry and Simultaneous Localization and Mapping Applications

Authors: Janusz Bedkowski, Grzegorz Kisala, Michal Wlasiuk, Piotr Pokorski

Abstract:

Ground-truth data is essential for VO (Visual Odometry) and SLAM (Simultaneous Localization and Mapping) quantitative evaluation using e.g. ATE (Absolute Trajectory Error) and RPE (Relative Pose Error). Many open-access data sets provide raw and ground-truth data for benchmark purposes. The issue appears when one would like to validate Visual Odometry and/or SLAM approaches on data captured using the device for which the algorithm is targeted for example mobile phone and disseminate data for other researchers. For this reason, we propose an open source, open hardware groundtruth system that provides an accurate and precise trajectory with a 3D point cloud. It is based on LiDAR Livox Mid-360 with a non-repetitive scanning pattern, on-board Raspberry Pi 4B computer, battery and software for off-line calculations (camera to LiDAR calibration, LiDAR odometry, SLAM, georeferencing). We show how this system can be used for the evaluation of various the state of the art algorithms (Stella SLAM, ORB SLAM3, DSO) in typical indoor monocular VO/SLAM.

Keywords: SLAM, ground truth, navigation, LiDAR, visual odometry, mapping

Procedia PDF Downloads 66
4062 Rd-PLS Regression: From the Analysis of Two Blocks of Variables to Path Modeling

Authors: E. Tchandao Mangamana, V. Cariou, E. Vigneau, R. Glele Kakai, E. M. Qannari

Abstract:

A new definition of a latent variable associated with a dataset makes it possible to propose variants of the PLS2 regression and the multi-block PLS (MB-PLS). We shall refer to these variants as Rd-PLS regression and Rd-MB-PLS respectively because they are inspired by both Redundancy analysis and PLS regression. Usually, a latent variable t associated with a dataset Z is defined as a linear combination of the variables of Z with the constraint that the length of the loading weights vector equals 1. Formally, t=Zw with ‖w‖=1. Denoting by Z' the transpose of Z, we define herein, a latent variable by t=ZZ’q with the constraint that the auxiliary variable q has a norm equal to 1. This new definition of a latent variable entails that, as previously, t is a linear combination of the variables in Z and, in addition, the loading vector w=Z’q is constrained to be a linear combination of the rows of Z. More importantly, t could be interpreted as a kind of projection of the auxiliary variable q onto the space generated by the variables in Z, since it is collinear to the first PLS1 component of q onto Z. Consider the situation in which we aim to predict a dataset Y from another dataset X. These two datasets relate to the same individuals and are assumed to be centered. Let us consider a latent variable u=YY’q to which we associate the variable t= XX’YY’q. Rd-PLS consists in seeking q (and therefore u and t) so that the covariance between t and u is maximum. The solution to this problem is straightforward and consists in setting q to the eigenvector of YY’XX’YY’ associated with the largest eigenvalue. For the determination of higher order components, we deflate X and Y with respect to the latent variable t. Extending Rd-PLS to the context of multi-block data is relatively easy. Starting from a latent variable u=YY’q, we consider its ‘projection’ on the space generated by the variables of each block Xk (k=1, ..., K) namely, tk= XkXk'YY’q. Thereafter, Rd-MB-PLS seeks q in order to maximize the average of the covariances of u with tk (k=1, ..., K). The solution to this problem is given by q, eigenvector of YY’XX’YY’, where X is the dataset obtained by horizontally merging datasets Xk (k=1, ..., K). For the determination of latent variables of order higher than 1, we use a deflation of Y and Xk with respect to the variable t= XX’YY’q. In the same vein, extending Rd-MB-PLS to the path modeling setting is straightforward. Methods are illustrated on the basis of case studies and performance of Rd-PLS and Rd-MB-PLS in terms of prediction is compared to that of PLS2 and MB-PLS.

Keywords: multiblock data analysis, partial least squares regression, path modeling, redundancy analysis

Procedia PDF Downloads 146
4061 Aerodynamic Study of an Open Window Moving Bus with Passengers

Authors: Pawan Kumar Pant, Bhanu Gupta, S. R. Kale, S. V. Veeravalli

Abstract:

In many countries, buses are the principal means of transport, of which a majority are naturally ventilated with open windows. The design of this ventilation has little scientific basis and to address this problem a study has been undertaken involving both experiments and numerical simulations. The flow pattern inside and around of an open window bus with passengers has been investigated in detail. A full scale three-dimensional numerical simulation has been used for a) a bus with closed windows and b) with open windows. In either simulation, the bus had 58 seated passengers. The bus dimensions used were 2500 mm wide × 2500 mm high (exterior) × 10500 mm long and its speed was set at 40 km/h. In both cases, the flow separates at the top front edge forming a vortex and reattaches close to the mid-length. This attached flow separates once more as it leaves the bus. However, the strength and shape of the vortices at the top front and wake region is different for both cases. The streamline pattern around the bus is also different for the two cases. For the bus with open windows, the dominant airflow inside the bus is from the rear to the front of the bus and air velocity at the face level of the passengers was found to be 1/10th of the free stream velocity. These findings are in good agreement with flow visualization experiments performed in a water channel at 10 m/s, and with smoke/tuft visualizations in a wind tunnel with a free-stream velocity of approximately 40 km/h on a 1:25 scaled Perspex model.

Keywords: air flow, moving bus, open windows, vortex, wind tunnel

Procedia PDF Downloads 232
4060 Cosmetic Recommendation Approach Using Machine Learning

Authors: Shakila N. Senarath, Dinesh Asanka, Janaka Wijayanayake

Abstract:

The necessity of cosmetic products is arising to fulfill consumer needs of personality appearance and hygiene. A cosmetic product consists of various chemical ingredients which may help to keep the skin healthy or may lead to damages. Every chemical ingredient in a cosmetic product does not perform on every human. The most appropriate way to select a healthy cosmetic product is to identify the texture of the body first and select the most suitable product with safe ingredients. Therefore, the selection process of cosmetic products is complicated. Consumer surveys have shown most of the time, the selection process of cosmetic products is done in an improper way by consumers. From this study, a content-based system is suggested that recommends cosmetic products for the human factors. To such an extent, the skin type, gender and price range will be considered as human factors. The proposed system will be implemented by using Machine Learning. Consumer skin type, gender and price range will be taken as inputs to the system. The skin type of consumer will be derived by using the Baumann Skin Type Questionnaire, which is a value-based approach that includes several numbers of questions to derive the user’s skin type to one of the 16 skin types according to the Bauman Skin Type indicator (BSTI). Two datasets are collected for further research proceedings. The user data set was collected using a questionnaire given to the public. Those are the user dataset and the cosmetic dataset. Product details are included in the cosmetic dataset, which belongs to 5 different kinds of product categories (Moisturizer, Cleanser, Sun protector, Face Mask, Eye Cream). An alternate approach of TF-IDF (Term Frequency – Inverse Document Frequency) is applied to vectorize cosmetic ingredients in the generic cosmetic products dataset and user-preferred dataset. Using the IF-IPF vectors, each user-preferred products dataset and generic cosmetic products dataset can be represented as sparse vectors. The similarity between each user-preferred product and generic cosmetic product will be calculated using the cosine similarity method. For the recommendation process, a similarity matrix can be used. Higher the similarity, higher the match for consumer. Sorting a user column from similarity matrix in a descending order, the recommended products can be retrieved in ascending order. Even though results return a list of similar products, and since the user information has been gathered, such as gender and the price ranges for product purchasing, further optimization can be done by considering and giving weights for those parameters once after a set of recommended products for a user has been retrieved.

Keywords: content-based filtering, cosmetics, machine learning, recommendation system

Procedia PDF Downloads 134
4059 Open Data for e-Governance: Case Study of Bangladesh

Authors: Sami Kabir, Sadek Hossain Khoka

Abstract:

Open Government Data (OGD) refers to all data produced by government which are accessible in reusable way by common people with access to Internet and at free of cost. In line with “Digital Bangladesh” vision of Bangladesh government, the concept of open data has been gaining momentum in the country. Opening all government data in digital and customizable format from single platform can enhance e-governance which will make government more transparent to the people. This paper presents a well-in-progress case study on OGD portal by Bangladesh Government in order to link decentralized data. The initiative is intended to facilitate e-service towards citizens through this one-stop web portal. The paper further discusses ways of collecting data in digital format from relevant agencies with a view to making it publicly available through this single point of access. Further, possible layout of this web portal is presented.

Keywords: e-governance, one-stop web portal, open government data, reusable data, web of data

Procedia PDF Downloads 354
4058 Degeneracy and Defectiveness in Non-Hermitian Systems with Open Boundary

Authors: Yongxu Fu, Shaolong Wan

Abstract:

We study the band degeneracy, defectiveness, as well as exceptional points of non-Hermitian systems and materials analytically. We elaborate on the energy bands, the band degeneracy, and the defectiveness of eigenstates under open boundary conditions based on developing a general theory of one-dimensional (1D) non-Hermitian systems. We research the presence of the exceptional points in a generalized non-Hermitian Su-Schrieffer-Heeger model under open boundary conditions. Beyond our general theory, there exist infernal points in 1D non-Hermitian systems, where the energy spectra under open boundary conditions converge on some discrete energy values. We study two 1D non-Hermitian models with the existence of infernal points. We generalize the infernal points to the infernal knots in four-dimensional non-Hermitian systems.

Keywords: non-hermitian, degeneracy, defectiveness, exceptional points, infernal points

Procedia PDF Downloads 129
4057 Developing an Intonation Labeled Dataset for Hindi

Authors: Esha Banerjee, Atul Kumar Ojha, Girish Nath Jha

Abstract:

This study aims to develop an intonation labeled database for Hindi. Although no single standard for prosody labeling exists in Hindi, researchers in the past have employed perceptual and statistical methods in literature to draw inferences about the behavior of prosody patterns in Hindi. Based on such existing research and largely agreed upon intonational theories in Hindi, this study attempts to develop a manually annotated prosodic corpus of Hindi speech data, which can be used for training speech models for natural-sounding speech in the future. 100 sentences ( 500 words) each for declarative and interrogative types have been labeled using Praat.

Keywords: speech dataset, Hindi, intonation, labeled corpus

Procedia PDF Downloads 196
4056 Crowdsourcing as an Open Innovation Tool for Entrepreneurship

Authors: Zeynep Ayfer Bozat

Abstract:

As traditional innovation has already taken its place in managers’ to do lists; managers and companies have started to look for new ways to go beyond the traditional innovation. Because of its cost, traditional innovation became a burden for companies since they only use inner sources. Companies have intended to use outer innovation sources to decrease the innovation costs and Open Innovation has become a new solution for companies at this point. Crowdsourcing is a tool of Open Innovation and it consists of two words: Outsourcing and crowd. Crowdsourcing aims to benefit from the efforts and ideas of a virtual crowd via Internet technologies. In addition to that, crowdsourcing can help entrepreneurs to innovate and grow their businesses. They can crowd source anything they can use to grow their businesses: Ideas, investment, new business, new partners, new solutions, new policies, data, insight, marketing or talent. Therefore, the aim of the study is to be able to show some possible ways for entrepreneurs to benefit from crowdsourcing to expand or foster their businesses. In the study, the term crowdsourcing has been given in details and these possible ways have been searched and given.

Keywords: crowdsourcing, entrepreneurship, innovation, open innovation

Procedia PDF Downloads 291
4055 The Impact of External Technology Acquisition and Exploitation on Firms' Process Innovation Performance

Authors: Thammanoon Charmjuree, Yuosre F. Badir, Umar Safdar

Abstract:

There is a consensus among innovation scholars that knowledge is a vital antecedent for firm’s innovation; e.g., process innovation. Recently, there has been an increasing amount of attention to more open approaches to innovation. This open model emphasizes the use of purposive flows of knowledge across the organization boundaries. Firms adopt open innovation strategy to improve their innovation performance by bringing knowledge into the organization (inbound open innovation) to accelerate internal innovation or transferring knowledge outside (outbound open innovation) to expand the markets for external use of innovation. Reviewing open innovation research reveals the following. First, the majority of existing studies have focused on inbound open innovation and less on outbound open innovation. Second, limited research has considered the possible interaction between both and how this interaction may impact the firm’s innovation performance. Third, scholars have focused mainly on the impact of open innovation strategy on product innovation and less on process innovation. Therefore, our knowledge of the relationship between firms’ inbound and outbound open innovation and how these two impact process innovation is still limited. This study focuses on the firm’s external technology acquisition (ETA) and external technology exploitation (ETE) and the firm’s process innovation performance. The ETA represents inbound openness in which firms rely on the acquisition and absorption of external technologies to complement their technology portfolios. The ETE, on the other hand, refers to commercializing technology assets exclusively or in addition to their internal application. This study hypothesized that both ETA and ETE have a positive relationship with process innovation performance and that ETE fully mediates the relationship between ETA and process innovation performance, i.e., ETA has a positive impact on ETE, and turn, ETE has a positive impact on process innovation performance. This study empirically explored these hypotheses in software development firms in Thailand. These firms were randomly selected from a list of Software firms registered with the Department of Business Development, Ministry of Commerce of Thailand. The questionnaires were sent to 1689 firms. After follow-ups and periodic reminders, we obtained 329 (19.48%) completed usable questionnaires. The structure question modeling (SEM) has been used to analyze the data. An analysis of the outcome of 329 firms provides support for our three hypotheses: First, the firm’s ETA has a positive impact on its process innovation performance. Second, the firm’s ETA has a positive impact its ETE. Third, the firm’s ETE fully mediates the relationship between the firm’s ETA and its process innovation performance. This study fills up the gap in open innovation literature by examining the relationship between inbound (ETA) and outbound (ETE) open innovation and suggest that in order to benefits from the promises of openness, firms must engage in both. The study went one step further by explaining the mechanism through which ETA influence process innovation performance.

Keywords: process innovation performance, external technology acquisition, external technology exploitation, open innovation

Procedia PDF Downloads 201
4054 The Impact of Open Defecation on Fecal-Oral Infections: A Case Study in Burat and Ngaremara Wards of Isiolo County, Kenya

Authors: Kimutai Joan Jepkorir, Moturi Wilkister Nyaora

Abstract:

The practice of open defecation can be devastating for human health as well as the environment, and this practice persistence could be due to ingrained habits that individuals continue to engage in despite having a better alternative. Safe disposal of human excreta is essential for public health protection. This study sought to find if open defecation relates to fecal-oral infections in Burat and Ngaremara Wards in Isiolo County. This was achieved through conducting a cross-sectional study. Simple random sampling technique was used to select 385 households that were used in the study. Data collection was done by use of questionnaires and observation checklists. The result show that 66% of the respondents disposed-off fecal matter in a safe manner, whereas 34% disposed-off fecal matter in unsafe manner through open defecation. The prevalence proportions per 1000 of diarrhea and intestinal worms among children under-5 years of age were 142 and 21, respectively. The prevalence proportions per 1000 of diarrhea and typhoid among children over-5 years of age were 20 and 20, respectively.

Keywords: faecal-oral infections, open defecation, prevalence proportion, sanitation

Procedia PDF Downloads 304
4053 Open Innovation Laboratory for Rapid Realization of Sensing, Smart and Sustainable Products (S3 Products) for Higher Education

Authors: J. Miranda, D. Chavarría-Barrientos, M. Ramírez-Cadena, M. E. Macías, P. Ponce, J. Noguez, R. Pérez-Rodríguez, P. K. Wright, A. Molina

Abstract:

Higher education methods need to evolve because the new generations of students are learning in different ways. One way is by adopting emergent technologies, new learning methods and promoting the maker movement. As a result, Tecnologico de Monterrey is developing Open Innovation Laboratories as an immediate response to educational challenges of the world. This paper presents an Open Innovation Laboratory for Rapid Realization of Sensing, Smart and Sustainable Products (S3 Products). The Open Innovation Laboratory is composed of a set of specific resources where students and teachers use them to provide solutions to current problems of priority sectors through the development of a new generation of products. This new generation of products considers the concepts Sensing, Smart, and Sustainable. The Open Innovation Laboratory has been implemented in different courses in the context of New Product Development (NPD) and Integrated Manufacturing Systems (IMS) at Tecnologico de Monterrey. The implementation consists of adapting this Open Innovation Laboratory within the course’s syllabus in combination with the implementation of specific methodologies for product development, learning methods (Active Learning and Blended Learning using Massive Open Online Courses MOOCs) and rapid product realization platforms. Using the concepts proposed it is possible to demonstrate that students can propose innovative and sustainable products, and demonstrate how the learning process could be improved using technological resources applied in the higher educational sector. Finally, examples of innovative S3 products developed at Tecnologico de Monterrey are presented.

Keywords: active learning, blended learning, maker movement, new product development, open innovation laboratory

Procedia PDF Downloads 395
4052 Numerical Study of a 6080HP Open Drip Proof (ODP) Motor

Authors: Feng-Hisang Lai

Abstract:

CFD(Computational Fluid Dynamics) is conducted to numerically study the flow and heat transfer features of a two-pole, 6,080HP, 60Hz, 3,150V open drip-proof (ODP) motor. The stator and rotor cores in this high voltage induction motor are segmented with the use of spacers for cooling purposes, which leads to difficulties in meshing when the entire system is to be simulated. The system is divided into 4 parts, meshed separately and then combined using interfaces. The deviation between the CFD and experimental results in temperature and flow rate is less than 10%. The internal flow is further examined and a final design is proposed to reduce the winding temperature by 10 degrees.

Keywords: CFD, open drip proof, induction motor, cooling

Procedia PDF Downloads 197
4051 Open Education Resources a Gateway for Accessing Hospitality and Tourism Learning Materials

Authors: Isiya Shinkafi Salihu

Abstract:

Open education resources (OER) are open learning materials in different formats, course content and context to support learning globally. This study investigated the level of awareness of Hospitality and Tourism OER among students in the Department of Tourism and Hotel Management in a University. Specifically, it investigated students’ awareness, use and accessibility of OER in learning. The research design method used was the quantitative approach, using an online questionnaire. The thesis research shows that respondents frequently use OER but with little knowledge of the content and context of the material. Most of the respondents’ have little knowledge about the concept even though they use it. Information and communication technologies are tools for information gathering, social networking and knowledge sharing and transfer. OER are open education materials accessible online such as curriculum, maps, course materials, and videos that users create, adapt, reuse for learning and research. Few of the respondents that used OER in learning faced some challenges such as high cost of data, poor connectivity and lack of proper guidance. The results suggest a lack of awareness of OER among students in the faculty of tourism and the need for support from the teachers in the utilization of OER. The thesis also reveals that some of the international students are accessing the internet as beginners in their studies which require guidance. The research, however, recommends that further studies should be conducted to other faculties.

Keywords: creative commons, open education resources, open licenses, information and communication technology

Procedia PDF Downloads 176
4050 Application of Data Mining Techniques for Tourism Knowledge Discovery

Authors: Teklu Urgessa, Wookjae Maeng, Joong Seek Lee

Abstract:

Application of five implementations of three data mining classification techniques was experimented for extracting important insights from tourism data. The aim was to find out the best performing algorithm among the compared ones for tourism knowledge discovery. Knowledge discovery process from data was used as a process model. 10-fold cross validation method is used for testing purpose. Various data preprocessing activities were performed to get the final dataset for model building. Classification models of the selected algorithms were built with different scenarios on the preprocessed dataset. The outperformed algorithm tourism dataset was Random Forest (76%) before applying information gain based attribute selection and J48 (C4.5) (75%) after selection of top relevant attributes to the class (target) attribute. In terms of time for model building, attribute selection improves the efficiency of all algorithms. Artificial Neural Network (multilayer perceptron) showed the highest improvement (90%). The rules extracted from the decision tree model are presented, which showed intricate, non-trivial knowledge/insight that would otherwise not be discovered by simple statistical analysis with mediocre accuracy of the machine using classification algorithms.

Keywords: classification algorithms, data mining, knowledge discovery, tourism

Procedia PDF Downloads 294
4049 Digital Repositories in Algerian Universities: Content and Search Possibilities

Authors: Hakim Benoumelghar

Abstract:

The launch in 1999 of the open access Initiative (OAI) and the protocol for sharing metadata, OAI-PMH, in parallel with the provision of deposit platforms, open-source software, such as DSpace in 2002, which allow libraries to develop digital repositories and play a leading role in the open access movement, and by building institutional open archives alongside the theme. This study focuses on Algerian universities and their projects and platforms for digital repositories of theses and scientific papers and the possibilities of access to the university community to develop research and access to archives of scientific digital content offered by the scientific community. This contribution attempts to compare Algerian and foreign institutional deposits in developed countries in order to have development and perspectives to facilitate scientific research and give more possibilities to the scientific community in documentary matters.

Keywords: digital repository, repository software, university, algeria

Procedia PDF Downloads 80