Search results for: categorical datasets
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 868

Search results for: categorical datasets

328 Investigating the Dimensions of Perceived Attributions in Making Sense of Failure: An Exploratory Study of Lebanese Entrepreneurs

Authors: Ghiwa Dandach

Abstract:

By challenging the anti-failure bias and contributing to the theoretical territory of the attribution theory, this thesis develops a comprehensive process for entrepreneurial learning from failure. The practical implication of the findings suggests assisting entrepreneurs (current, failing, and nascent) in effectively anticipating and reflecting upon failure. Additionally, the process is suggested to enhance the level of institutional and private (accelerators and financers) support provided to entrepreneurs, the implications of which may improve future opportunities for entrepreneurial success. Henceforth, exploring learning from failure is argued to impact the potential survival of future ventures, subsequently revitalizing the economic contribution of entrepreneurship. This learning process can be enhanced with the cognitive development of causal ascriptions for failure, which eventually impacts learning outcomes. However, the mechanism with which entrepreneurs make sense of failure, reflect on the journey, and transform experience into knowledge is still under-researched. More specifically, the cognitive process of failure attribution is under-explored, majorly in the context of developing economies, calling for a more insightful understanding on how entrepreneurs ascribe failure. Responding to the call for more thorough research in such cultural contexts, this study expands the understanding of the dimensions of failure attributions as perceived by entrepreneurs and the impact of these dimensions on learning outcomes in the Lebanese context. The research adopted the exploratory interpretivism paradigm and collected data from interviews with industry experts first, followed by narratives of entrepreneurs using the qualitative multimethod approach. The holistic and categorical content analysis of narratives, preceded by the thematic analysis of interviews, unveiled how entrepreneurs ascribe failure by developing minor and major dimensions of each failure attribution. The findings have also revealed how each dimension impacts the learning from failure when accompanied by emotional resilience. The thesis concludes that exploring in-depth the dimensions of failure attributions significantly determines the level of learning generated. They are moving beyond the simple categorisation of ascriptions as primary internal or external unveiled how learning may occur with each attribution at the individual, venture, and ecosystem levels. This has further accentuated that a major internal attribution of failure combined with a minor external attribution generated the highest levels of transformative and double-loop learning, emphasizing the role of personal blame and responsibility on enhancing learning outcomes.

Keywords: attribution, entrepreneurship, reflection, sense-making, emotions, learning outcomes, failure, exit

Procedia PDF Downloads 227
327 Leveraging the Power of Dual Spatial-Temporal Data Scheme for Traffic Prediction

Authors: Yang Zhou, Heli Sun, Jianbin Huang, Jizhong Zhao, Shaojie Qiao

Abstract:

Traffic prediction is a fundamental problem in urban environment, facilitating the smart management of various businesses, such as taxi dispatching, bike relocation, and stampede alert. Most earlier methods rely on identifying the intrinsic spatial-temporal correlation to forecast. However, the complex nature of this problem entails a more sophisticated solution that can simultaneously capture the mutual influence of both adjacent and far-flung areas, with the information of time-dimension also incorporated seamlessly. To tackle this difficulty, we propose a new multi-phase architecture, DSTDS (Dual Spatial-Temporal Data Scheme for traffic prediction), that aims to reveal the underlying relationship that determines future traffic trend. First, a graph-based neural network with an attention mechanism is devised to obtain the static features of the road network. Then, a multi-granularity recurrent neural network is built in conjunction with the knowledge from a grid-based model. Subsequently, the preceding output is fed into a spatial-temporal super-resolution module. With this 3-phase structure, we carry out extensive experiments on several real-world datasets to demonstrate the effectiveness of our approach, which surpasses several state-of-the-art methods.

Keywords: traffic prediction, spatial-temporal, recurrent neural network, dual data scheme

Procedia PDF Downloads 117
326 Machine Learning for Targeting of Conditional Cash Transfers: Improving the Effectiveness of Proxy Means Tests to Identify Future School Dropouts and the Poor

Authors: Cristian Crespo

Abstract:

Conditional cash transfers (CCTs) have been targeted towards the poor. Thus, their targeting assessments check whether these schemes have been allocated to low-income households or individuals. However, CCTs have more than one goal and target group. An additional goal of CCTs is to increase school enrolment. Hence, students at risk of dropping out of school also are a target group. This paper analyses whether one of the most common targeting mechanisms of CCTs, a proxy means test (PMT), is suitable to identify the poor and future school dropouts. The PMT is compared with alternative approaches that use the outputs of a predictive model of school dropout. This model was built using machine learning algorithms and rich administrative datasets from Chile. The paper shows that using machine learning outputs in conjunction with the PMT increases targeting effectiveness by identifying more students who are either poor or future dropouts. This joint targeting approach increases effectiveness in different scenarios except when the social valuation of the two target groups largely differs. In these cases, the most likely optimal approach is to solely adopt the targeting mechanism designed to find the highly valued group.

Keywords: conditional cash transfers, machine learning, poverty, proxy means tests, school dropout prediction, targeting

Procedia PDF Downloads 204
325 The Predictive Utility of Subjective Cognitive Decline Using Item Level Data from the Everyday Cognition (ECog) Scales

Authors: J. Fox, J. Randhawa, M. Chan, L. Campbell, A. Weakely, D. J. Harvey, S. Tomaszewski Farias

Abstract:

Early identification of individuals at risk for conversion to dementia provides an opportunity for preventative treatment. Many older adults (30-60%) report specific subjective cognitive decline (SCD); however, previous research is inconsistent in terms of what types of complaints predict future cognitive decline. The purpose of this study is to identify which specific complaints from the Everyday Cognition Scales (ECog) scales, a measure of self-reported concerns for everyday abilities across six cognitive domains, are associated with: 1) conversion from a clinical diagnosis of normal to either MCI or dementia (categorical variable) and 2) progressive cognitive decline in memory and executive function (continuous variables). 415 cognitively normal older adults were monitored annually for an average of 5 years. Cox proportional hazards models were used to assess associations between self-reported ECog items and progression to impairment (MCI or dementia). A total of 114 individuals progressed to impairment; the mean time to progression was 4.9 years (SD=3.4 years, range=0.8-13.8). Follow-up models were run controlling for depression. A subset of individuals (n=352) underwent repeat cognitive assessments for an average of 5.3 years. For those individuals, mixed effects models with random intercepts and slopes were used to assess associations between ECog items and change in neuropsychological measures of episodic memory or executive function. Prior to controlling for depression, subjective concerns on five of the eight Everyday Memory items, three of the nine Everyday Language items, one of the seven Everyday Visuospatial items, two of the five Everyday Planning items, and one of the six Everyday Organization items were associated with subsequent diagnostic conversion (HR=1.25 to 1.59, p=0.003 to 0.03). However, after controlling for depression, only two specific complaints of remembering appointments, meetings, and engagements and understanding spoken directions and instructions were associated with subsequent diagnostic conversion. Episodic memory in individuals reporting no concern on ECog items did not significantly change over time (p>0.4). More complaints on seven of the eight Everyday Memory items, three of the nine Everyday Language items, and three of the seven Everyday Visuospatial items were associated with a decline in episodic memory (Interaction estimate=-0.055 to 0.001, p=0.003 to 0.04). Executive function in those reporting no concern on ECog items declined slightly (p <0.001 to 0.06). More complaints on three of the eight Everyday Memory items and three of the nine Everyday Language items were associated with a decline in executive function (Interaction estimate=-0.021 to -0.012, p=0.002 to 0.04). These findings suggest that specific complaints across several cognitive domains are associated with diagnostic conversion. Specific complaints in the domains of Everyday Memory and Language are associated with a decline in both episodic memory and executive function. Increased monitoring and treatment of individuals with these specific SCD may be warranted.

Keywords: alzheimer’s disease, dementia, memory complaints, mild cognitive impairment, risk factors, subjective cognitive decline

Procedia PDF Downloads 80
324 Pushing the Boundary of Parallel Tractability for Ontology Materialization via Boolean Circuits

Authors: Zhangquan Zhou, Guilin Qi

Abstract:

Materialization is an important reasoning service for applications built on the Web Ontology Language (OWL). To make materialization efficient in practice, current research focuses on deciding tractability of an ontology language and designing parallel reasoning algorithms. However, some well-known large-scale ontologies, such as YAGO, have been shown to have good performance for parallel reasoning, but they are expressed in ontology languages that are not parallelly tractable, i.e., the reasoning is inherently sequential in the worst case. This motivates us to study the problem of parallel tractability of ontology materialization from a theoretical perspective. That is we aim to identify the ontologies for which materialization is parallelly tractable, i.e., in the NC complexity. Since the NC complexity is defined based on Boolean circuit that is widely used to investigate parallel computing problems, we first transform the problem of materialization to evaluation of Boolean circuits, and then study the problem of parallel tractability based on circuits. In this work, we focus on datalog rewritable ontology languages. We use Boolean circuits to identify two classes of datalog rewritable ontologies (called parallelly tractable classes) such that materialization over them is parallelly tractable. We further investigate the parallel tractability of materialization of a datalog rewritable OWL fragment DHL (Description Horn Logic). Based on the above results, we analyze real-world datasets and show that many ontologies expressed in DHL belong to the parallelly tractable classes.

Keywords: ontology materialization, parallel reasoning, datalog, Boolean circuit

Procedia PDF Downloads 271
323 Diversity Indices as a Tool for Evaluating Quality of Water Ways

Authors: Khadra Ahmed, Khaled Kheireldin

Abstract:

In this paper, we present a pedestrian detection descriptor called Fused Structure and Texture (FST) features based on the combination of the local phase information with the texture features. Since the phase of the signal conveys more structural information than the magnitude, the phase congruency concept is used to capture the structural features. On the other hand, the Center-Symmetric Local Binary Pattern (CSLBP) approach is used to capture the texture information of the image. The dimension less quantity of the phase congruency and the robustness of the CSLBP operator on the flat images, as well as the blur and illumination changes, lead the proposed descriptor to be more robust and less sensitive to the light variations. The proposed descriptor can be formed by extracting the phase congruency and the CSLBP values of each pixel of the image with respect to its neighborhood. The histogram of the oriented phase and the histogram of the CSLBP values for the local regions in the image are computed and concatenated to construct the FST descriptor. Several experiments were conducted on INRIA and the low resolution DaimlerChrysler datasets to evaluate the detection performance of the pedestrian detection system that is based on the FST descriptor. A linear Support Vector Machine (SVM) is used to train the pedestrian classifier. These experiments showed that the proposed FST descriptor has better detection performance over a set of state of the art feature extraction methodologies.

Keywords: planktons, diversity indices, water quality index, water ways

Procedia PDF Downloads 518
322 Mapping of Alteration Zones in Mineral Rich Belt of South-East Rajasthan Using Remote Sensing Techniques

Authors: Mrinmoy Dhara, Vivek K. Sengar, Shovan L. Chattoraj, Soumiya Bhattacharjee

Abstract:

Remote sensing techniques have emerged as an asset for various geological studies. Satellite images obtained by different sensors contain plenty of information related to the terrain. Digital image processing further helps in customized ways for the prospecting of minerals. In this study, an attempt has been made to map the hydrothermally altered zones using multispectral and hyperspectral datasets of South East Rajasthan. Advanced Space-borne Thermal Emission and Reflection Radiometer (ASTER) and Hyperion (Level1R) dataset have been processed to generate different Band Ratio Composites (BRCs). For this study, ASTER derived BRCs were generated to delineate the alteration zones, gossans, abundant clays and host rocks. ASTER and Hyperion images were further processed to extract mineral end members and classified mineral maps have been produced using Spectral Angle Mapper (SAM) method. Results were validated with the geological map of the area which shows positive agreement with the image processing outputs. Thus, this study concludes that the band ratios and image processing in combination play significant role in demarcation of alteration zones which may provide pathfinders for mineral prospecting studies.

Keywords: ASTER, hyperion, band ratios, alteration zones, SAM

Procedia PDF Downloads 279
321 A Fuzzy-Rough Feature Selection Based on Binary Shuffled Frog Leaping Algorithm

Authors: Javad Rahimipour Anaraki, Saeed Samet, Mahdi Eftekhari, Chang Wook Ahn

Abstract:

Feature selection and attribute reduction are crucial problems, and widely used techniques in the field of machine learning, data mining and pattern recognition to overcome the well-known phenomenon of the Curse of Dimensionality. This paper presents a feature selection method that efficiently carries out attribute reduction, thereby selecting the most informative features of a dataset. It consists of two components: 1) a measure for feature subset evaluation, and 2) a search strategy. For the evaluation measure, we have employed the fuzzy-rough dependency degree (FRFDD) of the lower approximation-based fuzzy-rough feature selection (L-FRFS) due to its effectiveness in feature selection. As for the search strategy, a modified version of a binary shuffled frog leaping algorithm is proposed (B-SFLA). The proposed feature selection method is obtained by hybridizing the B-SFLA with the FRDD. Nine classifiers have been employed to compare the proposed approach with several existing methods over twenty two datasets, including nine high dimensional and large ones, from the UCI repository. The experimental results demonstrate that the B-SFLA approach significantly outperforms other metaheuristic methods in terms of the number of selected features and the classification accuracy.

Keywords: binary shuffled frog leaping algorithm, feature selection, fuzzy-rough set, minimal reduct

Procedia PDF Downloads 225
320 The Influence of Housing Choice Vouchers on the Private Rental Market

Authors: Randy D. Colon

Abstract:

Through a freedom of information request, data pertaining to Housing Choice Voucher (HCV) households has been obtained from the Chicago Housing Authority, including rent price and number of bedrooms per HCV household, community area, and zip code from 2013 to the first quarter of 2018. Similar data pertaining to the private rental market will be obtained through public records found through the United States Department of Housing and Urban Development. The datasets will be analyzed through statistical and mapping software to investigate the potential link between HCV households and distorted rent prices. Quantitative data will be supplemented by qualitative data to investigate the lived experience of Chicago residents. Qualitative data will be collected at community meetings in the Chicago Englewood neighborhood through participation in neighborhood meetings and informal interviews with residents and community leaders. The qualitative data will be used to gain insight on the lived experience of community leaders and residents of the Englewood neighborhood in relation to housing, the rental market, and HCV. While there is an abundance of quantitative data on this subject, this qualitative data is necessary to capture the lived experience of local residents effected by a changing rental market. This topic reflects concerns voiced by members of the Englewood community, and this study aims to keep the community relevant in its findings.

Keywords: Chicago, housing, housing choice voucher program, housing subsidies, rental market

Procedia PDF Downloads 118
319 Real-Time Multi-Vehicle Tracking Application at Intersections Based on Feature Selection in Combination with Color Attribution

Authors: Qiang Zhang, Xiaojian Hu

Abstract:

In multi-vehicle tracking, based on feature selection, the tracking system efficiently tracks vehicles in a video with minimal error in combination with color attribution, which focuses on presenting a simple and fast, yet accurate and robust solution to the problem such as inaccurately and untimely responses of statistics-based adaptive traffic control system in the intersection scenario. In this study, a real-time tracking system is proposed for multi-vehicle tracking in the intersection scene. Considering the complexity and application feasibility of the algorithm, in the object detection step, the detection result provided by virtual loops were post-processed and then used as the input for the tracker. For the tracker, lightweight methods were designed to extract and select features and incorporate them into the adaptive color tracking (ACT) framework. And the approbatory online feature selection algorithms are integrated on the mature ACT system with good compatibility. The proposed feature selection methods and multi-vehicle tracking method are evaluated on KITTI datasets and show efficient vehicle tracking performance when compared to the other state-of-the-art approaches in the same category. And the system performs excellently on the video sequences recorded at the intersection. Furthermore, the presented vehicle tracking system is suitable for surveillance applications.

Keywords: real-time, multi-vehicle tracking, feature selection, color attribution

Procedia PDF Downloads 163
318 Visualization and Performance Measure to Determine Number of Topics in Twitter Data Clustering Using Hybrid Topic Modeling

Authors: Moulana Mohammed

Abstract:

Topic models are widely used in building clusters of documents for more than a decade, yet problems occurring in choosing optimal number of topics. The main problem is the lack of a stable metric of the quality of topics obtained during the construction of topic models. The authors analyzed from previous works, most of the models used in determining the number of topics are non-parametric and quality of topics determined by using perplexity and coherence measures and concluded that they are not applicable in solving this problem. In this paper, we used the parametric method, which is an extension of the traditional topic model with visual access tendency for visualization of the number of topics (clusters) to complement clustering and to choose optimal number of topics based on results of cluster validity indices. Developed hybrid topic models are demonstrated with different Twitter datasets on various topics in obtaining the optimal number of topics and in measuring the quality of clusters. The experimental results showed that the Visual Non-negative Matrix Factorization (VNMF) topic model performs well in determining the optimal number of topics with interactive visualization and in performance measure of the quality of clusters with validity indices.

Keywords: interactive visualization, visual mon-negative matrix factorization model, optimal number of topics, cluster validity indices, Twitter data clustering

Procedia PDF Downloads 134
317 Data Mining Approach: Classification Model Evaluation

Authors: Lubabatu Sada Sodangi

Abstract:

The rapid growth in exchange and accessibility of information via the internet makes many organisations acquire data on their own operation. The aim of data mining is to analyse the different behaviour of a dataset using observation. Although, the subset of the dataset being analysed may not display all the behaviours and relationships of the entire data and, therefore, may not represent other parts that exist in the dataset. There is a range of techniques used in data mining to determine the hidden or unknown information in datasets. In this paper, the performance of two algorithms Chi-Square Automatic Interaction Detection (CHAID) and multilayer perceptron (MLP) would be matched using an Adult dataset to find out the percentage of an/the adults that earn > 50k and those that earn <= 50k per year. The two algorithms were studied and compared using IBM SPSS statistics software. The result for CHAID shows that the most important predictors are relationship and education. The algorithm shows that those are married (husband) and have qualification: Bachelor, Masters, Doctorate or Prof-school whose their age is > 41<57 earn > 50k. Also, multilayer perceptron displays marital status and capital gain as the most important predictors of the income. It also shows that individuals that their capital gain is less than 6,849 and are single, separated or widow, earn <= 50K, whereas individuals with their capital gain is > 6,849, work > 35 hrs/wk, and > 27yrs their income will be > 50k. By comparing the two algorithms, it is observed that both algorithms are reliable but there is strong reliability in CHAID which clearly shows that relation and education contribute to the prediction as displayed in the data visualisation.

Keywords: data mining, CHAID, multi-layer perceptron, SPSS, Adult dataset

Procedia PDF Downloads 378
316 On Exploring Search Heuristics for improving the efficiency in Web Information Extraction

Authors: Patricia Jiménez, Rafael Corchuelo

Abstract:

Nowadays the World Wide Web is the most popular source of information that relies on billions of on-line documents. Web mining is used to crawl through these documents, collect the information of interest and process it by applying data mining tools in order to use the gathered information in the best interest of a business, what enables companies to promote theirs. Unfortunately, it is not easy to extract the information a web site provides automatically when it lacks an API that allows to transform the user-friendly data provided in web documents into a structured format that is machine-readable. Rule-based information extractors are the tools intended to extract the information of interest automatically and offer it in a structured format that allow mining tools to process it. However, the performance of an information extractor strongly depends on the search heuristic employed since bad choices regarding how to learn a rule may easily result in loss of effectiveness and/or efficiency. Improving search heuristics regarding efficiency is of uttermost importance in the field of Web Information Extraction since typical datasets are very large. In this paper, we employ an information extractor based on a classical top-down algorithm that uses the so-called Information Gain heuristic introduced by Quinlan and Cameron-Jones. Unfortunately, the Information Gain relies on some well-known problems so we analyse an intuitive alternative, Termini, that is clearly more efficient; we also analyse other proposals in the literature and conclude that none of them outperforms the previous alternative.

Keywords: information extraction, search heuristics, semi-structured documents, web mining.

Procedia PDF Downloads 335
315 Disclosure on Adherence of the King Code's Audit Committee Guidance: Cluster Analyses to Determine Strengths and Weaknesses

Authors: Philna Coetzee, Clara Msiza

Abstract:

In modern society, audit committees are seen as the custodians of accountability and the conscience of management and the board. But who holds the audit committee accountable for their actions or non-actions and how do we know what they are supposed to be doing and what they are doing? The purpose of this article is to provide greater insight into the latter part of this problem, namely, determine what best practises for audit committees and the disclosure of what is the realities are. In countries where governance is well established, the roles and responsibilities of the audit committee are mostly clearly guided by legislation and/or guidance documents, with countries increasingly providing guidance on this topic. With high cost involved to adhere to governance guidelines, the public (for public organisations) and shareholders (for private organisations) expect to see the value of their ‘investment’. For audit committees, the dividends on the investment should reflect in less fraudulent activities, less corruption, higher efficiency and effectiveness, improved social and environmental impact, and increased profits, to name a few. If this is not the case (which is reflected in the number of fraudulent activities in both the private and the public sector), stakeholders have the right to ask: where was the audit committee? Therefore, the objective of this article is to contribute to the body of knowledge by comparing the adherence of audit committee to best practices guidelines as stipulated in the King Report across public listed companies, national and provincial government departments, state-owned enterprises and local municipalities. After constructs were formed, based on the literature, factor analyses were conducted to reduce the number of variables in each construct. Thereafter, cluster analyses, which is an explorative analysis technique that classifies a set of objects in such a way that objects that are more similar are grouped into the same group, were conducted. The SPSS TwoStep Clustering Component was used, being capable of handling both continuous and categorical variables. In the first step, a pre-clustering procedure clusters the objects into small sub-clusters, after which it clusters these sub-clusters into the desired number of clusters. The cluster analyses were conducted for each construct and the measure, namely the audit opinion as listed in the external audit report, were included. Analysing 228 organisations' information, the results indicate that there is a clear distinction between the four spheres of business that has been included in the analyses, indicating certain strengths and certain weaknesses within each sphere. The results may provide the overseers of audit committees’ insight into where a specific sector’s strengths and weaknesses lie. Audit committee chairs will be able to improve the areas where their audit committee is lacking behind. The strengthening of audit committees should result in an improvement of the accountability of boards, leading to less fraud and corruption.

Keywords: audit committee disclosure, cluster analyses, governance best practices, strengths and weaknesses

Procedia PDF Downloads 167
314 Intelligent Indoor Localization Using WLAN Fingerprinting

Authors: Gideon C. Joseph

Abstract:

The ability to localize mobile devices is quite important, as some applications may require location information of these devices to operate or deliver better services to the users. Although there are several ways of acquiring location data of mobile devices, the WLAN fingerprinting approach has been considered in this work. This approach uses the Received Signal Strength Indicator (RSSI) measurement as a function of the position of the mobile device. RSSI is a quantitative technique of describing the radio frequency power carried by a signal. RSSI may be used to determine RF link quality and is very useful in dense traffic scenarios where interference is of major concern, for example, indoor environments. This research aims to design a system that can predict the location of a mobile device, when supplied with the mobile’s RSSIs. The developed system takes as input the RSSIs relating to the mobile device, and outputs parameters that describe the location of the device such as the longitude, latitude, floor, and building. The relationship between the Received Signal Strengths (RSSs) of mobile devices and their corresponding locations is meant to be modelled; hence, subsequent locations of mobile devices can be predicted using the developed model. It is obvious that describing mathematical relationships between the RSSIs measurements and localization parameters is one option to modelling the problem, but the complexity of such an approach is a serious turn-off. In contrast, we propose an intelligent system that can learn the mapping of such RSSIs measurements to the localization parameters to be predicted. The system is capable of upgrading its performance as more experiential knowledge is acquired. The most appealing consideration to using such a system for this task is that complicated mathematical analysis and theoretical frameworks are excluded or not needed; the intelligent system on its own learns the underlying relationship in the supplied data (RSSI levels) that corresponds to the localization parameters. These localization parameters to be predicted are of two different tasks: Longitude and latitude of mobile devices are real values (regression problem), while the floor and building of the mobile devices are of integer values or categorical (classification problem). This research work presents artificial neural network based intelligent systems to model the relationship between the RSSIs predictors and the mobile device localization parameters. The designed systems were trained and validated on the collected WLAN fingerprint database. The trained networks were then tested with another supplied database to obtain the performance of trained systems on achieved Mean Absolute Error (MAE) and error rates for the regression and classification tasks involved therein.

Keywords: indoor localization, WLAN fingerprinting, neural networks, classification, regression

Procedia PDF Downloads 347
313 Automatic Early Breast Cancer Segmentation Enhancement by Image Analysis and Hough Transform

Authors: David Jurado, Carlos Ávila

Abstract:

Detection of early signs of breast cancer development is crucial to quickly diagnose the disease and to define adequate treatment to increase the survival probability of the patient. Computer Aided Detection systems (CADs), along with modern data techniques such as Machine Learning (ML) and Neural Networks (NN), have shown an overall improvement in digital mammography cancer diagnosis, reducing the false positive and false negative rates becoming important tools for the diagnostic evaluations performed by specialized radiologists. However, ML and NN-based algorithms rely on datasets that might bring issues to the segmentation tasks. In the present work, an automatic segmentation and detection algorithm is described. This algorithm uses image processing techniques along with the Hough transform to automatically identify microcalcifications that are highly correlated with breast cancer development in the early stages. Along with image processing, automatic segmentation of high-contrast objects is done using edge extraction and circle Hough transform. This provides the geometrical features needed for an automatic mask design which extracts statistical features of the regions of interest. The results shown in this study prove the potential of this tool for further diagnostics and classification of mammographic images due to the low sensitivity to noisy images and low contrast mammographies.

Keywords: breast cancer, segmentation, X-ray imaging, hough transform, image analysis

Procedia PDF Downloads 83
312 The Importance of Downstream Supply Chain in Supply Chain Risk Management: Multi-Objective Optimization

Authors: Zohreh Khojasteh-Ghamari, Takashi Irohara

Abstract:

One of the efficient ways in supply chain risk management is avoiding the interruption in Supply Chain (SC) before it occurs. Although the majority of the organizations focus on their first-tier suppliers to avoid risk in the SC, studies show that in only 60 percent of the disruption cases the reason is first tier suppliers. In the 40 percent of the SC disruptions, the reason is downstream SC, which is the second tier and lower. Due to the increasing complexity and interrelation of modern supply chains, the SC elements have become difficult to trace. Moreover, studies show that there is a vital need to better understand the integration of risk and visibility, especially in the context of multiple objectives. In this study, we propose a multi-objective programming model to avoid disruption in SC. The objective of this study is evaluating the effect of downstream SCV on managing supply chain risk. We propose a multi-objective mathematical programming model with the objective functions of minimizing the total cost and maximizing the downstream supply chain visibility (SCV). The decision variable is supplier selection. We assume there are several manufacturers and several candidate suppliers. For each manufacturer, our model proposes the best suppliers with the lowest cost and maximum visibility in downstream supply chain. We examine the applicability of the model by numerical examples. We also define several scenarios for datasets and observe the tendency. The results show that minimum visibility in downstream SC is needed to have a safe SC network.

Keywords: downstream supply chain, optimization, supply chain risk, supply chain visibility

Procedia PDF Downloads 244
311 Analysis of the Elastic Energy Released and Characterization of the Eruptive Episodes Intensity’s during 2014-2015 at El Reventador Volcano, Ecuador

Authors: Paúl I. Cornejo

Abstract:

The elastic energy released through Strombolian explosions has been quite studied, detailing various processes, sources, and precursory events at several volcanoes. We realized an analysis based on the relative partitioning of the elastic energy radiated into the atmosphere and ground by Strombolian-type explosions recorded at El Reventador volcano, using infrasound and seismic signals at high and moderate seismicity episodes during intense eruptive stages of explosive and effusive activity. Our results show that considerable values of Volcano Acoustic-Seismic Ratio (VASR or η) are obtained at high seismicity stages. VASR is a physical diagnostic of explosive degassing that we used to compare eruption mechanisms at El Reventador volcano for two datasets of explosions recorded at a Broad-Band BB seismic and infrasonic station located at ~5 kilometers from the vent. We conclude that the acoustic energy EA released during explosive activity (VASR η = 0.47, standard deviation σ = 0.8) is higher than the EA released during effusive activity; therefore, producing the highest values of η. Furthermore, we realized the analysis and characterization of the eruptive intensity for two episodes at high seismicity, calculating a η three-time higher for an episode of effusive activity with an occasional explosive component (η = 0.32, and σ = 0.42), than a η for an episode of only effusive activity (η = 0.11, and σ = 0.18), but more energetic.

Keywords: effusive, explosion quakes, explosive, Strombolian, VASR

Procedia PDF Downloads 184
310 Identifying Autism Spectrum Disorder Using Optimization-Based Clustering

Authors: Sharifah Mousli, Sona Taheri, Jiayuan He

Abstract:

Autism spectrum disorder (ASD) is a complex developmental condition involving persistent difficulties with social communication, restricted interests, and repetitive behavior. The challenges associated with ASD can interfere with an affected individual’s ability to function in social, academic, and employment settings. Although there is no effective medication known to treat ASD, to our best knowledge, early intervention can significantly improve an affected individual’s overall development. Hence, an accurate diagnosis of ASD at an early phase is essential. The use of machine learning approaches improves and speeds up the diagnosis of ASD. In this paper, we focus on the application of unsupervised clustering methods in ASD as a large volume of ASD data generated through hospitals, therapy centers, and mobile applications has no pre-existing labels. We conduct a comparative analysis using seven clustering approaches such as K-means, agglomerative hierarchical, model-based, fuzzy-C-means, affinity propagation, self organizing maps, linear vector quantisation – as well as the recently developed optimization-based clustering (COMSEP-Clust) approach. We evaluate the performances of the clustering methods extensively on real-world ASD datasets encompassing different age groups: toddlers, children, adolescents, and adults. Our experimental results suggest that the COMSEP-Clust approach outperforms the other seven methods in recognizing ASD with well-separated clusters.

Keywords: autism spectrum disorder, clustering, optimization, unsupervised machine learning

Procedia PDF Downloads 115
309 Complete Ensemble Empirical Mode Decomposition with Adaptive Noise Temporal Convolutional Network for Remaining Useful Life Prediction of Lithium Ion Batteries

Authors: Jing Zhao, Dayong Liu, Shihao Wang, Xinghua Zhu, Delong Li

Abstract:

Uhumanned Underwater Vehicles generally operate in the deep sea, which has its own unique working conditions. Lithium-ion power batteries should have the necessary stability and endurance for use as an underwater vehicle’s power source. Therefore, it is essential to accurately forecast how long lithium-ion batteries will last in order to maintain the system’s reliability and safety. In order to model and forecast lithium battery Remaining Useful Life (RUL), this research suggests a model based on Complete Ensemble Empirical Mode Decomposition with Adaptive noise-Temporal Convolutional Net (CEEMDAN-TCN). In this study, two datasets, NASA and CALCE, which have a specific gap in capacity data fluctuation, are used to verify the model and examine the experimental results in order to demonstrate the generalizability of the concept. The experiments demonstrate the network structure’s strong universality and ability to achieve good fitting outcomes on the test set for various battery dataset types. The evaluation metrics reveal that the CEEMDAN-TCN prediction performance of TCN is 25% to 35% better than that of a single neural network, proving that feature expansion and modal decomposition can both enhance the model’s generalizability and be extremely useful in industrial settings.

Keywords: lithium-ion battery, remaining useful life, complete EEMD with adaptive noise, temporal convolutional net

Procedia PDF Downloads 152
308 The Effectiveness of Psychosocial Interventions for Survivors of Natural Disasters: A Systematic Review

Authors: Santhani M. Selveindran

Abstract:

Background: Natural disasters are traumatic global events that are becoming increasing more common, with significant psychosocial impact on survivors. This impact results not only in psychosocial distress but, for many, can lead to psychosocial disorders and chronic psychopathology. While there are currently available interventions that seek to prevent and treat these psychosocial sequelae, their effectiveness is uncertain. The evidence-base is emerging with more primary studies evaluating the effectiveness of various psychosocial interventions for survivors of natural disasters, which remains to be synthesized. Aim of Review: To identify, critically appraise and synthesize the current evidence-base on the effectiveness of psychosocial interventions in preventing or treating Post-Traumatic Stress Disorder (PTSD), Major Depressive Disorder (MDD) and/or Generalized Anxiety Disorder (GAD) in adults and children who are survivors of natural disasters. Methods: A protocol was developed as a guide to carry out this review. A systematic search was conducted in eight international electronic databases, three grey literature databases, one dissertation and thesis repository, websites of six humanitarian and non-governmental organizations renowned for their work on natural disasters, as well as bibliographic and citation searching for eligible articles. Papers meeting the specific inclusion criteria underwent quality assessment using the Downs and Black checklist. Data were extracted from the included papers and analysed by way of narrative synthesis. Results: Database and website searching returned 3777 papers where 31 met the criteria for inclusion. Additional 2 papers were obtained through bibliographic and citation searching. Methodological quality of most papers was fair. Twenty-five studies evaluated psychological interventions, five, social interventions whereas three studies evaluated ‘mixed’ psychological and social interventions. All studies, irrespective of methodological quality, reported post-intervention reductions in symptom scores for PTSD, depression and/or anxiety and where assessed, reduced diagnosis of PTSD and MDD, and produced improvements in self-efficacy and quality of life. Statistically significant results were seen in 27 studies. However, three studies demonstrated that the evaluated interventions may not have been very beneficial. Conclusions: The overall positive results suggest that any psychosocial interventions are favourable and should be delivered to all natural disaster survivors, irrespective of age, country, and phase of disaster. Yet, heterogeneity and methodological shortcomings of the current evidence-base makes it difficult to draw definite conclusions needed to formulate categorical guidance or frameworks. Further, rigorously conducted research is needed in this area, although the feasibility of such, given the context and nature of the problem, is also recognized.

Keywords: psychosocial interventions, natural disasters, survivors, effectiveness

Procedia PDF Downloads 154
307 A Monte Carlo Fuzzy Logistic Regression Framework against Imbalance and Separation

Authors: Georgios Charizanos, Haydar Demirhan, Duygu Icen

Abstract:

Two of the most impactful issues in classical logistic regression are class imbalance and complete separation. These can result in model predictions heavily leaning towards the imbalanced class on the binary response variable or over-fitting issues. Fuzzy methodology offers key solutions for handling these problems. However, most studies propose the transformation of the binary responses into a continuous format limited within [0,1]. This is called the possibilistic approach within fuzzy logistic regression. Following this approach is more aligned with straightforward regression since a logit-link function is not utilized, and fuzzy probabilities are not generated. In contrast, we propose a method of fuzzifying binary response variables that allows for the use of the logit-link function; hence, a probabilistic fuzzy logistic regression model with the Monte Carlo method. The fuzzy probabilities are then classified by selecting a fuzzy threshold. Different combinations of fuzzy and crisp input, output, and coefficients are explored, aiming to understand which of these perform better under different conditions of imbalance and separation. We conduct numerical experiments using both synthetic and real datasets to demonstrate the performance of the fuzzy logistic regression framework against seven crisp machine learning methods. The proposed framework shows better performance irrespective of the degree of imbalance and presence of separation in the data, while the considered machine learning methods are significantly impacted.

Keywords: fuzzy logistic regression, fuzzy, logistic, machine learning

Procedia PDF Downloads 74
306 Vision-Based Daily Routine Recognition for Healthcare with Transfer Learning

Authors: Bruce X. B. Yu, Yan Liu, Keith C. C. Chan

Abstract:

We propose to record Activities of Daily Living (ADLs) of elderly people using a vision-based system so as to provide better assistive and personalization technologies. Current ADL-related research is based on data collected with help from non-elderly subjects in laboratory environments and the activities performed are predetermined for the sole purpose of data collection. To obtain more realistic datasets for the application, we recorded ADLs for the elderly with data collected from real-world environment involving real elderly subjects. Motivated by the need to collect data for more effective research related to elderly care, we chose to collect data in the room of an elderly person. Specifically, we installed Kinect, a vision-based sensor on the ceiling, to capture the activities that the elderly subject performs in the morning every day. Based on the data, we identified 12 morning activities that the elderly person performs daily. To recognize these activities, we created a HARELCARE framework to investigate into the effectiveness of existing Human Activity Recognition (HAR) algorithms and propose the use of a transfer learning algorithm for HAR. We compared the performance, in terms of accuracy, and training progress. Although the collected dataset is relatively small, the proposed algorithm has a good potential to be applied to all daily routine activities for healthcare purposes such as evidence-based diagnosis and treatment.

Keywords: daily activity recognition, healthcare, IoT sensors, transfer learning

Procedia PDF Downloads 132
305 A Case Study for User Rating Prediction on Automobile Recommendation System Using Mapreduce

Authors: Jiao Sun, Li Pan, Shijun Liu

Abstract:

Recommender systems have been widely used in contemporary industry, and plenty of work has been done in this field to help users to identify items of interest. Collaborative Filtering (CF, for short) algorithm is an important technology in recommender systems. However, less work has been done in automobile recommendation system with the sharp increase of the amount of automobiles. What’s more, the computational speed is a major weakness for collaborative filtering technology. Therefore, using MapReduce framework to optimize the CF algorithm is a vital solution to this performance problem. In this paper, we present a recommendation of the users’ comment on industrial automobiles with various properties based on real world industrial datasets of user-automobile comment data collection, and provide recommendation for automobile providers and help them predict users’ comment on automobiles with new-coming property. Firstly, we solve the sparseness of matrix using previous construction of score matrix. Secondly, we solve the data normalization problem by removing dimensional effects from the raw data of automobiles, where different dimensions of automobile properties bring great error to the calculation of CF. Finally, we use the MapReduce framework to optimize the CF algorithm, and the computational speed has been improved times. UV decomposition used in this paper is an often used matrix factorization technology in CF algorithm, without calculating the interpolation weight of neighbors, which will be more convenient in industry.

Keywords: collaborative filtering, recommendation, data normalization, mapreduce

Procedia PDF Downloads 217
304 Identification of Hepatocellular Carcinoma Using Supervised Learning Algorithms

Authors: Sagri Sharma

Abstract:

Analysis of diseases integrating multi-factors increases the complexity of the problem and therefore, development of frameworks for the analysis of diseases is an issue that is currently a topic of intense research. Due to the inter-dependence of the various parameters, the use of traditional methodologies has not been very effective. Consequently, newer methodologies are being sought to deal with the problem. Supervised Learning Algorithms are commonly used for performing the prediction on previously unseen data. These algorithms are commonly used for applications in fields ranging from image analysis to protein structure and function prediction and they get trained using a known dataset to come up with a predictor model that generates reasonable predictions for the response to new data. Gene expression profiles generated by DNA analysis experiments can be quite complex since these experiments can involve hypotheses involving entire genomes. The application of well-known machine learning algorithm - Support Vector Machine - to analyze the expression levels of thousands of genes simultaneously in a timely, automated and cost effective way is thus used. The objectives to undertake the presented work are development of a methodology to identify genes relevant to Hepatocellular Carcinoma (HCC) from gene expression dataset utilizing supervised learning algorithms and statistical evaluations along with development of a predictive framework that can perform classification tasks on new, unseen data.

Keywords: artificial intelligence, biomarker, gene expression datasets, hepatocellular carcinoma, machine learning, supervised learning algorithms, support vector machine

Procedia PDF Downloads 429
303 Autism Disease Detection Using Transfer Learning Techniques: Performance Comparison between Central Processing Unit vs. Graphics Processing Unit Functions for Neural Networks

Authors: Mst Shapna Akter, Hossain Shahriar

Abstract:

Neural network approaches are machine learning methods used in many domains, such as healthcare and cyber security. Neural networks are mostly known for dealing with image datasets. While training with the images, several fundamental mathematical operations are carried out in the Neural Network. The operation includes a number of algebraic and mathematical functions, including derivative, convolution, and matrix inversion and transposition. Such operations require higher processing power than is typically needed for computer usage. Central Processing Unit (CPU) is not appropriate for a large image size of the dataset as it is built with serial processing. While Graphics Processing Unit (GPU) has parallel processing capabilities and, therefore, has higher speed. This paper uses advanced Neural Network techniques such as VGG16, Resnet50, Densenet, Inceptionv3, Xception, Mobilenet, XGBOOST-VGG16, and our proposed models to compare CPU and GPU resources. A system for classifying autism disease using face images of an autistic and non-autistic child was used to compare performance during testing. We used evaluation matrices such as Accuracy, F1 score, Precision, Recall, and Execution time. It has been observed that GPU runs faster than the CPU in all tests performed. Moreover, the performance of the Neural Network models in terms of accuracy increases on GPU compared to CPU.

Keywords: autism disease, neural network, CPU, GPU, transfer learning

Procedia PDF Downloads 118
302 Traffic Prediction with Raw Data Utilization and Context Building

Authors: Zhou Yang, Heli Sun, Jianbin Huang, Jizhong Zhao, Shaojie Qiao

Abstract:

Traffic prediction is essential in a multitude of ways in modern urban life. The researchers of earlier work in this domain carry out the investigation chiefly with two major focuses: (1) the accurate forecast of future values in multiple time series and (2) knowledge extraction from spatial-temporal correlations. However, two key considerations for traffic prediction are often missed: the completeness of raw data and the full context of the prediction timestamp. Concentrating on the two drawbacks of earlier work, we devise an approach that can address these issues in a two-phase framework. First, we utilize the raw trajectories to a greater extent through building a VLA table and data compression. We obtain the intra-trajectory features with graph-based encoding and the intertrajectory ones with a grid-based model and the technique of back projection that restore their surrounding high-resolution spatial-temporal environment. To the best of our knowledge, we are the first to study direct feature extraction from raw trajectories for traffic prediction and attempt the use of raw data with the least degree of reduction. In the prediction phase, we provide a broader context for the prediction timestamp by taking into account the information that are around it in the training dataset. Extensive experiments on several well-known datasets have verified the effectiveness of our solution that combines the strength of raw trajectory data and prediction context. In terms of performance, our approach surpasses several state-of-the-art methods for traffic prediction.

Keywords: traffic prediction, raw data utilization, context building, data reduction

Procedia PDF Downloads 127
301 Ontology Expansion via Synthetic Dataset Generation and Transformer-Based Concept Extraction

Authors: Andrey Khalov

Abstract:

The rapid proliferation of unstructured data in IT infrastructure management demands innovative approaches for extracting actionable knowledge. This paper presents a framework for ontology-based knowledge extraction that combines relational graph neural networks (R-GNN) with large language models (LLMs). The proposed method leverages the DOLCE framework as the foundational ontology, extending it with concepts from ITSMO for domain-specific applications in IT service management and outsourcing. A key component of this research is the use of transformer-based models, such as DeBERTa-v3-large, for automatic entity and relationship extraction from unstructured texts. Furthermore, the paper explores how transfer learning techniques can be applied to fine-tune large language models (LLaMA) for using to generate synthetic datasets to improve precision in BERT-based entity recognition and ontology alignment. The resulting IT Ontology (ITO) serves as a comprehensive knowledge base that integrates domain-specific insights from ITIL processes, enabling more efficient decision-making. Experimental results demonstrate significant improvements in knowledge extraction and relationship mapping, offering a cutting-edge solution for enhancing cognitive computing in IT service environments.

Keywords: ontology expansion, synthetic dataset, transformer fine-tuning, concept extraction, DOLCE, BERT, taxonomy, LLM, NER

Procedia PDF Downloads 14
300 In-Context Meta Learning for Automatic Designing Pretext Tasks for Self-Supervised Image Analysis

Authors: Toktam Khatibi

Abstract:

Self-supervised learning (SSL) includes machine learning models that are trained on one aspect and/or one part of the input to learn other aspects and/or part of it. SSL models are divided into two different categories, including pre-text task-based models and contrastive learning ones. Pre-text tasks are some auxiliary tasks learning pseudo-labels, and the trained models are further fine-tuned for downstream tasks. However, one important disadvantage of SSL using pre-text task solving is defining an appropriate pre-text task for each image dataset with a variety of image modalities. Therefore, it is required to design an appropriate pretext task automatically for each dataset and each downstream task. To the best of our knowledge, the automatic designing of pretext tasks for image analysis has not been considered yet. In this paper, we present a framework based on In-context learning that describes each task based on its input and output data using a pre-trained image transformer. Our proposed method combines the input image and its learned description for optimizing the pre-text task design and its hyper-parameters using Meta-learning models. The representations learned from the pre-text tasks are fine-tuned for solving the downstream tasks. We demonstrate that our proposed framework outperforms the compared ones on unseen tasks and image modalities in addition to its superior performance for previously known tasks and datasets.

Keywords: in-context learning (ICL), meta learning, self-supervised learning (SSL), vision-language domain, transformers

Procedia PDF Downloads 80
299 Carbon Skimming: Towards an Application to Summarise and Compare Embodied Carbon to Aid Early-Stage Decision Making

Authors: Rivindu Nethmin Bandara Menik Hitihamy Mudiyanselage, Matthias Hank Haeusler, Ben Doherty

Abstract:

Investors and clients in the Architectural, Engineering and Construction industry find it difficult to understand complex datasets and reports with little to no graphic representation. The stakeholders examined in this paper include designers, design clients and end-users. Communicating embodied carbon information graphically and concisely can aid with decision support early in a building's life cycle. It is essential to create a common visualisation approach as the level of knowledge about embodied carbon varies between stakeholders. The tool, designed in conjunction with Bates Smart, condenses Tally Life Cycle Assessment data to a carbon hot-spotting visualisation, highlighting the sections with the highest amounts of embodied carbon. This allows stakeholders at every stage of a given project to have a better understanding of the carbon implications with minimal effort. It further allows stakeholders to differentiate building elements by their carbon values, which enables the evaluation of the cost-effectiveness of the selected materials at an early stage. To examine and build a decision-support tool, an action-design research methodology of cycles of iterations was used along with precedents of embodied carbon visualising tools. Accordingly, the importance of visualisation and Building Information Modelling are also explored to understand the best format for relaying these results.

Keywords: embodied carbon, visualisation, summarisation, data filtering, early-stage decision-making, materiality

Procedia PDF Downloads 82