Search results for: inference on the semantic web
286 A Proposed Framework for Software Redocumentation Using Distributed Data Processing Techniques and Ontology
Authors: Laila Khaled Almawaldi, Hiew Khai Hang, Sugumaran A. l. Nallusamy
Abstract:
Legacy systems are crucial for organizations, but their intricacy and lack of documentation pose challenges for maintenance and enhancement. Redocumentation of legacy systems is vital for automatically or semi-automatically creating documentation for software lacking sufficient records. It aims to enhance system understandability, maintainability, and knowledge transfer. However, existing redocumentation methods need improvement in data processing performance and document generation efficiency. This stems from the necessity to efficiently handle the extensive and complex code of legacy systems. This paper proposes a method for semi-automatic legacy system re-documentation using semantic parallel processing and ontology. Leveraging parallel processing and ontology addresses current challenges by distributing the workload and creating documentation with logically interconnected data. The paper outlines challenges in legacy system redocumentation and suggests a method of redocumentation using parallel processing and ontology for improved efficiency and effectiveness.Keywords: legacy systems, redocumentation, big data analysis, parallel processing
Procedia PDF Downloads 46285 Bayesian Inference for High Dimensional Dynamic Spatio-Temporal Models
Authors: Sofia M. Karadimitriou, Kostas Triantafyllopoulos, Timothy Heaton
Abstract:
Reduced dimension Dynamic Spatio-Temporal Models (DSTMs) jointly describe the spatial and temporal evolution of a function observed subject to noise. A basic state space model is adopted for the discrete temporal variation, while a continuous autoregressive structure describes the continuous spatial evolution. Application of such a DSTM relies upon the pre-selection of a suitable reduced set of basic functions and this can present a challenge in practice. In this talk, we propose an online estimation method for high dimensional spatio-temporal data based upon DSTM and we attempt to resolve this issue by allowing the basis to adapt to the observed data. Specifically, we present a wavelet decomposition in order to obtain a parsimonious approximation of the spatial continuous process. This parsimony can be achieved by placing a Laplace prior distribution on the wavelet coefficients. The aim of using the Laplace prior, is to filter wavelet coefficients with low contribution, and thus achieve the dimension reduction with significant computation savings. We then propose a Hierarchical Bayesian State Space model, for the estimation of which we offer an appropriate particle filter algorithm. The proposed methodology is illustrated using real environmental data.Keywords: multidimensional Laplace prior, particle filtering, spatio-temporal modelling, wavelets
Procedia PDF Downloads 427284 An Efficient Propensity Score Method for Causal Analysis With Application to Case-Control Study in Breast Cancer Research
Authors: Ms Azam Najafkouchak, David Todem, Dorothy Pathak, Pramod Pathak, Joseph Gardiner
Abstract:
Propensity score (PS) methods have recently become the standard analysis as a tool for the causal inference in the observational studies where exposure is not randomly assigned, thus, confounding can impact the estimation of treatment effect on the outcome. For the binary outcome, the effect of treatment on the outcome can be estimated by odds ratios, relative risks, and risk differences. However, using the different PS methods may give you a different estimation of the treatment effect on the outcome. Several methods of PS analyses have been used mainly, include matching, inverse probability of weighting, stratification, and covariate adjusted on PS. Due to the dangers of discretizing continuous variables (exposure, covariates), the focus of this paper will be on how the variation in cut-points or boundaries will affect the average treatment effect (ATE) utilizing the stratification of PS method. Therefore, we are trying to avoid choosing arbitrary cut-points, instead, we continuously discretize the PS and accumulate information across all cut-points for inferences. We will use Monte Carlo simulation to evaluate ATE, focusing on two PS methods, stratification and covariate adjusted on PS. We will then show how this can be observed based on the analyses of the data from a case-control study of breast cancer, the Polish Women’s Health Study.Keywords: average treatment effect, propensity score, stratification, covariate adjusted, monte Calro estimation, breast cancer, case_control study
Procedia PDF Downloads 105283 On the Utility of Bidirectional Transformers in Gene Expression-Based Classification
Authors: Babak Forouraghi
Abstract:
A genetic circuit is a collection of interacting genes and proteins that enable individual cells to implement and perform vital biological functions such as cell division, growth, death, and signaling. In cell engineering, synthetic gene circuits are engineered networks of genes specifically designed to implement functionalities that are not evolved by nature. These engineered networks enable scientists to tackle complex problems such as engineering cells to produce therapeutics within the patient's body, altering T cells to target cancer-related antigens for treatment, improving antibody production using engineered cells, tissue engineering, and production of genetically modified plants and livestock. Construction of computational models to realize genetic circuits is an especially challenging task since it requires the discovery of the flow of genetic information in complex biological systems. Building synthetic biological models is also a time-consuming process with relatively low prediction accuracy for highly complex genetic circuits. The primary goal of this study was to investigate the utility of a pre-trained bidirectional encoder transformer that can accurately predict gene expressions in genetic circuit designs. The main reason behind using transformers is their innate ability (attention mechanism) to take account of the semantic context present in long DNA chains that are heavily dependent on the spatial representation of their constituent genes. Previous approaches to gene circuit design, such as CNN and RNN architectures, are unable to capture semantic dependencies in long contexts, as required in most real-world applications of synthetic biology. For instance, RNN models (LSTM, GRU), although able to learn long-term dependencies, greatly suffer from vanishing gradient and low-efficiency problem when they sequentially process past states and compresses contextual information into a bottleneck with long input sequences. In other words, these architectures are not equipped with the necessary attention mechanisms to follow a long chain of genes with thousands of tokens. To address the above-mentioned limitations, a transformer model was built in this work as a variation to the existing DNA Bidirectional Encoder Representations from Transformers (DNABERT) model. It is shown that the proposed transformer is capable of capturing contextual information from long input sequences with an attention mechanism. In previous works on genetic circuit design, the traditional approaches to classification and regression, such as Random Forrest, Support Vector Machine, and Artificial Neural Networks, were able to achieve reasonably high R2 accuracy levels of 0.95 to 0.97. However, the transformer model utilized in this work, with its attention-based mechanism, was able to achieve a perfect accuracy level of 100%. Further, it is demonstrated that the efficiency of the transformer-based gene expression classifier is not dependent on the presence of large amounts of training examples, which may be difficult to compile in many real-world gene circuit designs.Keywords: machine learning, classification and regression, gene circuit design, bidirectional transformers
Procedia PDF Downloads 61282 Personal Knowledge Management: Systematic Review and Future Direction
Authors: Kuribachew Gizaw Tohiye, Monica Garfield
Abstract:
Personal knowledge management is the aspect of knowledge management that relates to the way in which individuals organize and manage their own set of knowledge. While in that respect, there has been research in this area for the past 25 years, it is at present necessary to speculate upon what research has been done and what we have discovered about this arena of knowledge management. In contrast to organizational knowledge management, which focuses on a firm’s profitability and competitiveness, personal knowledge management (PKM) is concerned with the person’s self-effectiveness, competence and success. People are concerned in managing their knowledge in order to become more efficient in a variety of personal and organizational interests. This study presents a systematic review of PKM studies. Articles with PKM concepts are reviewed with the objective of clearly defining PKM, identifying the benefits of PKM, classifying the tools that enable PKM and finding the research gaps to indicate future research directions in the area. Consequently, we have developed a definition of PKM and identified the benefits of PKM, including an understanding of who seeks PKM and for what. Tools enabling PKM are identified and classified under three categories Web 1.0, 2.0 and 3.0 and finally the research gap and future directions are suggested. Research which facilitates collaboration by using semantic technologies is suggested to be studied further to improve PKM effectiveness.Keywords: personal knowledge management, knowledge management, organizational knowledge management, systematic review
Procedia PDF Downloads 331281 Medicompills Architecture: A Mathematical Precise Tool to Reduce the Risk of Diagnosis Errors on Precise Medicine
Authors: Adriana Haulica
Abstract:
Powered by Machine Learning, Precise medicine is tailored by now to use genetic and molecular profiling, with the aim of optimizing the therapeutic benefits for cohorts of patients. As the majority of Machine Language algorithms come from heuristics, the outputs have contextual validity. This is not very restrictive in the sense that medicine itself is not an exact science. Meanwhile, the progress made in Molecular Biology, Bioinformatics, Computational Biology, and Precise Medicine, correlated with the huge amount of human biology data and the increase in computational power, opens new healthcare challenges. A more accurate diagnosis is needed along with real-time treatments by processing as much as possible from the available information. The purpose of this paper is to present a deeper vision for the future of Artificial Intelligence in Precise medicine. In fact, actual Machine Learning algorithms use standard mathematical knowledge, mostly Euclidian metrics and standard computation rules. The loss of information arising from the classical methods prevents obtaining 100% evidence on the diagnosis process. To overcome these problems, we introduce MEDICOMPILLS, a new architectural concept tool of information processing in Precise medicine that delivers diagnosis and therapy advice. This tool processes poly-field digital resources: global knowledge related to biomedicine in a direct or indirect manner but also technical databases, Natural Language Processing algorithms, and strong class optimization functions. As the name suggests, the heart of this tool is a compiler. The approach is completely new, tailored for omics and clinical data. Firstly, the intrinsic biological intuition is different from the well-known “a needle in a haystack” approach usually used when Machine Learning algorithms have to process differential genomic or molecular data to find biomarkers. Also, even if the input is seized from various types of data, the working engine inside the MEDICOMPILLS does not search for patterns as an integrative tool. This approach deciphers the biological meaning of input data up to the metabolic and physiologic mechanisms, based on a compiler with grammars issued from bio-algebra-inspired mathematics. It translates input data into bio-semantic units with the help of contextual information iteratively until Bio-Logical operations can be performed on the base of the “common denominator “rule. The rigorousness of MEDICOMPILLS comes from the structure of the contextual information on functions, built to be analogous to mathematical “proofs”. The major impact of this architecture is expressed by the high accuracy of the diagnosis. Detected as a multiple conditions diagnostic, constituted by some main diseases along with unhealthy biological states, this format is highly suitable for therapy proposal and disease prevention. The use of MEDICOMPILLS architecture is highly beneficial for the healthcare industry. The expectation is to generate a strategic trend in Precise medicine, making medicine more like an exact science and reducing the considerable risk of errors in diagnostics and therapies. The tool can be used by pharmaceutical laboratories for the discovery of new cures. It will also contribute to better design of clinical trials and speed them up.Keywords: bio-semantic units, multiple conditions diagnosis, NLP, omics
Procedia PDF Downloads 70280 Data Driven Infrastructure Planning for Offshore Wind farms
Authors: Isha Saxena, Behzad Kazemtabrizi, Matthias C. M. Troffaes, Christopher Crabtree
Abstract:
The calculations done at the beginning of the life of a wind farm are rarely reliable, which makes it important to conduct research and study the failure and repair rates of the wind turbines under various conditions. This miscalculation happens because the current models make a simplifying assumption that the failure/repair rate remains constant over time. This means that the reliability function is exponential in nature. This research aims to create a more accurate model using sensory data and a data-driven approach. The data cleaning and data processing is done by comparing the Power Curve data of the wind turbines with SCADA data. This is then converted to times to repair and times to failure timeseries data. Several different mathematical functions are fitted to the times to failure and times to repair data of the wind turbine components using Maximum Likelihood Estimation and the Posterior expectation method for Bayesian Parameter Estimation. Initial results indicate that two parameter Weibull function and exponential function produce almost identical results. Further analysis is being done using the complex system analysis considering the failures of each electrical and mechanical component of the wind turbine. The aim of this project is to perform a more accurate reliability analysis that can be helpful for the engineers to schedule maintenance and repairs to decrease the downtime of the turbine.Keywords: reliability, bayesian parameter inference, maximum likelihood estimation, weibull function, SCADA data
Procedia PDF Downloads 86279 Argument Representation in Non-Spatial Motion Bahasa Melayu Based Conceptual Structure Theory
Authors: Nurul Jamilah Binti Rosly
Abstract:
The typology of motion must be understood as a change from one location to another. But from a conceptual point of view, motion can also occur in non-spatial contexts associated with human and social factors. Therefore, from the conceptual point of view, the concept of non-spatial motion involves the movement of time, ownership, identity, state, and existence. Accordingly, this study will focus on the lexical as shared, accept, be, store, and exist as the study material. The data in this study were extracted from the Database of Languages and Literature Corpus Database, Malaysia, which was analyzed using semantics and syntax concepts using Conceptual Structure Theory - Ray Jackendoff (2002). Semantic representations are represented in the form of conceptual structures in argument functions that include functions [events], [situations], [objects], [paths] and [places]. The findings show that the mapping of these arguments comprises three main stages, namely mapping the argument structure, mapping the tree, and mapping the role of thematic items. Accordingly, this study will show the representation of non- spatial Malay language areas.Keywords: arguments, concepts, constituencies, events, situations, thematics
Procedia PDF Downloads 129278 Wolof Voice Response Recognition System: A Deep Learning Model for Wolof Audio Classification
Authors: Krishna Mohan Bathula, Fatou Bintou Loucoubar, FNU Kaleemunnisa, Christelle Scharff, Mark Anthony De Castro
Abstract:
Voice recognition algorithms such as automatic speech recognition and text-to-speech systems with African languages can play an important role in bridging the digital divide of Artificial Intelligence in Africa, contributing to the establishment of a fully inclusive information society. This paper proposes a Deep Learning model that can classify the user responses as inputs for an interactive voice response system. A dataset with Wolof language words ‘yes’ and ‘no’ is collected as audio recordings. A two stage Data Augmentation approach is adopted for enhancing the dataset size required by the deep neural network. Data preprocessing and feature engineering with Mel-Frequency Cepstral Coefficients are implemented. Convolutional Neural Networks (CNNs) have proven to be very powerful in image classification and are promising for audio processing when sounds are transformed into spectra. For performing voice response classification, the recordings are transformed into sound frequency feature spectra and then applied image classification methodology using a deep CNN model. The inference model of this trained and reusable Wolof voice response recognition system can be integrated with many applications associated with both web and mobile platforms.Keywords: automatic speech recognition, interactive voice response, voice response recognition, wolof word classification
Procedia PDF Downloads 116277 The Language of Fliptop among Filipino Youth: A Discourse Analysis
Authors: Bong Borero Lumabao
Abstract:
This qualitative research is a study on the lines of Fliptop talks performed by the Fliptop rappers employing Finnegan’s (2008) discourse analysis. This paper aimed to analyze the phonological, morphological, and semantic features of the fliptop talk, to explore the structures in the lines of Fliptop among Filipino youth, and to uncover the various insights that can be gained from it. The corpora of the study included all the 20 Fliptop Videos downloaded from the Youtube Channel of Fliptop. Results revealed that Fliptop contains phonological features such as assonance, consonance, deletion, lengthening, and rhyming. Morphological features include acronym, affixation, blending, borrowing, code-mixing and switching, compounding, conversion or functional shifts, and dysphemism. Semantics presented the lexical category, meaning, and words used in the fliptop talks. Structure of Fliptop revolves on the personal attack (physical attributes), attack on the bars (rapping skills), extension: family members and friends, antithesis, profane words, figurative languages, sexual undertones, anime characters, homosexuality, and famous celebrities involvement.Keywords: discourse analysis, fliptop talks, filipino youth, fliptop videos, Philippines
Procedia PDF Downloads 242276 The Intersection/Union Region Computation for Drosophila Brain Images Using Encoding Schemes Based on Multi-Core CPUs
Authors: Ming-Yang Guo, Cheng-Xian Wu, Wei-Xiang Chen, Chun-Yuan Lin, Yen-Jen Lin, Ann-Shyn Chiang
Abstract:
With more and more Drosophila Driver and Neuron images, it is an important work to find the similarity relationships among them as the functional inference. There is a general problem that how to find a Drosophila Driver image, which can cover a set of Drosophila Driver/Neuron images. In order to solve this problem, the intersection/union region for a set of images should be computed at first, then a comparison work is used to calculate the similarities between the region and other images. In this paper, three encoding schemes, namely Integer, Boolean, Decimal, are proposed to encode each image as a one-dimensional structure. Then, the intersection/union region from these images can be computed by using the compare operations, Boolean operators and lookup table method. Finally, the comparison work is done as the union region computation, and the similarity score can be calculated by the definition of Tanimoto coefficient. The above methods for the region computation are also implemented in the multi-core CPUs environment with the OpenMP. From the experimental results, in the encoding phase, the performance by the Boolean scheme is the best than that by others; in the region computation phase, the performance by Decimal is the best when the number of images is large. The speedup ratio can achieve 12 based on 16 CPUs. This work was supported by the Ministry of Science and Technology under the grant MOST 106-2221-E-182-070.Keywords: Drosophila driver image, Drosophila neuron images, intersection/union computation, parallel processing, OpenMP
Procedia PDF Downloads 239275 Improved Performance in Content-Based Image Retrieval Using Machine Learning Approach
Authors: B. Ramesh Naik, T. Venugopal
Abstract:
This paper presents a novel approach which improves the high-level semantics of images based on machine learning approach. The contemporary approaches for image retrieval and object recognition includes Fourier transforms, Wavelets, SIFT and HoG. Though these descriptors helpful in a wide range of applications, they exploit zero order statistics, and this lacks high descriptiveness of image features. These descriptors usually take benefit of primitive visual features such as shape, color, texture and spatial locations to describe images. These features do not adequate to describe high-level semantics of the images. This leads to a gap in semantic content caused to unacceptable performance in image retrieval system. A novel method has been proposed referred as discriminative learning which is derived from machine learning approach that efficiently discriminates image features. The analysis and results of proposed approach were validated thoroughly on WANG and Caltech-101 Databases. The results proved that this approach is very competitive in content-based image retrieval.Keywords: CBIR, discriminative learning, region weight learning, scale invariant feature transforms
Procedia PDF Downloads 181274 Next-Gen Solutions: How Generative AI Will Reshape Businesses
Authors: Aishwarya Rai
Abstract:
This study explores the transformative influence of generative AI on startups, businesses, and industries. We will explore how large businesses can benefit in the area of customer operations, where AI-powered chatbots can improve self-service and agent effectiveness, greatly increasing efficiency. In marketing and sales, generative AI could transform businesses by automating content development, data utilization, and personalization, resulting in a substantial increase in marketing and sales productivity. In software engineering-focused startups, generative AI can streamline activities, significantly impacting coding processes and work experiences. It can be extremely useful in product R&D for market analysis, virtual design, simulations, and test preparation, altering old workflows and increasing efficiency. Zooming into the retail and CPG industry, industry findings suggest a 1-2% increase in annual revenues, equating to $400 billion to $660 billion. By automating customer service, marketing, sales, and supply chain management, generative AI can streamline operations, optimizing personalized offerings and presenting itself as a disruptive force. While celebrating economic potential, we acknowledge challenges like external inference and adversarial attacks. Human involvement remains crucial for quality control and security in the era of generative AI-driven transformative innovation. This talk provides a comprehensive exploration of generative AI's pivotal role in reshaping businesses, recognizing its strategic impact on customer interactions, productivity, and operational efficiency.Keywords: generative AI, digital transformation, LLM, artificial intelligence, startups, businesses
Procedia PDF Downloads 76273 Progressive Multimedia Collection Structuring via Scene Linking
Authors: Aman Berhe, Camille Guinaudeau, Claude Barras
Abstract:
In order to facilitate information seeking in large collections of multimedia documents with long and progressive content (such as broadcast news or TV series), one can extract the semantic links that exist between semantically coherent parts of documents, i.e., scenes. The links can then create a coherent collection of scenes from which it is easier to perform content analysis, topic extraction, or information retrieval. In this paper, we focus on TV series structuring and propose two approaches for scene linking at different levels of granularity (episode and season): a fuzzy online clustering technique and a graph-based community detection algorithm. When evaluated on the two first seasons of the TV series Game of Thrones, we found that the fuzzy online clustering approach performed better compared to graph-based community detection at the episode level, while graph-based approaches show better performance at the season level.Keywords: multimedia collection structuring, progressive content, scene linking, fuzzy clustering, community detection
Procedia PDF Downloads 100272 From the “Movement Language” to Communication Language
Authors: Mahmudjon Kuchkarov, Marufjon Kuchkarov
Abstract:
The origin of ‘Human Language’ is still a secret and the most interesting subject of historical linguistics. The core element is the nature of labeling or coding the things or processes with symbols and sounds. In this paper, we investigate human’s involuntary Paired Sounds and Shape Production (PSSP) and its contribution to the development of early human communication. Aimed at twenty-six volunteers who provided many physical movements with various difficulties, the research team investigated the natural, repeatable, and paired sounds and shape productions during human activities. The paper claims the involvement of Paired Sounds and Shape Production (PSSP) in the phonetic origin of some modern words and the existence of similarities between elements of PSSP with characters of the classic Latin alphabet. The results may be used not only as a supporting idea for existing theories but to create a closer look at some fundamental nature of the origin of the languages as well.Keywords: body shape, body language, coding, Latin alphabet, merging method, movement language, movement sound, natural sound, origin of language, pairing, phonetics, sound and shape production, word origin, word semantic
Procedia PDF Downloads 249271 Words of Peace in the Speeches of the Egyptian President, Abdulfattah El-Sisi: A Corpus-Based Study
Authors: Mohamed S. Negm, Waleed S. Mandour
Abstract:
The present study aims primarily at investigating words of peace (lexemes of peace) in the formal speeches of the Egyptian president Abdulfattah El-Sisi in a two-year span of time, from 2018 to 2019. This paper attempts to shed light not only on the contextual use of the antonyms, war and peace, but also it underpins quantitative analysis through the current methods of corpus linguistics. As such, the researchers have deployed a corpus-based approach in collecting, encoding, and processing 30 presidential speeches over the stated period (23,411 words and 25,541 tokens in total). Further, semantic fields and collocational networkzs are identified and compared statistically. Results have shown a significant propensity of adopting peace, including its relevant collocation network, textually and therefore, ideationally, at the expense of war concept which in most cases surfaces euphemistically through the noun conflict. The president has not justified the action of war with an honorable cause or a valid reason. Such results, so far, have indicated a positive sociopolitical mindset the Egyptian president possesses and moreover, reveal national and international fair dealing on arising issues.Keywords: CADS, collocation network, corpus linguistics, critical discourse analysis
Procedia PDF Downloads 155270 Development of Fault Diagnosis Technology for Power System Based on Smart Meter
Authors: Chih-Chieh Yang, Chung-Neng Huang
Abstract:
In power system, how to improve the fault diagnosis technology of transmission line has always been the primary goal of power grid operators. In recent years, due to the rise of green energy, the addition of all kinds of distributed power also has an impact on the stability of the power system. Because the smart meters are with the function of data recording and bidirectional transmission, the adaptive Fuzzy Neural inference system, ANFIS, as well as the artificial intelligence that has the characteristics of learning and estimation in artificial intelligence. For transmission network, in order to avoid misjudgment of the fault type and location due to the input of these unstable power sources, combined with the above advantages of smart meter and ANFIS, a method for identifying fault types and location of faults is proposed in this study. In ANFIS training, the bus voltage and current information collected by smart meters can be trained through the ANFIS tool in MATLAB to generate fault codes to identify different types of faults and the location of faults. In addition, due to the uncertainty of distributed generation, a wind power system is added to the transmission network to verify the diagnosis correctness of the study. Simulation results show that the method proposed in this study can correctly identify the fault type and location of fault with more efficiency, and can deal with the interference caused by the addition of unstable power sources.Keywords: ANFIS, fault diagnosis, power system, smart meter
Procedia PDF Downloads 139269 Deep-Learning Coupled with Pragmatic Categorization Method to Classify the Urban Environment of the Developing World
Authors: Qianwei Cheng, A. K. M. Mahbubur Rahman, Anis Sarker, Abu Bakar Siddik Nayem, Ovi Paul, Amin Ahsan Ali, M. Ashraful Amin, Ryosuke Shibasaki, Moinul Zaber
Abstract:
Thomas Friedman, in his famous book, argued that the world in this 21st century is flat and will continue to be flatter. This is attributed to rapid globalization and the interdependence of humanity that engendered tremendous in-flow of human migration towards the urban spaces. In order to keep the urban environment sustainable, policy makers need to plan based on extensive analysis of the urban environment. With the advent of high definition satellite images, high resolution data, computational methods such as deep neural network analysis, and hardware capable of high-speed analysis; urban planning is seeing a paradigm shift. Legacy data on urban environments are now being complemented with high-volume, high-frequency data. However, the first step of understanding urban space lies in useful categorization of the space that is usable for data collection, analysis, and visualization. In this paper, we propose a pragmatic categorization method that is readily usable for machine analysis and show applicability of the methodology on a developing world setting. Categorization to plan sustainable urban spaces should encompass the buildings and their surroundings. However, the state-of-the-art is mostly dominated by classification of building structures, building types, etc. and largely represents the developed world. Hence, these methods and models are not sufficient for developing countries such as Bangladesh, where the surrounding environment is crucial for the categorization. Moreover, these categorizations propose small-scale classifications, which give limited information, have poor scalability and are slow to compute in real time. Our proposed method is divided into two steps-categorization and automation. We categorize the urban area in terms of informal and formal spaces and take the surrounding environment into account. 50 km × 50 km Google Earth image of Dhaka, Bangladesh was visually annotated and categorized by an expert and consequently a map was drawn. The categorization is based broadly on two dimensions-the state of urbanization and the architectural form of urban environment. Consequently, the urban space is divided into four categories: 1) highly informal area; 2) moderately informal area; 3) moderately formal area; and 4) highly formal area. In total, sixteen sub-categories were identified. For semantic segmentation and automatic categorization, Google’s DeeplabV3+ model was used. The model uses Atrous convolution operation to analyze different layers of texture and shape. This allows us to enlarge the field of view of the filters to incorporate larger context. Image encompassing 70% of the urban space was used to train the model, and the remaining 30% was used for testing and validation. The model is able to segment with 75% accuracy and 60% Mean Intersection over Union (mIoU). In this paper, we propose a pragmatic categorization method that is readily applicable for automatic use in both developing and developed world context. The method can be augmented for real-time socio-economic comparative analysis among cities. It can be an essential tool for the policy makers to plan future sustainable urban spaces.Keywords: semantic segmentation, urban environment, deep learning, urban building, classification
Procedia PDF Downloads 191268 Supplier Risk Management: A Multivariate Statistical Modelling and Portfolio Optimization Based Approach for Supplier Delivery Performance Development
Authors: Jiahui Yang, John Quigley, Lesley Walls
Abstract:
In this paper, the authors develop a stochastic model regarding the investment in supplier delivery performance development from a buyer’s perspective. The authors propose a multivariate model through a Multinomial-Dirichlet distribution within an Empirical Bayesian inference framework, representing both the epistemic and aleatory uncertainties in deliveries. A closed form solution is obtained and the lower and upper bound for both optimal investment level and expected profit under uncertainty are derived. The theoretical properties provide decision makers with useful insights regarding supplier delivery performance improvement problems where multiple delivery statuses are involved. The authors also extend the model from a single supplier investment into a supplier portfolio, using a Lagrangian method to obtain a theoretical expression for an optimal investment level and overall expected profit. The model enables a buyer to know how the marginal expected profit/investment level of each supplier changes with respect to the budget and which supplier should be invested in when additional budget is available. An application of this model is illustrated in a simulation study. Overall, the main contribution of this study is to provide an optimal investment decision making framework for supplier development, taking into account multiple delivery statuses as well as multiple projects.Keywords: decision making, empirical bayesian, portfolio optimization, supplier development, supply chain management
Procedia PDF Downloads 288267 Fast Approximate Bayesian Contextual Cold Start Learning (FAB-COST)
Authors: Jack R. McKenzie, Peter A. Appleby, Thomas House, Neil Walton
Abstract:
Cold-start is a notoriously difficult problem which can occur in recommendation systems, and arises when there is insufficient information to draw inferences for users or items. To address this challenge, a contextual bandit algorithm – the Fast Approximate Bayesian Contextual Cold Start Learning algorithm (FAB-COST) – is proposed, which is designed to provide improved accuracy compared to the traditionally used Laplace approximation in the logistic contextual bandit, while controlling both algorithmic complexity and computational cost. To this end, FAB-COST uses a combination of two moment projection variational methods: Expectation Propagation (EP), which performs well at the cold start, but becomes slow as the amount of data increases; and Assumed Density Filtering (ADF), which has slower growth of computational cost with data size but requires more data to obtain an acceptable level of accuracy. By switching from EP to ADF when the dataset becomes large, it is able to exploit their complementary strengths. The empirical justification for FAB-COST is presented, and systematically compared to other approaches on simulated data. In a benchmark against the Laplace approximation on real data consisting of over 670, 000 impressions from autotrader.co.uk, FAB-COST demonstrates at one point increase of over 16% in user clicks. On the basis of these results, it is argued that FAB-COST is likely to be an attractive approach to cold-start recommendation systems in a variety of contexts.Keywords: cold-start learning, expectation propagation, multi-armed bandits, Thompson Sampling, variational inference
Procedia PDF Downloads 108266 A Guide to User-Friendly Bash Prompt: Adding Natural Language Processing Plus Bash Explanation to the Command Interface
Authors: Teh Kean Kheng, Low Soon Yee, Burra Venkata Durga Kumar
Abstract:
In 2022, as the future world becomes increasingly computer-related, more individuals are attempting to study coding for themselves or in school. This is because they have discovered the value of learning code and the benefits it will provide them. But learning coding is difficult for most people. Even senior programmers that have experience for a decade year still need help from the online source while coding. The reason causing this is that coding is not like talking to other people; it has the specific syntax to make the computer understand what we want it to do, so coding will be hard for normal people if they don’t have contact in this field before. Coding is hard. If a user wants to learn bash code with bash prompt, it will be harder because if we look at the bash prompt, we will find that it is just an empty box and waiting for a user to tell the computer what we want to do, if we don’t refer to the internet, we will not know what we can do with the prompt. From here, we can conclude that the bash prompt is not user-friendly for new users who are learning bash code. Our goal in writing this paper is to give an idea to implement a user-friendly Bash prompt in Ubuntu OS using Artificial Intelligent (AI) to lower the threshold of learning in Bash code, to make the user use their own words and concept to write and learn Bash code.Keywords: user-friendly, bash code, artificial intelligence, threshold, semantic similarity, lexical similarity
Procedia PDF Downloads 142265 Programming without Code: An Approach and Environment to Conditions-On-Data Programming
Authors: Philippe Larvet
Abstract:
This paper presents the concept of an object-based programming language where tests (if... then... else) and control structures (while, repeat, for...) disappear and are replaced by conditions on data. According to the object paradigm, by using this concept, data are still embedded inside objects, as variable-value couples, but object methods are expressed into the form of logical propositions (‘conditions on data’ or COD).For instance : variable1 = value1 AND variable2 > value2 => variable3 = value3. Implementing this approach, a central inference engine turns and examines objects one after another, collecting all CODs of each object. CODs are considered as rules in a rule-based system: the left part of each proposition (left side of the ‘=>‘ sign) is the premise and the right part is the conclusion. So, premises are evaluated and conclusions are fired. Conclusions modify the variable-value couples of the object and the engine goes to examine the next object. The paper develops the principles of writing CODs instead of complex algorithms. Through samples, the paper also presents several hints for implementing a simple mechanism able to process this ‘COD language’. The proposed approach can be used within the context of simulation, process control, industrial systems validation, etc. By writing simple and rigorous conditions on data, instead of using classical and long-to-learn languages, engineers and specialists can easily simulate and validate the functioning of complex systems.Keywords: conditions on data, logical proposition, programming without code, object-oriented programming, system simulation, system validation
Procedia PDF Downloads 221264 Application of RS and GIS Technique for Identifying Groundwater Potential Zone in Gomukhi Nadhi Sub Basin, South India
Authors: Punitha Periyasamy, Mahalingam Sudalaimuthu, Sachikanta Nanda, Arasu Sundaram
Abstract:
India holds 17.5% of the world’s population but has only 2% of the total geographical area of the world where 27.35% of the area is categorized as wasteland due to lack of or less groundwater. So there is a demand for excessive groundwater for agricultural and non agricultural activities to balance its growth rate. With this in mind, an attempt is made to find the groundwater potential zone in Gomukhi river sub basin of Vellar River basin, TamilNadu, India covering an area of 1146.6 Sq.Km consists of 9 blocks from Peddanaickanpalayam to Villupuram fall in the sub basin. The thematic maps such as Geology, Geomorphology, Lineament, Landuse, and Landcover and Drainage are prepared for the study area using IRS P6 data. The collateral data includes rainfall, water level, soil map are collected for analysis and inference. The digital elevation model (DEM) is generated using Shuttle Radar Topographic Mission (SRTM) and the slope of the study area is obtained. ArcGIS 10.1 acts as a powerful spatial analysis tool to find out the ground water potential zones in the study area by means of weighted overlay analysis. Each individual parameter of the thematic maps are ranked and weighted in accordance with their influence to increase the water level in the ground. The potential zones in the study area are classified viz., Very Good, Good, Moderate, Poor with its aerial extent of 15.67, 381.06, 575.38, 174.49 Sq.Km respectively.Keywords: ArcGIS, DEM, groundwater, recharge, weighted overlay
Procedia PDF Downloads 444263 Geographic Information System for District Level Energy Performance Simulations
Authors: Avichal Malhotra, Jerome Frisch, Christoph van Treeck
Abstract:
The utilization of semantic, cadastral and topological data from geographic information systems (GIS) has exponentially increased for building and urban-scale energy performance simulations. Urban planners, simulation scientists, and researchers use virtual 3D city models for energy analysis, algorithms and simulation tools. For dynamic energy simulations at city and district level, this paper provides an overview of the available GIS data models and their levels of detail. Adhering to different norms and standards, these models also intend to describe building and construction industry data. For further investigations, CityGML data models are considered for simulations. Though geographical information modelling has considerably many different implementations, extensions of virtual city data can also be made for domain specific applications. Highlighting the use of the extended CityGML models for energy researches, a brief introduction to the Energy Application Domain Extension (ADE) along with its significance is made. Consequently, addressing specific input simulation data, a workflow using Modelica underlining the usage of GIS information and the quantification of its significance over annual heating energy demand is presented in this paper.Keywords: CityGML, EnergyADE, energy performance simulation, GIS
Procedia PDF Downloads 168262 ISMARA: Completely Automated Inference of Gene Regulatory Networks from High-Throughput Data
Authors: Piotr J. Balwierz, Mikhail Pachkov, Phil Arnold, Andreas J. Gruber, Mihaela Zavolan, Erik van Nimwegen
Abstract:
Understanding the key players and interactions in the regulatory networks that control gene expression and chromatin state across different cell types and tissues in metazoans remains one of the central challenges in systems biology. Our laboratory has pioneered a number of methods for automatically inferring core gene regulatory networks directly from high-throughput data by modeling gene expression (RNA-seq) and chromatin state (ChIP-seq) measurements in terms of genome-wide computational predictions of regulatory sites for hundreds of transcription factors and micro-RNAs. These methods have now been completely automated in an integrated webserver called ISMARA that allows researchers to analyze their own data by simply uploading RNA-seq or ChIP-seq data sets and provides results in an integrated web interface as well as in downloadable flat form. For any data set, ISMARA infers the key regulators in the system, their activities across the input samples, the genes and pathways they target, and the core interactions between the regulators. We believe that by empowering experimental researchers to apply cutting-edge computational systems biology tools to their data in a completely automated manner, ISMARA can play an important role in developing our understanding of regulatory networks across metazoans.Keywords: gene expression analysis, high-throughput sequencing analysis, transcription factor activity, transcription regulation
Procedia PDF Downloads 65261 An Intelligent Scheme Switching for MIMO Systems Using Fuzzy Logic Technique
Authors: Robert O. Abolade, Olumide O. Ajayi, Zacheaus K. Adeyemo, Solomon A. Adeniran
Abstract:
Link adaptation is an important strategy for achieving robust wireless multimedia communications based on quality of service (QoS) demand. Scheme switching in multiple-input multiple-output (MIMO) systems is an aspect of link adaptation, and it involves selecting among different MIMO transmission schemes or modes so as to adapt to the varying radio channel conditions for the purpose of achieving QoS delivery. However, finding the most appropriate switching method in MIMO links is still a challenge as existing methods are either computationally complex or not always accurate. This paper presents an intelligent switching method for the MIMO system consisting of two schemes - transmit diversity (TD) and spatial multiplexing (SM) - using fuzzy logic technique. In this method, two channel quality indicators (CQI) namely average received signal-to-noise ratio (RSNR) and received signal strength indicator (RSSI) are measured and are passed as inputs to the fuzzy logic system which then gives a decision – an inference. The switching decision of the fuzzy logic system is fed back to the transmitter to switch between the TD and SM schemes. Simulation results show that the proposed fuzzy logic – based switching technique outperforms conventional static switching technique in terms of bit error rate and spectral efficiency.Keywords: channel quality indicator, fuzzy logic, link adaptation, MIMO, spatial multiplexing, transmit diversity
Procedia PDF Downloads 152260 Ambivalence as Ethical Practice: Methodologies to Address Noise, Bias in Care, and Contact Evaluations
Authors: Anthony Townsend, Robyn Fasser
Abstract:
While complete objectivity is a desirable scientific position from which to conduct a care and contact evaluation (CCE), it is precisely the recognition that we are inherently incapable of operating objectively that is the foundation of ethical practice and skilled assessment. Drawing upon recent research from Daniel Kahneman (2021) on the differences between noise and bias, as well as different inherent biases collectively termed “The Elephant in the Brain” by Kevin Simler and Robin Hanson (2019) from Oxford University, this presentation addresses both the various ways in which our judgments, perceptions and even procedures can be distorted and contaminated while conducting a CCE, but also considers the value of second order cybernetics and the psychodynamic concept of ‘ambivalence’ as a conceptual basis to inform our assessment methodologies to limit such errors or at least better identify them. Both a conceptual framework for ambivalence, our higher-order capacity to allow for the convergence and consideration of multiple emotional experiences and cognitive perceptions to inform our reasoning, and a practical methodology for assessment relying on data triangulation, Bayesian inference and hypothesis testing is presented as a means of promoting ethical practice for health care professionals conducting CCEs. An emphasis on widening awareness and perspective, limiting ‘splitting’, is demonstrated both in how this form of emotional processing plays out in alienating dynamics in families as well as the assessment thereof. In addressing this concept, this presentation aims to illuminate the value of ambivalence as foundational to ethical practice for assessors.Keywords: ambivalence, forensic, psychology, noise, bias, ethics
Procedia PDF Downloads 86259 The Influence of Screen Translation on Creative Audiovisual Writing: A Corpus-Based Approach
Authors: John D. Sanderson
Abstract:
The popularity of American cinema worldwide has contributed to the development of sociolects related to specific film genres in other cultural contexts by means of screen translation, in many cases eluding norms of usage in the target language, a process whose result has come to be known as 'dubbese'. A consequence for the reception in countries where local audiovisual fiction consumption is far lower than American imported productions is that this linguistic construct is preferred, even though it differs from common everyday speech. The iconography of film genres such as science-fiction, western or sword-and-sandal films, for instance, generates linguistic expectations in international audiences who will accept more easily the sociolects assimilated by the continuous reception of American productions, even if the themes, locations, characters, etc., portrayed on screen may belong in origin to other cultures. And the non-normative language (e.g., calques, semantic loans) used in the preferred mode of linguistic transfer, whether it is translation for dubbing or subtitling, has diachronically evolved in many cases into a status of canonized sociolect, not only accepted but also required, by foreign audiences of American films. However, a remarkable step forward is taken when this typology of artificial linguistic constructs starts being used creatively by nationals of these target cultural contexts. In the case of Spain, the success of American sitcoms such as Friends in the 1990s led Spanish television scriptwriters to include in national productions lexical and syntactical indirect borrowings (Anglicisms not formally identifiable as such because they include elements from their own language) in order to target audiences of the former. However, this commercial strategy had already taken place decades earlier when Spain became a favored location for the shooting of foreign films in the early 1960s. The international popularity of the then newly developed sub-genre known as Spaghetti-Western encouraged Spanish investors to produce their own movies, and local scriptwriters made use of the dubbese developed nationally since the advent of sound in film instead of using normative language. As a result, direct Anglicisms, as well as lexical and syntactical borrowings made up the creative writing of these Spanish productions, which also became commercially successful. Interestingly enough, some of these films were even marketed in English-speaking countries as original westerns (some of the names of actors and directors were anglified to that purpose) dubbed into English. The analysis of these 'back translations' will also foreground some semantic distortions that arose in the process. In order to perform the research on these issues, a wide corpus of American films has been used, which chronologically range from Stagecoach (John Ford, 1939) to Django Unchained (Quentin Tarantino, 2012), together with a shorter corpus of Spanish films produced during the golden age of Spaghetti Westerns, from una tumba para el sheriff (Mario Caiano; in English lone and angry man, William Hawkins) to tu fosa será la exacta, amigo (Juan Bosch, 1972; in English my horse, my gun, your widow, John Wood). The methodology of analysis and the conclusions reached could be applied to other genres and other cultural contexts.Keywords: dubbing, film genre, screen translation, sociolect
Procedia PDF Downloads 171258 Corpus-Based Description of Core English Nouns of Pakistani English, an EFL Learner Perspective at Secondary Level
Authors: Abrar Hussain Qureshi
Abstract:
Vocabulary has been highlighted as a key indicator in any foreign language learning program, especially English as a foreign language (EFL). It is often considered a potential tool in foreign language curriculum, and its deficiency impedes successful communication in the target language. The knowledge of the lexicon is very significant in getting communicative competence and performance. Nouns constitute a considerable bulk of English vocabulary. Rather, they are the bones of the English language and are the main semantic carrier in spoken and written discourse. As nouns dominate the bulk of the English lexicon, their role becomes all the more potential. The undertaken research is a systematic effort in this regard to work out a list of highly frequent list of Pakistani English nouns for the EFL learners at the secondary level. It will encourage autonomy for the EFL learners as well as will save their time. The corpus used for the research has been developed locally from leading English newspapers of Pakistan. Wordsmith Tools has been used to process the research data and to retrieve word list of frequent Pakistani English nouns. The retrieved list of core Pakistani English nouns is supposed to be useful for English language learners at the secondary level as it covers a wide range of speech events.Keywords: corpus, EFL, frequency list, nouns
Procedia PDF Downloads 103257 Non-Linear Regression Modeling for Composite Distributions
Authors: Mostafa Aminzadeh, Min Deng
Abstract:
Modeling loss data is an important part of actuarial science. Actuaries use models to predict future losses and manage financial risk, which can be beneficial for marketing purposes. In the insurance industry, small claims happen frequently while large claims are rare. Traditional distributions such as Normal, Exponential, and inverse-Gaussian are not suitable for describing insurance data, which often show skewness and fat tails. Several authors have studied classical and Bayesian inference for parameters of composite distributions, such as Exponential-Pareto, Weibull-Pareto, and Inverse Gamma-Pareto. These models separate small to moderate losses from large losses using a threshold parameter. This research introduces a computational approach using a nonlinear regression model for loss data that relies on multiple predictors. Simulation studies were conducted to assess the accuracy of the proposed estimation method. The simulations confirmed that the proposed method provides precise estimates for regression parameters. It's important to note that this approach can be applied to datasets if goodness-of-fit tests confirm that the composite distribution under study fits the data well. To demonstrate the computations, a real data set from the insurance industry is analyzed. A Mathematica code uses the Fisher information algorithm as an iteration method to obtain the maximum likelihood estimation (MLE) of regression parameters.Keywords: maximum likelihood estimation, fisher scoring method, non-linear regression models, composite distributions
Procedia PDF Downloads 33