Search results for: semantic data
25164 Efficient Layout-Aware Pretraining for Multimodal Form Understanding
Authors: Armineh Nourbakhsh, Sameena Shah, Carolyn Rose
Abstract:
Layout-aware language models have been used to create multimodal representations for documents that are in image form, achieving relatively high accuracy in document understanding tasks. However, the large number of parameters in the resulting models makes building and using them prohibitive without access to high-performing processing units with large memory capacity. We propose an alternative approach that can create efficient representations without the need for a neural visual backbone. This leads to an 80% reduction in the number of parameters compared to the smallest SOTA model, widely expanding applicability. In addition, our layout embeddings are pre-trained on spatial and visual cues alone and only fused with text embeddings in downstream tasks, which can facilitate applicability to low-resource of multi-lingual domains. Despite using 2.5% of training data, we show competitive performance on two form understanding tasks: semantic labeling and link prediction.Keywords: layout understanding, form understanding, multimodal document understanding, bias-augmented attention
Procedia PDF Downloads 14825163 Gender Bias in Natural Language Processing: Machines Reflect Misogyny in Society
Authors: Irene Yi
Abstract:
Machine learning, natural language processing, and neural network models of language are becoming more and more prevalent in the fields of technology and linguistics today. Training data for machines are at best, large corpora of human literature and at worst, a reflection of the ugliness in society. Machines have been trained on millions of human books, only to find that in the course of human history, derogatory and sexist adjectives are used significantly more frequently when describing females in history and literature than when describing males. This is extremely problematic, both as training data, and as the outcome of natural language processing. As machines start to handle more responsibilities, it is crucial to ensure that they do not take with them historical sexist and misogynistic notions. This paper gathers data and algorithms from neural network models of language having to deal with syntax, semantics, sociolinguistics, and text classification. Results are significant in showing the existing intentional and unintentional misogynistic notions used to train machines, as well as in developing better technologies that take into account the semantics and syntax of text to be more mindful and reflect gender equality. Further, this paper deals with the idea of non-binary gender pronouns and how machines can process these pronouns correctly, given its semantic and syntactic context. This paper also delves into the implications of gendered grammar and its effect, cross-linguistically, on natural language processing. Languages such as French or Spanish not only have rigid gendered grammar rules, but also historically patriarchal societies. The progression of society comes hand in hand with not only its language, but how machines process those natural languages. These ideas are all extremely vital to the development of natural language models in technology, and they must be taken into account immediately.Keywords: gendered grammar, misogynistic language, natural language processing, neural networks
Procedia PDF Downloads 12025162 Research of Data Cleaning Methods Based on Dependency Rules
Authors: Yang Bao, Shi Wei Deng, WangQun Lin
Abstract:
This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSQL), and gives 6 data cleaning methods based on these algorithms.Keywords: data cleaning, dependency rules, violation data discovery, data repair
Procedia PDF Downloads 56425161 Provenance in Scholarly Publications: Introducing the provCite Ontology
Authors: Maria Joseph Israel, Ahmed Amer
Abstract:
Our work aims to broaden the application of provenance technology beyond its traditional domains of scientific workflow management and database systems by offering a general provenance framework to capture richer and extensible metadata in unstructured textual data sources such as literary texts, commentaries, translations, and digital humanities. Specifically, we demonstrate the feasibility of capturing and representing expressive provenance metadata, including more of the context for citing scholarly works (e.g., the authors’ explicit or inferred intentions at the time of developing his/her research content for publication), while also supporting subsequent augmentation with similar additional metadata (by third parties, be they human or automated). To better capture the nature and types of possible citations, in our proposed provenance scheme metaScribe, we extend standard provenance conceptual models to form our proposed provCite ontology. This provides a conceptual framework which can accurately capture and describe more of the functional and rhetorical properties of a citation than can be achieved with any current models.Keywords: knowledge representation, provenance architecture, ontology, metadata, bibliographic citation, semantic web annotation
Procedia PDF Downloads 11725160 Text Localization in Fixed-Layout Documents Using Convolutional Networks in a Coarse-to-Fine Manner
Authors: Beier Zhu, Rui Zhang, Qi Song
Abstract:
Text contained within fixed-layout documents can be of great semantic value and so requires a high localization accuracy, such as ID cards, invoices, cheques, and passports. Recently, algorithms based on deep convolutional networks achieve high performance on text detection tasks. However, for text localization in fixed-layout documents, such algorithms detect word bounding boxes individually, which ignores the layout information. This paper presents a novel architecture built on convolutional neural networks (CNNs). A global text localization network and a regional bounding-box regression network are introduced to tackle the problem in a coarse-to-fine manner. The text localization network simultaneously locates word bounding points, which takes the layout information into account. The bounding-box regression network inputs the features pooled from arbitrarily sized RoIs and refine the localizations. These two networks share their convolutional features and are trained jointly. A typical type of fixed-layout documents: ID cards, is selected to evaluate the effectiveness of the proposed system. These networks are trained on data cropped from nature scene images, and synthetic data produced by a synthetic text generation engine. Experiments show that our approach locates high accuracy word bounding boxes and achieves state-of-the-art performance.Keywords: bounding box regression, convolutional networks, fixed-layout documents, text localization
Procedia PDF Downloads 19425159 Translation and Adaptation of the Assessment Instrument “Kiddycat” for European Portuguese
Authors: Elsa Marta Soares, Ana Rita Valente, Cristiana Rodrigues, Filipa Gonçalves
Abstract:
Background: The assessment of feelings and attitudes of preschool children in relation to stuttering is crucial. Negative experiences can lead to anxiety, worry or frustration. To avoid the worsening of attitudes and feelings related to stuttering, it is important the early detection in order to intervene as soon as possible through an individualized intervention plan. Then it is important to have Portuguese instruments that allow this assessment. Aims: The aim of the present study is to realize the translation and adaptation of the Communication Attitude Test for Children in Preschool Age and Kindergarten (KiddyCat) for EP. Methodology: For the translation and adaptation process, a methodological study was carried out with the following steps: translation, back translation, assessment by a committee of experts and pre-test. This abstract describes the results of the first two phases of this process. The translation was accomplished by two bilingual individuals without experience in health and any knowledge about the instrument. One of them was an English teacher and the other one a Translator. The back-translation was conducted by two Senior Class Teachers that live in United Kingdom without any knowledge in health and about the instrument. Results and Discussion: In translation there were differences in semantic equivalences of various expressions and concepts. A discussion between the two translators, mediated by the researchers, allowed to achieve the consensus version of the translated instrument. Taking into account the original version of KiddyCAT the results demonstrated that back-translation versions were similar to the original version of this assessment instrument. Although the back-translators used different words, they were synonymous, maintaining semantic and idiomatic equivalences of the instrument’s items. Conclusion: This project contributes with an important resource that can be used in the assessment of feelings and attitudes of preschool children who stutter. This was the first phase of the research; expert panel and pretest are being developed. Therefore, it is expected that this instrument contributes to an holistic therapeutic intervention, taking into account the individual characteristics of each child.Keywords: assessment, feelings and attitudes, preschool children, stuttering
Procedia PDF Downloads 14925158 Structural Balance and Creative Tensions in New Product Development Teams
Authors: Shankaran Sitarama
Abstract:
New Product Development involves team members coming together and working in teams to come up with innovative solutions to problems, resulting in new products. Thus, a core attribute of a successful NPD team is their creativity and innovation. They need to be creative as a group, generating a breadth of ideas and innovative solutions that solve or address the problem they are targeting and meet the user’s needs. They also need to be very efficient in their teamwork as they work through the various stages of the development of these ideas, resulting in a POC (proof-of-concept) implementation or a prototype of the product. There are two distinctive traits that the teams need to have, one is ideational creativity, and the other is effective and efficient teamworking. There are multiple types of tensions that each of these traits cause in the teams, and these tensions reflect in the team dynamics. Ideational conflicts arising out of debates and deliberations increase the collective knowledge and affect the team creativity positively. However, the same trait of challenging each other’s viewpoints might lead the team members to be disruptive, resulting in interpersonal tensions, which in turn lead to less than efficient teamwork. Teams that foster and effectively manage these creative tensions are successful, and teams that are not able to manage these tensions show poor team performance. In this paper, it explore these tensions as they result in the team communication social network and propose a Creative Tension Balance index along the lines of Degree of Balance in social networks that has the potential to highlight the successful (and unsuccessful) NPD teams. Team communication reflects the team dynamics among team members and is the data set for analysis. The emails between the members of the NPD teams are processed through a semantic analysis algorithm (LSA) to analyze the content of communication and a semantic similarity analysis to arrive at a social network graph that depicts the communication amongst team members based on the content of communication. This social network is subjected to traditional social network analysis methods to arrive at some established metrics and structural balance analysis metrics. Traditional structural balance is extended to include team interaction pattern metrics to arrive at a creative tension balance metric that effectively captures the creative tensions and tension balance in teams. This CTB (Creative Tension Balance) metric truly captures the signatures of successful and unsuccessful (dissonant) NPD teams. The dataset for this research study includes 23 NPD teams spread out over multiple semesters and computes this CTB metric and uses it to identify the most successful and unsuccessful teams by classifying these teams into low, high and medium performing teams. The results are correlated to the team reflections (for team dynamics and interaction patterns), the team self-evaluation feedback surveys (for teamwork metrics) and team performance through a comprehensive team grade (for high and low performing team signatures).Keywords: team dynamics, social network analysis, new product development teamwork, structural balance, NPD teams
Procedia PDF Downloads 7925157 Metaphor Institutionalization as Phase Transition: Case Studies of Chinese Metaphors
Abstract:
Metaphor institutionalization refers to the propagation of a metaphor that leads to its acceptance in speech community as a norm of the language. Such knowledge is important to both theoretical studies of metaphor and practical disciplines such as lexicography and language generation. This paper reports an empirical study of metaphor institutionalization of 14 Chinese metaphors. It first explores the pattern of metaphor institutionalization by fitting the logistic function (or S-shaped curve) to time series data of conventionality of the metaphors that are automatically obtained from a large-scale diachronic Chinese corpus. Then it reports a questionnaire-based survey on the propagation scale of each metaphor, which is measured by the average number of subjects that can easily understand the metaphorical expressions. The study provides two pieces of evidence supporting the hypothesis that metaphor institutionalization is a phrase transition: (1) the pattern of metaphor institutionalization is an S-shaped curve and (2) institutionalized metaphors generally do not propagate to the whole community but remain in equilibrium state. This conclusion helps distinguish metaphor institutionalization from topicalization and other types of semantic change.Keywords: metaphor institutionalization, phase transition, propagation scale, s-shaped curve
Procedia PDF Downloads 17125156 Product Form Bionic Design Based on Eye Tracking Data: A Case Study of Desk Lamp
Authors: Huan Lin, Liwen Pang
Abstract:
In order to reduce the ambiguity and uncertainty of product form bionic design, a product form bionic design method based on eye tracking is proposed. The eye-tracking experiment is designed to calculate the average time ranking of the specific parts of the bionic shape that the subjects are looking at. Key bionic shape is explored through the experiment and then applied to a desk lamp bionic design. During the design case, FAHP (Fuzzy Analytic Hierachy Process) and SD (Semantic Differential) method are firstly used to identify consumer emotional perception model toward desk lamp before product design. Through investigating different desk lamp design elements and consumer views, the form design factors on the desk lamp product are reflected and all design schemes are sequenced after caculation. Desk lamp form bionic design method is combined the key bionic shape extracted from eye-tracking experiment and priority of desk lamp design schemes. This study provides an objective and rational method to product form bionic design.Keywords: Bionic design; Form; Eye tracking; FAHP; Desk lamp
Procedia PDF Downloads 22425155 Computational Team Dynamics and Interaction Patterns in New Product Development Teams
Authors: Shankaran Sitarama
Abstract:
New Product Development (NPD) is invariably a team effort and involves effective teamwork. NPD team has members from different disciplines coming together and working through the different phases all the way from conceptual design phase till the production and product roll out. Creativity and Innovation are some of the key factors of successful NPD. Team members going through the different phases of NPD interact and work closely yet challenge each other during the design phases to brainstorm on ideas and later converge to work together. These two traits require the teams to have a divergent and a convergent thinking simultaneously. There needs to be a good balance. The team dynamics invariably result in conflicts among team members. While some amount of conflict (ideational conflict) is desirable in NPD teams to be creative as a group, relational conflicts (or discords among members) could be detrimental to teamwork. Team communication truly reflect these tensions and team dynamics. In this research, team communication (emails) between the members of the NPD teams is considered for analysis. The email communication is processed through a semantic analysis algorithm (LSA) to analyze the content of communication and a semantic similarity analysis to arrive at a social network graph that depicts the communication amongst team members based on the content of communication. The amount of communication (content and not frequency of communication) defines the interaction strength between the members. Social network adjacency matrix is thus obtained for the team. Standard social network analysis techniques based on the Adjacency Matrix (AM) and Dichotomized Adjacency Matrix (DAM) based on network density yield network graphs and network metrics like centrality. The social network graphs are then rendered for visual representation using a Metric Multi-Dimensional Scaling (MMDS) algorithm for node placements and arcs connecting the nodes (representing team members) are drawn. The distance of the nodes in the placement represents the tie-strength between the members. Stronger tie-strengths render nodes closer. Overall visual representation of the social network graph provides a clear picture of the team’s interactions. This research reveals four distinct patterns of team interaction that are clearly identifiable in the visual representation of the social network graph and have a clearly defined computational scheme. The four computational patterns of team interaction defined are Central Member Pattern (CMP), Subgroup and Aloof member Pattern (SAP), Isolate Member Pattern (IMP), and Pendant Member Pattern (PMP). Each of these patterns has a team dynamics implication in terms of the conflict level in the team. For instance, Isolate member pattern, clearly points to a near break-down in communication with the member and hence a possible high conflict level, whereas the subgroup or aloof member pattern points to a non-uniform information flow in the team and some moderate level of conflict. These pattern classifications of teams are then compared and correlated to the real level of conflict in the teams as indicated by the team members through an elaborate self-evaluation, team reflection, feedback form and results show a good correlation.Keywords: team dynamics, team communication, team interactions, social network analysis, sna, new product development, latent semantic analysis, LSA, NPD teams
Procedia PDF Downloads 7025154 Optimizing the Capacity of a Convolutional Neural Network for Image Segmentation and Pattern Recognition
Authors: Yalong Jiang, Zheru Chi
Abstract:
In this paper, we study the factors which determine the capacity of a Convolutional Neural Network (CNN) model and propose the ways to evaluate and adjust the capacity of a CNN model for best matching to a specific pattern recognition task. Firstly, a scheme is proposed to adjust the number of independent functional units within a CNN model to make it be better fitted to a task. Secondly, the number of independent functional units in the capsule network is adjusted to fit it to the training dataset. Thirdly, a method based on Bayesian GAN is proposed to enrich the variances in the current dataset to increase its complexity. Experimental results on the PASCAL VOC 2010 Person Part dataset and the MNIST dataset show that, in both conventional CNN models and capsule networks, the number of independent functional units is an important factor that determines the capacity of a network model. By adjusting the number of functional units, the capacity of a model can better match the complexity of a dataset.Keywords: CNN, convolutional neural network, capsule network, capacity optimization, character recognition, data augmentation, semantic segmentation
Procedia PDF Downloads 15325153 Semantic Differential Technique as a Kansei Engineering Tool to Enquire Public Space Design Requirements: The Case of Parks in Tehran
Authors: Nasser Koleini Mamaghani, Sara Mostowfi
Abstract:
The complexity of public space design makes it difficult for designers to simultaneously consider all issues for thorough decision-making. Among public spaces, the public space around people’s house is the most prominent space that affects and impacts people’s daily life. Considering recreational public spaces in cities, their main purpose would be to design for experiences that enable a deep feeling of peace and a moment of being away from the hectic daily life. Respecting human emotions and restoring natural environments, although difficult and to some extent out of reach, are key issues for designing such spaces. In this paper we propose to analyse the structure of recreational public spaces and the related emotional impressions. Furthermore, we suggest investigating how these structures influence people’s choice for public spaces by using differential semantics. According to Kansei methodology, in order to evaluate a situation appropriately, the assessment variables must be adapted to the user’s mental scheme. This means that the first step would have to be the identification of a space’s conceptual scheme. In our case study, 32 Kansei words and 4 different locations, each with a different sensual experience, were selected. The 4 locations were all parks in the city of Tehran (Iran), each with a unique structure and artifacts such as a fountain, lighting, sculptures, and music. It should be noted that each of these parks has different combination and structure of environmental and artificial elements like: fountain, lightning, sculpture, music (sound) and so forth. The first one was park No.1, a park with natural environment, the selected space was a fountain with motion light and sculpture. The second park was park No.2, in which there are different styles of park construction: ways from different countries, the selected space was traditional Iranian architecture with a fountain and trees. The third one was park No.3, the park with modern environment and spaces, and included a fountain that moved according to music and lighting. The fourth park was park No.4, the park with combination of four elements: water, fire, earth, wind, the selected space was fountains squirting water from the ground up. 80 participant (55 males and 25 females) aged from 20-60 years participated in this experiment. Each person filled the questionnaire in the park he/she was in. Five-point semantic differential scale was considered to determine the relation between space details and adjectives (kansei words). Received data were analyzed by multivariate statistical technique (factor analysis using SPSS statics). Finally the results of this analysis are criteria as inspiration which can be used in future space designing for creating pleasant feeling in users.Keywords: environmental design, differential semantics, Kansei engineering, subjective preferences, space
Procedia PDF Downloads 40725152 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity
Authors: Hoda A. Abdel Hafez
Abstract:
Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.Keywords: mining big data, big data, machine learning, telecommunication
Procedia PDF Downloads 41025151 JavaScript Object Notation Data against eXtensible Markup Language Data in Software Applications a Software Testing Approach
Authors: Theertha Chandroth
Abstract:
This paper presents a comparative study on how to check JSON (JavaScript Object Notation) data against XML (eXtensible Markup Language) data from a software testing point of view. JSON and XML are widely used data interchange formats, each with its unique syntax and structure. The objective is to explore various techniques and methodologies for validating comparison and integration between JSON data to XML and vice versa. By understanding the process of checking JSON data against XML data, testers, developers and data practitioners can ensure accurate data representation, seamless data interchange, and effective data validation.Keywords: XML, JSON, data comparison, integration testing, Python, SQL
Procedia PDF Downloads 14025150 Interacting with Multi-Scale Structures of Online Political Debates by Visualizing Phylomemies
Authors: Quentin Lobbe, David Chavalarias, Alexandre Delanoe
Abstract:
The ICT revolution has given birth to an unprecedented world of digital traces and has impacted a wide number of knowledge-driven domains such as science, education or policy making. Nowadays, we are daily fueled by unlimited flows of articles, blogs, messages, tweets, etc. The internet itself can thus be considered as an unsteady hyper-textual environment where websites emerge and expand every day. But there are structures inside knowledge. A given text can always be studied in relation to others or in light of a specific socio-cultural context. By way of their textual traces, human beings are calling each other out: hypertext citations, retweets, vocabulary similarity, etc. We are in fact the architects of a giant web of elements of knowledge whose structures and shapes convey their own information. The global shapes of these digital traces represent a source of collective knowledge and the question of their visualization remains an opened challenge. How can we explore, browse and interact with such shapes? In order to navigate across these growing constellations of words and texts, interdisciplinary innovations are emerging at the crossroad between fields of social and computational sciences. In particular, complex systems approaches make it now possible to reconstruct the hidden structures of textual knowledge by means of multi-scale objects of research such as semantic maps and phylomemies. The phylomemy reconstruction is a generic method related to the co-word analysis framework. Phylomemies aim to reveal the temporal dynamics of large corpora of textual contents by performing inter-temporal matching on extracted knowledge domains in order to identify their conceptual lineages. This study aims to address the question of visualizing the global shapes of online political discussions related to the French presidential and legislative elections of 2017. We aim to build phylomemies on top of a dedicated collection of thousands of French political tweets enriched with archived contemporary news web articles. Our goal is to reconstruct the temporal evolution of online debates fueled by each political community during the elections. To that end, we want to introduce an iterative data exploration methodology implemented and tested within the free software Gargantext. There we combine synchronic and diachronic axis of visualization to reveal the dynamics of our corpora of tweets and web pages as well as their inner syntagmatic and paradigmatic relationships. In doing so, we aim to provide researchers with innovative methodological means to explore online semantic landscapes in a collaborative and reflective way.Keywords: online political debate, French election, hyper-text, phylomemy
Procedia PDF Downloads 18625149 Argument Representation in Non-Spatial Motion Bahasa Melayu Based Conceptual Structure Theory
Authors: Nurul Jamilah Binti Rosly
Abstract:
The typology of motion must be understood as a change from one location to another. But from a conceptual point of view, motion can also occur in non-spatial contexts associated with human and social factors. Therefore, from the conceptual point of view, the concept of non-spatial motion involves the movement of time, ownership, identity, state, and existence. Accordingly, this study will focus on the lexical as shared, accept, be, store, and exist as the study material. The data in this study were extracted from the Database of Languages and Literature Corpus Database, Malaysia, which was analyzed using semantics and syntax concepts using Conceptual Structure Theory - Ray Jackendoff (2002). Semantic representations are represented in the form of conceptual structures in argument functions that include functions [events], [situations], [objects], [paths] and [places]. The findings show that the mapping of these arguments comprises three main stages, namely mapping the argument structure, mapping the tree, and mapping the role of thematic items. Accordingly, this study will show the representation of non- spatial Malay language areas.Keywords: arguments, concepts, constituencies, events, situations, thematics
Procedia PDF Downloads 12925148 Using Machine Learning Techniques to Extract Useful Information from Dark Data
Authors: Nigar Hussain
Abstract:
It is a subset of big data. Dark data means those data in which we fail to use for future decisions. There are many issues in existing work, but some need powerful tools for utilizing dark data. It needs sufficient techniques to deal with dark data. That enables users to exploit their excellence, adaptability, speed, less time utilization, execution, and accessibility. Another issue is the way to utilize dark data to extract helpful information to settle on better choices. In this paper, we proposed upgrade strategies to remove the dark side from dark data. Using a supervised model and machine learning techniques, we utilized dark data and achieved an F1 score of 89.48%.Keywords: big data, dark data, machine learning, heatmap, random forest
Procedia PDF Downloads 2825147 Multi-Source Data Fusion for Urban Comprehensive Management
Authors: Bolin Hua
Abstract:
In city governance, various data are involved, including city component data, demographic data, housing data and all kinds of business data. These data reflects different aspects of people, events and activities. Data generated from various systems are different in form and data source are different because they may come from different sectors. In order to reflect one or several facets of an event or rule, data from multiple sources need fusion together. Data from different sources using different ways of collection raised several issues which need to be resolved. Problem of data fusion include data update and synchronization, data exchange and sharing, file parsing and entry, duplicate data and its comparison, resource catalogue construction. Governments adopt statistical analysis, time series analysis, extrapolation, monitoring analysis, value mining, scenario prediction in order to achieve pattern discovery, law verification, root cause analysis and public opinion monitoring. The result of Multi-source data fusion is to form a uniform central database, which includes people data, location data, object data, and institution data, business data and space data. We need to use meta data to be referred to and read when application needs to access, manipulate and display the data. A uniform meta data management ensures effectiveness and consistency of data in the process of data exchange, data modeling, data cleansing, data loading, data storing, data analysis, data search and data delivery.Keywords: multi-source data fusion, urban comprehensive management, information fusion, government data
Procedia PDF Downloads 39325146 Logic of Appearance vs Explanatory Logic: A Systemic Functional Linguistics Approach to the Evolution of Communicative Strategies in the European Union Institutional Discourse
Authors: Antonio Piga
Abstract:
The issue of European cultural identity has become a prominent topic of discussion among political actors in the wake of the unsuccessful referenda held in France and the Netherlands in May and June 2006. The „period of reflection‟ announced by the European Council at the conclusion of June 2006 has provided an opportunity for the implementation of several initiatives and programmes designed to „bridge the gap‟ between the EU institutions and its citizens. Specific programmes were designed with the objective of enhancing the European Commission‟s external communication of its activities. Subsequently, further plans for democracy, debate, and dialogue were devised with the objective of fostering open and extensive discourse between EU institutions and citizens. Further documentation on communication policy emphasised the necessity of developing linguistic techniques to re-engage disenchanted or uninformed citizens with the European project. It was observed that the European Union is perceived as a „faceless‟ entity, which is attributed to the absence of a distinct public identity vis-à-vis its institutions. This contribution presents an analysis of a collection of informative publications regarding the European Union, entitled “Europe on the Move”. This collection of booklets provides comprehensive information about the European Union, including its historical origins, core values, and historical development, as well as its achievements, strategic objectives, policies, and operational procedures. The theoretical framework adopted for the longitudinal linguistic analysis of EU discourse is that of Systemic Functional Linguistics (SFL). In more detail, this study considers two basic systems of relations between clauses: firstly, the degree of interdependency (or taxis) and secondly, the logico-semantic relation of expansion. The former refers to the structural markers of grammatical relations between clauses within sentences, namely paratactic, hypotactic and embedded relations. The latter pertains to various logicosemantic relationships existing between the primary and secondary members of the clause nexus. These relationships include how the secondary clause expands the primary clause, which may be achieved by (a) elaborating it, (b) extending it or (c) enhancing it. This study examines the impact of the European Commission‟s post-referendum communication methods on the portrayal of Europe, its role in facilitating the EU institutional process, and its articulation of a specific EU identity linked to distinct values. The research reveals that the language employed by the EU is evidently grounded in an explanatory logic, elucidating the rationale behind their institutionalised acts. Nevertheless, the minimal use of hypotaxis in the post-referendum booklets, coupled with the inconsistent yet increasing ratio of parataxis to hypotaxis, may suggest a potential shift towards a logic of appearance, characterised by a predominant reliance on coordination and additive, and elaborative logico-semantic relations.Keywords: systemic functional linguistics, logic of appearance, explanatory logic, interdependency, logico-semantic relation
Procedia PDF Downloads 825145 Reviewing Privacy Preserving Distributed Data Mining
Authors: Sajjad Baghernezhad, Saeideh Baghernezhad
Abstract:
Nowadays considering human involved in increasing data development some methods such as data mining to extract science are unavoidable. One of the discussions of data mining is inherent distribution of the data usually the bases creating or receiving such data belong to corporate or non-corporate persons and do not give their information freely to others. Yet there is no guarantee to enable someone to mine special data without entering in the owner’s privacy. Sending data and then gathering them by each vertical or horizontal software depends on the type of their preserving type and also executed to improve data privacy. In this study it was attempted to compare comprehensively preserving data methods; also general methods such as random data, coding and strong and weak points of each one are examined.Keywords: data mining, distributed data mining, privacy protection, privacy preserving
Procedia PDF Downloads 52525144 The Right to Data Portability and Its Influence on the Development of Digital Services
Authors: Roman Bieda
Abstract:
The General Data Protection Regulation (GDPR) will come into force on 25 May 2018 which will create a new legal framework for the protection of personal data in the European Union. Article 20 of GDPR introduces a right to data portability. This right allows for data subjects to receive the personal data which they have provided to a data controller, in a structured, commonly used and machine-readable format, and to transmit this data to another data controller. The right to data portability, by facilitating transferring personal data between IT environments (e.g.: applications), will also facilitate changing the provider of services (e.g. changing a bank or a cloud computing service provider). Therefore, it will contribute to the development of competition and the digital market. The aim of this paper is to discuss the right to data portability and its influence on the development of new digital services.Keywords: data portability, digital market, GDPR, personal data
Procedia PDF Downloads 47325143 Analyzing Emerging Scientific Domains in Biomedical Discourse: Case Study Comparing Microbiome, Metabolome, and Metagenome Research in Scientific Articles
Authors: Kenneth D. Aiello, M. Simeone, Manfred Laubichler
Abstract:
It is increasingly difficult to analyze emerging scientific fields as contemporary scientific fields are more dynamic, their boundaries are more porous, and the relational possibilities have increased due to Big Data and new information sources. In biomedicine, where funding, medical categories, and medical jurisdiction are determined by distinct boundaries on biomedical research fields and definitions of concepts, ambiguity persists between the microbiome, metabolome, and metagenome research fields. This ambiguity continues despite efforts by institutions and organizations to establish parameters on the core concepts and research discourses. Further, the explosive growth of microbiome, metabolome, and metagenomic research has led to unknown variation and covariation making application of findings across subfields or coming to a consensus difficult. This study explores the evolution and variation of knowledge within the microbiome, metabolome, and metagenome research fields related to ambiguous scholarly language and commensurable theoretical frameworks via a semantic analysis of key concepts and narratives. A computational historical framework of cultural evolution and large-scale publication data highlight the boundaries and overlaps between the competing scientific discourses surrounding the three research areas. The results of this study highlight how discourse and language distribute power within scholarly and scientific networks, specifically the power to set and define norms, central questions, methods, and knowledge.Keywords: biomedicine, conceptual change, history of science, philosophy of science, science of science, sociolinguistics, sociology of knowledge
Procedia PDF Downloads 13025142 Validation of Mapping Historical Linked Data to International Committee for Documentation (CIDOC) Conceptual Reference Model Using Shapes Constraint Language
Authors: Ghazal Faraj, András Micsik
Abstract:
Shapes Constraint Language (SHACL), a World Wide Web Consortium (W3C) language, provides well-defined shapes and RDF graphs, named "shape graphs". These shape graphs validate other resource description framework (RDF) graphs which are called "data graphs". The structural features of SHACL permit generating a variety of conditions to evaluate string matching patterns, value type, and other constraints. Moreover, the framework of SHACL supports high-level validation by expressing more complex conditions in languages such as SPARQL protocol and RDF Query Language (SPARQL). SHACL includes two parts: SHACL Core and SHACL-SPARQL. SHACL Core includes all shapes that cover the most frequent constraint components. While SHACL-SPARQL is an extension that allows SHACL to express more complex customized constraints. Validating the efficacy of dataset mapping is an essential component of reconciled data mechanisms, as the enhancement of different datasets linking is a sustainable process. The conventional validation methods are the semantic reasoner and SPARQL queries. The former checks formalization errors and data type inconsistency, while the latter validates the data contradiction. After executing SPARQL queries, the retrieved information needs to be checked manually by an expert. However, this methodology is time-consuming and inaccurate as it does not test the mapping model comprehensively. Therefore, there is a serious need to expose a new methodology that covers the entire validation aspects for linking and mapping diverse datasets. Our goal is to conduct a new approach to achieve optimal validation outcomes. The first step towards this goal is implementing SHACL to validate the mapping between the International Committee for Documentation (CIDOC) conceptual reference model (CRM) and one of its ontologies. To initiate this project successfully, a thorough understanding of both source and target ontologies was required. Subsequently, the proper environment to run SHACL and its shape graphs were determined. As a case study, we performed SHACL over a CIDOC-CRM dataset after running a Pellet reasoner via the Protégé program. The applied validation falls under multiple categories: a) data type validation which constrains whether the source data is mapped to the correct data type. For instance, checking whether a birthdate is assigned to xsd:datetime and linked to Person entity via crm:P82a_begin_of_the_begin property. b) Data integrity validation which detects inconsistent data. For instance, inspecting whether a person's birthdate occurred before any of the linked event creation dates. The expected results of our work are: 1) highlighting validation techniques and categories, 2) selecting the most suitable techniques for those various categories of validation tasks. The next plan is to establish a comprehensive validation model and generate SHACL shapes automatically.Keywords: SHACL, CIDOC-CRM, SPARQL, validation of ontology mapping
Procedia PDF Downloads 25325141 Efficacy of Clickers in L2 Interaction
Authors: Ryoo Hye Jin Agnes
Abstract:
This study aims to investigate the efficacy of clickers in fostering L2 class interaction. In an L2 classroom, active learner-to-learner interactions and learner-to-teacher interactions play an important role in language acquisition. In light of this, introducing learning tools that promote such interactions would benefit L2 classroom by fostering interaction. This is because the anonymity of clickers allows learners to express their needs without the social risks associated with speaking up in the class. clickers therefore efficiently help learners express their level of understanding during the process of learning itself. This allows for an evaluative feedback loop where both learners and teachers understand the level of progress of the learners, better enabling classrooms to adapt to the learners’ needs. Eventually this tool promotes participation from learners. This, in turn, is believed to be effective in fostering classroom interaction, allowing learning to take place in a more comfortable yet vibrant way. This study is finalized by presenting the result of an experiment conducted to verify the effectiveness of this approach when teaching pragmatic aspect of Korean expressions with similar semantic functions. The learning achievement of learners in the experimental group was found higher than the learners’ in a control group. A survey was distributed to the learners, questioning them regarding the efficacy of clickers, and how it contributed to their learning in areas such as motivation, self-assessment, increasing participation, as well as giving feedback to teachers. Analyzing the data collected from the questionnaire given to the learners, the study presented data suggesting that this approach increased the scope of interactivity in the classroom, thus not only increasing participation but enhancing the type of classroom participation among learners. This participation in turn led to a marked improvement in their communicative abilities.Keywords: second language acquisition, interaction, clickers, learner response system, output from learners, learner’s cognitive process
Procedia PDF Downloads 52125140 Formalizing the Sense Relation of Hyponymy from Logical Point of View: A Study of Mathematical Linguistics in Farsi
Authors: Maryam Ramezankhani
Abstract:
The present research tries to study the possibility of formalizing the sense relation of hyponymy. It applied mathematical tools and also uses mathematical logic concepts especially those from propositional logic. In order to do so, firstly, it goes over the definitions of hyponymy presented in linguistic dictionaries and semantic textbooks. Then, it introduces a formal translation of the sense relation of hyponymy. Lastly, it examines the efficiency of the suggested formula by some examples of natural language.Keywords: sense relations, hyponymy, formalizing, words’ sense relation, formalizing sense relations
Procedia PDF Downloads 23925139 Commercial Management vs. Quantity Surveying: Hoax or Harmonization
Authors: Zelda Jansen Van Rensburg
Abstract:
Purpose: This study investigates the perceived disparities between Quantity Surveying and Commercial Management in the construction industry, questioning if these differences are substantive or merely semantic. It aims to challenge the conventional notion of Commercial Managers’ superiority by critically evaluating QS and CM roles, exploring CM integration possibilities, examining qualifications for aspiring Commercial Managers, assessing regulatory frameworks, and considering terminology redefinition for global QS professional enhancement. Design: Utilizing mixed methods like literature reviews, surveys, interviews, and document analyses, this research examines the QS-CM relationship. Insights from industry professionals, academics, and regulatory bodies inform the investigation into changing QS roles. Findings: Empirical data highlight evolving roles, showcasing areas of convergence and divergence between QSs and CM. Potential CM integration into QS practice and qualifications for aspiring Commercial Managers are identified. Limitations/Implications: Limitations include potential bias in self-reported data and findings. Nevertheless, the research informs future practices and educational approaches in QS and CM, reflecting the changing roles and responsibilities of Quantity Surveyors. Practical Implications: Findings inform industry practitioners, educators, and regulators, stressing the need to adapt to changing QS roles and integrate CM principles where applicable. Value to the Conference Theme: Aligned with ‘Evolving roles and responsibilities of Quantity Surveyors,’ this research offers insights crucial for understanding the changing dynamics within the QS profession and informs strategies to navigate these shifts effectively.Keywords: quantity surveying, commercial management, cost engineering, quantity survey
Procedia PDF Downloads 4025138 Investigating Complement Clause Choice in Written Educated Nigerian English (ENE)
Authors: Juliet Udoudom
Abstract:
Inappropriate complement selection constitutes one of the major features of non-standard complementation in the Nigerian users of English output of sentence construction. This paper investigates complement clause choice in Written Educated Nigerian English (ENE) and offers some results. It aims at determining preferred and dispreferred patterns of complement clause selection in respect of verb heads in English by selected Nigerian users of English. The complementation data analyzed in this investigation were obtained from experimental tasks designed to elicit complement categories of Verb – Noun -, Adjective – and Prepositional – heads in English. Insights from the Government – Binding relations were employed in analyzing data, which comprised responses obtained from one hundred subjects to a picture elicitation exercise, a grammaticality judgement test, and a free composition task. The findings indicate a general tendency for clausal complements (CPs) introduced by the complementizer that to be preferred by the subjects studied. Of the 235 tokens of clausal complements which occurred in our corpus, 128 of them representing 54.46% were CPs headed by that, while whether – and if-clauses recorded 31.07% and 8.94%, respectively. The complement clause-type which recorded the lowest incidence of choice was the CP headed by the Complementiser, for with a 5.53% incident of occurrence. Further findings from the study indicate that semantic features of relevant embedding verb heads were not taken into consideration in the choice of complementisers which introduce the respective complement clauses, hence the that-clause was chosen to complement verbs like prefer. In addition, the dispreferred choice of the for-clause is explicable in terms of the fact that the respondents studied regard ‘for’ as a preposition, and not a complementiser.Keywords: complement, complement clause complement selection, complementisers, government-binding
Procedia PDF Downloads 18825137 The Lexical Eidos as an Invariant of a Polysemantic Word
Authors: S. Pesina, T. Solonchak
Abstract:
Phenomenological analysis is not based on natural language, but ideal language which is able to be a carrier of ideal meanings – eidos representing typical structures or essences. For this purpose, it’s necessary to release from the spatio-temporal definiteness of a subject and then state its noetic essence (eidos) by means of free fantasy generation. Herewith, as if a totally new objectness is created - the universal, confirming the thesis that thinking process takes place in generalizations passing by numerous means through the specific to the general and from the general through the specific to the singular.Keywords: lexical eidos, phenomenology, noema, polysemantic word, semantic core
Procedia PDF Downloads 27725136 Recent Advances in Data Warehouse
Authors: Fahad Hanash Alzahrani
Abstract:
This paper describes some recent advances in a quickly developing area of data storing and processing based on Data Warehouses and Data Mining techniques, which are associated with software, hardware, data mining algorithms and visualisation techniques having common features for any specific problems and tasks of their implementation.Keywords: data warehouse, data mining, knowledge discovery in databases, on-line analytical processing
Procedia PDF Downloads 40425135 How to Use Big Data in Logistics Issues
Authors: Mehmet Akif Aslan, Mehmet Simsek, Eyup Sensoy
Abstract:
Big Data stands for today’s cutting-edge technology. As the technology becomes widespread, so does Data. Utilizing massive data sets enable companies to get competitive advantages over their adversaries. Out of many area of Big Data usage, logistics has significance role in both commercial sector and military. This paper lays out what big data is and how it is used in both military and commercial logistics.Keywords: big data, logistics, operational efficiency, risk management
Procedia PDF Downloads 641