Search results for: link data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25165

Search results for: link data

24895 The Markers -mm and dämmo in Amharic: Developmental Approach

Authors: Hayat Omar

Abstract:

Languages provide speakers with a wide range of linguistic units to organize and deliver information. There are several ways to verbally express the mental representations of events. According to the linguistic tools they have acquired, speakers select the one that brings out the most communicative effect to convey their message. Our study focuses on two markers, -mm and dämmo, in Amharic (Ethiopian Semitic language). Our aim is to examine, from a developmental perspective, how they are used by speakers. We seek to distinguish the communicative and pragmatic functions indicated by means of these markers. To do so, we created a corpus of sixty narrative productions of children from 5-6, 7-8 to 10-12 years old and adult Amharic speakers. The experimental material we used to collect our data is a series of pictures without text 'Frog, Where are you?'. Although -mm and dämmo are each used in specific contexts, they are sometimes analyzed as being interchangeable. The suffix -mm is complex and multifunctional. It marks the end of the negative verbal structure, it is found in the relative structure of the imperfect, it creates new words such as adverbials or pronouns, it also serves to coordinate words, sentences and to mark the link between macro-propositions within a larger textual unit. -mm was analyzed as marker of insistence, topic shift marker, element of concatenation, contrastive focus marker, 'bisyndetic' coordinator. On the other hand, dämmo has limited function and did not attract the attention of many authors. The only approach we could find analyzes it in terms of 'monosyndetic' coordinator. The paralleling of these two elements made it possible to understand their distinctive functions and refine their description. When it comes to marking a referent, the choice of -mm or dämmo is not neutral, depending on whether the tagged argument is newly introduced, maintained, promoted or reintroduced. The presence of these morphemes explains the inter-phrastic link. The information is seized by anaphora or presupposition: -mm goes upstream while dämmo arrows downstream, the latter requires new information. The speaker uses -mm or dämmo according to what he assumes to be known to his interlocutors. The results show that -mm and dämmo, although all the speakers use them both, do not always have the same scope according to the speaker and vary according to the age. dämmo is mainly used to mark a contrastive topic to signal the concomitance of events. It is more commonly used in young children’s narratives (F(3,56) = 3,82, p < .01). Some values of -mm (additive) are acquired very early while others are rather late and increase with age (F(3,56) = 3,2, p < .03). The difficulty is due not only because of its synthetic structure but primarily because it is multi-purpose and requires a memory work. It highlights the constituent on which it operates to clarify how the message should be interpreted.

Keywords: acquisition, cohesion, connection, contrastive topic, contrastive focus, discourse marker, pragmatics

Procedia PDF Downloads 122
24894 Data Mining Algorithms Analysis: Case Study of Price Predictions of Lands

Authors: Julio Albuja, David Zaldumbide

Abstract:

Data analysis is an important step before taking a decision about money. The aim of this work is to analyze the factors that influence the final price of the houses through data mining algorithms. To our best knowledge, previous work was researched just to compare results. Furthermore, before using the data of the data set, the Z-Transformation were used to standardize the data in the same range. Hence, the data was classified into two groups to visualize them in a readability format. A decision tree was built, and graphical data is displayed where clearly is easy to see the results and the factors' influence in these graphics. The definitions of these methods are described, as well as the descriptions of the results. Finally, conclusions and recommendations are presented related to the released results that our research showed making it easier to apply these algorithms using a customized data set.

Keywords: algorithms, data, decision tree, transformation

Procedia PDF Downloads 356
24893 Application of Blockchain Technology in Geological Field

Authors: Mengdi Zhang, Zhenji Gao, Ning Kang, Rongmei Liu

Abstract:

Management and application of geological big data is an important part of China's national big data strategy. With the implementation of a national big data strategy, geological big data management becomes more and more critical. At present, there are still a lot of technology barriers as well as cognition chaos in many aspects of geological big data management and application, such as data sharing, intellectual property protection, and application technology. Therefore, it’s a key task to make better use of new technologies for deeper delving and wider application of geological big data. In this paper, we briefly introduce the basic principle of blockchain technology at the beginning and then make an analysis of the application dilemma of geological data. Based on the current analysis, we bring forward some feasible patterns and scenarios for the blockchain application in geological big data and put forward serval suggestions for future work in geological big data management.

Keywords: blockchain, intellectual property protection, geological data, big data management

Procedia PDF Downloads 67
24892 Innovation and Performance of Very Small Agri-Food Enterprises in Cameroon

Authors: Ahmed Moustapha Mfokeu

Abstract:

Agri-food VSEs in Cameroon are facing a succession of crises, lack of security, particularly in the Far North, South West, and North West regions, the consequences of the Covid 19 crisis, and the war in Ukraine . These multiple crises have benefited the reception of the prices of the raw materials. Moreover, the exacerbation of competitive pressures is driven by the technological acceleration of productive systems in emerging countries which increase the demands imposed on the markets. The Cameroonian VSE must therefore be able to meet the new challenges of international competition, especially through innovation. The objective of this research is to contribute to the knowledge of the effects of innovation on the performance of very small agribusinesses in Cameroon. On the methodological level, the data were provided from a sample of 153 companies in the cities of Douala and Yaoundé. This research uses structural equation models with latent variables. The main results show that there is a positive and significant link between innovation and the performance of very small agri-food companies, so if it is important for entrepreneurs to encourage and practice innovation, it is also necessary to make them understand and make them like this aspect in their strategic function.

Keywords: innovation, performance, very small enterprise, agrifood

Procedia PDF Downloads 84
24891 Performance Evaluation of Routing Protocol in Cognitive Radio with Multi Technological Environment

Authors: M. Yosra, A. Mohamed, T. Sami

Abstract:

Over the past few years, mobile communication technologies have seen significant evolution. This fact promoted the implementation of many systems in a multi-technological setting. From one system to another, the Quality of Service (QoS) provided to mobile consumers gets better. The growing number of normalized standards extends the available services for each consumer, moreover, most of the available radio frequencies have already been allocated, such as 3G, Wifi, Wimax, and LTE. A study by the Federal Communications Commission (FCC) found that certain frequency bands are partially occupied in particular locations and times. So, the idea of Cognitive Radio (CR) is to share the spectrum between a primary user (PU) and a secondary user (SU). The main objective of this spectrum management is to achieve a maximum rate of exploitation of the radio spectrum. In general, the CR can greatly improve the quality of service (QoS) and improve the reliability of the link. The problem will reside in the possibility of proposing a technique to improve the reliability of the wireless link by using the CR with some routing protocols. However, users declared that the links were unreliable and that it was an incompatibility with QoS. In our case, we choose the QoS parameter "bandwidth" to perform a supervised classification. In this paper, we propose a comparative study between some routing protocols, taking into account the variation of different technologies on the existing spectral bandwidth like 3G, WIFI, WIMAX, and LTE. Due to the simulation results, we observe that LTE has significantly higher availability bandwidth compared with other technologies. The performance of the OLSR protocol is better than other on-demand routing protocols (DSR, AODV and DSDV), in LTE technology because of the proper receiving of packets, less packet drop and the throughput. Numerous simulations of routing protocols have been made using simulators such as NS3.

Keywords: cognitive radio, multi technology, network simulator (NS3), routing protocol

Procedia PDF Downloads 44
24890 Frequent Item Set Mining for Big Data Using MapReduce Framework

Authors: Tamanna Jethava, Rahul Joshi

Abstract:

Frequent Item sets play an essential role in many data Mining tasks that try to find interesting patterns from the database. Typically it refers to a set of items that frequently appear together in transaction dataset. There are several mining algorithm being used for frequent item set mining, yet most do not scale to the type of data we presented with today, so called “BIG DATA”. Big Data is a collection of large data sets. Our approach is to work on the frequent item set mining over the large dataset with scalable and speedy way. Big Data basically works with Map Reduce along with HDFS is used to find out frequent item sets from Big Data on large cluster. This paper focuses on using pre-processing & mining algorithm as hybrid approach for big data over Hadoop platform.

Keywords: frequent item set mining, big data, Hadoop, MapReduce

Procedia PDF Downloads 399
24889 The Role Of Data Gathering In NGOs

Authors: Hussaini Garba Mohammed

Abstract:

Background/Significance: The lack of data gathering is affecting NGOs world-wide in general to have good data information about educational and health related issues among communities in any country and around the world. For example, HIV/AIDS smoking (Tuberculosis diseases) and COVID-19 virus carriers is becoming a serious public health problem, especially among old men and women. But there is no full details data survey assessment from communities, villages, and rural area in some countries to show the percentage of victims and patients, especial with this world COVID-19 virus among the people. These data are essential to inform programming targets, strategies, and priorities in getting good information about data gathering in any society.

Keywords: reliable information, data assessment, data mining, data communication

Procedia PDF Downloads 162
24888 Research on Spatial Distribution of Service Facilities Based on Innovation Function: A Case Study of Zhejiang University Zijin Co-Maker Town

Authors: Zhang Yuqi

Abstract:

Service facilities are the boosters for the cultivation and development of innovative functions in innovative cluster areas. At the same time, reasonable service facilities planning can better link the internal functional blocks. This paper takes Zhejiang University Zijin Co-Maker Town as the research object, based on the combination of network data mining and field research and verification, combined with the needs of its internal innovative groups. It studies the distribution characteristics and existing problems of service facilities and then proposes a targeted planning suggestion. The main conclusions are as follows: (1) From the perspective of view, the town is rich in general life-supporting services, but lacking of provision targeted and distinctive service facilities for innovative groups; (2) From the perspective of scale structure, small-scale street shops are the main business form, lack of large-scale service center; (3) From the perspective of spatial structure, service facilities layout of each functional block is too fragile to fit the characteristics of 2aggregation- distribution' of innovation and entrepreneurial activities; (4) The goal of optimizing service facilities planning should be guided for fostering function of innovation and entrepreneurship and meet the actual needs of the innovation and entrepreneurial groups.

Keywords: the cultivation of innovative function, Zhejiang University Zijin Co-Maker Town, service facilities, network data mining, space optimization advice

Procedia PDF Downloads 95
24887 Social Business Process Management and Business Process Management Maturity

Authors: Dalia Suša Vugec, Vesna Bosilj Vukšić, Ljubica Milanović Glavan

Abstract:

Business process management (BPM) is a well-known holistic discipline focused on managing business processes with the intention of achieving higher level of BPM maturity and better organizational performance. In recent period, traditional BPM faced some of its limitations like model-reality divide and lost innovation. Following latest trends, as an attempt to overcome the issues of traditional BPM, there has been an introduction of applying the principles of social software in managing business processes which led to the development of social BPM. However, there are not many authors or studies dealing with this topic so this study aims to contribute to that literature gap and to examine the link between the level of BPM maturity and the usage of social BPM. To meet these objectives, a survey within the companies with more than 50 employees has been conducted. The results reveal that the usage of social BPM is higher within the companies which achieved higher level of BPM maturity. This paper provides an overview, analysis and discussion of collected data regarding BPM maturity and social BPM within the observed companies and identifies the main social BPM principles.

Keywords: business process management, BPM maturity, process performance index, social BPM

Procedia PDF Downloads 305
24886 Quantifying Stability of Online Communities and Its Impact on Disinformation

Authors: Victor Chomel, Maziyar Panahi, David Chavalarias

Abstract:

Misinformation has taken an increasingly worrying place in social media. Propagation patterns are closely linked to the structure of communities. This study proposes a method of community analysis based on a combination of centrality indicators for the network and its main communities. The objective is to establish a link between the stability of the communities over time, the social ascension of its members internally, and the propagation of information in the community. To this end, data from the debates about global warming and political communities on Twitter have been collected, and several tens of millions of tweets and retweets have helped us better understand the structure of these communities. The quantification of this stability allows for the study of the propagation of information of any kind, including disinformation. Our results indicate that the most stable communities over time are the ones that enable the establishment of nodes capturing a large part of the information and broadcasting its opinions. Conversely, communities with a high turnover and social ascendancy only stabilize themselves strongly in the face of adversity and external events but seem to offer a greater diversity of opinions most of the time.

Keywords: community analysis, disinformation, misinformation, Twitter

Procedia PDF Downloads 124
24885 The Application of Data Mining Technology in Building Energy Consumption Data Analysis

Authors: Liang Zhao, Jili Zhang, Chongquan Zhong

Abstract:

Energy consumption data, in particular those involving public buildings, are impacted by many factors: the building structure, climate/environmental parameters, construction, system operating condition, and user behavior patterns. Traditional methods for data analysis are insufficient. This paper delves into the data mining technology to determine its application in the analysis of building energy consumption data including energy consumption prediction, fault diagnosis, and optimal operation. Recent literature are reviewed and summarized, the problems faced by data mining technology in the area of energy consumption data analysis are enumerated, and research points for future studies are given.

Keywords: data mining, data analysis, prediction, optimization, building operational performance

Procedia PDF Downloads 833
24884 To Handle Data-Driven Software Development Projects Effectively

Authors: Shahnewaz Khan

Abstract:

Machine learning (ML) techniques are often used in projects for creating data-driven applications. These tasks typically demand additional research and analysis. The proper technique and strategy must be chosen to ensure the success of data-driven projects. Otherwise, even exerting a lot of effort, the necessary development might not always be possible. In this post, an effort to examine the workflow of data-driven software development projects and its implementation process in order to describe how to manage a project successfully. Which will assist in minimizing the added workload.

Keywords: data, data-driven projects, data science, NLP, software project

Procedia PDF Downloads 63
24883 Efficient Layout-Aware Pretraining for Multimodal Form Understanding

Authors: Armineh Nourbakhsh, Sameena Shah, Carolyn Rose

Abstract:

Layout-aware language models have been used to create multimodal representations for documents that are in image form, achieving relatively high accuracy in document understanding tasks. However, the large number of parameters in the resulting models makes building and using them prohibitive without access to high-performing processing units with large memory capacity. We propose an alternative approach that can create efficient representations without the need for a neural visual backbone. This leads to an 80% reduction in the number of parameters compared to the smallest SOTA model, widely expanding applicability. In addition, our layout embeddings are pre-trained on spatial and visual cues alone and only fused with text embeddings in downstream tasks, which can facilitate applicability to low-resource of multi-lingual domains. Despite using 2.5% of training data, we show competitive performance on two form understanding tasks: semantic labeling and link prediction.

Keywords: layout understanding, form understanding, multimodal document understanding, bias-augmented attention

Procedia PDF Downloads 131
24882 The Relationship Between Artificial Intelligence, Data Science, and Privacy

Authors: M. Naidoo

Abstract:

Artificial intelligence often requires large amounts of good quality data. Within important fields, such as healthcare, the training of AI systems predominately relies on health and personal data; however, the usage of this data is complicated by various layers of law and ethics that seek to protect individuals’ privacy rights. This research seeks to establish the challenges AI and data sciences pose to (i) informational rights, (ii) privacy rights, and (iii) data protection. To solve some of the issues presented, various methods are suggested, such as embedding values in technological development, proper balancing of rights and interests, and others.

Keywords: artificial intelligence, data science, law, policy

Procedia PDF Downloads 91
24881 Simulation Data Summarization Based on Spatial Histograms

Authors: Jing Zhao, Yoshiharu Ishikawa, Chuan Xiao, Kento Sugiura

Abstract:

In order to analyze large-scale scientific data, research on data exploration and visualization has gained popularity. In this paper, we focus on the exploration and visualization of scientific simulation data, and define a spatial V-Optimal histogram for data summarization. We propose histogram construction algorithms based on a general binary hierarchical partitioning as well as a more specific one, the l-grid partitioning. For effective data summarization and efficient data visualization in scientific data analysis, we propose an optimal algorithm as well as a heuristic algorithm for histogram construction. To verify the effectiveness and efficiency of the proposed methods, we conduct experiments on the massive evacuation simulation data.

Keywords: simulation data, data summarization, spatial histograms, exploration, visualization

Procedia PDF Downloads 162
24880 The Effect of Evil Eye in the Individuals' Journey for Personhood within a Christian Orthodox Society

Authors: Nikolaos Souvlakis

Abstract:

The present paper negotiates the effect of 'the evil eye' on individuals' mental health while at the same time poses the problem of how the evil eye fits into the anthropological arena as a key question that forges a fundamental link between religion, anthropology and mental health professions. It is the argument of the paper that the evil eye is an essential and fundamental human phenomenon and therefore any scholarly field involved in its study must consider the insight it provides into the development of personhood. The study was an anthropological study in the geographical area of Corfu, a Greek Orthodox society uninfluenced by the Ottoman Islamic Culture. The paper aims to deepen our understanding of the evil eye as it analyses the interaction between the evil eye and gaze and how they affect the development of personhood; based on the empirical data collected from the fieldwork. Therefore, the paper adopts a psychoanalytic anthropology approach to facilitate a better understanding of the evil eye through the accounts of individuals’ journeys in the process of their development of personhood. Finally, the paper aims to offer a detailed analysis of the particular element of eye (‘I’) and, more specifically, of ‘the others’, as they relate to the phenomenon of the evil eye.

Keywords: gaze, evil eye, mental health, personhood

Procedia PDF Downloads 114
24879 Teacher's Gender and Primary School Pupils Achievement in Social Studies and Its Educational Implications on Pupils

Authors: Elizabeth Oyenike Abegunrin

Abstract:

This study is borne out of the dire need to improve the academic achievement of pupils in social studies. The paper attempted to reconcile the lacuna in teacher’s gender and primary school pupils’ achievement. With specific reference to Social Studies classroom, the aim of this study was to detail how pupils’ achievement is a function of the teacher’s gender as well as to establish the link (if any) between teacher’s gender and pupils’ educational achievement. The significance of this was to create gender-template standard for teachers, school owners, administrators and policy makers to follow in the course of engendering pupils’ achievement in Social Studies. By adopting a quasi-experimental research design, a sample of two hundred pupils was selected across five primary schools in Education District I, Lagos State and assigned to experimental and control groups. A 40-item Gender and Social Studies Achievement Test (GSSAT) was used to obtain data from the pupils. Having analyzed the data collected using Pearson Product Moment Correlation (PPMC), a reliability of 0.78 was obtained. Result revealed that teacher’s gender (male/female) had no significant effect on pupils’ achievement in Social Studies and that there was significant interaction effect of teacher’s commitment devoid of gender on the general education output of pupils in Social Studies. Taken together, the results revealed that there is a high degree correlation between teacher’s commitment and pupils academic achievement in social studies, and not gender-based. The study recommended that social studies teachers should re-assess their classroom instructional strategies and use more innovative instructional methods and techniques that will give the pupils equal opportunities to excel in social studies, rather than their gender differences.

Keywords: gender, academic achievement, social studies, primary school

Procedia PDF Downloads 195
24878 Management as a Proxy for Firm Quality

Authors: Petar Dobrev

Abstract:

There is no agreed-upon definition of firm quality. While profitability and stock performance often qualify as popular proxies of quality, in this project, we aim to identify quality without relying on a firm’s financial statements or stock returns as selection criteria. Instead, we use firm-level data on management practices across small to medium-sized U.S. manufacturing firms from the World Management Survey (WMS) to measure firm quality. Each firm in the WMS dataset is assigned a mean management score from 0 to 5, with higher scores identifying better-managed firms. This management score serves as our proxy for firm quality and is the sole criteria we use to separate firms into portfolios comprised of high-quality and low-quality firms. We define high-quality (low-quality) firms as those firms with a management score of one standard deviation above (below) the mean. To study whether this proxy for firm quality can identify better-performing firms, we link this data to Compustat and The Center for Research in Security Prices (CRSP) to obtain firm-level data on financial performance and monthly stock returns, respectively. We find that from 1999 to 2019 (our sample data period), firms in the high-quality portfolio are consistently more profitable — higher operating profitability and return on equity compared to low-quality firms. In addition, high-quality firms also exhibit a lower risk of bankruptcy — a higher Altman Z-score. Next, we test whether the stocks of the firms in the high-quality portfolio earn superior risk-adjusted excess returns. We regress the monthly excess returns on each portfolio on the Fama-French 3-factor, 4-factor, and 5-factor models, the betting-against-beta factor, and the quality-minus-junk factor. We find no statistically significant differences in excess returns between both portfolios, suggesting that stocks of high-quality (well managed) firms do not earn superior risk-adjusted returns compared to low-quality (poorly managed) firms. In short, our proxy for firm quality, the WMS management score, can identify firms with superior financial performance (higher profitability and reduced risk of bankruptcy). However, our management proxy cannot identify stocks that earn superior risk-adjusted returns, suggesting no statistically significant relationship between managerial quality and stock performance.

Keywords: excess stock returns, management, profitability, quality

Procedia PDF Downloads 77
24877 Algorithms used in Spatial Data Mining GIS

Authors: Vahid Bairami Rad

Abstract:

Extracting knowledge from spatial data like GIS data is important to reduce the data and extract information. Therefore, the development of new techniques and tools that support the human in transforming data into useful knowledge has been the focus of the relatively new and interdisciplinary research area ‘knowledge discovery in databases’. Thus, we introduce a set of database primitives or basic operations for spatial data mining which are sufficient to express most of the spatial data mining algorithms from the literature. This approach has several advantages. Similar to the relational standard language SQL, the use of standard primitives will speed-up the development of new data mining algorithms and will also make them more portable. We introduced a database-oriented framework for spatial data mining which is based on the concepts of neighborhood graphs and paths. A small set of basic operations on these graphs and paths were defined as database primitives for spatial data mining. Furthermore, techniques to efficiently support the database primitives by a commercial DBMS were presented.

Keywords: spatial data base, knowledge discovery database, data mining, spatial relationship, predictive data mining

Procedia PDF Downloads 439
24876 Local Tax Map Software System Development

Authors: Smithinun Thairoongrojana

Abstract:

This research is a qualitative research with three main purposes: (1) to develop the local tax map software system to be linked to the main Local Tax Map System (LTAX3000) system; (2) to design and develop a program for tax data fieldwork on wireless devices and link it to LTAX3000 database of Surat Thani Municipality; (3) to develop the human resource responsible for the fieldwork to be able to use the program and maintain the system and also to be able to work with the dynamic of technologies. In-depth interviews with the two groups of samples, the board of Surat Thani Municipality and operation staff responsible for observing and taxing fieldworks were conducted. The result of this study demonstrates the new developed fieldworks system that can be used both stand-alone usage and networking usage. The fieldworks system to collect and store the variety of taxing information within Surat Thani Municipality will be explained. Then the fieldwork operation process development and the replacement of transferring and storing the information via the network communication.

Keywords: Local tax map, software system development, wireless devices, human resource

Procedia PDF Downloads 181
24875 Data Stream Association Rule Mining with Cloud Computing

Authors: B. Suraj Aravind, M. H. M. Krishna Prasad

Abstract:

There exist emerging applications of data streams that require association rule mining, such as network traffic monitoring, web click streams analysis, sensor data, data from satellites etc. Data streams typically arrive continuously in high speed with huge amount and changing data distribution. This raises new issues that need to be considered when developing association rule mining techniques for stream data. This paper proposes to introduce an improved data stream association rule mining algorithm by eliminating the limitation of resources. For this, the concept of cloud computing is used. Inclusion of this may lead to additional unknown problems which needs further research.

Keywords: data stream, association rule mining, cloud computing, frequent itemsets

Procedia PDF Downloads 482
24874 A Comprehensive Survey and Improvement to Existing Privacy Preserving Data Mining Techniques

Authors: Tosin Ige

Abstract:

Ethics must be a condition of the world, like logic. (Ludwig Wittgenstein, 1889-1951). As important as data mining is, it possess a significant threat to ethics, privacy, and legality, since data mining makes it difficult for an individual or consumer (in the case of a company) to control the accessibility and usage of his data. This research focuses on Current issues and the latest research and development on Privacy preserving data mining methods as at year 2022. It also discusses some advances in those techniques while at the same time highlighting and providing a new technique as a solution to an existing technique of privacy preserving data mining methods. This paper also bridges the wide gap between Data mining and the Web Application Programing Interface (web API), where research is urgently needed for an added layer of security in data mining while at the same time introducing a seamless and more efficient way of data mining.

Keywords: data, privacy, data mining, association rule, privacy preserving, mining technique

Procedia PDF Downloads 145
24873 Big Data: Concepts, Technologies and Applications in the Public Sector

Authors: A. Alexandru, C. A. Alexandru, D. Coardos, E. Tudora

Abstract:

Big Data (BD) is associated with a new generation of technologies and architectures which can harness the value of extremely large volumes of very varied data through real time processing and analysis. It involves changes in (1) data types, (2) accumulation speed, and (3) data volume. This paper presents the main concepts related to the BD paradigm, and introduces architectures and technologies for BD and BD sets. The integration of BD with the Hadoop Framework is also underlined. BD has attracted a lot of attention in the public sector due to the newly emerging technologies that allow the availability of network access. The volume of different types of data has exponentially increased. Some applications of BD in the public sector in Romania are briefly presented.

Keywords: big data, big data analytics, Hadoop, cloud

Procedia PDF Downloads 291
24872 Semantic Data Schema Recognition

Authors: Aïcha Ben Salem, Faouzi Boufares, Sebastiao Correia

Abstract:

The subject covered in this paper aims at assisting the user in its quality approach. The goal is to better extract, mix, interpret and reuse data. It deals with the semantic schema recognition of a data source. This enables the extraction of data semantics from all the available information, inculding the data and the metadata. Firstly, it consists of categorizing the data by assigning it to a category and possibly a sub-category, and secondly, of establishing relations between columns and possibly discovering the semantics of the manipulated data source. These links detected between columns offer a better understanding of the source and the alternatives for correcting data. This approach allows automatic detection of a large number of syntactic and semantic anomalies.

Keywords: schema recognition, semantic data profiling, meta-categorisation, semantic dependencies inter columns

Procedia PDF Downloads 404
24871 Access Control System for Big Data Application

Authors: Winfred Okoe Addy, Jean Jacques Dominique Beraud

Abstract:

Access control systems (ACs) are some of the most important components in safety areas. Inaccuracies of regulatory frameworks make personal policies and remedies more appropriate than standard models or protocols. This problem is exacerbated by the increasing complexity of software, such as integrated Big Data (BD) software for controlling large volumes of encrypted data and resources embedded in a dedicated BD production system. This paper proposes a general access control strategy system for the diffusion of Big Data domains since it is crucial to secure the data provided to data consumers (DC). We presented a general access control circulation strategy for the Big Data domain by describing the benefit of using designated access control for BD units and performance and taking into consideration the need for BD and AC system. We then presented a generic of Big Data access control system to improve the dissemination of Big Data.

Keywords: access control, security, Big Data, domain

Procedia PDF Downloads 118
24870 A Data Envelopment Analysis Model in a Multi-Objective Optimization with Fuzzy Environment

Authors: Michael Gidey Gebru

Abstract:

Most of Data Envelopment Analysis models operate in a static environment with input and output parameters that are chosen by deterministic data. However, due to ambiguity brought on shifting market conditions, input and output data are not always precisely gathered in real-world scenarios. Fuzzy numbers can be used to address this kind of ambiguity in input and output data. Therefore, this work aims to expand crisp Data Envelopment Analysis into Data Envelopment Analysis with fuzzy environment. In this study, the input and output data are regarded as fuzzy triangular numbers. Then, the Data Envelopment Analysis model with fuzzy environment is solved using a multi-objective method to gauge the Decision Making Units' efficiency. Finally, the developed Data Envelopment Analysis model is illustrated with an application on real data 50 educational institutions.

Keywords: efficiency, Data Envelopment Analysis, fuzzy, higher education, input, output

Procedia PDF Downloads 30
24869 DNA Methylation Score Development for In utero Exposure to Paternal Smoking Using a Supervised Machine Learning Approach

Authors: Cristy Stagnar, Nina Hubig, Diana Ivankovic

Abstract:

The epigenome is a compelling candidate for mediating long-term responses to environmental effects modifying disease risk. The main goal of this research is to develop a machine learning-based DNA methylation score, which will be valuable in delineating the unique contribution of paternal epigenetic modifications to the germline impacting childhood health outcomes. It will also be a useful tool in validating self-reports of nonsmoking and in adjusting epigenome-wide DNA methylation association studies for this early-life exposure. Using secondary data from two population-based methylation profiling studies, our DNA methylation score is based on CpG DNA methylation measurements from cord blood gathered from children whose fathers smoked pre- and peri-conceptually. Each child’s mother and father fell into one of three class labels in the accompanying questionnaires -never smoker, former smoker, or current smoker. By applying different machine learning algorithms to the accessible resource for integrated epigenomic studies (ARIES) sub-study of the Avon longitudinal study of parents and children (ALSPAC) data set, which we used for training and testing of our model, the best-performing algorithm for classifying the father smoker and mother never smoker was selected based on Cohen’s κ. Error in the model was identified and optimized. The final DNA methylation score was further tested and validated in an independent data set. This resulted in a linear combination of methylation values of selected probes via a logistic link function that accurately classified each group and contributed the most towards classification. The result is a unique, robust DNA methylation score which combines information on DNA methylation and early life exposure of offspring to paternal smoking during pregnancy and which may be used to examine the paternal contribution to offspring health outcomes.

Keywords: epigenome, health outcomes, paternal preconception environmental exposures, supervised machine learning

Procedia PDF Downloads 172
24868 Exploring the Spatial Characteristics of Mortality Map: A Statistical Area Perspective

Authors: Jung-Hong Hong, Jing-Cen Yang, Cai-Yu Ou

Abstract:

The analysis of geographic inequality heavily relies on the use of location-enabled statistical data and quantitative measures to present the spatial patterns of the selected phenomena and analyze their differences. To protect the privacy of individual instance and link to administrative units, point-based datasets are spatially aggregated to area-based statistical datasets, where only the overall status for the selected levels of spatial units is used for decision making. The partition of the spatial units thus has dominant influence on the outcomes of the analyzed results, well known as the Modifiable Areal Unit Problem (MAUP). A new spatial reference framework, the Taiwan Geographical Statistical Classification (TGSC), was recently introduced in Taiwan based on the spatial partition principles of homogeneous consideration of the number of population and households. Comparing to the outcomes of the traditional township units, TGSC provides additional levels of spatial units with finer granularity for presenting spatial phenomena and enables domain experts to select appropriate dissemination level for publishing statistical data. This paper compares the results of respectively using TGSC and township unit on the mortality data and examines the spatial characteristics of their outcomes. For the mortality data between the period of January 1st, 2008 and December 31st, 2010 of the Taitung County, the all-cause age-standardized death rate (ASDR) ranges from 571 to 1757 per 100,000 persons, whereas the 2nd dissemination area (TGSC) shows greater variation, ranged from 0 to 2222 per 100,000. The finer granularity of spatial units of TGSC clearly provides better outcomes for identifying and evaluating the geographic inequality and can be further analyzed with the statistical measures from other perspectives (e.g., population, area, environment.). The management and analysis of the statistical data referring to the TGSC in this research is strongly supported by the use of Geographic Information System (GIS) technology. An integrated workflow that consists of the tasks of the processing of death certificates, the geocoding of street address, the quality assurance of geocoded results, the automatic calculation of statistic measures, the standardized encoding of measures and the geo-visualization of statistical outcomes is developed. This paper also introduces a set of auxiliary measures from a geographic distribution perspective to further examine the hidden spatial characteristics of mortality data and justify the analyzed results. With the common statistical area framework like TGSC, the preliminary results demonstrate promising potential for developing a web-based statistical service that can effectively access domain statistical data and present the analyzed outcomes in meaningful ways to avoid wrong decision making.

Keywords: mortality map, spatial patterns, statistical area, variation

Procedia PDF Downloads 238
24867 Women’s Leadership for Sustainable Outcomes: On the Road to Gender Equality for a Better Tomorrow

Authors: Deepika Faugoo

Abstract:

Gender equality stands as the cornerstone of societal progress, intricately woven into the very essence of the 2030 Sustainable Development Goals (SDGs). Yet, the gender leadership gap remains a formidable obstacle hindering global equality. Despite women's educational advancements, their underrepresentation in senior roles persists as a baffling anomaly. Drawing from contemporary research, empirical evidence, and secondary data, this paper underscores the imperative of advancing women in leadership to drive SDGs related to empowerment and gender equality by 2030. It highlights the undeniable link between women leaders and sustainable outcomes, citing case studies and examples of their contributions to financial performance, prosperity, economic growth, and societal well-being. Exploring persistent barriers and emerging challenges, it offers actionable strategies to enhance women's representation in leadership, promising transformative benefits for organizations and societies. Amidst societal upheavals, gender equality emerges as a potent solution, catalyzing change toward a future where every voice resonates, ensuring no one is left behind.

Keywords: senior leadership, empowerment, SDGs, gender equality

Procedia PDF Downloads 33
24866 Fueling Efficient Reporting And Decision-Making In Public Health With Large Data Automation In Remote Areas, Neno Malawi

Authors: Wiseman Emmanuel Nkhomah, Chiyembekezo Kachimanga, Julia Huggins, Fabien Munyaneza

Abstract:

Background: Partners In Health – Malawi introduced one of Operational Researches called Primary Health Care (PHC) Surveys in 2020, which seeks to assess progress of delivery of care in the district. The study consists of 5 long surveys, namely; Facility assessment, General Patient, Provider, Sick Child, Antenatal Care (ANC), primarily conducted in 4 health facilities in Neno district. These facilities include Neno district hospital, Dambe health centre, Chifunga and Matope. Usually, these annual surveys are conducted from January, and the target is to present final report by June. Once data is collected and analyzed, there are a series of reviews that take place before reaching final report. In the first place, the manual process took over 9 months to present final report. Initial findings reported about 76.9% of the data that added up when cross-checked with paper-based sources. Purpose: The aim of this approach is to run away from manually pulling the data, do fresh analysis, and reporting often associated not only with delays in reporting inconsistencies but also with poor quality of data if not done carefully. This automation approach was meant to utilize features of new technologies to create visualizations, reports, and dashboards in Power BI that are directly fished from the data source – CommCare hence only require a single click of a ‘refresh’ button to have the updated information populated in visualizations, reports, and dashboards at once. Methodology: We transformed paper-based questionnaires into electronic using CommCare mobile application. We further connected CommCare Mobile App directly to Power BI using Application Program Interface (API) connection as data pipeline. This provided chance to create visualizations, reports, and dashboards in Power BI. Contrary to the process of manually collecting data in paper-based questionnaires, entering them in ordinary spreadsheets, and conducting analysis every time when preparing for reporting, the team utilized CommCare and Microsoft Power BI technologies. We utilized validations and logics in CommCare to capture data with less errors. We utilized Power BI features to host the reports online by publishing them as cloud-computing process. We switched from sharing ordinary report files to sharing the link to potential recipients hence giving them freedom to dig deep into extra findings within Power BI dashboards and also freedom to export to any formats of their choice. Results: This data automation approach reduced research timelines from the initial 9 months’ duration to 5. It also improved the quality of the data findings from the original 76.9% to 98.9%. This brought confidence to draw conclusions from the findings that help in decision-making and gave opportunities for further researches. Conclusion: These results suggest that automating the research data process has the potential of reducing overall amount of time spent and improving the quality of the data. On this basis, the concept of data automation should be taken into serious consideration when conducting operational research for efficiency and decision-making.

Keywords: reporting, decision-making, power BI, commcare, data automation, visualizations, dashboards

Procedia PDF Downloads 104