Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 24243

Search results for: clustered data

24183 Modified Active (MA) Algorithm to Generate Semantic Web Related Clustered Hierarchy for Keyword Search

Authors: G. Leena Giri, Archana Mathur, S. H. Manjula, K. R. Venugopal, L. M. Patnaik

Abstract:

Keyword search in XML documents is based on the notion of lowest common ancestors in the labelled trees model of XML documents and has recently gained a lot of research interest in the database community. In this paper, we propose the Modified Active (MA) algorithm which is an improvement over the active clustering algorithm by taking into consideration the entity aspect of the nodes to find the level of the node pertaining to a particular keyword input by the user. A portion of the bibliography database is used to experimentally evaluate the modified active algorithm and results show that it performs better than the active algorithm. Our modification improves the response time of the system and thereby increases the efficiency of the system.

Keywords: keyword matching patterns, MA algorithm, semantic search, knowledge management

Procedia PDF Downloads 376

24182 Integrating Data Mining with Case-Based Reasoning for Diagnosing Sorghum Anthracnose

Authors: Mariamawit T. Belete

Abstract:

Cereal production and marketing are the means of livelihood for millions of households in Ethiopia. However, cereal production is constrained by technical and socio-economic factors. Among the technical factors, cereal crop diseases are the major contributing factors to the low yield. The aim of this research is to develop an integration of data mining and knowledge based system for sorghum anthracnose disease diagnosis that assists agriculture experts and development agents to make timely decisions. Anthracnose diagnosing systems gather information from Melkassa agricultural research center and attempt to score anthracnose severity scale. Empirical research is designed for data exploration, modeling, and confirmatory procedures for testing hypothesis and prediction to draw a sound conclusion. WEKA (Waikato Environment for Knowledge Analysis) was employed for the modeling. Knowledge based system has come across a variety of approaches based on the knowledge representation method; case-based reasoning (CBR) is one of the popular approaches used in knowledge-based system. CBR is a problem solving strategy that uses previous cases to solve new problems. The system utilizes hidden knowledge extracted by employing clustering algorithms, specifically K-means clustering from sampled anthracnose dataset. Clustered cases with centroid value are mapped to jCOLIBRI, and then the integrator application is created using NetBeans with JDK 8.0.2. The important part of a case based reasoning model includes case retrieval; the similarity measuring stage, reuse; which allows domain expert to transfer retrieval case solution to suit for the current case, revise; to test the solution, and retain to store the confirmed solution to the case base for future use. Evaluation of the system was done for both system performance and user acceptance. For testing the prototype, seven test cases were used. Experimental result shows that the system achieves an average precision and recall values of 70% and 83%, respectively. User acceptance testing also performed by involving five domain experts, and an average of 83% acceptance is achieved. Although the result of this study is promising, however, further study should be done an investigation on hybrid approach such as rule based reasoning, and pictorial retrieval process are recommended.

Keywords: sorghum anthracnose, data mining, case based reasoning, integration

Procedia PDF Downloads 59

24181 Toward an Appropriate Index for Corporate Governance

Authors: Bita Mashayekhi, Farzaneh Jalali, Alemeh Yazdanian

Abstract:

This study contributes to identifying the corporate governance indices in previous researches by using content analysis on relevant papers published in 20 top accounting journals according to Google Scholar ranking, dated from 1990 to 2016. For this purpose, 65 papers are scrutinized deeply, and the concepts of corporate governance are coded and categorized. Then extracted indices are clustered into 10 and 51 categories and subcategories, respectively; and their frequencies are determined. Results show that the board of directors’ characteristics is employed more frequently in reviewed papers, and the board of directors’ independency is the most frequent index within the 97 percent of our sample. Duality, board size, and ownership structure have more frequencies in comparison with other extracted corporate governance indices.

Keywords: corporate governance, content analysis, corporate governance index, top accounting journals

Procedia PDF Downloads 314

24180 Productive Engagements and Psychological Wellbeing of Older Adults; An Analysis of HRS Dataset

Authors: Mohammad Didar Hossain

Abstract:

Background/Purpose: The purpose of this study was to examine the associations between productive engagements and the psychological well-being of older adults in the U.S by analyzing cross-sectional data from a secondary dataset. Speciﬁcally, this paper analyzed the associations of 4 different types of productive engagements, including current work status, caregiving to the family members, volunteering and religious strengths with the psychological well-being as an outcome variable. Methods: Data and sample: The study used the data from the Health and Retirement Study (HRS). The HRS is a nationally representative prospective longitudinal cohort study that has been conducting biennial surveys since 1992 to community-dwelling individuals 50 years of age or older on diverse issues. This analysis was based on the 2016 wave (cross-sectional) of the HRS dataset and the data collection period was April 2016 through August 2017. The samples were recruited from a multistage, national area-clustered probability sampling frame. Measures: Four different variables were considered as the predicting variables in this analysis. Firstly, current working status was a binary variable that measured by 0=Yes and 1= No. The second and third variables were respectively caregiving and volunteering, and both of them were measured by; 0=Regularly, 1= Irregularly. Finally, find in strength was measured by 0= Agree and 1= Disagree. Outcome (Wellbeing) variable was measured by 0= High level of well-being, 1= Low level of well-being. Control variables including age were measured in years, education in the categories of 0=Low level of education, 1= Higher level of education and sex r in the categories 0=male, 1= female. Analysis and Results: Besides the descriptive statistics, binary logistic regression analyses were applied to examine the association between independent and dependent variables. The results showed that among the four independent variables, three of them including working status (OR: .392, p<.001), volunteering (OR: .471, p<.003) and strengths in religion (OR .588, p<.003), were significantly associated with psychological well-being while controlling for age, gender and education factors. Also, no significant association was found between the caregiving engagement of older adults and their psychological well-being outcome. Conclusions and Implications: The findings of this study are mostly consistent with the previous studies except for the caregiving engagements and their impact on older adults’ well-being outcomes. Therefore, the findings support the proactive initiatives from different micro to macro levels to facilitate opportunities for productive engagements for the older adults, and all of these may ultimately benefit their psychological well-being and life satisfaction in later life.

Keywords: productive engagements, older adults, psychological wellbeing, productive aging

Procedia PDF Downloads 135

24179 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: mining big data, big data, machine learning, telecommunication

Procedia PDF Downloads 369

24178 Magnetic Survey for the Delineation of Concrete Pillars in Geotechnical Investigation for Site Characterization

Authors: Nuraddeen Usman, Khiruddin Abdullah, Mohd Nawawi, Amin Khalil Ismail

Abstract:

A magnetic survey is carried out in order to locate the remains of construction items, specifically concrete pillars. The conventional Euler deconvolution technique can perform the task but it requires the use of fixed structural index (SI) and the construction items are made of materials with different shapes which require different SI (unknown). A Euler deconvolution technique that estimate background, horizontal coordinate (xo and yo), depth and structural index (SI) simultaneously is prepared and used for this task. The synthetic model study carried indicated the new methodology can give a good estimate of location and does not depend on magnetic latitude. For field data, both the total magnetic field and gradiometer reading had been collected simultaneously. The computed vertical derivatives and gradiometer readings are compared and they have shown good correlation signifying the effectiveness of the method. The filtering is carried out using automated procedure, analytic signal and other traditional techniques. The clustered depth solutions coincided with the high amplitude/values of analytic signal and these are the possible target positions of the concrete pillars being sought. The targets under investigation are interpreted to be located at the depth between 2.8 to 9.4 meters. More follow up survey is recommended as this mark the preliminary stage of the work.

Keywords: concrete pillar, magnetic survey, geotechnical investigation, Euler Deconvolution

Procedia PDF Downloads 234

24177 JavaScript Object Notation Data against eXtensible Markup Language Data in Software Applications a Software Testing Approach

Authors: Theertha Chandroth

Abstract:

This paper presents a comparative study on how to check JSON (JavaScript Object Notation) data against XML (eXtensible Markup Language) data from a software testing point of view. JSON and XML are widely used data interchange formats, each with its unique syntax and structure. The objective is to explore various techniques and methodologies for validating comparison and integration between JSON data to XML and vice versa. By understanding the process of checking JSON data against XML data, testers, developers and data practitioners can ensure accurate data representation, seamless data interchange, and effective data validation.

Keywords: XML, JSON, data comparison, integration testing, Python, SQL

Procedia PDF Downloads 85

24176 Automatic Moment-Based Texture Segmentation

Authors: Tudor Barbu

Abstract:

An automatic moment-based texture segmentation approach is proposed in this paper. First, we describe the related work in this computer vision domain. Our texture feature extraction, the first part of the texture recognition process, produces a set of moment-based feature vectors. For each image pixel, a texture feature vector is computed as a sequence of area moments. Second, an automatic pixel classification approach is proposed. The feature vectors are clustered using some unsupervised classification algorithm, the optimal number of clusters being determined using a measure based on validation indexes. From the resulted pixel classes one determines easily the desired texture regions of the image.

Keywords: image segmentation, moment-based, texture analysis, automatic classification, validation indexes

Procedia PDF Downloads 386

24175 A Comparative Analysis of Clustering Approaches for Understanding Patterns in Health Insurance Uptake: Evidence from Sociodemographic Kenyan Data

Authors: Nelson Kimeli Kemboi Yego, Juma Kasozi, Joseph Nkruzinza, Francis Kipkogei

Abstract:

The study investigated the low uptake of health insurance in Kenya despite efforts to achieve universal health coverage through various health insurance schemes. Unsupervised machine learning techniques were employed to identify patterns in health insurance uptake based on sociodemographic factors among Kenyan households. The aim was to identify key demographic groups that are underinsured and to provide insights for the development of effective policies and outreach programs. Using the 2021 FinAccess Survey, the study clustered Kenyan households based on their health insurance uptake and sociodemographic features to reveal patterns in health insurance uptake across the country. The effectiveness of k-prototypes clustering, hierarchical clustering, and agglomerative hierarchical clustering in clustering based on sociodemographic factors was compared. The k-prototypes approach was found to be the most effective at uncovering distinct and well-separated clusters in the Kenyan sociodemographic data related to health insurance uptake based on silhouette, Calinski-Harabasz, Davies-Bouldin, and Rand indices. Hence, it was utilized in uncovering the patterns in uptake. The results of the analysis indicate that inclusivity in health insurance is greatly related to affordability. The findings suggest that targeted policy interventions and outreach programs are necessary to increase health insurance uptake in Kenya, with the ultimate goal of achieving universal health coverage. The study provides important insights for policymakers and stakeholders in the health insurance sector to address the low uptake of health insurance and to ensure that healthcare services are accessible and affordable to all Kenyans, regardless of their socio-demographic status. The study highlights the potential of unsupervised machine learning techniques to provide insights into complex health policy issues and improve decision-making in the health sector.

Keywords: health insurance, unsupervised learning, clustering algorithms, machine learning

Procedia PDF Downloads 85

24174 Multi-Source Data Fusion for Urban Comprehensive Management

Authors: Bolin Hua

Abstract:

In city governance, various data are involved, including city component data, demographic data, housing data and all kinds of business data. These data reflects different aspects of people, events and activities. Data generated from various systems are different in form and data source are different because they may come from different sectors. In order to reflect one or several facets of an event or rule, data from multiple sources need fusion together. Data from different sources using different ways of collection raised several issues which need to be resolved. Problem of data fusion include data update and synchronization, data exchange and sharing, file parsing and entry, duplicate data and its comparison, resource catalogue construction. Governments adopt statistical analysis, time series analysis, extrapolation, monitoring analysis, value mining, scenario prediction in order to achieve pattern discovery, law verification, root cause analysis and public opinion monitoring. The result of Multi-source data fusion is to form a uniform central database, which includes people data, location data, object data, and institution data, business data and space data. We need to use meta data to be referred to and read when application needs to access, manipulate and display the data. A uniform meta data management ensures effectiveness and consistency of data in the process of data exchange, data modeling, data cleansing, data loading, data storing, data analysis, data search and data delivery.

Keywords: multi-source data fusion, urban comprehensive management, information fusion, government data

Procedia PDF Downloads 349

24173 Reviewing Privacy Preserving Distributed Data Mining

Authors: Sajjad Baghernezhad, Saeideh Baghernezhad

Abstract:

Nowadays considering human involved in increasing data development some methods such as data mining to extract science are unavoidable. One of the discussions of data mining is inherent distribution of the data usually the bases creating or receiving such data belong to corporate or non-corporate persons and do not give their information freely to others. Yet there is no guarantee to enable someone to mine special data without entering in the owner’s privacy. Sending data and then gathering them by each vertical or horizontal software depends on the type of their preserving type and also executed to improve data privacy. In this study it was attempted to compare comprehensively preserving data methods; also general methods such as random data, coding and strong and weak points of each one are examined.

Keywords: data mining, distributed data mining, privacy protection, privacy preserving

Procedia PDF Downloads 489

24172 Low Density Lipoprotein: The Culprit in the Development of Obesity

Authors: Ojiegbe Ikenna Nathan

Abstract:

Obesity is a medical condition in which excess body fat has accumulated to the extent that it leads to reduced life expectancy and or increased health problems. Obesity as a worldwide problem is seen clustered in the families and it moves from generation to generation. It causes some disabilities, mortality and morbidity if left unattended to. The predisposing factors to obesity are either genetic or environment in origin. Nevertheless, the main predisposing factor to obesity is the excessive consumption of food rich in low-density lipoprotein (LDL) such as organ meats, saturated fats etc. This low-density lipoprotein causes an increase in adipose tissue and complicates to obesity. There are varieties of obesity which one needs to take appropriate measures to avoid; such as android, gynoid and morbid obesity. Nonetheless, studies have shown that there is hope for the obese individuals, despite the cause, type and degree of their obesity. This is through the use of the different available treatment measures which increase in physical activities, caloric restrictions, drug therapy and surgical intervention.

Keywords: low-density, lipoprotein, culprit, obesity

Procedia PDF Downloads 373

24171 The Right to Data Portability and Its Influence on the Development of Digital Services

Authors: Roman Bieda

Abstract:

The General Data Protection Regulation (GDPR) will come into force on 25 May 2018 which will create a new legal framework for the protection of personal data in the European Union. Article 20 of GDPR introduces a right to data portability. This right allows for data subjects to receive the personal data which they have provided to a data controller, in a structured, commonly used and machine-readable format, and to transmit this data to another data controller. The right to data portability, by facilitating transferring personal data between IT environments (e.g.: applications), will also facilitate changing the provider of services (e.g. changing a bank or a cloud computing service provider). Therefore, it will contribute to the development of competition and the digital market. The aim of this paper is to discuss the right to data portability and its influence on the development of new digital services.

Keywords: data portability, digital market, GDPR, personal data

Procedia PDF Downloads 442

24170 Recent Advances in Data Warehouse

Authors: Fahad Hanash Alzahrani

Abstract:

This paper describes some recent advances in a quickly developing area of data storing and processing based on Data Warehouses and Data Mining techniques, which are associated with software, hardware, data mining algorithms and visualisation techniques having common features for any specific problems and tasks of their implementation.

Keywords: data warehouse, data mining, knowledge discovery in databases, on-line analytical processing

Procedia PDF Downloads 367

24169 How to Use Big Data in Logistics Issues

Authors: Mehmet Akif Aslan, Mehmet Simsek, Eyup Sensoy

Abstract:

Big Data stands for today’s cutting-edge technology. As the technology becomes widespread, so does Data. Utilizing massive data sets enable companies to get competitive advantages over their adversaries. Out of many area of Big Data usage, logistics has significance role in both commercial sector and military. This paper lays out what big data is and how it is used in both military and commercial logistics.

Keywords: big data, logistics, operational efficiency, risk management

Procedia PDF Downloads 609

24168 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: clustering, data mining, DBSCAN, k-means, k-medoids, sensor data

Procedia PDF Downloads 344

24167 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review

Procedia PDF Downloads 128

24166 Evidence for Replication of an Unusual G8P[14] Human Rotavirus Strain in the Feces of an Alpine Goat: Zoonotic Transmission from Caprine Species

Authors: Amine Alaoui Sanae, Tagjdid Reda, Loutfi Chafiqa, Melloul Merouane, Laloui Aziz, Touil Nadia, El Fahim, E. Mostafa

Abstract:

Background: Rotavirus group A (RVA) strains with G8P[14] specificities are usually detected in calves and goats. However, these strains have been reported globally in humans and have often been characterized as originating from zoonotic transmissions, particularly in area where ruminants and humans live side-by-side. Whether human P[14] genotypes are two-way and can be transmitted to animal species remains to be established. Here we describe VP4 deduced amino-acid relationships of three Moroccan P[14] genotypes originating from different species and the receptiveness of an alpine goat to a human G8P[14] through an experimental infection. Material/methods: the human MA31 RVA strain was originally identified in a four years old girl presenting an acute gastroenteritis hospitalized at the pediatric care unit in Rabat Hospital in 2011. The virus was isolated and propagated in MA104 cells in the presence of trypsin. Ch_10S and 8045_S animal RVA strains were identified in fecal samples of a 2-week-old native goat and 3-week-old calf with diarrhea in 2011 in Bouaarfa and My Bousselham respectively. Genomic RNAs of all strains were subjected to a two-step RT-PCR and sequenced using the consensus primers VP4. The phylogenetic tree for MA31, Ch_10S and 8045_S VP4 and a set of published P[14] genotypes was constructed using MEGA6 software. The receptivity of MA31 strain by an eight month-old alpine goat was assayed. The animal was orally and intraperitonally inoculated with a dose of 8.5 TCID50 of virus stock at passage level 3. The shedding of the virus was tested by a real time RT-PCR assay. Results: The phylogenetic tree showed that the three Moroccan strains MA31, Ch_10S and 8045_S VP4 were highly related to each other (100% similar at the nucleotide level). They were clustered together with the B10925, Sp813, PA77 and P169 strains isolated in Belgium, Spain and Italy respectively. The Belgian strain B10925 was the most closely related to the Moroccan strains. In contrast, the 8045_S and Ch_10S strains were clustered distantly from the Tunisian calf strain B137 and the goat strain cap455 isolated in South Africa respectively. The human MA31 RVA strain was able to induce bloody diarrhea at 2 days post infection (dpi) in the alpine goat kid. RVA virus shedding started by 2 dpi (Ct value of 28) and continued until 5 dpi (Ct value of 25) with a concomitant elevation in the body temperature. Conclusions: Our study while limited to one animal, is the first study proving experimentally that a human P[14] genotype causes diarrhea and virus shedding in the goat. This result reinforce the potential role of inter- species transmission in generating novel and rare rotavirus strains such G8P[14] which infect humans.

Keywords: interspecies transmission, rotavirus, goat, human

Procedia PDF Downloads 255

24165 Government Big Data Ecosystem: A Systematic Literature Review

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Data that is high in volume, velocity, veracity and comes from a variety of sources is usually generated in all sectors including the government sector. Globally public administrations are pursuing (big) data as new technology and trying to adopt a data-centric architecture for hosting and sharing data. Properly executed, big data and data analytics in the government (big) data ecosystem can be led to data-driven government and have a direct impact on the way policymakers work and citizens interact with governments. In this research paper, we conduct a systematic literature review. The main aims of this paper are to highlight essential aspects of the government (big) data ecosystem and to explore the most critical socio-technical factors that contribute to the successful implementation of government (big) data ecosystem. The essential aspects of government (big) data ecosystem include definition, data types, data lifecycle models, and actors and their roles. We also discuss the potential impact of (big) data in public administration and gaps in the government data ecosystems literature. As this is a new topic, we did not find specific articles on government (big) data ecosystem and therefore focused our research on various relevant areas like humanitarian data, open government data, scientific research data, industry data, etc.

Keywords: applications of big data, big data, big data types. big data ecosystem, critical success factors, data-driven government, egovernment, gaps in data ecosystems, government (big) data, literature review, public administration, systematic review

Procedia PDF Downloads 180

24164 A Machine Learning Decision Support Framework for Industrial Engineering Purposes

Authors: Anli Du Preez, James Bekker

Abstract:

Data is currently one of the most critical and influential emerging technologies. However, the true potential of data is yet to be exploited since, currently, about 1% of generated data are ever actually analyzed for value creation. There is a data gap where data is not explored due to the lack of data analytics infrastructure and the required data analytics skills. This study developed a decision support framework for data analytics by following Jabareen’s framework development methodology. The study focused on machine learning algorithms, which is a subset of data analytics. The developed framework is designed to assist data analysts with little experience, in choosing the appropriate machine learning algorithm given the purpose of their application.

Keywords: Data analytics, Industrial engineering, Machine learning, Value creation

Procedia PDF Downloads 137

24163 Providing Security to Private Cloud Using Advanced Encryption Standard Algorithm

Authors: Annapureddy Srikant Reddy, Atthanti Mahendra, Samala Chinni Krishna, N. Neelima

Abstract:

In our present world, we are generating a lot of data and we, need a specific device to store all these data. Generally, we store data in pen drives, hard drives, etc. Sometimes we may loss the data due to the corruption of devices. To overcome all these issues, we implemented a cloud space for storing the data, and it provides more security to the data. We can access the data with just using the internet from anywhere in the world. We implemented all these with the java using Net beans IDE. Once user uploads the data, he does not have any rights to change the data. Users uploaded files are stored in the cloud with the file name as system time and the directory will be created with some random words. Cloud accepts the data only if the size of the file is less than 2MB.

Keywords: cloud space, AES, FTP, NetBeans IDE

Procedia PDF Downloads 174

24162 Identification of Clinical Characteristics from Persistent Homology Applied to Tumor Imaging

Authors: Eashwar V. Somasundaram, Raoul R. Wadhwa, Jacob G. Scott

Abstract:

The use of radiomics in measuring geometric properties of tumor images such as size, surface area, and volume has been invaluable in assessing cancer diagnosis, treatment, and prognosis. In addition to analyzing geometric properties, radiomics would benefit from measuring topological properties using persistent homology. Intuitively, features uncovered by persistent homology may correlate to tumor structural features. One example is necrotic cavities (corresponding to 2D topological features), which are markers of very aggressive tumors. We develop a data pipeline in R that clusters tumors images based on persistent homology is used to identify meaningful clinical distinctions between tumors and possibly new relationships not captured by established clinical categorizations. A preliminary analysis was performed on 16 Magnetic Resonance Imaging (MRI) breast tissue segments downloaded from the 'Investigation of Serial Studies to Predict Your Therapeutic Response with Imaging and Molecular Analysis' (I-SPY TRIAL or ISPY1) collection in The Cancer Imaging Archive. Each segment represents a patient’s breast tumor prior to treatment. The ISPY1 dataset also provided the estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2) status data. A persistent homology matrix up to 2-dimensional features was calculated for each of the MRI segmentation. Wasserstein distances were then calculated between all pairwise tumor image persistent homology matrices to create a distance matrix for each feature dimension. Since Wasserstein distances were calculated for 0, 1, and 2-dimensional features, three hierarchal clusters were constructed. The adjusted Rand Index was used to see how well the clusters corresponded to the ER/PR/HER2 status of the tumors. Triple-negative cancers (negative status for all three receptors) significantly clustered together in the 2-dimensional features dendrogram (Adjusted Rand Index of .35, p = .031). It is known that having a triple-negative breast tumor is associated with aggressive tumor growth and poor prognosis when compared to non-triple negative breast tumors. The aggressive tumor growth associated with triple-negative tumors may have a unique structure in an MRI segmentation, which persistent homology is able to identify. This preliminary analysis shows promising results in the use of persistent homology on tumor imaging to assess the severity of breast tumors. The next step is to apply this pipeline to other tumor segment images from The Cancer Imaging Archive at different sites such as the lung, kidney, and brain. In addition, whether other clinical parameters, such as overall survival, tumor stage, and tumor genotype data are captured well in persistent homology clusters will be assessed. If analyzing tumor MRI segments using persistent homology consistently identifies clinical relationships, this could enable clinicians to use persistent homology data as a noninvasive way to inform clinical decision making in oncology.

Keywords: cancer biology, oncology, persistent homology, radiomics, topological data analysis, tumor imaging

Procedia PDF Downloads 104

24161 Circulating Public Perception on Agroforestry: Discourse Networks Analysis Using Social Media and Online News Media in Four Countries of the Sahel Region

Authors: Luisa Müting, Wisnu Harto Adiwijoyo

Abstract:

Agroforestry systems transform the agricultural landscapes in the Sahel region of Africa, providing food and farming products consumed for subsistence or sold for income. In the incrementally dry climate of the Sahel region, the spreading of agroforestry practices is integral for policymaker efforts to counteract land degradation and provide soil restoration in the region. Several measures on agroforestry practices have been implemented in the region by governmental and non-governmental institutions in recent years. However, despite the efforts, past research shows that awareness of how policies and interventions are being consumed and perceived by the public remains low. Therefore, interpreting public policy dilemmas by analyzing the public perception regarding agroforestry concepts and practices is necessary. Public perceptions and discourses can be an essential driver or constraint for the adoption of agroforestry practices in the region. Thus, understanding the public discourse behavior of crucial stakeholders could assist policymakers in developing inclusive and contextual policies that are relevant to the context of agroforestry adoption in Sahel region. To answer how information about agroforestry spreads and is perceived by the public. As internet usage increased drastically over the past decade, reaching a share of 33 percent of the population being connected to the internet, this research is based on online conversation data. Social media data from Facebook are gathered daily between April 2021 and April 2022 in Djibouti, Senegal, Mali, and Nigeria based on their share of active internet users compared to other countries in the Sahel region. A systematic methodology was applied to the extracted social media using discourse network analysis (DNA). This study then clustered the data by the types of agroforestry practices, sentiments, and country. Additionally, this research extracted the text data from online news media during the same period to pinpoint events related to the topic of agroforestry. The preliminary result indicates that tree management, crops, and livestock integration, diversifying species and genetic resources, and focusing on interactions and productivity across the agricultural system; are the most notable keywords in agroforestry-related conversations within the four countries in the Sahel region. Additionally, approximately 84 percent of the discussions were still dominated by big actors, such as NGO or government actors. Furthermore, as a subject of communication within agroforestry discourse, the Great Green Wall initiative generates almost 60 percent positive sentiment within the captured social media data, effectively having a more significant outreach than general agroforestry topics. This study provides an understanding for scholars and policymakers with a springboard for further research or policy design on agroforestry in the four countries of the Sahel region with systematically uncaptured novel data from the internet.

Keywords: sahel, djibouti, senegal, mali, nigeria, social networks analysis, public discourse analysis, sentiment analysis, content analysis, social media, online news, agroforestry, land restoration

Procedia PDF Downloads 66

24160 Business Intelligence for Profiling of Telecommunication Customer

Authors: Rokhmatul Insani, Hira Laksmiwati Soemitro

Abstract:

Business Intelligence is a methodology that exploits the data to produce information and knowledge systematically, business intelligence can support the decision-making process. Some methods in business intelligence are data warehouse and data mining. A data warehouse can store historical data from transactional data. For data modelling in data warehouse, we apply dimensional modelling by Kimball. While data mining is used to extracting patterns from the data and get insight from the data. Data mining has many techniques, one of which is segmentation. For profiling of telecommunication customer, we use customer segmentation according to customer’s usage of services, customer invoice and customer payment. Customers can be grouped according to their characteristics and can be identified the profitable customers. We apply K-Means Clustering Algorithm for segmentation. The input variable for that algorithm we use RFM (Recency, Frequency and Monetary) model. All process in data mining, we use tools IBM SPSS modeller.

Keywords: business intelligence, customer segmentation, data warehouse, data mining

Procedia PDF Downloads 443

24159 The Efficiency of Cytochrome Oxidase Subunit 1 Gene (cox1) in Reconstruction of Phylogenetic Relations among Some Crustacean Species

Authors: Yasser M. Saad, Heba El-Sebaie Abd El-Sadek

Abstract:

Some Metapenaeus monoceros cox1 gene fragments were isolated, purified, sequenced, and comparatively analyzed with some other Crustacean Cox1 gene sequences (obtained from National Center for Biotechnology Information). This work was designed for testing the efficiency of this system in reconstruction of phylogenetic relations among some Crustacean species belonging to four genera (Metapenaeus, Artemia, Daphnia and Calanus). The single nucleotide polymorphism and haplotype diversity were calculated for all estimated mt-DNA fragments. The genetic distance values were 0.292, 0.015, 0.151, and 0.09 within Metapenaeus species, Calanus species, Artemia species, and Daphnia species, respectively. The reconstructed phylogenetic tree is clustered into some unique clades. Cytochrome oxidase subunit 1 gene (cox1) was a powerful system in reconstruction of phylogenetic relations among evaluated crustacean species.

Keywords: crustaceans, genetics, Cox1, phylogeny

Procedia PDF Downloads 329

24158 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Saeed Hassan Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analysing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics

Procedia PDF Downloads 533

24157 PDDA: Priority-Based, Dynamic Data Aggregation Approach for Sensor-Based Big Data Framework

Authors: Lutful Karim, Mohammed S. Al-kahtani

Abstract:

Sensors are being used in various applications such as agriculture, health monitoring, air and water pollution monitoring, traffic monitoring and control and hence, play the vital role in the growth of big data. However, sensors collect redundant data. Thus, aggregating and filtering sensors data are significantly important to design an efficient big data framework. Current researches do not focus on aggregating and filtering data at multiple layers of sensor-based big data framework. Thus, this paper introduces (i) three layers data aggregation and framework for big data and (ii) a priority-based, dynamic data aggregation scheme (PDDA) for the lowest layer at sensors. Simulation results show that the PDDA outperforms existing tree and cluster-based data aggregation scheme in terms of overall network energy consumptions and end-to-end data transmission delay.

Keywords: big data, clustering, tree topology, data aggregation, sensor networks

Procedia PDF Downloads 302

24156 The Impact of Leadership Style and Sense of Competence on the Performance of Post-Primary School Teachers in Oyo State, Nigeria

Authors: Babajide S. Adeokin, Oguntoyinbo O. Kazeem

Abstract:

The not so pleasing state of the nation's quality of education has been a major area of research. Many researchers have looked into various aspects of the educational system and organizational structure in relation to the quality of service delivery of the staff members. However, there is paucity of research in areas relating to the sense of competence and commitment in relation to leadership styles. Against this backdrop, this study investigated the impact of leadership style and sense of competence on the performance of post-primary school teachers in Oyo state Nigeria. Data were generated across public secondary schools in the city using survey design method. Ibadan as a metropolis has eleven local government areas contained in it. A systematic random sampling technique of the eleven local government areas in Ibadan was done and five local government areas were selected. The selected local government areas are Akinyele, Ibadan North, Ibadan North-East, Ibadan South and Ibadan South-West. Data were obtained from a range of two – three public secondary schools selected in each of the local government areas mentioned above. Also, these secondary schools are a representation of the variations in the constructs under consideration across the Ibadan metropolis. Categorically, all secondary school teachers in Ibadan were clustered into selected schools in those found across the five local government areas. In all, a total of 272 questionnaires were administered to public secondary school teachers, while 241 were returned. Findings revealed that transformational leadership style makes room for job commitment when compared with transactional and laissez-faire leadership styles. Teachers with a high sense of competence are more likely to demonstrate more commitment to their job than others with low sense of competence. We recommend that, it is important an assessment is made of the leadership styles employed by principals and school administrators. This guides administrators and principals in to having a clear, comprehensive knowledge of the style they currently adopt in the management of the staff and the school as a whole; and know where to begin the adjustment process from. Also to make an impact on student achievement, being attentive to teachers’ levels of commitment may be an important aspect of leadership for school principals.

Keywords: Ibadan, leadership style, sense of competence, teachers, public secondary schools

Procedia PDF Downloads 266

24155 Malaysian ESL Writing Process: A Comparison with England’s

Authors: Henry Nicholas Lee, George Thomas, Juliana Johari, Carmilla Freddie, Caroline Val Madin

Abstract:

Research in comparative and international education often provides value-laden views of an education system within and in between other countries. These views are frequently used by policy makers or educators to explore similarities and differences for, among others, benchmarking purposes. In this study, a comparison is made between Malaysia and England, focusing on the process of writing children went through to create a text, using a multimodal theoretical framework to analyse this comparison. The main purpose is political in nature as it served as an answer to Malaysia’s call for benchmarking of best practices for language learning. Furthermore, the focus on writing in this study adds into more empirical findings about early writers’ writing development and writing improvement, especially for children at the ages of 5-9. In research, comparative studies in English as a Second Language (ESL) writing pedagogy – particularly in Malaysia since the introduction of the Standard- based English Language Curriculum (KSSR) in 2011 as a draft and its full implementation in 2017; reviewed 2018 KSSR-CEFR aligned – has not been done comparatively. In theory, a multimodal theoretical framework somehow allows a logical comparison between first language and ESL which would provide useful insights to illuminate the writing process between Malaysia and England. The comparisons are not representative because of the different school systems in both countries. So far, the literature informs us that the curriculum for language learning is very much emphasised on children’s linguistic abilities, which include their proficiency and mastery of the language, its conventions, and technicalities. However, recent empirical findings suggested that literacy in its concepts and characters need change. In view of this suggestion, the comparison will look at how the process of writing is implemented through the five modes of communication: linguistic, visual, aural, spatial, and gestural. This project draws on data from Malaysia and England, involving 10 teachers, 26 classroom observations, 20 lesson plans, 20 interviews, and 20 brief conversations with teachers. The research focused upon 20 primary children of different genders aged 5-9, and in addition to primary data descriptions, 40 children’s works, 40 brief classroom conversations, 30 classroom photographs, and 30 school compound photographs were undertaken to investigate teachers and children’s use of modes and semiotic resources to design a text. The data were analysed by means of within-case analysis, cross-case analysis, and constant comparative analysis, with an initial stage of data categorisation, followed by general and specific coding, which clustered the data into thematic groups. The study highlights the importance of teachers’ and children’s engagement and interaction with various modes of communication, an adaptation from the English approaches to teaching writing within the KSSR framework and providing ‘voice’ to ESL writers to ensure that both have access to the knowledge and skills required to make decisions in developing multimodal texts and artefacts.

Keywords: comparative education, early writers, KSSR, multimodal theoretical framework, writing development

Procedia PDF Downloads 38

24154 Clustered Regularly Interspaced Short Palindromic Repeats Interference (CRISPRi): An Approach to Inhibit Microbial Biofilm

Authors: Azna Zuberi

Abstract:

Biofilm is a sessile bacterial accretion in which bacteria adapts different physiological and morphological behavior from planktonic form. It is the root cause of about 80% microbial infections in human. Among them, E. coli biofilms are most prevalent in medical devices associated nosocomial infections. The objective of this study was to inhibit biofilm formation by targeting LuxS gene, involved in quorum sensing using CRISPRi. luxS is a synthase, involved in the synthesis of Autoinducer-2(AI-2), which in turn guides the initial stage of biofilm formation. To implement CRISPRi system, we have synthesized complementary sgRNA to target gene sequence and co-expressed with dCas9. Suppression of luxS was confirmed through qRT-PCR. The effect of luxS gene on biofilm inhibition was studied through crystal violet assay, XTT reduction assay and scanning electron microscopy. We conclude that CRISPRi system could be a potential strategy to inhibit bacterial biofilm through mechanism base approach.

Keywords: biofilm, CRISPRi, luxS, microbial

Procedia PDF Downloads 155