Search results for: panel data analysis
41872 A Study on Big Data Analytics, Applications, and Challenges
Authors: Chhavi Rana
Abstract:
The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, healthcare, and business intelligence contain voluminous and incremental data which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organisation decision-making strategy can be enhanced by using big data analytics and applying different machine learning techniques and statistical tools to such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates various frameworks in the process of analysis using different machine learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.Keywords: big data, big data analytics, machine learning, review
Procedia PDF Downloads 9541871 Development of Generally Applicable Intravenous to Oral Antibiotic Switch Therapy Criteria
Authors: H. Akhloufi, M. Hulscher, J. M. Prins, I. H. Van Der Sijs, D. Melles, A. Verbon
Abstract:
Background: A timely switch from intravenous to oral antibiotic therapy has many advantages, such as reduced incidence of IV-line related infections, a decreased hospital length of stay and less workload for healthcare professionals with equivalent patient safety. Additionally, numerous studies have demonstrated significant decreases in costs of a timely intravenous to oral antibiotic therapy switch, while maintaining efficacy and safety. However, a considerable variation in iv to oral antibiotic switch therapy criteria has been described in literature. Here, we report the development of a set of iv to oral switch criteria that are generally applicable in all hospitals. Material/methods: A RAND-modified Delphi procedure, which was composed of 3 rounds, was used. This Delphi procedure is a widely used structured process to develop consensus using multiple rounds of questionnaires within a qualified panel of selected experts. The international expert panel was multidisciplinary and composed out of clinical microbiologists, infectious disease consultants and clinical pharmacists. This panel of 19 experts appraised 6 major intravenous to oral antibiotic switch therapy criteria and operationalized these criteria using 41 measurable conditions extracted from the literature. The procedure to select a concise set of iv to oral switch criteria included 2 questionnaire rounds and a face-to-face meeting. Results: The procedure resulted in the selection of 16 measurable conditions, which operationalize 6 major intravenous to oral antibiotic switch therapy criteria. The following 6 major switch therapy criteria were selected: (1) Vital signs should be good or improving when bad. (2) Signs and symptoms related to the infection have to be resolved or improved. (3) The gastrointestinal tract has to be intact and functioning. (4) The oral route should not be compromised. (5) Absence of contra-indicated infections. (6) An oral variant of the antibiotic with good bioavailability has to exist. Conclusions: This systematic stepwise method which combined evidence and expert opinion resulted in a feasible set of 6 major intravenous to oral antibiotic switch therapy criteria operationalized by 16 measurable conditions. This set of early antibiotic iv to oral switch criteria can be used in daily practice in all adult hospital patients. Future use in audits and as rules in computer assisted decision support systems will lead to improvement of antimicrobial steward ship programs.Keywords: antibiotic resistance, antibiotic stewardship, intravenous to oral, switch therapy
Procedia PDF Downloads 35641870 Analysis and Prediction of Netflix Viewing History Using Netflixlatte as an Enriched Real Data Pool
Authors: Amir Mabhout, Toktam Ghafarian, Amirhossein Farzin, Zahra Makki, Sajjad Alizadeh, Amirhossein Ghavi
Abstract:
The high number of Netflix subscribers makes it attractive for data scientists to extract valuable knowledge from the viewers' behavioural analyses. This paper presents a set of statistical insights into viewers' viewing history. After that, a deep learning model is used to predict the future watching behaviour of the users based on previous watching history within the Netflixlatte data pool. Netflixlatte in an aggregated and anonymized data pool of 320 Netflix viewers with a length 250 000 data points recorded between 2008-2022. We observe insightful correlations between the distribution of viewing time and the COVID-19 pandemic outbreak. The presented deep learning model predicts future movie and TV series viewing habits with an average loss of 0.175.Keywords: data analysis, deep learning, LSTM neural network, netflix
Procedia PDF Downloads 25141869 EnumTree: An Enumerative Biclustering Algorithm for DNA Microarray Data
Authors: Haifa Ben Saber, Mourad Elloumi
Abstract:
In a number of domains, like in DNA microarray data analysis, we need to cluster simultaneously rows (genes) and columns (conditions) of a data matrix to identify groups of constant rows with a group of columns. This kind of clustering is called biclustering. Biclustering algorithms are extensively used in DNA microarray data analysis. More effective biclustering algorithms are highly desirable and needed. We introduce a new algorithm called, Enumerative tree (EnumTree) for biclustering of binary microarray data. is an algorithm adopting the approach of enumerating biclusters. This algorithm extracts all biclusters consistent good quality. The main idea of EnumLat is the construction of a new tree structure to represent adequately different biclusters discovered during the process of enumeration. This algorithm adopts the strategy of all biclusters at a time. The performance of the proposed algorithm is assessed using both synthetic and real DNA micryarray data, our algorithm outperforms other biclustering algorithms for binary microarray data. Biclusters with different numbers of rows. Moreover, we test the biological significance using a gene annotation web tool to show that our proposed method is able to produce biologically relevent biclusters.Keywords: DNA microarray, biclustering, gene expression data, tree, datamining.
Procedia PDF Downloads 37241868 Series Network-Structured Inverse Models of Data Envelopment Analysis: Pitfalls and Solutions
Authors: Zohreh Moghaddas, Morteza Yazdani, Farhad Hosseinzadeh
Abstract:
Nowadays, data envelopment analysis (DEA) models featuring network structures have gained widespread usage for evaluating the performance of production systems and activities (Decision-Making Units (DMUs)) across diverse fields. By examining the relationships between the internal stages of the network, these models offer valuable insights to managers and decision-makers regarding the performance of each stage and its impact on the overall network. To further empower system decision-makers, the inverse data envelopment analysis (IDEA) model has been introduced. This model allows the estimation of crucial information for estimating parameters while keeping the efficiency score unchanged or improved, enabling analysis of the sensitivity of system inputs or outputs according to managers' preferences. This empowers managers to apply their preferences and policies on resources, such as inputs and outputs, and analyze various aspects like production, resource allocation processes, and resource efficiency enhancement within the system. The results obtained can be instrumental in making informed decisions in the future. The top result of this study is an analysis of infeasibility and incorrect estimation that may arise in the theory and application of the inverse model of data envelopment analysis with network structures. By addressing these pitfalls, novel protocols are proposed to circumvent these shortcomings effectively. Subsequently, several theoretical and applied problems are examined and resolved through insightful case studies.Keywords: inverse models of data envelopment analysis, series network, estimation of inputs and outputs, efficiency, resource allocation, sensitivity analysis, infeasibility
Procedia PDF Downloads 5241867 Modelling the Education Supply Chain with Network Data Envelopment Analysis
Authors: Sourour Ramzi, Claudia Sarrico
Abstract:
Little has been done on network DEA in education, and nobody has attempted to model the whole education supply chain using network DEA. As such the contribution of the present paper is to propose a model for measuring the efficiency of education supply chains using network DEA. First, we use a general survey of data envelopment analysis (DEA) to establish the emergent themes for research in DEA, and focus on the theme of Network DEA. Second, we use a survey on two-stage DEA models, and Network DEA to write a state of the art on Network DEA, particularly applied to supply chain management. Third, we use a survey on DEA applications to establish the most influential papers on DEA education applications, in order to establish the state of the art on applications of DEA in education, in general, and applications of DEA to education using network DEA, in particular. Finally, we propose a model for measuring the performance of education supply chains of different education systems (countries or states within a country, for instance). We then use this model on some empirical data.Keywords: supply chain, education, data envelopment analysis, network DEA
Procedia PDF Downloads 36841866 Strategic Citizen Participation in Applied Planning Investigations: How Planners Use Etic and Emic Community Input Perspectives to Fill-in the Gaps in Their Analysis
Authors: John Gaber
Abstract:
Planners regularly use citizen input as empirical data to help them better understand community issues they know very little about. This type of community data is based on the lived experiences of local residents and is known as "emic" data. What is becoming more common practice for planners is their use of data from local experts and stakeholders (known as "etic" data or the outsider perspective) to help them fill in the gaps in their analysis of applied planning research projects. Utilizing international Health Impact Assessment (HIA) data, I look at who planners invite to their citizen input investigations. Research presented in this paper shows that planners access a wide range of emic and etic community perspectives in their search for the “community’s view.” The paper concludes with how planners can chart out a new empirical path in their execution of emic/etic citizen participation strategies in their applied planning research projects.Keywords: citizen participation, emic data, etic data, Health Impact Assessment (HIA)
Procedia PDF Downloads 48441865 Data Science-Based Key Factor Analysis and Risk Prediction of Diabetic
Authors: Fei Gao, Rodolfo C. Raga Jr.
Abstract:
This research proposal will ascertain the major risk factors for diabetes and to design a predictive model for risk assessment. The project aims to improve diabetes early detection and management by utilizing data science techniques, which may improve patient outcomes and healthcare efficiency. The phase relation values of each attribute were used to analyze and choose the attributes that might influence the examiner's survival probability using Diabetes Health Indicators Dataset from Kaggle’s data as the research data. We compare and evaluate eight machine learning algorithms. Our investigation begins with comprehensive data preprocessing, including feature engineering and dimensionality reduction, aimed at enhancing data quality. The dataset, comprising health indicators and medical data, serves as a foundation for training and testing these algorithms. A rigorous cross-validation process is applied, and we assess their performance using five key metrics like accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC-ROC). After analyzing the data characteristics, investigate their impact on the likelihood of diabetes and develop corresponding risk indicators.Keywords: diabetes, risk factors, predictive model, risk assessment, data science techniques, early detection, data analysis, Kaggle
Procedia PDF Downloads 7541864 Risk, Capital Buffers, and Bank Lending: The Adjustment of Euro Area Banks
Authors: Laurent Maurin, Mervi Toivanen
Abstract:
This paper estimates euro area banks’ internal target capital ratios and investigates whether banks’ adjustment to the targets have an impact on credit supply and holding of securities during the financial crisis in 2005-2011. Using data on listed banks and country-specific macro-variables a partial adjustment model is estimated in a panel context. The results indicate, firstly, that an increase in the riskiness of banks’ balance sheets influences positively on the target capital ratios. Secondly, the adjustment towards higher equilibrium capital ratios has a significant impact on banks’ assets. The impact is found to be more size-able on security holdings than on loans, thereby suggesting a pecking order.Keywords: Euro area, capital ratios, credit supply, partial adjustment model
Procedia PDF Downloads 44841863 Dataset Quality Index:Development of Composite Indicator Based on Standard Data Quality Indicators
Authors: Sakda Loetpiparwanich, Preecha Vichitthamaros
Abstract:
Nowadays, poor data quality is considered one of the majority costs for a data project. The data project with data quality awareness almost as much time to data quality processes while data project without data quality awareness negatively impacts financial resources, efficiency, productivity, and credibility. One of the processes that take a long time is defining the expectations and measurements of data quality because the expectation is different up to the purpose of each data project. Especially, big data project that maybe involves with many datasets and stakeholders, that take a long time to discuss and define quality expectations and measurements. Therefore, this study aimed at developing meaningful indicators to describe overall data quality for each dataset to quick comparison and priority. The objectives of this study were to: (1) Develop a practical data quality indicators and measurements, (2) Develop data quality dimensions based on statistical characteristics and (3) Develop Composite Indicator that can describe overall data quality for each dataset. The sample consisted of more than 500 datasets from public sources obtained by random sampling. After datasets were collected, there are five steps to develop the Dataset Quality Index (SDQI). First, we define standard data quality expectations. Second, we find any indicators that can measure directly to data within datasets. Thirdly, each indicator aggregates to dimension using factor analysis. Next, the indicators and dimensions were weighted by an effort for data preparing process and usability. Finally, the dimensions aggregate to Composite Indicator. The results of these analyses showed that: (1) The developed useful indicators and measurements contained ten indicators. (2) the developed data quality dimension based on statistical characteristics, we found that ten indicators can be reduced to 4 dimensions. (3) The developed Composite Indicator, we found that the SDQI can describe overall datasets quality of each dataset and can separate into 3 Level as Good Quality, Acceptable Quality, and Poor Quality. The conclusion, the SDQI provide an overall description of data quality within datasets and meaningful composition. We can use SQDI to assess for all data in the data project, effort estimation, and priority. The SDQI also work well with Agile Method by using SDQI to assessment in the first sprint. After passing the initial evaluation, we can add more specific data quality indicators into the next sprint.Keywords: data quality, dataset quality, data quality management, composite indicator, factor analysis, principal component analysis
Procedia PDF Downloads 13941862 Analyzing Global User Sentiments on Laptop Features: A Comparative Study of Preferences Across Economic Contexts
Authors: Mohammadreza Bakhtiari, Mehrdad Maghsoudi, Hamidreza Bakhtiari
Abstract:
The widespread adoption of laptops has become essential to modern lifestyles, supporting work, education, and entertainment. Social media platforms have emerged as key spaces where users share real-time feedback on laptop performance, providing a valuable source of data for understanding consumer preferences. This study leverages aspect-based sentiment analysis (ABSA) on 1.5 million tweets to examine how users from developed and developing countries perceive and prioritize 16 key laptop features. The analysis reveals that consumers in developing countries express higher satisfaction overall, emphasizing affordability, durability, and reliability. Conversely, users in developed countries demonstrate more critical attitudes, especially toward performance-related aspects such as cooling systems, battery life, and chargers. The study employs a mixed-methods approach, combining ABSA using the PyABSA framework with expert insights gathered through a Delphi panel of ten industry professionals. Data preprocessing included cleaning, filtering, and aspect extraction from tweets. Universal issues such as battery efficiency and fan performance were identified, reflecting shared challenges across markets. However, priorities diverge between regions, while users in developed countries demand high-performance models with advanced features, those in developing countries seek products that offer strong value for money and long-term durability. The findings suggest that laptop manufacturers should adopt a market-specific strategy by developing differentiated product lines. For developed markets, the focus should be on cutting-edge technologies, enhanced cooling solutions, and comprehensive warranty services. In developing markets, emphasis should be placed on affordability, versatile port options, and robust designs. Additionally, the study highlights the importance of universal charging solutions and continuous sentiment monitoring to adapt to evolving consumer needs. This research offers practical insights for manufacturers seeking to optimize product development and marketing strategies for global markets, ensuring enhanced user satisfaction and long-term competitiveness. Future studies could explore multi-source data integration and conduct longitudinal analyses to capture changing trends over time.Keywords: consumer behavior, durability, laptop industry, sentiment analysis, social media analytics
Procedia PDF Downloads 1541861 Study the Relationship amongst Digital Finance, Renewable Energy, and Economic Development of Least Developed Countries
Authors: Fatima Sohail, Faizan Iftikhar
Abstract:
This paper studies the relationship between digital finance, renewable energy, and the economic development of Pakistan and least developed countries from 2000 to 2022. The paper used panel analysis and generalized method of moments Arellano-Bond approaches. The findings show that under the growth model, renewable energy (RE) has a strong and favorable link with fixed broadband and mobile subscribers. However, FB and MD have a strong but negative association with the uptake of renewable energy (RE) in the average and simple model. This paper provides valuable insights for policymakers, investors of the digital economy.Keywords: digital finance, renewable energy, economic development, mobile subscription, fixed broadband
Procedia PDF Downloads 4041860 Development of a Multi-Locus DNA Metabarcoding Method for Endangered Animal Species Identification
Authors: Meimei Shi
Abstract:
Objectives: The identification of endangered species, especially simultaneous detection of multiple species in complex samples, plays a critical role in alleged wildlife crime incidents and prevents illegal trade. This study was to develop a multi-locus DNA metabarcoding method for endangered animal species identification. Methods: Several pairs of universal primers were designed according to the mitochondria conserved gene regions. Experimental mixtures were artificially prepared by mixing well-defined species, including endangered species, e.g., forest musk, bear, tiger, pangolin, and sika deer. The artificial samples were prepared with 1-16 well-characterized species at 1% to 100% DNA concentrations. After multiplex-PCR amplification and parameter modification, the amplified products were analyzed by capillary electrophoresis and used for NGS library preparation. The DNA metabarcoding was carried out based on Illumina MiSeq amplicon sequencing. The data was processed with quality trimming, reads filtering, and OTU clustering; representative sequences were blasted using BLASTn. Results: According to the parameter modification and multiplex-PCR amplification results, five primer sets targeting COI, Cytb, 12S, and 16S, respectively, were selected as the NGS library amplification primer panel. High-throughput sequencing data analysis showed that the established multi-locus DNA metabarcoding method was sensitive and could accurately identify all species in artificial mixtures, including endangered animal species Moschus berezovskii, Ursus thibetanus, Panthera tigris, Manis pentadactyla, Cervus nippon at 1% (DNA concentration). In conclusion, the established species identification method provides technical support for customs and forensic scientists to prevent the illegal trade of endangered animals and their products.Keywords: DNA metabarcoding, endangered animal species, mitochondria nucleic acid, multi-locus
Procedia PDF Downloads 14041859 Generative AI: A Comparison of Conditional Tabular Generative Adversarial Networks and Conditional Tabular Generative Adversarial Networks with Gaussian Copula in Generating Synthetic Data with Synthetic Data Vault
Authors: Lakshmi Prayaga, Chandra Prayaga. Aaron Wade, Gopi Shankar Mallu, Harsha Satya Pola
Abstract:
Synthetic data generated by Generative Adversarial Networks and Autoencoders is becoming more common to combat the problem of insufficient data for research purposes. However, generating synthetic data is a tedious task requiring extensive mathematical and programming background. Open-source platforms such as the Synthetic Data Vault (SDV) and Mostly AI have offered a platform that is user-friendly and accessible to non-technical professionals to generate synthetic data to augment existing data for further analysis. The SDV also provides for additions to the generic GAN, such as the Gaussian copula. We present the results from two synthetic data sets (CTGAN data and CTGAN with Gaussian Copula) generated by the SDV and report the findings. The results indicate that the ROC and AUC curves for the data generated by adding the layer of Gaussian copula are much higher than the data generated by the CTGAN.Keywords: synthetic data generation, generative adversarial networks, conditional tabular GAN, Gaussian copula
Procedia PDF Downloads 8241858 Factors That Determine International Competitiveness of Agricultural Products in Latin America 1990-2020
Authors: Oluwasefunmi Eunice Irewole, Enrique Armas Arévalos
Abstract:
Agriculture has played a crucial role in the economy and the development of many countries. Moreover, the basic needs for human survival are; food, shelter, and cloth are link on agricultural production. Most developed countries see that agriculture provides them with food and raw materials for different goods such as (shelter, medicine, fuel and clothing) which has led to an increase in incomes, livelihoods and standard of living. This study aimed at analysing the relationship between International competitiveness of agricultural products, with the area, fertilizer, labour force, economic growth, foreign direct investment, exchange rate and inflation rate in Latin America during the period of 1991-to 2019. In this study, panel data econometric methods were used, as well as cross-section dependence (Pesaran test), unit root (cross-section Augumented Dickey Fuller and Cross-sectional Im, Pesaran, and Shin tests), cointergration (Pedroni and Fisher-Johansen tests), and heterogeneous causality (Pedroni and Fisher-Johansen tests) (Hurlin and Dumitrescu test). The results reveal that the model has cross-sectional dependency and that they are integrated at one I. (1). The "fully modified OLS and dynamic OLS estimators" were used to examine the existence of a long-term relationship, and it was found that a long-term relationship existed between the selected variables. The study revealed a positive significant relationship between International Competitiveness of the agricultural raw material and area, fertilizer, labour force, economic growth, and foreign direct investment, while international competitiveness has a negative relationship with the advantages of the exchange rate and inflation. The economy policy recommendations deducted from this investigation is that Foreign Direct Investment and the labour force have a positive contribution to the increase of International Competitiveness of agricultural products.Keywords: revealed comparative advantage, agricultural products, area, fertilizer, economic growth, granger causality, panel unit root
Procedia PDF Downloads 10041857 Real-Time Mine Safety System with the Internet of Things
Authors: Şakir Bingöl, Bayram İslamoğlu, Ebubekir Furkan Tepeli, Fatih Mehmet Karakule, Fatih Küçük, Merve Sena Arpacık, Mustafa Taha Kabar, Muhammet Metin Molak, Osman Emre Turan, Ömer Faruk Yesir, Sıla İnanır
Abstract:
This study introduces an IoT-based real-time safety system for mining, addressing global safety challenges. The wearable device, seamlessly integrated into miners' jackets, employs LoRa technology for communication and offers real-time monitoring of vital health and environmental data. Unique features include an LCD panel for immediate information display and sound-based location tracking for emergency response. The methodology involves sensor integration, data transmission, and ethical testing. Validation confirms the system's effectiveness in diverse mining scenarios. The study calls for ongoing research to adapt the system to different mining contexts, emphasizing its potential to significantly enhance safety standards in the industry.Keywords: mining safety, internet of things, wearable technology, LoRa, RFID tracking, real-time safety system, safety alerts, safety measures
Procedia PDF Downloads 6341856 Enabling Participation of Deaf People in the Co-Production of Services: An Example in Service Design, Commissioning and Delivery in a London Borough
Authors: Stephen Bahooshy
Abstract:
Co-producing services with the people that access them is considered best practice in the United Kingdom, with the Care Act 2014 arguing that people who access services and their carers should be involved in the design, commissioning and delivery of services. Co-production is a way of working with the community, breaking down barriers of access and providing meaningful opportunity for people to engage. Unfortunately, owing to a number of reported factors such as time constraints, practitioner experience and departmental budget restraints, this process is not always followed. In 2019, in a south London borough, d/Deaf people who access services were engaged in the design, commissioning and delivery of an information and advice service that would support their community to access local government services. To do this, sensory impairment social workers and commissioners collaborated to host a series of engagement events with the d/Deaf community. Interpreters were used to enable communication between the commissioners and d/Deaf participants. Initially, the community’s opinions, ideas and requirements were noted. This was then summarized and fed back to the community to ensure accuracy. Subsequently, a service specification was developed which included performance metrics, inclusive of qualitative and quantitative indicators, such as ‘I statements’, whereby participants respond on an adapted Likert scale how much they agree or disagree with a particular statement in relation to their experience of the service. The service specification was reviewed by a smaller group of d/Deaf residents and social workers, to ensure that it met the community’s requirements. The service was then tendered using the local authority’s e-tender process. Bids were evaluated and scored in two parts; part one was by commissioners and social workers and part two was a presentation by prospective providers to an evaluation panel formed of four d/Deaf residents. The internal evaluation panel formed 75% of the overall score, whilst the d/Deaf resident evaluation panel formed 25% of the overall tender score. Co-producing the evaluation panel with social workers and the d/Deaf community meant that commissioners were able to meet the requirements of this community by developing evaluation questions and tools that were easily understood and use by this community. For example, the wording of questions were reviewed and the scoring mechanism consisted of three faces to reflect the d/Deaf residents’ scores instead of traditional numbering. These faces were a happy face, a neutral face and a sad face. By making simple changes to the commissioning and tender evaluation process, d/Deaf people were able to have meaningful involvement in the design and commissioning process for a service that would benefit their community. Co-produced performance metrics means that it is incumbent on the successful provider to continue to engage with people accessing the service and ensure that the feedback is utilized. d/Deaf residents were grateful to have been involved in this process as this was not an opportunity that they had previously been afforded. In recognition of their time, each d/Deaf resident evaluator received a £40 gift voucher, bringing the total cost of this co-production to £160.Keywords: co-production, community engagement, deaf and hearing impaired, service design
Procedia PDF Downloads 27141855 An Automated Approach to Consolidate Galileo System Availability
Authors: Marie Bieber, Fabrice Cosson, Olivier Schmitt
Abstract:
Europe's Global Navigation Satellite System, Galileo, provides worldwide positioning and navigation services. The satellites in space are only one part of the Galileo system. An extensive ground infrastructure is essential to oversee the satellites and ensure accurate navigation signals. High reliability and availability of the entire Galileo system are crucial to continuously provide positioning information of high quality to users. Outages are tracked, and operational availability is regularly assessed. A highly flexible and adaptive tool has been developed to automate the Galileo system availability analysis. Not only does it enable a quick availability consolidation, but it also provides first steps towards improving the data quality of maintenance tickets used for the analysis. This includes data import and data preparation, with a focus on processing strings used for classification and identifying faulty data. Furthermore, the tool allows to handle a low amount of data, which is a major constraint when the aim is to provide accurate statistics.Keywords: availability, data quality, system performance, Galileo, aerospace
Procedia PDF Downloads 16741854 Impact of Ownership Structure on Financial Performance of Listed Industrial Goods Firms in Nigeria
Authors: Muhammad Shehu Garba
Abstract:
The financial statements of the firms between the periods of 2013 and 2022 were collected using the secondary method of data collection, and the study aims to investigate the effect of ownership structure on the financial performance of listed industrial goods companies in Nigeria. 10 firms were used as the study's sample size. The study used panel data variables of the study. The ownership structure is measured with managerial ownership, institutional ownership and foreign ownership, while financial performance is measured with return on asset and return on equity; the study made use of control variables leverage and firm size. The result shows a multivariate relationship that exists between variables of the study, which shows ROA has a positive correlation with ROE (0.4053), MO (0.2001), and FS (0.3048). It has a negative correlation with FO (-0.1933), IO (-0.0919), and LEV (-0.3367). ROE has a positive correlation with ROA (0.4053), MO (0.2001), and FS (0.2640). It has a negative correlation with FO (-0.1864), IO (-0.1847), and LEV (-0.0319). It is recommended that firms should focus on increasing their ROA. Firms should also consider increasing their MO, as this can help to align the interests of managers and shareholders. Firms should also be aware of the potential impact of FO and IO on their ROA.Keywords: firm size, ownership structure, financial performance, leaverage
Procedia PDF Downloads 6641853 Processing Big Data: An Approach Using Feature Selection
Authors: Nikat Parveen, M. Ananthi
Abstract:
Big data is one of the emerging technology, which collects the data from various sensors and those data will be used in many fields. Data retrieval is one of the major issue where there is a need to extract the exact data as per the need. In this paper, large amount of data set is processed by using the feature selection. Feature selection helps to choose the data which are actually needed to process and execute the task. The key value is the one which helps to point out exact data available in the storage space. Here the available data is streamed and R-Center is proposed to achieve this task.Keywords: big data, key value, feature selection, retrieval, performance
Procedia PDF Downloads 34141852 Improved K-Means Clustering Algorithm Using RHadoop with Combiner
Authors: Ji Eun Shin, Dong Hoon Lim
Abstract:
Data clustering is a common technique used in data analysis and is used in many applications, such as artificial intelligence, pattern recognition, economics, ecology, psychiatry and marketing. K-means clustering is a well-known clustering algorithm aiming to cluster a set of data points to a predefined number of clusters. In this paper, we implement K-means algorithm based on MapReduce framework with RHadoop to make the clustering method applicable to large scale data. RHadoop is a collection of R packages that allow users to manage and analyze data with Hadoop. The main idea is to introduce a combiner as a function of our map output to decrease the amount of data needed to be processed by reducers. The experimental results demonstrated that K-means algorithm using RHadoop can scale well and efficiently process large data sets on commodity hardware. We also showed that our K-means algorithm using RHadoop with combiner was faster than regular algorithm without combiner as the size of data set increases.Keywords: big data, combiner, K-means clustering, RHadoop
Procedia PDF Downloads 43841851 Design and Simulation of Low Cost Boost-Half- Bridge Microinverter with Grid Connection
Authors: P. Bhavya, P. R. Jayasree
Abstract:
This paper presents a low cost transformer isolated boost half bridge micro-inverter for single phase grid connected PV system. Since the output voltage of a single PV panel is as low as 20~50V, a high voltage gain inverter is required for the PV panel to connect to the single-phase grid. The micro-inverter has two stages, an isolated dc-dc converter stage and an inverter stage with a dc link. To achieve MPPT and to step up the PV voltage to the dc link voltage, a transformer isolated boost half bridge dc-dc converter is used. To output the synchronised sinusoidal current with unity power factor to the grid, a pulse width modulated full bridge inverter with LCL filter is used. Variable step size Maximum Power Point Tracking (MPPT) method is adopted such that fast tracking and high MPPT efficiency are both obtained. AC voltage as per grid requirement is obtained at the output of the inverter. High power factor (>0.99) is obtained at both heavy and light loads. This paper gives the results of computer simulation program of a grid connected solar PV system using MATLAB/Simulink and SIM Power System tool.Keywords: boost-half-bridge, micro-inverter, maximum power point tracking, grid connection, MATLAB/Simulink
Procedia PDF Downloads 34141850 Joint Probability Distribution of Extreme Water Level with Rainfall and Temperature: Trend Analysis of Potential Impacts of Climate Change
Authors: Ali Razmi, Saeed Golian
Abstract:
Climate change is known to have the potential to impact adversely hydrologic patterns for variables such as rainfall, maximum and minimum temperature and sea level rise. Long-term average of these climate variables could possibly change over time due to climate change impacts. In this study, trend analysis was performed on rainfall, maximum and minimum temperature and water level data of a coastal area in Manhattan, New York City, Central Park and Battery Park stations to investigate if there is a significant change in the data mean. Partial Man-Kendall test was used for trend analysis. Frequency analysis was then performed on data using common probability distribution functions such as Generalized Extreme Value (GEV), normal, log-normal and log-Pearson. Goodness of fit tests such as Kolmogorov-Smirnov are used to determine the most appropriate distributions. In flood frequency analysis, rainfall and water level data are often separately investigated. However, in determining flood zones, simultaneous consideration of rainfall and water level in frequency analysis could have considerable effect on floodplain delineation (flood extent and depth). The present study aims to perform flood frequency analysis considering joint probability distribution for rainfall and storm surge. First, correlation between the considered variables was investigated. Joint probability distribution of extreme water level and temperature was also investigated to examine how global warming could affect sea level flooding impacts. Copula functions were fitted to data and joint probability of water level with rainfall and temperature for different recurrence intervals of 2, 5, 25, 50, 100, 200, 500, 600 and 1000 was determined and compared with the severity of individual events. Results for trend analysis showed increase in long-term average of data that could be attributed to climate change impacts. GEV distribution was found as the most appropriate function to be fitted to the extreme climate variables. The results for joint probability distribution analysis confirmed the necessity for incorporation of both rainfall and water level data in flood frequency analysis.Keywords: climate change, climate variables, copula, joint probability
Procedia PDF Downloads 36041849 Model of Optimal Centroids Approach for Multivariate Data Classification
Authors: Pham Van Nha, Le Cam Binh
Abstract:
Particle swarm optimization (PSO) is a population-based stochastic optimization algorithm. PSO was inspired by the natural behavior of birds and fish in migration and foraging for food. PSO is considered as a multidisciplinary optimization model that can be applied in various optimization problems. PSO’s ideas are simple and easy to understand but PSO is only applied in simple model problems. We think that in order to expand the applicability of PSO in complex problems, PSO should be described more explicitly in the form of a mathematical model. In this paper, we represent PSO in a mathematical model and apply in the multivariate data classification. First, PSOs general mathematical model (MPSO) is analyzed as a universal optimization model. Then, Model of Optimal Centroids (MOC) is proposed for the multivariate data classification. Experiments were conducted on some benchmark data sets to prove the effectiveness of MOC compared with several proposed schemes.Keywords: analysis of optimization, artificial intelligence based optimization, optimization for learning and data analysis, global optimization
Procedia PDF Downloads 20841848 Understanding Innovation by Analyzing the Pillars of the Global Competitiveness Index
Authors: Ujjwala Bhand, Mridula Goel
Abstract:
Global Competitiveness Index (GCI) prepared by World Economic Forum has become a benchmark in studying the competitiveness of countries and for understanding the factors that enable competitiveness. Innovation is a key pillar in competitiveness and has the unique property of enabling exponential economic growth. This paper attempts to analyze how the pillars comprising the Global Competitiveness Index affect innovation and whether GDP growth can directly affect innovation outcomes for a country. The key objective of the study is to identify areas on which governments of developing countries can focus policies and programs to improve their country’s innovativeness. We have compiled a panel data set for top innovating countries and large emerging economies called BRICS from 2007-08 to 2014-15 in order to find the significant factors that affect innovation. The results of the regression analysis suggest that government should make policies to improve labor market efficiency, establish sophisticated business networks, provide basic health and primary education to its people and strengthen the quality of higher education and training services in the economy. The achievements of smaller economies on innovation suggest that concerted efforts by governments can counter any size related disadvantage, and in fact can provide greater flexibility and speed in encouraging innovation.Keywords: innovation, global competitiveness index, BRICS, economic growth
Procedia PDF Downloads 26841847 Copula-Based Estimation of Direct and Indirect Effects in Path Analysis Model
Authors: Alam Ali, Ashok Kumar Pathak
Abstract:
Path analysis is a statistical technique used to evaluate the strength of the direct and indirect effects of variables. One or more structural regression equations are used to estimate a series of parameters in order to find the better fit of data. Sometimes, exogenous variables do not show a significant strength of their direct and indirect effect when the assumption of classical regression (ordinary least squares (OLS)) are violated by the nature of the data. The main motive of this article is to investigate the efficacy of the copula-based regression approach over the classical regression approach and calculate the direct and indirect effects of variables when data violates the OLS assumption and variables are linked through an elliptical copula. We perform this study using a well-organized numerical scheme. Finally, a real data application is also presented to demonstrate the performance of the superiority of the copula approach.Keywords: path analysis, copula-based regression models, direct and indirect effects, k-fold cross validation technique
Procedia PDF Downloads 7241846 Sentiment Analysis: An Enhancement of Ontological-Based Features Extraction Techniques and Word Equations
Authors: Mohd Ridzwan Yaakub, Muhammad Iqbal Abu Latiffi
Abstract:
Online business has become popular recently due to the massive amount of information and medium available on the Internet. This has resulted in the huge number of reviews where the consumers share their opinion, criticisms, and satisfaction on the products they have purchased on the websites or the social media such as Facebook and Twitter. However, to analyze customer’s behavior has become very important for organizations to find new market trends and insights. The reviews from the websites or the social media are in structured and unstructured data that need a sentiment analysis approach in analyzing customer’s review. In this article, techniques used in will be defined. Definition of the ontology and description of its possible usage in sentiment analysis will be defined. It will lead to empirical research that related to mobile phones used in research and the ontology used in the experiment. The researcher also will explore the role of preprocessing data and feature selection methodology. As the result, ontology-based approach in sentiment analysis can help in achieving high accuracy for the classification task.Keywords: feature selection, ontology, opinion, preprocessing data, sentiment analysis
Procedia PDF Downloads 20041845 Social Network Analysis as a Research and Pedagogy Tool in Problem-Focused Undergraduate Social Innovation Courses
Authors: Sean McCarthy, Patrice M. Ludwig, Will Watson
Abstract:
This exploratory case study explores the deployment of Social Network Analysis (SNA) in mapping community assets in an interdisciplinary, undergraduate, team-taught course focused on income insecure populations in a rural area in the US. Specifically, it analyzes how students were taught to collect data on community assets and to visualize the connections between those assets using Kumu, an SNA data visualization tool. Further, the case study shows how social network data was also collected about student teams via their written communications in Slack, an enterprise messaging tool, which enabled instructors to manage and guide student research activity throughout the semester. The discussion presents how SNA methods can simultaneously inform both community-based research and social innovation pedagogy through the use of data visualization and collaboration-focused communication technologies.Keywords: social innovation, social network analysis, pedagogy, problem-based learning, data visualization, information communication technologies
Procedia PDF Downloads 14741844 Characterization of Agroforestry Systems in Burkina Faso Using an Earth Observation Data Cube
Authors: Dan Kanmegne
Abstract:
Africa will become the most populated continent by the end of the century, with around 4 billion inhabitants. Food security and climate changes will become continental issues since agricultural practices depend on climate but also contribute to global emissions and land degradation. Agroforestry has been identified as a cost-efficient and reliable strategy to address these two issues. It is defined as the integrated management of trees and crops/animals in the same land unit. Agroforestry provides benefits in terms of goods (fruits, medicine, wood, etc.) and services (windbreaks, fertility, etc.), and is acknowledged to have a great potential for carbon sequestration; therefore it can be integrated into reduction mechanisms of carbon emissions. Particularly in sub-Saharan Africa, the constraint stands in the lack of information about both areas under agroforestry and the characterization (composition, structure, and management) of each agroforestry system at the country level. This study describes and quantifies “what is where?”, earliest to the quantification of carbon stock in different systems. Remote sensing (RS) is the most efficient approach to map such a dynamic technology as agroforestry since it gives relatively adequate and consistent information over a large area at nearly no cost. RS data fulfill the good practice guidelines of the Intergovernmental Panel On Climate Change (IPCC) that is to be used in carbon estimation. Satellite data are getting more and more accessible, and the archives are growing exponentially. To retrieve useful information to support decision-making out of this large amount of data, satellite data needs to be organized so to ensure fast processing, quick accessibility, and ease of use. A new solution is a data cube, which can be understood as a multi-dimensional stack (space, time, data type) of spatially aligned pixels and used for efficient access and analysis. A data cube for Burkina Faso has been set up from the cooperation project between the international service provider WASCAL and Germany, which provides an accessible exploitation architecture of multi-temporal satellite data. The aim of this study is to map and characterize agroforestry systems using the Burkina Faso earth observation data cube. The approach in its initial stage is based on an unsupervised image classification of a normalized difference vegetation index (NDVI) time series from 2010 to 2018, to stratify the country based on the vegetation. Fifteen strata were identified, and four samples per location were randomly assigned to define the sampling units. For safety reasons, the northern part will not be part of the fieldwork. A total of 52 locations will be visited by the end of the dry season in February-March 2020. The field campaigns will consist of identifying and describing different agroforestry systems and qualitative interviews. A multi-temporal supervised image classification will be done with a random forest algorithm, and the field data will be used for both training the algorithm and accuracy assessment. The expected outputs are (i) map(s) of agroforestry dynamics, (ii) characteristics of different systems (main species, management, area, etc.); (iii) assessment report of Burkina Faso data cube.Keywords: agroforestry systems, Burkina Faso, earth observation data cube, multi-temporal image classification
Procedia PDF Downloads 14541843 Dissimilarity Measure for General Histogram Data and Its Application to Hierarchical Clustering
Authors: K. Umbleja, M. Ichino
Abstract:
Symbolic data mining has been developed to analyze data in very large datasets. It is also useful in cases when entry specific details should remain hidden. Symbolic data mining is quickly gaining popularity as datasets in need of analyzing are becoming ever larger. One type of such symbolic data is a histogram, which enables to save huge amounts of information into a single variable with high-level of granularity. Other types of symbolic data can also be described in histograms, therefore making histogram a very important and general symbolic data type - a method developed for histograms - can also be applied to other types of symbolic data. Due to its complex structure, analyzing histograms is complicated. This paper proposes a method, which allows to compare two histogram-valued variables and therefore find a dissimilarity between two histograms. Proposed method uses the Ichino-Yaguchi dissimilarity measure for mixed feature-type data analysis as a base and develops a dissimilarity measure specifically for histogram data, which allows to compare histograms with different number of bins and bin widths (so called general histogram). Proposed dissimilarity measure is then used as a measure for clustering. Furthermore, linkage method based on weighted averages is proposed with the concept of cluster compactness to measure the quality of clustering. The method is then validated with application on real datasets. As a result, the proposed dissimilarity measure is found producing adequate and comparable results with general histograms without the loss of detail or need to transform the data.Keywords: dissimilarity measure, hierarchical clustering, histograms, symbolic data analysis
Procedia PDF Downloads 162