Search results for: data sets
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25035

Search results for: data sets

24435 Investigating the Effectiveness of Multilingual NLP Models for Sentiment Analysis

Authors: Othmane Touri, Sanaa El Filali, El Habib Benlahmar

Abstract:

Natural Language Processing (NLP) has gained significant attention lately. It has proved its ability to analyze and extract insights from unstructured text data in various languages. It is found that one of the most popular NLP applications is sentiment analysis which aims to identify the sentiment expressed in a piece of text, such as positive, negative, or neutral, in multiple languages. While there are several multilingual NLP models available for sentiment analysis, there is a need to investigate their effectiveness in different contexts and applications. In this study, we aim to investigate the effectiveness of different multilingual NLP models for sentiment analysis on a dataset of online product reviews in multiple languages. The performance of several NLP models, including Google Cloud Natural Language API, Microsoft Azure Cognitive Services, Amazon Comprehend, Stanford CoreNLP, spaCy, and Hugging Face Transformers are being compared. The models based on several metrics, including accuracy, precision, recall, and F1 score, are being evaluated and compared to their performance across different categories of product reviews. In order to run the study, preprocessing of the dataset has been performed by cleaning and tokenizing the text data in multiple languages. Then training and testing each model has been applied using a cross-validation approach where randomly dividing the dataset into training and testing sets and repeating the process multiple times has been used. A grid search approach to optimize the hyperparameters of each model and select the best-performing model for each category of product reviews and language has been applied. The findings of this study provide insights into the effectiveness of different multilingual NLP models for Multilingual Sentiment Analysis and their suitability for different languages and applications. The strengths and limitations of each model were identified, and recommendations for selecting the most performant model based on the specific requirements of a project were provided. This study contributes to the advancement of research methods in multilingual NLP and provides a practical guide for researchers and practitioners in the field.

Keywords: NLP, multilingual, sentiment analysis, texts

Procedia PDF Downloads 77
24434 Survey of Selected Pathogenic Bacteria in Chickens from Rural Households in Limpopo Province

Authors: M. Lizzy Madiwani, Ignatious Ncube, Evelyn Madoroba

Abstract:

This study was designed to determine the distribution of pathogenic bacteria in household raised chickens and study their virulence and antibiotic profiles. For this purpose, 40 chickens were purchased from families in the Capricorn district and sacrificed for sampling. Tissues were cultured on different bacteriological media followed by biotyping using Matrix-assisted Laser Desorption Ionization-time of Flight (MALDI-TOF). Disk diffusion test was performed to determine the antibiotic susceptibility profiles of these bacteria. Out of a total of 160 tissue samples evaluated, E. coli and Salmonella were detected in these tissues. Furthermore, determination of the pathogenic E. coli and Salmonella strains at species level using primer sets that target selected genes of interest in the polymerase chain reaction (PCR) assay was employed. The invA gene, a confirmatory gene of Salmonella was detected in all the Salmonella isolates. The study revealed that there is a high distribution of Salmonella and pathogenic E. coli in these chickens. Therefore, further studies on identification at the species level are highly recommended to provide management and sanitation practices to lower this prevalence. The antimicrobial susceptibly data generated from this study can be a valuable reference to veterinarians for treating bacterial diseases in poultry.

Keywords: antimicrobial, Escherichia coli, pathogens, Salmonella

Procedia PDF Downloads 104
24433 The Role of Interpersonal and Institutional Trusts for the Public Support of Welfare State

Authors: Nazim Habibov, Alena Auchynnikava, Lida Fan

Abstract:

The exploration of the relationship between social trust and the support of the welfare system in transitional countries has attracted growing interests in recent decades. This study estimates the effects of interpersonal and institutional trust on the support of the welfare system in 27 countries in Eastern Europe the former Soviet Union. We estimate the data sets from the Life-in-Transition Survey 2010 and 2016 with binomial regression models. The results indicate that both interpersonal and institutional trust have positive effects on the support for the welfare system in all the three areas under investigation: helping the needy, public healthcare and public education, both in the less developed countries of the former Soviet Union and in the more developed Eastern European countries. Furthermore, the positive effects of interpersonal and institutional trust on support for helping the needy, public healthcare and public education were found to grow over time. In conclusion, this study confirms that interpersonal and institutional trusts have positive effects for the public support of the welfare system in these transitional countries under investigation, regardless of their level of development.

Keywords: central and eastern Europe, former Soviet union, international social welfare policy, comparative social welfare policy

Procedia PDF Downloads 116
24432 The Influence of Ecologically -Valid High- and Low-Volume Resistance Training on Muscle Strength and Size in Trained Men

Authors: Jason Dellatolla, Scott Thomas

Abstract:

Much of the current literature pertaining to resistance training (RT) volume prescription lacks ecological validity, and very few studies investigate true high-volume ranges. Purpose: The present study sought to investigate the effects of ecologically-valid high- vs low-volume RT on muscular size and strength in trained men. Methods: This study systematically randomized trained, college-aged men into two groups: low-volume (LV; n = 4) and high-volume (HV; n = 5). The sample size was affected by COVID-19 limitations. Subjects followed an ecologically-valid 6-week RT program targeting both muscle size and strength. RT occurred 3x/week on non-consecutive days. Over the course of six weeks, LVR and HVR gradually progressed from 15 to 23 sets/week and 30 to 46 sets/week of lower-body RT, respectively. Muscle strength was assessed via 3RM tests in the squat, stiff-leg deadlift (SL DL), and leg press. Muscle hypertrophy was evaluated through a combination of DXA, BodPod, and ultrasound (US) measurements. Results: Two-way repeated-measures ANOVAs indicated that strength in all 3 compound lifts increased significantly among both groups (p < 0.01); between-group differences only occurred in the squat (p = 0.02) and SL DL (p = 0.03), both of which favored HVR. Significant pre-to-post-study increases in indicators of hypertrophy were discovered for lean body mass in the legs via DXA, overall fat-free mass via BodPod, and US measures of muscle thickness (MT) for the rectus femoris, vastus intermedius, vastus medialis, vastus lateralis, long-head of the biceps femoris, and total MT. Between-group differences were only found for MT of the vastus medialis – favoring HVR. Moreover, each additional weekly set of lower-body RT was associated with an average increase in MT of 0.39% in the thigh muscles. Conclusion: We conclude that ecologically-valid RT regimens significantly improve muscular strength and indicators of hypertrophy. When HVR is compared to LVR, HVR provides significantly greater gains in muscular strength but has no greater effect on hypertrophy over the course of 6 weeks in trained, college-aged men.

Keywords: ecological validity, hypertrophy, resistance training, strength

Procedia PDF Downloads 97
24431 Identifying Critical Success Factors for Data Quality Management through a Delphi Study

Authors: Maria Paula Santos, Ana Lucas

Abstract:

Organizations support their operations and decision making on the data they have at their disposal, so the quality of these data is remarkably important and Data Quality (DQ) is currently a relevant issue, the literature being unanimous in pointing out that poor DQ can result in large costs for organizations. The literature review identified and described 24 Critical Success Factors (CSF) for Data Quality Management (DQM) that were presented to a panel of experts, who ordered them according to their degree of importance, using the Delphi method with the Q-sort technique, based on an online questionnaire. The study shows that the five most important CSF for DQM are: definition of appropriate policies and standards, control of inputs, definition of a strategic plan for DQ, organizational culture focused on quality of the data and obtaining top management commitment and support.

Keywords: critical success factors, data quality, data quality management, Delphi, Q-Sort

Procedia PDF Downloads 202
24430 Investigating the Effects of Psychological and Socio-Cultural Factors on the Tendency of Villagers to Use E-Banking Services: Case Study of Agricultural Bank Branches in Ilam

Authors: Nahid Ehsani, Amir Hossein Rezvanfar

Abstract:

The main objective of this study is to investigate psychological and socio-cultural factors effective on the tendency of the villagers to use e-banking services. The current paper is an applied study considering its objectives. The main data gathering tool in the current study is a made questionnaire which is designed and executed based on the conceptual background of the subject matter and the objectives and hypotheses of the study. The statistical population of this study includes all the customers of rural branches of Agricultural Bank in Ilam Province (N=82885). Among these 120 participants were chosen through sample size determination formula and they were studied using stratified random sampling method. In the analytical statistics level the results obtained from calculating Spearman’s Correlative Coefficient showed that socio-cultural and psychological factors had a significant impact of the extent of the tendency of the villagers to use e-banking services of the Agricultural Bank at the 99% level. Furthermore, stepwise multiple regression analysis showed that both sets of psychological factors as well as socio-economic factors were able to explain 50 percent of the variance of the independent variable; namely the tendency of villagers to use e-banking services.

Keywords: e-banking, agricultural bank, tendency, socio-economic factors, psychological factors

Procedia PDF Downloads 518
24429 Data Mining in Medicine Domain Using Decision Trees and Vector Support Machine

Authors: Djamila Benhaddouche, Abdelkader Benyettou

Abstract:

In this paper, we used data mining to extract biomedical knowledge. In general, complex biomedical data collected in studies of populations are treated by statistical methods, although they are robust, they are not sufficient in themselves to harness the potential wealth of data. For that you used in step two learning algorithms: the Decision Trees and Support Vector Machine (SVM). These supervised classification methods are used to make the diagnosis of thyroid disease. In this context, we propose to promote the study and use of symbolic data mining techniques.

Keywords: biomedical data, learning, classifier, algorithms decision tree, knowledge extraction

Procedia PDF Downloads 536
24428 Analysis of Different Classification Techniques Using WEKA for Diabetic Disease

Authors: Usama Ahmed

Abstract:

Data mining is the process of analyze data which are used to predict helpful information. It is the field of research which solve various type of problem. In data mining, classification is an important technique to classify different kind of data. Diabetes is most common disease. This paper implements different classification technique using Waikato Environment for Knowledge Analysis (WEKA) on diabetes dataset and find which algorithm is suitable for working. The best classification algorithm based on diabetic data is Naïve Bayes. The accuracy of Naïve Bayes is 76.31% and take 0.06 seconds to build the model.

Keywords: data mining, classification, diabetes, WEKA

Procedia PDF Downloads 132
24427 Comprehensive Study of Data Science

Authors: Asifa Amara, Prachi Singh, Kanishka, Debargho Pathak, Akshat Kumar, Jayakumar Eravelly

Abstract:

Today's generation is totally dependent on technology that uses data as its fuel. The present study is all about innovations and developments in data science and gives an idea about how efficiently to use the data provided. This study will help to understand the core concepts of data science. The concept of artificial intelligence was introduced by Alan Turing in which the main principle was to create an artificial system that can run independently of human-given programs and can function with the help of analyzing data to understand the requirements of the users. Data science comprises business understanding, analyzing data, ethical concerns, understanding programming languages, various fields and sources of data, skills, etc. The usage of data science has evolved over the years. In this review article, we have covered a part of data science, i.e., machine learning. Machine learning uses data science for its work. Machines learn through their experience, which helps them to do any work more efficiently. This article includes a comparative study image between human understanding and machine understanding, advantages, applications, and real-time examples of machine learning. Data science is an important game changer in the life of human beings. Since the advent of data science, we have found its benefits and how it leads to a better understanding of people, and how it cherishes individual needs. It has improved business strategies, services provided by them, forecasting, the ability to attend sustainable developments, etc. This study also focuses on a better understanding of data science which will help us to create a better world.

Keywords: data science, machine learning, data analytics, artificial intelligence

Procedia PDF Downloads 63
24426 Regional Dynamics of Innovation and Entrepreneurship in the Optics and Photonics Industry

Authors: Mustafa İlhan Akbaş, Özlem Garibay, Ivan Garibay

Abstract:

The economic entities in innovation ecosystems form various industry clusters, in which they compete and cooperate to survive and grow. Within a successful and stable industry cluster, the entities acquire different roles that complement each other in the system. The universities and research centers have been accepted to have a critical role in these systems for the creation and development of innovations. However, the real effect of research institutions on regional economic growth is difficult to assess. In this paper, we present our approach for the identification of the impact of research activities on the regional entrepreneurship for a specific high-tech industry: optics and photonics. The optics and photonics has been defined as an enabling industry, which combines the high-tech photonics technology with the developing optics industry. The recent literature suggests that the growth of optics and photonics firms depends on three important factors: the embedded regional specializations in the labor market, the research and development infrastructure, and a dynamic small firm network capable of absorbing new technologies, products and processes. Therefore, the role of each factor and the dynamics among them must be understood to identify the requirements of the entrepreneurship activities in optics and photonics industry. There are three main contributions of our approach. The recent studies show that the innovation in optics and photonics industry is mostly located around metropolitan areas. There are also studies mentioning the importance of research center locations and universities in the regional development of optics and photonics industry. These studies are mostly limited with the number of patents received within a short period of time or some limited survey results. Therefore the first contribution of our approach is conducting a comprehensive analysis for the state and recent history of the photonics and optics research in the US. For this purpose, both the research centers specialized in optics and photonics and the related research groups in various departments of institutions (e.g. Electrical Engineering, Materials Science) are identified and a geographical study of their locations is presented. The second contribution of the paper is the analysis of regional entrepreneurship activities in optics and photonics in recent years. We use the membership data of the International Society for Optics and Photonics (SPIE) and the regional photonics clusters to identify the optics and photonics companies in the US. Then the profiles and activities of these companies are gathered by extracting and integrating the related data from the National Establishment Time Series (NETS) database, ES-202 database and the data sets from the regional photonics clusters. The number of start-ups, their employee numbers and sales are some examples of the extracted data for the industry. Our third contribution is the utilization of collected data to investigate the impact of research institutions on the regional optics and photonics industry growth and entrepreneurship. In this analysis, the regional and periodical conditions of the overall market are taken into consideration while discovering and quantifying the statistical correlations.

Keywords: entrepreneurship, industrial clusters, optics, photonics, emerging industries, research centers

Procedia PDF Downloads 394
24425 Structural Design Optimization of Reinforced Thin-Walled Vessels under External Pressure Using Simulation and Machine Learning Classification Algorithm

Authors: Lydia Novozhilova, Vladimir Urazhdin

Abstract:

An optimization problem for reinforced thin-walled vessels under uniform external pressure is considered. The conventional approaches to optimization generally start with pre-defined geometric parameters of the vessels, and then employ analytic or numeric calculations and/or experimental testing to verify functionality, such as stability under the projected conditions. The proposed approach consists of two steps. First, the feasibility domain will be identified in the multidimensional parameter space. Every point in the feasibility domain defines a design satisfying both geometric and functional constraints. Second, an objective function defined in this domain is formulated and optimized. The broader applicability of the suggested methodology is maximized by implementing the Support Vector Machines (SVM) classification algorithm of machine learning for identification of the feasible design region. Training data for SVM classifier is obtained using the Simulation package of SOLIDWORKS®. Based on the data, the SVM algorithm produces a curvilinear boundary separating admissible and not admissible sets of design parameters with maximal margins. Then optimization of the vessel parameters in the feasibility domain is performed using the standard algorithms for the constrained optimization. As an example, optimization of a ring-stiffened closed cylindrical thin-walled vessel with semi-spherical caps under high external pressure is implemented. As a functional constraint, von Mises stress criterion is used but any other stability constraint admitting mathematical formulation can be incorporated into the proposed approach. Suggested methodology has a good potential for reducing design time for finding optimal parameters of thin-walled vessels under uniform external pressure.

Keywords: design parameters, feasibility domain, von Mises stress criterion, Support Vector Machine (SVM) classifier

Procedia PDF Downloads 312
24424 Estimating Knowledge Flow Patterns of Business Method Patents with a Hidden Markov Model

Authors: Yoonjung An, Yongtae Park

Abstract:

Knowledge flows are a critical source of faster technological progress and stouter economic growth. Knowledge flows have been accelerated dramatically with the establishment of a patent system in which each patent is required by law to disclose sufficient technical information for the invention to be recreated. Patent analysis, thus, has been widely used to help investigate technological knowledge flows. However, the existing research is limited in terms of both subject and approach. Particularly, in most of the previous studies, business method (BM) patents were not covered although they are important drivers of knowledge flows as other patents. In addition, these studies usually focus on the static analysis of knowledge flows. Some use approaches that incorporate the time dimension, yet they still fail to trace a true dynamic process of knowledge flows. Therefore, we investigate dynamic patterns of knowledge flows driven by BM patents using a Hidden Markov Model (HMM). An HMM is a popular statistical tool for modeling a wide range of time series data, with no general theoretical limit in regard to statistical pattern classification. Accordingly, it enables characterizing knowledge patterns that may differ by patent, sector, country and so on. We run the model in sets of backward citations and forward citations to compare the patterns of knowledge utilization and knowledge dissemination.

Keywords: business method patents, dynamic pattern, Hidden-Markov Model, knowledge flow

Procedia PDF Downloads 317
24423 Application of Artificial Neural Network Technique for Diagnosing Asthma

Authors: Azadeh Bashiri

Abstract:

Introduction: Lack of proper diagnosis and inadequate treatment of asthma leads to physical and financial complications. This study aimed to use data mining techniques and creating a neural network intelligent system for diagnosis of asthma. Methods: The study population is the patients who had visited one of the Lung Clinics in Tehran. Data were analyzed using the SPSS statistical tool and the chi-square Pearson's coefficient was the basis of decision making for data ranking. The considered neural network is trained using back propagation learning technique. Results: According to the analysis performed by means of SPSS to select the top factors, 13 effective factors were selected, in different performances, data was mixed in various forms, so the different models were made for training the data and testing networks and in all different modes, the network was able to predict correctly 100% of all cases. Conclusion: Using data mining methods before the design structure of system, aimed to reduce the data dimension and the optimum choice of the data, will lead to a more accurate system. Therefore, considering the data mining approaches due to the nature of medical data is necessary.

Keywords: asthma, data mining, Artificial Neural Network, intelligent system

Procedia PDF Downloads 258
24422 Combined Effect of Gender Differences and Fatiguing Task on Unipedal Postural Balance and Functional Mobility in Adults with Multiple Sclerosis

Authors: Sonda Jallouli, Omar Hammouda, Imen Ben Dhia, Salma Sakka, Chokri Mhiri, Mohamed Habib Elleuch, Abedlmoneem Yahia, Sameh Ghroubi

Abstract:

Multiple sclerosis (MS) is characterized by gender differences with affecting women two to four times more than men, but the disease progression is faster and more severe in men. Fatigue represents one of the most frequent and disabling symptoms related to MS. Results of previous studies regarding gender differences in fatigue perception in MS persons are contradictory. Besides, fatigue has been shown to affect negatively postural balance and functional mobility in MS persons. However, no study has taken into account gender differences in the response of these physical parameters to a fatiguing protocol in MS persons. Given the reduction of autonomy due to the alteration of these parameters induced by fatigue and the importance of gender differences in postural balance training programs in fatigued men and women with MS, the aim of this study was to investigate the effect of gender difference on unipedal postural balance and functional mobility after performing a fatiguing task in MS adults. Methods: Eleven women (30.29 ± 7.99 years) and seven men (30.91 ± 8.19 years) with relapsing-remitting MS performed a fatiguing protocol: three sets of the 5×sit to stand test (5-STST), six-minute walk test (6MWT) followed by three sets of the 5-STST. Unipedal balance, functional mobility, and fatigue perception were measured prefatigue (T0) and post fatigue (T3) using a clinical unipedal balance test, timed up and go test (TUGT), and analogic visual scale of fatigue (VASF), respectively. Heart rate (HR) and rate of perceived exertion (RPE) were recorded before, during and after the fatiguing task. Results: Compared to women, men showed an impairment of unipedal balance on the dominant leg (p<0.001, d=0.52) and mobility (p<0.001, d=3) via reducing unipedal stance time and increasing duration of TUGT execution, respectively. No gender differences were observed in 6MWT, 5-STST, HR, RPE and VASF scores. Conclusion: Fatiguing protocol negatively affected unipedal postural balance and mobility only in men. These gender differences were inconclusive but can be taken into account in postural balance rehabilitation programs for persons with MS.

Keywords: functional mobility, fatiguing exercises, multiple sclerosis, sex differences, unipedal balance

Procedia PDF Downloads 119
24421 Bringing Together Student Collaboration and Research Opportunities to Promote Scientific Understanding and Outreach Through a Seismological Community

Authors: Michael Ray Brunt

Abstract:

China has been the site of some of the most significant earthquakes in history; however, earthquake monitoring has long been the provenance of universities and research institutions. The China Digital Seismographic Network was initiated in 1983 and improved significantly during 1992-1993. Data from the CDSN is widely used by government and research institutions, and, generally, this data is not readily accessible to middle and high school students. An educational seismic network in China is needed to provide collaboration and research opportunities for students and engaging students around the country in scientific understanding of earthquake hazards and risks while promoting community awareness. In 2022, the Tsinghua International School (THIS) Seismology Team, made up of enthusiastic students and facilitated by two experienced teachers, was established. As a group, the team’s objective is to install seismographs in schools throughout China, thus creating an educational seismic network that shares data from the THIS Educational Seismic Network (THIS-ESN) and facilitates collaboration. The THIS-ESN initiative will enhance education and outreach in China about earthquake risks and hazards, introduce seismology to a wider audience, stimulate interest in research among students, and develop students’ programming, data collection and analysis skills. It will also encourage and inspire young minds to pursue science, technology, engineering, the arts, and math (STEAM) career fields. The THIS-ESN utilizes small, low-cost RaspberryShake seismographs as a powerful tool linked into a global network, giving schools and the public access to real-time seismic data from across China, increasing earthquake monitoring capabilities in the perspective areas and adding to the available data sets regionally and worldwide helping create a denser seismic network. The RaspberryShake seismograph is compatible with free seismic data viewing platforms such as SWARM, RaspberryShake web programs and mobile apps are designed specifically towards teaching seismology and seismic data interpretation, providing opportunities to enhance understanding. The RaspberryShake is powered by an operating system embedded in the Raspberry Pi, which makes it an easy platform to teach students basic computer communication concepts by utilizing processing tools to investigate, plot, and manipulate data. THIS Seismology Team believes strongly in creating opportunities for committed students to become part of the seismological community by engaging in analysis of real-time scientific data with tangible outcomes. Students will feel proud of the important work they are doing to understand the world around them and become advocates spreading their knowledge back into their homes and communities, helping to improve overall community resilience. We trust that, in studying the results seismograph stations yield, students will not only grasp how subjects like physics and computer science apply in real life, and by spreading information, we hope students across the country can appreciate how and why earthquakes bear on their lives, develop practical skills in STEAM, and engage in the global seismic monitoring effort. By providing such an opportunity to schools across the country, we are confident that we will be an agent of change for society.

Keywords: collaboration, outreach, education, seismology, earthquakes, public awareness, research opportunities

Procedia PDF Downloads 52
24420 Utilization of Informatics to Transform Clinical Data into a Simplified Reporting System to Examine the Analgesic Prescribing Practices of a Single Urban Hospital’s Emergency Department

Authors: Rubaiat S. Ahmed, Jemer Garrido, Sergey M. Motov

Abstract:

Clinical informatics (CI) enables the transformation of data into a systematic organization that improves the quality of care and the generation of positive health outcomes.Innovative technology through informatics that compiles accurate data on analgesic utilization in the emergency department can enhance pain management in this important clinical setting. We aim to establish a simplified reporting system through CI to examine and assess the analgesic prescribing practices in the EDthrough executing a U.S. federal grant project on opioid reduction initiatives. Queried data points of interest from a level-one trauma ED’s electronic medical records were used to create data sets and develop informational/visual reporting dashboards (on Microsoft Excel and Google Sheets) concerning analgesic usage across several pre-defined parameters and performance metrics using CI. The data was then qualitatively analyzed to evaluate ED analgesic prescribing trends by departmental clinicians and leadership. During a 12-month reporting period (Dec. 1, 2020 – Nov. 30, 2021) for the ongoing project, about 41% of all ED patient visits (N = 91,747) were for pain conditions, of which 81.6% received analgesics in the ED and at discharge (D/C). Of those treated with analgesics, 24.3% received opioids compared to 75.7% receiving opioid alternatives in the ED and at D/C, including non-pharmacological modalities. Demographics showed among patients receiving analgesics, 56.7% were aged between 18-64, 51.8% were male, 51.7% were white, and 66.2% had government funded health insurance. Ninety-one percent of all opioids prescribed were in the ED, with intravenous (IV) morphine, IV fentanyl, and morphine sulfate immediate release (MSIR) tablets accounting for 88.0% of ED dispensed opioids. With 9.3% of all opioids prescribed at D/C, MSIR was dispensed 72.1% of the time. Hydrocodone, oxycodone, and tramadol usage to only 10-15% of the time, and hydromorphone at 0%. Of opioid alternatives, non-steroidal anti-inflammatory drugs were utilized 60.3% of the time, 23.5% with local anesthetics and ultrasound-guided nerve blocks, and 7.9% with acetaminophen as the primary non-opioid drug categories prescribed by ED providers. Non-pharmacological analgesia included virtual reality and other modalities. An average of 18.5 ED opioid orders and 1.9 opioid D/C prescriptions per 102.4 daily ED patient visits was observed for the period. Compared to other specialties within our institution, 2.0% of opioid D/C prescriptions are given by ED providers, compared to the national average of 4.8%. Opioid alternatives accounted for 69.7% and 30.3% usage, versus 90.7% and 9.3% for opioids in the ED and D/C, respectively.There is a pressing need for concise, relevant, and reliable clinical data on analgesic utilization for ED providers and leadership to evaluate prescribing practices and make data-driven decisions. Basic computer software can be used to create effective visual reporting dashboards with indicators that convey relevant and timely information in an easy-to-digest manner. We accurately examined our ED's analgesic prescribing practices using CI through dashboard reporting. Such reporting tools can quickly identify key performance indicators and prioritize data to enhance pain management and promote safe prescribing practices in the emergency setting.

Keywords: clinical informatics, dashboards, emergency department, health informatics, healthcare informatics, medical informatics, opioids, pain management, technology

Procedia PDF Downloads 126
24419 Kernel-Based Double Nearest Proportion Feature Extraction for Hyperspectral Image Classification

Authors: Hung-Sheng Lin, Cheng-Hsuan Li

Abstract:

Over the past few years, kernel-based algorithms have been widely used to extend some linear feature extraction methods such as principal component analysis (PCA), linear discriminate analysis (LDA), and nonparametric weighted feature extraction (NWFE) to their nonlinear versions, kernel principal component analysis (KPCA), generalized discriminate analysis (GDA), and kernel nonparametric weighted feature extraction (KNWFE), respectively. These nonlinear feature extraction methods can detect nonlinear directions with the largest nonlinear variance or the largest class separability based on the given kernel function. Moreover, they have been applied to improve the target detection or the image classification of hyperspectral images. The double nearest proportion feature extraction (DNP) can effectively reduce the overlap effect and have good performance in hyperspectral image classification. The DNP structure is an extension of the k-nearest neighbor technique. For each sample, there are two corresponding nearest proportions of samples, the self-class nearest proportion and the other-class nearest proportion. The term “nearest proportion” used here consider both the local information and other more global information. With these settings, the effect of the overlap between the sample distributions can be reduced. Usually, the maximum likelihood estimator and the related unbiased estimator are not ideal estimators in high dimensional inference problems, particularly in small data-size situation. Hence, an improved estimator by shrinkage estimation (regularization) is proposed. Based on the DNP structure, LDA is included as a special case. In this paper, the kernel method is applied to extend DNP to kernel-based DNP (KDNP). In addition to the advantages of DNP, KDNP surpasses DNP in the experimental results. According to the experiments on the real hyperspectral image data sets, the classification performance of KDNP is better than that of PCA, LDA, NWFE, and their kernel versions, KPCA, GDA, and KNWFE.

Keywords: feature extraction, kernel method, double nearest proportion feature extraction, kernel double nearest feature extraction

Procedia PDF Downloads 327
24418 Interpreting Privacy Harms from a Non-Economic Perspective

Authors: Christopher Muhawe, Masooda Bashir

Abstract:

With increased Internet Communication Technology(ICT), the virtual world has become the new normal. At the same time, there is an unprecedented collection of massive amounts of data by both private and public entities. Unfortunately, this increase in data collection has been in tandem with an increase in data misuse and data breach. Regrettably, the majority of data breach and data misuse claims have been unsuccessful in the United States courts for the failure of proof of direct injury to physical or economic interests. The requirement to express data privacy harms from an economic or physical stance negates the fact that not all data harms are physical or economic in nature. The challenge is compounded by the fact that data breach harms and risks do not attach immediately. This research will use a descriptive and normative approach to show that not all data harms can be expressed in economic or physical terms. Expressing privacy harms purely from an economic or physical harm perspective negates the fact that data insecurity may result into harms which run counter the functions of privacy in our lives. The promotion of liberty, selfhood, autonomy, promotion of human social relations and the furtherance of the existence of a free society. There is no economic value that can be placed on these functions of privacy. The proposed approach addresses data harms from a psychological and social perspective.

Keywords: data breach and misuse, economic harms, privacy harms, psychological harms

Procedia PDF Downloads 176
24417 Genomic Prediction Reliability Using Haplotypes Defined by Different Methods

Authors: Sohyoung Won, Heebal Kim, Dajeong Lim

Abstract:

Genomic prediction is an effective way to measure the abilities of livestock for breeding based on genomic estimated breeding values, statistically predicted values from genotype data using best linear unbiased prediction (BLUP). Using haplotypes, clusters of linked single nucleotide polymorphisms (SNPs), as markers instead of individual SNPs can improve the reliability of genomic prediction since the probability of a quantitative trait loci to be in strong linkage disequilibrium (LD) with markers is higher. To efficiently use haplotypes in genomic prediction, finding optimal ways to define haplotypes is needed. In this study, 770K SNP chip data was collected from Hanwoo (Korean cattle) population consisted of 2506 cattle. Haplotypes were first defined in three different ways using 770K SNP chip data: haplotypes were defined based on 1) length of haplotypes (bp), 2) the number of SNPs, and 3) k-medoids clustering by LD. To compare the methods in parallel, haplotypes defined by all methods were set to have comparable sizes; in each method, haplotypes defined to have an average number of 5, 10, 20 or 50 SNPs were tested respectively. A modified GBLUP method using haplotype alleles as predictor variables was implemented for testing the prediction reliability of each haplotype set. Also, conventional genomic BLUP (GBLUP) method, which uses individual SNPs were tested to evaluate the performance of the haplotype sets on genomic prediction. Carcass weight was used as the phenotype for testing. As a result, using haplotypes defined by all three methods showed increased reliability compared to conventional GBLUP. There were not many differences in the reliability between different haplotype defining methods. The reliability of genomic prediction was highest when the average number of SNPs per haplotype was 20 in all three methods, implying that haplotypes including around 20 SNPs can be optimal to use as markers for genomic prediction. When the number of alleles generated by each haplotype defining methods was compared, clustering by LD generated the least number of alleles. Using haplotype alleles for genomic prediction showed better performance, suggesting improved accuracy in genomic selection. The number of predictor variables was decreased when the LD-based method was used while all three haplotype defining methods showed similar performances. This suggests that defining haplotypes based on LD can reduce computational costs and allows efficient prediction. Finding optimal ways to define haplotypes and using the haplotype alleles as markers can provide improved performance and efficiency in genomic prediction.

Keywords: best linear unbiased predictor, genomic prediction, haplotype, linkage disequilibrium

Procedia PDF Downloads 129
24416 Approach for Demonstrating Reliability Targets for Rail Transport during Low Mileage Accumulation in the Field: Methodology and Case Study

Authors: Nipun Manirajan, Heeralal Gargama, Sushil Guhe, Manoj Prabhakaran

Abstract:

In railway industry, train sets are designed based on contractual requirements (mission profile), where reliability targets are measured in terms of mean distance between failures (MDBF). However, during the beginning of revenue services, trains do not achieve the designed mission profile distance (mileage) within the timeframe due to infrastructure constraints, scarcity of commuters or other operational challenges thereby not respecting the original design inputs. Since trains do not run sufficiently and do not achieve the designed mileage within the specified time, car builder has a risk of not achieving the contractual MDBF target. This paper proposes a constant failure rate based model to deal with the situations where mileage accumulation is not a part of the design mission profile. The model provides appropriate MDBF target to be demonstrated based on actual accumulated mileage. A case study of rolling stock running in the field is undertaken to analyze the failure data and MDBF target demonstration during low mileage accumulation. The results of case study prove that with the proposed method, reliability targets are achieved under low mileage accumulation.

Keywords: mean distance between failures, mileage-based reliability, reliability target appropriations, rolling stock reliability

Procedia PDF Downloads 248
24415 Most Recent Lifespan Estimate for the Itaipu Hydroelectric Power Plant Computed by Using Borland and Miller Method and Mass Balance in Brazil, Paraguay

Authors: Anderson Braga Mendes

Abstract:

Itaipu Hydroelectric Power Plant is settled on the Paraná River, which is a natural boundary between Brazil and Paraguay; thus, the facility is shared by both countries. Itaipu Power Plant is the biggest hydroelectric generator in the world, and provides clean and renewable electrical energy supply for 17% and 76% of Brazil and Paraguay, respectively. The plant started its generation in 1984. It counts on 20 Francis turbines and has installed capacity of 14,000 MWh. Its historic generation record occurred in 2016 (103,098,366 MWh), and since the beginning of its operation until the last day of 2016 the plant has achieved the sum of 2,415,789,823 MWh. The distinct sedimentologic aspects of the drainage area of Itaipu Power Plant, from its stretch upstream (Porto Primavera and Rosana dams) to downstream (Itaipu dam itself), were taken into account in order to best estimate the increase/decrease in the sediment yield by using data from 2001 to 2016. Such data are collected through a network of 14 automatic sedimentometric stations managed by the company itself and operating in an hourly basis, covering an area of around 136,000 km² (92% of the incremental drainage area of the undertaking). Since 1972, a series of lifespan studies for the Itaipu Power Plant have been made, being first assessed by Sir Hans Albert Einstein, at the time of the feasibility studies for the enterprise. From that date onwards, eight further studies were made through the last 44 years aiming to confer more precision upon the estimates based on more updated data sets. From the analysis of each monitoring station, it was clearly noticed strong increase tendencies in the sediment yield through the last 14 years, mainly in the Iguatemi, Ivaí, São Francisco Falso and Carapá Rivers, the latter situated in Paraguay, whereas the others are utterly in Brazilian territory. Five lifespan scenarios considering different sediment yield tendencies were simulated with the aid of the softwares SEDIMENT and DPOSIT, both developed by the author of the present work. Such softwares thoroughly follow the Borland & Miller methodology (empirical method of area-reduction). The soundest scenario out of the five ones under analysis indicated a lifespan foresight of 168 years, being the reservoir only 1.8% silted by the end of 2016, after 32 years of operation. Besides, the mass balance in the reservoir (water inflows minus outflows) between 1986 and 2016 shows that 2% of the whole Itaipu lake is silted nowadays. Owing to the convergence of both results, which were acquired by using different methodologies and independent input data, it is worth concluding that the mathematical modeling is satisfactory and calibrated, thus assigning credibility to this most recent lifespan estimate.

Keywords: Borland and Miller method, hydroelectricity, Itaipu Power Plant, lifespan, mass balance

Procedia PDF Downloads 260
24414 Machine Learning Analysis of Student Success in Introductory Calculus Based Physics I Course

Authors: Chandra Prayaga, Aaron Wade, Lakshmi Prayaga, Gopi Shankar Mallu

Abstract:

This paper presents the use of machine learning algorithms to predict the success of students in an introductory physics course. Data having 140 rows pertaining to the performance of two batches of students was used. The lack of sufficient data to train robust machine learning models was compensated for by generating synthetic data similar to the real data. CTGAN and CTGAN with Gaussian Copula (Gaussian) were used to generate synthetic data, with the real data as input. To check the similarity between the real data and each synthetic dataset, pair plots were made. The synthetic data was used to train machine learning models using the PyCaret package. For the CTGAN data, the Ada Boost Classifier (ADA) was found to be the ML model with the best fit, whereas the CTGAN with Gaussian Copula yielded Logistic Regression (LR) as the best model. Both models were then tested for accuracy with the real data. ROC-AUC analysis was performed for all the ten classes of the target variable (Grades A, A-, B+, B, B-, C+, C, C-, D, F). The ADA model with CTGAN data showed a mean AUC score of 0.4377, but the LR model with the Gaussian data showed a mean AUC score of 0.6149. ROC-AUC plots were obtained for each Grade value separately. The LR model with Gaussian data showed consistently better AUC scores compared to the ADA model with CTGAN data, except in two cases of the Grade value, C- and A-.

Keywords: machine learning, student success, physics course, grades, synthetic data, CTGAN, gaussian copula CTGAN

Procedia PDF Downloads 29
24413 Comparison of Multivariate Adaptive Regression Splines and Random Forest Regression in Predicting Forced Expiratory Volume in One Second

Authors: P. V. Pramila , V. Mahesh

Abstract:

Pulmonary Function Tests are important non-invasive diagnostic tests to assess respiratory impairments and provides quantifiable measures of lung function. Spirometry is the most frequently used measure of lung function and plays an essential role in the diagnosis and management of pulmonary diseases. However, the test requires considerable patient effort and cooperation, markedly related to the age of patients esulting in incomplete data sets. This paper presents, a nonlinear model built using Multivariate adaptive regression splines and Random forest regression model to predict the missing spirometric features. Random forest based feature selection is used to enhance both the generalization capability and the model interpretability. In the present study, flow-volume data are recorded for N= 198 subjects. The ranked order of feature importance index calculated by the random forests model shows that the spirometric features FVC, FEF 25, PEF,FEF 25-75, FEF50, and the demographic parameter height are the important descriptors. A comparison of performance assessment of both models prove that, the prediction ability of MARS with the `top two ranked features namely the FVC and FEF 25 is higher, yielding a model fit of R2= 0.96 and R2= 0.99 for normal and abnormal subjects. The Root Mean Square Error analysis of the RF model and the MARS model also shows that the latter is capable of predicting the missing values of FEV1 with a notably lower error value of 0.0191 (normal subjects) and 0.0106 (abnormal subjects). It is concluded that combining feature selection with a prediction model provides a minimum subset of predominant features to train the model, yielding better prediction performance. This analysis can assist clinicians with a intelligence support system in the medical diagnosis and improvement of clinical care.

Keywords: FEV, multivariate adaptive regression splines pulmonary function test, random forest

Procedia PDF Downloads 293
24412 Design of Large Parallel Underground Openings in Himalayas: A Case Study of Desilting Chambers for Punatsangchhu-I, Bhutan

Authors: Kanupreiya, Rajani Sharma

Abstract:

Construction of a single underground structure is itself a challenging task, and it becomes more critical in tectonically active young mountains such as the Himalayas which are highly anisotropic. The Himalayan geology mostly comprises of incompetent and sheared rock mass in addition to fold/faults, rock burst, and water ingress. Underground tunnels form the most essential and important structure in run-of-river hydroelectric projects. Punatsangchhu I hydroelectric project (PHEP-I), Bhutan (1200 MW) is a run-of-river scheme which has four parallel underground desilting chambers. The Punatsangchhu River carries a large quantity of silt load during monsoon season. Desilting chambers were provided to remove the silt particles of size greater than and equal to 0.2 mm with 90% efficiency, thereby minimizing the rate of damage to turbines. These chambers are 330 m long, 18 m wide at the center and 23.87 m high, with a 5.87 m hopper portion. The geology of desilting chambers was known from an exploratory drift which exposed low dipping foliation joint and six joint sets. The RMR and Q value in this reach varied from 40 to 60 and 1 to 6 respectively. This paper describes different rock engineering principles undertaken for safe excavation and rock support of the moderately jointed, blocky and thinly foliated biotite gneiss. For the design of rock support system of desilting chambers, empirical and numerical analysis was adopted. Finite element analysis was carried out for cavern design and finalization of pillar width using Phase2. Phase2 is a powerful tool for simulation of stage-wise excavation with simultaneous provision of support system. As the geology of the region had 7 sets of joints, in addition to FEM based approach, safety factors for potentially unstable wedges were checked using UnWedge. The final support recommendations were based on continuous face mapping, numerical modelling, empirical calculations, and practical experiences.

Keywords: dam siltation, Himalayan geology, hydropower, rock support, numerical modelling

Procedia PDF Downloads 78
24411 Data Access, AI Intensity, and Scale Advantages

Authors: Chuping Lo

Abstract:

This paper presents a simple model demonstrating that ceteris paribus countries with lower barriers to accessing global data tend to earn higher incomes than other countries. Therefore, large countries that inherently have greater data resources tend to have higher incomes than smaller countries, such that the former may be more hesitant than the latter to liberalize cross-border data flows to maintain this advantage. Furthermore, countries with higher artificial intelligence (AI) intensity in production technologies tend to benefit more from economies of scale in data aggregation, leading to higher income and more trade as they are better able to utilize global data.

Keywords: digital intensity, digital divide, international trade, scale of economics

Procedia PDF Downloads 51
24410 Secured Transmission and Reserving Space in Images Before Encryption to Embed Data

Authors: G. R. Navaneesh, E. Nagarajan, C. H. Rajam Raju

Abstract:

Nowadays the multimedia data are used to store some secure information. All previous methods allocate a space in image for data embedding purpose after encryption. In this paper, we propose a novel method by reserving space in image with a boundary surrounded before encryption with a traditional RDH algorithm, which makes it easy for the data hider to reversibly embed data in the encrypted images. The proposed method can achieve real time performance, that is, data extraction and image recovery are free of any error. A secure transmission process is also discussed in this paper, which improves the efficiency by ten times compared to other processes as discussed.

Keywords: secure communication, reserving room before encryption, least significant bits, image encryption, reversible data hiding

Procedia PDF Downloads 396
24409 Identity Verification Using k-NN Classifiers and Autistic Genetic Data

Authors: Fuad M. Alkoot

Abstract:

DNA data have been used in forensics for decades. However, current research looks at using the DNA as a biometric identity verification modality. The goal is to improve the speed of identification. We aim at using gene data that was initially used for autism detection to find if and how accurate is this data for identification applications. Mainly our goal is to find if our data preprocessing technique yields data useful as a biometric identification tool. We experiment with using the nearest neighbor classifier to identify subjects. Results show that optimal classification rate is achieved when the test set is corrupted by normally distributed noise with zero mean and standard deviation of 1. The classification rate is close to optimal at higher noise standard deviation reaching 3. This shows that the data can be used for identity verification with high accuracy using a simple classifier such as the k-nearest neighbor (k-NN). 

Keywords: biometrics, genetic data, identity verification, k nearest neighbor

Procedia PDF Downloads 235
24408 Vertical and Lateral Vibration Analysis of Conventional Elevator

Authors: Mohammadreza Saviz, Sina Najafian

Abstract:

This paper presents an analytical study of vibration moving elevator and shows the elevator 2D dynamic model to evaluate the vertical and lateral motion. Most elevators applied to tall buildings include compensating ropes to satisfy the balanced rope tension between the car and the counterweight. The elasticity of these ropes and springs of sets that connect cabin to ropes make the elevator car to vibrate. A two-dimensional model is derived to calculate vibrations and displacements. The simulation results were validated by the results of similar works.

Keywords: elevator, vibration, simulation, analytical solution, 2D modeling

Procedia PDF Downloads 290
24407 A Review on Intelligent Systems for Geoscience

Authors: R Palson Kennedy, P.Kiran Sai

Abstract:

This article introduces machine learning (ML) researchers to the hurdles that geoscience problems present, as well as the opportunities for improvement in both ML and geosciences. This article presents a review from the data life cycle perspective to meet that need. Numerous facets of geosciences present unique difficulties for the study of intelligent systems. Geosciences data is notoriously difficult to analyze since it is frequently unpredictable, intermittent, sparse, multi-resolution, and multi-scale. The first half addresses data science’s essential concepts and theoretical underpinnings, while the second section contains key themes and sharing experiences from current publications focused on each stage of the data life cycle. Finally, themes such as open science, smart data, and team science are considered.

Keywords: Data science, intelligent system, machine learning, big data, data life cycle, recent development, geo science

Procedia PDF Downloads 123
24406 Toward a Measure of Appropriateness of User Interfaces Adaptations Solutions

Authors: Abderrahim Siam, Ramdane Maamri, Zaidi Sahnoun

Abstract:

The development of adaptive user interfaces (UI) presents for a long time an important research area in which researcher attempt to call upon the full resources and skills of several disciplines. The adaptive UI community holds a thorough knowledge regarding the adaptation of UIs with users and with contexts of use. Several solutions, models, formalisms, techniques, and mechanisms were proposed to develop adaptive UI. In this paper, we propose an approach based on the fuzzy set theory for modeling the concept of the appropriateness of different solutions of UI adaptation with different situations for which interactive systems have to adapt their UIs.

Keywords: adaptive user interfaces, adaptation solution’s appropriateness, fuzzy sets

Procedia PDF Downloads 466