Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 26376

Search results for: data exchange

24306 Emotions in Health Tweets: Analysis of American Government Official Accounts

Abstract:

The Government Departments of Health have the task of informing and educating citizens about public health issues. For this, they use channels like Twitter, key in the search for health information and the propagation of content. The tweets, important in the virality of the content, may contain emotions that influence the contagion and exchange of knowledge. The goal of this study is to perform an analysis of the emotional projection of health information shared on Twitter by official American accounts: the disease control account CDCgov, National Institutes of Health, NIH, the government agency HHSGov, and the professional organization PublicHealth. For this, we used Tone Analyzer, an International Business Machines Corporation (IBM) tool specialized in emotion detection in text, corresponding to the categorical model of emotion representation. For 15 days, all tweets from these accounts were analyzed with the emotional analysis tool in text. The results showed that their tweets contain an important emotional load, a determining factor in the success of their communications. This exposes that official accounts also use subjective language and contain emotions. The predominance of emotion joy over sadness and the strong presence of emotions in their tweets stimulate the virality of content, a key in the work of informing that government health departments have.

Keywords: emotions in tweets, emotion detection in the text, health information on Twitter, American health official accounts, emotions on Twitter, emotions and content

Procedia PDF Downloads 148

24305 Mining Scientific Literature to Discover Potential Research Data Sources: An Exploratory Study in the Field of Haemato-Oncology

Authors: A. Anastasiou, K. S. Tingay

Abstract:

Background: Discovering suitable datasets is an important part of health research, particularly for projects working with clinical data from patients organized in cohorts (cohort data), but with the proliferation of so many national and international initiatives, it is becoming increasingly difficult for research teams to locate real world datasets that are most relevant to their project objectives. We present a method for identifying healthcare institutes in the European Union (EU) which may hold haemato-oncology (HO) data. A key enabler of this research was the bibInsight platform, a scientometric data management and analysis system developed by the authors at Swansea University. Method: A PubMed search was conducted using HO clinical terms taken from previous work. The resulting XML file was processed using the bibInsight platform, linking affiliations to the Global Research Identifier Database (GRID). GRID is an international, standardized list of institutions, including the city and country in which the institution exists, as well as a category of the main business type, e.g., Academic, Healthcare, Government, Company. Countries were limited to the 28 current EU members, and institute type to 'Healthcare'. An article was considered valid if at least one author was affiliated with an EU-based healthcare institute. Results: The PubMed search produced 21,310 articles, consisting of 9,885 distinct affiliations with correspondence in GRID. Of these articles, 760 were from EU countries, and 390 of these were healthcare institutes. One affiliation was excluded as being a veterinary hospital. Two EU countries did not have any publications in our analysis dataset. The results were analysed by country and by individual healthcare institute. Networks both within the EU and internationally show institutional collaborations, which may suggest a willingness to share data for research purposes. Geographical mapping can ensure that data has broad population coverage. Collaborations with industry or government may exclude healthcare institutes that may have embargos or additional costs associated with data access. Conclusions: Data reuse is becoming increasingly important both for ensuring the validity of results, and economy of available resources. The ability to identify potential, specific data sources from over twenty thousand articles in less than an hour could assist in improving knowledge of, and access to, data sources. As our method has not yet specified if these healthcare institutes are holding data, or merely publishing on that topic, future work will involve text mining of data-specific concordant terms to identify numbers of participants, demographics, study methodologies, and sub-topics of interest.

Keywords: data reuse, data discovery, data linkage, journal articles, text mining

Procedia PDF Downloads 120

24304 NaOH/Pumice and LiOH/Pumice as Heterogeneous Solid Base Catalysts for Biodiesel Production from Soybean Oil: An Optimization Study

Authors: Joy Marie Mora, Mark Daniel De Luna, Tsair-Wang Chung

Abstract:

Transesterification reaction of soybean oil with methanol was carried out to produce fatty acid methyl esters (FAME) using calcined alkali metal (Na and Li) supported by pumice silica as the solid base catalyst. Pumice silica catalyst was activated by loading alkali metal ions to its surface via an ion-exchange method. Response surface methodology (RSM) in combination with Box-Behnken design (BBD) was used to optimize the operating parameters in biodiesel production, namely: reaction temperature, methanol to oil molar ratio, reaction time, and catalyst concentration. Using the optimized sets of parameters, FAME yields using sodium and lithium silicate catalysts were 98.80% and 98.77%, respectively. A pseudo-first order kinetic equation was applied to evaluate the kinetic parameters of the reaction. The prepared catalysts were characterized by several techniques such as X-ray diffraction (XRD), Fourier transform infrared spectroscopy (FTIR), Brunauer-Emmett-Teller (BET) sorptometer, and scanning electron microscopy (SEM). In addition, the reusability of the catalysts was successfully tested in two subsequent cycles.

Keywords: alkali metal, biodiesel, Box-Behnken design, heterogeneous catalyst, kinetics, optimization, pumice, transesterification

Procedia PDF Downloads 307

24303 Using Data Mining Technique for Scholarship Disbursement

Authors: J. K. Alhassan, S. A. Lawal

Abstract:

This work is on decision tree-based classification for the disbursement of scholarship. Tree-based data mining classification technique is used in other to determine the generic rule to be used to disburse the scholarship. The system based on the defined rules from the tree is able to determine the class (status) to which an applicant shall belong whether Granted or Not Granted. The applicants that fall to the class of granted denote a successful acquirement of scholarship while those in not granted class are unsuccessful in the scheme. An algorithm that can be used to classify the applicants based on the rules from tree-based classification was also developed. The tree-based classification is adopted because of its efficiency, effectiveness, and easy to comprehend features. The system was tested with the data of National Information Technology Development Agency (NITDA) Abuja, a Parastatal of Federal Ministry of Communication Technology that is mandated to develop and regulate information technology in Nigeria. The system was found working according to the specification. It is therefore recommended for all scholarship disbursement organizations.

Keywords: classification, data mining, decision tree, scholarship

Procedia PDF Downloads 382

24302 [Keynote Speech]: Feature Selection and Predictive Modeling of Housing Data Using Random Forest

Authors: Bharatendra Rai

Abstract:

Predictive data analysis and modeling involving machine learning techniques become challenging in presence of too many explanatory variables or features. Presence of too many features in machine learning is known to not only cause algorithms to slow down, but they can also lead to decrease in model prediction accuracy. This study involves housing dataset with 79 quantitative and qualitative features that describe various aspects people consider while buying a new house. Boruta algorithm that supports feature selection using a wrapper approach build around random forest is used in this study. This feature selection process leads to 49 confirmed features which are then used for developing predictive random forest models. The study also explores five different data partitioning ratios and their impact on model accuracy are captured using coefficient of determination (r-square) and root mean square error (rsme).

Keywords: housing data, feature selection, random forest, Boruta algorithm, root mean square error

Procedia PDF Downloads 327

24301 Examining the Effects of National Disaster on the Performance of Hospitality Industry in Korea

Authors: Kim Sang Hyuck, Y. Park Sung

Abstract:

The outbreak of national disasters stimulates the decrease of the both internal and domestic tourism demands, causing bad effects on the hospitality industry. The effective and efficient risk management regarding national disasters are being increasingly required from the hospitality industry practitioners and the tourism policymakers. To establish the effective and efficient risk management strategy on national disasters, the most essential prerequisite condition is the correct estimation of national disasters’ effects in terms of the size and duration of the damages occurred from national disaster on hospitality industry. More specifically, the national disasters are twofold: natural disaster and social disaster. In addition, the hospitality industry has consisted of several types of business, such as hotel, restaurant, travel agency, etc. As reasons of the above, it is important to consider how each type of national disasters differently influences on the performance of each type of hospitality industry. Therefore, the purpose of this study is examining the effects of national disaster on hospitality industry in Korea based on the types of national disasters as well as the types of hospitality business. The monthly data was collected from Jan. 2000 to Dec. 2016. The indexes of industrial production for each hospitality industry in Korea were used with the proxy variable for the performance of each hospitality industry. Two national disaster variables (natural disaster and social disaster) were treated as dummy variables. In addition, the exchange rate, industrial production index, and consumer price index were used as control variables in the research model. The impulse response analysis was used to examine the size and duration of the damages occurred from each type of national disaster on each type of hospitality industries. The results of this study show that the natural disaster and the social disaster differently influenced on each type of hospitality industry. More specifically, the performance of airline industry is negatively influenced by the natural disaster at the time of 3 months later from the incidence. However, the negative impacts of social disaster on airline industry occurred not significantly over the time periods. For the hotel industry, both natural disaster and social disaster negatively influence the performance of hotel industry at the time of 5 months and 6 months later, respectively. Also, the negative impact of natural disaster on the performance of restaurant industry occurred at the time of 5 months later, as well as for both 3 months and 6 months later for the social disaster. Finally, both natural disaster and social disaster negatively influence the performance of travel agency at the time of 3 months and 4 months later, respectively. In conclusion, the types of national disasters differently influence the performance of each type of hospitality industry in Korea. These results would provide an important information to establish the effective and efficient risk management strategy for the national disasters.

Keywords: impulse response analysis, Korea, national disaster, performance of hospitality industry

Procedia PDF Downloads 187

24300 Image-Based (RBG) Technique for Estimating Phosphorus Levels of Different Crops

Authors: M. M. Ali, Ahmed Al- Ani, Derek Eamus, Daniel K. Y. Tan

Abstract:

In this glasshouse study, we developed the new image-based non-destructive technique for detecting leaf P status of different crops such as cotton, tomato and lettuce. Plants were allowed to grow on nutrient media containing different P concentrations, i.e. 0%, 50% and 100% of recommended P concentration (P0 = no P, L; P1 = 2.5 mL 10 L-1 of P and P2 = 5 mL 10 L-1 of P as NaH2PO4). After 10 weeks of growth, plants were harvested and data on leaf P contents were collected using the standard destructive laboratory method and at the same time leaf images were collected by a handheld crop image sensor. We calculated leaf area, leaf perimeter and RGB (red, green and blue) values of these images. This data was further used in the linear discriminant analysis (LDA) to estimate leaf P contents, which successfully classified these plants on the basis of leaf P contents. The data indicated that P deficiency in crop plants can be predicted using the image and morphological data. Our proposed non-destructive imaging method is precise in estimating P requirements of different crop species.

Keywords: image-based techniques, leaf area, leaf P contents, linear discriminant analysis

Procedia PDF Downloads 386

24299 Design of Visual Repository, Constraint and Process Modeling Tool Based on Eclipse Plug-Ins

Authors: Rushiraj Heshi, Smriti Bhandari

Abstract:

Master Data Management requires creation of Central repository, applying constraints on Repository and designing processes to manage data. Designing of Repository, constraints on repository and business processes is very tedious and time consuming task for large Enterprise. Hence Visual Repository, constraints and Process (Workflow) modeling is the most critical step in Master Data Management.In this paper, we realize a Visual Modeling tool for implementing Repositories, Constraints and Processes based on Eclipse Plugin using GMF/EMF which follows principles of Model Driven Engineering (MDE).

Keywords: EMF, GMF, GEF, repository, constraint, process

Procedia PDF Downloads 501

24298 Chemical and Mineralogical Properties of Soils from an Arid Region of Misurata-Libya: Treated Wastewater Irrigation Impacts

Authors: Khalifa Alatresh, Mirac Aydin

Abstract:

This research explores the impacts of irrigation by treated wastewater (TWW) on the mineralogical and chemical attributes of sandy calcareous soils in the Southern region of Misurata. Soil samples obtained from three horizons (A, B, and C) of six TWW-irrigated pedons (29years) and six other pedons from nearby non-irrigated areas (dry-control). The results demonstrated that the TWW-irrigated pedons had significantly higher salinity (EC), sodium adsorption ratio (SAR), exchangeable sodium percentage (ESP), cation exchange capacity (CEC), available phosphor (AP), total nitrogen (TN), and organic matter (OM) relative to the control pedons. Nonetheless, all the values of interest (EC < 4000 µs/cm < SAR < 13, pH < 8.5 and ESP < 15) remained lower than the thresholds, showing no issues with sodicity or salinity. Irrigated pedons contained significantly higher amounts of total clay and showed an altered distribution of particle sizes and minerals identified (quartz, calcite, microcline, albite, anorthite, and dolomite) within the profile. The observed results included the occurrence of Margarite, Anorthite, Chabazite, and Tridymite minerals after the application of TWW in small quantities that are not enough to influence soil genesis and classification.0,51 cm.

Keywords: treated wastewater, sandy calcareous soils, soil mineralogy, and chemistry

Procedia PDF Downloads 119

24297 The Classification Performance in Parametric and Nonparametric Discriminant Analysis for a Class- Unbalanced Data of Diabetes Risk Groups

Authors: Lily Ingsrisawang, Tasanee Nacharoen

Abstract:

Introduction: The problems of unbalanced data sets generally appear in real world applications. Due to unequal class distribution, many research papers found that the performance of existing classifier tends to be biased towards the majority class. The k -nearest neighbors’ nonparametric discriminant analysis is one method that was proposed for classifying unbalanced classes with good performance. Hence, the methods of discriminant analysis are of interest to us in investigating misclassification error rates for class-imbalanced data of three diabetes risk groups. Objective: The purpose of this study was to compare the classification performance between parametric discriminant analysis and nonparametric discriminant analysis in a three-class classification application of class-imbalanced data of diabetes risk groups. Methods: Data from a healthy project for 599 staffs in a government hospital in Bangkok were obtained for the classification problem. The staffs were diagnosed into one of three diabetes risk groups: non-risk (90%), risk (5%), and diabetic (5%). The original data along with the variables; diabetes risk group, age, gender, cholesterol, and BMI was analyzed and bootstrapped up to 50 and 100 samples, 599 observations per sample, for additional estimation of misclassification error rate. Each data set was explored for the departure of multivariate normality and the equality of covariance matrices of the three risk groups. Both the original data and the bootstrap samples show non-normality and unequal covariance matrices. The parametric linear discriminant function, quadratic discriminant function, and the nonparametric k-nearest neighbors’ discriminant function were performed over 50 and 100 bootstrap samples and applied to the original data. In finding the optimal classification rule, the choices of prior probabilities were set up for both equal proportions (0.33: 0.33: 0.33) and unequal proportions with three choices of (0.90:0.05:0.05), (0.80: 0.10: 0.10) or (0.70, 0.15, 0.15). Results: The results from 50 and 100 bootstrap samples indicated that the k-nearest neighbors approach when k = 3 or k = 4 and the prior probabilities of {non-risk:risk:diabetic} as {0.90:0.05:0.05} or {0.80:0.10:0.10} gave the smallest error rate of misclassification. Conclusion: The k-nearest neighbors approach would be suggested for classifying a three-class-imbalanced data of diabetes risk groups.

Keywords: error rate, bootstrap, diabetes risk groups, k-nearest neighbors

Procedia PDF Downloads 439

24296 BFDD-S: Big Data Framework to Detect and Mitigate DDoS Attack in SDN Network

Authors: Amirreza Fazely Hamedani, Muzzamil Aziz, Philipp Wieder, Ramin Yahyapour

Abstract:

Software-defined networking in recent years came into the sight of so many network designers as a successor to the traditional networking. Unlike traditional networks where control and data planes engage together within a single device in the network infrastructure such as switches and routers, the two planes are kept separated in software-defined networks (SDNs). All critical decisions about packet routing are made on the network controller, and the data level devices forward the packets based on these decisions. This type of network is vulnerable to DDoS attacks, degrading the overall functioning and performance of the network by continuously injecting the fake flows into it. This increases substantial burden on the controller side, and the result ultimately leads to the inaccessibility of the controller and the lack of network service to the legitimate users. Thus, the protection of this novel network architecture against denial of service attacks is essential. In the world of cybersecurity, attacks and new threats emerge every day. It is essential to have tools capable of managing and analyzing all this new information to detect possible attacks in real-time. These tools should provide a comprehensive solution to automatically detect, predict and prevent abnormalities in the network. Big data encompasses a wide range of studies, but it mainly refers to the massive amounts of structured and unstructured data that organizations deal with on a regular basis. On the other hand, it regards not only the volume of the data; but also that how data-driven information can be used to enhance decision-making processes, security, and the overall efficiency of a business. This paper presents an intelligent big data framework as a solution to handle illegitimate traffic burden on the SDN network created by the numerous DDoS attacks. The framework entails an efficient defence and monitoring mechanism against DDoS attacks by employing the state of the art machine learning techniques.

Keywords: apache spark, apache kafka, big data, DDoS attack, machine learning, SDN network

Procedia PDF Downloads 173

24295 Welding Process Selection for Storage Tank by Integrated Data Envelopment Analysis and Fuzzy Credibility Constrained Programming Approach

Authors: Rahmad Wisnu Wardana, Eakachai Warinsiriruk, Sutep Joy-A-Ka

Abstract:

Selecting the most suitable welding process usually depends on experiences or common application in similar companies. However, this approach generally ignores many criteria that can be affecting the suitable welding process selection. Therefore, knowledge automation through knowledge-based systems will significantly improve the decision-making process. The aims of this research propose integrated data envelopment analysis (DEA) and fuzzy credibility constrained programming approach for identifying the best welding process for stainless steel storage tank in the food and beverage industry. The proposed approach uses fuzzy concept and credibility measure to deal with uncertain data from experts' judgment. Furthermore, 12 parameters are used to determine the most appropriate welding processes among six competitive welding processes.

Keywords: welding process selection, data envelopment analysis, fuzzy credibility constrained programming, storage tank

Procedia PDF Downloads 171

24294 On the Estimation of Crime Rate in the Southwest of Nigeria: Principal Component Analysis Approach

Authors: Kayode Balogun, Femi Ayoola

Abstract:

Crime is at alarming rate in this part of world and there are many factors that are contributing to this antisocietal behaviour both among the youths and old. In this work, principal component analysis (PCA) was used as a tool to reduce the dimensionality and to really know those variables that were crime prone in the study region. Data were collected on twenty-eight crime variables from National Bureau of Statistics (NBS) databank for a period of fifteen years, while retaining as much of the information as possible. We use PCA in this study to know the number of major variables and contributors to the crime in the Southwest Nigeria. The results of our analysis revealed that there were eight principal variables have been retained using the Scree plot and Loading plot which implies an eight-equation solution will be appropriate for the data. The eight components explained 93.81% of the total variation in the data set. We also found that the highest and commonly committed crimes in the Southwestern Nigeria were: Assault, Grievous Harm and Wounding, theft/stealing, burglary, house breaking, false pretence, unlawful arms possession and breach of public peace.

Keywords: crime rates, data, Southwest Nigeria, principal component analysis, variables

Procedia PDF Downloads 450

24293 On-Line Data-Driven Multivariate Statistical Prediction Approach to Production Monitoring

Authors: Hyun-Woo Cho

Abstract:

Detection of incipient abnormal events in production processes is important to improve safety and reliability of manufacturing operations and reduce losses caused by failures. The construction of calibration models for predicting faulty conditions is quite essential in making decisions on when to perform preventive maintenance. This paper presents a multivariate calibration monitoring approach based on the statistical analysis of process measurement data. The calibration model is used to predict faulty conditions from historical reference data. This approach utilizes variable selection techniques, and the predictive performance of several prediction methods are evaluated using real data. The results shows that the calibration model based on supervised probabilistic model yielded best performance in this work. By adopting a proper variable selection scheme in calibration models, the prediction performance can be improved by excluding non-informative variables from their model building steps.

Keywords: calibration model, monitoring, quality improvement, feature selection

Procedia PDF Downloads 360

24292 A Quantitative Analysis for the Correlation between Corporate Financial and Social Performance

Authors: Wafaa Salah, Mostafa A. Salama, Jane Doe

Abstract:

Recently, the corporate social performance (CSP) is not less important than the corporate financial performance (CFP). Debate still exists about the nature of the relationship between the CSP and CFP, whether it is a positive, negative or a neutral correlation. The objective of this study is to explore the relationship between corporate social responsibility (CSR) reports and CFP. The study uses the accounting-based and market-based quantitative measures to quantify the financial performance of seven organizations listed on the Egyptian Stock Exchange in 2007-2014. Then uses the information retrieval technologies to quantify the contribution of each of the three dimensions of the corporate social responsibility report (environmental, social and economic). Finally, the correlation between these two sets of variables is viewed together in a model to detect the correlations between them. This model is applied on seven firms that generate social responsibility reports. The results show a positive correlation between the Earnings per share (market based measure) and the economical dimension in the CSR report. On the other hand, total assets and property, plant and equipment (accounting-based measure) are positively correlated to the environmental and social dimensions of the CSR reports. While there is not any significant relationship between ROA, ROE, Operating income and corporate social responsibility. This study contributes to the literature by providing more clarification of the relationship between CFP and the isolated CSR activities in a developing country.

Keywords: financial, social, machine learning, corporate social performance, corporate social responsibility

Procedia PDF Downloads 315

24291 Multilevel Gray Scale Image Encryption through 2D Cellular Automata

Authors: Rupali Bhardwaj

Abstract:

Cryptography is the science of using mathematics to encrypt and decrypt data; the data are converted into some other gibberish form, and then the encrypted data are transmitted. The primary purpose of this paper is to provide two levels of security through a two-step process, rather than transmitted the message bits directly, first encrypted it using 2D cellular automata and then scrambled with Arnold Cat Map transformation; it provides an additional layer of protection and reduces the chance of the transmitted message being detected. A comparative analysis on effectiveness of scrambling technique is provided by scrambling degree measurement parameters i.e. Gray Difference Degree (GDD) and Correlation Coefficient.

Keywords: scrambling, cellular automata, Arnold cat map, game of life, gray difference degree, correlation coefficient

Procedia PDF Downloads 382

24290 Survey Based Data Security Evaluation in Pakistan Financial Institutions against Malicious Attacks

Authors: Naveed Ghani, Samreen Javed

Abstract:

In today’s heterogeneous network environment, there is a growing demand for distrust clients to jointly execute secure network to prevent from malicious attacks as the defining task of propagating malicious code is to locate new targets to attack. Residual risk is always there no matter what solutions are implemented or whet so ever security methodology or standards being adapted. Security is the first and crucial phase in the field of Computer Science. The main aim of the Computer Security is gathering of information with secure network. No one need wonder what all that malware is trying to do: It's trying to steal money through data theft, bank transfers, stolen passwords, or swiped identities. From there, with the help of our survey we learn about the importance of white listing, antimalware programs, security patches, log files, honey pots, and more used in banks for financial data protection but there’s also a need of implementing the IPV6 tunneling with Crypto data transformation according to the requirements of new technology to prevent the organization from new Malware attacks and crafting of its own messages and sending them to the target. In this paper the writer has given the idea of implementing IPV6 Tunneling Secessions on private data transmission from financial organizations whose secrecy needed to be safeguarded.

Keywords: network worms, malware infection propagating malicious code, virus, security, VPN

Procedia PDF Downloads 360

24289 Interactive IoT-Blockchain System for Big Data Processing

Authors: Abdallah Al-ZoubI, Mamoun Dmour

Abstract:

The spectrum of IoT devices is becoming widely diversified, entering almost all possible fields and finding applications in industry, health, finance, logistics, education, to name a few. The IoT active endpoint sensors and devices exceeded the 12 billion mark in 2021 and are expected to reach 27 billion in 2025, with over $34 billion in total market value. This sheer rise in numbers and use of IoT devices bring with it considerable concerns regarding data storage, analysis, manipulation and protection. IoT Blockchain-based systems have recently been proposed as a decentralized solution for large-scale data storage and protection. COVID-19 has actually accelerated the desire to utilize IoT devices as it impacted both demand and supply and significantly affected several regions due to logistic reasons such as supply chain interruptions, shortage of shipping containers and port congestion. An IoT-blockchain system is proposed to handle big data generated by a distributed network of sensors and controllers in an interactive manner. The system is designed using the Ethereum platform, which utilizes smart contracts, programmed in solidity to execute and manage data generated by IoT sensors and devices. such as Raspberry Pi 4, Rasbpian, and add-on hardware security modules. The proposed system will run a number of applications hosted by a local machine used to validate transactions. It then sends data to the rest of the network through InterPlanetary File System (IPFS) and Ethereum Swarm, forming a closed IoT ecosystem run by blockchain where a number of distributed IoT devices can communicate and interact, thus forming a closed, controlled environment. A prototype has been deployed with three IoT handling units distributed over a wide geographical space in order to examine its feasibility, performance and costs. Initial results indicated that big IoT data retrieval and storage is feasible and interactivity is possible, provided that certain conditions of cost, speed and thorough put are met.

Keywords: IoT devices, blockchain, Ethereum, big data

Procedia PDF Downloads 152

24288 Keynote Talk: The Role of Internet of Things in the Smart Cities Power System

Authors: Abdul-Rahman Al-Ali

Abstract:

As the number of mobile devices is growing exponentially, it is estimated to connect about 50 million devices to the Internet by the year 2020. At the end of this decade, it is expected that an average of eight connected devices per person worldwide. The 50 billion devices are not mobile phones and data browsing gadgets only, but machine-to-machine and man-to-machine devices. With such growing numbers of devices the Internet of Things (I.o.T) concept is one of the emerging technologies as of recently. Within the smart grid technologies, smart home appliances, Intelligent Electronic Devices (IED) and Distributed Energy Resources (DER) are major I.o.T objects that can be addressable using the IPV6. These objects are called the smart grid internet of things (SG-I.o.T). The SG-I.o.T generates big data that requires high-speed computing infrastructure, widespread computer networks, big data storage, software, and platforms services. A company’s utility control and data centers cannot handle such a large number of devices, high-speed processing, and massive data storage. Building large data center’s infrastructure takes a long time, it also requires widespread communication networks and huge capital investment. To maintain and upgrade control and data centers’ infrastructure and communication networks as well as updating and renewing software licenses which collectively, requires additional cost. This can be overcome by utilizing the emerging computing paradigms such as cloud computing. This can be used as a smart grid enabler to replace the legacy of utilities data centers. The talk will highlight the role of I.o.T, cloud computing services and their development models within the smart grid technologies.

Keywords: intelligent electronic devices (IED), distributed energy resources (DER), internet, smart home appliances

Procedia PDF Downloads 329

24287 Statistical Analysis of Interferon-γ for the Effectiveness of an Anti-Tuberculous Treatment

Authors: Shishen Xie, Yingda L. Xie

Abstract:

Tuberculosis (TB) is a potentially serious infectious disease that remains a health concern. The Interferon Gamma Release Assay (IGRA) is a blood test to find out if an individual is tuberculous positive or negative. This study applies statistical analysis to the clinical data of interferon-gamma levels of seventy-three subjects who diagnosed pulmonary TB in an anti-tuberculous treatment. Data analysis is performed to determine if there is a significant decline in interferon-gamma levels for the subjects during a period of six months, and to infer if the anti-tuberculous treatment is effective.

Keywords: data analysis, interferon gamma release assay, statistical methods, tuberculosis infection

Procedia PDF Downloads 308

24286 Short Text Classification Using Part of Speech Feature to Analyze Students' Feedback of Assessment Components

Authors: Zainab Mutlaq Ibrahim, Mohamed Bader-El-Den, Mihaela Cocea

Abstract:

Students' textual feedback can hold unique patterns and useful information about learning process, it can hold information about advantages and disadvantages of teaching methods, assessment components, facilities, and other aspects of teaching. The results of analysing such a feedback can form a key point for institutions’ decision makers to advance and update their systems accordingly. This paper proposes a data mining framework for analysing end of unit general textual feedback using part of speech feature (PoS) with four machine learning algorithms: support vector machines, decision tree, random forest, and naive bays. The proposed framework has two tasks: first, to use the above algorithms to build an optimal model that automatically classifies the whole data set into two subsets, one subset is tailored to assessment practices (assessment related), and the other one is the non-assessment related data. Second task to use the same algorithms to build an optimal model for whole data set, and the new data subsets to automatically detect their sentiment. The significance of this paper is to compare the performance of the above four algorithms using part of speech feature to the performance of the same algorithms using n-grams feature. The paper follows Knowledge Discovery and Data Mining (KDDM) framework to construct the classification and sentiment analysis models, which is understanding the assessment domain, cleaning and pre-processing the data set, selecting and running the data mining algorithm, interpreting mined patterns, and consolidating the discovered knowledge. The results of this paper experiments show that both models which used both features performed very well regarding first task. But regarding the second task, models that used part of speech feature has underperformed in comparison with models that used unigrams and bigrams.

Keywords: assessment, part of speech, sentiment analysis, student feedback

Procedia PDF Downloads 148

24285 The Roman Fora in North Africa Towards a Supportive Protocol to the Decision for the Morphological Restitution

Authors: Dhouha Laribi Galalou, Najla Allani Bouhoula, Atef Hammouda

Abstract:

This research delves into the fundamental question of the morphological restitution of built archaeology in order to place it in its paradigmatic context and to seek answers to it. Indeed, the understanding of the object of the study, its analysis, and the methodology of solving the morphological problem posed, are manageable aspects only by means of a thoughtful strategy that draws on well-defined epistemological scaffolding. In this stream, the crisis of natural reasoning in archaeology has generated multiple changes in this field, ranging from the use of new tools to the integration of an archaeological information system where urbanization involves the interplay of several disciplines. The built archaeological topic is also an architectural and morphological object. It is also a set of articulated elementary data, the understanding of which is about to be approached from a logicist point of view. Morphological restitution is no exception to the rule, and the inter-exchange between the different disciplines uses the capacity of each to frame the reflection on the incomplete elements of a given architecture or on its different phases and multiple states of existence. The logicist sequence is furnished by the set of scattered or destroyed elements found, but also by what can be called a rule base which contains the set of rules for the architectural construction of the object. The knowledge base built from the archaeological literature also provides a reference that enters into the game of searching for forms and articulations. The choice of the Roman Forum in North Africa is justified by the great urban and architectural characteristics of this entity. The research on the forum involves both a fairly large knowledge base but also provides the researcher with material to study - from a morphological and architectural point of view - starting from the scale of the city down to the architectural detail. The experimentation of the knowledge deduced on the paradigmatic level, as well as the deduction of an analysis model, is then carried out on the basis of a well-defined context which contextualises the experimentation from the elaboration of the morphological information container attached to the rule base and the knowledge base. The use of logicist analysis and artificial intelligence has allowed us to first question the aspects already known in order to measure the credibility of our system, which remains above all a decision support tool for the morphological restitution of Roman Fora in North Africa. This paper presents a first experimentation of the model elaborated during this research, a model framed by a paradigmatic discussion and thus trying to position the research in relation to the existing paradigmatic and experimental knowledge on the issue.

Keywords: classical reasoning, logicist reasoning, archaeology, architecture, roman forum, morphology, calculation

Procedia PDF Downloads 152

24284 Fast Fourier Transform-Based Steganalysis of Covert Communications over Streaming Media

Authors: Jinghui Peng, Shanyu Tang, Jia Li

Abstract:

Steganalysis seeks to detect the presence of secret data embedded in cover objects, and there is an imminent demand to detect hidden messages in streaming media. This paper shows how a steganalysis algorithm based on Fast Fourier Transform (FFT) can be used to detect the existence of secret data embedded in streaming media. The proposed algorithm uses machine parameter characteristics and a network sniffer to determine whether the Internet traffic contains streaming channels. The detected streaming data is then transferred from the time domain to the frequency domain through FFT. The distributions of power spectra in the frequency domain between original VoIP streams and stego VoIP streams are compared in turn using t-test, achieving the p-value of 7.5686E-176 which is below the threshold. The results indicate that the proposed FFT-based steganalysis algorithm is effective in detecting the secret data embedded in VoIP streaming media.

Keywords: steganalysis, security, Fast Fourier Transform, streaming media

Procedia PDF Downloads 152

24283 Privacy-Preserving Model for Social Network Sites to Prevent Unwanted Information Diffusion

Authors: Sanaz Kavianpour, Zuraini Ismail, Bharanidharan Shanmugam

Abstract:

Social Network Sites (SNSs) can be served as an invaluable platform to transfer the information across a large number of individuals. A substantial component of communicating and managing information is to identify which individual will influence others in propagating information and also whether dissemination of information in the absence of social signals about that information will be occurred or not. Classifying the final audience of social data is difficult as controlling the social contexts which transfers among individuals are not completely possible. Hence, undesirable information diffusion to an unauthorized individual on SNSs can threaten individuals’ privacy. This paper highlights the information diffusion in SNSs and moreover it emphasizes the most significant privacy issues to individuals of SNSs. The goal of this paper is to propose a privacy-preserving model that has urgent regards with individuals’ data in order to control availability of data and improve privacy by providing access to the data for an appropriate third parties without compromising the advantages of information sharing through SNSs.

Keywords: anonymization algorithm, classification algorithm, information diffusion, privacy, social network sites

Procedia PDF Downloads 323

24282 Application Difference between Cox and Logistic Regression Models

Authors: Idrissa Kayijuka

Abstract:

The logistic regression and Cox regression models (proportional hazard model) at present are being employed in the analysis of prospective epidemiologic research looking into risk factors in their application on chronic diseases. However, a theoretical relationship between the two models has been studied. By definition, Cox regression model also called Cox proportional hazard model is a procedure that is used in modeling data regarding time leading up to an event where censored cases exist. Whereas the Logistic regression model is mostly applicable in cases where the independent variables consist of numerical as well as nominal values while the resultant variable is binary (dichotomous). Arguments and findings of many researchers focused on the overview of Cox and Logistic regression models and their different applications in different areas. In this work, the analysis is done on secondary data whose source is SPSS exercise data on BREAST CANCER with a sample size of 1121 women where the main objective is to show the application difference between Cox regression model and logistic regression model based on factors that cause women to die due to breast cancer. Thus we did some analysis manually i.e. on lymph nodes status, and SPSS software helped to analyze the mentioned data. This study found out that there is an application difference between Cox and Logistic regression models which is Cox regression model is used if one wishes to analyze data which also include the follow-up time whereas Logistic regression model analyzes data without follow-up-time. Also, they have measurements of association which is different: hazard ratio and odds ratio for Cox and logistic regression models respectively. A similarity between the two models is that they are both applicable in the prediction of the upshot of a categorical variable i.e. a variable that can accommodate only a restricted number of categories. In conclusion, Cox regression model differs from logistic regression by assessing a rate instead of proportion. The two models can be applied in many other researches since they are suitable methods for analyzing data but the more recommended is the Cox, regression model.

Keywords: logistic regression model, Cox regression model, survival analysis, hazard ratio

Procedia PDF Downloads 462

24281 Bicycle Tourism and Sharing Economy (C2C-Tourism): Analysis of the Reciprocity Behavior in the Case of Warmshowers

Authors: Jana Heimel, Franziska Drescher, Lauren Ugur, Graciela Kuchle

Abstract:

Sharing platforms are a widely investigated field. However, there is a research gap with a lack of focus on ‘real’ (non-profit-orientated) sharing platforms. The research project addresses this gap by conducting an empirical study on a private peer-to-peer (P2P) network to investigate cooperative behavior from a socio-psychological perspective. In recent years the conversion from possession to accessing is increasingly influencing different sectors, particularly the traveling industry. The number of people participating in hospitality exchange platforms like Airbnb, Couchsurfing, and Warmshowers (WS) is rapidly growing. WS is an increasingly popular online community that is linking cycling tourists and locals. It builds on the idea of the “sharing economy” as a not-for-profit hospitality network for bicycle tourists. Hosts not only provide a sleeping berth and warm shower free of charge but also offer additional services to their guests, such as cooking and washing clothes for them. According to previous studies, they are motivated by the idea of promoting cultural experience and forming new friendships. Trust and reciprocity are supposed to play major roles in the success of such platforms. The objective of this research project is to analyze the reciprocity behavior within the WS community. Reciprocity is the act of giving and taking among each other. Individuals feel obligated to return a favor and often expect to increase their own chances of receiving future benefits for themselves. Consequently, the drivers that incite giving and taking, as well as the motivation for hosts and guests, are examined. Thus, the project investigates a particular tourism offer that contributes to sustainable tourism by analyzing P2P resp. cyclist-to-cyclist, C2C) tourism. C2C tourism is characterized by special hospitality and generosity. To find out what motivations drive the hosts and which determinants drive the sharing cycling economy, an empirical study has been conducted globally through an online survey. The data was gathered through the WS community and comprised responses from more than 10,000 cyclists around the globe. Next to general information mostly comprising quantitative data on bicycle tourism (year/tour distance, duration and budget), qualitative information on traveling with WS as well as hosting was collected. The most important motivations for a traveler is to explore the local culture, to save money, and to make friends. The main reasons to host a guest are to promote the use of bicycles and to make friends, but also to give back and pay forward. WS members prefer to stay with/host cyclists. The results indicate that C2C tourists share homogenous characteristics and a similar philosophy, which is crucial for building mutual trust. Members of WS are generally extremely trustful. The study promotes an ecological form of tourism by combining sustainability, regionality, health, experience and the local communities' cultures. The empirical evidence found and analyzed, despite evident limitations, enabled us to shed light, especially on the issue of motivations and social capital, and on the functioning of ‘sharing’ platforms. Final research results are intended to promote C2C tourism around the globe to further replace conventional by sustainable tourism.

Keywords: bicycle tourism, homogeneity, reciprocity, sharing economy, trust

Procedia PDF Downloads 120

24280 Text Mining of Twitter Data Using a Latent Dirichlet Allocation Topic Model and Sentiment Analysis

Authors: Sidi Yang, Haiyi Zhang

Abstract:

Twitter is a microblogging platform, where millions of users daily share their attitudes, views, and opinions. Using a probabilistic Latent Dirichlet Allocation (LDA) topic model to discern the most popular topics in the Twitter data is an effective way to analyze a large set of tweets to find a set of topics in a computationally efficient manner. Sentiment analysis provides an effective method to show the emotions and sentiments found in each tweet and an efficient way to summarize the results in a manner that is clearly understood. The primary goal of this paper is to explore text mining, extract and analyze useful information from unstructured text using two approaches: LDA topic modelling and sentiment analysis by examining Twitter plain text data in English. These two methods allow people to dig data more effectively and efficiently. LDA topic model and sentiment analysis can also be applied to provide insight views in business and scientific fields.

Keywords: text mining, Twitter, topic model, sentiment analysis

Procedia PDF Downloads 181

24279 Navigating Government Finance Statistics: Effortless Retrieval and Comparative Analysis through Data Science and Machine Learning

Authors: Kwaku Damoah

Abstract:

This paper presents a methodology and software application (App) designed to empower users in accessing, retrieving, and comparatively exploring data within the hierarchical network framework of the Government Finance Statistics (GFS) system. It explores the ease of navigating the GFS system and identifies the gaps filled by the new methodology and App. The GFS, embodies a complex Hierarchical Network Classification (HNC) structure, encapsulating institutional units, revenues, expenses, assets, liabilities, and economic activities. Navigating this structure demands specialized knowledge, experience, and skill, posing a significant challenge for effective analytics and fiscal policy decision-making. Many professionals encounter difficulties deciphering these classifications, hindering confident utilization of the system. This accessibility barrier obstructs a vast number of professionals, students, policymakers, and the public from leveraging the abundant data and information within the GFS. Leveraging R programming language, Data Science Analytics and Machine Learning, an efficient methodology enabling users to access, navigate, and conduct exploratory comparisons was developed. The machine learning Fiscal Analytics App (FLOWZZ) democratizes access to advanced analytics through its user-friendly interface, breaking down expertise barriers.

Keywords: data science, data wrangling, drilldown analytics, government finance statistics, hierarchical network classification, machine learning, web application.

Procedia PDF Downloads 74

24278 Multifunctional 1D α-Fe2O3/ZnO Core/Shell Semiconductor Nano-Heterostructures: Heterojunction Engineering

Authors: Gobinda Gopal Khan, Ashutosh K. Singh, Debasish Sarkar

Abstract:

This study reports the facile fabrication of 1D ZnO/α-Fe2O3 semiconductor nano-heterostructures (SNHs), and we investigate the strong interfacial interactions at the heterojunction, resulting in novel multifunctionality in the hybrid structure. ZnO-coated α-Fe2O3 nanowires (NWs) have been prepared by combining electrodeposition and wet chemical methods. Significant improvement in electrical conductivity, photoluminescence, and room temperature magnetic properties have been observed for the ZnO/α-Fe2O3 SNHs over the pristine α-Fe2O3 NWs because of the contribution of the ZnO nanolayer. The increase in electrical conductivity in ZnO/α-Fe2O3 SNHs is because of the increase in free electrons in the conduction band of the SNHs due to the formation of type-II n-n band configuration at the heterojunction. The SNHs are found to exhibit enhanced visible green photoluminescence along with the UV emission at room temperature. The band-gap emission of the α-Fe2O3 NWs coupled to the defect emissions of the ZnO in SNHs can be attributed to the profound enhancement of the visible green luminescence. Ferromagnetism of the SNHs is found to be increased nearly five times in magnitude over the primeval α-Fe2O3 NWs, which can be ascribed to the exchange coupling of the interfacial spin at ZnO/α-Fe2O3 interface, the surface spin of ZnO nanolayer, along with the structural defects like the cation vacancies (VZn) and the singly ionized oxygen vacancies (Vo•) present in SNHs.

Keywords: nano-heterostructures, photoluminescence, electrical property, magnetism

Procedia PDF Downloads 261

24277 Geo Spatial Database for Railway Assets Management

Authors: Muhammad Umar

Abstract:

Safety and Assets management is considering a backbone of every department. GIS in the Railway become very important to Manage Assets and Security through Digital Maps and Web based GIS Maps. It provides a complete frame of work to the organization for the management of assets. Pakistan Railway is the most common and safest mode of traveling in Pakistan. Due to ever-increasing demand of transporting huge amount of information generated from various sources and this information must be accurate. This creates problems for Passengers and Administration that causes finical and time loss. GIS Solve this problem by Digital Maps & Database. It provides you a real time Spatial and Statistical analysis that helps you to communicate and exchange the information in a sophisticated way to the users. GIS Based Web system provides a facility to different end user to make query at a time as per requirements. This GIS System provides an advancement in an organization for a complete Monitoring, Safety and Decision System for tracks, Stations and Junctions that further use for the Analysis of different areas i.e. analysis of tracks, junctions and Stations in case of reconstruction, Rescue for rail accidents and Natural disasters .This Research work helps to reduce the financial loss and reduce human mistakes helps you provide a complete security and Management system of assets.

Keywords: Geographical Information System (GIS) for assets management, geo spatial database, railway assets management, Pakistan

Procedia PDF Downloads 494