Search results for: data integrity and privacy
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25828

Search results for: data integrity and privacy

22798 Human Resource Management Practices, Person-Environment Fit and Financial Performance in Brazilian Publicly Traded Companies

Authors: Bruno Henrique Rocha Fernandes, Amir Rezaee, Jucelia Appio

Abstract:

The relation between Human Resource Management (HRM) practices and organizational performance remains the subject of substantial literature. Though many studies demonstrated positive relationship, still major influencing variables are not yet clear. This study considers the Person-Environment Fit (PE Fit) and its components, Person-Supervisor (PS), Person-Group (PG), Person-Organization (PO) and Person-Job (PJ) Fit, as possible explanatory variables. We analyzed PE Fit as a moderator between HRM practices and financial performance in the “best companies to work” in Brazil. Data from HRM practices were classified through the High Performance Working Systems (HPWS) construct and data on PE-Fit were obtained through surveys among employees. Financial data, consisting of return on invested capital (ROIC) and price earnings ratio (PER) were collected for publicly traded best companies to work. Findings show that PO Fit and PJ Fit play a significant moderator role for PER but not for ROIC.

Keywords: financial performance, human resource management, high performance working systems, person-environment fit

Procedia PDF Downloads 166
22797 Flow Duration Curves and Recession Curves Connection through a Mathematical Link

Authors: Elena Carcano, Mirzi Betasolo

Abstract:

This study helps Public Water Bureaus in giving reliable answers to water concession requests. Rapidly increasing water requests can be supported provided that further uses of a river course are not totally compromised, and environmental features are protected as well. Strictly speaking, a water concession can be considered a continuous drawing from the source and causes a mean annual streamflow reduction. Therefore, deciding if a water concession is appropriate or inappropriate seems to be easily solved by comparing the generic demand to the mean annual streamflow value at disposal. Still, the immediate shortcoming for such a comparison is that streamflow data are information available only for few catchments and, most often, limited to specific sites. Subsequently, comparing the generic water demand to mean daily discharge is indeed far from being completely satisfactory since the mean daily streamflow is greater than the water withdrawal for a long period of a year. Consequently, such a comparison appears to be of little significance in order to preserve the quality and the quantity of the river. In order to overcome such a limit, this study aims to complete the information provided by flow duration curves introducing a link between Flow Duration Curves (FDCs) and recession curves and aims to show the chronological sequence of flows with a particular focus on low flow data. The analysis is carried out on 25 catchments located in North-Eastern Italy for which daily data are provided. The results identify groups of catchments as hydrologically homogeneous, having the lower part of the FDCs (corresponding streamflow interval is streamflow Q between 300 and 335, namely: Q(300), Q(335)) smoothly reproduced by a common recession curve. In conclusion, the results are useful to provide more reliable answers to water request, especially for those catchments which show similar hydrological response and can be used for a focused regionalization approach on low flow data. A mathematical link between streamflow duration curves and recession curves is herein provided, thus furnishing streamflow duration curves information upon a temporal sequence of data. In such a way, by introducing assumptions on recession curves, the chronological sequence upon low flow data can also be attributed to FDCs, which are known to lack this information by nature.

Keywords: chronological sequence of discharges, recession curves, streamflow duration curves, water concession

Procedia PDF Downloads 186
22796 Alternating Expectation-Maximization Algorithm for a Bilinear Model in Isoform Quantification from RNA-Seq Data

Authors: Wenjiang Deng, Tian Mou, Yudi Pawitan, Trung Nghia Vu

Abstract:

Estimation of isoform-level gene expression from RNA-seq data depends on simplifying assumptions, such as uniform reads distribution, that are easily violated in real data. Such violations typically lead to biased estimates. Most existing methods provide a bias correction step(s), which is based on biological considerations, such as GC content–and applied in single samples separately. The main problem is that not all biases are known. For example, new technologies such as single-cell RNA-seq (scRNA-seq) may introduce new sources of bias not seen in bulk-cell data. This study introduces a method called XAEM based on a more flexible and robust statistical model. Existing methods are essentially based on a linear model Xβ, where the design matrix X is known and derived based on the simplifying assumptions. In contrast, XAEM considers Xβ as a bilinear model with both X and β unknown. Joint estimation of X and β is made possible by simultaneous analysis of multi-sample RNA-seq data. Compared to existing methods, XAEM automatically performs empirical correction of potentially unknown biases. XAEM implements an alternating expectation-maximization (AEM) algorithm, alternating between estimation of X and β. For speed XAEM utilizes quasi-mapping for read alignment, thus leading to a fast algorithm. Overall XAEM performs favorably compared to other recent advanced methods. For simulated datasets, XAEM obtains higher accuracy for multiple-isoform genes, particularly for paralogs. In a differential-expression analysis of a real scRNA-seq dataset, XAEM achieves substantially greater rediscovery rates in an independent validation set.

Keywords: alternating EM algorithm, bias correction, bilinear model, gene expression, RNA-seq

Procedia PDF Downloads 142
22795 A New Distribution and Application on the Lifetime Data

Authors: Gamze Ozel, Selen Cakmakyapan

Abstract:

We introduce a new model called the Marshall-Olkin Rayleigh distribution which extends the Rayleigh distribution using Marshall-Olkin transformation and has increasing and decreasing shapes for the hazard rate function. Various structural properties of the new distribution are derived including explicit expressions for the moments, generating and quantile function, some entropy measures, and order statistics are presented. The model parameters are estimated by the method of maximum likelihood and the observed information matrix is determined. The potentiality of the new model is illustrated by means of real life data set.

Keywords: Marshall-Olkin distribution, Rayleigh distribution, estimation, maximum likelihood

Procedia PDF Downloads 501
22794 Comparative Study between the Absorbed Dose of 67ga-Ecc and 68ga-Ecc

Authors: H. Yousefnia, S. Zolghadri, S. Shanesazzadeh, A.Lahooti, A. R. Jalilian

Abstract:

In this study, 68Ga-ECC and 67Ga-ECC were both prepared with the radiochemical purity of higher than 97% in less than 30 min. The biodistribution data for 68Ga-ECC showed the extraction of the most of the activity from the urinary tract. The absorbed dose was estimated based on biodistribution data in mice by the medical internal radiation dose (MIRD) method. Comparison between human absorbed dose estimation for these two agents indicated the values of approximately ten-fold higher after injection of 67Ga-ECC than 68Ga-ECC in the most organs. The results showed that 68Ga-ECC can be considered as a more potential agent for renal imaging compared to 67Ga-ECC.

Keywords: effective absorbed dose, ethylenecysteamine cysteine, Ga-67, Ga-68

Procedia PDF Downloads 469
22793 Logistic Regression Model versus Additive Model for Recurrent Event Data

Authors: Entisar A. Elgmati

Abstract:

Recurrent infant diarrhea is studied using daily data collected in Salvador, Brazil over one year and three months. A logistic regression model is fitted instead of Aalen's additive model using the same covariates that were used in the analysis with the additive model. The model gives reasonably similar results to that using additive regression model. In addition, the problem with the estimated conditional probabilities not being constrained between zero and one in additive model is solved here. Also martingale residuals that have been used to judge the goodness of fit for the additive model are shown to be useful for judging the goodness of fit of the logistic model.

Keywords: additive model, cumulative probabilities, infant diarrhoea, recurrent event

Procedia PDF Downloads 635
22792 From Industry 4.0 to Agriculture 4.0: A Framework to Manage Product Data in Agri-Food Supply Chain for Voluntary Traceability

Authors: Angelo Corallo, Maria Elena Latino, Marta Menegoli

Abstract:

Agri-food value chain involves various stakeholders with different roles. All of them abide by national and international rules and leverage marketing strategies to advance their products. Food products and related processing phases carry with it a big mole of data that are often not used to inform final customer. Some data, if fittingly identified and used, can enhance the single company, and/or the all supply chain creates a math between marketing techniques and voluntary traceability strategies. Moreover, as of late, the world has seen buying-models’ modification: customer is careful on wellbeing and food quality. Food citizenship and food democracy was born, leveraging on transparency, sustainability and food information needs. Internet of Things (IoT) and Analytics, some of the innovative technologies of Industry 4.0, have a significant impact on market and will act as a main thrust towards a genuine ‘4.0 change’ for agriculture. But, realizing a traceability system is not simple because of the complexity of agri-food supply chain, a lot of actors involved, different business models, environmental variations impacting products and/or processes, and extraordinary climate changes. In order to give support to the company involved in a traceability path, starting from business model analysis and related business process a Framework to Manage Product Data in Agri-Food Supply Chain for Voluntary Traceability was conceived. Studying each process task and leveraging on modeling techniques lead to individuate information held by different actors during agri-food supply chain. IoT technologies for data collection and Analytics techniques for data processing supply information useful to increase the efficiency intra-company and competitiveness in the market. The whole information recovered can be shown through IT solutions and mobile application to made accessible to the company, the entire supply chain and the consumer with the view to guaranteeing transparency and quality.

Keywords: agriculture 4.0, agri-food suppy chain, industry 4.0, voluntary traceability

Procedia PDF Downloads 147
22791 Investigation of the Relationship between Personality Components and Tendency to Addiction to Domestic Violence

Authors: Mohamad Reza Khodabakhsh

Abstract:

Violence against women is a historical phenomenon; although its form and type are common in various societies and cultures, this type of violence occurs in terms of physical, psychological, financial, and sexual dimensions. This is the cause of many social deviations and endangers the center of the family as the most important institution. This research seeks to investigate the relationship between personality characteristics and the tendency to addiction to domestic violence. One hundred fifty women and one hundred fifty men were selected by the available sampling method. One hundred fifty men were admitted to drug addiction camps, and women included domestic violence cases. A questionnaire on addiction tendency, Five Personality Traits (NEO), and attitudes toward violence against women was used. Data were analyzed in descriptive and inferential statistics. The data were analyzed at the level of descriptive mean, mean, and standard deviation and analyzed using SPSS 20 software using correlation and analysis of variance at the level of inferential level. And the data were analyzed at the p≤0.05 significance level. The results showed that there is a significant relationship between personality traits and a tendency to addiction and domestic violence.

Keywords: personality, addiction, domestic violence, family

Procedia PDF Downloads 103
22790 Artificial Intelligence Assisted Sentiment Analysis of Hotel Reviews Using Topic Modeling

Authors: Sushma Ghogale

Abstract:

With a surge in user-generated content or feedback or reviews on the internet, it has become possible and important to know consumers' opinions about products and services. This data is important for both potential customers and businesses providing the services. Data from social media is attracting significant attention and has become the most prominent channel of expressing an unregulated opinion. Prospective customers look for reviews from experienced customers before deciding to buy a product or service. Several websites provide a platform for users to post their feedback for the provider and potential customers. However, the biggest challenge in analyzing such data is in extracting latent features and providing term-level analysis of the data. This paper proposes an approach to use topic modeling to classify the reviews into topics and conduct sentiment analysis to mine the opinions. This approach can analyse and classify latent topics mentioned by reviewers on business sites or review sites, or social media using topic modeling to identify the importance of each topic. It is followed by sentiment analysis to assess the satisfaction level of each topic. This approach provides a classification of hotel reviews using multiple machine learning techniques and comparing different classifiers to mine the opinions of user reviews through sentiment analysis. This experiment concludes that Multinomial Naïve Bayes classifier produces higher accuracy than other classifiers.

Keywords: latent Dirichlet allocation, topic modeling, text classification, sentiment analysis

Procedia PDF Downloads 97
22789 Modelling Fluoride Pollution of Groundwater Using Artificial Neural Network in the Western Parts of Jharkhand

Authors: Neeta Kumari, Gopal Pathak

Abstract:

Artificial neural network has been proved to be an efficient tool for non-parametric modeling of data in various applications where output is non-linearly associated with input. It is a preferred tool for many predictive data mining applications because of its power , flexibility, and ease of use. A standard feed forward networks (FFN) is used to predict the groundwater fluoride content. The ANN model is trained using back propagated algorithm, Tansig and Logsig activation function having varying number of neurons. The models are evaluated on the basis of statistical performance criteria like Root Mean Squarred Error (RMSE) and Regression coefficient (R2), bias (mean error), Coefficient of variation (CV), Nash-Sutcliffe efficiency (NSE), and the index of agreement (IOA). The results of the study indicate that Artificial neural network (ANN) can be used for groundwater fluoride prediction in the limited data situation in the hard rock region like western parts of Jharkhand with sufficiently good accuracy.

Keywords: Artificial neural network (ANN), FFN (Feed-forward network), backpropagation algorithm, Levenberg-Marquardt algorithm, groundwater fluoride contamination

Procedia PDF Downloads 550
22788 Translanguaging In Preschools: New Evidence from Polish-English Bilingual Children

Authors: Judyta Pawliszko

Abstract:

The study draws on the theoretical framework of translanguaging. It investigates translanguaging patterns and how meaning-making processes among bilingual children in preschool are affected by using two different languages, 8 months of observation and 200 hours of vocal recordings of children (3-6 years old) provide data on bilingual children’s linguistic repertoire why children translanguage, and how they achieve understanding with the strategic use of the two languages. The data gathered point to translanguaging as a practice that maximizes meaning-making processes among preschool bilingual children.

Keywords: translanguaging, bilingualism, preschool, polish-english bilingual children

Procedia PDF Downloads 108
22787 Towards a Framework for Embedded Weight Comparison Algorithm with Business Intelligence in the Plantation Domain

Authors: M. Pushparani, A. Sagaya

Abstract:

Embedded systems have emerged as important elements in various domains with extensive applications in automotive, commercial, consumer, healthcare and transportation markets, as there is emphasis on intelligent devices. On the other hand, Business Intelligence (BI) has also been extensively used in a range of applications, especially in the agriculture domain which is the area of this research. The aim of this research is to create a framework for Embedded Weight Comparison Algorithm with Business Intelligence (EWCA-BI). The weight comparison algorithm will be embedded within the plantation management system and the weighbridge system. This algorithm will be used to estimate the weight at the site and will be compared with the actual weight at the plantation. The algorithm will be used to build the necessary alerts when there is a discrepancy in the weight, thus enabling better decision making. In the current practice, data are collected from various locations in various forms. It is a challenge to consolidate data to obtain timely and accurate information for effective decision making. Adding to this, the unstable network connection leads to difficulty in getting timely accurate information. To overcome the challenges embedding is done on a portable device that will have the embedded weight comparison algorithm to also assist in data capture and synchronize data at various locations overcoming the network short comings at collection points. The EWCA-BI will provide real-time information at any given point of time, thus enabling non-latent BI reports that will provide crucial information to enable efficient operational decision making. This research has a high potential in bringing embedded system into the agriculture industry. EWCA-BI will provide BI reports with accurate information with uncompromised data using an embedded system and provide alerts, therefore, enabling effective operation management decision-making at the site.

Keywords: embedded business intelligence, weight comparison algorithm, oil palm plantation, embedded systems

Procedia PDF Downloads 285
22786 R Statistical Software Applied in Reliability Analysis: Case Study of Diesel Generator Fans

Authors: Jelena Vucicevic

Abstract:

Reliability analysis represents a very important task in different areas of work. In any industry, this is crucial for maintenance, efficiency, safety and monetary costs. There are ways to calculate reliability, unreliability, failure density and failure rate. This paper will try to introduce another way of calculating reliability by using R statistical software. R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS. The R programming environment is a widely used open source system for statistical analysis and statistical programming. It includes thousands of functions for the implementation of both standard and new statistical methods. R does not limit user only to operation related only to these functions. This program has many benefits over other similar programs: it is free and, as an open source, constantly updated; it has built-in help system; the R language is easy to extend with user-written functions. The significance of the work is calculation of time to failure or reliability in a new way, using statistic. Another advantage of this calculation is that there is no need for technical details and it can be implemented in any part for which we need to know time to fail in order to have appropriate maintenance, but also to maximize usage and minimize costs. In this case, calculations have been made on diesel generator fans but the same principle can be applied to any other part. The data for this paper came from a field engineering study of the time to failure of diesel generator fans. The ultimate goal was to decide whether or not to replace the working fans with a higher quality fan to prevent future failures. Seventy generators were studied. For each one, the number of hours of running time from its first being put into service until fan failure or until the end of the study (whichever came first) was recorded. Dataset consists of two variables: hours and status. Hours show the time of each fan working and status shows the event: 1- failed, 0- censored data. Censored data represent cases when we cannot track the specific case, so it could fail or success. Gaining the result by using R was easy and quick. The program will take into consideration censored data and include this into the results. This is not so easy in hand calculation. For the purpose of the paper results from R program have been compared to hand calculations in two different cases: censored data taken as a failure and censored data taken as a success. In all three cases, results are significantly different. If user decides to use the R for further calculations, it will give more precise results with work on censored data than the hand calculation.

Keywords: censored data, R statistical software, reliability analysis, time to failure

Procedia PDF Downloads 401
22785 Distributed Automation System Based Remote Monitoring of Power Quality Disturbance on LV Network

Authors: Emmanuel D. Buedi, K. O. Boateng, Griffith S. Klogo

Abstract:

Electrical distribution networks are prone to power quality disturbances originating from the complexity of the distribution network, mode of distribution (overhead or underground) and types of loads used by customers. Data on the types of disturbances present and frequency of occurrence is needed for economic evaluation and hence finding solution to the problem. Utility companies have resorted to using secondary power quality devices such as smart meters to help gather the required data. Even though this approach is easier to adopt, data gathered from these devices may not serve the required purpose, since the installation of these devices in the electrical network usually does not conform to available PQM placement methods. This paper presents a design of a PQM that is capable of integrating into an existing DAS infrastructure to take advantage of available placement methodologies. The monitoring component of the design is implemented and installed to monitor an existing LV network. Data from the monitor is analyzed and presented. A portion of the LV network of the Electricity Company of Ghana is modeled in MATLAB-Simulink and analyzed under various earth fault conditions. The results presented show the ability of the PQM to detect and analyze PQ disturbance such as voltage sag and overvoltage. By adopting a placement methodology and installing these nodes, utilities are assured of accurate and reliable information with respect to the quality of power delivered to consumers.

Keywords: power quality, remote monitoring, distributed automation system, economic evaluation, LV network

Procedia PDF Downloads 349
22784 Doing Cause-and-Effect Analysis Using an Innovative Chat-Based Focus Group Method

Authors: Timothy Whitehill

Abstract:

This paper presents an innovative chat-based focus group method for collecting qualitative data to construct a cause-and-effect analysis in business research. This method was developed in response to the research and data collection challenges faced by the Covid-19 outbreak in the United Kingdom during 2020-21. This paper discusses the methodological approaches and builds a contemporary argument for its effectiveness in exploring cause-and-effect relationships in the context of focus group research, systems thinking and problem structuring methods. The pilot for this method was conducted between October 2020 and March 2021 and collected more than 7,000 words of chat-based data which was used to construct a consensus drawn cause-and-effect analysis. This method was developed in support of an ongoing Doctorate in Business Administration (DBA) thesis, which is using Design Science Research methodology to operationalize organisational resilience in UK construction sector firms.

Keywords: cause-and-effect analysis, focus group research, problem structuring methods, qualitative research, systems thinking

Procedia PDF Downloads 221
22783 Performance Evaluation of Grid Connected Photovoltaic System

Authors: Abdulkadir Magaji

Abstract:

This study analyzes and compares the actual measured and simulated performance of a 3.2 kwP grid-connected photovoltaic system. The system is located at the Outdoor Facility of Government Day secondary School Katsina State, which lies approximately between coordinate of 12°15′N 7°30′E. The system consists of 14 Mono crystalline silicon modules connected in two strings of 7 series-connected modules, each facing north at a fixed tilt of 340. The data presented in this study were measured in the year 2015, where the system supplied a total of 4628 kWh to the local electric utility grid. The performance of the system was simulated using PVsyst software using measured and Meteonorm derived climate data sets (solar radiation, ambient temperature and wind speed). The comparison between measured and simulated energy yield are discussed. Although, both simulation results were similar, better comparison between measured and predicted monthly energy yield is observed with simulation performed using measured weather data at the site. The measured performance ratio in the present study shows 58.4% is higher than those reported elsewhere as compared in the study.

Keywords: performance, evaluation, grid connection, photovoltaic system

Procedia PDF Downloads 181
22782 Comparative Study of the Earth Land Surface Temperature Signatures over Ota, South-West Nigeria

Authors: Moses E. Emetere, M. L. Akinyemi

Abstract:

Agricultural activities in the South–West Nigeria are mitigated by the global increase in temperature. The unpredictive surface temperature of the area had increased health challenges amongst other social influence. The satellite data of surface temperatures were compared with the ground station Davis weather station. The differential heating of the lower atmosphere were represented mathematically. A numerical predictive model was propounded to forecast future surface temperature.

Keywords: numerical predictive model, surface temperature, satellite date, ground data

Procedia PDF Downloads 474
22781 Advanced Magnetic Field Mapping Utilizing Vertically Integrated Deployment Platforms

Authors: John E. Foley, Martin Miele, Raul Fonda, Jon Jacobson

Abstract:

This paper presents development and implementation of new and innovative data collection and analysis methodologies based on deployment of total field magnetometer arrays. Our research has focused on the development of a vertically-integrated suite of platforms all utilizing common data acquisition, data processing and analysis tools. These survey platforms include low-altitude helicopters and ground-based vehicles, including robots, for terrestrial mapping applications. For marine settings the sensor arrays are deployed from either a hydrodynamic bottom-following wing towed from a surface vessel or from a towed floating platform for shallow-water settings. Additionally, sensor arrays are deployed from tethered remotely operated vehicles (ROVs) for underwater settings where high maneuverability is required. While the primary application of these systems is the detection and mapping of unexploded ordnance (UXO), these system are also used for various infrastructure mapping and geologic investigations. For each application, success is driven by the integration of magnetometer arrays, accurate geo-positioning, system noise mitigation, and stable deployment of the system in appropriate proximity of expected targets or features. Each of the systems collects geo-registered data compatible with a web-enabled data management system providing immediate access of data and meta-data for remote processing, analysis and delivery of results. This approach allows highly sophisticated magnetic processing methods, including classification based on dipole modeling and remanent magnetization, to be efficiently applied to many projects. This paper also briefly describes the initial development of magnetometer-based detection systems deployed from low-altitude helicopter platforms and the subsequent successful transition of this technology to the marine environment. Additionally, we present examples from a range of terrestrial and marine settings as well as ongoing research efforts related to sensor miniaturization for unmanned aerial vehicle (UAV) magnetic field mapping applications.

Keywords: dipole modeling, magnetometer mapping systems, sub-surface infrastructure mapping, unexploded ordnance detection

Procedia PDF Downloads 464
22780 New NIR System for Detecting the Internal Disorder and Quality of Apple Fruit

Authors: Eid Alharbi, Yaser Miaji

Abstract:

The importance of fruit quality and freshness is potential in today’s life. Most recent studies show and automatic online sorting system according to the internal disorder for fresh apple fruit has developed by using near infrared (NIR) spectroscopic technology. The automatic conveyer belts system along with sorting mechanism was constructed. To check the internal quality of the apple fruit, apple was exposed to the NIR radiations in the range 650-1300nm and the data were collected in form of absorption spectra. The collected data were compared to the reference (data of known sample) analyzed and an electronic signal was pass to the sorting system. The sorting system was separate the apple fruit samples according to electronic signal passed to the system. It is found that absorption of NIR radiation in the range 930-950nm was higher in the internally defected samples as compared to healthy samples. On the base of this high absorption of NIR radiation in 930-950nm region the online sorting system was constructed.

Keywords: mechatronics design, NIR, fruit quality, spectroscopic technology

Procedia PDF Downloads 397
22779 Novel NIR System for Detection of Internal Disorder and Quality of Apple Fruit

Authors: Eid Alharbi, Yaser Miaji

Abstract:

The importance of fruit quality and freshness is potential in today’s life. Most recent studies show and automatic online sorting system according to the internal disorder for fresh apple fruit has developed by using near infrared (NIR) spectroscopic technology. The automatic conveyer belts system along with sorting mechanism was constructed. To check the internal quality of the apple fruit, apple was exposed to the NIR radiations in the range 650-1300nm and the data were collected in form of absorption spectra. The collected data were compared to the reference (data of known sample) analyzed and an electronic signal was pass to the sorting system. The sorting system was separate the apple fruit samples according to electronic signal passed to the system. It is found that absorption of NIR radiation in the range 930-950nm was higher in the internally defected samples as compared to healthy samples. On the base of this high absorption of NIR radiation in 930-950nm region the online sorting system was constructed.

Keywords: mechatronics design, NIR, fruit quality, spectroscopic technology

Procedia PDF Downloads 386
22778 Active Features Determination: A Unified Framework

Authors: Meenal Badki

Abstract:

We address the issue of active feature determination, where the objective is to determine the set of examples on which additional data (such as lab tests) needs to be gathered, given a large number of examples with some features (such as demographics) and some examples with all the features (such as the complete Electronic Health Record). We note that certain features may be more costly, unique, or laborious to gather. Our proposal is a general active learning approach that is independent of classifiers and similarity metrics. It allows us to identify examples that differ from the full data set and obtain all the features for the examples that match. Our comprehensive evaluation shows the efficacy of this approach, which is driven by four authentic clinical tasks.

Keywords: feature determination, classification, active learning, sample-efficiency

Procedia PDF Downloads 75
22777 Charter versus District Schools and Student Achievement: Implications for School Leaders

Authors: Kara Rosenblatt, Kevin Badgett, James Eldridge

Abstract:

There is a preponderance of information regarding the overall effectiveness of charter schools and their ability to increase academic achievement compared to traditional district schools. Most research on the topic is focused on comparing long and short-term outcomes, academic achievement in mathematics and reading, and locale (i.e., urban, v. Rural). While the lingering unanswered questions regarding effectiveness continue to loom for school leaders, data on charter schools suggests that enrollment increases by 10% annually and that charter schools educate more than 2 million U.S. students across 40 states each year. Given the increasing share of U.S. students educated in charter schools, it is important to better understand possible differences in student achievement defined in multiple ways for students in charter schools and for those in Independent School District (ISD) settings in the state of Texas. Data were retrieved from the Texas Education Agency’s (TEA) repository that includes data organized annually and available on the TEA website. Specific data points and definitions of achievement were based on characterizations of achievement found in the relevant literature. Specific data points include but were not limited to graduation rate, student performance on standardized testing, and teacher-related factors such as experience and longevity in the district. Initial findings indicate some similarities with the current literature on long-term student achievement in English/Language Arts; however, the findings differ substantially from other recent research related to long-term student achievement in social studies. There are a number of interesting findings also related to differences between achievement for students in charters and ISDs and within different types of charter schools in Texas. In addition to findings, implications for leadership in different settings will be explored.

Keywords: charter schools, ISDs, student achievement, implications for PK-12 school leadership

Procedia PDF Downloads 128
22776 Next-Generation Laser-Based Transponder and 3D Switch for Free Space Optics in Nanosatellite

Authors: Nadir Atayev, Mehman Hasanov

Abstract:

Future spacecraft will require a structural change in the way data is transmitted due to the increase in the volume of data required for space communication. Current radio frequency communication systems are already facing a bottleneck in the volume of data sent to the ground segment due to their technological and regulatory characteristics. To overcome these issues, free space optics communication plays an important role in the integrated terrestrial space network due to its advantages such as significantly improved data rate compared to traditional RF technology, low cost, improved security, and inter-satellite free space communication, as well as uses a laser beam, which is an optical signal carrier to establish satellite-ground & ground-to-satellite links. In this approach, there is a need for high-speed and energy-efficient systems as a base platform for sending high-volume video & audio data. Nano Satellite and its branch CubeSat platforms have more technical functionality than large satellites, wheres cover an important part of the space sector, with their Low-Earth-Orbit application area with low-cost design and technical functionality for building networks using different communication topologies. Along the research theme developed in this regard, the output parameter indicators for the FSO of the optical communication transceiver subsystem on the existing CubeSat platforms, and in the direction of improving the mentioned parameters of this communication methodology, 3D optical switch and laser beam controlled optical transponder with 2U CubeSat structural subsystems and application in the Low Earth Orbit satellite network topology, as well as its functional performance and structural parameters, has been studied accordingly.

Keywords: cubesat, free space optics, nano satellite, optical laser communication.

Procedia PDF Downloads 89
22775 Cloud-Based Multiresolution Geodata Cube for Efficient Raster Data Visualization and Analysis

Authors: Lassi Lehto, Jaakko Kahkonen, Juha Oksanen, Tapani Sarjakoski

Abstract:

The use of raster-formatted data sets in geospatial analysis is increasing rapidly. At the same time, geographic data are being introduced into disciplines outside the traditional domain of geoinformatics, like climate change, intelligent transport, and immigration studies. These developments call for better methods to deliver raster geodata in an efficient and easy-to-use manner. Data cube technologies have traditionally been used in the geospatial domain for managing Earth Observation data sets that have strict requirements for effective handling of time series. The same approach and methodologies can also be applied in managing other types of geospatial data sets. A cloud service-based geodata cube, called GeoCubes Finland, has been developed to support online delivery and analysis of most important geospatial data sets with national coverage. The main target group of the service is the academic research institutes in the country. The most significant aspects of the GeoCubes data repository include the use of multiple resolution levels, cloud-optimized file structure, and a customized, flexible content access API. Input data sets are pre-processed while being ingested into the repository to bring them into a harmonized form in aspects like georeferencing, sampling resolutions, spatial subdivision, and value encoding. All the resolution levels are created using an appropriate generalization method, selected depending on the nature of the source data set. Multiple pre-processed resolutions enable new kinds of online analysis approaches to be introduced. Analysis processes based on interactive visual exploration can be effectively carried out, as the level of resolution most close to the visual scale can always be used. In the same way, statistical analysis can be carried out on resolution levels that best reflect the scale of the phenomenon being studied. Access times remain close to constant, independent of the scale applied in the application. The cloud service-based approach, applied in the GeoCubes Finland repository, enables analysis operations to be performed on the server platform, thus making high-performance computing facilities easily accessible. The developed GeoCubes API supports this kind of approach for online analysis. The use of cloud-optimized file structures in data storage enables the fast extraction of subareas. The access API allows for the use of vector-formatted administrative areas and user-defined polygons as definitions of subareas for data retrieval. Administrative areas of the country in four levels are available readily from the GeoCubes platform. In addition to direct delivery of raster data, the service also supports the so-called virtual file format, in which only a small text file is first downloaded. The text file contains links to the raster content on the service platform. The actual raster data is downloaded on demand, from the spatial area and resolution level required in each stage of the application. By the geodata cube approach, pre-harmonized geospatial data sets are made accessible to new categories of inexperienced users in an easy-to-use manner. At the same time, the multiresolution nature of the GeoCubes repository facilitates expert users to introduce new kinds of interactive online analysis operations.

Keywords: cloud service, geodata cube, multiresolution, raster geodata

Procedia PDF Downloads 135
22774 Wind Velocity Climate Zonation Based on Observation Data in Indonesia Using Cluster and Principal Component Analysis

Authors: I Dewa Gede Arya Putra

Abstract:

Principal Component Analysis (PCA) is a mathematical procedure that uses orthogonal transformation techniques to change a set of data with components that may be related become components that are not related to each other. This can have an impact on clustering wind speed characteristics in Indonesia. This study uses data daily wind speed observations of the Site Meteorological Station network for 30 years. Multicollinearity tests were also performed on all of these data before doing clustering with PCA. The results show that the four main components have a total diversity of above 80% which will be used for clusters. Division of clusters using Ward's method obtained 3 types of clusters. Cluster 1 covers the central part of Sumatra Island, northern Kalimantan, northern Sulawesi, and northern Maluku with the climatological pattern of wind speed that does not have an annual cycle and a weak speed throughout the year with a low-speed ranging from 0 to 1,5 m/s². Cluster 2 covers the northern part of Sumatra Island, South Sulawesi, Bali, northern Papua with the climatological pattern conditions of wind speed that have annual cycle variations with low speeds ranging from 1 to 3 m/s². Cluster 3 covers the eastern part of Java Island, the Southeast Nusa Islands, and the southern Maluku Islands with the climatological pattern of wind speed conditions that have annual cycle variations with high speeds ranging from 1 to 4.5 m/s².

Keywords: PCA, cluster, Ward's method, wind speed

Procedia PDF Downloads 195
22773 Capturing Public Voices: The Role of Social Media in Heritage Management

Authors: Mahda Foroughi, Bruno de Anderade, Ana Pereira Roders

Abstract:

Social media platforms have been increasingly used by locals and tourists to express their opinions about buildings, cities, and built heritage in particular. Most recently, scholars have been using social media to conduct innovative research on built heritage and heritage management. Still, the application of artificial intelligence (AI) methods to analyze social media data for heritage management is seldom explored. This paper investigates the potential of short texts (sentences and hashtags) shared through social media as a data source and artificial intelligence methods for data analysis for revealing the cultural significance (values and attributes) of built heritage. The city of Yazd, Iran, was taken as a case study, with a particular focus on windcatchers, key attributes conveying outstanding universal values, as inscribed on the UNESCO World Heritage List. This paper has three subsequent phases: 1) state of the art on the intersection of public participation in heritage management and social media research; 2) methodology of data collection and data analysis related to coding people's voices from Instagram and Twitter into values of windcatchers over the last ten-years; 3) preliminary findings on the comparison between opinions of locals and tourists, sentiment analysis, and its association with the values and attributes of windcatchers. Results indicate that the age value is recognized as the most important value by all interest groups, while the political value is the least acknowledged. Besides, the negative sentiments are scarcely reflected (e.g., critiques) in social media. Results confirm the potential of social media for heritage management in terms of (de)coding and measuring the cultural significance of built heritage for windcatchers in Yazd. The methodology developed in this paper can be applied to other attributes in Yazd and also to other case studies.

Keywords: social media, artificial intelligence, public participation, cultural significance, heritage, sentiment analysis

Procedia PDF Downloads 115
22772 Hybrid Knowledge Approach for Determining Health Care Provider Specialty from Patient Diagnoses

Authors: Erin Lynne Plettenberg, Jeremy Vickery

Abstract:

In an access-control situation, the role of a user determines whether a data request is appropriate. This paper combines vetted web mining and logic modeling to build a lightweight system for determining the role of a health care provider based only on their prior authorized requests. The model identifies provider roles with 100% recall from very little data. This shows the value of vetted web mining in AI systems, and suggests the impact of the ICD classification on medical practice.

Keywords: electronic medical records, information extraction, logic modeling, ontology, vetted web mining

Procedia PDF Downloads 172
22771 Relationship between Gender and Performance with Respect to a Basic Math Skills Quiz in Statistics Courses in Lebanon

Authors: Hiba Naccache

Abstract:

The present research investigated whether gender differences affect performance in a simple math quiz in statistics course. Participants of this study comprised a sample of 567 statistics students in two different universities in Lebanon. Data were collected through a simple math quiz. Analysis of quantitative data indicated that there wasn’t a significant difference in math performance between males and females. The results suggest that improvements in student performance may depend on improved mastery of basic algebra especially for females. The implications of these findings and further recommendations were discussed.

Keywords: gender, education, math, statistics

Procedia PDF Downloads 377
22770 INCIPIT-CRIS: A Research Information System Combining Linked Data Ontologies and Persistent Identifiers

Authors: David Nogueiras Blanco, Amir Alwash, Arnaud Gaudinat, René Schneider

Abstract:

At a time when the access to and the sharing of information are crucial in the world of research, the use of technologies such as persistent identifiers (PIDs), Current Research Information Systems (CRIS), and ontologies may create platforms for information sharing if they respond to the need of disambiguation of their data by assuring interoperability inside and between other systems. INCIPIT-CRIS is a continuation of the former INCIPIT project, whose goal was to set up an infrastructure for a low-cost attribution of PIDs with high granularity based on Archival Resource Keys (ARKs). INCIPIT-CRIS can be interpreted as a logical consequence and propose a research information management system developed from scratch. The system has been created on and around the Schema.org ontology with a further articulation of the use of ARKs. It is thus built upon the infrastructure previously implemented (i.e., INCIPIT) in order to enhance the persistence of URIs. As a consequence, INCIPIT-CRIS aims to be the hinge between previously separated aspects such as CRIS, ontologies and PIDs in order to produce a powerful system allowing the resolution of disambiguation problems using a combination of an ontology such as Schema.org and unique persistent identifiers such as ARK, allowing the sharing of information through a dedicated platform, but also the interoperability of the system by representing the entirety of the data as RDF triplets. This paper aims to present the implemented solution as well as its simulation in real life. We will describe the underlying ideas and inspirations while going through the logic and the different functionalities implemented and their links with ARKs and Schema.org. Finally, we will discuss the tests performed with our project partner, the Swiss Institute of Bioinformatics (SIB), by the use of large and real-world data sets.

Keywords: current research information systems, linked data, ontologies, persistent identifier, schema.org, semantic web

Procedia PDF Downloads 135
22769 MIMIC: A Multi Input Micro-Influencers Classifier

Authors: Simone Leonardi, Luca Ardito

Abstract:

Micro-influencers are effective elements in the marketing strategies of companies and institutions because of their capability to create an hyper-engaged audience around a specific topic of interest. In recent years, many scientific approaches and commercial tools have handled the task of detecting this type of social media users. These strategies adopt solutions ranging from rule based machine learning models to deep neural networks and graph analysis on text, images, and account information. This work compares the existing solutions and proposes an ensemble method to generalize them with different input data and social media platforms. The deployed solution combines deep learning models on unstructured data with statistical machine learning models on structured data. We retrieve both social media accounts information and multimedia posts on Twitter and Instagram. These data are mapped into feature vectors for an eXtreme Gradient Boosting (XGBoost) classifier. Sixty different topics have been analyzed to build a rule based gold standard dataset and to compare the performances of our approach against baseline classifiers. We prove the effectiveness of our work by comparing the accuracy, precision, recall, and f1 score of our model with different configurations and architectures. We obtained an accuracy of 0.91 with our best performing model.

Keywords: deep learning, gradient boosting, image processing, micro-influencers, NLP, social media

Procedia PDF Downloads 183