Search results for: data standardization
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25250

Search results for: data standardization

23930 Big Data Applications for Transportation Planning

Authors: Antonella Falanga, Armando Cartenì

Abstract:

"Big data" refers to extremely vast and complex sets of data, encompassing extraordinarily large and intricate datasets that require specific tools for meaningful analysis and processing. These datasets can stem from diverse origins like sensors, mobile devices, online transactions, social media platforms, and more. The utilization of big data is pivotal, offering the chance to leverage vast information for substantial advantages across diverse fields, thereby enhancing comprehension, decision-making, efficiency, and fostering innovation in various domains. Big data, distinguished by its remarkable attributes of enormous volume, high velocity, diverse variety, and significant value, represent a transformative force reshaping the industry worldwide. Their pervasive impact continues to unlock new possibilities, driving innovation and advancements in technology, decision-making processes, and societal progress in an increasingly data-centric world. The use of these technologies is becoming more widespread, facilitating and accelerating operations that were once much more complicated. In particular, big data impacts across multiple sectors such as business and commerce, healthcare and science, finance, education, geography, agriculture, media and entertainment and also mobility and logistics. Within the transportation sector, which is the focus of this study, big data applications encompass a wide variety, spanning across optimization in vehicle routing, real-time traffic management and monitoring, logistics efficiency, reduction of travel times and congestion, enhancement of the overall transportation systems, but also mitigation of pollutant emissions contributing to environmental sustainability. Meanwhile, in public administration and the development of smart cities, big data aids in improving public services, urban planning, and decision-making processes, leading to more efficient and sustainable urban environments. Access to vast data reservoirs enables deeper insights, revealing hidden patterns and facilitating more precise and timely decision-making. Additionally, advancements in cloud computing and artificial intelligence (AI) have further amplified the potential of big data, enabling more sophisticated and comprehensive analyses. Certainly, utilizing big data presents various advantages but also entails several challenges regarding data privacy and security, ensuring data quality, managing and storing large volumes of data effectively, integrating data from diverse sources, the need for specialized skills to interpret analysis results, ethical considerations in data use, and evaluating costs against benefits. Addressing these difficulties requires well-structured strategies and policies to balance the benefits of big data with privacy, security, and efficient data management concerns. Building upon these premises, the current research investigates the efficacy and influence of big data by conducting an overview of the primary and recent implementations of big data in transportation systems. Overall, this research allows us to conclude that big data better provide to enhance rational decision-making for mobility choices and is imperative for adeptly planning and allocating investments in transportation infrastructures and services.

Keywords: big data, public transport, sustainable mobility, transport demand, transportation planning

Procedia PDF Downloads 60
23929 Implementing Fault Tolerance with Proxy Signature on the Improvement of RSA System

Authors: H. El-Kamchouchi, Heba Gaber, Fatma Ahmed, Dalia H. El-Kamchouchi

Abstract:

Fault tolerance and data security are two important issues in modern communication systems. During the transmission of data between the sender and receiver, errors may occur frequently. Therefore, the sender must re-transmit the data to the receiver in order to correct these errors, which makes the system very feeble. To improve the scalability of the scheme, we present a proxy signature scheme with fault tolerance over an efficient and secure authenticated key agreement protocol based on the improved RSA system. Authenticated key agreement protocols have an important role in building a secure communications network between the two parties.

Keywords: fault tolerance, improved RSA, key agreement, proxy signature

Procedia PDF Downloads 425
23928 The Necessity to Standardize Procedures of Providing Engineering Geological Data for Designing Road and Railway Tunneling Projects

Authors: Atefeh Saljooghi Khoshkar, Jafar Hassanpour

Abstract:

One of the main problems of the design stage relating to many tunneling projects is the lack of an appropriate standard for the provision of engineering geological data in a predefined format. In particular, this is more reflected in highway and railroad tunnel projects in which there is a number of tunnels and different professional teams involved. In this regard, comprehensive software needs to be designed using the accepted methods in order to help engineering geologists to prepare standard reports, which contain sufficient input data for the design stage. Regarding this necessity, applied software has been designed using macro capabilities and Visual Basic programming language (VBA) through Microsoft Excel. In this software, all of the engineering geological input data, which are required for designing different parts of tunnels, such as discontinuities properties, rock mass strength parameters, rock mass classification systems, boreability classification, the penetration rate, and so forth, can be calculated and reported in a standard format.

Keywords: engineering geology, rock mass classification, rock mechanic, tunnel

Procedia PDF Downloads 80
23927 The Underestimate of the Annual Maximum Rainfall Depths Due to Coarse Time Resolution Data

Authors: Renato Morbidelli, Carla Saltalippi, Alessia Flammini, Tommaso Picciafuoco, Corrado Corradini

Abstract:

A considerable part of rainfall data to be used in the hydrological practice is available in aggregated form within constant time intervals. This can produce undesirable effects, like the underestimate of the annual maximum rainfall depth, Hd, associated with a given duration, d, that is the basic quantity in the development of rainfall depth-duration-frequency relationships and in determining if climate change is producing effects on extreme event intensities and frequencies. The errors in the evaluation of Hd from data characterized by a coarse temporal aggregation, ta, and a procedure to reduce the non-homogeneity of the Hd series are here investigated. Our results indicate that: 1) in the worst conditions, for d=ta, the estimation of a single Hd value can be affected by an underestimation error up to 50%, while the average underestimation error for a series with at least 15-20 Hd values, is less than or equal to 16.7%; 2) the underestimation error values follow an exponential probability density function; 3) each very long time series of Hd contains many underestimated values; 4) relationships between the non-dimensional ratio ta/d and the average underestimate of Hd, derived from continuous rainfall data observed in many stations of Central Italy, may overcome this issue; 5) these equations should allow to improve the Hd estimates and the associated depth-duration-frequency curves at least in areas with similar climatic conditions.

Keywords: central Italy, extreme events, rainfall data, underestimation errors

Procedia PDF Downloads 191
23926 Objective Evaluation on Medical Image Compression Using Wavelet Transformation

Authors: Amhimmid Mohammed Saffour, Mustafa Mohamed Abdullah

Abstract:

The use of computers for handling image data in the healthcare is growing. However, the amount of data produced by modern image generating techniques is vast. This data might be a problem from a storage point of view or when the data is sent over a network. This paper using wavelet transform technique for medical images compression. MATLAB program, are designed to evaluate medical images storage and transmission time problem at Sebha Medical Center Libya. In this paper, three different Computed Tomography images which are abdomen, brain and chest have been selected and compressed using wavelet transform. Objective evaluation has been performed to measure the quality of the compressed images. For this evaluation, the results show that the Peak Signal to Noise Ratio (PSNR) which indicates the quality of the compressed image is ranging from (25.89db to 34.35db for abdomen images, 23.26db to 33.3db for brain images and 25.5db to 36.11db for chest images. These values shows that the compression ratio is nearly to 30:1 is acceptable.

Keywords: medical image, Matlab, image compression, wavelet's, objective evaluation

Procedia PDF Downloads 285
23925 Understanding the Qualitative Nature of Product Reviews by Integrating Text Processing Algorithm and Usability Feature Extraction

Authors: Cherry Yieng Siang Ling, Joong Hee Lee, Myung Hwan Yun

Abstract:

The quality of a product to be usable has become the basic requirement in consumer’s perspective while failing the requirement ends up the customer from not using the product. Identifying usability issues from analyzing quantitative and qualitative data collected from usability testing and evaluation activities aids in the process of product design, yet the lack of studies and researches regarding analysis methodologies in qualitative text data of usability field inhibits the potential of these data for more useful applications. While the possibility of analyzing qualitative text data found with the rapid development of data analysis studies such as natural language processing field in understanding human language in computer, and machine learning field in providing predictive model and clustering tool. Therefore, this research aims to study the application capability of text processing algorithm in analysis of qualitative text data collected from usability activities. This research utilized datasets collected from LG neckband headset usability experiment in which the datasets consist of headset survey text data, subject’s data and product physical data. In the analysis procedure, which integrated with the text-processing algorithm, the process includes training of comments onto vector space, labeling them with the subject and product physical feature data, and clustering to validate the result of comment vector clustering. The result shows 'volume and music control button' as the usability feature that matches best with the cluster of comment vectors where centroid comments of a cluster emphasized more on button positions, while centroid comments of the other cluster emphasized more on button interface issues. When volume and music control buttons are designed separately, the participant experienced less confusion, and thus, the comments mentioned only about the buttons' positions. While in the situation where the volume and music control buttons are designed as a single button, the participants experienced interface issues regarding the buttons such as operating methods of functions and confusion of functions' buttons. The relevance of the cluster centroid comments with the extracted feature explained the capability of text processing algorithms in analyzing qualitative text data from usability testing and evaluations.

Keywords: usability, qualitative data, text-processing algorithm, natural language processing

Procedia PDF Downloads 285
23924 Differentiation between Different Rangeland Sites Using Principal Component Analysis in Semi-Arid Areas of Sudan

Authors: Nancy Ibrahim Abdalla, Abdelaziz Karamalla Gaiballa

Abstract:

Rangelands in semi-arid areas provide a good source for feeding huge numbers of animals and serving environmental, economic and social importance; therefore, these areas are considered economically very important for the pastoral sector in Sudan. This paper investigates the means of differentiating between different rangelands sites according to soil types using principal component analysis to assist in monitoring and assessment purposes. Three rangeland sites were identified in the study area as flat sandy sites, sand dune site, and hard clay site. Principal component analysis (PCA) was used to reduce the number of factors needed to distinguish between rangeland sites and produce a new set of data including the most useful spectral information to run satellite image processing. It was performed using selected types of data (two vegetation indices, topographic data and vegetation surface reflectance within the three bands of MODIS data). Analysis with PCA indicated that there is a relatively high correspondence between vegetation and soil of the total variance in the data set. The results showed that the use of the principal component analysis (PCA) with the selected variables showed a high difference, reflected in the variance and eigenvalues and it can be used for differentiation between different range sites.

Keywords: principal component analysis, PCA, rangeland sites, semi-arid areas, soil types

Procedia PDF Downloads 186
23923 The Use of Optical-Radar Remotely-Sensed Data for Characterizing Geomorphic, Structural and Hydrologic Features and Modeling Groundwater Prospective Zones in Arid Zones

Authors: Mohamed Abdelkareem

Abstract:

Remote sensing data contributed on predicting the prospective areas of water resources. Integration of microwave and multispectral data along with climatic, hydrologic, and geological data has been used here. In this article, Sentinel-2, Landsat-8 Operational Land Imager (OLI), Shuttle Radar Topography Mission (SRTM), Tropical Rainfall Measuring Mission (TRMM), and Advanced Land Observing Satellite (ALOS) Phased Array Type L‐band Synthetic Aperture Radar (PALSAR) data were utilized to identify the geological, hydrologic and structural features of Wadi Asyuti which represents a defunct tributary of the Nile basin, in the eastern Sahara. The image transformation of Sentinel-2 and Landsat-8 data allowed characterizing the different varieties of rock units. Integration of microwave remotely-sensed data and GIS techniques provided information on physical characteristics of catchments and rainfall zones that are of a crucial role for mapping groundwater prospective zones. A fused Landsat-8 OLI and ALOS/PALSAR data improved the structural elements that difficult to reveal using optical data. Lineament extraction and interpretation indicated that the area is clearly shaped by the NE-SW graben that is cut by NW-SE trend. Such structures allowed the accumulation of thick sediments in the downstream area. Processing of recent OLI data acquired on March 15, 2014, verified the flood potential maps and offered the opportunity to extract the extent of the flooding zone of the recent flash flood event (March 9, 2014), as well as revealed infiltration characteristics. Several layers including geology, slope, topography, drainage density, lineament density, soil characteristics, rainfall, and morphometric characteristics were combined after assigning a weight for each using a GIS-based knowledge-driven approach. The results revealed that the predicted groundwater potential zones (GPZs) can be arranged into six distinctive groups, depending on their probability for groundwater, namely very low, low, moderate, high very, high, and excellent. Field and well data validated the delineated zones.

Keywords: GIS, remote sensing, groundwater, Egypt

Procedia PDF Downloads 98
23922 Intelligent Production Machine

Authors: A. Şahinoğlu, R. Gürbüz, A. Güllü, M. Karhan

Abstract:

This study in production machines, it is aimed that machine will automatically perceive cutting data and alter cutting parameters. The two most important parameters have to be checked in machine control unit are progress feed rate and speeds. These parameters are aimed to be controlled by sounds of machine. Optimum sound’s features introduced to computer. During process, real time data is received and converted by Matlab software. Data is converted into numerical values. According to them progress and speeds decreases/increases at a certain rate and thus optimum sound is acquired. Cutting process is made in respect of optimum cutting parameters. During chip remove progress, features of cutting tools, kind of cut material, cutting parameters and used machine; affects on various parameters. Instead of required parameters need to be measured such as temperature, vibration, and tool wear that emerged during cutting process; detailed analysis of the sound emerged during cutting process will provide detection of various data that included in the cutting process by the much more easy and economic way. The relation between cutting parameters and sound is being identified.

Keywords: cutting process, sound processing, intelligent late, sound analysis

Procedia PDF Downloads 334
23921 The Effectiveness and Accuracy of the Schulte Holt IOL Toric Calculator Processor in Comparison to Manually Input Data into the Barrett Toric IOL Calculator

Authors: Gabrielle Holt

Abstract:

This paper is looking to prove the efficacy of the Schulte Holt IOL Toric Calculator Processor (Schulte Holt ITCP). It has been completed using manually inputted data into the Barrett Toric Calculator and comparing the number of minutes taken to complete the Toric calculations, the number of errors identified during completion, and distractions during completion. It will then compare that data to the number of minutes taken for the Schulte Holt ITCP to complete also, using the Barrett method, as well as the number of errors identified in the Schulte Holt ITCP. The data clearly demonstrate a momentous advantage to the Schulte Holt ITCP and notably reduces time spent doing Toric Calculations, as well as reducing the number of errors. With the ever-growing number of cataract surgeries taking place around the world and the waitlists increasing -the Schulte Holt IOL Toric Calculator Processor may well demonstrate a way forward to increase the availability of ophthalmologists and ophthalmic staff while maintaining patient safety.

Keywords: Toric, toric lenses, ophthalmology, cataract surgery, toric calculations, Barrett

Procedia PDF Downloads 93
23920 Change Point Detection Using Random Matrix Theory with Application to Frailty in Elderly Individuals

Authors: Malika Kharouf, Aly Chkeir, Khac Tuan Huynh

Abstract:

Detecting change points in time series data is a challenging problem, especially in scenarios where there is limited prior knowledge regarding the data’s distribution and the nature of the transitions. We present a method designed for detecting changes in the covariance structure of high-dimensional time series data, where the number of variables closely matches the data length. Our objective is to achieve unbiased test statistic estimation under the null hypothesis. We delve into the utilization of Random Matrix Theory to analyze the behavior of our test statistic within a high-dimensional context. Specifically, we illustrate that our test statistic converges pointwise to a normal distribution under the null hypothesis. To assess the effectiveness of our proposed approach, we conduct evaluations on a simulated dataset. Furthermore, we employ our method to examine changes aimed at detecting frailty in the elderly.

Keywords: change point detection, hypothesis tests, random matrix theory, frailty in elderly

Procedia PDF Downloads 52
23919 Main Cause of Children's Deaths in Indigenous Wayuu Community from Department of La Guajira: A Research Developed through Data Mining Use

Authors: Isaura Esther Solano Núñez, David Suarez

Abstract:

The main purpose of this research is to discover what causes death in children of the Wayuu community, and deeply analyze those results in order to take corrective measures to properly control infant mortality. We consider important to determine the reasons that are producing early death in this specific type of population, since they are the most vulnerable to high risk environmental conditions. In this way, the government, through competent authorities, may develop prevention policies and the right measures to avoid an increase of this tragic fact. The methodology used to develop this investigation is data mining, which consists in gaining and examining large amounts of data to produce new and valuable information. Through this technique it has been possible to determine that the child population is dying mostly from malnutrition. In short, this technique has been very useful to develop this study; it has allowed us to transform large amounts of information into a conclusive and important statement, which has made it easier to take appropriate steps to resolve a particular situation.

Keywords: malnutrition, data mining, analytical, descriptive, population, Wayuu, indigenous

Procedia PDF Downloads 159
23918 Application of the Mobile Phone for Occupational Self-Inspection Program in Small-Scale Industries

Authors: Jia-Sin Li, Ying-Fang Wang, Cheing-Tong Yan

Abstract:

In this study, an integrated approach of Google Spreadsheet and QR code which is free internet resources was used to improve the inspection procedure. The mobile phone Application(App)was also designed to combine with a web page to create an automatic checklist in order to provide a new integrated information of inspection management system. By means of client-server model, the client App is developed for Android mobile OS and the back end is a web server. It can set up App accounts including authorized data and store some checklist documents in the website. The checklist document URL could generate QR code first and then print and paste on the machine. The user can scan the QR code by the app and filled the checklist in the factory. In the meanwhile, the checklist data will send to the server, it not only save the filled data but also executes the related functions and charts. On the other hand, it also enables auditors and supervisors to facilitate the prevention and response to hazards, as well as immediate report data checks. Finally, statistics and professional analysis are performed using inspection records and other relevant data to not only improve the reliability, integrity of inspection operations and equipment loss control, but also increase plant safety and personnel performance. Therefore, it suggested that the traditional paper-based inspection method could be replaced by the APP which promotes the promotion of industrial security and reduces human error.

Keywords: checklist, Google spreadsheet, APP, self-inspection

Procedia PDF Downloads 118
23917 Industry 4.0 and Supply Chain Integration: Case of Tunisian Industrial Companies

Authors: Rym Ghariani, Ghada Soltane, Younes Boujelbene

Abstract:

Industry 4.0, a set of emerging smart and digital technologies, has been the main focus of operations management researchers and practitioners in recent years. The objective of this research paper is to study the impact of Industry 4.0 on the integration of the supply chain (SCI) in Tunisian industrial companies. A conceptual model to study the relationship between Industry 4.0 technologies and supply chain integration was designed. This model contains three explained variables (Big data, Internet of Things, and Robotics) and one variable to be explained (supply chain integration). In order to answer our research questions and investigate the research hypotheses, principal component analysis and discriminant analysis were used using SPSS26 software. The results reveal that there is a statistically positive impact significant impact of Industry 4.0 (Big data, Internet of Things and Robotics) on the integration of the supply chain. Interestingly, big data has a greater positive impact on supply chain integration than the Internet of Things and robotics.

Keywords: industry 4.0 (I4.0), big data, internet of things, robotics, supply chain integration

Procedia PDF Downloads 59
23916 Analysing Competitive Advantage of IoT and Data Analytics in Smart City Context

Authors: Petra Hofmann, Dana Koniel, Jussi Luukkanen, Walter Nieminen, Lea Hannola, Ilkka Donoghue

Abstract:

The Covid-19 pandemic forced people to isolate and become physically less connected. The pandemic has not only reshaped people’s behaviours and needs but also accelerated digital transformation (DT). DT of cities has become an imperative with the outlook of converting them into smart cities in the future. Embedding digital infrastructure and smart city initiatives as part of normal design, construction, and operation of cities provides a unique opportunity to improve the connection between people. The Internet of Things (IoT) is an emerging technology and one of the drivers in DT. It has disrupted many industries by introducing different services and business models, and IoT solutions are being applied in multiple fields, including smart cities. As IoT and data are fundamentally linked together, IoT solutions can only create value if the data generated by the IoT devices is analysed properly. Extracting relevant conclusions and actionable insights by using established techniques, data analytics contributes significantly to the growth and success of IoT applications and investments. Companies must grasp DT and be prepared to redesign their offerings and business models to remain competitive in today’s marketplace. As there are many IoT solutions available today, the amount of data is tremendous. The challenge for companies is to understand what solutions to focus on and how to prioritise and which data to differentiate from the competition. This paper explains how IoT and data analytics can impact competitive advantage and how companies should approach IoT and data analytics to translate them into concrete offerings and solutions in the smart city context. The study was carried out as a qualitative, literature-based research. A case study is provided to validate the preservation of company’s competitive advantage through smart city solutions. The results of the research contribution provide insights into the different factors and considerations related to creating competitive advantage through IoT and data analytics deployment in the smart city context. Furthermore, this paper proposes a framework that merges the factors and considerations with examples of offerings and solutions in smart cities. The data collected through IoT devices, and the intelligent use of it, can create competitive advantage to companies operating in smart city business. Companies should take into consideration the five forces of competition that shape industries and pay attention to the technological, organisational, and external contexts which define factors for consideration of competitive advantages in the field of IoT and data analytics. Companies that can utilise these key assets in their businesses will most likely conquer the markets and have a strong foothold in the smart city business.

Keywords: data analytics, smart cities, competitive advantage, internet of things

Procedia PDF Downloads 133
23915 Best Season for Seismic Survey in Zaria Area, Nigeria: Data Quality and Implications

Authors: Ibe O. Stephen, Egwuonwu N. Gabriel

Abstract:

Variations in seismic P-wave velocity and depth resolution resulting from variations in subsurface water saturation were investigated in this study in order to determine the season of the year that gives the most reliable P-wave velocity and depth resolution of the subsurface in Zaria Area, Nigeria. A 2D seismic refraction tomography technique involving an ABEM Terraloc MK6 Seismograph was used to collect data across a borehole of standard log with the centre of the spread situated at the borehole site. Using the same parameters this procedure was repeated along the same spread for at least once in a month for at least eight months in a year for four years. The choice for each survey time depended on when there was significant variation in rainfall data. The seismic data collected were tomographically inverted. The results suggested that the average P-wave velocity ranges of the subsurface in the area are generally higher when the ground was wet than when it was dry. The results also suggested that the overburden of about 9.0 m in thickness, the weathered basement of about 14.0 m in thickness and the fractured basement at a depth of about 23.0 m best fitted the borehole log. This best fit was consistently obtained in the months between March and May when the average total rainfall was about 44.8 mm in the area. The results had also shown that the velocity ranges in both dry and wet formations fall within the standard ranges as provided in literature. In terms of velocity, this study has not in any way clearly distinguished the quality of the results of the seismic data obtained when the subsurface was dry from the results of the data collected when the subsurface was wet. It was concluded that for more detailed and reliable seismic studies in Zaria Area and its environs with similar climatic condition, the surveys are best conducted between March and May. The most reliable seismic data for depth resolution are most likely obtainable in the area between March and May.

Keywords: best season, variations in depth resolution, variations in P-wave velocity, variations in subsurface water saturation, Zaria area

Procedia PDF Downloads 289
23914 Quick Sequential Search Algorithm Used to Decode High-Frequency Matrices

Authors: Mohammed M. Siddeq, Mohammed H. Rasheed, Omar M. Salih, Marcos A. Rodrigues

Abstract:

This research proposes a data encoding and decoding method based on the Matrix Minimization algorithm. This algorithm is applied to high-frequency coefficients for compression/encoding. The algorithm starts by converting every three coefficients to a single value; this is accomplished based on three different keys. The decoding/decompression uses a search method called QSS (Quick Sequential Search) Decoding Algorithm presented in this research based on the sequential search to recover the exact coefficients. In the next step, the decoded data are saved in an auxiliary array. The basic idea behind the auxiliary array is to save all possible decoded coefficients; this is because another algorithm, such as conventional sequential search, could retrieve encoded/compressed data independently from the proposed algorithm. The experimental results showed that our proposed decoding algorithm retrieves original data faster than conventional sequential search algorithms.

Keywords: matrix minimization algorithm, decoding sequential search algorithm, image compression, DCT, DWT

Procedia PDF Downloads 150
23913 Structuring and Visualizing Healthcare Claims Data Using Systems Architecture Methodology

Authors: Inas S. Khayal, Weiping Zhou, Jonathan Skinner

Abstract:

Healthcare delivery systems around the world are in crisis. The need to improve health outcomes while decreasing healthcare costs have led to an imminent call to action to transform the healthcare delivery system. While Bioinformatics and Biomedical Engineering have primarily focused on biological level data and biomedical technology, there is clear evidence of the importance of the delivery of care on patient outcomes. Classic singular decomposition approaches from reductionist science are not capable of explaining complex systems. Approaches and methods from systems science and systems engineering are utilized to structure healthcare delivery system data. Specifically, systems architecture is used to develop a multi-scale and multi-dimensional characterization of the healthcare delivery system, defined here as the Healthcare Delivery System Knowledge Base. This paper is the first to contribute a new method of structuring and visualizing a multi-dimensional and multi-scale healthcare delivery system using systems architecture in order to better understand healthcare delivery.

Keywords: health informatics, systems thinking, systems architecture, healthcare delivery system, data analytics

Procedia PDF Downloads 348
23912 Cleaning of Scientific References in Large Patent Databases Using Rule-Based Scoring and Clustering

Authors: Emiel Caron

Abstract:

Patent databases contain patent related data, organized in a relational data model, and are used to produce various patent statistics. These databases store raw data about scientific references cited by patents. For example, Patstat holds references to tens of millions of scientific journal publications and conference proceedings. These references might be used to connect patent databases with bibliographic databases, e.g. to study to the relation between science, technology, and innovation in various domains. Problematic in such studies is the low data quality of the references, i.e. they are often ambiguous, unstructured, and incomplete. Moreover, a complete bibliographic reference is stored in only one attribute. Therefore, a computerized cleaning and disambiguation method for large patent databases is developed in this work. The method uses rule-based scoring and clustering. The rules are based on bibliographic metadata, retrieved from the raw data by regular expressions, and are transparent and adaptable. The rules in combination with string similarity measures are used to detect pairs of records that are potential duplicates. Due to the scoring, different rules can be combined, to join scientific references, i.e. the rules reinforce each other. The scores are based on expert knowledge and initial method evaluation. After the scoring, pairs of scientific references that are above a certain threshold, are clustered by means of single-linkage clustering algorithm to form connected components. The method is designed to disambiguate all the scientific references in the Patstat database. The performance evaluation of the clustering method, on a large golden set with highly cited papers, shows on average a 99% precision and a 95% recall. The method is therefore accurate but careful, i.e. it weighs precision over recall. Consequently, separate clusters of high precision are sometimes formed, when there is not enough evidence for connecting scientific references, e.g. in the case of missing year and journal information for a reference. The clusters produced by the method can be used to directly link the Patstat database with bibliographic databases as the Web of Science or Scopus.

Keywords: clustering, data cleaning, data disambiguation, data mining, patent analysis, scientometrics

Procedia PDF Downloads 194
23911 A Human Centered Design of an Exoskeleton Using Multibody Simulation

Authors: Sebastian Kölbl, Thomas Reitmaier, Mathias Hartmann

Abstract:

Trial and error approaches to adapt wearable support structures to human physiology are time consuming and elaborate. However, during preliminary design, the focus lies on understanding the interaction between exoskeleton and the human body in terms of forces and moments, namely body mechanics. For the study at hand, a multi-body simulation approach has been enhanced to evaluate actual forces and moments in a human dummy model with and without a digital mock-up of an active exoskeleton. Therefore, different motion data have been gathered and processed to perform a musculosceletal analysis. The motion data are ground reaction forces, electromyography data (EMG) and human motion data recorded with a marker-based motion capture system. Based on the experimental data, the response of the human dummy model has been calibrated. Subsequently, the scalable human dummy model, in conjunction with the motion data, is connected with the exoskeleton structure. The results of the human-machine interaction (HMI) simulation platform are in particular resulting contact forces and human joint forces to compare with admissible values with regard to the human physiology. Furthermore, it provides feedback for the sizing of the exoskeleton structure in terms of resulting interface forces (stress justification) and the effect of its compliance. A stepwise approach for the setup and validation of the modeling strategy is presented and the potential for a more time and cost-effective development of wearable support structures is outlined.

Keywords: assistive devices, ergonomic design, inverse dynamics, inverse kinematics, multibody simulation

Procedia PDF Downloads 162
23910 Evaluation of the Nursing Management Course in Undergraduate Nursing Programs of State Universities in Turkey

Authors: Oznur Ispir, Oya Celebi Cakiroglu, Esengul Elibol, Emine Ceribas, Gizem Acikgoz, Hande Yesilbas, Merve Tarhan

Abstract:

This study was conducted to evaluate the academic staff teaching the 'Nursing Management' course in the undergraduate nursing programs of the state universities in Turkey and to assess the current content of the course. Design of the study is descriptive. Population of the study consists of seventy-eight undergraduate nursing programs in the state universities in Turkey. The questionnaire/survey prepared by the researchers was used as a data collection tool. The data were obtained by screening the content of the websites of nursing education programs between March and May 2016. Descriptive statistics were used to analyze the data. The research performed within the study indicated that 58% of the undergraduate nursing programs from which the data were derived were included in the school of health, 81% of the academic staff graduated from the undergraduate nursing programs, 40% worked as a lecturer and 37% specialized in a field other than the nursing. The research also implied that the above-mentioned course was included in 98% of the programs from which it was possible to obtain data. The full name of the course was 'Nursing Management' in 95% of the programs and 98% stated that the course was compulsory. Theory and application hours were 3.13 and 2.91, respectively. Moreover, the content of the course was not shared in 65% of the programs reviewed. This study demonstrated that the experience and expertise of the academic staff teaching the 'Nursing Management' course was not sufficient in the management area, and the schedule and content of the course were not sufficient although many nursing education programs provided the course. Comparison between the curricula of the course revealed significant differences.

Keywords: nursing, nursing management, nursing management course, undergraduate program

Procedia PDF Downloads 358
23909 The DAQ Debugger for iFDAQ of the COMPASS Experiment

Authors: Y. Bai, M. Bodlak, V. Frolov, S. Huber, V. Jary, I. Konorov, D. Levit, J. Novy, D. Steffen, O. Subrt, M. Virius

Abstract:

In general, state-of-the-art Data Acquisition Systems (DAQ) in high energy physics experiments must satisfy high requirements in terms of reliability, efficiency and data rate capability. This paper presents the development and deployment of a debugging tool named DAQ Debugger for the intelligent, FPGA-based Data Acquisition System (iFDAQ) of the COMPASS experiment at CERN. Utilizing a hardware event builder, the iFDAQ is designed to be able to readout data at the average maximum rate of 1.5 GB/s of the experiment. In complex softwares, such as the iFDAQ, having thousands of lines of code, the debugging process is absolutely essential to reveal all software issues. Unfortunately, conventional debugging of the iFDAQ is not possible during the real data taking. The DAQ Debugger is a tool for identifying a problem, isolating the source of the problem, and then either correcting the problem or determining a way to work around it. It provides the layer for an easy integration to any process and has no impact on the process performance. Based on handling of system signals, the DAQ Debugger represents an alternative to conventional debuggers provided by most integrated development environments. Whenever problem occurs, it generates reports containing all necessary information important for a deeper investigation and analysis. The DAQ Debugger was fully incorporated to all processes in the iFDAQ during the run 2016. It helped to reveal remaining software issues and improved significantly the stability of the system in comparison with the previous run. In the paper, we present the DAQ Debugger from several insights and discuss it in a detailed way.

Keywords: DAQ Debugger, data acquisition system, FPGA, system signals, Qt framework

Procedia PDF Downloads 284
23908 Q-Map: Clinical Concept Mining from Clinical Documents

Authors: Sheikh Shams Azam, Manoj Raju, Venkatesh Pagidimarri, Vamsi Kasivajjala

Abstract:

Over the past decade, there has been a steep rise in the data-driven analysis in major areas of medicine, such as clinical decision support system, survival analysis, patient similarity analysis, image analytics etc. Most of the data in the field are well-structured and available in numerical or categorical formats which can be used for experiments directly. But on the opposite end of the spectrum, there exists a wide expanse of data that is intractable for direct analysis owing to its unstructured nature which can be found in the form of discharge summaries, clinical notes, procedural notes which are in human written narrative format and neither have any relational model nor any standard grammatical structure. An important step in the utilization of these texts for such studies is to transform and process the data to retrieve structured information from the haystack of irrelevant data using information retrieval and data mining techniques. To address this problem, the authors present Q-Map in this paper, which is a simple yet robust system that can sift through massive datasets with unregulated formats to retrieve structured information aggressively and efficiently. It is backed by an effective mining technique which is based on a string matching algorithm that is indexed on curated knowledge sources, that is both fast and configurable. The authors also briefly examine its comparative performance with MetaMap, one of the most reputed tools for medical concepts retrieval and present the advantages the former displays over the latter.

Keywords: information retrieval, unified medical language system, syntax based analysis, natural language processing, medical informatics

Procedia PDF Downloads 133
23907 Developing Logistics Indices for Turkey as an an Indicator of Economic Activity

Authors: Gizem İntepe, Eti Mizrahi

Abstract:

Investment and financing decisions are influenced by various economic features. Detailed analysis should be conducted in order to make decisions not only by companies but also by governments. Such analysis can be conducted either at the company level or on a sectoral basis to reduce risks and to maximize profits. Sectoral disaggregation caused by seasonality effects, subventions, data advantages or disadvantages may appear in sectors behaving parallel to BIST (Borsa Istanbul stock exchange) Index. Proposed logistic indices could serve market needs as a decision parameter in sectoral basis and also helps forecasting activities in import export volume changes. Also it is an indicator of logistic activity, which is also a sign of economic mobility at the national level. Publicly available data from “Ministry of Transport, Maritime Affairs and Communications” and “Turkish Statistical Institute” is utilized to obtain five logistics indices namely as; exLogistic, imLogistic, fLogistic, dLogistic and cLogistic index. Then, efficiency and reliability of these indices are tested.

Keywords: economic activity, export trade data, import trade data, logistics indices

Procedia PDF Downloads 336
23906 Using Non-Negative Matrix Factorization Based on Satellite Imagery for the Collection of Agricultural Statistics

Authors: Benyelles Zakaria, Yousfi Djaafar, Karoui Moussa Sofiane

Abstract:

Agriculture is fundamental and remains an important objective in the Algerian economy, based on traditional techniques and structures, it generally has a purpose of consumption. Collection of agricultural statistics in Algeria is done using traditional methods, which consists of investigating the use of land through survey and field survey. These statistics suffer from problems such as poor data quality, the long delay between collection of their last final availability and high cost compared to their limited use. The objective of this work is to develop a processing chain for a reliable inventory of agricultural land by trying to develop and implement a new method of extracting information. Indeed, this methodology allowed us to combine data from remote sensing and field data to collect statistics on areas of different land. The contribution of remote sensing in the improvement of agricultural statistics, in terms of area, has been studied in the wilaya of Sidi Bel Abbes. It is in this context that we applied a method for extracting information from satellite images. This method is called the non-negative matrix factorization, which does not consider the pixel as a single entity, but will look for components the pixel itself. The results obtained by the application of the MNF were compared with field data and the results obtained by the method of maximum likelihood. We have seen a rapprochement between the most important results of the FMN and those of field data. We believe that this method of extracting information from satellite data leads to interesting results of different types of land uses.

Keywords: blind source separation, hyper-spectral image, non-negative matrix factorization, remote sensing

Procedia PDF Downloads 423
23905 Estimation of Coefficient of Discharge of Side Trapezoidal Labyrinth Weir Using Group Method of Data Handling Technique

Authors: M. A. Ansari, A. Hussain, A. Uddin

Abstract:

A side weir is a flow diversion structure provided in the side wall of a channel to divert water from the main channel to a branch channel. The trapezoidal labyrinth weir is a special type of weir in which crest length of the weir is increased to pass higher discharge. Experimental and numerical studies related to the coefficient of discharge of trapezoidal labyrinth weir in an open channel have been presented in the present study. Group Method of Data Handling (GMDH) with the transfer function of quadratic polynomial has been used to predict the coefficient of discharge for the side trapezoidal labyrinth weir. A new model is developed for coefficient of discharge of labyrinth weir by regression method. Generalized models for predicting the coefficient of discharge for labyrinth weir using Group Method of Data Handling (GMDH) network have also been developed. The prediction based on GMDH model is more satisfactory than those given by traditional regression equations.

Keywords: discharge coefficient, group method of data handling, open channel, side labyrinth weir

Procedia PDF Downloads 160
23904 Integration of Resistivity and Seismic Refraction Using Combine Inversion for Ancient River Findings at Sungai Batu, Lembah Bujang, Malaysia

Authors: Rais Yusoh, Rosli Saad, Mokhtar Saidin, Fauzi Andika, Sabiu Bala Muhammad

Abstract:

Resistivity and seismic refraction profiling have become a common method in pre-investigations for visualizing subsurface structure. The integration of the methods could reduce an interpretation ambiguity. Both methods have their individual software packages for data inversion, but potential to combine certain geophysical methods are restricted; however, the research algorithms that have this functionality was existed and are evaluated personally. The interpretation of subsurface were improve by combining inversion data from both methods by influence each other models using closure coupling; thus, by implementing both methods to support each other which could improve the subsurface interpretation. These methods were applied on a field dataset from a pre-investigation for archeology in finding the ancient river. There were no major changes in the inverted model by combining data inversion for this archetype which probably due to complex geology. The combine data analysis provides an additional technique for interpretation such as an alluvium, which can have strong influence on the ancient river findings.

Keywords: ancient river, combine inversion, resistivity, seismic refraction

Procedia PDF Downloads 333
23903 Data Mining in Healthcare for Predictive Analytics

Authors: Ruzanna Muradyan

Abstract:

Medical data mining is a crucial field in contemporary healthcare that offers cutting-edge tactics with enormous potential to transform patient care. This abstract examines how sophisticated data mining techniques could transform the healthcare industry, with a special focus on how they might improve patient outcomes. Healthcare data repositories have dynamically evolved, producing a rich tapestry of different, multi-dimensional information that includes genetic profiles, lifestyle markers, electronic health records, and more. By utilizing data mining techniques inside this vast library, a variety of prospects for precision medicine, predictive analytics, and insight production become visible. Predictive modeling for illness prediction, risk stratification, and therapy efficacy evaluations are important points of focus. Healthcare providers may use this abundance of data to tailor treatment plans, identify high-risk patient populations, and forecast disease trajectories by applying machine learning algorithms and predictive analytics. Better patient outcomes, more efficient use of resources, and early treatments are made possible by this proactive strategy. Furthermore, data mining techniques act as catalysts to reveal complex relationships between apparently unrelated data pieces, providing enhanced insights into the cause of disease, genetic susceptibilities, and environmental factors. Healthcare practitioners can get practical insights that guide disease prevention, customized patient counseling, and focused therapies by analyzing these associations. The abstract explores the problems and ethical issues that come with using data mining techniques in the healthcare industry. In order to properly use these approaches, it is essential to find a balance between data privacy, security issues, and the interpretability of complex models. Finally, this abstract demonstrates the revolutionary power of modern data mining methodologies in transforming the healthcare sector. Healthcare practitioners and researchers can uncover unique insights, enhance clinical decision-making, and ultimately elevate patient care to unprecedented levels of precision and efficacy by employing cutting-edge methodologies.

Keywords: data mining, healthcare, patient care, predictive analytics, precision medicine, electronic health records, machine learning, predictive modeling, disease prognosis, risk stratification, treatment efficacy, genetic profiles, precision health

Procedia PDF Downloads 62
23902 ROOP: Translating Sequential Code Fragments to Distributed Code Fragments Using Deep Reinforcement Learning

Authors: Arun Sanjel, Greg Speegle

Abstract:

Every second, massive amounts of data are generated, and Data Intensive Scalable Computing (DISC) frameworks have evolved into effective tools for analyzing such massive amounts of data. Since the underlying architecture of these distributed computing platforms is often new to users, building a DISC application can often be time-consuming and prone to errors. The automated conversion of a sequential program to a DISC program will consequently significantly improve productivity. However, synthesizing a user’s intended program from an input specification is complex, with several important applications, such as distributed program synthesizing and code refactoring. Existing works such as Tyro and Casper rely entirely on deductive synthesis techniques or similar program synthesis approaches. Our approach is to develop a data-driven synthesis technique to identify sequential components and translate them to equivalent distributed operations. We emphasize using reinforcement learning and unit testing as feedback mechanisms to achieve our objectives.

Keywords: program synthesis, distributed computing, reinforcement learning, unit testing, DISC

Procedia PDF Downloads 106
23901 A Novel Technological Approach to Maintaining the Cold Chain during Transportation

Authors: Philip J. Purnell

Abstract:

Innovators propose to use the Internet of Things to solve the problem of maintaining the cold chain during the transport of biopharmaceutical products. Sending a data logger with refrigerated goods is only useful to inform the recipient of the goods that they have either breached the cold chain and are therefore potentially spoiled or that they have not breached it and are therefore assumed to be in good condition. Connecting the data logger to the Internet of Things means that the supply chain manager will be informed in real-time of the exact location and the precise temperature of the material at any point on earth. Readable using a simple online interface, the supply chain manager will watch the progress of their material on a Google map together with accurate and crucially real-time temperature readings. The data logger will also send alarms to the supply chain manager if a cold chain breach becomes imminent allowing them time to contact the transporter and restore the cold chain before the material is affected. This development is expected to save billions of dollars in wasted biologics that currently arrive either spoiled or in an unreliable condition.

Keywords: internet of things, cold chain, data logger, transportation

Procedia PDF Downloads 442