Search results for: data acquisition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25636

Search results for: data acquisition

24556 Sourcing and Compiling a Maltese Traffic Dataset MalTra

Authors: Gabriele Borg, Alexei De Bono, Charlie Abela

Abstract:

There on a constant rise in the availability of high volumes of data gathered from multiple sources, resulting in an abundance of unprocessed information that can be used to monitor patterns and trends in user behaviour. Similarly, year after year, Malta is also constantly experiencing ongoing population growth and an increase in mobilization demand. This research takes advantage of data which is continuously being sourced and converting it into useful information related to the traffic problem on the Maltese roads. The scope of this paper is to provide a methodology to create a custom dataset (MalTra - Malta Traffic) compiled from multiple participants from various locations across the island to identify the most common routes taken to expose the main areas of activity. This use of big data is seen being used in various technologies and is referred to as ITSs (Intelligent Transportation Systems), which has been concluded that there is significant potential in utilising such sources of data on a nationwide scale.

Keywords: Big Data, vehicular traffic, traffic management, mobile data patterns

Procedia PDF Downloads 109
24555 Comparative Study of Accuracy of Land Cover/Land Use Mapping Using Medium Resolution Satellite Imagery: A Case Study

Authors: M. C. Paliwal, A. K. Jain, S. K. Katiyar

Abstract:

Classification of satellite imagery is very important for the assessment of its accuracy. In order to determine the accuracy of the classified image, usually the assumed-true data are derived from ground truth data using Global Positioning System. The data collected from satellite imagery and ground truth data is then compared to find out the accuracy of data and error matrices are prepared. Overall and individual accuracies are calculated using different methods. The study illustrates advanced classification and accuracy assessment of land use/land cover mapping using satellite imagery. IRS-1C-LISS IV data were used for classification of satellite imagery. The satellite image was classified using the software in fourteen classes namely water bodies, agricultural fields, forest land, urban settlement, barren land and unclassified area etc. Classification of satellite imagery and calculation of accuracy was done by using ERDAS-Imagine software to find out the best method. This study is based on the data collected for Bhopal city boundaries of Madhya Pradesh State of India.

Keywords: resolution, accuracy assessment, land use mapping, satellite imagery, ground truth data, error matrices

Procedia PDF Downloads 507
24554 Effect of Genuine Missing Data Imputation on Prediction of Urinary Incontinence

Authors: Suzan Arslanturk, Mohammad-Reza Siadat, Theophilus Ogunyemi, Ananias Diokno

Abstract:

Missing data is a common challenge in statistical analyses of most clinical survey datasets. A variety of methods have been developed to enable analysis of survey data to deal with missing values. Imputation is the most commonly used among the above methods. However, in order to minimize the bias introduced due to imputation, one must choose the right imputation technique and apply it to the correct type of missing data. In this paper, we have identified different types of missing values: missing data due to skip pattern (SPMD), undetermined missing data (UMD), and genuine missing data (GMD) and applied rough set imputation on only the GMD portion of the missing data. We have used rough set imputation to evaluate the effect of such imputation on prediction by generating several simulation datasets based on an existing epidemiological dataset (MESA). To measure how well each dataset lends itself to the prediction model (logistic regression), we have used p-values from the Wald test. To evaluate the accuracy of the prediction, we have considered the width of 95% confidence interval for the probability of incontinence. Both imputed and non-imputed simulation datasets were fit to the prediction model, and they both turned out to be significant (p-value < 0.05). However, the Wald score shows a better fit for the imputed compared to non-imputed datasets (28.7 vs. 23.4). The average confidence interval width was decreased by 10.4% when the imputed dataset was used, meaning higher precision. The results show that using the rough set method for missing data imputation on GMD data improve the predictive capability of the logistic regression. Further studies are required to generalize this conclusion to other clinical survey datasets.

Keywords: rough set, imputation, clinical survey data simulation, genuine missing data, predictive index

Procedia PDF Downloads 168
24553 Drones, Rebels and Bombs: Explaining the Role of Private Security and Expertise in a Post-piratical Indian Ocean

Authors: Jessica Kate Simonds

Abstract:

The last successful hijacking perpetrated by Somali pirates in 2012 represented a critical turning point for the identity and brand of Indian Ocean (IO) insecurity, coined in this paper as the era of the post-piratical. This paper explores the broadening of the PMSC business model to account and contribute to the design of a new IO security environment that prioritises foreign and insurgency drone activity and Houthi rebel operations as the main threat to merchant shipping in the post-2012 era. This study is situated within a longer history of analysing maritime insecurity and also contributes a bespoke conceptual framework that understands the sea as a space that is produced and reproduced relative to existing and emerging threats to merchant shipping based on bespoke models of information sharing and intelligence acquisition. This paper also makes a prominent empirical contribution by drawing on a post-positivist methodology, data drawn from original semi-structured interviews with senior maritime insurers and active merchant seafarers that is triangulated with industry-produced guidance such as the BMP series as primary data sources. Each set is analysed through qualitative discourse and content analysis and supported by the quantitative data sets provided by the IMB Piracy Reporting center and intelligence networks. This analysis reveals that mechanisms such as the IGP&I Maritime Security Committee and intelligence divisions of PMSC’s have driven the exchanges of knowledge between land and sea and thus the reproduction of the maritime security environment through new regulations and guidance to account dones, rebels and bombs as the key challenges in the IO, beyond piracy. A contribution of this paper is the argument that experts who may not be in the highest-profile jobs are the architects of maritime insecurity based on their detailed knowledge and connections to vessels in transit. This paper shares the original insights of those who have served in critical decision making spaces to demonstrate that the development and refinement of industry produced deterrence guidance that has been accredited to the mitigation of piracy, have shaped new editions such as BMP 5 that now serve to frame a new security environment that prioritises the mitigation of risks from drones and WBEID’s from both state and insurgency risk groups. By highlighting the experiences and perspectives of key players on both land and at sea, the key finding of this paper is outlining that as pirates experienced a financial boom by profiteering from their bespoke business model during the peak of successful hijackings, the private security market encountered a similar level of financial success and guaranteed risk environment in which to prospect business. Thus, the reproduction of the Indian Ocean as a maritime security environment reflects a new found purpose for PMSC’s as part of the broader conglomerate of maritime insurers, regulators, shipowners and managers who continue to redirect the security consciousness and IO brand of insecurity.

Keywords: maritime security, private security, risk intelligence, political geography, international relations, political economy, maritime law, security studies

Procedia PDF Downloads 184
24552 Database Management System for Orphanages to Help Track of Orphans

Authors: Srivatsav Sanjay Sridhar, Asvitha Raja, Prathit Kalra, Soni Gupta

Abstract:

Database management is a system that keeps track of details about a person in an organisation. Not a lot of orphanages these days are shifting to a computer and program-based system, but unfortunately, most have only pen and paper-based records, which not only consumes space but it is also not eco-friendly. It comes as a hassle when one has to view a record of a person as they have to search through multiple records, and it will consume time. This program will organise all the data and can pull out any information about anyone whose data is entered. This is also a safe way of storage as physical data gets degraded over time or, worse, destroyed due to natural disasters. In this developing world, it is only smart enough to shift all data to an electronic-based storage system. The program comes with all features, including creating, inserting, searching, and deleting the data, as well as printing them.

Keywords: database, orphans, programming, C⁺⁺

Procedia PDF Downloads 156
24551 New Two-Way Map-Reduce Join Algorithm: Hash Semi Join

Authors: Marwa Hussein Mohamed, Mohamed Helmy Khafagy, Samah Ahmed Senbel

Abstract:

Map Reduce is a programming model used to handle and support massive data sets. Rapidly increasing in data size and big data are the most important issue today to make an analysis of this data. map reduce is used to analyze data and get more helpful information by using two simple functions map and reduce it's only written by the programmer, and it includes load balancing , fault tolerance and high scalability. The most important operation in data analysis are join, but map reduce is not directly support join. This paper explains two-way map-reduce join algorithm, semi-join and per split semi-join, and proposes new algorithm hash semi-join that used hash table to increase performance by eliminating unused records as early as possible and apply join using hash table rather than using map function to match join key with other data table in the second phase but using hash tables isn't affecting on memory size because we only save matched records from the second table only. Our experimental result shows that using a hash table with hash semi-join algorithm has higher performance than two other algorithms while increasing the data size from 10 million records to 500 million and running time are increased according to the size of joined records between two tables.

Keywords: map reduce, hadoop, semi join, two way join

Procedia PDF Downloads 513
24550 Using Implicit Data to Improve E-Learning Systems

Authors: Slah Alsaleh

Abstract:

In the recent years and with popularity of internet and technology, e-learning became a major part of majority of education systems. One of the advantages the e-learning systems provide is the large amount of information available about the students' behavior while communicating with the e-learning system. Such information is very rich and it can be used to improve the capability and efficiency of e-learning systems. This paper discusses how e-learning can benefit from implicit data in different ways including; creating homogeneous groups of student, evaluating students' learning, creating behavior profiles for students and identifying the students through their behaviors.

Keywords: e-learning, implicit data, user behavior, data mining

Procedia PDF Downloads 309
24549 Enabling Quantitative Urban Sustainability Assessment with Big Data

Authors: Changfeng Fu

Abstract:

Sustainable urban development has been widely accepted a common sense in the modern urban planning and design. However, the measurement and assessment of urban sustainability, especially the quantitative assessment have been always an issue obsessing planning and design professionals. This paper will present an on-going research on the principles and technologies to develop a quantitative urban sustainability assessment principles and techniques which aim to integrate indicators, geospatial and geo-reference data, and assessment techniques together into a mechanism. It is based on the principles and techniques of geospatial analysis with GIS and statistical analysis methods. The decision-making technologies and methods such as AHP and SMART are also adopted to address overall assessment conclusions. The possible interfaces and presentation of data and quantitative assessment results are also described. This research is based on the knowledge, situations and data sources of UK, but it is potentially adaptable to other countries or regions. The implementation potentials of the mechanism are also discussed.

Keywords: urban sustainability assessment, quantitative analysis, sustainability indicator, geospatial data, big data

Procedia PDF Downloads 357
24548 Educational Audit and Curricular Reforms in the Arabian Context

Authors: Irum Naz

Abstract:

In the Arabian higher education context, linguistic proficiency in the English language is considered crucial for the developmental sustainability, economic growth, and stability of communities and societies. Qatar’s educational reforms package, through the 2030 vision, identifies the acquisition of English at K-12 as an essential survival communication tool for globalization, believing that Qatari students need better preparation to take on the responsibilities of leadership and to participate effectively in the country’s surging economy. The idea of introducing Qatari students to modern curricula benchmarked to high-student-performance curricula in developed countries is one of the components of reformatory design principles of Education for New Era reform project that is mutually consented to and supported by the Office of Shared Services, Communications Office, and Supreme Education Council. In appreciation of the government’s vision, the English Language Centre (ELC) at the Community College of Qatar ran an internal educational audit and conducted evaluative research to understand and appraise the value, impact, and practicality of the existing ELC language development program. This study sought to identify the type of change that could identify and improve the quality of Foundation Program courses and the manners in which second language learners could be assisted to transit smoothly between (ELC) levels. Following the interpretivist paradigm and mixed research method, the data was gathered through a bicyclic research model and a triangular design. The analyses of the data suggested that there was a need for improvement in the ELC program as a whole, and particularly in terms of curriculum, student learning outcomes, and the general learning environment in the department. Key findings suggest that the target program would benefit from significant revisions, which would include narrowing the focus of the courses, providing sets of specific learning objectives, and preventing repetition between levels. Another promising finding was about the assessment tools and process. The data suggested that a set of standardized assessments that more closely suited the programs of study should be devised. It was also recommended that students undergo a more comprehensive placement process to ensure that they begin the program at an appropriate level and get the maximum benefit from their learning experience. Although this ties into the idea of curriculum revamp, it was expected that students could leave the ELC having had exposure to courses in English for specific purposes. The idea of a more reliable exit assessment for students was raised frequently so ELC could regulate itself and ensure optimum learning outcomes. Another important recommendation was the provision of a Student Learning Center for students that would help them to receive personalized tuition, differentiated instruction, and self-driven and self-evaluated learning experience. In addition, an extra study level was recommended to be added to the program to accommodate the different levels of English language proficiency represented among ELC students. The evidence collected in the course of conducting the study suggests that significant change is needed in the structure of the ELC program, specifically about curriculum, the program learning outcomes, and the learning environment in general.

Keywords: educational audit, ESL, optimum learning outcomes, Qatar’s educational reforms, self-driven and self-evaluated learning experience, Student Learning Center

Procedia PDF Downloads 185
24547 PlayTrain: A Research and Intervention Project for Early Childhood Teacher Education

Authors: Dalila Lino, Maria Joao Hortas, Carla Rocha, Clarisse Nunes, Natalia Vieira, Marina Fuertes, Kátia Sa

Abstract:

The value of play is recognized worldwide and is considered a fundamental right of all children, as defined in Article 31 of the United Nations Children’s Rights. It is consensual among the scientific community that play, and toys are of vital importance for children’s learning and development. Play promotes the acquisition of language, enhances creativity and improves social, affective, emotional, cognitive and motor development of young children. Young children ages 0 to 6 who have had many opportunities to get involved in play show greater competence to adapt to new and unexpected situations and more easily overcome the pain and suffering caused by traumatic situations. The PlayTrain Project aims to understand the places/spaces of play in the education of children from 0 to 6 years and promoting the training of preschool teachers to become capable of developing practices that enhance children’s agency, experimentation in the physical and social world and the development of imagination and creativity. This project follows the Design-Based-Research (DBR) and has two dimensions: research and intervention. The participants are 120 students from the Master in Pre-school Education of the Higher School of Education, Polytechnic Institute of Lisbon enrolled in the academic year 2018/2019. The development of workshops focused on the role of play and toys for young children’s learning promotes the participants reflection and the development of skills and knowledge to construct developmentally appropriated practices in early childhood education. Data was collected through an online questionnaire and focal groups. Results show that the PlayTrain Project contribute to the development of a body of knowledge about the role of play for early childhood education. It was possible to identify the needs of preschool teacher education and to enhance the discussion among the scientific and academic community about the importance of deepening the role of play and toys in the study plans of the masters in pre-school education.

Keywords: children's learning, early childhood education, play, teacher education, toys

Procedia PDF Downloads 144
24546 Development of Generalized Correlation for Liquid Thermal Conductivity of N-Alkane and Olefin

Authors: A. Ishag Mohamed, A. A. Rabah

Abstract:

The objective of this research is to develop a generalized correlation for the prediction of thermal conductivity of n-Alkanes and Alkenes. There is a minority of research and lack of correlation for thermal conductivity of liquids in the open literature. The available experimental data are collected covering the groups of n-Alkanes and Alkenes.The data were assumed to correlate to temperature using Filippov correlation. Nonparametric regression of Grace Algorithm was used to develop the generalized correlation model. A spread sheet program based on Microsoft Excel was used to plot and calculate the value of the coefficients. The results obtained were compared with the data that found in Perry's Chemical Engineering Hand Book. The experimental data correlated to the temperature ranged "between" 273.15 to 673.15 K, with R2 = 0.99.The developed correlation reproduced experimental data that which were not included in regression with absolute average percent deviation (AAPD) of less than 7 %. Thus the spread sheet was quite accurate which produces reliable data.

Keywords: N-Alkanes, N-Alkenes, nonparametric, regression

Procedia PDF Downloads 654
24545 Survey on Arabic Sentiment Analysis in Twitter

Authors: Sarah O. Alhumoud, Mawaheb I. Altuwaijri, Tarfa M. Albuhairi, Wejdan M. Alohaideb

Abstract:

Large-scale data stream analysis has become one of the important business and research priorities lately. Social networks like Twitter and other micro-blogging platforms hold an enormous amount of data that is large in volume, velocity and variety. Extracting valuable information and trends out of these data would aid in a better understanding and decision-making. Multiple analysis techniques are deployed for English content. Moreover, one of the languages that produce a large amount of data over social networks and is least analyzed is the Arabic language. The proposed paper is a survey on the research efforts to analyze the Arabic content in Twitter focusing on the tools and methods used to extract the sentiments for the Arabic content on Twitter.

Keywords: big data, social networks, sentiment analysis, twitter

Procedia PDF Downloads 576
24544 Estimating Current Suicide Rates Using Google Trends

Authors: Ladislav Kristoufek, Helen Susannah Moat, Tobias Preis

Abstract:

Data on the number of people who have committed suicide tends to be reported with a substantial time lag of around two years. We examine whether online activity measured by Google searches can help us improve estimates of the number of suicide occurrences in England before official figures are released. Specifically, we analyse how data on the number of Google searches for the terms “depression” and “suicide” relate to the number of suicides between 2004 and 2013. We find that estimates drawing on Google data are significantly better than estimates using previous suicide data alone. We show that a greater number of searches for the term “depression” is related to fewer suicides, whereas a greater number of searches for the term “suicide” is related to more suicides. Data on suicide related search behaviour can be used to improve current estimates of the number of suicide occurrences.

Keywords: nowcasting, search data, Google Trends, official statistics

Procedia PDF Downloads 357
24543 Moderating Influence of Environmental Hostility and External Relational Capital on the Effect of Entrepreneurial Orientation on Performance

Authors: Peter Ugbedeojo Nelson

Abstract:

Despite the tremendous advancements and knowledge acquisition around entrepreneurship orientation (EO) research, there may still be more to learn on how environmental dynamics would permute organizational processes and determine the extent to which success would be achieved. Using the contingency theory, we test a model that proposes a moderating influence of external relational capital and environmental hostility on the EO-performance effect of 423 managers/owners of small and medium scale enterprises. The hypotheses were tested using Hayes simultaneous regression, and the results showed that all EO dimensions (risk-taking, innovation, and performance) had a main effect on performance while the moderating variables interacted well with risk-taking (more than other EO dimensions) to improve performance. However, external relational capital, more than environmental hostility, influences the EO-performance relationship. Our findings highlight the differential ways that EO dimensions interact with environmental contingencies to influence performance. Further studies can examine how competitive aggressiveness and autonomy are moderated by external relational capital and environmental hostility.

Keywords: external relational capital, entrepreneurial orientation, risk-taking, innovation, proactiveness

Procedia PDF Downloads 55
24542 On the Network Packet Loss Tolerance of SVM Based Activity Recognition

Authors: Gamze Uslu, Sebnem Baydere, Alper K. Demir

Abstract:

In this study, data loss tolerance of Support Vector Machines (SVM) based activity recognition model and multi activity classification performance when data are received over a lossy wireless sensor network is examined. Initially, the classification algorithm we use is evaluated in terms of resilience to random data loss with 3D acceleration sensor data for sitting, lying, walking and standing actions. The results show that the proposed classification method can recognize these activities successfully despite high data loss. Secondly, the effect of differentiated quality of service performance on activity recognition success is measured with activity data acquired from a multi hop wireless sensor network, which introduces high data loss. The effect of number of nodes on the reliability and multi activity classification success is demonstrated in simulation environment. To the best of our knowledge, the effect of data loss in a wireless sensor network on activity detection success rate of an SVM based classification algorithm has not been studied before.

Keywords: activity recognition, support vector machines, acceleration sensor, wireless sensor networks, packet loss

Procedia PDF Downloads 475
24541 GIS Data Governance: GIS Data Submission Process for Build-in Project, Replacement Project at Oman electricity Transmission Company

Authors: Rahma Saleh Hussein Al Balushi

Abstract:

Oman Electricity Transmission Company's (OETC) vision is to be a renowned world-class transmission grid by 2025, and one of the indications of achieving the vision is obtaining Asset Management ISO55001 certification, which required setting out a documented Standard Operating Procedures (SOP). Hence, documented SOP for the Geographical information system data process has been established. Also, to effectively manage and improve OETC power transmission, asset data and information need to be governed as such by Asset Information & GIS department. This paper will describe in detail the current GIS data submission process and the journey for developing it. The methodology used to develop the process is based on three main pillars, which are system and end-user requirements, Risk evaluation, data availability, and accuracy. The output of this paper shows the dramatic change in the used process, which results subsequently in more efficient, accurate, and updated data. Furthermore, due to this process, GIS has been and is ready to be integrated with other systems as well as the source of data for all OETC users. Some decisions related to issuing No objection certificates (NOC) for excavation permits and scheduling asset maintenance plans in Computerized Maintenance Management System (CMMS) have been made consequently upon GIS data availability. On the Other hand, defining agreed and documented procedures for data collection, data systems update, data release/reporting and data alterations has also contributed to reducing the missing attributes and enhance data quality index of GIS transmission data. A considerable difference in Geodatabase (GDB) completeness percentage was observed between the years 2017 and year 2022. Overall, concluding that by governance, asset information & GIS department can control the GIS data process; collect, properly record, and manage asset data and information within the OETC network. This control extends to other applications and systems integrated with/related to GIS systems.

Keywords: asset management ISO55001, standard procedures process, governance, CMMS

Procedia PDF Downloads 125
24540 Efects of Data Corelation in a Sparse-View Compresive Sensing Based Image Reconstruction

Authors: Sajid Abas, Jon Pyo Hong, Jung-Ryun Le, Seungryong Cho

Abstract:

Computed tomography and laminography are heavily investigated in a compressive sensing based image reconstruction framework to reduce the dose to the patients as well as to the radiosensitive devices such as multilayer microelectronic circuit boards. Nowadays researchers are actively working on optimizing the compressive sensing based iterative image reconstruction algorithm to obtain better quality images. However, the effects of the sampled data’s properties on reconstructed the image’s quality, particularly in an insufficient sampled data conditions have not been explored in computed laminography. In this paper, we investigated the effects of two data properties i.e. sampling density and data incoherence on the reconstructed image obtained by conventional computed laminography and a recently proposed method called spherical sinusoidal scanning scheme. We have found that in a compressive sensing based image reconstruction framework, the image quality mainly depends upon the data incoherence when the data is uniformly sampled.

Keywords: computed tomography, computed laminography, compressive sending, low-dose

Procedia PDF Downloads 464
24539 Fuzzy Wavelet Model to Forecast the Exchange Rate of IDR/USD

Authors: Tri Wijayanti Septiarini, Agus Maman Abadi, Muhammad Rifki Taufik

Abstract:

The exchange rate of IDR/USD can be the indicator to analysis Indonesian economy. The exchange rate as a important factor because it has big effect in Indonesian economy overall. So, it needs the analysis data of exchange rate. There is decomposition data of exchange rate of IDR/USD to be frequency and time. It can help the government to monitor the Indonesian economy. This method is very effective to identify the case, have high accurate result and have simple structure. In this paper, data of exchange rate that used is weekly data from December 17, 2010 until November 11, 2014.

Keywords: the exchange rate, fuzzy mamdani, discrete wavelet transforms, fuzzy wavelet

Procedia PDF Downloads 570
24538 Humanising Digital Healthcare to Build Capacity by Harnessing the Power of Patient Data

Authors: Durhane Wong-Rieger, Kawaldip Sehmi, Nicola Bedlington, Nicole Boice, Tamás Bereczky

Abstract:

Patient-generated health data should be seen as the expression of the experience of patients, including the outcomes reflecting the impact a treatment or service had on their physical health and wellness. We discuss how the healthcare system can reach a place where digital is a determinant of health - where data is generated by patients and is respected and which acknowledges their contribution to science. We explore the biggest barriers facing this. The International Experience Exchange with Patient Organisation’s Position Paper is based on a global patient survey conducted in Q3 2021 that received 304 responses. Results were discussed and validated by the 15 patient experts and supplemented with literature research. Results are a subset of this. Our research showed patient communities want to influence how their data is generated, shared, and used. Our study concludes that a reasonable framework is needed to protect the integrity of patient data and minimise abuse, and build trust. Results also demonstrated a need for patient communities to have more influence and control over how health data is generated, shared, and used. The results clearly highlight that the community feels there is a lack of clear policies on sharing data.

Keywords: digital health, equitable access, humanise healthcare, patient data

Procedia PDF Downloads 82
24537 Use of Machine Learning in Data Quality Assessment

Authors: Bruno Pinto Vieira, Marco Antonio Calijorne Soares, Armando Sérgio de Aguiar Filho

Abstract:

Nowadays, a massive amount of information has been produced by different data sources, including mobile devices and transactional systems. In this scenario, concerns arise on how to maintain or establish data quality, which is now treated as a product to be defined, measured, analyzed, and improved to meet consumers' needs, which is the one who uses these data in decision making and companies strategies. Information that reaches low levels of quality can lead to issues that can consume time and money, such as missed business opportunities, inadequate decisions, and bad risk management actions. The step of selecting, identifying, evaluating, and selecting data sources with significant quality according to the need has become a costly task for users since the sources do not provide information about their quality. Traditional data quality control methods are based on user experience or business rules limiting performance and slowing down the process with less than desirable accuracy. Using advanced machine learning algorithms, it is possible to take advantage of computational resources to overcome challenges and add value to companies and users. In this study, machine learning is applied to data quality analysis on different datasets, seeking to compare the performance of the techniques according to the dimensions of quality assessment. As a result, we could create a ranking of approaches used, besides a system that is able to carry out automatically, data quality assessment.

Keywords: machine learning, data quality, quality dimension, quality assessment

Procedia PDF Downloads 148
24536 Exploring Data Leakage in EEG Based Brain-Computer Interfaces: Overfitting Challenges

Authors: Khalida Douibi, Rodrigo Balp, Solène Le Bars

Abstract:

In the medical field, applications related to human experiments are frequently linked to reduced samples size, which makes the training of machine learning models quite sensitive and therefore not very robust nor generalizable. This is notably the case in Brain-Computer Interface (BCI) studies, where the sample size rarely exceeds 20 subjects or a few number of trials. To address this problem, several resampling approaches are often used during the data preparation phase, which is an overly critical step in a data science analysis process. One of the naive approaches that is usually applied by data scientists consists in the transformation of the entire database before the resampling phase. However, this can cause model’ s performance to be incorrectly estimated when making predictions on unseen data. In this paper, we explored the effect of data leakage observed during our BCI experiments for device control through the real-time classification of SSVEPs (Steady State Visually Evoked Potentials). We also studied potential ways to ensure optimal validation of the classifiers during the calibration phase to avoid overfitting. The results show that the scaling step is crucial for some algorithms, and it should be applied after the resampling phase to avoid data leackage and improve results.

Keywords: data leackage, data science, machine learning, SSVEP, BCI, overfitting

Procedia PDF Downloads 153
24535 Nuclear Decay Data Evaluation for 217Po

Authors: S. S. Nafee, A. M. Al-Ramady, S. A. Shaheen

Abstract:

Evaluated nuclear decay data for the 217Po nuclide ispresented in the present work. These data include recommended values for the half-life T1/2, α-, β--, and γ-ray emission energies and probabilities. Decay data from 221Rn α and 217Bi β—decays are presented. Q(α) has been updated based on the recent published work of the Atomic Mass Evaluation AME2012. In addition, the logft values were calculated using the Logft program from the ENSDF evaluation package. Moreover, the total internal conversion electrons has been calculated using Bricc program. Meanwhile, recommendation values or the multi-polarities have been assigned based on recently measurement yield a better intensity balance at the 254 keV and 264 keV gamma transitions.

Keywords: nuclear decay data evaluation, mass evaluation, total converison coefficients, atomic mass evaluation

Procedia PDF Downloads 433
24534 Amplitude Versus Offset (AVO) Modeling as a Tool for Seismic Reservoir Characterization of the Semliki Basin

Authors: Hillary Mwongyera

Abstract:

The Semliki basin has become a frontier for petroleum exploration in recent years. Exploration efforts have resulted into extensive seismic data acquisition and drilling of three wells namely; Turaco 1, Turaco 2 and Turaco 3. A petrophysical analysis of the Turaco 1 well was carried out to identify two reservoir zones on which AVO modeling was performed. A combination of seismic modeling and rock physics modeling was applied during reservoir characterization and monitoring to determine variations of seismic responses with amplitude characteristics. AVO intercept gradient analysis applied on AVO synthetic CDP gathers classified AVO anomalies associated with both reservoir zones as Class 1 AVO anomalies. Fluid replacement modeling was carried out on both reservoir zones using homogeneous mixing and patchy saturation patterns to determine effects of fluid substitution on rock property interactions. For both homogeneous mixing and saturation patterns, density (ρ) showed an increasing trend with increasing brine substitution while Shear wave velocity (Vs) decreased with increasing brine substitution. A study of compressional wave velocity (Vp) with increasing brine substitution for both homogeneous mixing and patchy saturation gave quite interesting results. During patchy saturation, Vp increased with increasing brine substitution. During homogeneous mixing however, Vp showed a slightly decreasing trend with increasing brine substitution but increased tremendously towards and at full brine saturation. A sensitivity analysis carried out showed that density was a very sensitive rock property responding to brine saturation except at full brine saturation during homogeneous mixing where Vp showed greater sensitivity with brine saturation. Rock physics modeling was performed to predict diagnostics of reservoir quality using an inverse deterministic approach which showed low shale content and a high degree of shale stiffness within reservoir zones.

Keywords: Amplitude Versus Offset (AVO), fluid replacement modelling, reservoir characterization, AVO attributes, rock physics modelling, reservoir monitoring

Procedia PDF Downloads 531
24533 Geographic Information System Using Google Fusion Table Technology for the Delivery of Disease Data Information

Authors: I. Nyoman Mahayasa Adiputra

Abstract:

Data in the field of health can be useful for the purposes of data analysis, one example of health data is disease data. Disease data is usually in a geographical plot in accordance with the area. Where the data was collected, in the city of Denpasar, Bali. Disease data report is still published in tabular form, disease information has not been mapped in GIS form. In this research, disease information in Denpasar city will be digitized in the form of a geographic information system with the smallest administrative area in the form of district. Denpasar City consists of 4 districts of North Denpasar, East Denpasar, West Denpasar and South Denpasar. In this research, we use Google fusion table technology for map digitization process, where this technology can facilitate from the administrator and from the recipient information. From the administrator side of the input disease, data can be done easily and quickly. From the receiving end of the information, the resulting GIS application can be published in a website-based application so that it can be accessed anywhere and anytime. In general, the results obtained in this study, divided into two, namely: (1) Geolocation of Denpasar and all of Denpasar districts, the process of digitizing the map of Denpasar city produces a polygon geolocation of each - district of Denpasar city. These results can be utilized in subsequent GIS studies if you want to use the same administrative area. (2) Dengue fever mapping in 2014 and 2015. Disease data used in this study is dengue fever case data taken in 2014 and 2015. Data taken from the profile report Denpasar Health Department 2015 and 2016. This mapping can be useful for the analysis of the spread of dengue hemorrhagic fever in the city of Denpasar.

Keywords: geographic information system, Google fusion table technology, delivery of disease data information, Denpasar city

Procedia PDF Downloads 129
24532 Inclusive Practices in Health Sciences: Equity Proofing Higher Education Programs

Authors: Mitzi S. Brammer

Abstract:

Given that the cultural make-up of programs of study in institutions of higher learning is becoming increasingly diverse, much has been written about cultural diversity from a university-level perspective. However, there are little data in the way of specific programs and how they address inclusive practices when teaching and working with marginalized populations. This research study aimed to discover baseline knowledge and attitudes of health sciences faculty, instructional staff, and students related to inclusive teaching/learning and interactions. Quantitative data were collected via an anonymous online survey (one designed for students and another designed for faculty/instructional staff) using a web-based program called Qualtrics. Quantitative data were analyzed amongst the faculty/instructional staff and students, respectively, using descriptive and comparative statistics (t-tests). Additionally, some participants voluntarily engaged in a focus group discussion in which qualitative data were collected around these same variables. Collecting qualitative data to triangulate the quantitative data added trustworthiness to the overall data. The research team analyzed collected data and compared identified categories and trends, comparing those data between faculty/staff and students, and reported results as well as implications for future study and professional practice.

Keywords: inclusion, higher education, pedagogy, equity, diversity

Procedia PDF Downloads 67
24531 An Analysis of Sequential Pattern Mining on Databases Using Approximate Sequential Patterns

Authors: J. Suneetha, Vijayalaxmi

Abstract:

Sequential Pattern Mining involves applying data mining methods to large data repositories to extract usage patterns. Sequential pattern mining methodologies used to analyze the data and identify patterns. The patterns have been used to implement efficient systems can recommend on previously observed patterns, in making predictions, improve usability of systems, detecting events, and in general help in making strategic product decisions. In this paper, identified performance of approximate sequential pattern mining defines as identifying patterns approximately shared with many sequences. Approximate sequential patterns can effectively summarize and represent the databases by identifying the underlying trends in the data. Conducting an extensive and systematic performance over synthetic and real data. The results demonstrate that ApproxMAP effective and scalable in mining large sequences databases with long patterns.

Keywords: multiple data, performance analysis, sequential pattern, sequence database scalability

Procedia PDF Downloads 340
24530 Optimizing Exposure Parameters in Digital Mammography: A Study in Morocco

Authors: Talbi Mohammed, Oustous Aziz, Ben Messaoud Mounir, Sebihi Rajaa, Khalis Mohammed

Abstract:

Background: Breast cancer is the leading cause of death for women around the world. Screening mammography is the reference examination, due to its sensitivity for detecting small lesions and micro-calcifications. Therefore, it is essential to ensure quality mammographic examinations with the most optimal dose. These conditions depend on the choice of exposure parameters. Clinically, practices must be evaluated in order to determine the most appropriate exposure parameters. Material and Methods: We performed our measurements on a mobile mammography unit (PLANMED Sofie-classic.) in Morocco. A solid dosimeter (AGMS Radcal) and a MTM 100 phantom allow to quantify the delivered dose and the image quality. For image quality assessment, scores are defined by the rate of visible inserts (MTM 100 phantom), obtained and compared for each acquisition. Results: The results show that the parameters of the mammography unit on which we have made our measurements can be improved in order to offer a better compromise between image quality and breast dose. The last one can be reduced up from 13.27% to 22.16%, while preserving comparable image quality.

Keywords: Mammography, Breast Dose, Image Quality, Phantom

Procedia PDF Downloads 172
24529 Architectural Engineering and Executive Design: Modelling Procedures, Scientific Tools, Simulation Processing

Authors: Massimiliano Nastri

Abstract:

The study is part of the scientific references on executive design in engineering and architecture, understood as an interdisciplinary field aimed at anticipating and simulating, planning and managing, guiding and instructing construction operations on site. On this basis, the study intends to provide an analysis of a theoretical, methodological, and guiding character aimed at constituting the disciplinary sphere of the executive design, often in the absence of supporting methodological and procedural guidelines in engineering and architecture. The basic methodologies of the study refer to the investigation of the theories and references that can contribute to constituting the scenario of the executive design as the practice of modelling, visualization, and simulation of the construction phases, through the practices of projection of the pragmatic issues of the building. This by proposing a series of references, interrelations, and openings intended to support (for intellectual, procedural, and applicative purposes) the executive definition of the project, aimed at activating the practices of cognitive acquisition and realization intervention within reality.

Keywords: modelling and simulation technology, executive design, discretization of the construction, engineering design for building

Procedia PDF Downloads 78
24528 Medical Knowledge Management since the Integration of Heterogeneous Data until the Knowledge Exploitation in a Decision-Making System

Authors: Nadjat Zerf Boudjettou, Fahima Nader, Rachid Chalal

Abstract:

Knowledge management is to acquire and represent knowledge relevant to a domain, a task or a specific organization in order to facilitate access, reuse and evolution. This usually means building, maintaining and evolving an explicit representation of knowledge. The next step is to provide access to that knowledge, that is to say, the spread in order to enable effective use. Knowledge management in the medical field aims to improve the performance of the medical organization by allowing individuals in the care facility (doctors, nurses, paramedics, etc.) to capture, share and apply collective knowledge in order to make optimal decisions in real time. In this paper, we propose a knowledge management approach based on integration technique of heterogeneous data in the medical field by creating a data warehouse, a technique of extracting knowledge from medical data by choosing a technique of data mining, and finally an exploitation technique of that knowledge in a case-based reasoning system.

Keywords: data warehouse, data mining, knowledge discovery in database, KDD, medical knowledge management, Bayesian networks

Procedia PDF Downloads 395
24527 Texture Identification Using Vision System: A Method to Predict Functionality of a Component

Authors: Varsha Singh, Shraddha Prajapati, M. B. Kiran

Abstract:

Texture identification is useful in predicting the functionality of a component. Many of the existing texture identification methods are of contact in nature, which limits its measuring speed. These contact measurement techniques use a diamond stylus and the diamond stylus being sharp going to damage the surface under inspection and hence these techniques can be used in statistical sampling. Though these contact methods are very accurate, they do not give complete information for full characterization of surface. In this context, the presented method assumes special significance. The method uses a relatively low cost vision system for image acquisition. Software is developed based on wavelet transform, for analyzing texture images. Specimens are made using different manufacturing process (shaping, grinding, milling etc.) During experimentation, the specimens are illuminated using proper lighting and texture images a capture using CCD camera connected to the vision system. The software installed in the vision system processes these images and subsequently identify the texture of manufacturing processes.

Keywords: diamond stylus, manufacturing process, texture identification, vision system

Procedia PDF Downloads 289