Search results for: data mining analytics
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25015

Search results for: data mining analytics

22885 A Survey in Techniques for Imbalanced Intrusion Detection System Datasets

Authors: Najmeh Abedzadeh, Matthew Jacobs

Abstract:

An intrusion detection system (IDS) is a software application that monitors malicious activities and generates alerts if any are detected. However, most network activities in IDS datasets are normal, and the relatively few numbers of attacks make the available data imbalanced. Consequently, cyber-attacks can hide inside a large number of normal activities, and machine learning algorithms have difficulty learning and classifying the data correctly. In this paper, a comprehensive literature review is conducted on different types of algorithms for both implementing the IDS and methods in correcting the imbalanced IDS dataset. The most famous algorithms are machine learning (ML), deep learning (DL), synthetic minority over-sampling technique (SMOTE), and reinforcement learning (RL). Most of the research use the CSE-CIC-IDS2017, CSE-CIC-IDS2018, and NSL-KDD datasets for evaluating their algorithms.

Keywords: IDS, imbalanced datasets, sampling algorithms, big data

Procedia PDF Downloads 303
22884 Tourism Satellite Account: Approach and Information System Development

Authors: Pappas Theodoros, Mihail Diakomihalis

Abstract:

Measuring the economic impact of tourism in a benchmark economy is a global concern, with previous measurements being partial and not fully integrated. Tourism is a phenomenon that requires individual consumption of visitors and which should be observed and measured to reveal, thus, the overall contribution of tourism to an economy. The Tourism Satellite Account (TSA) is a critical tool for assessing the annual growth of tourism, providing reliable measurements. This article introduces a system of TSA information that encompasses all the works of the TSA, including input, storage, management, and analysis of data, as well as additional future functions and enhances the efficiency of tourism data management and TSA collection utility. The methodology and results presented offer insights into the development and implementation of TSA.

Keywords: tourism satellite account, information system, data-based tourist account, relation database

Procedia PDF Downloads 62
22883 Interoperable Platform for Internet of Things at Home Applications

Authors: Fabiano Amorim Vaz, Camila Gonzaga de Araujo

Abstract:

With the growing number of personal devices such as smartphones, tablets, smart watches, among others, in addition to recent devices designed for IoT, it is observed that residential environment has potential to generate important information about our daily lives. Therefore, this work is focused on showing and evaluating a system that integrates all these technologies considering the context of a smart house. To achieve this, we define an architecture capable of supporting the amount of data generated and consumed at a residence and, mainly, the variety of this data presents. We organize it in a particular cloud containing information about robots, recreational vehicles, weather, in addition to data from the house, such as lighting, energy, security, among others. The proposed architecture can be extrapolated to various scenarios and applications. Through the core of this work, we can define new functionality for residences integrating them with more resources.

Keywords: cloud computing, IoT, robotics, smart house

Procedia PDF Downloads 364
22882 Visualization Tool for EEG Signal Segmentation

Authors: Sweeti, Anoop Kant Godiyal, Neha Singh, Sneh Anand, B. K. Panigrahi, Jayasree Santhosh

Abstract:

This work is about developing a tool for visualization and segmentation of Electroencephalograph (EEG) signals based on frequency domain features. Change in the frequency domain characteristics are correlated with change in mental state of the subject under study. Proposed algorithm provides a way to represent the change in the mental states using the different frequency band powers in form of segmented EEG signal. Many segmentation algorithms have been suggested in literature having application in brain computer interface, epilepsy and cognition studies that have been used for data classification. But the proposed method focusses mainly on the better presentation of signal and that’s why it could be a good utilization tool for clinician. Algorithm performs the basic filtering using band pass and notch filters in the range of 0.1-45 Hz. Advanced filtering is then performed by principal component analysis and wavelet transform based de-noising method. Frequency domain features are used for segmentation; considering the fact that the spectrum power of different frequency bands describes the mental state of the subject. Two sliding windows are further used for segmentation; one provides the time scale and other assigns the segmentation rule. The segmented data is displayed second by second successively with different color codes. Segment’s length can be selected as per need of the objective. Proposed algorithm has been tested on the EEG data set obtained from University of California in San Diego’s online data repository. Proposed tool gives a better visualization of the signal in form of segmented epochs of desired length representing the power spectrum variation in data. The algorithm is designed in such a way that it takes the data points with respect to the sampling frequency for each time frame and so it can be improved to use in real time visualization with desired epoch length.

Keywords: de-noising, multi-channel data, PCA, power spectra, segmentation

Procedia PDF Downloads 384
22881 Identification of Factors and Impacts on the Success of Implementing Extended Enterprise Resource Planning: Case Study of Manufacturing Industries in East Java, Indonesia

Authors: Zeplin Jiwa Husada Tarigan, Sautma Ronni Basana, Widjojo Suprapto

Abstract:

The ERP is integrating all data from various departments within the company into one data base. One department inputs the data and many other departments can access and use the data through the connected information system. As many manufacturing companies in Indonesia implement the ERP technology, many adjustments are to be made to align with the business process in the companies, especially the management policy and the competitive advantages. For companies that are successful in the initial implementation, they still have to maintain the process so that the initial success can develop along with the changing of business processes of the company. For companies which have already implemented the ERP successfully, they are still in need to maintain the system so that it can match up with the business development and changes. The continued success of the extended ERP implementation aims to achieve efficient and effective performance for the company. This research is distributing 100 questionnaires to manufacturing companies in East Java, Indonesia, which have implemented and have going live ERP for over five years. There are 90 returned questionnaires with ten disqualified questionnaires because they are from companies that implement ERP less than five years. There are only 80 questionnaires used as the data, with the response rate of 80%. Based on the data results and analysis with PLS (Partial Least Square), it is obtained that the organization commitment brings impacts to the user’s effectiveness and provides the adequate IT infrastructure. The user’s effectiveness brings impacts to the adequate IT infrastructure. The information quality of the company increases the implementation of the extended ERP in manufacturing companies in East Java, Indonesia.

Keywords: organization commitment, adequate IT infrastructure, information quality, extended ERP implementation

Procedia PDF Downloads 153
22880 Enhancement Method of Network Traffic Anomaly Detection Model Based on Adversarial Training With Category Tags

Authors: Zhang Shuqi, Liu Dan

Abstract:

For the problems in intelligent network anomaly traffic detection models, such as low detection accuracy caused by the lack of training samples, poor effect with small sample attack detection, a classification model enhancement method, F-ACGAN(Flow Auxiliary Classifier Generative Adversarial Network) which introduces generative adversarial network and adversarial training, is proposed to solve these problems. Generating adversarial data with category labels could enhance the training effect and improve classification accuracy and model robustness. FACGAN consists of three steps: feature preprocess, which includes data type conversion, dimensionality reduction and normalization, etc.; A generative adversarial network model with feature learning ability is designed, and the sample generation effect of the model is improved through adversarial iterations between generator and discriminator. The adversarial disturbance factor of the gradient direction of the classification model is added to improve the diversity and antagonism of generated data and to promote the model to learn from adversarial classification features. The experiment of constructing a classification model with the UNSW-NB15 dataset shows that with the enhancement of FACGAN on the basic model, the classification accuracy has improved by 8.09%, and the score of F1 has improved by 6.94%.

Keywords: data imbalance, GAN, ACGAN, anomaly detection, adversarial training, data augmentation

Procedia PDF Downloads 88
22879 Strategies Used by the Saffron Producers of Taliouine (Morocco) to Adapt to Climate Change

Authors: Aziz Larbi, Widad Sadok

Abstract:

In Morocco, the mountainous regions extend over about 26% of the national territory where 30% of the total population live. They contain opportunities for agriculture, forestry, pastureland and mining. The production systems in these zones are characterised by crop diversification. However, these areas have become vulnerable to the effects of climate change. To understand these effects in relation to the population living in these areas, a study was carried out in the zone of Taliouine, in the Anti-Atlas. The vulnerability of crop productions to climate change was analysed and the different ways of adaptation adopted by farmers were identified. The work was done on saffron, the most profitable crop in the target area even though it requires much water. Our results show that the majority of the farmers surveyed had noticed variations in the climate of the region: irregularity of precipitation leading to a decrease in quantity and an uneven distribution throughout the year; rise in temperature; reduction in the cold period and less snow. These variations had impacts on the cropping system of saffron and its productivity. To cope with these effects, the farmers adopted various strategies: better management and use of water; diversification of agricultural activities; increase in the contribution of non-agricultural activities to their gross income; and seasonal migration.

Keywords: climate change, Taliouine, saffron, perceptions, adaptation strategies

Procedia PDF Downloads 47
22878 IoT Based Monitoring Temperature and Humidity

Authors: Jay P. Sipani, Riki H. Patel, Trushit Upadhyaya

Abstract:

Today there is a demand to monitor environmental factors almost in all research institutes and industries and even for domestic uses. The analog data measurement requires manual effort to note readings, and there may be a possibility of human error. Such type of systems fails to provide and store precise values of parameters with high accuracy. Analog systems are having drawback of storage/memory. Therefore, there is a requirement of a smart system which is fully automated, accurate and capable enough to monitor all the environmental parameters with utmost possible accuracy. Besides, it should be cost-effective as well as portable too. This paper represents the Wireless Sensor (WS) data communication using DHT11, Arduino, SIM900A GSM module, a mobile device and Liquid Crystal Display (LCD). Experimental setup includes the heating arrangement of DHT11 and transmission of its data using Arduino and SIM900A GSM shield. The mobile device receives the data using Arduino, GSM shield and displays it on LCD too. Heating arrangement is used to heat and cool the temperature sensor to study its characteristics.

Keywords: wireless communication, Arduino, DHT11, LCD, SIM900A GSM module, mobile phone SMS

Procedia PDF Downloads 265
22877 Detect Cable Force of Cable Stayed Bridge from Accelerometer Data of SHM as Real Time

Authors: Nguyen Lan, Le Tan Kien, Nguyen Pham Gia Bao

Abstract:

The cable-stayed bridge belongs to the combined system, in which the cables is a major strutual element. Cable-stayed bridges with large spans are often arranged with structural health monitoring systems to collect data for bridge health diagnosis. Cables tension monitoring is a structural monitoring content. It is common to measure cable tension by a direct force sensor or cable vibration accelerometer sensor, thereby inferring the indirect cable tension through the cable vibration frequency. To translate cable-stayed vibration acceleration data to real-time tension requires some necessary calculations and programming. This paper introduces the algorithm, labview program that converts cable-stayed vibration acceleration data to real-time tension. The research results are applied to the monitoring system of Tran Thi Ly cable-stayed bridge and Song Hieu cable-stayed bridge in Vietnam.

Keywords: cable-stayed bridge, cable fore, structural heath monitoring (SHM), fast fourie transformed (FFT), real time, vibrations

Procedia PDF Downloads 53
22876 Impacts of Building Design Factors on Auckland School Energy Consumptions

Authors: Bin Su

Abstract:

This study focuses on the impact of school building design factors on winter extra energy consumption which mainly includes space heating, water heating and other appliances related to winter indoor thermal conditions. A number of Auckland schools were randomly selected for the study which introduces a method of using real monthly energy consumption data for a year to calculate winter extra energy data of school buildings. The study seeks to identify the relationships between winter extra energy data related to school building design data related to the main architectural features, building envelope and elements of the sample schools. The relationships can be used to estimate the approximate saving in winter extra energy consumption which would result from a changed design datum for future school development, and identify any major energy-efficient design problems. The relationships are also valuable for developing passive design guides for school energy efficiency.

Keywords: building energy efficiency, building thermal design, building thermal performance, school building design

Procedia PDF Downloads 428
22875 The Meta–Evaluation of Master Degree Theses in Science Program of Evaluation Methodology, Srinakharinwirot University

Authors: Panwasn Mahalawalert

Abstract:

The objective of this study was to meta-evaluation of Master Degree theses in Science Program of Evaluation Methodology at Srinakharinwirot University, published during 2008-2011. This study was summative meta-evaluation that evaluated all theses of Master Degree in Science Program of Evaluation Methodology. Data were collected using the theses characteristics recording form and the evaluation meta-evaluation checklist. The collected data were analyzed by two parts: 1) Quantitative data were analyzed by descriptive statistics presented in frequency, percentages, mean, and standard deviation and 2) Qualitative data were analyzed by content analysis. The results of this study were found the theses characteristics was results revealed that most of theses were published in 2011. The largest group of theses researcher were female and were from the government office. The evaluation model of all theses were Decision-Oriented Evaluation Model. The objective of all theses were evaluate the project or curriculum. The most sampling technique were used the multistage random sampling technique. The most tool were used to gathering the data were questionnaires. All of the theses were analysed by descriptive statistics. The meta-evaluation results revealed that most of theses had fair on Utility Standards and Feasibility Standards, good on Propriety Standards and Accuracy Standards.

Keywords: meta-evaluation, evaluation, master degree theses, Srinakharinwirot University

Procedia PDF Downloads 523
22874 Re-Stating the Origin of Tetrapod Using Measures of Phylogenetic Support for Phylogenomic Data

Authors: Yunfeng Shan, Xiaoliang Wang, Youjun Zhou

Abstract:

Whole-genome data from two lungfish species, along with other species, present a valuable opportunity to re-investigate the longstanding debate regarding the evolutionary relationships among tetrapods, lungfishes, and coelacanths. However, the use of bootstrap support has become outdated for large-scale phylogenomic data. Without robust phylogenetic support, the phylogenetic trees become meaningless. Therefore, it is necessary to re-evaluate the phylogenies of tetrapods, lungfishes, and coelacanths using novel measures of phylogenetic support specifically designed for phylogenomic data, as the previous phylogenies were based on 100% bootstrap support. Our findings consistently provide strong evidence favoring lungfish as the closest living relative of tetrapods. This conclusion is based on high internode certainty, relative gene support, and high gene concordance factor. The evidence stems from five previous datasets derived from lungfish transcriptomes. These results yield fresh insights into the three hypotheses regarding the phylogenies of tetrapods, lungfishes, and coelacanths. Importantly, these hypotheses are not mere conjectures but are substantiated by a significant number of genes. Analyzing real biological data further demonstrates that the inclusion of additional taxa leads to more diverse tree topologies. Consequently, gene trees and species trees may not be identical even when whole-genome sequencing data is utilized. However, it is worth noting that many gene trees can accurately reflect the species tree if an appropriate number of taxa, typically ranging from six to ten, are sampled. Therefore, it is crucial to carefully select the number of taxa and an appropriate outgroup, such as slow-evolving species, while excluding fast-evolving taxa as outgroups to mitigate the adverse effects of long-branch attraction and achieve an accurate reconstruction of the species tree. This is particularly important as more whole-genome sequencing data becomes available.

Keywords: novel measures of phylogenetic support for phylogenomic data, gene concordance factor confidence, relative gene support, internode certainty, origin of tetrapods

Procedia PDF Downloads 47
22873 Predicting Daily Patient Hospital Visits Using Machine Learning

Authors: Shreya Goyal

Abstract:

The study aims to build user-friendly software to understand patient arrival patterns and compute the number of potential patients who will visit a particular health facility for a given period by using a machine learning algorithm. The underlying machine learning algorithm used in this study is the Support Vector Machine (SVM). Accurate prediction of patient arrival allows hospitals to operate more effectively, providing timely and efficient care while optimizing resources and improving patient experience. It allows for better allocation of staff, equipment, and other resources. If there's a projected surge in patients, additional staff or resources can be allocated to handle the influx, preventing bottlenecks or delays in care. Understanding patient arrival patterns can also help streamline processes to minimize waiting times for patients and ensure timely access to care for patients in need. Another big advantage of using this software is adhering to strict data protection regulations such as the Health Insurance Portability and Accountability Act (HIPAA) in the United States as the hospital will not have to share the data with any third party or upload it to the cloud because the software can read data locally from the machine. The data needs to be arranged in. a particular format and the software will be able to read the data and provide meaningful output. Using software that operates locally can facilitate compliance with these regulations by minimizing data exposure. Keeping patient data within the hospital's local systems reduces the risk of unauthorized access or breaches associated with transmitting data over networks or storing it in external servers. This can help maintain the confidentiality and integrity of sensitive patient information. Historical patient data is used in this study. The input variables used to train the model include patient age, time of day, day of the week, seasonal variations, and local events. The algorithm uses a Supervised learning method to optimize the objective function and find the global minima. The algorithm stores the values of the local minima after each iteration and at the end compares all the local minima to find the global minima. The strength of this study is the transfer function used to calculate the number of patients. The model has an output accuracy of >95%. The method proposed in this study could be used for better management planning of personnel and medical resources.

Keywords: machine learning, SVM, HIPAA, data

Procedia PDF Downloads 54
22872 Analyzing Keyword Networks for the Identification of Correlated Research Topics

Authors: Thiago M. R. Dias, Patrícia M. Dias, Gray F. Moita

Abstract:

The production and publication of scientific works have increased significantly in the last years, being the Internet the main factor of access and distribution of these works. Faced with this, there is a growing interest in understanding how scientific research has evolved, in order to explore this knowledge to encourage research groups to become more productive. Therefore, the objective of this work is to explore repositories containing data from scientific publications and to characterize keyword networks of these publications, in order to identify the most relevant keywords, and to highlight those that have the greatest impact on the network. To do this, each article in the study repository has its keywords extracted and in this way the network is  characterized, after which several metrics for social network analysis are applied for the identification of the highlighted keywords.

Keywords: bibliometrics, data analysis, extraction and data integration, scientometrics

Procedia PDF Downloads 242
22871 A New Approach towards the Development of Next Generation CNC

Authors: Yusri Yusof, Kamran Latif

Abstract:

Computer Numeric Control (CNC) machine has been widely used in the industries since its inception. Currently, in CNC technology has been used for various operations like milling, drilling, packing and welding etc. with the rapid growth in the manufacturing world the demand of flexibility in the CNC machines has rapidly increased. Previously, the commercial CNC failed to provide flexibility because its structure was of closed nature that does not provide access to the inner features of CNC. Also CNC’s operating ISO data interface model was found to be limited. Therefore, to overcome that problem, Open Architecture Control (OAC) technology and STEP-NC data interface model are introduced. At present the Personal Computer (PC) has been the best platform for the development of open-CNC systems. In this paper, both ISO data interface model interpretation, its verification and execution has been highlighted with the introduction of the new techniques. The proposed is composed of ISO data interpretation, 3D simulation and machine motion control modules. The system is tested on an old 3 axis CNC milling machine. The results are found to be satisfactory in performance. This implementation has successfully enabled sustainable manufacturing environment.

Keywords: CNC, ISO 6983, ISO 14649, LabVIEW, open architecture control, reconfigurable manufacturing systems, sustainable manufacturing, Soft-CNC

Procedia PDF Downloads 503
22870 A Study on the Establishment of a 4-Joint Based Motion Capture System and Data Acquisition

Authors: Kyeong-Ri Ko, Seong Bong Bae, Jang Sik Choi, Sung Bum Pan

Abstract:

A simple method for testing the posture imbalance of the human body is to check for differences in the bilateral shoulder and pelvic height of the target. In this paper, to check for spinal disorders the authors have studied ways to establish a motion capture system to obtain and express motions of 4-joints, and to acquire data based on this system. The 4 sensors are attached to the both shoulders and pelvis. To verify the established system, the normal and abnormal postures of the targets listening to a lecture were obtained using the established 4-joint based motion capture system. From the results, it was confirmed that the motions taken by the target was identical to the 3-dimensional simulation.

Keywords: inertial sensor, motion capture, motion data acquisition, posture imbalance

Procedia PDF Downloads 503
22869 Urban Change Detection and Pattern Analysis Using Satellite Data

Authors: Shivani Jha, Klaus Baier, Rafiq Azzam, Ramakar Jha

Abstract:

In India, generally people migrate from rural area to the urban area for better infra-structural facilities, high standard of living, good job opportunities and advanced transport/communication availability. In fact, unplanned urban development due to migration of people causes seriou damage to the land use, water pollution and available water resources. In the present work, an attempt has been made to use satellite data of different years for urban change detection of Chennai metropolitan city along with pattern analysis to generate future scenario of urban development using buffer zoning in GIS environment. In the analysis, SRTM (30m) elevation data and IRS-1C satellite data for the years 1990, 2000, and 2014, are used. The flow accumulation, aspect, flow direction and slope maps developed using SRTM 30 m data are very useful for finding suitable urban locations for industrial setup and urban settlements. Normalized difference vegetation index (NDVI) and Principal Component Analysis (PCA) have been used in ERDAS imagine software for change detection in land use of Chennai metropolitan city. It has been observed that the urban area has increased exponentially in Chennai metropolitan city with significant decrease in agriculture and barren lands. However, the water bodies located in the study regions are protected and being used as freshwater for drinking purposes. Using buffer zone analysis in GIS environment, it has been observed that the development has taken place in south west direction significantly and will do so in future.

Keywords: urban change, satellite data, the Chennai metropolis, change detection

Procedia PDF Downloads 389
22868 HelpMeBreathe: A Web-Based System for Asthma Management

Authors: Alia Al Rayssi, Mahra Al Marar, Alyazia Alkhaili, Reem Al Dhaheri, Shayma Alkobaisi, Hoda Amer

Abstract:

We present in this paper a web-based system called “HelpMeBreathe” for managing asthma. The proposed system provides analytical tools, which allow better understanding of environmental triggers of asthma, hence better support of data-driven decision making. The developed system provides warning messages to a specific asthma patient if the weather in his/her area might cause any difficulty in breathing or could trigger an asthma attack. HelpMeBreathe collects, stores, and analyzes individuals’ moving trajectories and health conditions as well as environmental data. It then processes and displays the patients’ data through an analytical tool that leads to an effective decision making by physicians and other decision makers.

Keywords: asthma, environmental triggers, map interface, web-based systems

Procedia PDF Downloads 287
22867 Geographic Information Systems and Remotely Sensed Data for the Hydrological Modelling of Mazowe Dam

Authors: Ellen Nhedzi Gozo

Abstract:

Unavailability of adequate hydro-meteorological data has always limited the analysis and understanding of hydrological behaviour of several dam catchments including Mazowe Dam in Zimbabwe. The problem of insufficient data for Mazowe Dam catchment analysis was solved by extracting catchment characteristics and aerial hydro-meteorological data from ASTER, LANDSAT, Shuttle Radar Topographic Mission SRTM remote sensing (RS) images using ILWIS, ArcGIS and ERDAS Imagine geographic information systems (GIS) software. Available observed hydrological as well as meteorological data complemented the use of the remotely sensed information. Ground truth land cover was mapped using a Garmin Etrex global positioning system (GPS) system. This information was then used to validate land cover classification detail that was obtained from remote sensing images. A bathymetry survey was conducted using a SONAR system connected to GPS. Hydrological modelling using the HBV model was then performed to simulate the hydrological process of the catchment in an effort to verify the reliability of the derived parameters. The model output shows a high Nash-Sutcliffe Coefficient that is close to 1 indicating that the parameters derived from remote sensing and GIS can be applied with confidence in the analysis of Mazowe Dam catchment.

Keywords: geographic information systems, hydrological modelling, remote sensing, water resources management

Procedia PDF Downloads 319
22866 A Bayesian Model with Improved Prior in Extreme Value Problems

Authors: Eva L. Sanjuán, Jacinto Martín, M. Isabel Parra, Mario M. Pizarro

Abstract:

In Extreme Value Theory, inference estimation for the parameters of the distribution is made employing a small part of the observation values. When block maxima values are taken, many data are discarded. We developed a new Bayesian inference model to seize all the information provided by the data, introducing informative priors and using the relations between baseline and limit parameters. Firstly, we studied the accuracy of the new model for three baseline distributions that lead to a Gumbel extreme distribution: Exponential, Normal and Gumbel. Secondly, we considered mixtures of Normal variables, to simulate practical situations when data do not adjust to pure distributions, because of perturbations (noise).

Keywords: bayesian inference, extreme value theory, Gumbel distribution, highly informative prior

Procedia PDF Downloads 183
22865 Quantitative, Preservative Methodology for Review of Interview Transcripts Using Natural Language Processing

Authors: Rowan P. Martnishn

Abstract:

During the execution of a National Endowment of the Arts grant, approximately 55 interviews were collected from professionals across various fields. These interviews were used to create deliverables – historical connections for creations that began as art and evolved entirely into computing technology. With dozens of hours’ worth of transcripts to be analyzed by qualitative coders, a quantitative methodology was created to sift through the documents. The initial step was to both clean and format all the data. First, a basic spelling and grammar check was applied, as well as a Python script for normalized formatting which used an open-source grammatical formatter to make the data as coherent as possible. 10 documents were randomly selected to manually review, where words often incorrectly translated during the transcription were recorded and replaced throughout all other documents. Then, to remove all banter and side comments, the transcripts were spliced into paragraphs (separated by change in speaker) and all paragraphs with less than 300 characters were removed. Secondly, a keyword extractor, a form of natural language processing where significant words in a document are selected, was run on each paragraph for all interviews. Every proper noun was put into a data structure corresponding to that respective interview. From there, a Bidirectional and Auto-Regressive Transformer (B.A.R.T.) summary model was then applied to each paragraph that included any of the proper nouns selected from the interview. At this stage the information to review had been sent from about 60 hours’ worth of data to 20. The data was further processed through light, manual observation – any summaries which proved to fit the criteria of the proposed deliverable were selected, as well their locations within the document. This narrowed that data down to about 5 hours’ worth of processing. The qualitative researchers were then able to find 8 more connections in addition to our previous 4, exceeding our minimum quota of 3 to satisfy the grant. Major findings of the study and subsequent curation of this methodology raised a conceptual finding crucial to working with qualitative data of this magnitude. In the use of artificial intelligence there is a general trade off in a model between breadth of knowledge and specificity. If the model has too much knowledge, the user risks leaving out important data (too general). If the tool is too specific, it has not seen enough data to be useful. Thus, this methodology proposes a solution to this tradeoff. The data is never altered outside of grammatical and spelling checks. Instead, the important information is marked, creating an indicator of where the significant data is without compromising the purity of it. Secondly, the data is chunked into smaller paragraphs, giving specificity, and then cross-referenced with the keywords (allowing generalization over the whole document). This way, no data is harmed, and qualitative experts can go over the raw data instead of using highly manipulated results. Given the success in deliverable creation as well as the circumvention of this tradeoff, this methodology should stand as a model for synthesizing qualitative data while maintaining its original form.

Keywords: B.A.R.T.model, keyword extractor, natural language processing, qualitative coding

Procedia PDF Downloads 6
22864 Culture and Commodification: A Study of William Gibson's the Bridge Trilogy

Authors: Aruna Bhat

Abstract:

Culture can be placed within the social structure that embodies both the creation of social groups, and the manner in which they interact with each other. As many critics have pointed out, culture in the Postmodern context has often been considered a commodity, and indeed it shares many attributes with commercial products. Popular culture follows many patterns of behavior derived from Economics, from the simple principle of supply and demand, to the creation of marketable demographics which fit certain criterion. This trend is exemplary visible in contemporary fiction, especially in contemporary science fiction; Cyberpunk fiction in particular which is an off shoot of pure science fiction. William Gibson is one such author who in his works portrays such a scenario, and in his The Bridge Trilogy he adds another level of interpretation to this state of affairs, by describing a world that is centered on industrialization of a new kind – that focuses around data in the cyberspace. In this new world, data has become the most important commodity, and man has become nothing but a nodal point in a vast ocean of raw data resulting into commodification of each thing including Culture. This paper will attempt to study the presence of above mentioned elements in William Gibson’s The Bridge Trilogy. The theories applied will be Postmodernism and Cultural studies.

Keywords: culture, commodity, cyberpunk, data, postmodern

Procedia PDF Downloads 489
22863 Impact of Safety and Quality Considerations of Housing Clients on the Construction Firms’ Intention to Adopt Quality Function Deployment: A Case of Construction Sector

Authors: Saif Ul Haq

Abstract:

The current study intends to examine the safety and quality considerations of clients of housing projects and their impact on the adoption of Quality Function Deployment (QFD) by the construction firm. Mixed method research technique has been used to collect and analyze the data wherein a survey was conducted to collect the data from 220 clients of housing projects in Saudi Arabia. Then, the telephonic and Skype interviews were conducted to collect data of 15 professionals working in the top ten real estate companies of Saudi Arabia. Data were analyzed by using partial least square (PLS) and thematic analysis techniques. Findings reveal that today’s customer prioritizes the safety and quality requirements of their houses and as a result, construction firms adopt QFD to address the needs of customers. The findings are of great importance for the clients of housing projects as well as for the construction firms as they could apply QFD in housing projects to address the safety and quality concerns of their clients.

Keywords: construction industry, quality considerations, quality function deployment, safety considerations

Procedia PDF Downloads 110
22862 Customers’ Acceptability of Islamic Banking: Employees’ Perspective in Peshawar

Authors: Tahira Imtiaz, Karim Ullah

Abstract:

This paper aims to incorporate the banks employees’ perspective on acceptability of Islamic banking by the customers of Peshawar. A qualitative approach is adopted for which six in-depth interviews with employees of Islamic banks are conducted. The employees were asked to share their experience regarding customers’ acceptance attitude towards acceptability of Islamic banking. Collected data was analyzed through thematic analysis technique and its synthesis with the current literature. Through data analysis a theoretical framework is developed, which highlights the factors which drive customers towards Islamic banking, as witnessed by the employees. The practical implication of analyzed data evident that a new model could be developed on the basis of four determinants of human preference namely: inner satisfaction, time, faith and market forces.

Keywords: customers’ attraction, employees’ perspective, Islamic banking, Riba

Procedia PDF Downloads 319
22861 Customized Design of Amorphous Solids by Generative Deep Learning

Authors: Yinghui Shang, Ziqing Zhou, Rong Han, Hang Wang, Xiaodi Liu, Yong Yang

Abstract:

The design of advanced amorphous solids, such as metallic glasses, with targeted properties through artificial intelligence signifies a paradigmatic shift in physical metallurgy and materials technology. Here, we developed a machine-learning architecture that facilitates the generation of metallic glasses with targeted multifunctional properties. Our architecture integrates the state-of-the-art unsupervised generative adversarial network model with supervised models, allowing the incorporation of general prior knowledge derived from thousands of data points across a vast range of alloy compositions, into the creation of data points for a specific type of composition, which overcame the common issue of data scarcity typically encountered in the design of a given type of metallic glasses. Using our generative model, we have successfully designed copper-based metallic glasses, which display exceptionally high hardness or a remarkably low modulus. Notably, our architecture can not only explore uncharted regions in the targeted compositional space but also permits self-improvement after experimentally validated data points are added to the initial dataset for subsequent cycles of data generation, hence paving the way for the customized design of amorphous solids without human intervention.

Keywords: metallic glass, artificial intelligence, mechanical property, automated generation

Procedia PDF Downloads 34
22860 R Data Science for Technology Management

Authors: Sunghae Jun

Abstract:

Technology management (TM) is important issue in a company improving the competitiveness. Among many activities of TM, technology analysis (TA) is important factor, because most decisions for management of technology are decided by the results of TA. TA is to analyze the developed results of target technology using statistics or Delphi. TA based on Delphi is depended on the experts’ domain knowledge, in comparison, TA by statistics and machine learning algorithms use objective data such as patent or paper instead of the experts’ knowledge. Many quantitative TA methods based on statistics and machine learning have been studied, and these have been used for technology forecasting, technological innovation, and management of technology. They applied diverse computing tools and many analytical methods case by case. It is not easy to select the suitable software and statistical method for given TA work. So, in this paper, we propose a methodology for quantitative TA using statistical computing software called R and data science to construct a general framework of TA. From the result of case study, we also show how our methodology is applied to real field. This research contributes to R&D planning and technology valuation in TM areas.

Keywords: technology management, R system, R data science, statistics, machine learning

Procedia PDF Downloads 446
22859 Mixture statistical modeling for predecting mortality human immunodeficiency virus (HIV) and tuberculosis(TB) infection patients

Authors: Mohd Asrul Affendi Bi Abdullah, Nyi Nyi Naing

Abstract:

The purpose of this study was to identify comparable manner between negative binomial death rate (NBDR) and zero inflated negative binomial death rate (ZINBDR) with died patients with (HIV + T B+) and (HIV + T B−). HIV and TB is a serious world wide problem in the developing country. Data were analyzed with applying NBDR and ZINBDR to make comparison which a favorable model is better to used. The ZINBDR model is able to account for the disproportionately large number of zero within the data and is shown to be a consistently better fit than the NBDR model. Hence, as a results ZINBDR model is a superior fit to the data than the NBDR model and provides additional information regarding the died mechanisms HIV+TB. The ZINBDR model is shown to be a use tool for analysis death rate according age categorical.

Keywords: zero inflated negative binomial death rate, HIV and TB, AIC and BIC, death rate

Procedia PDF Downloads 413
22858 Efficient Reuse of Exome Sequencing Data for Copy Number Variation Callings

Authors: Chen Wang, Jared Evans, Yan Asmann

Abstract:

With the quick evolvement of next-generation sequencing techniques, whole-exome or exome-panel data have become a cost-effective way for detection of small exonic mutations, but there has been a growing desire to accurately detect copy number variations (CNVs) as well. In order to address this research and clinical needs, we developed a sequencing coverage pattern-based method not only for copy number detections, data integrity checks, CNV calling, and visualization reports. The developed methodologies include complete automation to increase usability, genome content-coverage bias correction, CNV segmentation, data quality reports, and publication quality images. Automatic identification and removal of poor quality outlier samples were made automatically. Multiple experimental batches were routinely detected and further reduced for a clean subset of samples before analysis. Algorithm improvements were also made to improve somatic CNV detection as well as germline CNV detection in trio family. Additionally, a set of utilities was included to facilitate users for producing CNV plots in focused genes of interest. We demonstrate the somatic CNV enhancements by accurately detecting CNVs in whole exome-wide data from the cancer genome atlas cancer samples and a lymphoma case study with paired tumor and normal samples. We also showed our efficient reuses of existing exome sequencing data, for improved germline CNV calling in a family of the trio from the phase-III study of 1000 Genome to detect CNVs with various modes of inheritance. The performance of the developed method is evaluated by comparing CNV calling results with results from other orthogonal copy number platforms. Through our case studies, reuses of exome sequencing data for calling CNVs have several noticeable functionalities, including a better quality control for exome sequencing data, improved joint analysis with single nucleotide variant calls, and novel genomic discovery of under-utilized existing whole exome and custom exome panel data.

Keywords: bioinformatics, computational genetics, copy number variations, data reuse, exome sequencing, next generation sequencing

Procedia PDF Downloads 246
22857 [Keynote]: No-Trust-Zone Architecture for Securing Supervisory Control and Data Acquisition

Authors: Michael Okeke, Andrew Blyth

Abstract:

Supervisory Control And Data Acquisition (SCADA) as the state of the art Industrial Control Systems (ICS) are used in many different critical infrastructures, from smart home to energy systems and from locomotives train system to planes. Security of SCADA systems is vital since many lives depend on it for daily activities and deviation from normal operation could be disastrous to the environment as well as lives. This paper describes how No-Trust-Zone (NTZ) architecture could be incorporated into SCADA Systems in order to reduce the chances of malicious intent. The architecture is made up of two distinctive parts which are; the field devices such as; sensors, PLCs pumps, and actuators. The second part of the architecture is designed following lambda architecture, which is made up of a detection algorithm based on Particle Swarm Optimization (PSO) and Hadoop framework for data processing and storage. Apache Spark will be a part of the lambda architecture for real-time analysis of packets for anomalies detection.

Keywords: industrial control system (ics, no-trust-zone (ntz), particle swarm optimisation (pso), supervisory control and data acquisition (scada), swarm intelligence (SI)

Procedia PDF Downloads 327
22856 A Study on the Correlation Analysis between the Pre-Sale Competition Rate and the Apartment Unit Plan Factor through Machine Learning

Authors: Seongjun Kim, Jinwooung Kim, Sung-Ah Kim

Abstract:

The development of information and communication technology also affects human cognition and thinking, especially in the field of design, new techniques are being tried. In architecture, new design methodologies such as machine learning or data-driven design are being applied. In particular, these methodologies are used in analyzing the factors related to the value of real estate or analyzing the feasibility in the early planning stage of the apartment housing. However, since the value of apartment buildings is often determined by external factors such as location and traffic conditions, rather than the interior elements of buildings, data is rarely used in the design process. Therefore, although the technical conditions are provided, the internal elements of the apartment are difficult to apply the data-driven design in the design process of the apartment. As a result, the designers of apartment housing were forced to rely on designer experience or modular design alternatives rather than data-driven design at the design stage, resulting in a uniform arrangement of space in the apartment house. The purpose of this study is to propose a methodology to support the designers to design the apartment unit plan with high consumer preference by deriving the correlation and importance of the floor plan elements of the apartment preferred by the consumers through the machine learning and reflecting this information from the early design process. The data on the pre-sale competition rate and the elements of the floor plan are collected as data, and the correlation between pre-sale competition rate and independent variables is analyzed through machine learning. This analytical model can be used to review the apartment unit plan produced by the designer and to assist the designer. Therefore, it is possible to make a floor plan of apartment housing with high preference because it is possible to feedback apartment unit plan by using trained model when it is used in floor plan design of apartment housing.

Keywords: apartment unit plan, data-driven design, design methodology, machine learning

Procedia PDF Downloads 254