Search results for: data acquisition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 9633

Search results for: data acquisition

8853 Quantitative, Preservative Methodology for Review of Interview Transcripts Using Natural Language Processing

Authors: Rowan P. Martnishn

Abstract:

During the execution of a National Endowment of the Arts grant, approximately 55 interviews were collected from professionals across various fields. These interviews were used to create deliverables – historical connections for creations that began as art and evolved entirely into computing technology. With dozens of hours’ worth of transcripts to be analyzed by qualitative coders, a quantitative methodology was created to sift through the documents. The initial step was to both clean and format all the data. First, a basic spelling and grammar check was applied, as well as a Python script for normalized formatting which used an open-source grammatical formatter to make the data as coherent as possible. 10 documents were randomly selected to manually review, where words often incorrectly translated during the transcription were recorded and replaced throughout all other documents. Then, to remove all banter and side comments, the transcripts were spliced into paragraphs (separated by change in speaker) and all paragraphs with less than 300 characters were removed. Secondly, a keyword extractor, a form of natural language processing where significant words in a document are selected, was run on each paragraph for all interviews. Every proper noun was put into a data structure corresponding to that respective interview. From there, a Bidirectional and Auto-Regressive Transformer (B.A.R.T.) summary model was then applied to each paragraph that included any of the proper nouns selected from the interview. At this stage the information to review had been sent from about 60 hours’ worth of data to 20. The data was further processed through light, manual observation – any summaries which proved to fit the criteria of the proposed deliverable were selected, as well their locations within the document. This narrowed that data down to about 5 hours’ worth of processing. The qualitative researchers were then able to find 8 more connections in addition to our previous 4, exceeding our minimum quota of 3 to satisfy the grant. Major findings of the study and subsequent curation of this methodology raised a conceptual finding crucial to working with qualitative data of this magnitude. In the use of artificial intelligence there is a general trade off in a model between breadth of knowledge and specificity. If the model has too much knowledge, the user risks leaving out important data (too general). If the tool is too specific, it has not seen enough data to be useful. Thus, this methodology proposes a solution to this tradeoff. The data is never altered outside of grammatical and spelling checks. Instead, the important information is marked, creating an indicator of where the significant data is without compromising the purity of it. Secondly, the data is chunked into smaller paragraphs, giving specificity, and then cross-referenced with the keywords (allowing generalization over the whole document). This way, no data is harmed, and qualitative experts can go over the raw data instead of using highly manipulated results. Given the success in deliverable creation as well as the circumvention of this tradeoff, this methodology should stand as a model for synthesizing qualitative data while maintaining its original form.

Keywords: B.A.R.T.model, keyword extractor, natural language processing, qualitative coding

Procedia PDF Downloads 29
8852 Customized Design of Amorphous Solids by Generative Deep Learning

Authors: Yinghui Shang, Ziqing Zhou, Rong Han, Hang Wang, Xiaodi Liu, Yong Yang

Abstract:

The design of advanced amorphous solids, such as metallic glasses, with targeted properties through artificial intelligence signifies a paradigmatic shift in physical metallurgy and materials technology. Here, we developed a machine-learning architecture that facilitates the generation of metallic glasses with targeted multifunctional properties. Our architecture integrates the state-of-the-art unsupervised generative adversarial network model with supervised models, allowing the incorporation of general prior knowledge derived from thousands of data points across a vast range of alloy compositions, into the creation of data points for a specific type of composition, which overcame the common issue of data scarcity typically encountered in the design of a given type of metallic glasses. Using our generative model, we have successfully designed copper-based metallic glasses, which display exceptionally high hardness or a remarkably low modulus. Notably, our architecture can not only explore uncharted regions in the targeted compositional space but also permits self-improvement after experimentally validated data points are added to the initial dataset for subsequent cycles of data generation, hence paving the way for the customized design of amorphous solids without human intervention.

Keywords: metallic glass, artificial intelligence, mechanical property, automated generation

Procedia PDF Downloads 56
8851 Efficient Reuse of Exome Sequencing Data for Copy Number Variation Callings

Authors: Chen Wang, Jared Evans, Yan Asmann

Abstract:

With the quick evolvement of next-generation sequencing techniques, whole-exome or exome-panel data have become a cost-effective way for detection of small exonic mutations, but there has been a growing desire to accurately detect copy number variations (CNVs) as well. In order to address this research and clinical needs, we developed a sequencing coverage pattern-based method not only for copy number detections, data integrity checks, CNV calling, and visualization reports. The developed methodologies include complete automation to increase usability, genome content-coverage bias correction, CNV segmentation, data quality reports, and publication quality images. Automatic identification and removal of poor quality outlier samples were made automatically. Multiple experimental batches were routinely detected and further reduced for a clean subset of samples before analysis. Algorithm improvements were also made to improve somatic CNV detection as well as germline CNV detection in trio family. Additionally, a set of utilities was included to facilitate users for producing CNV plots in focused genes of interest. We demonstrate the somatic CNV enhancements by accurately detecting CNVs in whole exome-wide data from the cancer genome atlas cancer samples and a lymphoma case study with paired tumor and normal samples. We also showed our efficient reuses of existing exome sequencing data, for improved germline CNV calling in a family of the trio from the phase-III study of 1000 Genome to detect CNVs with various modes of inheritance. The performance of the developed method is evaluated by comparing CNV calling results with results from other orthogonal copy number platforms. Through our case studies, reuses of exome sequencing data for calling CNVs have several noticeable functionalities, including a better quality control for exome sequencing data, improved joint analysis with single nucleotide variant calls, and novel genomic discovery of under-utilized existing whole exome and custom exome panel data.

Keywords: bioinformatics, computational genetics, copy number variations, data reuse, exome sequencing, next generation sequencing

Procedia PDF Downloads 257
8850 Impact of Protean Career Attitude on Career Success with the Mediating Effect of Career Insight

Authors: Prabhashini Wijewantha

Abstract:

This study looks at the impact of protean career attitude of employees on their career success and next it looks at the mediation effect of career insights on the above relationship. Career success is defined as the accomplishment of desirable work related outcomes at any point in person’s work experiences over time and it comprises of two sub variables, namely, career satisfaction and perceived employability. Protean career attitude was measured using the eight items from the Self Directedness subscale of the Protean Career Attitude scale developed by Briscoe and Hall, where as career satisfaction was measured by the three item scale developed by Martine, Eddleston, and Veiga. Perceived employability was also evaluated using three items and career insight was measured using fourteen items that were adapted and used by De Vos and Soens. Data were collected from a sample of 300 mid career executives in Sri Lanka deploying the survey strategy and data were analyzed using the SPSS and AMOS software version 20.0. A preliminary analysis of data was initially performed where data were screened and reliability and validity were ensured. Next a simple regression analysis was performed to test the direct impact of protean career attitude on career success and the hypothesis was supported. The Baron and Kenney’s four steps, three regressions approach for mediator testing was used to calculate the mediation effect of career insight on the above relationship and a partial mediation was supported by the data. Finally theoretical and practical implications are discussed.

Keywords: career success, career insight, mid career MBAs, protean career attitude

Procedia PDF Downloads 360
8849 A Study of Students’ Perceptions of Technology in Petaling District

Authors: Ahmad Masduki Bin Selamat

Abstract:

Malaysia is becoming a developed country by the year 2020, the problem is that little is known about the perceptions and curricular values of Malaysian high school students who have taken Living Skills as a subject in the regular public school. How these students perceive technology in their daily lives, in the country’s development and in global context, is not known. The study involved form 4 students from four public schools in Petaling District. The study found that the Petaling District students’ knowledge of technology were good, where 76.6 % of them scored 50% marks and above during the achievement test. In addition, it was also found that only excellent and squatter students perceived technology education as important as a school subject, compared to those students from the urban area. It was found that students preferred business and entrepreneurship topics rather than the other Living Skills curriculum. The study suggests that students should be exposed to technology education from the early years of schooling (preschool to secondary). In addition, the acquisition of skills, the evaluation, revision and modification of the instruction as well as the curriculum should be enforced.

Keywords: technology education, living skills, curricular values, public schools

Procedia PDF Downloads 451
8848 An Analysis of Oil Price Changes and Other Factors Affecting Iranian Food Basket: A Panel Data Method

Authors: Niloofar Ashktorab, Negar Ashktorab

Abstract:

Oil exports fund nearly half of Iran’s government expenditures, since many years other countries have been imposed different sanctions against Iran. Sanctions that primarily target Iran’s key energy sector have harmed Iran’s economy. The strategic effects of sanctions might be reduction as Iran adjusts to them economically. In this study, we evaluate the impact of oil price and sanctions against Iran on food commodity prices by using panel data method. Here, we find that the food commodity prices, the oil price and real exchange rate are stationary. The results show positive effect of oil price changes, real exchange rate and sanctions on food commodity prices.

Keywords: oil price, food basket, sanctions, panel data, Iran

Procedia PDF Downloads 356
8847 A Proposed Framework for Software Redocumentation Using Distributed Data Processing Techniques and Ontology

Authors: Laila Khaled Almawaldi, Hiew Khai Hang, Sugumaran A. l. Nallusamy

Abstract:

Legacy systems are crucial for organizations, but their intricacy and lack of documentation pose challenges for maintenance and enhancement. Redocumentation of legacy systems is vital for automatically or semi-automatically creating documentation for software lacking sufficient records. It aims to enhance system understandability, maintainability, and knowledge transfer. However, existing redocumentation methods need improvement in data processing performance and document generation efficiency. This stems from the necessity to efficiently handle the extensive and complex code of legacy systems. This paper proposes a method for semi-automatic legacy system re-documentation using semantic parallel processing and ontology. Leveraging parallel processing and ontology addresses current challenges by distributing the workload and creating documentation with logically interconnected data. The paper outlines challenges in legacy system redocumentation and suggests a method of redocumentation using parallel processing and ontology for improved efficiency and effectiveness.

Keywords: legacy systems, redocumentation, big data analysis, parallel processing

Procedia PDF Downloads 46
8846 Building Green Infrastructure Networks Based on Cadastral Parcels Using Network Analysis

Authors: Gon Park

Abstract:

Seoul in South Korea established the 2030 Seoul City Master Plan that contains green-link projects to connect critical green areas within the city. However, the plan does not have detailed analyses for green infrastructure to incorporate land-cover information to many structural classes. This study maps green infrastructure networks of Seoul for complementing their green plans with identifying and raking green areas. Hubs and links of main elements of green infrastructure have been identified from incorporating cadastral data of 967,502 parcels to 135 of land use maps using geographic information system. Network analyses were used to rank hubs and links of a green infrastructure map with applying a force-directed algorithm, weighted values, and binary relationships that has metrics of density, distance, and centrality. The results indicate that network analyses using cadastral parcel data can be used as the framework to identify and rank hubs, links, and networks for the green infrastructure planning under a variable scenarios of green areas in cities.

Keywords: cadastral data, green Infrastructure, network analysis, parcel data

Procedia PDF Downloads 206
8845 The Effect of CPU Location in Total Immersion of Microelectronics

Authors: A. Almaneea, N. Kapur, J. L. Summers, H. M. Thompson

Abstract:

Meeting the growth in demand for digital services such as social media, telecommunications, and business and cloud services requires large scale data centres, which has led to an increase in their end use energy demand. Generally, over 30% of data centre power is consumed by the necessary cooling overhead. Thus energy can be reduced by improving the cooling efficiency. Air and liquid can both be used as cooling media for the data centre. Traditional data centre cooling systems use air, however liquid is recognised as a promising method that can handle the more densely packed data centres. Liquid cooling can be classified into three methods; rack heat exchanger, on-chip heat exchanger and full immersion of the microelectronics. This study quantifies the improvements of heat transfer specifically for the case of immersed microelectronics by varying the CPU and heat sink location. Immersion of the server is achieved by filling the gap between the microelectronics and a water jacket with a dielectric liquid which convects the heat from the CPU to the water jacket on the opposite side. Heat transfer is governed by two physical mechanisms, which is natural convection for the fixed enclosure filled with dielectric liquid and forced convection for the water that is pumped through the water jacket. The model in this study is validated with published numerical and experimental work and shows good agreement with previous work. The results show that the heat transfer performance and Nusselt number (Nu) is improved by 89% by placing the CPU and heat sink on the bottom of the microelectronics enclosure.

Keywords: CPU location, data centre cooling, heat sink in enclosures, immersed microelectronics, turbulent natural convection in enclosures

Procedia PDF Downloads 272
8844 A Macroeconomic Analysis of Defense Industry: Comparisons, Trends and Improvements in Brazil and in the World

Authors: J. Fajardo, J. Guerra, E. Gonzales

Abstract:

This paper will outline a study of Brazil's industrial base of defense (IDB), through a bibliographic research method, combined with an analysis of macroeconomic data from several available public data platforms. This paper begins with a brief study about Brazilian national industry, including analyzes of productivity, income, outcome and jobs. Next, the research presents a study on the defense industry in Brazil, presenting the main national companies that operate in the aeronautical, army and naval branches. After knowing the main points of the Brazilian defense industry, data on the productivity of the defense industry of the main countries and competing companies of the Brazilian industry were analyzed, in order to summarize big cases in Brazil with a comparative analysis. Concerned the methodology, were used bibliographic research and the exploration of historical data series, in order to analyze information, to get trends and to make comparisons along the time. The research is finished with the main trends for the development of the Brazilian defense industry, comparing the current situation with the point of view of several countries.

Keywords: economics of defence, industry, trends, market

Procedia PDF Downloads 155
8843 A Statistical Approach to Classification of Agricultural Regions

Authors: Hasan Vural

Abstract:

Turkey is a favorable country to produce a great variety of agricultural products because of her different geographic and climatic conditions which have been used to divide the country into four main and seven sub regions. This classification into seven regions traditionally has been used in order to data collection and publication especially related with agricultural production. Afterwards, nine agricultural regions were considered. Recently, the governmental body which is responsible of data collection and dissemination (Turkish Institute of Statistics-TIS) has used 12 classes which include 11 sub regions and Istanbul province. This study aims to evaluate these classification efforts based on the acreage of ten main crops in a ten years time period (1996-2005). The panel data grouped in 11 subregions has been evaluated by cluster and multivariate statistical methods. It was concluded that from the agricultural production point of view, it will be rather meaningful to consider three main and eight sub-agricultural regions throughout the country.

Keywords: agricultural region, factorial analysis, cluster analysis,

Procedia PDF Downloads 416
8842 Data Confidentiality in Public Cloud: A Method for Inclusion of ID-PKC Schemes in OpenStack Cloud

Authors: N. Nalini, Bhanu Prakash Gopularam

Abstract:

The term data security refers to the degree of resistance or protection given to information from unintended or unauthorized access. The core principles of information security are the confidentiality, integrity and availability, also referred as CIA triad. Cloud computing services are classified as SaaS, IaaS and PaaS services. With cloud adoption the confidential enterprise data are moved from organization premises to untrusted public network and due to this the attack surface has increased manifold. Several cloud computing platforms like OpenStack, Eucalyptus, Amazon EC2 offer users to build and configure public, hybrid and private clouds. While the traditional encryption based on PKI infrastructure still works in cloud scenario, the management of public-private keys and trust certificates is difficult. The Identity based Public Key Cryptography (also referred as ID-PKC) overcomes this problem by using publicly identifiable information for generating the keys and works well with decentralized systems. The users can exchange information securely without having to manage any trust information. Another advantage is that access control (role based access control policy) information can be embedded into data unlike in PKI where it is handled by separate component or system. In OpenStack cloud platform the keystone service acts as identity service for authentication and authorization and has support for public key infrastructure for auto services. In this paper, we explain OpenStack security architecture and evaluate the PKI infrastructure piece for data confidentiality. We provide method to integrate ID-PKC schemes for securing data while in transit and stored and explain the key measures for safe guarding data against security attacks. The proposed approach uses JPBC crypto library for key-pair generation based on IEEE P1636.3 standard and secure communication to other cloud services.

Keywords: data confidentiality, identity based cryptography, secure communication, open stack key stone, token scoping

Procedia PDF Downloads 384
8841 Improved Distance Estimation in Dynamic Environments through Multi-Sensor Fusion with Extended Kalman Filter

Authors: Iffat Ara Ebu, Fahmida Islam, Mohammad Abdus Shahid Rafi, Mahfuzur Rahman, Umar Iqbal, John Ball

Abstract:

The application of multi-sensor fusion for enhanced distance estimation accuracy in dynamic environments is crucial for advanced driver assistance systems (ADAS) and autonomous vehicles. Limitations of single sensors such as cameras or radar in adverse conditions motivate the use of combined camera and radar data to improve reliability, adaptability, and object recognition. A multi-sensor fusion approach using an extended Kalman filter (EKF) is proposed to combine sensor measurements with a dynamic system model, achieving robust and accurate distance estimation. The research utilizes the Mississippi State University Autonomous Vehicular Simulator (MAVS) to create a controlled environment for data collection. Data analysis is performed using MATLAB. Qualitative (visualization of fused data vs ground truth) and quantitative metrics (RMSE, MAE) are employed for performance assessment. Initial results with simulated data demonstrate accurate distance estimation compared to individual sensors. The optimal sensor measurement noise variance and plant noise variance parameters within the EKF are identified, and the algorithm is validated with real-world data from a Chevrolet Blazer. In summary, this research demonstrates that multi-sensor fusion with an EKF significantly improves distance estimation accuracy in dynamic environments. This is supported by comprehensive evaluation metrics, with validation transitioning from simulated to real-world data, paving the way for safer and more reliable autonomous vehicle control.

Keywords: sensor fusion, EKF, MATLAB, MAVS, autonomous vehicle, ADAS

Procedia PDF Downloads 43
8840 A User Identification Technique to Access Big Data Using Cloud Services

Authors: A. R. Manu, V. K. Agrawal, K. N. Balasubramanya Murthy

Abstract:

Authentication is required in stored database systems so that only authorized users can access the data and related cloud infrastructures. This paper proposes an authentication technique using multi-factor and multi-dimensional authentication system with multi-level security. The proposed technique is likely to be more robust as the probability of breaking the password is extremely low. This framework uses a multi-modal biometric approach and SMS to enforce additional security measures with the conventional Login/password system. The robustness of the technique is demonstrated mathematically using a statistical analysis. This work presents the authentication system along with the user authentication architecture diagram, activity diagrams, data flow diagrams, sequence diagrams, and algorithms.

Keywords: design, implementation algorithms, performance, biometric approach

Procedia PDF Downloads 476
8839 Carbon Footprint Assessment Initiative and Trees: Role in Reducing Emissions

Authors: Omar Alelweet

Abstract:

Carbon emissions are quantified in terms of carbon dioxide equivalents, generated through a specific activity or accumulated throughout the life stages of a product or service. Given the growing concern about climate change and the role of carbon dioxide emissions in global warming, this initiative aims to create awareness and understanding of the impact of human activities and identify potential areas for improvement regarding the management of the carbon footprint on campus. Given that trees play a vital role in reducing carbon emissions by absorbing CO₂ during the photosynthesis process, this paper evaluated the contribution of each tree to reducing those emissions. Collecting data over an extended period of time is essential to monitoring carbon dioxide levels. This will help capture changes at different times and identify any patterns or trends in the data. By linking the data to specific activities, events, or environmental factors, it is possible to identify sources of emissions and areas where carbon dioxide levels are rising. Analyzing the collected data can provide valuable insights into ways to reduce emissions and mitigate the impact of climate change.

Keywords: sustainability, green building, environmental impact, CO₂

Procedia PDF Downloads 69
8838 Detection of Change Points in Earthquakes Data: A Bayesian Approach

Authors: F. A. Al-Awadhi, D. Al-Hulail

Abstract:

In this study, we applied the Bayesian hierarchical model to detect single and multiple change points for daily earthquake body wave magnitude. The change point analysis is used in both backward (off-line) and forward (on-line) statistical research. In this study, it is used with the backward approach. Different types of change parameters are considered (mean, variance or both). The posterior model and the conditional distributions for single and multiple change points are derived and implemented using BUGS software. The model is applicable for any set of data. The sensitivity of the model is tested using different prior and likelihood functions. Using Mb data, we concluded that during January 2002 and December 2003, three changes occurred in the mean magnitude of Mb in Kuwait and its vicinity.

Keywords: multiple change points, Markov Chain Monte Carlo, earthquake magnitude, hierarchical Bayesian mode

Procedia PDF Downloads 456
8837 Productivity and Structural Design of Manufacturing Systems

Authors: Ryspek Usubamatov, Tan San Chin, Sarken Kapaeva

Abstract:

Productivity of the manufacturing systems depends on technological processes, a technical data of machines and a structure of systems. Technology is presented by the machining mode and data, a technical data presents reliability parameters and auxiliary time for discrete production processes. The term structure of manufacturing systems includes the number of serial and parallel production machines and links between them. Structures of manufacturing systems depend on the complexity of technological processes. Mathematical models of productivity rate for manufacturing systems are important attributes that enable to define best structure by criterion of a productivity rate. These models are important tool in evaluation of the economical efficiency for production systems.

Keywords: productivity, structure, manufacturing systems, structural design

Procedia PDF Downloads 585
8836 The Effect of Tacit Knowledge for Intelligence Cycle

Authors: Bahadir Aydin

Abstract:

It is difficult to access accurate knowledge because of mass data. This huge data make environment more and more caotic. Data are main piller of intelligence. The affiliation between intelligence and knowledge is quite significant to understand underlying truths. The data gathered from different sources can be modified, interpreted and classified by using intelligence cycle process. This process is applied in order to progress to wisdom as well as intelligence. Within this process the effect of tacit knowledge is crucial. Knowledge which is classified as explicit and tacit knowledge is the key element for any purpose. Tacit knowledge can be seen as "the tip of the iceberg”. This tacit knowledge accounts for much more than we guess in all intelligence cycle. If the concept of intelligence cycle is scrutinized, it can be seen that it contains risks, threats as well as success. The main purpose of all organizations is to be successful by eliminating risks and threats. Therefore, there is a need to connect or fuse existing information and the processes which can be used to develop it. Thanks to this process the decision-makers can be presented with a clear holistic understanding, as early as possible in the decision making process. Altering from the current traditional reactive approach to a proactive intelligence cycle approach would reduce extensive duplication of work in the organization. Applying new result-oriented cycle and tacit knowledge intelligence can be procured and utilized more effectively and timely.

Keywords: information, intelligence cycle, knowledge, tacit Knowledge

Procedia PDF Downloads 514
8835 Implementation Association Rule Method in Determining the Layout of Qita Supermarket as a Strategy in the Competitive Retail Industry in Indonesia

Authors: Dwipa Rizki Utama, Hanief Ibrahim

Abstract:

The development of industry retail in Indonesia is very fast, various strategy was undertaken to boost the customer satisfaction and the productivity purchases to boost the profit, one of which is implementing strategies layout. The purpose of this study is to determine the layout of Qita supermarket, a retail industry in Indonesia, in order to improve customer satisfaction and to maximize the rate of products’ sale as a whole, so as the infrequently purchased products will be purchased. This research uses a literature study method, and one of the data mining methods is association rule which applied in market basket analysis. Data were tested amounted 100 from 160 after pre-processing data, so then the distribution department and 26 departments corresponding to the data previous layout will be obtained. From those data, by the association rule method, customer behavior when purchasing items simultaneously can be studied, so then the layout of the supermarket based on customer behavior can be determined. Using the rapid miner software by the minimal support 25% and minimal confidence 30% showed that the 14th department purchased at the same time with department 10, 21st department purchased at the same time with department 13, 15th department purchased at the same time with department 12, 14th department purchased at the same time with department 12, and 10th department purchased at the same time with department 14. From those results, a better supermarket layout can be arranged than the previous layout.

Keywords: industry retail, strategy, association rule, supermarket

Procedia PDF Downloads 188
8834 Transforming Data Science Curriculum Through Design Thinking

Authors: Samar Swaid

Abstract:

Today, corporates are moving toward the adoption of Design-Thinking techniques to develop products and services, putting their consumer as the heart of the development process. One of the leading companies in Design-Thinking, IDEO (Innovation, Design, Engineering Organization), defines Design-Thinking as an approach to problem-solving that relies on a set of multi-layered skills, processes, and mindsets that help people generate novel solutions to problems. Design thinking may result in new ideas, narratives, objects or systems. It is about redesigning systems, organizations, infrastructures, processes, and solutions in an innovative fashion based on the users' feedback. Tim Brown, president and CEO of IDEO, sees design thinking as a human-centered approach that draws from the designer's toolkit to integrate people's needs, innovative technologies, and business requirements. The application of design thinking has been witnessed to be the road to developing innovative applications, interactive systems, scientific software, healthcare application, and even to utilizing Design-Thinking to re-think business operations, as in the case of Airbnb. Recently, there has been a movement to apply design thinking to machine learning and artificial intelligence to ensure creating the "wow" effect on consumers. The Association of Computing Machinery task force on Data Science program states that" Data scientists should be able to implement and understand algorithms for data collection and analysis. They should understand the time and space considerations of algorithms. They should follow good design principles developing software, understanding the importance of those principles for testability and maintainability" However, this definition hides the user behind the machine who works on data preparation, algorithm selection and model interpretation. Thus, the Data Science program includes design thinking to ensure meeting the user demands, generating more usable machine learning tools, and developing ways of framing computational thinking. Here, describe the fundamentals of Design-Thinking and teaching modules for data science programs.

Keywords: data science, design thinking, AI, currculum, transformation

Procedia PDF Downloads 81
8833 Exchange Rate Forecasting by Econometric Models

Authors: Zahid Ahmad, Nosheen Imran, Nauman Ali, Farah Amir

Abstract:

The objective of the study is to forecast the US Dollar and Pak Rupee exchange rate by using time series models. For this purpose, daily exchange rates of US and Pakistan for the period of January 01, 2007 - June 2, 2017, are employed. The data set is divided into in sample and out of sample data set where in-sample data are used to estimate as well as forecast the models, whereas out-of-sample data set is exercised to forecast the exchange rate. The ADF test and PP test are used to make the time series stationary. To forecast the exchange rate ARIMA model and GARCH model are applied. Among the different Autoregressive Integrated Moving Average (ARIMA) models best model is selected on the basis of selection criteria. Due to the volatility clustering and ARCH effect the GARCH (1, 1) is also applied. Results of analysis showed that ARIMA (0, 1, 1 ) and GARCH (1, 1) are the most suitable models to forecast the future exchange rate. Further the GARCH (1,1) model provided the volatility with non-constant conditional variance in the exchange rate with good forecasting performance. This study is very useful for researchers, policymakers, and businesses for making decisions through accurate and timely forecasting of the exchange rate and helps them in devising their policies.

Keywords: exchange rate, ARIMA, GARCH, PAK/USD

Procedia PDF Downloads 561
8832 The Best Prediction Data Mining Model for Breast Cancer Probability in Women Residents in Kabul

Authors: Mina Jafari, Kobra Hamraee, Saied Hossein Hosseini

Abstract:

The prediction of breast cancer disease is one of the challenges in medicine. In this paper we collected 528 records of women’s information who live in Kabul including demographic, life style, diet and pregnancy data. There are many classification algorithm in breast cancer prediction and tried to find the best model with most accurate result and lowest error rate. We evaluated some other common supervised algorithms in data mining to find the best model in prediction of breast cancer disease among afghan women living in Kabul regarding to momography result as target variable. For evaluating these algorithms we used Cross Validation which is an assured method for measuring the performance of models. After comparing error rate and accuracy of three models: Decision Tree, Naive Bays and Rule Induction, Decision Tree with accuracy of 94.06% and error rate of %15 is found the best model to predicting breast cancer disease based on the health care records.

Keywords: decision tree, breast cancer, probability, data mining

Procedia PDF Downloads 138
8831 Image Steganography Using Least Significant Bit Technique

Authors: Preeti Kumari, Ridhi Kapoor

Abstract:

 In any communication, security is the most important issue in today’s world. In this paper, steganography is the process of hiding the important data into other data, such as text, audio, video, and image. The interest in this topic is to provide availability, confidentiality, integrity, and authenticity of data. The steganographic technique that embeds hides content with unremarkable cover media so as not to provoke eavesdropper’s suspicion or third party and hackers. In which many applications of compression, encryption, decryption, and embedding methods are used for digital image steganography. Due to compression, the nose produces in the image. To sustain noise in the image, the LSB insertion technique is used. The performance of the proposed embedding system with respect to providing security to secret message and robustness is discussed. We also demonstrate the maximum steganography capacity and visual distortion.

Keywords: steganography, LSB, encoding, information hiding, color image

Procedia PDF Downloads 474
8830 Multiple Query Optimization in Wireless Sensor Networks Using Data Correlation

Authors: Elaheh Vaezpour

Abstract:

Data sensing in wireless sensor networks is done by query deceleration the network by the users. In many applications of the wireless sensor networks, many users send queries to the network simultaneously. If the queries are processed separately, the network’s energy consumption will increase significantly. Therefore, it is very important to aggregate the queries before sending them to the network. In this paper, we propose a multiple query optimization framework based on sensors physical and temporal correlation. In the proposed method, queries are merged and sent to network by considering correlation among the sensors in order to reduce the communication cost between the sensors and the base station.

Keywords: wireless sensor networks, multiple query optimization, data correlation, reducing energy consumption

Procedia PDF Downloads 334
8829 Efficient Tuning Parameter Selection by Cross-Validated Score in High Dimensional Models

Authors: Yoonsuh Jung

Abstract:

As DNA microarray data contain relatively small sample size compared to the number of genes, high dimensional models are often employed. In high dimensional models, the selection of tuning parameter (or, penalty parameter) is often one of the crucial parts of the modeling. Cross-validation is one of the most common methods for the tuning parameter selection, which selects a parameter value with the smallest cross-validated score. However, selecting a single value as an "optimal" value for the parameter can be very unstable due to the sampling variation since the sample sizes of microarray data are often small. Our approach is to choose multiple candidates of tuning parameter first, then average the candidates with different weights depending on their performance. The additional step of estimating the weights and averaging the candidates rarely increase the computational cost, while it can considerably improve the traditional cross-validation. We show that the selected value from the suggested methods often lead to stable parameter selection as well as improved detection of significant genetic variables compared to the tradition cross-validation via real data and simulated data sets.

Keywords: cross validation, parameter averaging, parameter selection, regularization parameter search

Procedia PDF Downloads 415
8828 Digital Image Steganography with Multilayer Security

Authors: Amar Partap Singh Pharwaha, Balkrishan Jindal

Abstract:

In this paper, a new method is developed for hiding image in a digital image with multilayer security. In the proposed method, the secret image is encrypted in the first instance using a flexible matrix based symmetric key to add first layer of security. Then another layer of security is added to the secret data by encrypting the ciphered data using Pythagorean Theorem method. The ciphered data bits (4 bits) produced after double encryption are then embedded within digital image in the spatial domain using Least Significant Bits (LSBs) substitution. To improve the image quality of the stego-image, an improved form of pixel adjustment process is proposed. To evaluate the effectiveness of the proposed method, image quality metrics including Peak Signal-to-Noise Ratio (PSNR), Mean Square Error (MSE), entropy, correlation, mean value and Universal Image Quality Index (UIQI) are measured. It has been found experimentally that the proposed method provides higher security as well as robustness. In fact, the results of this study are quite promising.

Keywords: Pythagorean theorem, pixel adjustment, ciphered data, image hiding, least significant bit, flexible matrix

Procedia PDF Downloads 337
8827 Efficient Sampling of Probabilistic Program for Biological Systems

Authors: Keerthi S. Shetty, Annappa Basava

Abstract:

In recent years, modelling of biological systems represented by biochemical reactions has become increasingly important in Systems Biology. Biological systems represented by biochemical reactions are highly stochastic in nature. Probabilistic model is often used to describe such systems. One of the main challenges in Systems biology is to combine absolute experimental data into probabilistic model. This challenge arises because (1) some molecules may be present in relatively small quantities, (2) there is a switching between individual elements present in the system, and (3) the process is inherently stochastic on the level at which observations are made. In this paper, we describe a novel idea of combining absolute experimental data into probabilistic model using tool R2. Through a case study of the Transcription Process in Prokaryotes we explain how biological systems can be written as probabilistic program to combine experimental data into the model. The model developed is then analysed in terms of intrinsic noise and exact sampling of switching times between individual elements in the system. We have mainly concentrated on inferring number of genes in ON and OFF states from experimental data.

Keywords: systems biology, probabilistic model, inference, biology, model

Procedia PDF Downloads 349
8826 Experimental Investigation of Cutting Forces and Temperature in Bone Drilling

Authors: Vishwanath Mali, Hemant Warhatkar, Raju Pawade

Abstract:

Drilling of bone has been always challenging for surgeons due to the adverse effect it may impart to bone tissues. Force has to be applied manually by the surgeon while performing conventional bone drilling which may lead to permanent death of bone tissues and nerves. During bone drilling the temperature of the bone tissues increases to higher values above 47 ⁰C that causes thermal osteonecrosis resulting into screw loosening and subsequent implant failures. An attempt has been made here to study the input drilling parameters and surgical drill bit geometry affecting bone health during bone drilling. A One Factor At a Time (OFAT) method is used to plan the experiments. Input drilling parameters studied include spindle speed and feed rate. The drill bit geometry parameter studied include point angle and helix angle. The output variables are drilling thrust force and bone temperature. The experiments were conducted on goat femur bone at room temperature 30 ⁰C. For measurement of thrust forces KISTLER cutting force dynamometer Type 9257BA was used. For continuous data acquisition of temperature NI LabVIEW software was used. Fixture was made on RPT machine for holding the bone specimen while performing drilling operation. Bone specimen were preserved in deep freezer (LABTOP make) under -40 ⁰C. In case of drilling parameters, it is observed that at constant feed rate when spindle speed increases, thrust force as well as temperature decreases and at constant spindle speed when feed rate increases thrust force as well as temperature increases. The effect of drill bit geometry shows that at constant helix angle when point angle increases thrust force as well as temperature increases and at constant point angle when helix angle increase thrust force as well as temperature decreases. Hence it is concluded that as the thrust force increases temperature increases. In case of drilling parameter, the lowest thrust force and temperature i.e. 35.55 N and 36.04 ⁰C respectively were recorded at spindle speed 2000 rpm and feed rate 0.04 mm/rev. In case of drill bit geometry parameter, the lowest thrust force and temperature i.e. 40.81 N and 34 ⁰C respectively were recorded at point angle 70⁰ and helix angle 25⁰ Hence to avoid thermal necrosis of bone it is recommended to use higher spindle speed, lower feed rate, low point angle and high helix angle. The hard nature of cortical bone contributes to a greater rise in temperature whereas a considerable drop in temperature is observed during cancellous bone drilling.

Keywords: bone drilling, helix angle, point angle, thrust force, temperature, thermal necrosis

Procedia PDF Downloads 309
8825 The Role of Waqf Forestry for Sustainable Economic Development: A Panel Logit Analysis

Authors: Patria Yunita

Abstract:

Kuznets’ environmental curve analysis suggests sacrificing economic development to reduce environmental problems. However, we hope to achieve sustainable economic development. In this case, Islamic social finance, especially that of waqf in Indonesia, can be used as a solution to bridge the problem of environmental damage to the sustainability of economic development. The Panel Logit Regression method was used to analyze the probability of increasing economic growth and the role of waqf in the environmental impact of CO₂ emissions. This study uses panel data from 33 Indonesian provinces. The data used were the National Waqf Index, Forest Area, Waqf Land Area, Growth Rate of Regional Gross Domestic Product (YoY), and CO₂ Emissions for 2018-2022. Data were obtained from the Indonesian Waqf Board, Climate World Data, the Ministry of the Environment, and the Bank of Indonesia. The results prove that CO₂ emissions have a negative effect on regional economic growth and that waqf governance in the waqf index has a positive effect on regional economic growth in 33 provinces.

Keywords: waqf, CO₂ emissions, panel logit analysis, sustainable economic development

Procedia PDF Downloads 41
8824 Intelligent Human Pose Recognition Based on EMG Signal Analysis and Machine 3D Model

Authors: Si Chen, Quanhong Jiang

Abstract:

In the increasingly mature posture recognition technology, human movement information is widely used in sports rehabilitation, human-computer interaction, medical health, human posture assessment, and other fields today; this project uses the most original ideas; it is proposed to use the collection equipment for the collection of myoelectric data, reflect the muscle posture change on a degree of freedom through data processing, carry out data-muscle three-dimensional model joint adjustment, and realize basic pose recognition. Based on this, bionic aids or medical rehabilitation equipment can be further developed with the help of robotic arms and cutting-edge technology, which has a bright future and unlimited development space.

Keywords: pose recognition, 3D animation, electromyography, machine learning, bionics

Procedia PDF Downloads 79