Search results for: Data science
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 26307

Search results for: Data science

25617 A Cross-Disciplinary Educational Model in Biomanufacturing to Sustain a Competitive Workforce Ecosystem

Authors: Rosa Buxeda, Lorenzo Saliceti-Piazza, Rodolfo J. Romañach, Luis Ríos, Sandra L. Maldonado-Ramírez

Abstract:

Biopharmaceuticals manufacturing is one of the major economic activities worldwide. Ninety-three percent of the workforce in a biomanufacturing environment concentrates in production-related areas. As a result, strategic collaborations between industry and academia are crucial to ensure the availability of knowledgeable workforce needed in an economic region to become competitive in biomanufacturing. In the past decade, our institution has been a key strategic partner with multinational biotechnology companies in supplying science and engineering graduates in the field of industrial biotechnology. Initiatives addressing all levels of the educational pipeline, from K-12 to college to continued education for company employees have been established along a ten-year span. The Amgen BioTalents Program was designed to provide undergraduate science and engineering students with training in biomanufacturing. The areas targeted by this educational program enhance their academic development, since these topics are not part of their traditional science and engineering curricula. The educational curriculum involved the process of producing a biomolecule from the genetic engineering of cells to the production of an especially targeted polypeptide, protein expression and purification, to quality control, and validation. This paper will report and describe the implementation details and outcomes of the first sessions of the program.

Keywords: biomanufacturing curriculum, interdisciplinary learning, workforce development, industry-academia partnering

Procedia PDF Downloads 280
25616 Podcasting: A Tool for an Enhanced Learning Experience of Introductory Courses to Science and Engineering Students

Authors: Yaser E. Greish, Emad F. Hindawy, Maryam S. Al Nehayan

Abstract:

Introductory courses such as General Chemistry I, General Physics I and General Biology need special attention as students taking these courses are usually at their first year of the university. In addition to the language barrier for most of them, they also face other difficulties if these elementary courses are taught in the traditional way. Changing the routine method of teaching of these courses is therefore mandated. In this regard, podcasting of chemistry lectures was used as an add-on to the traditional and non-traditional methods of teaching chemistry to science and non-science students. Podcasts refer to video files that are distributed in a digital format through the Internet using personal computers or mobile devices. Pedagogical strategy is another way of identifying podcasts. Three distinct teaching approaches are evident in the current literature and include receptive viewing, problem-solving, and created video podcasts. The digital format and dispensing of video podcasts have stabilized over the past eight years, the type of podcasts vary considerably according to their purpose, degree of segmentation, pedagogical strategy, and academic focus. In this regard, the whole syllabus of 'General Chemistry I' course was developed as podcasts and were delivered to students throughout the semester. Students used the podcasted files extensively during their studies, especially as part of their preparations for exams. Feedback of students strongly supported the idea of using podcasting as it reflected its effect on the overall understanding of the subject, and a consequent improvement of their grades.

Keywords: podcasting, introductory course, interactivity, flipped classroom

Procedia PDF Downloads 257
25615 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning

Authors: Walid Cherif

Abstract:

Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.

Keywords: data mining, knowledge discovery, machine learning, similarity measurement, supervised classification

Procedia PDF Downloads 459
25614 Seismic Data Scaling: Uncertainties, Potential and Applications in Workstation Interpretation

Authors: Ankur Mundhra, Shubhadeep Chakraborty, Y. R. Singh, Vishal Das

Abstract:

Seismic data scaling affects the dynamic range of a data and with present day lower costs of storage and higher reliability of Hard Disk data, scaling is not suggested. However, in dealing with data of different vintages, which perhaps were processed in 16 bits or even 8 bits and are need to be processed with 32 bit available data, scaling is performed. Also, scaling amplifies low amplitude events in deeper region which disappear due to high amplitude shallow events that saturate amplitude scale. We have focused on significance of scaling data to aid interpretation. This study elucidates a proper seismic loading procedure in workstations without using default preset parameters as available in most software suites. Differences and distribution of amplitude values at different depth for seismic data are probed in this exercise. Proper loading parameters are identified and associated steps are explained that needs to be taken care of while loading data. Finally, the exercise interprets the un-certainties which might arise when correlating scaled and unscaled versions of seismic data with synthetics. As, seismic well tie correlates the seismic reflection events with well markers, for our study it is used to identify regions which are enhanced and/or affected by scaling parameter(s).

Keywords: clipping, compression, resolution, seismic scaling

Procedia PDF Downloads 463
25613 The Relationship between Competency-Based Learning and Learning Efficiency of Media Communication Students at Suan Sunandha Rajabhat University

Authors: Somtop Keawchuer

Abstract:

This research aims to study (1) the relationship between competency-based learning and learning efficiency of new media communication students at Suan Sunandha University (2) the demographic factor effect on learning efficiency of students at Suan Sunandha University. This research method will use quantitative research; data was collected by questionnaires distributed to students from new media communication in management science faculty of Suan Sunandha Rajabhat University for 1340 sample by purposive sampling method. Data was analyzed by descriptive statistic including percentage, mean, standard deviation and inferential statistic including T-test, ANOVA and Pearson correlation for hypothesis testing. The results showed that the competency-based learning in term of ability to communicate, ability to think and solve the problem, life skills and ability to use technology has a significant relationship with learning efficiency in term of the cognitive domain, psychomotor domain and affective domain at the 0.05 level and which is in harmony with the research hypotheses.

Keywords: competency-based learning, learning efficiency, new media communication students, Suan Sunandha Rajabhat University

Procedia PDF Downloads 239
25612 Psychological Perspectives on Modern Restaurant Interior Design Based on Traditional Elements (Case Study: Interior Design of the Mesineh Restaurant, Tehran, Iran)

Authors: Raheleh Saifiabolhassan

Abstract:

After the post-industrial era, when a wide variety of foods and drinks are readily available everywhere, the motive has shifted from meeting basic nutritional needs to enjoy the eating experience. Today, behavioral environmental studies are an essential branch of science when it comes to understanding, analyzing, and evaluating how humans react to the environment. Similarly, these studies explore customer-influencing factors and the effectiveness of restaurant designs. To facilitate a pleasant dining experience, the authors focused on acoustics, flexibility, and lighting. In this study, 2700 square feet of surface area was used to plan a restaurant (called Mesineh) based on behavioral science, considering many factors related to the interaction between the building and the users, such as flexibility and privacy, acoustics, and light. Environment psychology considerations in architectural design have been lacking for several decades. To fill this gap, the author evaluated environmental psychology standards and applied them to Mesineh's design. A sense of nostalgia will be felt by customers of the Mesineh restaurant thanks to its interior design, which combines historical elements with contemporary elements. Additionally, vernacular Persian architectural elements were incorporated into a modern context to fulfill the behavioral science component of interior design.

Keywords: Mesineh restaurant, interior design, behavioral sciences, environment psychology, traditional persian architecture

Procedia PDF Downloads 204
25611 Integrating Evidence Into Health Policy: Navigating Cross-Sector and Interdisciplinary Collaboration

Authors: Tessa Heeren

Abstract:

The following proposal pertains to the complex process of successfully implementing health policies that are based on public health research. A systematic review was conducted by myself and faculty at the Cluj School of Public Health in Romania. The reviewed articles covered a wide range of topics, such as barriers and facilitators to multi-sector collaboration, differences in professional cultures, and systemic obstacles. The reviewed literature identified communication, collaboration, user-friendly dissemination, and documentation of processes in the execution of applied research as important themes for the promotion of evidence in the public health decision-making process. This proposal fits into the Academy Health National Health Policy conference because it identifies and examines differences between the worlds of research and politics. Implications and new insights for federal and/or state health policy: Recommendations made based on the findings of this research include using politically relevant levers to promote research (e.g. campaign donors, lobbies, established parties, etc.), modernizing dissemination practices, and reforms in which the involvement of external stakeholders is facilitated without relying on invitations from individual policy makers. Description of how evidence and/or data was or could be used: The reviewed articles illustrated shortcomings and areas for improvement in policy research processes and collaborative development. In general, the evidence base in the field of integrating research into policy lacks critical details of the actual process of developing evidence based policy. This shortcoming in logistical details creates a barrier for potential replication of collaborative efforts described in studies. Potential impact of the presentation for health policy: The reviewed articles focused on identifying barriers and facilitators that arise in cross sector collaboration, rather than the process and impact of integrating evidence into policy. In addition, the type of evidence used in policy was rarely specified, and widely varying interpretations of the definition of evidence complicated overall conclusions. Background: Using evidence to inform public health decision making processes has been proven effective; however, it is not clear how research is applied in practice. Aims: The objectives of the current study were to assess the extent to which evidence is used in public health decision-making process. Methods: To identify eligible studies, seven bibliographic databases, specifically, PubMed, Scopus, Cochrane Library, Science Direct, Web of Science, ClinicalKey, Health and Safety Science Abstract were screened (search dates: 1990 – September 2015); a general internet search was also conducted. Primary research and systematic reviews about the use of evidence in public health policy in Europe were included. The studies considered for inclusion were assessed by two reviewers, along with extracted data on objective, methods, population, and results. Data were synthetized as a narrative review. Results: Of 2564 articles initially identified, 2525 titles and abstracts were screened. Ultimately, 30 articles fit the research criteria by describing how or why evidence is used/not used in public health policy. The majority of included studies involved interviews and surveys (N=17). Study participants were policy makers, health care professionals, researchers, community members, service users, experts in public health.

Keywords: cross-sector, dissemination, health policy, policy implementation

Procedia PDF Downloads 217
25610 Association of Social Data as a Tool to Support Government Decision Making

Authors: Diego Rodrigues, Marcelo Lisboa, Elismar Batista, Marcos Dias

Abstract:

Based on data on child labor, this work arises questions about how to understand and locate the factors that make up the child labor rates, and which properties are important to analyze these cases. Using data mining techniques to discover valid patterns on Brazilian social databases were evaluated data of child labor in the State of Tocantins (located north of Brazil with a territory of 277000 km2 and comprises 139 counties). This work aims to detect factors that are deterministic for the practice of child labor and their relationships with financial indicators, educational, regional and social, generating information that is not explicit in the government database, thus enabling better monitoring and updating policies for this purpose.

Keywords: social data, government decision making, association of social data, data mining

Procedia PDF Downloads 365
25609 Outlier Detection in Stock Market Data using Tukey Method and Wavelet Transform

Authors: Sadam Alwadi

Abstract:

Outlier values become a problem that frequently occurs in the data observation or recording process. Thus, the need for data imputation has become an essential matter. In this work, it will make use of the methods described in the prior work to detect the outlier values based on a collection of stock market data. In order to implement the detection and find some solutions that maybe helpful for investors, real closed price data were obtained from the Amman Stock Exchange (ASE). Tukey and Maximum Overlapping Discrete Wavelet Transform (MODWT) methods will be used to impute the detect the outlier values.

Keywords: outlier values, imputation, stock market data, detecting, estimation

Procedia PDF Downloads 77
25608 Automatic and High Precise Modeling for System Optimization

Authors: Stephanie Chen, Mitja Echim, Christof Büskens

Abstract:

To describe and propagate the behavior of a system mathematical models are formulated. Parameter identification is used to adapt the coefficients of the underlying laws of science. For complex systems this approach can be incomplete and hence imprecise and moreover too slow to be computed efficiently. Therefore, these models might be not applicable for the numerical optimization of real systems, since these techniques require numerous evaluations of the models. Moreover not all quantities necessary for the identification might be available and hence the system must be adapted manually. Therefore, an approach is described that generates models that overcome the before mentioned limitations by not focusing on physical laws, but on measured (sensor) data of real systems. The approach is more general since it generates models for every system detached from the scientific background. Additionally, this approach can be used in a more general sense, since it is able to automatically identify correlations in the data. The method can be classified as a multivariate data regression analysis. In contrast to many other data regression methods this variant is also able to identify correlations of products of variables and not only of single variables. This enables a far more precise and better representation of causal correlations. The basis and the explanation of this method come from an analytical background: the series expansion. Another advantage of this technique is the possibility of real-time adaptation of the generated models during operation. Herewith system changes due to aging, wear or perturbations from the environment can be taken into account, which is indispensable for realistic scenarios. Since these data driven models can be evaluated very efficiently and with high precision, they can be used in mathematical optimization algorithms that minimize a cost function, e.g. time, energy consumption, operational costs or a mixture of them, subject to additional constraints. The proposed method has successfully been tested in several complex applications and with strong industrial requirements. The generated models were able to simulate the given systems with an error in precision less than one percent. Moreover the automatic identification of the correlations was able to discover so far unknown relationships. To summarize the above mentioned approach is able to efficiently compute high precise and real-time-adaptive data-based models in different fields of industry. Combined with an effective mathematical optimization algorithm like WORHP (We Optimize Really Huge Problems) several complex systems can now be represented by a high precision model to be optimized within the user wishes. The proposed methods will be illustrated with different examples.

Keywords: adaptive modeling, automatic identification of correlations, data based modeling, optimization

Procedia PDF Downloads 400
25607 PEINS: A Generic Compression Scheme Using Probabilistic Encoding and Irrational Number Storage

Authors: P. Jayashree, S. Rajkumar

Abstract:

With social networks and smart devices generating a multitude of data, effective data management is the need of the hour for networks and cloud applications. Some applications need effective storage while some other applications need effective communication over networks and data reduction comes as a handy solution to meet out both requirements. Most of the data compression techniques are based on data statistics and may result in either lossy or lossless data reductions. Though lossy reductions produce better compression ratios compared to lossless methods, many applications require data accuracy and miniature details to be preserved. A variety of data compression algorithms does exist in the literature for different forms of data like text, image, and multimedia data. In the proposed work, a generic progressive compression algorithm, based on probabilistic encoding, called PEINS is projected as an enhancement over irrational number stored coding technique to cater to storage issues of increasing data volumes as a cost effective solution, which also offers data security as a secondary outcome to some extent. The proposed work reveals cost effectiveness in terms of better compression ratio with no deterioration in compression time.

Keywords: compression ratio, generic compression, irrational number storage, probabilistic encoding

Procedia PDF Downloads 285
25606 Iot Device Cost Effective Storage Architecture and Real-Time Data Analysis/Data Privacy Framework

Authors: Femi Elegbeleye, Omobayo Esan, Muienge Mbodila, Patrick Bowe

Abstract:

This paper focused on cost effective storage architecture using fog and cloud data storage gateway and presented the design of the framework for the data privacy model and data analytics framework on a real-time analysis when using machine learning method. The paper began with the system analysis, system architecture and its component design, as well as the overall system operations. The several results obtained from this study on data privacy model shows that when two or more data privacy model is combined we tend to have a more stronger privacy to our data, and when fog storage gateway have several advantages over using the traditional cloud storage, from our result shows fog has reduced latency/delay, low bandwidth consumption, and energy usage when been compare with cloud storage, therefore, fog storage will help to lessen excessive cost. This paper dwelt more on the system descriptions, the researchers focused on the research design and framework design for the data privacy model, data storage, and real-time analytics. This paper also shows the major system components and their framework specification. And lastly, the overall research system architecture was shown, its structure, and its interrelationships.

Keywords: IoT, fog, cloud, data analysis, data privacy

Procedia PDF Downloads 91
25605 Comparison of Selected Pier-Scour Equations for Wide Piers Using Field Data

Authors: Nordila Ahmad, Thamer Mohammad, Bruce W. Melville, Zuliziana Suif

Abstract:

Current methods for predicting local scour at wide bridge piers, were developed on the basis of laboratory studies and very limited scour prediction were tested with field data. Laboratory wide pier scour equation from previous findings with field data were presented. A wide range of field data were used and it consists of both live-bed and clear-water scour. A method for assessing the quality of the data was developed and applied to the data set. Three other wide pier-scour equations from the literature were used to compare the performance of each predictive method. The best-performing scour equation were analyzed using statistical analysis. Comparisons of computed and observed scour depths indicate that the equation from the previous publication produced the smallest discrepancy ratio and RMSE value when compared with the large amount of laboratory and field data.

Keywords: field data, local scour, scour equation, wide piers

Procedia PDF Downloads 398
25604 Eye Tracking Syntax in Language Education

Authors: Marcus Maia

Abstract:

The present study reports and discusses the use of eye tracking qualitative data in reading workshops in Brazilian middle and high schools and in Generative Syntax and Sentence Processing courses at the undergraduate and graduate levels at the Federal University of Rio de Janeiro, respectively. Both endeavors take the sentential level as the proper object to be metacognitively explored in language education (cf. Chomsky, Gallego & Ott, 2019) to develop innate science forming capacity and knowledge of language. In both projects, non-discrepant qualitative eye tracking data collected and quantitatively analyzed in experimental syntax and psycholinguistic studies carried out in Lapex (Experimental Psycholinguistics Laboratory of the Federal University of Rio de Janeiro) were displayed to students as a point of departure, triggering discussions. Classes would generally start with the display of videos showing eye tracking data, such as gaze plots and heatmaps from several studies in Psycholinguistics and Experimental Syntax that we had already developed in our laboratory. The videos usually triggered discussions with students about linguistic and psycholinguistic issues, such as the reading of sentences for gist, garden-path sentences, syntactic and semantic anomalies, the filled-gap effect, island effects, direct and indirect cause, and recursive constructions, among other topics. Active, problem-solving based methodologies were employed with the objective of stimulating student participation. The communication also discusses the importance of developing full literacy, epistemic vigilance and intellectual self-defense in an infodemic world in the lines of Maia (2022).

Keywords: reading, educational psycholinguistics, eye-tracking, active methodology

Procedia PDF Downloads 59
25603 The Maximum Throughput Analysis of UAV Datalink 802.11b Protocol

Authors: Inkyu Kim, SangMan Moon

Abstract:

This IEEE 802.11b protocol provides up to 11Mbps data rate, whereas aerospace industry wants to seek higher data rate COTS data link system in the UAV. The Total Maximum Throughput (TMT) and delay time are studied on many researchers in the past years This paper provides theoretical data throughput performance of UAV formation flight data link using the existing 802.11b performance theory. We operate the UAV formation flight with more than 30 quad copters with 802.11b protocol. We may be predicting that UAV formation flight numbers have to bound data link protocol performance limitations.

Keywords: UAV datalink, UAV formation flight datalink, UAV WLAN datalink application, UAV IEEE 802.11b datalink application

Procedia PDF Downloads 386
25602 Methods for Distinction of Cattle Using Supervised Learning

Authors: Radoslav Židek, Veronika Šidlová, Radovan Kasarda, Birgit Fuerst-Waltl

Abstract:

Machine learning represents a set of topics dealing with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data. The data can present identification patterns which are used to classify into groups. The result of the analysis is the pattern which can be used for identification of data set without the need to obtain input data used for creation of this pattern. An important requirement in this process is careful data preparation validation of model used and its suitable interpretation. For breeders, it is important to know the origin of animals from the point of the genetic diversity. In case of missing pedigree information, other methods can be used for traceability of animal´s origin. Genetic diversity written in genetic data is holding relatively useful information to identify animals originated from individual countries. We can conclude that the application of data mining for molecular genetic data using supervised learning is an appropriate tool for hypothesis testing and identifying an individual.

Keywords: genetic data, Pinzgau cattle, supervised learning, machine learning

Procedia PDF Downloads 541
25601 Router 1X3 - RTL Design and Verification

Authors: Nidhi Gopal

Abstract:

Routing is the process of moving a packet of data from source to destination and enables messages to pass from one computer to another and eventually reach the target machine. A router is a networking device that forwards data packets between computer networks. It is connected to two or more data lines from different networks (as opposed to a network switch, which connects data lines from one single network). This paper mainly emphasizes upon the study of router device, its top level architecture, and how various sub-modules of router i.e. Register, FIFO, FSM and Synchronizer are synthesized, and simulated and finally connected to its top module.

Keywords: data packets, networking, router, routing

Procedia PDF Downloads 797
25600 Transdisciplinary Methodological Innovation: Connecting Natural and Social Sciences Research through a Training Toolbox

Authors: Jessica M. Black

Abstract:

Although much of natural and social science research aims to enhance human flourishing and address social problems, the training within the two fields is significantly different across theory, methodology, and implementation of results. Social scientists are trained in social, psychological, and to the extent that it is relevant to their discipline, spiritual development, theory, and accompanying methodologies. They tend not to receive training or learn about accompanying methodology related to interrogating human development and social problems from a biological perspective. On the other hand, those in the natural sciences, and for the purpose of this work, human biological sciences specifically – biology, neuroscience, genetics, epigenetics, and physiology – are often trained first to consider cellular development and related methodologies, and may not have opportunity to receive formal training in many of the foundational principles that guide human development, such as systems theory or person-in-environment framework, methodology related to tapping both proximal and distal psycho-social-spiritual influences on human development, and foundational principles of equity, justice and inclusion in research design. There is a need for disciplines heretofore siloed to know one another, to receive streamlined, easy to access training in theory and methods from one another and to learn how to build interdisciplinary teams that can speak and act upon a shared research language. Team science is more essential than ever, as are transdisciplinary approaches to training and research design. This study explores the use of a methodological toolbox that natural and social scientists can use by employing a decision-making tree regarding project aims, costs, and participants, among other important study variables. The decision tree begins with a decision about whether the researcher wants to learn more about social sciences approaches or biological approaches to study design. The toolbox and platform are flexible, such that users could also choose among modules, for instance, reviewing epigenetics or community-based participatory research even if those are aspects already a part of their home field. To start, both natural and social scientists would receive training on systems science, team science, transdisciplinary approaches, and translational science. Next, social scientists would receive training on grounding biological theory and the following methodological approaches and tools: physiology, (epi)genetics, non-invasive neuroimaging, invasive neuroimaging, endocrinology, and the gut-brain connection. Natural scientists would receive training on grounding social science theory, and measurement including variables, assessment and surveys on human development as related to the developing person (e.g., temperament and identity), microsystems (e.g., systems that directly interact with the person such as family and peers), mesosystems (e.g., systems that interact with one another but do not directly interact with the individual person, such as parent and teacher relationships with one another), exosystems (e.g., spaces and settings that may come back to affect the individual person, such as a parent’s work environment, but within which the individual does not directly interact, macrosystems (e.g., wider culture and policy), and the chronosystem (e.g., historical time, such as the generational impact of trauma). Participants will be able to engage with the toolbox and one another to foster increased transdisciplinary work

Keywords: methodology, natural science, social science, transdisciplinary

Procedia PDF Downloads 102
25599 National System of Innovation in Zambia: Towards Socioeconomic Development

Authors: Ephraim Daka, Maxim Kotsemir

Abstract:

The National system Innovation (NSI) have recently proliferated as a vehicle for addressing poverty and national competitiveness in the developing countries. While several governments in Sub-Saharan Africa have adopted the developed countries’ models of innovation to local conditions, the Zambian case is rather unique. This study highlights conceptual and socioeconomic challenges directed to the performances of the NSI. The paper analyses science and technology strategies with the inclusion of “innovation” and its effect towards improving socioeconomic elements. The authors reviewed STI policy and national strategy documents, followed by interviews compared to economical regional and national data sets. The NSI and its related to inter-linkages and support mechanism to socioeconomic development were explored.

Keywords: national system of innovation, socioeconomics, development, Zambia

Procedia PDF Downloads 214
25598 Performance Evaluation and Comparison between the Empirical Mode Decomposition, Wavelet Analysis, and Singular Spectrum Analysis Applied to the Time Series Analysis in Atmospheric Science

Authors: Olivier Delage, Hassan Bencherif, Alain Bourdier

Abstract:

Signal decomposition approaches represent an important step in time series analysis, providing useful knowledge and insight into the data and underlying dynamics characteristics while also facilitating tasks such as noise removal and feature extraction. As most of observational time series are nonlinear and nonstationary, resulting of several physical processes interaction at different time scales, experimental time series have fluctuations at all time scales and requires the development of specific signal decomposition techniques. Most commonly used techniques are data driven, enabling to obtain well-behaved signal components without making any prior-assumptions on input data. Among the most popular time series decomposition techniques, most cited in the literature, are the empirical mode decomposition and its variants, the empirical wavelet transform and singular spectrum analysis. With increasing popularity and utility of these methods in wide ranging applications, it is imperative to gain a good understanding and insight into the operation of these algorithms. In this work, we describe all of the techniques mentioned above as well as their ability to denoise signals, to capture trends, to identify components corresponding to the physical processes involved in the evolution of the observed system and deduce the dimensionality of the underlying dynamics. Results obtained with all of these methods on experimental total ozone columns and rainfall time series will be discussed and compared

Keywords: denoising, empirical mode decomposition, singular spectrum analysis, time series, underlying dynamics, wavelet analysis

Procedia PDF Downloads 101
25597 Noise Reduction in Web Data: A Learning Approach Based on Dynamic User Interests

Authors: Julius Onyancha, Valentina Plekhanova

Abstract:

One of the significant issues facing web users is the amount of noise in web data which hinders the process of finding useful information in relation to their dynamic interests. Current research works consider noise as any data that does not form part of the main web page and propose noise web data reduction tools which mainly focus on eliminating noise in relation to the content and layout of web data. This paper argues that not all data that form part of the main web page is of a user interest and not all noise data is actually noise to a given user. Therefore, learning of noise web data allocated to the user requests ensures not only reduction of noisiness level in a web user profile, but also a decrease in the loss of useful information hence improves the quality of a web user profile. Noise Web Data Learning (NWDL) tool/algorithm capable of learning noise web data in web user profile is proposed. The proposed work considers elimination of noise data in relation to dynamic user interest. In order to validate the performance of the proposed work, an experimental design setup is presented. The results obtained are compared with the current algorithms applied in noise web data reduction process. The experimental results show that the proposed work considers the dynamic change of user interest prior to elimination of noise data. The proposed work contributes towards improving the quality of a web user profile by reducing the amount of useful information eliminated as noise.

Keywords: web log data, web user profile, user interest, noise web data learning, machine learning

Procedia PDF Downloads 259
25596 The Development of Student Core Competencies through the STEM Education Opportunities in Classroom

Authors: Z. Dedovets, M. Rodionov

Abstract:

The goal of the modern education system is to prepare students to be able to adapt to ever-changing life situations. They must be able to acquire required knowledge independently; apply such knowledge in practice to solve various problems by using modern technologies; think critically and creatively; competently use information; be communicative, work in a team; and develop their own moral values, intellect and cultural awareness. As a result, the status of education significantly increases; new requirements to its quality have been formed. In recent years, the competency-based approach in education has become of significant interest. This approach is a strengthening of applied and practical characteristics of a school education and leads to the forming of the key students’ competencies which define their success in future life. In this article, the authors’ attention focuses on a range of key competencies, educational, informational and communicative and on the possibility to develop such competencies via STEM education. This research shows the change in students’ attitude towards scientific disciplines such as mathematics, general science, technology and engineering as a result of STEM education. Two-staged analyzes questionnaires completed by students of forms II to IV in the republic of Trinidad and Tobago allowed the authors to categorize students between two levels that represent students’ attitude to various disciplines. The significance of differences between selected levels was confirmed with the use of Pearsons’ chi-squared test. In summary, the analysis of obtained data makes it possible to conclude that STEM education has a great potential for development of core students’ competencies and encourages the development of positive student attitude towards the above mentioned above scientific disciplines.

Keywords: STEM, science, technology, engineering, mathematics, students’ competency, Pearson's chi-squared test

Procedia PDF Downloads 381
25595 Data Mining and Knowledge Management Application to Enhance Business Operations: An Exploratory Study

Authors: Zeba Mahmood

Abstract:

The modern business organizations are adopting technological advancement to achieve competitive edge and satisfy their consumer. The development in the field of Information technology systems has changed the way of conducting business today. Business operations today rely more on the data they obtained and this data is continuously increasing in volume. The data stored in different locations is difficult to find and use without the effective implementation of Data mining and Knowledge management techniques. Organizations who smartly identify, obtain and then convert data in useful formats for their decision making and operational improvements create additional value for their customers and enhance their operational capabilities. Marketers and Customer relationship departments of firm use Data mining techniques to make relevant decisions, this paper emphasizes on the identification of different data mining and Knowledge management techniques that are applied to different business industries. The challenges and issues of execution of these techniques are also discussed and critically analyzed in this paper.

Keywords: knowledge, knowledge management, knowledge discovery in databases, business, operational, information, data mining

Procedia PDF Downloads 525
25594 Climate Change and Sustainable Development among Agricultural Communities in Tanzania; An Analysis of Southern Highland Rural Communities

Authors: Paschal Arsein Mugabe

Abstract:

This paper examines sustainable development planning in the context of environmental concerns in rural areas of the Tanzania. It challenges mainstream approaches to development, focusing instead upon transformative action for environmental justice. The goal is to help shape future sustainable development agendas in local government, international agencies and civil society organisations. Research methods: The approach of the study is geographical, but also involves various Trans-disciplinary elements, particularly from development studies, sociology and anthropology, management, geography, agriculture and environmental science. The research methods included thematic and questionnaire interviews, participatory tools such as focus group discussion, participatory research appraisal and expert interviews for primary data. Secondary data were gathered through the analysis of land use/cover data and official documents on climate, agriculture, marketing and health. Also several earlier studies that were made in the area provided an important reference base. Findings: The findings show that, agricultural sustainability in Tanzania appears likely to deteriorate as a consequence of climate change. Noteworthy differences in impacts across households are also present both by district and by income category. Also food security cannot be explained by climate as the only influencing factor. A combination of economic, political and socio-cultural context of the community are crucial. Conclusively, it is worthy knowing that people understand their relationship between climate change and their livelihood.

Keywords: agriculture, climate change, environment, sustainable development

Procedia PDF Downloads 322
25593 Indexing and Incremental Approach Using Map Reduce Bipartite Graph (MRBG) for Mining Evolving Big Data

Authors: Adarsh Shroff

Abstract:

Big data is a collection of dataset so large and complex that it becomes difficult to process using data base management tools. To perform operations like search, analysis, visualization on big data by using data mining; which is the process of extraction of patterns or knowledge from large data set. In recent years, the data mining applications become stale and obsolete over time. Incremental processing is a promising approach to refreshing mining results. It utilizes previously saved states to avoid the expense of re-computation from scratch. This project uses i2MapReduce, an incremental processing extension to Map Reduce, the most widely used framework for mining big data. I2MapReduce performs key-value pair level incremental processing rather than task level re-computation, supports not only one-step computation but also more sophisticated iterative computation, which is widely used in data mining applications, and incorporates a set of novel techniques to reduce I/O overhead for accessing preserved fine-grain computation states. To optimize the mining results, evaluate i2MapReduce using a one-step algorithm and three iterative algorithms with diverse computation characteristics for efficient mining.

Keywords: big data, map reduce, incremental processing, iterative computation

Procedia PDF Downloads 342
25592 Analyzing Large Scale Recurrent Event Data with a Divide-And-Conquer Approach

Authors: Jerry Q. Cheng

Abstract:

Currently, in analyzing large-scale recurrent event data, there are many challenges such as memory limitations, unscalable computing time, etc. In this research, a divide-and-conquer method is proposed using parametric frailty models. Specifically, the data is randomly divided into many subsets, and the maximum likelihood estimator from each individual data set is obtained. Then a weighted method is proposed to combine these individual estimators as the final estimator. It is shown that this divide-and-conquer estimator is asymptotically equivalent to the estimator based on the full data. Simulation studies are conducted to demonstrate the performance of this proposed method. This approach is applied to a large real dataset of repeated heart failure hospitalizations.

Keywords: big data analytics, divide-and-conquer, recurrent event data, statistical computing

Procedia PDF Downloads 158
25591 Applications of Probabilistic Interpolation via Orthogonal Matrices

Authors: Dariusz Jacek Jakóbczak

Abstract:

Mathematics and computer science are interested in methods of 2D curve interpolation and extrapolation using the set of key points (knots). A proposed method of Hurwitz- Radon Matrices (MHR) is such a method. This novel method is based on the family of Hurwitz-Radon (HR) matrices which possess columns composed of orthogonal vectors. Two-dimensional curve is interpolated via different functions as probability distribution functions: polynomial, sinus, cosine, tangent, cotangent, logarithm, exponent, arcsin, arccos, arctan, arcctg or power function, also inverse functions. It is shown how to build the orthogonal matrix operator and how to use it in a process of curve reconstruction.

Keywords: 2D data interpolation, hurwitz-radon matrices, MHR method, probabilistic modeling, curve extrapolation

Procedia PDF Downloads 520
25590 Secure Multiparty Computations for Privacy Preserving Classifiers

Authors: M. Sumana, K. S. Hareesha

Abstract:

Secure computations are essential while performing privacy preserving data mining. Distributed privacy preserving data mining involve two to more sites that cannot pool in their data to a third party due to the violation of law regarding the individual. Hence in order to model the private data without compromising privacy and information loss, secure multiparty computations are used. Secure computations of product, mean, variance, dot product, sigmoid function using the additive and multiplicative homomorphic property is discussed. The computations are performed on vertically partitioned data with a single site holding the class value.

Keywords: homomorphic property, secure product, secure mean and variance, secure dot product, vertically partitioned data

Procedia PDF Downloads 406
25589 Urban Sexual Geographies, Queer Citizenship and the Socio-Economic Status of LGBTIQs in Vienna

Authors: Karin Schoenpflug, Christine M. Klapeer

Abstract:

In a large study for the Vienna City Council’s Antidiscrimination unit (WASt) an interdisciplinary team (in the fields of economics, sociology and political science) working with urban economics, critical citizenship studies, the sociology of work & inequality and urban political/human geography conducted an online survey asking LGBTIs (lesbians, gays, bisexuals, transgender and intersex people) in Vienna detailed questions on their quality-of-life, happiness and well-being. 3.161 persons responded and provided us with a rich data set concerning: 1) Labor market structures, discrimination, working conditions and employment practices (economic citizenship); 2) access to health care, welfare, education and safety in public spaces (social citizenship); 3) political participation as well as access to legal institutions (political citizenship). All those fields are important dimensions in regards to “full” citizenship and the well-being of the LGBTI population, but are also constitutive for the inclusion of sexual and gender minorities into the city population(s) of Vienna. Our data also allows us to map the sexual geography of Vienna as LGBTI communities are more likely to live in certain districts; some places are considered safe(r) and “friendlier”. In this way our work helps to fill a research gap connecting (urban) spaces and sexuality, and it produces new data and insights on the quality-of-life of this subpopulation. Our findings allow for urban (policy) planning and limiting violence and discrimination and improving the collective wellbeing and social cohesion.

Keywords: urban sexual geographies, LGBTI, socio-economic status, Vienna, sitizenship status

Procedia PDF Downloads 341
25588 Cross Project Software Fault Prediction at Design Phase

Authors: Pradeep Singh, Shrish Verma

Abstract:

Software fault prediction models are created by using the source code, processed metrics from the same or previous version of code and related fault data. Some company do not store and keep track of all artifacts which are required for software fault prediction. To construct fault prediction model for such company, the training data from the other projects can be one potential solution. The earlier we predict the fault the less cost it requires to correct. The training data consists of metrics data and related fault data at function/module level. This paper investigates fault predictions at early stage using the cross-project data focusing on the design metrics. In this study, empirical analysis is carried out to validate design metrics for cross project fault prediction. The machine learning techniques used for evaluation is Naïve Bayes. The design phase metrics of other projects can be used as initial guideline for the projects where no previous fault data is available. We analyze seven data sets from NASA Metrics Data Program which offer design as well as code metrics. Overall, the results of cross project is comparable to the within company data learning.

Keywords: software metrics, fault prediction, cross project, within project.

Procedia PDF Downloads 334