Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 24175

Search results for: data normalization

24085 Recent Advances in Data Warehouse

Abstract:

This paper describes some recent advances in a quickly developing area of data storing and processing based on Data Warehouses and Data Mining techniques, which are associated with software, hardware, data mining algorithms and visualisation techniques having common features for any specific problems and tasks of their implementation.

Keywords: data warehouse, data mining, knowledge discovery in databases, on-line analytical processing

Procedia PDF Downloads 364

24084 How to Use Big Data in Logistics Issues

Authors: Mehmet Akif Aslan, Mehmet Simsek, Eyup Sensoy

Abstract:

Big Data stands for today’s cutting-edge technology. As the technology becomes widespread, so does Data. Utilizing massive data sets enable companies to get competitive advantages over their adversaries. Out of many area of Big Data usage, logistics has significance role in both commercial sector and military. This paper lays out what big data is and how it is used in both military and commercial logistics.

Keywords: big data, logistics, operational efficiency, risk management

Procedia PDF Downloads 608

24083 Understanding Gender-Based Violence through an Adolescent Lens: Qualitative Findings from Delhi, India

Authors: Pratishtha Singh

Abstract:

Gender-based violence (GBV) or gendered violence refers to violence inflicted on a person because of their gender. Majority of men who perpetrate gender-based violence, first do so during their teenage years. Further, the first sexual experience of most girls is coerced. In order to reduce the widespread occurrence of GBV, it is vital to intervene and reach people, especially boys, when their attitudes and beliefs about sexuality and gender are developing. This study aims to understand GBV through an adolescent lens, focusing on their knowledge, attitudes and experiences regarding gendered abuse. This is a cross-sectional, qualitative study. The respondents are Delhi based students in grades 11th and 12th, recruited via snowball sampling. Sixteen in-depth, telephonic interviews were carried out in the month of April, 2020. The data was transcribed verbatim into MS Word and qualitative coding was undertaken in Atlas.ti 8. Twelve out of sixteen respondents admitted experiencing sexual GBV. Out of these, a little more than half of the victims reported it to somebody. Thematic analysis revealed key themes of: (i) Introduction and reinforcement of a patriarchal structure (ii) Violence in teen dating (iii) Acceptability and normalization of violence and (iv) Justice System. Findings reflect a process wherein GBV becomes an intricate part of adolescents’ lives. Participants showed a moderately well-informed understanding of gendered abuse whereas attitudes reflected a complex combination of internalized patriarchy and a desire to bring positive societal reform. The results of this study highlight a need for health promoting, gender-equitable interventions.

Keywords: adolescents, gender, health, violence

Procedia PDF Downloads 101

24082 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: clustering, data mining, DBSCAN, k-means, k-medoids, sensor data

Procedia PDF Downloads 342

24081 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review

Procedia PDF Downloads 126

24080 Influence of Bondage Discipline Sadism Masochism (BDSM) On Fashion Industry on Fashion Industry

Authors: Utkarsh Goley

Abstract:

BDSM, or Bondage Discipline Sadism Masochism, is a controversial and often misunderstood practice that has had a presence in the fashion industry for decades. BDSM-inspired fashion can be seen in various forms, from leather harnesses and corsets to studded collars and latex clothing. BDSM fashion is often associated with edginess, rebellion, and sexuality. It has been embraced by subcultures such as punk, Goth, and fetish, as well as mainstream fashion designers looking to push boundaries and make a statement. However, the use of BDSM imagery in fashion has also been criticized for promoting objectification, exploitation, and the normalization of abusive behavior. Some argue that the fashion industry's depiction of BDSM often reinforces harmful stereotypes and misconceptions about the practice. Despite the controversy, BDSM-inspired fashion continues to have a place in the industry, with designers and consumers alike finding value in its aesthetic appeal and provocative nature. As with any aspect of fashion, the role of BDSM in the industry will continue to evolve and adapt to changing cultural norms and societal attitudes.

Keywords: BDSM, leather, fashion, lycra

Procedia PDF Downloads 143

24079 Government Big Data Ecosystem: A Systematic Literature Review

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Data that is high in volume, velocity, veracity and comes from a variety of sources is usually generated in all sectors including the government sector. Globally public administrations are pursuing (big) data as new technology and trying to adopt a data-centric architecture for hosting and sharing data. Properly executed, big data and data analytics in the government (big) data ecosystem can be led to data-driven government and have a direct impact on the way policymakers work and citizens interact with governments. In this research paper, we conduct a systematic literature review. The main aims of this paper are to highlight essential aspects of the government (big) data ecosystem and to explore the most critical socio-technical factors that contribute to the successful implementation of government (big) data ecosystem. The essential aspects of government (big) data ecosystem include definition, data types, data lifecycle models, and actors and their roles. We also discuss the potential impact of (big) data in public administration and gaps in the government data ecosystems literature. As this is a new topic, we did not find specific articles on government (big) data ecosystem and therefore focused our research on various relevant areas like humanitarian data, open government data, scientific research data, industry data, etc.

Keywords: applications of big data, big data, big data types. big data ecosystem, critical success factors, data-driven government, egovernment, gaps in data ecosystems, government (big) data, literature review, public administration, systematic review

Procedia PDF Downloads 179

24078 A Machine Learning Decision Support Framework for Industrial Engineering Purposes

Authors: Anli Du Preez, James Bekker

Abstract:

Data is currently one of the most critical and influential emerging technologies. However, the true potential of data is yet to be exploited since, currently, about 1% of generated data are ever actually analyzed for value creation. There is a data gap where data is not explored due to the lack of data analytics infrastructure and the required data analytics skills. This study developed a decision support framework for data analytics by following Jabareen’s framework development methodology. The study focused on machine learning algorithms, which is a subset of data analytics. The developed framework is designed to assist data analysts with little experience, in choosing the appropriate machine learning algorithm given the purpose of their application.

Keywords: Data analytics, Industrial engineering, Machine learning, Value creation

Procedia PDF Downloads 134

24077 Text Mining Past Medical History in Electrophysiological Studies

Authors: Roni Ramon-Gonen, Amir Dori, Shahar Shelly

Abstract:

Background and objectives: Healthcare professionals produce abundant textual information in their daily clinical practice. The extraction of insights from all the gathered information, mainly unstructured and lacking in normalization, is one of the major challenges in computational medicine. In this respect, text mining assembles different techniques to derive valuable insights from unstructured textual data, so it has led to being especially relevant in Medicine. Neurological patient’s history allows the clinician to define the patient’s symptoms and along with the result of the nerve conduction study (NCS) and electromyography (EMG) test, assists in formulating a differential diagnosis. Past medical history (PMH) helps to direct the latter. In this study, we aimed to identify relevant PMH, understand which PMHs are common among patients in the referral cohort and documented by the medical staff, and examine the differences by sex and age in a large cohort based on textual format notes. Methods: We retrospectively identified all patients with abnormal NCS between May 2016 to February 2022. Age, gender, and all NCS attributes reports were recorded, including the summary text. All patients’ histories were extracted from the text report by a query. Basic text cleansing and data preparation were performed, as well as lemmatization. Very popular words (like ‘left’ and ‘right’) were deleted. Several words were replaced with their abbreviations. A bag of words approach was used to perform the analyses. Different visualizations which are common in text analysis, were created to easily grasp the results. Results: We identified 5282 unique patients. Three thousand and five (57%) patients had documented PMH. Of which 60.4% (n=1817) were males. The total median age was 62 years (range 0.12 – 97.2 years), and the majority of patients (83%) presented after the age of forty years. The top two documented medical histories were diabetes mellitus (DM) and surgery. DM was observed in 16.3% of the patients, and surgery at 15.4%. Other frequent patient histories (among the top 20) were fracture, cancer (ca), motor vehicle accident (MVA), leg, lumbar, discopathy, back and carpal tunnel release (CTR). When separating the data by sex, we can see that DM and MVA are more frequent among males, while cancer and CTR are less frequent. On the other hand, the top medical history in females was surgery and, after that, DM. Other frequent histories among females are breast cancer, fractures, and CTR. In the younger population (ages 18 to 26), the frequent PMH were surgery, fractures, trauma, and MVA. Discussion: By applying text mining approaches to unstructured data, we were able to better understand which medical histories are more relevant in these circumstances and, in addition, gain additional insights regarding sex and age differences. These insights might help to collect epidemiological demographical data as well as raise new hypotheses. One limitation of this work is that each clinician might use different words or abbreviations to describe the same condition, and therefore using a coding system can be beneficial.

Keywords: abnormal studies, healthcare analytics, medical history, nerve conduction studies, text mining, textual analysis

Procedia PDF Downloads 62

24076 Providing Security to Private Cloud Using Advanced Encryption Standard Algorithm

Authors: Annapureddy Srikant Reddy, Atthanti Mahendra, Samala Chinni Krishna, N. Neelima

Abstract:

In our present world, we are generating a lot of data and we, need a specific device to store all these data. Generally, we store data in pen drives, hard drives, etc. Sometimes we may loss the data due to the corruption of devices. To overcome all these issues, we implemented a cloud space for storing the data, and it provides more security to the data. We can access the data with just using the internet from anywhere in the world. We implemented all these with the java using Net beans IDE. Once user uploads the data, he does not have any rights to change the data. Users uploaded files are stored in the cloud with the file name as system time and the directory will be created with some random words. Cloud accepts the data only if the size of the file is less than 2MB.

Keywords: cloud space, AES, FTP, NetBeans IDE

Procedia PDF Downloads 172

24075 Business Intelligence for Profiling of Telecommunication Customer

Authors: Rokhmatul Insani, Hira Laksmiwati Soemitro

Abstract:

Business Intelligence is a methodology that exploits the data to produce information and knowledge systematically, business intelligence can support the decision-making process. Some methods in business intelligence are data warehouse and data mining. A data warehouse can store historical data from transactional data. For data modelling in data warehouse, we apply dimensional modelling by Kimball. While data mining is used to extracting patterns from the data and get insight from the data. Data mining has many techniques, one of which is segmentation. For profiling of telecommunication customer, we use customer segmentation according to customer’s usage of services, customer invoice and customer payment. Customers can be grouped according to their characteristics and can be identified the profitable customers. We apply K-Means Clustering Algorithm for segmentation. The input variable for that algorithm we use RFM (Recency, Frequency and Monetary) model. All process in data mining, we use tools IBM SPSS modeller.

Keywords: business intelligence, customer segmentation, data warehouse, data mining

Procedia PDF Downloads 441

24074 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Saeed Hassan Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analysing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics

Procedia PDF Downloads 530

24073 PDDA: Priority-Based, Dynamic Data Aggregation Approach for Sensor-Based Big Data Framework

Authors: Lutful Karim, Mohammed S. Al-kahtani

Abstract:

Sensors are being used in various applications such as agriculture, health monitoring, air and water pollution monitoring, traffic monitoring and control and hence, play the vital role in the growth of big data. However, sensors collect redundant data. Thus, aggregating and filtering sensors data are significantly important to design an efficient big data framework. Current researches do not focus on aggregating and filtering data at multiple layers of sensor-based big data framework. Thus, this paper introduces (i) three layers data aggregation and framework for big data and (ii) a priority-based, dynamic data aggregation scheme (PDDA) for the lowest layer at sensors. Simulation results show that the PDDA outperforms existing tree and cluster-based data aggregation scheme in terms of overall network energy consumptions and end-to-end data transmission delay.

Keywords: big data, clustering, tree topology, data aggregation, sensor networks

Procedia PDF Downloads 299

24072 StockTwits Sentiment Analysis on Stock Price Prediction

Authors: Min Chen, Rubi Gupta

Abstract:

Understanding and predicting stock market movements is a challenging problem. It is believed stock markets are partially driven by public sentiments, which leads to numerous research efforts to predict stock market trend using public sentiments expressed on social media such as Twitter but with limited success. Recently a microblogging website StockTwits is becoming increasingly popular for users to share their discussions and sentiments about stocks and financial market. In this project, we analyze the text content of StockTwits tweets and extract financial sentiment using text featurization and machine learning algorithms. StockTwits tweets are first pre-processed using techniques including stopword removal, special character removal, and case normalization to remove noise. Features are extracted from these preprocessed tweets through text featurization process using bags of words, N-gram models, TF-IDF (term frequency-inverse document frequency), and latent semantic analysis. Machine learning models are then trained to classify the tweets' sentiment as positive (bullish) or negative (bearish). The correlation between the aggregated daily sentiment and daily stock price movement is then investigated using Pearson’s correlation coefficient. Finally, the sentiment information is applied together with time series stock data to predict stock price movement. The experiments on five companies (Apple, Amazon, General Electric, Microsoft, and Target) in a duration of nine months demonstrate the effectiveness of our study in improving the prediction accuracy.

Keywords: machine learning, sentiment analysis, stock price prediction, tweet processing

Procedia PDF Downloads 123

24071 The Influence of Directionality on the Giovanelli Illusion

Authors: Michele Sinico

Abstract:

In the Giovanelli illusion, some collinear dots appear misaligned, when each dot lies within a circle and the circles are not collinear. In this illusion, the role of the frame of reference, determined by the circles, is considered a crucial factor. Three experiments were carried out to study the influence of directionality of the circles on the misalignment. The adjustment method was used. Participants changed the orthogonal position of each dot, from the left to the right of the sequence, until a collinear sequence of dots was achieved. The first experiment verified the illusory effect of the misalignment. In the second experiment, the influence of two different directionalities of the circles (-0.58° and +0.58°) on the misalignment was tested. The results show an over-normalization on the sequences of the dots. The third experiment tested the misalignment of the dots without any inclination of the sequence of circles (0°). Only a local illusory effect was found. These results demonstrate that the directionality of the circles, as a global factor, can increase the misalignment. The findings also indicate that directionality and the frame of reference are independent factors in explaining the Giovanelli illusion.

Keywords: Giovannelli illusion, visual illusion, directionality, misalignment, the frame of reference

Procedia PDF Downloads 152

24070 The Effect on Lead Times When Normalizing a Supply Chain Process

Authors: Bassam Istanbouli

Abstract:

Organizations are living in a very competitive and dynamic environment which is constantly changing. In order to achieve a high level of service, the products and processes of these organizations need to be flexible and evolvable. If the supply chains are not modular and well designed, changes can bring combinatorial effects to most areas of a company from its management, financial, documentation, logistics and its information structure. Applying the normalized system’s concept to segments of the supply chain may help in reducing those ripple effects, but it may also increase lead times. Lead times are important and can become a decisive element in gaining customers. Industries are always under the pressure in providing good quality products, at competitive prices, when and how the customer wants them. Most of the time, the customers want their orders now, if not yesterday. The above concept will be proven by examining lead times in a manufacturing example before and after applying normalized systems concept to that segment of the chain. We will then show that although we can minimize the combinatorial effects when changes occur, the lead times will be increased.

Keywords: supply chain, lead time, normalization, modular

Procedia PDF Downloads 94

24069 A Weighted Approach to Unconstrained Iris Recognition

Authors: Yao-Hong Tsai

Abstract:

This paper presents a weighted approach to unconstrained iris recognition. Nowadays, commercial systems are usually characterized by strong acquisition constraints based on the subject’s cooperation. However, it is not always achievable for real scenarios in our daily life. Researchers have been focused on reducing these constraints and maintaining the performance of the system by new techniques at the same time. With large variation in the environment, there are two main improvements to develop the proposed iris recognition system. For solving extremely uneven lighting condition, statistic based illumination normalization is first used on eye region to increase the accuracy of iris feature. The detection of the iris image is based on Adaboost algorithm. Secondly, the weighted approach is designed by Gaussian functions according to the distance to the center of the iris. Furthermore, local binary pattern (LBP) histogram is then applied to texture classification with the weight. Experiment showed that the proposed system provided users a more flexible and feasible way to interact with the verification system through iris recognition.

Keywords: authentication, iris recognition, adaboost, local binary pattern

Procedia PDF Downloads 188

24068 Control the Flow of Big Data

Authors: Shizra Waris, Saleem Akhtar

Abstract:

Big data is a research area receiving attention from academia and IT communities. In the digital world, the amounts of data produced and stored have within a short period of time. Consequently this fast increasing rate of data has created many challenges. In this paper, we use functionalism and structuralism paradigms to analyze the genesis of big data applications and its current trends. This paper presents a complete discussion on state-of-the-art big data technologies based on group and stream data processing. Moreover, strengths and weaknesses of these technologies are analyzed. This study also covers big data analytics techniques, processing methods, some reported case studies from different vendor, several open research challenges and the chances brought about by big data. The similarities and differences of these techniques and technologies based on important limitations are also investigated. Emerging technologies are suggested as a solution for big data problems.

Keywords: computer, it community, industry, big data

Procedia PDF Downloads 157

24067 High Performance Computing and Big Data Analytics

Authors: Branci Sarra, Branci Saadia

Abstract:

Because of the multiplied data growth, many computer science tools have been developed to process and analyze these Big Data. High-performance computing architectures have been designed to meet the treatment needs of Big Data (view transaction processing standpoint, strategic, and tactical analytics). The purpose of this article is to provide a historical and global perspective on the recent trend of high-performance computing architectures especially what has a relation with Analytics and Data Mining.

Keywords: high performance computing, HPC, big data, data analysis

Procedia PDF Downloads 483

24066 Transcriptomic and Translational Regulation of Peroxisome Proliferator-Activated Receptors after Different Feedings in Salmon

Authors: Mahsa Jalili, Essa Ehsan Khan, Signe Dille Lovmo, Augustine Akruwe, Egil Lien, Rolf Erik Olsen, Trygve Sigholt, Atle Magnus Bones

Abstract:

Data from the Norwegian Directorate of Fisheries reported that >1.2 million tons of Atlantic salmon were produced in Norway aquaculture industry in 2016. Peroxisome proliferator-activated receptors (PPARs) are one of the key transcription factor families that respond to nutritional ligands. Recent studies have shown the connection between PPARs with lipid and carbohydrate metabolism in aquaculture. To our knowledge, there is no published data about the effects of krill meal, soybean meal, Bactocell ® and butyrate feedings compared to control group on PPARs gene and protein expressions in Atlantic salmon. Fish, 1year +postsmolt, average weight 250 gram were cultured for 12 weeks after acclimatization by control commercial feeding in 2 weeks after hatchery. Water oxygen rate, salinity, and temperature were monitored every second day. At the end of the trial, fish were taken from tanks randomly, and four replicates per group were collected and stored in -80 freezers until analysis. Total RNA extracted from posterior part of dorsal fin muscle tissues and Nanodrop and Bioanalyzer was used to check the quality of RNA. Gene expression of PPAR α, β and γ were determined by RT-PCR. The expression of genes of interest was measured relative to control group after normalization to three reference genes. Total protein concentration was calculated by Bradford method, and protein expression was determined with primary PPARγ antibody by western blot. All data were analyzed by ANOVA followed by Benjamini-Hochberg and Bonferroni tests. Probability values <0.05 considered significant. Bactocell® and butyrate groups showed significantly lower PPARα expression. PPARβ and γ were not significantly different among groups. PPARγ mRNA expression was approximately consistent with protein expression pattern, except than butyrate group showed lower mRNA level. The order of PPARγ expression was Bactocell® > soy meal > butyrate > krill meal > control respectively. PPARβ gene expression decreased more in soy meal > butyrate > krill meal > Bactocell® > control groups respectively. In conclusion, the increased expression of PPARγ and α is proposed to represent a reduction tendency of lipid storage in fish fed by Bactocell®, butyrate, soy and krill meal.

Keywords: aquaculture, blotting western, gene expression, krill protein extract, prebiotics, probiotics, Salmo salar

Procedia PDF Downloads 178

24065 A Landscape of Research Data Repositories in Re3data.org Registry: A Case Study of Indian Repositories

Authors: Prashant Shrivastava

Abstract:

The purpose of this study is to explore re3dat.org registry to identify research data repositories registration workflow process. Further objective is to depict a graph for present development of research data repositories in India. Preliminarily with an approach to understand re3data.org registry framework and schema design then further proceed to explore the status of research data repositories of India in re3data.org registry. Research data repositories are getting wider relevance due to e-research concepts. Now available registry re3data.org is a good tool for users and researchers to identify appropriate research data repositories as per their research requirements. In Indian environment, a compatible National Research Data Policy is the need of the time to boost the management of research data. Registry for Research Data Repositories is a crucial tool to discover specific information in specific domain. Also, Research Data Repositories in India have not been studied. Re3data.org registry and status of Indian research data repositories both discussed in this study.

Keywords: research data, research data repositories, research data registry, re3data.org

Procedia PDF Downloads 293

24064 A Study of Cloud Computing Solution for Transportation Big Data Processing

Authors: Ilgin Gökaşar, Saman Ghaffarian

Abstract:

The need for fast processed big data of transportation ridership (eg., smartcard data) and traffic operation (e.g., traffic detectors data) which requires a lot of computational power is incontrovertible in Intelligent Transportation Systems. Nowadays cloud computing is one of the important subjects and popular information technology solution for data processing. It enables users to process enormous measure of data without having their own particular computing power. Thus, it can also be a good selection for transportation big data processing as well. This paper intends to examine how the cloud computing can enhance transportation big data process with contrasting its advantages and disadvantages, and discussing cloud computing features.

Keywords: big data, cloud computing, Intelligent Transportation Systems, ITS, traffic data processing

Procedia PDF Downloads 421

24063 Harmonic Data Preparation for Clustering and Classification

Authors: Ali Asheibi

Abstract:

The rapid increase in the size of databases required to store power quality monitoring data has demanded new techniques for analysing and understanding the data. One suggested technique to assist in analysis is data mining. Preparing raw data to be ready for data mining exploration take up most of the effort and time spent in the whole data mining process. Clustering is an important technique in data mining and machine learning in which underlying and meaningful groups of data are discovered. Large amounts of harmonic data have been collected from an actual harmonic monitoring system in a distribution system in Australia for three years. This amount of acquired data makes it difficult to identify operational events that significantly impact the harmonics generated on the system. In this paper, harmonic data preparation processes to better understanding of the data have been presented. Underlying classes in this data has then been identified using clustering technique based on the Minimum Message Length (MML) method. The underlying operational information contained within the clusters can be rapidly visualised by the engineers. The C5.0 algorithm was used for classification and interpretation of the generated clusters.

Keywords: data mining, harmonic data, clustering, classification

Procedia PDF Downloads 215

24062 Research Attitude: Its Factor Structure and Determinants in the Graduate Level

Authors: Janet Lynn S. Montemayor

Abstract:

Dropping survivability and rising drop-out rate in the graduate school is attributed to the demands that come along with research-related requirements. Graduate students tend to withdraw from their studies when confronted with such requirements. This act of succumbing to the challenge is primarily due to a negative mindset. An understanding of students’ view towards research is essential for teachers in facilitating research activities in the graduate school. This study aimed to develop a tool that accurately measures attitude towards research. Psychometric properties of the Research Attitude Inventory (RAIn) was assessed. A pool of items (k=50) was initially constructed and was administered to a development sample composed of Masters and Doctorate degree students (n=159). Results show that the RAIn is a reliable measure of research attitude (k=41, αmax = 0.894). Principal component analysis using orthogonal rotation with Kaiser normalization identified four underlying factors of research attitude, namely predisposition, purpose, perspective, and preparation. Research attitude among the respondents was analyzed using this measure.

Keywords: graduate education, principal component analysis, research attitude, scale development

Procedia PDF Downloads 159

24061 Linguistic Summarization of Structured Patent Data

Authors: E. Y. Igde, S. Aydogan, F. E. Boran, D. Akay

Abstract:

Patent data have an increasingly important role in economic growth, innovation, technical advantages and business strategies and even in countries competitions. Analyzing of patent data is crucial since patents cover large part of all technological information of the world. In this paper, we have used the linguistic summarization technique to prove the validity of the hypotheses related to patent data stated in the literature.

Keywords: data mining, fuzzy sets, linguistic summarization, patent data

Procedia PDF Downloads 244

24060 Proposal of Data Collection from Probes

Authors: M. Kebisek, L. Spendla, M. Kopcek, T. Skulavik

Abstract:

In our paper we describe the security capabilities of data collection. Data are collected with probes located in the near and distant surroundings of the company. Considering the numerous obstacles e.g. forests, hills, urban areas, the data collection is realized in several ways. The collection of data uses connection via wireless communication, LAN network, GSM network and in certain areas data are collected by using vehicles. In order to ensure the connection to the server most of the probes have ability to communicate in several ways. Collected data are archived and subsequently used in supervisory applications. To ensure the collection of the required data, it is necessary to propose algorithms that will allow the probes to select suitable communication channel.

Keywords: communication, computer network, data collection, probe

Procedia PDF Downloads 328

24059 A Review on Big Data Movement with Different Approaches

Authors: Nay Myo Sandar

Abstract:

With the growth of technologies and applications, a large amount of data has been producing at increasing rate from various resources such as social media networks, sensor devices, and other information serving devices. This large collection of massive, complex and exponential growth of dataset is called big data. The traditional database systems cannot store and process such data due to large and complexity. Consequently, cloud computing is a potential solution for data storage and processing since it can provide a pool of resources for servers and storage. However, moving large amount of data to and from is a challenging issue since it can encounter a high latency due to large data size. With respect to big data movement problem, this paper reviews the literature of previous works, discusses about research issues, finds out approaches for dealing with big data movement problem.

Keywords: Big Data, Cloud Computing, Big Data Movement, Network Techniques

Procedia PDF Downloads 50

24058 Optimized Approach for Secure Data Sharing in Distributed Database

Authors: Ahmed Mateen, Zhu Qingsheng, Ahmad Bilal

Abstract:

In the current age of technology, information is the most precious asset of a company. Today, companies have a large amount of data. As the data become larger, access to data for some particular information is becoming slower day by day. Faster data processing to shape it in the form of information is the biggest issue. The major problems in distributed databases are the efficiency of data distribution and response time of data distribution. The security of data distribution is also a big issue. For these problems, we proposed a strategy that can maximize the efficiency of data distribution and also increase its response time. This technique gives better results for secure data distribution from multiple heterogeneous sources. The newly proposed technique facilitates the companies for secure data sharing efficiently and quickly.

Keywords: ER-schema, electronic record, P2P framework, API, query formulation

Procedia PDF Downloads 299

24057 Selection of Suitable Reference Genes for Assessing Endurance Related Traits in a Native Pony Breed of Zanskar at High Altitude

Authors: Prince Vivek, Vijay K. Bharti, Manishi Mukesh, Ankita Sharma, Om Prakash Chaurasia, Bhuvnesh Kumar

Abstract:

High performance of endurance in equid requires adaptive changes involving physio-biochemical, and molecular responses in an attempt to regain homeostasis. We hypothesized that the identification of the suitable reference genes might be considered for assessing of endurance related traits in pony at high altitude and may ensure for individuals struggling to potent endurance trait in ponies at high altitude. A total of 12 mares of ponies, Zanskar breed, were divided into three groups, group-A (without load), group-B, (60 Kg) and group-C (80 Kg) on backpack loads were subjected to a load carry protocol, on a steep climb of 4 km uphill, and of gravel, uneven rocky surface track at an altitude of 3292 m to 3500 m (endpoint). Blood was collected before and immediately after the load carry on sodium heparin anticoagulant, and the peripheral blood mononuclear cell was separated for total RNA isolation and thereafter cDNA synthesis. Real time-PCR reactions were carried out to evaluate the mRNAs expression profile of a panel of putative internal control genes (ICGs), related to different functional classes, namely glyceraldehyde 3-phosphate dehydrogenase (GAPDH), β₂ microglobulin (β₂M), β-actin (ACTB), ribosomal protein 18 (RS18), hypoxanthine-guanine phosophoribosyltransferase (HPRT), ubiquitin B (UBB), ribosomal protein L32 (RPL32), transferrin receptor protein (TFRC), succinate dehydrogenase complex subunit A (SDHA) for normalizing the real-time quantitative polymerase chain reaction (qPCR) data of native pony’s. Three different algorithms, geNorm, NormFinder, and BestKeeper software, were used to evaluate the stability of reference genes. The result showed that GAPDH was best stable gene and stability value for the best combination of two genes was observed TFRC and β₂M. In conclusion, the geometric mean of GAPDH, TFRC and β₂M might be used for accurate normalization of transcriptional data for assessing endurance related traits in Zanskar ponies during load carrying.

Keywords: endurance exercise, ubiquitin B (UBB), β₂ microglobulin (β₂M), high altitude, Zanskar ponies, reference gene

Procedia PDF Downloads 107

24056 Artificial Neural Networks Application on Nusselt Number and Pressure Drop Prediction in Triangular Corrugated Plate Heat Exchanger

Authors: Hany Elsaid Fawaz Abdallah

Abstract:

This study presents a new artificial neural network(ANN) model to predict the Nusselt Number and pressure drop for the turbulent flow in a triangular corrugated plate heat exchanger for forced air and turbulent water flow. An experimental investigation was performed to create a new dataset for the Nusselt Number and pressure drop values in the following range of dimensionless parameters: The plate corrugation angles (from 0° to 60°), the Reynolds number (from 10000 to 40000), pitch to height ratio (from 1 to 4), and Prandtl number (from 0.7 to 200). Based on the ANN performance graph, the three-layer structure with {12-8-6} hidden neurons has been chosen. The training procedure includes back-propagation with the biases and weight adjustment, the evaluation of the loss function for the training and validation dataset and feed-forward propagation of the input parameters. The linear function was used at the output layer as the activation function, while for the hidden layers, the rectified linear unit activation function was utilized. In order to accelerate the ANN training, the loss function minimization may be achieved by the adaptive moment estimation algorithm (ADAM). The ‘‘MinMax’’ normalization approach was utilized to avoid the increase in the training time due to drastic differences in the loss function gradients with respect to the values of weights. Since the test dataset is not being used for the ANN training, a cross-validation technique is applied to the ANN network using the new data. Such procedure was repeated until loss function convergence was achieved or for 4000 epochs with a batch size of 200 points. The program code was written in Python 3.0 using open-source ANN libraries such as Scikit learn, TensorFlow and Keras libraries. The mean average percent error values of 9.4% for the Nusselt number and 8.2% for pressure drop for the ANN model have been achieved. Therefore, higher accuracy compared to the generalized correlations was achieved. The performance validation of the obtained model was based on a comparison of predicted data with the experimental results yielding excellent accuracy.

Keywords: artificial neural networks, corrugated channel, heat transfer enhancement, Nusselt number, pressure drop, generalized correlations

Procedia PDF Downloads 51