Search results for: data filtering and extrapolation
25181 Data Mining Meets Educational Analysis: Opportunities and Challenges for Research
Authors: Carla Silva
Abstract:
Recent development of information and communication technology enables us to acquire, collect, analyse data in various fields of socioeconomic – technological systems. Along with the increase of economic globalization and the evolution of information technology, data mining has become an important approach for economic data analysis. As a result, there has been a critical need for automated approaches to effective and efficient usage of massive amount of educational data, in order to support institutions to a strategic planning and investment decision-making. In this article, we will address data from several different perspectives and define the applied data to sciences. Many believe that 'big data' will transform business, government, and other aspects of the economy. We discuss how new data may impact educational policy and educational research. Large scale administrative data sets and proprietary private sector data can greatly improve the way we measure, track, and describe educational activity and educational impact. We also consider whether the big data predictive modeling tools that have emerged in statistics and computer science may prove useful in educational and furthermore in economics. Finally, we highlight a number of challenges and opportunities for future research.Keywords: data mining, research analysis, investment decision-making, educational research
Procedia PDF Downloads 36225180 A Method of Detecting the Difference in Two States of Brain Using Statistical Analysis of EEG Raw Data
Authors: Digvijaysingh S. Bana, Kiran R. Trivedi
Abstract:
This paper introduces various methods for the alpha wave to detect the difference between two states of brain. One healthy subject participated in the experiment. EEG was measured on the forehead above the eye (FP1 Position) with reference and ground electrode are on the ear clip. The data samples are obtained in the form of EEG raw data. The time duration of reading is of one minute. Various test are being performed on the alpha band EEG raw data.The readings are performed in different time duration of the entire day. The statistical analysis is being carried out on the EEG sample data in the form of various tests.Keywords: electroencephalogram(EEG), biometrics, authentication, EEG raw data
Procedia PDF Downloads 46725179 Reservoir Heterogeneity of the Early Cretaceous Yamama Carbonate Formation: Impact of Depositional Facies and Diagenetic Processes in Southern Iraq
Authors: Abbas Mohammed, Felicitász Velledits
Abstract:
Evaluating carbonate rocks is crucial for the petroleum industry as they often serve as major hydrocarbon reservoirs, where understanding their depositional and diagenetic characteristics directly influences exploration, production, and field development plans. The Early Cretaceous Yamama carbonates Formation in southern Iraq exhibits significant reservoir but heterogeneous. Detailed sedimentological investigations of drill cores and well logs have identified 14 distinct facies/microfacies within both reservoir and non-reservoir units, deposited in a shallow carbonate ramp setting. The grain-supported facies such as intertidal peloidal oncoidal grain/rudstones, backshoal pelletal pack/grainstones, shoal ooidal-peloidal grainstones, cortoidal-peloidal grainstones, in addition to the Lithocodium-Bacinella float/boundstones, and the reefal skeletal rudstones facies display favorable reservoir properties due to preserved interparticle porosity (reaches 25%) at depths greater than 4000 m. This preservation is attributed to early diagenetic circumgranular calcite cementation and limited scattered equant and syntaxial calcite overgrowths, which protected the grains from physical compaction. In contrast, mud-supported facies such as lagoonal skeletal mud/wackestones, skeletal cortoids wacke/packstones, skeletal dasyclads wacke/packstone, middle-ramp miliolidal pack/grainstone, bioturbated dolomitic wackestones, skeletal foraminiferal mud/wackestones, and outer-ramp spiculitic skeletal mud/wackestones exhibit reduced reservoir quality due to a combination of their fine-grain texture, physical and chemical compaction, and substantial amounts of equant calcite cement. This cement fills interparticle and moldic pores, and significantly reducing porosity and permeability in these facies. Reservoir heterogeneity of the formation is attributed to depositional facies, which control the texture of the sediments, and various types of diagenetic alterations. Particular attention to be directed toward the Lithocodium-Bacinella facies deposited as build-ups and the shoal barrier grain-supported facies. These findings underscore the importance of adapting exploration and development strategies to specific facies distributions and diagenetic pathways to mitigate heterogeneity risks. While potential limitations include the extrapolation of core and well log data to field-scale reservoir modeling and the challenge of investigating diagenetic variations at finer scales, the study provides a valuable framework for integrating facies and diagenetic analysis into hydrocarbon exploration workflows.Keywords: depositional facies, grain-supported limestones, lithocoduim-bacinella, reservoir heterogeneity, yamama formation
Procedia PDF Downloads 725178 Low-Cost Embedded Biometric System Based on Fingervein Modality
Authors: Randa Boukhris, Alima Damak, Dorra Sellami
Abstract:
Fingervein biometric authentication is one of the most popular and accurate technologies. However, low cost embedded solution is still an open problem. In this paper, a real-time implementation of fingervein recognition process embedded in Raspberry-Pi has been proposed. The use of Raspberry-Pi reduces overall system cost and size while allowing an easy user interface. Implementation of a target technology has guided to opt some specific parallel and simple processing algorithms. In the proposed system, we use four structural directional kernel elements for filtering finger vein images. Then, a Top-Hat and Bottom-Hat kernel filters are used to enhance the visibility and the appearance of venous images. For feature extraction step, a simple Local Directional Code (LDC) descriptor is applied. The proposed system presents an Error Equal Rate (EER) and Identification Rate (IR), respectively, equal to 0.02 and 98%. Furthermore, experimental results show that real-time operations have good performance.Keywords: biometric, Bottom-Hat, Fingervein, LDC, Rasberry-Pi, ROI, Top-Hat
Procedia PDF Downloads 20725177 Analyzing Global User Sentiments on Laptop Features: A Comparative Study of Preferences Across Economic Contexts
Authors: Mohammadreza Bakhtiari, Mehrdad Maghsoudi, Hamidreza Bakhtiari
Abstract:
The widespread adoption of laptops has become essential to modern lifestyles, supporting work, education, and entertainment. Social media platforms have emerged as key spaces where users share real-time feedback on laptop performance, providing a valuable source of data for understanding consumer preferences. This study leverages aspect-based sentiment analysis (ABSA) on 1.5 million tweets to examine how users from developed and developing countries perceive and prioritize 16 key laptop features. The analysis reveals that consumers in developing countries express higher satisfaction overall, emphasizing affordability, durability, and reliability. Conversely, users in developed countries demonstrate more critical attitudes, especially toward performance-related aspects such as cooling systems, battery life, and chargers. The study employs a mixed-methods approach, combining ABSA using the PyABSA framework with expert insights gathered through a Delphi panel of ten industry professionals. Data preprocessing included cleaning, filtering, and aspect extraction from tweets. Universal issues such as battery efficiency and fan performance were identified, reflecting shared challenges across markets. However, priorities diverge between regions, while users in developed countries demand high-performance models with advanced features, those in developing countries seek products that offer strong value for money and long-term durability. The findings suggest that laptop manufacturers should adopt a market-specific strategy by developing differentiated product lines. For developed markets, the focus should be on cutting-edge technologies, enhanced cooling solutions, and comprehensive warranty services. In developing markets, emphasis should be placed on affordability, versatile port options, and robust designs. Additionally, the study highlights the importance of universal charging solutions and continuous sentiment monitoring to adapt to evolving consumer needs. This research offers practical insights for manufacturers seeking to optimize product development and marketing strategies for global markets, ensuring enhanced user satisfaction and long-term competitiveness. Future studies could explore multi-source data integration and conduct longitudinal analyses to capture changing trends over time.Keywords: consumer behavior, durability, laptop industry, sentiment analysis, social media analytics
Procedia PDF Downloads 2125176 Insect Diversity Potential in Olive Trees in Two Orchards Differently Managed Under an Arid Climate in the Western Steppe Land, Algeria
Authors: Samir Ali-arous, Mohamed Beddane, Khaled Djelouah
Abstract:
This study investigated the insect diversity of olive (Olea europaea Linnaeus (Oleaceae)) groves grown in an arid climate in Algeria. In this context, several sampling methods were used within two orchards differently managed. Fifty arthropod species belonging to diverse orders and families were recorded. Hymenopteran species were quantitatively the most abundant, followed by species associated with Heteroptera, Aranea, Coleoptera and Homoptera orders. Regarding functional feeding groups, phytophagous species were dominant in the weeded and the unweeded orchard; however, higher abundance was recorded in the weeded site. Predators were ranked second, and pollinators were more frequent in the unweeded olive orchard. Two-factor Anova with repeated measures had revealed high significant effect of the weed management system, measures repetition and interaction with measurement repetition on arthropod’s abundances (P < 0.05). Likewise, generalized linear models showed that N/S ratio varied significantly between the two weed management approaches, in contrast, the remaining diversity indices including the Shannon index H’ had no significant correlation. Moreover, diversity parameters of arthropod’s communities in each agro-system highlighted multiples significant correlations (P <0.05). Rarefaction and extrapolation (R/E) sampling curves, evidenced that the survey and monitoring carried out in both sites had a optimum coverage of entomofauna present including scarce and transient species. Overall, calculated diversity and similarity indices were greater in the unweeded orchard than in the weeded orchard, demonstrating spontaneous flora's key role in entomofaunal diversity. Principal Component Analysis (PCA) has defined correlations between arthropod’s abundances and naturally occurring plants in olive orchards, including beneficials.Keywords: Algeria, olive, insects, diversity, wild plants
Procedia PDF Downloads 7925175 A Study on Big Data Analytics, Applications and Challenges
Authors: Chhavi Rana
Abstract:
The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, Healthcare, and business intelligence contain voluminous and incremental data, which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organization's decision-making strategy can be enhanced using big data analytics and applying different machine learning techniques and statistical tools on such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates on various frameworks in the process of Analysis using different machine-learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.Keywords: big data, big data analytics, machine learning, review
Procedia PDF Downloads 9025174 A Study on Big Data Analytics, Applications, and Challenges
Authors: Chhavi Rana
Abstract:
The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, healthcare, and business intelligence contain voluminous and incremental data which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organisation decision-making strategy can be enhanced by using big data analytics and applying different machine learning techniques and statistical tools to such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates various frameworks in the process of analysis using different machine learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.Keywords: big data, big data analytics, machine learning, review
Procedia PDF Downloads 9825173 Improved K-Means Clustering Algorithm Using RHadoop with Combiner
Authors: Ji Eun Shin, Dong Hoon Lim
Abstract:
Data clustering is a common technique used in data analysis and is used in many applications, such as artificial intelligence, pattern recognition, economics, ecology, psychiatry and marketing. K-means clustering is a well-known clustering algorithm aiming to cluster a set of data points to a predefined number of clusters. In this paper, we implement K-means algorithm based on MapReduce framework with RHadoop to make the clustering method applicable to large scale data. RHadoop is a collection of R packages that allow users to manage and analyze data with Hadoop. The main idea is to introduce a combiner as a function of our map output to decrease the amount of data needed to be processed by reducers. The experimental results demonstrated that K-means algorithm using RHadoop can scale well and efficiently process large data sets on commodity hardware. We also showed that our K-means algorithm using RHadoop with combiner was faster than regular algorithm without combiner as the size of data set increases.Keywords: big data, combiner, K-means clustering, RHadoop
Procedia PDF Downloads 44425172 Comparative Analysis of Pet-parent Reported Pruritic Symptoms in Cats: Data from Social Media Listening and Surveys Similar
Authors: Georgina Cherry, Taranpreet Rai, Luke Boyden, Sitira Williams, Andrea Wright, Richard Brown, Viva Chu, Alasdair Cook, Kevin Wells
Abstract:
Estimating population-level burden, abilities of pet-parents to identify disease and demand for veterinary services worldwide is challenging. The purpose of this study is to compare a feline pruritus survey with social media listening (SML) data discussing this condition. Surveys are expensive and labour intensive to analyse, but SML data is freeform and requires careful filtering for relevancy. This study considers data from a survey of owner-observed symptoms of 156 pruritic cats conducted using Pet Parade® and SML posts collected through web-scraping to gain insights into the characterisation and management of feline pruritus. SML posts meeting a feline body area, behaviour and symptom were captured and reviewed for relevance representing 1299 public posts collected from 2021 to 2023. The survey involved 1067 pet-parents who reported on pruritic symptoms in their cats. Among the observed cats, approximately 18.37% (n=196) exhibited at least one symptom. The most frequently reported symptoms were hair loss (9.2%), bald spots (7.3%) and infection, crusting, scaling, redness, scabbing, scaling, or bumpy skin (8.2%). Notably, bald spots were the primary symptom reported for short-haired cats, while other symptoms were more prevalent in medium and long-haired cats. Affected body areas, according to pet-parents, were primarily the head, face, chin, neck (27%), and the top of the body, along the spine (22%). 35% of all cats displayed excessive behaviours consistent with pruritic skin disease. Interestingly, 27% of these cats were perceived as non-symptomatic by their owners, suggesting an under-identification of itch-related signs. Furthermore, a significant proportion of symptomatic cats did not receive any skin disease medication, whether prescribed or over the counter (n=41). These findings indicate a higher incidence of pruritic skin disease in cats than recognized by pet owners, potentially leading to a lack of medical intervention for clinically symptomatic cases. The comparison between the survey and social media listening data revealed bald spots were reported in similar proportions in both datasets (25% in the survey and 28% in SML). Infection, crusting, scaling, redness, scabbing, scaling, or bumpy skin accounted for 31% of symptoms in the survey, whereas it represented 53% of relevant SML posts (excluding bumpy skin). Abnormal licking or chewing behaviours were mentioned by pet-parents in 40% of SML posts compared to 38% in the survey. The consistency in the findings of these two disparate data sources, including a complete overlap in affected body areas for the top 80% of social media listening posts, indicates minimal biases in each method, as significant biases would likely yield divergent results. Therefore, the strong agreement across pruritic symptoms, affected body areas, and reported behaviours enhances our confidence in the reliability of the findings. Moreover, the small differences identified between the datasets underscore the valuable insights that arise from utilising multiple data sources. These variations provide additional depth in characterising and managing feline pruritus, allowing for more comprehensive understanding of the condition. By combining survey data and social media listening, researchers can obtain a nuanced perspective and capture a wider range of experiences and perspectives, supporting informed decision-making in veterinary practice.Keywords: social media listening, feline pruritus, surveys, felines, cats, pet owners
Procedia PDF Downloads 13725171 Framework for Integrating Big Data and Thick Data: Understanding Customers Better
Authors: Nikita Valluri, Vatcharaporn Esichaikul
Abstract:
With the popularity of data-driven decision making on the rise, this study focuses on providing an alternative outlook towards the process of decision-making. Combining quantitative and qualitative methods rooted in the social sciences, an integrated framework is presented with a focus on delivering a much more robust and efficient approach towards the concept of data-driven decision-making with respect to not only Big data but also 'Thick data', a new form of qualitative data. In support of this, an example from the retail sector has been illustrated where the framework is put into action to yield insights and leverage business intelligence. An interpretive approach to analyze findings from both kinds of quantitative and qualitative data has been used to glean insights. Using traditional Point-of-sale data as well as an understanding of customer psychographics and preferences, techniques of data mining along with qualitative methods (such as grounded theory, ethnomethodology, etc.) are applied. This study’s final goal is to establish the framework as a basis for providing a holistic solution encompassing both the Big and Thick aspects of any business need. The proposed framework is a modified enhancement in lieu of traditional data-driven decision-making approach, which is mainly dependent on quantitative data for decision-making.Keywords: big data, customer behavior, customer experience, data mining, qualitative methods, quantitative methods, thick data
Procedia PDF Downloads 16525170 Oil Contaminate Removal from Wastewater with Novel Nanofiber-Based Membranes
Authors: Zhaoyang Liu
Abstract:
Oil pollution is typically caused by oil and gas-related operations such as vessel accidents, which can pollute waterways as well as the environment and damage the ecosystem. Tanker ship cleaning contributes to oil spills, which have a negative impact on coastal countries due to protracted service disruption. It is critical for coastal countries to develop efficient oil taint cleanup technology. There are various oil/water separation technologies, such as gravity separation, hydrocyclone, air flotation, and membrane filtration, among others. Among these, membrane filtration has been shown to produce high-quality effluent. Commercial membranes, on the other hand, nevertheless face significant practical challenges, such as a high susceptibility for membrane fouling when dealing with greasy effluent. We developed a unique anti-fouling filtering membrane for oil/water separation in this work. The membrane was made of inorganic nanofibers, which possesses the advantages of low membrane fouling, high permeation flux and long-term durability. This results from this study could facilitate to pave a new way for membranes filtration’s practical applications in oil/gas industry.Keywords: oil, contaminate, wastewater, removal
Procedia PDF Downloads 8425169 Incremental Learning of Independent Topic Analysis
Authors: Takahiro Nishigaki, Katsumi Nitta, Takashi Onoda
Abstract:
In this paper, we present a method of applying Independent Topic Analysis (ITA) to increasing the number of document data. The number of document data has been increasing since the spread of the Internet. ITA was presented as one method to analyze the document data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis (ICA). ICA is a technique in the signal processing; however, it is difficult to apply the ITA to increasing number of document data. Because ITA must use the all document data so temporal and spatial cost is very high. Therefore, we present Incremental ITA which extracts the independent topics from increasing number of document data. Incremental ITA is a method of updating the independent topics when the document data is added after extracted the independent topics from a just previous the data. In addition, Incremental ITA updates the independent topics when the document data is added. And we show the result applied Incremental ITA to benchmark datasets.Keywords: text mining, topic extraction, independent, incremental, independent component analysis
Procedia PDF Downloads 31525168 Open Data for e-Governance: Case Study of Bangladesh
Authors: Sami Kabir, Sadek Hossain Khoka
Abstract:
Open Government Data (OGD) refers to all data produced by government which are accessible in reusable way by common people with access to Internet and at free of cost. In line with “Digital Bangladesh” vision of Bangladesh government, the concept of open data has been gaining momentum in the country. Opening all government data in digital and customizable format from single platform can enhance e-governance which will make government more transparent to the people. This paper presents a well-in-progress case study on OGD portal by Bangladesh Government in order to link decentralized data. The initiative is intended to facilitate e-service towards citizens through this one-stop web portal. The paper further discusses ways of collecting data in digital format from relevant agencies with a view to making it publicly available through this single point of access. Further, possible layout of this web portal is presented.Keywords: e-governance, one-stop web portal, open government data, reusable data, web of data
Procedia PDF Downloads 35725167 Resource Framework Descriptors for Interestingness in Data
Authors: C. B. Abhilash, Kavi Mahesh
Abstract:
Human beings are the most advanced species on earth; it's all because of the ability to communicate and share information via human language. In today's world, a huge amount of data is available on the web in text format. This has also resulted in the generation of big data in structured and unstructured formats. In general, the data is in the textual form, which is highly unstructured. To get insights and actionable content from this data, we need to incorporate the concepts of text mining and natural language processing. In our study, we mainly focus on Interesting data through which interesting facts are generated for the knowledge base. The approach is to derive the analytics from the text via the application of natural language processing. Using semantic web Resource framework descriptors (RDF), we generate the triple from the given data and derive the interesting patterns. The methodology also illustrates data integration using the RDF for reliable, interesting patterns.Keywords: RDF, interestingness, knowledge base, semantic data
Procedia PDF Downloads 16725166 Data Mining Practices: Practical Studies on the Telecommunication Companies in Jordan
Authors: Dina Ahmad Alkhodary
Abstract:
This study aimed to investigate the practices of Data Mining on the telecommunication companies in Jordan, from the viewpoint of the respondents. In order to achieve the goal of the study, and test the validity of hypotheses, the researcher has designed a questionnaire to collect data from managers and staff members from main department in the researched companies. The results shows improvements stages of the telecommunications companies towered Data Mining.Keywords: data, mining, development, business
Procedia PDF Downloads 50025165 Scientific Recommender Systems Based on Neural Topic Model
Authors: Smail Boussaadi, Hassina Aliane
Abstract:
With the rapid growth of scientific literature, it is becoming increasingly challenging for researchers to keep up with the latest findings in their fields. Academic, professional networks play an essential role in connecting researchers and disseminating knowledge. To improve the user experience within these networks, we need effective article recommendation systems that provide personalized content.Current recommendation systems often rely on collaborative filtering or content-based techniques. However, these methods have limitations, such as the cold start problem and difficulty in capturing semantic relationships between articles. To overcome these challenges, we propose a new approach that combines BERTopic (Bidirectional Encoder Representations from Transformers), a state-of-the-art topic modeling technique, with community detection algorithms in a academic, professional network. Experiences confirm our performance expectations by showing good relevance and objectivity in the results.Keywords: scientific articles, community detection, academic social network, recommender systems, neural topic model
Procedia PDF Downloads 10225164 A New Sign Subband Adaptive Filter Based on Dynamic Selection of Subbands
Authors: Mohammad Shams Esfand Abadi, Mehrdad Zalaghi, Reza ebrahimpour
Abstract:
In this paper, we propose a sign adaptive filter algorithm with the ability of dynamic selection of subband filters which leads to low computational complexity compared with conventional sign subband adaptive filter (SSAF) algorithm. Dynamic selection criterion is based on largest reduction of the mean square deviation at each adaption. We demonstrate that this simple proposed algorithm has the same performance of the conventional SSAF and somewhat faster than it. In the presence of impulsive interferences robustness of the simple proposed algorithm as well as the conventional SSAF and outperform the conventional normalized subband adaptive filter (NSAF) algorithm. Therefore, it is preferred for environments under impulsive interferences. Simulation results are presented to verify these above considerations very well have been achieved.Keywords: acoustic echo cancellation (AEC), normalized subband adaptive filter (NSAF), dynamic selection subband adaptive filter (DS-NSAF), sign subband adaptive filter (SSAF), impulsive noise, robust filtering
Procedia PDF Downloads 60225163 The Impact of System and Data Quality on Organizational Success in the Kingdom of Bahrain
Authors: Amal M. Alrayes
Abstract:
Data and system quality play a central role in organizational success, and the quality of any existing information system has a major influence on the effectiveness of overall system performance.Given the importance of system and data quality to an organization, it is relevant to highlight their importance on organizational performance in the Kingdom of Bahrain. This research aims to discover whether system quality and data quality are related, and to study the impact of system and data quality on organizational success. A theoretical model based on previous research is used to show the relationship between data and system quality, and organizational impact. We hypothesize, first, that system quality is positively associated with organizational impact, secondly that system quality is positively associated with data quality, and finally that data quality is positively associated with organizational impact. A questionnaire was conducted among public and private organizations in the Kingdom of Bahrain. The results show that there is a strong association between data and system quality, that affects organizational success.Keywords: data quality, performance, system quality, Kingdom of Bahrain
Procedia PDF Downloads 49925162 Cloud Computing in Data Mining: A Technical Survey
Authors: Ghaemi Reza, Abdollahi Hamid, Dashti Elham
Abstract:
Cloud computing poses a diversity of challenges in data mining operation arising out of the dynamic structure of data distribution as against the use of typical database scenarios in conventional architecture. Due to immense number of users seeking data on daily basis, there is a serious security concerns to cloud providers as well as data providers who put their data on the cloud computing environment. Big data analytics use compute intensive data mining algorithms (Hidden markov, MapReduce parallel programming, Mahot Project, Hadoop distributed file system, K-Means and KMediod, Apriori) that require efficient high performance processors to produce timely results. Data mining algorithms to solve or optimize the model parameters. The challenges that operation has to encounter is the successful transactions to be established with the existing virtual machine environment and the databases to be kept under the control. Several factors have led to the distributed data mining from normal or centralized mining. The approach is as a SaaS which uses multi-agent systems for implementing the different tasks of system. There are still some problems of data mining based on cloud computing, including design and selection of data mining algorithms.Keywords: cloud computing, data mining, computing models, cloud services
Procedia PDF Downloads 48225161 Cross-border Data Transfers to and from South Africa
Authors: Amy Gooden, Meshandren Naidoo
Abstract:
Genetic research and transfers of big data are not confined to a particular jurisdiction, but there is a lack of clarity regarding the legal requirements for importing and exporting such data. Using direct-to-consumer genetic testing (DTC-GT) as an example, this research assesses the status of data sharing into and out of South Africa (SA). While SA laws cover the sending of genetic data out of SA, prohibiting such transfer unless a legal ground exists, the position where genetic data comes into the country depends on the laws of the country from where it is sent – making the legal position less clear.Keywords: cross-border, data, genetic testing, law, regulation, research, sharing, South Africa
Procedia PDF Downloads 12825160 The Study of Security Techniques on Information System for Decision Making
Authors: Tejinder Singh
Abstract:
Information system is the flow of data from different levels to different directions for decision making and data operations in information system (IS). Data can be violated by different manner like manual or technical errors, data tampering or loss of integrity. Security system called firewall of IS is effected by such type of violations. The flow of data among various levels of Information System is done by networking system. The flow of data on network is in form of packets or frames. To protect these packets from unauthorized access, virus attacks, and to maintain the integrity level, network security is an important factor. To protect the data to get pirated, various security techniques are used. This paper represents the various security techniques and signifies different harmful attacks with the help of detailed data analysis. This paper will be beneficial for the organizations to make the system more secure, effective, and beneficial for future decisions making.Keywords: information systems, data integrity, TCP/IP network, vulnerability, decision, data
Procedia PDF Downloads 31125159 Data Integration with Geographic Information System Tools for Rural Environmental Monitoring
Authors: Tamas Jancso, Andrea Podor, Eva Nagyne Hajnal, Peter Udvardy, Gabor Nagy, Attila Varga, Meng Qingyan
Abstract:
The paper deals with the conditions and circumstances of integration of remotely sensed data for rural environmental monitoring purposes. The main task is to make decisions during the integration process when we have data sources with different resolution, location, spectral channels, and dimension. In order to have exact knowledge about the integration and data fusion possibilities, it is necessary to know the properties (metadata) that characterize the data. The paper explains the joining of these data sources using their attribute data through a sample project. The resulted product will be used for rural environmental analysis.Keywords: remote sensing, GIS, metadata, integration, environmental analysis
Procedia PDF Downloads 12425158 Analysis of Genomics Big Data in Cloud Computing Using Fuzzy Logic
Authors: Mohammad Vahed, Ana Sadeghitohidi, Majid Vahed, Hiroki Takahashi
Abstract:
In the genomics field, the huge amounts of data have produced by the next-generation sequencers (NGS). Data volumes are very rapidly growing, as it is postulated that more than one billion bases will be produced per year in 2020. The growth rate of produced data is much faster than Moore's law in computer technology. This makes it more difficult to deal with genomics data, such as storing data, searching information, and finding the hidden information. It is required to develop the analysis platform for genomics big data. Cloud computing newly developed enables us to deal with big data more efficiently. Hadoop is one of the frameworks distributed computing and relies upon the core of a Big Data as a Service (BDaaS). Although many services have adopted this technology, e.g. amazon, there are a few applications in the biology field. Here, we propose a new algorithm to more efficiently deal with the genomics big data, e.g. sequencing data. Our algorithm consists of two parts: First is that BDaaS is applied for handling the data more efficiently. Second is that the hybrid method of MapReduce and Fuzzy logic is applied for data processing. This step can be parallelized in implementation. Our algorithm has great potential in computational analysis of genomics big data, e.g. de novo genome assembly and sequence similarity search. We will discuss our algorithm and its feasibility.Keywords: big data, fuzzy logic, MapReduce, Hadoop, cloud computing
Procedia PDF Downloads 30225157 Forthcoming Big Data on Smart Buildings and Cities: An Experimental Study on Correlations among Urban Data
Authors: Yu-Mi Song, Sung-Ah Kim, Dongyoun Shin
Abstract:
Cities are complex systems of diverse and inter-tangled activities. These activities and their complex interrelationships create diverse urban phenomena. And such urban phenomena have considerable influences on the lives of citizens. This research aimed to develop a method to reveal the causes and effects among diverse urban elements in order to enable better understanding of urban activities and, therefrom, to make better urban planning strategies. Specifically, this study was conducted to solve a data-recommendation problem found on a Korean public data homepage. First, a correlation analysis was conducted to find the correlations among random urban data. Then, based on the results of that correlation analysis, the weighted data network of each urban data was provided to people. It is expected that the weights of urban data thereby obtained will provide us with insights into cities and show us how diverse urban activities influence each other and induce feedback.Keywords: big data, machine learning, ontology model, urban data model
Procedia PDF Downloads 42325156 Data-driven Decision-Making in Digital Entrepreneurship
Authors: Abeba Nigussie Turi, Xiangming Samuel Li
Abstract:
Data-driven business models are more typical for established businesses than early-stage startups that strive to penetrate a market. This paper provided an extensive discussion on the principles of data analytics for early-stage digital entrepreneurial businesses. Here, we developed data-driven decision-making (DDDM) framework that applies to startups prone to multifaceted barriers in the form of poor data access, technical and financial constraints, to state some. The startup DDDM framework proposed in this paper is novel in its form encompassing startup data analytics enablers and metrics aligning with startups' business models ranging from customer-centric product development to servitization which is the future of modern digital entrepreneurship.Keywords: startup data analytics, data-driven decision-making, data acquisition, data generation, digital entrepreneurship
Procedia PDF Downloads 33225155 Numerical Implementation and Testing of Fractioning Estimator Method for the Box-Counting Dimension of Fractal Objects
Authors: Abraham Terán Salcedo, Didier Samayoa Ochoa
Abstract:
This work presents a numerical implementation of a method for estimating the box-counting dimension of self-avoiding curves on a planar space, fractal objects captured on digital images; this method is named fractioning estimator. Classical methods of digital image processing, such as noise filtering, contrast manipulation, and thresholding, among others, are used in order to obtain binary images that are suitable for performing the necessary computations of the fractioning estimator. A user interface is developed for performing the image processing operations and testing the fractioning estimator on different captured images of real-life fractal objects. To analyze the results, the estimations obtained through the fractioning estimator are compared to the results obtained through other methods that are already implemented on different available software for computing and estimating the box-counting dimension.Keywords: box-counting, digital image processing, fractal dimension, numerical method
Procedia PDF Downloads 8825154 Protein and Mineral Removal from Dairy Waste-Water Using Precipitation Process
Authors: Zahra Akbari, Farzin Zokaee, Talat Ghomashchi
Abstract:
Whey is a by-product of the dairy industry whose major components are lactose (44–52 g/L), proteins (6–8 g/L) and mineral salts (4–9 g/L). Approximately 50% of 121 million tons of whey produced in the world in 1993 were disposed into rivers, lakes or other water bodies, treated in wastewater treatment plants or loaded onto land. This represents a significant loss of resources and causes serious pollution problems since whey is a heavy organic pollutant with high COD and BOD values, 40–60 g/L and 50–80 g/L, respectively. The removal of cheese whey proteins and minerals represent an important task both in environmental and in food sciences. The most important treatments which are considered in this study, have been done by using lime, Al2O3, FeCl3 and AlCl3 along with heating and also acidic-alkaline method. Results show that the best way for removal of protein is accomplished with adding HCl to decrease pH from 6 to 4, boiling for 20 min, and filtering protein aggregates. Also partial demineralization in whey solution for reducing ash is accomplished by adding NaOH to increase pH to 7.2 and heating solution for 20 min.Keywords: whey treatment, dairy industry, precipitation, protein, mineral
Procedia PDF Downloads 41925153 Cryptographic Protocol for Secure Cloud Storage
Authors: Luvisa Kusuma, Panji Yudha Prakasa
Abstract:
Cloud storage, as a subservice of infrastructure as a service (IaaS) in Cloud Computing, is the model of nerworked storage where data can be stored in server. In this paper, we propose a secure cloud storage system consisting of two main components; client as a user who uses the cloud storage service and server who provides the cloud storage service. In this system, we propose the protocol schemes to guarantee against security attacks in the data transmission. The protocols are login protocol, upload data protocol, download protocol, and push data protocol, which implement hybrid cryptographic mechanism based on data encryption before it is sent to the cloud, so cloud storage provider does not know the user's data and cannot analysis user’s data, because there is no correspondence between data and user.Keywords: cloud storage, security, cryptographic protocol, artificial intelligence
Procedia PDF Downloads 36125152 Decentralized Data Marketplace Framework Using Blockchain-Based Smart Contract
Authors: Meshari Aljohani, Stephan Olariu, Ravi Mukkamala
Abstract:
Data is essential for enhancing the quality of life. Its value creates chances for users to profit from data sales and purchases. Users in data marketplaces, however, must share and trade data in a secure and trusted environment while maintaining their privacy. The first main contribution of this paper is to identify enabling technologies and challenges facing the development of decentralized data marketplaces. The second main contribution is to propose a decentralized data marketplace framework based on blockchain technology. The proposed framework enables sellers and buyers to transact with more confidence. Using a security deposit, the system implements a unique approach for enforcing honesty in data exchange among anonymous individuals. Before the transaction is considered complete, the system has a time frame. As a result, users can submit disputes to the arbitrators which will review them and respond with their decision. Use cases are presented to demonstrate how these technologies help data marketplaces handle issues and challenges.Keywords: blockchain, data, data marketplace, smart contract, reputation system
Procedia PDF Downloads 161