Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 24592

Search results for: data stream

24232 A Study on Big Data Analytics, Applications and Challenges

Abstract:

The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, Healthcare, and business intelligence contain voluminous and incremental data, which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organization's decision-making strategy can be enhanced using big data analytics and applying different machine learning techniques and statistical tools on such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates on various frameworks in the process of Analysis using different machine-learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.

Keywords: big data, big data analytics, machine learning, review

Procedia PDF Downloads 50

24231 A Study on Big Data Analytics, Applications, and Challenges

Authors: Chhavi Rana

Abstract:

The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, healthcare, and business intelligence contain voluminous and incremental data which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organisation decision-making strategy can be enhanced by using big data analytics and applying different machine learning techniques and statistical tools to such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates various frameworks in the process of analysis using different machine learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.

Keywords: big data, big data analytics, machine learning, review

Procedia PDF Downloads 65

24230 An Active Solar Energy System to Supply Heating Demands of the Teaching Staff Dormitory of Islamic Azad University Ramhormoz Branch

Authors: M. Talebzadegan, S. Bina, I. Riazi

Abstract:

The purpose of this paper is to present an active solar energy system to supply heating demands of the teaching staff dormitory of the Islamic Azad University of Ramhormoz. The design takes into account the solar radiations and climate data of Ramhormoz town and is based on the daily warm water consumption for health demands of 450 residents of the dormitory, which is equal to 27000 lit of 50-C° water, and building heating requirements with an area of 3500 m² well-protected by heatproof materials. First, heating demands of the building were calculated, then a hybrid system made up of solar and fossil energies was developed and finally, the design was economically evaluated. Since there is only roof space for using 110 flat solar water heaters, the calculations were made to hybridize solar water heating system with heat pumping system in which solar energy contributes 67% of the heat generated. According to calculations, the net present value “N.P.V.” of revenue stream exceeds “N.P.V.” of cash paid off in this project over three years, which makes economically quite promising. The return of investment and payback period of the project is 4 years. Also, the internal rate of return (IRR) of the project was 25%, which exceeds bank rate of interest in Iran and emphasizes the desirability of the project.

Keywords: Solar energy, Heat Demand, Renewable , Pollution

Procedia PDF Downloads 225

24229 Improved K-Means Clustering Algorithm Using RHadoop with Combiner

Authors: Ji Eun Shin, Dong Hoon Lim

Abstract:

Data clustering is a common technique used in data analysis and is used in many applications, such as artificial intelligence, pattern recognition, economics, ecology, psychiatry and marketing. K-means clustering is a well-known clustering algorithm aiming to cluster a set of data points to a predefined number of clusters. In this paper, we implement K-means algorithm based on MapReduce framework with RHadoop to make the clustering method applicable to large scale data. RHadoop is a collection of R packages that allow users to manage and analyze data with Hadoop. The main idea is to introduce a combiner as a function of our map output to decrease the amount of data needed to be processed by reducers. The experimental results demonstrated that K-means algorithm using RHadoop can scale well and efficiently process large data sets on commodity hardware. We also showed that our K-means algorithm using RHadoop with combiner was faster than regular algorithm without combiner as the size of data set increases.

Keywords: big data, combiner, K-means clustering, RHadoop

Procedia PDF Downloads 401

24228 Framework for Integrating Big Data and Thick Data: Understanding Customers Better

Authors: Nikita Valluri, Vatcharaporn Esichaikul

Abstract:

With the popularity of data-driven decision making on the rise, this study focuses on providing an alternative outlook towards the process of decision-making. Combining quantitative and qualitative methods rooted in the social sciences, an integrated framework is presented with a focus on delivering a much more robust and efficient approach towards the concept of data-driven decision-making with respect to not only Big data but also 'Thick data', a new form of qualitative data. In support of this, an example from the retail sector has been illustrated where the framework is put into action to yield insights and leverage business intelligence. An interpretive approach to analyze findings from both kinds of quantitative and qualitative data has been used to glean insights. Using traditional Point-of-sale data as well as an understanding of customer psychographics and preferences, techniques of data mining along with qualitative methods (such as grounded theory, ethnomethodology, etc.) are applied. This study’s final goal is to establish the framework as a basis for providing a holistic solution encompassing both the Big and Thick aspects of any business need. The proposed framework is a modified enhancement in lieu of traditional data-driven decision-making approach, which is mainly dependent on quantitative data for decision-making.

Keywords: big data, customer behavior, customer experience, data mining, qualitative methods, quantitative methods, thick data

Procedia PDF Downloads 133

24227 High School Gain Analytics From National Assessment Program – Literacy and Numeracy and Australian Tertiary Admission Rankin Linkage

Authors: Andrew Laming, John Hattie, Mark Wilson

Abstract:

Nine Queensland Independent high schools provided deidentified student-matched ATAR and NAPLAN data for all 1217 ATAR graduates since 2020 who also sat NAPLAN at the school. Graduating cohorts from the nine schools contained a mean 100 ATAR graduates with previous NAPLAN data from their school. Excluded were vocational students (mean=27) and any ATAR graduates without NAPLAN data (mean=20). Based on Index of Community Socio-Educational Access (ICSEA) prediction, all schools had larger that predicted proportions of their students graduating with ATARs. There were an additional 173 students not releasing their ATARs to their school (14%), requiring this data to be inferred by schools. Gain was established by first converting each student’s strongest NAPLAN domain to a statewide percentile, then subtracting this result from final ATAR. The resulting ‘percentile shift’ was corrected for plausible ATAR participation at each NAPLAN level. Strongest NAPLAN domain had the highest correlation with ATAR (R2=0.58). RESULTS School mean NAPLAN scores fitted ICSEA closely (R2=0.97). Schools achieved a mean cohort gain of two ATAR rankings, but only 66% of students gained. This ranged from 46% of top-NAPLAN decile students gaining, rising to 75% achieving gains outside the top decile. The 54% of top-decile students whose ATAR fell short of prediction lost a mean 4.0 percentiles (or 6.2 percentiles prior to correction for regression to the mean). 71% of students in smaller schools gained, compared to 63% in larger schools. NAPLAN variability in each of the 13 ICSEA1100 cohorts was 17%, with both intra-school and inter-school variation of these values extremely low (0.3% to 1.8%). Mean ATAR change between years in each school was just 1.1 ATAR ranks. This suggests consecutive school cohorts and ICSEA-similar schools share very similar distributions and outcomes over time. Quantile analysis of the NAPLAN/ATAR revealed heteroscedasticity, but splines offered little additional benefit over simple linear regression. The NAPLAN/ATAR R2 was 0.33. DISCUSSION Standardised data like NAPLAN and ATAR offer educators a simple no-cost progression metric to analyse performance in conjunction with their internal test results. Change is expressed in percentiles, or ATAR shift per student, which is layperson intuitive. Findings may also reduce ATAR/vocational stream mismatch, reveal proportions of cohorts meeting or falling short of expectation and demonstrate by how much. Finally, ‘crashed’ ATARs well below expectation are revealed, which schools can reasonably work to minimise. The percentile shift method is neither value-add nor a growth percentile. In the absence of exit NAPLAN testing, this metric is unable to discriminate academic gain from legitimate ATAR-maximizing strategies. But by controlling for ICSEA, ATAR proportion variation and student mobility, it uncovers progression to ATAR metrics which are not currently publicly available. However achieved, ATAR maximisation is a sought-after private good. So long as standardised nationwide data is available, this analysis offers useful analytics for educators and reasonable predictivity when counselling subsequent cohorts about their ATAR prospects.

Keywords: NAPLAN, ATAR, analytics, measurement, gain, performance, data, percentile, value-added, high school, numeracy, reading comprehension, variability, regression to the mean

Procedia PDF Downloads 36

24226 Evaluation of an Integrated Supersonic System for Inertial Extraction of CO₂ in Post-Combustion Streams of Fossil Fuel Operating Power Plants

Authors: Zarina Chokparova, Ighor Uzhinsky

Abstract:

Carbon dioxide emissions resulting from burning of the fossil fuels on large scales, such as oil industry or power plants, leads to a plenty of severe implications including global temperature raise, air pollution and other adverse impacts on the environment. Besides some precarious and costly ways for the alleviation of CO₂ emissions detriment in industrial scales (such as liquefaction of CO₂ and its deep-water treatment, application of adsorbents and membranes, which require careful consideration of drawback effects and their mitigation), one physically and commercially available technology for its capture and disposal is supersonic system for inertial extraction of CO₂ in after-combustion streams. Due to the flue gas with a carbon dioxide concentration of 10-15 volume percent being emitted from the combustion system, the waste stream represents a rather diluted condition at low pressure. The supersonic system induces a flue gas mixture stream to expand using a converge-and-diverge operating nozzle; the flow velocity increases to the supersonic ranges resulting in rapid drop of temperature and pressure. Thus, conversion of potential energy into the kinetic power causes a desublimation of CO₂. Solidified carbon dioxide can be sent to the separate vessel for further disposal. The major advantages of the current solution are its economic efficiency, physical stability, and compactness of the system, as well as needlessness of addition any chemical media. However, there are several challenges yet to be regarded to optimize the system: the way for increasing the size of separated CO₂ particles (as they are represented on a micrometers scale of effective diameter), reduction of the concomitant gas separated together with carbon dioxide and provision of CO₂ downstream flow purity. Moreover, determination of thermodynamic conditions of the vapor-solid mixture including specification of the valid and accurate equation of state remains to be an essential goal. Due to high speeds and temperatures reached during the process, the influence of the emitted heat should be considered, and the applicable solution model for the compressible flow need to be determined. In this report, a brief overview of the current technology status will be presented and a program for further evaluation of this approach is going to be proposed.

Keywords: CO₂ sequestration, converging diverging nozzle, fossil fuel power plant emissions, inertial CO₂ extraction, supersonic post-combustion carbon dioxide capture

Procedia PDF Downloads 121

24225 Incremental Learning of Independent Topic Analysis

Authors: Takahiro Nishigaki, Katsumi Nitta, Takashi Onoda

Abstract:

In this paper, we present a method of applying Independent Topic Analysis (ITA) to increasing the number of document data. The number of document data has been increasing since the spread of the Internet. ITA was presented as one method to analyze the document data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis (ICA). ICA is a technique in the signal processing; however, it is difficult to apply the ITA to increasing number of document data. Because ITA must use the all document data so temporal and spatial cost is very high. Therefore, we present Incremental ITA which extracts the independent topics from increasing number of document data. Incremental ITA is a method of updating the independent topics when the document data is added after extracted the independent topics from a just previous the data. In addition, Incremental ITA updates the independent topics when the document data is added. And we show the result applied Incremental ITA to benchmark datasets.

Keywords: text mining, topic extraction, independent, incremental, independent component analysis

Procedia PDF Downloads 277

24224 Open Data for e-Governance: Case Study of Bangladesh

Authors: Sami Kabir, Sadek Hossain Khoka

Abstract:

Open Government Data (OGD) refers to all data produced by government which are accessible in reusable way by common people with access to Internet and at free of cost. In line with “Digital Bangladesh” vision of Bangladesh government, the concept of open data has been gaining momentum in the country. Opening all government data in digital and customizable format from single platform can enhance e-governance which will make government more transparent to the people. This paper presents a well-in-progress case study on OGD portal by Bangladesh Government in order to link decentralized data. The initiative is intended to facilitate e-service towards citizens through this one-stop web portal. The paper further discusses ways of collecting data in digital format from relevant agencies with a view to making it publicly available through this single point of access. Further, possible layout of this web portal is presented.

Keywords: e-governance, one-stop web portal, open government data, reusable data, web of data

Procedia PDF Downloads 323

24223 Unconfined Laminar Nanofluid Flow and Heat Transfer around a Square Cylinder with an Angle of Incidence

Authors: Rafik Bouakkaz

Abstract:

A finite-volume method simulation is used to investigate two dimensional unsteady flow of nanofluids and heat transfer characteristics past a square cylinder inclined with respect to the main flow in the laminar regime. The computations are carried out of nanoparticle volume fractions varying from 0 ≤ ∅ ≤ 5% for an inclination angle in the range 0° ≤ δ ≤ 45° at a Reynolds number of 100. The variation of stream line and isotherm patterns are presented for the above range of conditions. Also, it is noticed that the addition of nanoparticles enhances the heat transfer. Hence, the local Nusselt number is found to increase with increasing value of the concentration of nanoparticles for the fixed value of the inclination angle.

Keywords: copper nanoparticles, heat transfer, square cylinder, inclination angle

Procedia PDF Downloads 162

24222 Capture Zone of a Well Field in an Aquifer Bounded by Two Parallel Streams

Authors: S. Nagheli, N. Samani, D. A. Barry

Abstract:

In this paper, the velocity potential and stream function of capture zone for a well field in an aquifer bounded by two parallel streams with or without a uniform regional flow of any directions are presented. The well field includes any number of extraction or injection wells or a combination of both types with any pumping rates. To delineate the capture envelope, the potential and streamlines equations are derived by conformal mapping method. This method can help us to release constrains of other methods. The equations can be applied as useful tools to design in-situ groundwater remediation systems, to evaluate the surface–subsurface water interaction and to manage the water resources.

Keywords: complex potential, conformal mapping, image well theory, Laplace’s equation, superposition principle

Procedia PDF Downloads 400

24221 Resource Framework Descriptors for Interestingness in Data

Authors: C. B. Abhilash, Kavi Mahesh

Abstract:

Human beings are the most advanced species on earth; it's all because of the ability to communicate and share information via human language. In today's world, a huge amount of data is available on the web in text format. This has also resulted in the generation of big data in structured and unstructured formats. In general, the data is in the textual form, which is highly unstructured. To get insights and actionable content from this data, we need to incorporate the concepts of text mining and natural language processing. In our study, we mainly focus on Interesting data through which interesting facts are generated for the knowledge base. The approach is to derive the analytics from the text via the application of natural language processing. Using semantic web Resource framework descriptors (RDF), we generate the triple from the given data and derive the interesting patterns. The methodology also illustrates data integration using the RDF for reliable, interesting patterns.

Keywords: RDF, interestingness, knowledge base, semantic data

Procedia PDF Downloads 125

24220 Environmental Interactions in Riparian Vegetation Cover in an Urban Stream Corridor: A Case Study of Duzce Asar Suyu

Authors: Engin Eroğlu, Oktay Yıldız, Necmi Aksoy, Akif Keten, Mehmet Kıvanç Ak, Şeref Keskin, Elif Atmaca, Sertaç Kaya

Abstract:

Nowadays, green spaces in urban areas are under threat and decreasing their percentages in the urban areas because of increasing population, urbanization, migration, and some cultural changes in quality. An important element of the natural landscape water and water-related natural ecosystems are exposed to corruption due to these pressures. A landscape has owned many different types of elements or units, a more dominant structure than other landscapes as good or bad perceptible extent different direction and variable reveals a unique structure and character of the landscape. Whereas landscapes deal with two main groups as urban and rural according to their location on the world, especially intersection areas of urban and rural named semi-urban or semi-rural present variety landscape features. The main components of the landscape are defined as patch-matrix-corridor. The corridors include quite various vegetation types such as riparian, wetland and the others. In urban areas, natural water corridors are an important elements of the diversity of the riparian vegetation cover. In particular, water corridors attract attention with a natural diversity and lack of fragmentation, degradation and artificial results. Thanks to these features, without a doubt, water corridors are the important component of all cities in the world. These corridors not only divide the city into two separate sides, but also assured the ecological connectivity between the two sides of the city. The main objective of this study is to determine the vegetation and habitat features of urban stream corridor according to environmental interactions. Within this context, this study will be realized that 'Asar Suyu' is an important component of the city of Düzce. Moreover, the riparian zone touched contiguous area borders of the city and overlaid the urban development limits of the city, determining of characteristics of the corridor will be carried out as floristic and habitat analysis. Consequently, vegetation structure and habitat features which play an important role between riparian zone vegetation covers and environmental interaction will be determined. This study includes first results of The Scientific and Technological Research Council of Turkey (TUBITAK-116O596; 'Determining of Landscape Character of Urban Water Corridors as Visual and Ecological; A Case Study of Asar Suyu in Duzce').

Keywords: corridor, Duzce, landscape ecology, riparian vegetation

Procedia PDF Downloads 312

24219 Data Mining Practices: Practical Studies on the Telecommunication Companies in Jordan

Authors: Dina Ahmad Alkhodary

Abstract:

This study aimed to investigate the practices of Data Mining on the telecommunication companies in Jordan, from the viewpoint of the respondents. In order to achieve the goal of the study, and test the validity of hypotheses, the researcher has designed a questionnaire to collect data from managers and staff members from main department in the researched companies. The results shows improvements stages of the telecommunications companies towered Data Mining.

Keywords: data, mining, development, business

Procedia PDF Downloads 466

24218 Recovery of Acetonitrile from Aqueous Solutions by Extractive Distillation: The Effect of Entrainer

Authors: Aleksandra Y. Sazonova, Valentina M. Raeva

Abstract:

The aim of this work was to apply extractive distillation for acetonitrile removal from water solutions, to validate thermodynamic criterion based on excess Gibbs energy to entrainer selection process for acetonitrile – water mixture separation and show its potential efficiency at isothermal conditions as well as at isobaric (conditions of real distillation process), to simulate and analyze an extractive distillation process with chosen entrainers: optimize amount of trays and feeds, entrainer/original mixture and reflux ratios. Equimolar composition of the feed stream was chosen for the process, comparison of the energy consumptions was carried out. Glycerol was suggested as the most energetically and ecologically suitable entrainer.

Keywords: acetonitrile, entrainer, extractive distillation, water

Procedia PDF Downloads 242

24217 The Impact of System and Data Quality on Organizational Success in the Kingdom of Bahrain

Authors: Amal M. Alrayes

Abstract:

Data and system quality play a central role in organizational success, and the quality of any existing information system has a major influence on the effectiveness of overall system performance.Given the importance of system and data quality to an organization, it is relevant to highlight their importance on organizational performance in the Kingdom of Bahrain. This research aims to discover whether system quality and data quality are related, and to study the impact of system and data quality on organizational success. A theoretical model based on previous research is used to show the relationship between data and system quality, and organizational impact. We hypothesize, first, that system quality is positively associated with organizational impact, secondly that system quality is positively associated with data quality, and finally that data quality is positively associated with organizational impact. A questionnaire was conducted among public and private organizations in the Kingdom of Bahrain. The results show that there is a strong association between data and system quality, that affects organizational success.

Keywords: data quality, performance, system quality, Kingdom of Bahrain

Procedia PDF Downloads 458

24216 Gis Based Flash Flood Runoff Simulation Model of Upper Teesta River Besin - Using Aster Dem and Meteorological Data

Authors: Abhisek Chakrabarty, Subhraprakash Mandal

Abstract:

Flash flood is one of the catastrophic natural hazards in the mountainous region of India. The recent flood in the Mandakini River in Kedarnath (14-17th June, 2013) is a classic example of flash floods that devastated Uttarakhand by killing thousands of people.The disaster was an integrated effect of high intensityrainfall, sudden breach of Chorabari Lake and very steep topography. Every year in Himalayan Region flash flood occur due to intense rainfall over a short period of time, cloud burst, glacial lake outburst and collapse of artificial check dam that cause high flow of river water. In Sikkim-Derjeeling Himalaya one of the probable flash flood occurrence zone is Teesta Watershed. The Teesta River is a right tributary of the Brahmaputra with draining mountain area of approximately 8600 Sq. km. It originates in the Pauhunri massif (7127 m). The total length of the mountain section of the river amounts to 182 km. The Teesta is characterized by a complex hydrological regime. The river is fed not only by precipitation, but also by melting glaciers and snow as well as groundwater. The present study describes an attempt to model surface runoff in upper Teesta basin, which is directly related to catastrophic flood events, by creating a system based on GIS technology. The main object was to construct a direct unit hydrograph for an excess rainfall by estimating the stream flow response at the outlet of a watershed. Specifically, the methodology was based on the creation of a spatial database in GIS environment and on data editing. Moreover, rainfall time-series data collected from Indian Meteorological Department and they were processed in order to calculate flow time and the runoff volume. Apart from the meteorological data, background data such as topography, drainage network, land cover and geological data were also collected. Clipping the watershed from the entire area and the streamline generation for Teesta watershed were done and cross-sectional profiles plotted across the river at various locations from Aster DEM data using the ERDAS IMAGINE 9.0 and Arc GIS 10.0 software. The analysis of different hydraulic model to detect flash flood probability ware done using HEC-RAS, Flow-2D, HEC-HMS Software, which were of great importance in order to achieve the final result. With an input rainfall intensity above 400 mm per day for three days the flood runoff simulation models shows outbursts of lakes and check dam individually or in combination with run-off causing severe damage to the downstream settlements. Model output shows that 313 Sq. km area were found to be most vulnerable to flash flood includes Melli, Jourthang, Chungthang, and Lachung and 655sq. km. as moderately vulnerable includes Rangpo,Yathang, Dambung,Bardang, Singtam, Teesta Bazarand Thangu Valley. The model was validated by inserting the rain fall data of a flood event took place in August 1968, and 78% of the actual area flooded reflected in the output of the model. Lastly preventive and curative measures were suggested to reduce the losses by probable flash flood event.

Keywords: flash flood, GIS, runoff, simulation model, Teesta river basin

Procedia PDF Downloads 272

24215 Cloud Computing in Data Mining: A Technical Survey

Authors: Ghaemi Reza, Abdollahi Hamid, Dashti Elham

Abstract:

Cloud computing poses a diversity of challenges in data mining operation arising out of the dynamic structure of data distribution as against the use of typical database scenarios in conventional architecture. Due to immense number of users seeking data on daily basis, there is a serious security concerns to cloud providers as well as data providers who put their data on the cloud computing environment. Big data analytics use compute intensive data mining algorithms (Hidden markov, MapReduce parallel programming, Mahot Project, Hadoop distributed file system, K-Means and KMediod, Apriori) that require efficient high performance processors to produce timely results. Data mining algorithms to solve or optimize the model parameters. The challenges that operation has to encounter is the successful transactions to be established with the existing virtual machine environment and the databases to be kept under the control. Several factors have led to the distributed data mining from normal or centralized mining. The approach is as a SaaS which uses multi-agent systems for implementing the different tasks of system. There are still some problems of data mining based on cloud computing, including design and selection of data mining algorithms.

Keywords: cloud computing, data mining, computing models, cloud services

Procedia PDF Downloads 449

24214 Cross-border Data Transfers to and from South Africa

Authors: Amy Gooden, Meshandren Naidoo

Abstract:

Genetic research and transfers of big data are not confined to a particular jurisdiction, but there is a lack of clarity regarding the legal requirements for importing and exporting such data. Using direct-to-consumer genetic testing (DTC-GT) as an example, this research assesses the status of data sharing into and out of South Africa (SA). While SA laws cover the sending of genetic data out of SA, prohibiting such transfer unless a legal ground exists, the position where genetic data comes into the country depends on the laws of the country from where it is sent – making the legal position less clear.

Keywords: cross-border, data, genetic testing, law, regulation, research, sharing, South Africa

Procedia PDF Downloads 104

24213 The Study of Security Techniques on Information System for Decision Making

Authors: Tejinder Singh

Abstract:

Information system is the flow of data from different levels to different directions for decision making and data operations in information system (IS). Data can be violated by different manner like manual or technical errors, data tampering or loss of integrity. Security system called firewall of IS is effected by such type of violations. The flow of data among various levels of Information System is done by networking system. The flow of data on network is in form of packets or frames. To protect these packets from unauthorized access, virus attacks, and to maintain the integrity level, network security is an important factor. To protect the data to get pirated, various security techniques are used. This paper represents the various security techniques and signifies different harmful attacks with the help of detailed data analysis. This paper will be beneficial for the organizations to make the system more secure, effective, and beneficial for future decisions making.

Keywords: information systems, data integrity, TCP/IP network, vulnerability, decision, data

Procedia PDF Downloads 272

24212 Data Integration with Geographic Information System Tools for Rural Environmental Monitoring

Authors: Tamas Jancso, Andrea Podor, Eva Nagyne Hajnal, Peter Udvardy, Gabor Nagy, Attila Varga, Meng Qingyan

Abstract:

The paper deals with the conditions and circumstances of integration of remotely sensed data for rural environmental monitoring purposes. The main task is to make decisions during the integration process when we have data sources with different resolution, location, spectral channels, and dimension. In order to have exact knowledge about the integration and data fusion possibilities, it is necessary to know the properties (metadata) that characterize the data. The paper explains the joining of these data sources using their attribute data through a sample project. The resulted product will be used for rural environmental analysis.

Keywords: remote sensing, GIS, metadata, integration, environmental analysis

Procedia PDF Downloads 92

24211 Analysis of Genomics Big Data in Cloud Computing Using Fuzzy Logic

Authors: Mohammad Vahed, Ana Sadeghitohidi, Majid Vahed, Hiroki Takahashi

Abstract:

In the genomics field, the huge amounts of data have produced by the next-generation sequencers (NGS). Data volumes are very rapidly growing, as it is postulated that more than one billion bases will be produced per year in 2020. The growth rate of produced data is much faster than Moore's law in computer technology. This makes it more difficult to deal with genomics data, such as storing data, searching information, and finding the hidden information. It is required to develop the analysis platform for genomics big data. Cloud computing newly developed enables us to deal with big data more efficiently. Hadoop is one of the frameworks distributed computing and relies upon the core of a Big Data as a Service (BDaaS). Although many services have adopted this technology, e.g. amazon, there are a few applications in the biology field. Here, we propose a new algorithm to more efficiently deal with the genomics big data, e.g. sequencing data. Our algorithm consists of two parts: First is that BDaaS is applied for handling the data more efficiently. Second is that the hybrid method of MapReduce and Fuzzy logic is applied for data processing. This step can be parallelized in implementation. Our algorithm has great potential in computational analysis of genomics big data, e.g. de novo genome assembly and sequence similarity search. We will discuss our algorithm and its feasibility.

Keywords: big data, fuzzy logic, MapReduce, Hadoop, cloud computing

Procedia PDF Downloads 268

24210 Forthcoming Big Data on Smart Buildings and Cities: An Experimental Study on Correlations among Urban Data

Authors: Yu-Mi Song, Sung-Ah Kim, Dongyoun Shin

Abstract:

Cities are complex systems of diverse and inter-tangled activities. These activities and their complex interrelationships create diverse urban phenomena. And such urban phenomena have considerable influences on the lives of citizens. This research aimed to develop a method to reveal the causes and effects among diverse urban elements in order to enable better understanding of urban activities and, therefrom, to make better urban planning strategies. Specifically, this study was conducted to solve a data-recommendation problem found on a Korean public data homepage. First, a correlation analysis was conducted to find the correlations among random urban data. Then, based on the results of that correlation analysis, the weighted data network of each urban data was provided to people. It is expected that the weights of urban data thereby obtained will provide us with insights into cities and show us how diverse urban activities influence each other and induce feedback.

Keywords: big data, machine learning, ontology model, urban data model

Procedia PDF Downloads 384

24209 Data-driven Decision-Making in Digital Entrepreneurship

Authors: Abeba Nigussie Turi, Xiangming Samuel Li

Abstract:

Data-driven business models are more typical for established businesses than early-stage startups that strive to penetrate a market. This paper provided an extensive discussion on the principles of data analytics for early-stage digital entrepreneurial businesses. Here, we developed data-driven decision-making (DDDM) framework that applies to startups prone to multifaceted barriers in the form of poor data access, technical and financial constraints, to state some. The startup DDDM framework proposed in this paper is novel in its form encompassing startup data analytics enablers and metrics aligning with startups' business models ranging from customer-centric product development to servitization which is the future of modern digital entrepreneurship.

Keywords: startup data analytics, data-driven decision-making, data acquisition, data generation, digital entrepreneurship

Procedia PDF Downloads 280

24208 Parallel Opportunity for Water Conservation and Habitat Formation on Regulated Streams through Formation of Thermal Stratification in River Pools

Authors: Todd H. Buxton, Yong G. Lai

Abstract:

Temperature management in regulated rivers can involve significant expenditures of water to meet the cold-water requirements of species in summer. For this purpose, flows released from Lewiston Dam on the Trinity River in Northern California are 12.7 cms with temperatures around 11oC in July through September to provide adult spring Chinook cold water to hold in deep pools and mature until spawning in fall. The releases are more than double the flow and 10oC colder temperatures than the natural conditions before the dam was built. The high, cold releases provide springers the habitat they require but may suppress the stream food base and limit future populations of salmon by reducing the juvenile fish size and survival to adults via the positive relationship between the two. Field and modeling research was undertaken to explore whether lowering summer releases from Lewiston Dam may promote thermal stratification in river pools so that both the cold-water needs of adult salmon and warmer water requirements of other organisms in the stream biome may be met. For this investigation, a three-dimensional (3D) computational fluid dynamics (CFD) model was developed and validated with field measurements in two deep pools on the Trinity River. Modeling and field observations were then used to identify the flows and temperatures that may form and maintain thermal stratification under different meteorologic conditions. Under low flows, a pool was found to be well mixed and thermally homogenous until temperatures began to stratify shortly after sunrise. Stratification then strengthened through the day until shading from trees and mountains cooled the inlet flow and decayed the thermal gradient, which collapsed shortly before sunset and returned the pool to a well-mixed state. This diurnal process of stratification formation and destruction was closely predicted by the 3D CFD model. Both the model and field observations indicate that thermal stratification maintained the coldest temperatures of the day at ≥2m depth in a pool and provided water that was around 8oC warmer in the upper 2m of the pool. Results further indicate that the stratified pool under low flows provided almost the same daily average temperatures as when flows were an order of magnitude higher and stratification was prevented, indicating significant water savings may be realized in regulated streams while also providing a diversity in water temperatures the ecosystem requires. With confidence in the 3D CFD model, the model is now being applied to a dozen pools in the Trinity River to understand how pool bathymetry influences thermal stratification under variable flows and diurnal temperature variations. This knowledge will be used to expand the results to 52 pools in a 64 km reach below Lewiston Dam that meet the depth criteria (≥2 m) for spring Chinook holding. From this, rating curves will be developed to relate discharge to the volume of pool habitat that provides springers the temperature (<15.6oC daily average), velocity (0.15 to 0.4 m/s) and depths that accommodate the escapement target for spring Chinook (6,000 adults) under maximum fish densities measured in other streams (3.1 m3/fish) during the holding time of year (May through August). Flow releases that meet these goals will be evaluated for water savings relative to the current flow regime and their influence on indicator species, including the Foothill Yellow-Legged Frog, and aspects of the stream biome that support salmon populations, including macroinvertebrate production and juvenile Chinook growth rates.

Keywords: 3D CFD modeling, flow regulation, thermal stratification, chinook salmon, foothill yellow-legged frogs, water managment

Procedia PDF Downloads 34

24207 Cryptographic Protocol for Secure Cloud Storage

Authors: Luvisa Kusuma, Panji Yudha Prakasa

Abstract:

Cloud storage, as a subservice of infrastructure as a service (IaaS) in Cloud Computing, is the model of nerworked storage where data can be stored in server. In this paper, we propose a secure cloud storage system consisting of two main components; client as a user who uses the cloud storage service and server who provides the cloud storage service. In this system, we propose the protocol schemes to guarantee against security attacks in the data transmission. The protocols are login protocol, upload data protocol, download protocol, and push data protocol, which implement hybrid cryptographic mechanism based on data encryption before it is sent to the cloud, so cloud storage provider does not know the user's data and cannot analysis user’s data, because there is no correspondence between data and user.

Keywords: cloud storage, security, cryptographic protocol, artificial intelligence

Procedia PDF Downloads 308

24206 Decentralized Data Marketplace Framework Using Blockchain-Based Smart Contract

Authors: Meshari Aljohani, Stephan Olariu, Ravi Mukkamala

Abstract:

Data is essential for enhancing the quality of life. Its value creates chances for users to profit from data sales and purchases. Users in data marketplaces, however, must share and trade data in a secure and trusted environment while maintaining their privacy. The first main contribution of this paper is to identify enabling technologies and challenges facing the development of decentralized data marketplaces. The second main contribution is to propose a decentralized data marketplace framework based on blockchain technology. The proposed framework enables sellers and buyers to transact with more confidence. Using a security deposit, the system implements a unique approach for enforcing honesty in data exchange among anonymous individuals. Before the transaction is considered complete, the system has a time frame. As a result, users can submit disputes to the arbitrators which will review them and respond with their decision. Use cases are presented to demonstrate how these technologies help data marketplaces handle issues and challenges.

Keywords: blockchain, data, data marketplace, smart contract, reputation system

Procedia PDF Downloads 134

24205 An Energy Integration Study While Utilizing Heat of Flue Gas: Sponge Iron Process

Authors: Venkata Ramanaiah, Shabina Khanam

Abstract:

Enormous potential for saving energy is available in coal-based sponge iron plants as these are associated with the high percentage of energy wastage per unit sponge iron production. An energy integration option is proposed, in the present paper, to a coal based sponge iron plant of 100 tonnes per day production capacity, being operated in India using SL/RN (Stelco-Lurgi/Republic Steel-National Lead) process. It consists of the rotary kiln, rotary cooler, dust settling chamber, after burning chamber, evaporating cooler, electrostatic precipitator (ESP), wet scrapper and chimney as important equipment. Principles of process integration are used in the proposed option. It accounts for preheating kiln inlet streams like kiln feed and slinger coal up to 170ᴼC using waste gas exiting ESP. Further, kiln outlet stream is cooled from 1020ᴼC to 110ᴼC using kiln air. The working areas in the plant where energy is being lost and can be conserved are identified. Detailed material and energy balances are carried out around the sponge iron plant, and a modified model is developed, to find coal requirement of proposed option, based on hot utility, heat of reactions, kiln feed and air preheating, radiation losses, dolomite decomposition, the heat required to vaporize the coal volatiles, etc. As coal is used as utility and process stream, an iterative approach is used in solution methodology to compute coal consumption. Further, water consumption, operating cost, capital investment, waste gas generation, profit, and payback period of the modification are computed. Along with these, operational aspects of the proposed design are also discussed. To recover and integrate waste heat available in the plant, three gas-solid heat exchangers and four insulated ducts with one FD fan for each are installed additionally. Thus, the proposed option requires total capital investment of $0.84 million. Preheating of kiln feed, slinger coal and kiln air streams reduce coal consumption by 24.63% which in turn reduces waste gas generation by 25.2% in comparison to the existing process. Moreover, 96% reduction in water is also observed, which is the added advantage of the modification. Consequently, total profit is found as $2.06 million/year with payback period of 4.97 months only. The energy efficient factor (EEF), which is the % of the maximum energy that can be saved through design, is found to be 56.7%. Results of the proposed option are also compared with literature and found in good agreement.

Keywords: coal consumption, energy conservation, process integration, sponge iron plant

Procedia PDF Downloads 121

24204 Strategies for E-Waste Management: A Literature Review

Authors: Linh Thi Truc Doan, Yousef Amer, Sang-Heon Lee, Phan Nguyen Ky Phuc

Abstract:

During the last few decades, with the high-speed upgrade of electronic products, electronic waste (e-waste) has become one of the fastest growing wastes of the waste stream. In this context, more efforts and concerns have already been placed on the treatment and management of this waste. To mitigate their negative influences on the environment and society, it is necessary to establish appropriate strategies for e-waste management. Hence, this paper aims to review and analysis some useful strategies which have been applied in several countries to handle e-waste. Future perspectives on e-waste management are also suggested. The key findings found that, to manage e-waste successfully, it is necessary to establish effective reverse supply chains for e-waste, and raise public awareness towards the detrimental impacts of e-waste. The result of the research provides valuable insights to governments, policymakers in establishing e-waste management in a safe and sustainable manner.

Keywords: e-waste, e-waste management, life cycle assessment, recycling regulations

Procedia PDF Downloads 238

24203 Data Mining Approach for Commercial Data Classification and Migration in Hybrid Storage Systems

Authors: Mais Haj Qasem, Maen M. Al Assaf, Ali Rodan

Abstract:

Parallel hybrid storage systems consist of a hierarchy of different storage devices that vary in terms of data reading speed performance. As we ascend in the hierarchy, data reading speed becomes faster. Thus, migrating the application’ important data that will be accessed in the near future to the uppermost level will reduce the application I/O waiting time; hence, reducing its execution elapsed time. In this research, we implement trace-driven two-levels parallel hybrid storage system prototype that consists of HDDs and SSDs. The prototype uses data mining techniques to classify application’ data in order to determine its near future data accesses in parallel with the its on-demand request. The important data (i.e. the data that the application will access in the near future) are continuously migrated to the uppermost level of the hierarchy. Our simulation results show that our data migration approach integrated with data mining techniques reduces the application execution elapsed time when using variety of traces in at least to 22%.

Keywords: hybrid storage system, data mining, recurrent neural network, support vector machine

Procedia PDF Downloads 277