Search results for: Data Mining
24501 Findings on Modelling Carbon Dioxide Concentration Scenarios in the Nairobi Metropolitan Region before and during COVID-19
Authors: John Okanda Okwaro
Abstract:
Carbon (IV) oxide (CO₂) is emitted majorly from fossil fuel combustion and industrial production. The sources of interest of carbon (IV) oxide in the study area are mining activities, transport systems, and industrial processes. This study is aimed at building models that will help in monitoring the emissions within the study area. Three scenarios were discussed, namely: pessimistic scenario, business-as-usual scenario, and optimistic scenario. The result showed that there was a reduction in carbon dioxide concentration by approximately 50.5 ppm between March 2020 and January 2021 inclusive. This is majorly due to reduced human activities that led to decreased consumption of energy. Also, the CO₂ concentration trend follows the business-as-usual scenario (BAU) path. From the models, the pessimistic, business-as-usual, and optimistic scenarios give CO₂ concentration of about 545.9 ppm, 408.1 ppm, and 360.1 ppm, respectively, on December 31st, 2021. This research helps paint the picture to the policymakers of the relationship between energy sources and CO₂ emissions. Since the reduction in CO₂ emission was due to decreased use of fossil fuel as there was a decrease in economic activities, then if Kenya relies more on green energy than fossil fuel in the post-COVID-19 period, there will be more CO₂ emission reduction. That is, the CO₂ concentration trend is likely to follow the optimistic scenario path, hence a reduction in CO₂ concentration of about 48 ppm by the end of the year 2021. This research recommends investment in solar energy by energy-intensive companies, mine machinery and equipment maintenance, investment in electric vehicles, and doubling tree planting efforts to achieve the 10% cover.Keywords: forecasting, greenhouse gas, green energy, hierarchical data format
Procedia PDF Downloads 16824500 Digital Revolution a Veritable Infrastructure for Technological Development
Authors: Osakwe Jude Odiakaosa
Abstract:
Today’s digital society is characterized by e-education or e-learning, e-commerce, and so on. All these have been propelled by digital revolution. Digital technology such as computer technology, Global Positioning System (GPS) and Geographic Information System (GIS) has been having a tremendous impact on the field of technology. This development has positively affected the scope, methods, speed of data acquisition, data management and the rate of delivery of the results (map and other map products) of data processing. This paper tries to address the impact of revolution brought by digital technology.Keywords: digital revolution, internet, technology, data management
Procedia PDF Downloads 44924499 BigCrypt: A Probable Approach of Big Data Encryption to Protect Personal and Business Privacy
Authors: Abdullah Al Mamun, Talal Alkharobi
Abstract:
As data size is growing up, people are became more familiar to store big amount of secret information into cloud storage. Companies are always required to need transfer massive business files from one end to another. We are going to lose privacy if we transmit it as it is and continuing same scenario repeatedly without securing the communication mechanism means proper encryption. Although asymmetric key encryption solves the main problem of symmetric key encryption but it can only encrypt limited size of data which is inapplicable for large data encryption. In this paper we propose a probable approach of pretty good privacy for encrypt big data using both symmetric and asymmetric keys. Our goal is to achieve encrypt huge collection information and transmit it through a secure communication channel for committing the business and personal privacy. To justify our method an experimental dataset from three different platform is provided. We would like to show that our approach is working for massive size of various data efficiently and reliably.Keywords: big data, cloud computing, cryptography, hadoop, public key
Procedia PDF Downloads 32024498 Implementation of Big Data Concepts Led by the Business Pressures
Authors: Snezana Savoska, Blagoj Ristevski, Violeta Manevska, Zlatko Savoski, Ilija Jolevski
Abstract:
Big data is widely accepted by the pharmaceutical companies as a result of business demands create through legal pressure. Pharmaceutical companies have many legal demands as well as standards’ demands and have to adapt their procedures to the legislation. To manage with these demands, they have to standardize the usage of the current information technology and use the latest software tools. This paper highlights some important aspects of experience with big data projects implementation in a pharmaceutical Macedonian company. These projects made improvements of their business processes by the help of new software tools selected to comply with legal and business demands. They use IT as a strategic tool to obtain competitive advantage on the market and to reengineer the processes towards new Internet economy and quality demands. The company is required to manage vast amounts of structured as well as unstructured data. For these reasons, they implement projects for emerging and appropriate software tools which have to deal with big data concepts accepted in the company.Keywords: big data, unstructured data, SAP ERP, documentum
Procedia PDF Downloads 27124497 Data Clustering in Wireless Sensor Network Implemented on Self-Organization Feature Map (SOFM) Neural Network
Authors: Krishan Kumar, Mohit Mittal, Pramod Kumar
Abstract:
Wireless sensor network is one of the most promising communication networks for monitoring remote environmental areas. In this network, all the sensor nodes are communicated with each other via radio signals. The sensor nodes have capability of sensing, data storage and processing. The sensor nodes collect the information through neighboring nodes to particular node. The data collection and processing is done by data aggregation techniques. For the data aggregation in sensor network, clustering technique is implemented in the sensor network by implementing self-organizing feature map (SOFM) neural network. Some of the sensor nodes are selected as cluster head nodes. The information aggregated to cluster head nodes from non-cluster head nodes and then this information is transferred to base station (or sink nodes). The aim of this paper is to manage the huge amount of data with the help of SOM neural network. Clustered data is selected to transfer to base station instead of whole information aggregated at cluster head nodes. This reduces the battery consumption over the huge data management. The network lifetime is enhanced at a greater extent.Keywords: artificial neural network, data clustering, self organization feature map, wireless sensor network
Procedia PDF Downloads 51724496 Hierarchical Checkpoint Protocol in Data Grids
Authors: Rahma Souli-Jbali, Minyar Sassi Hidri, Rahma Ben Ayed
Abstract:
Grid of computing nodes has emerged as a representative means of connecting distributed computers or resources scattered all over the world for the purpose of computing and distributed storage. Since fault tolerance becomes complex due to the availability of resources in decentralized grid environment, it can be used in connection with replication in data grids. The objective of our work is to present fault tolerance in data grids with data replication-driven model based on clustering. The performance of the protocol is evaluated with Omnet++ simulator. The computational results show the efficiency of our protocol in terms of recovery time and the number of process in rollbacks.Keywords: data grids, fault tolerance, clustering, chandy-lamport
Procedia PDF Downloads 34124495 Deep Mill Level Zone (DMLZ) of Ertsberg East Skarn System, Papua; Correlation between Structure and Mineralization to Determined Characteristic Orebody of DMLZ Mine
Authors: Bambang Antoro, Lasito Soebari, Geoffrey de Jong, Fernandy Meiriyanto, Michael Siahaan, Eko Wibowo, Pormando Silalahi, Ruswanto, Adi Budirumantyo
Abstract:
The Ertsberg East Skarn System (EESS) is located in the Ertsberg Mining District, Papua, Indonesia. EESS is a sub-vertical zone of copper-gold mineralization hosted in both diorite (vein-style mineralization) and skarn (disseminated and vein style mineralization). Deep Mill Level Zone (DMLZ) is a mining zone in the lower part of East Ertsberg Skarn System (EESS) that product copper and gold. The Deep Mill Level Zone deposit is located below the Deep Ore Zone deposit between the 3125m to 2590m elevation, measures roughly 1,200m in length and is between 350 and 500m in width. DMLZ planned start mined on Q2-2015, being mined at an ore extraction rate about 60,000 tpd by the block cave mine method (the block cave contain 516 Mt). Mineralization and associated hydrothermal alteration in the DMLZ is hosted and enclosed by a large stock (The Main Ertsberg Intrusion) that is barren on all sides and above the DMLZ. Late porphyry dikes that cut through the Main Ertsberg Intrusion are spatially associated with the center of the DMLZ hydrothermal system. DMLZ orebody hosted in diorite and skarn, both dominantly by vein style mineralization. Percentage Material Mined at DMLZ compare with current Reserves are diorite 46% (with 0.46% Cu; 0.56 ppm Au; and 0.83% EqCu); Skarn is 39% (with 1.4% Cu; 0.95 ppm Au; and 2.05% EqCu); Hornfels is 8% (with 0.84% Cu; 0.82 ppm Au; and 1.39% EqCu); and Marble 7 % possible mined waste. Correlation between Ertsberg intrusion, major structure, and vein style mineralization is important to determine characteristic orebody in DMLZ Mine. Generally Deep Mill Level Zone has 2 type of vein filling mineralization from both hosted (diorite and skarn), in diorite hosted the vein system filled by chalcopyrite-bornite-quartz and pyrite, in skarn hosted the vein filled by chalcopyrite-bornite-pyrite and magnetite without quartz. Based on orientation the stockwork vein at diorite hosted and shallow vein in skarn hosted was generally NW-SE trending and NE-SW trending with shallow-moderate dipping. Deep Mill Level Zone control by two main major faults, geologist founded and verified local structure between major structure with NW-SE trending and NE-SW trending with characteristics slickenside, shearing, gauge, water-gas channel, and some has been re-healed.Keywords: copper-gold, DMLZ, skarn, structure
Procedia PDF Downloads 50124494 Security in Resource Constraints: Network Energy Efficient Encryption
Authors: Mona Almansoori, Ahmed Mustafa, Ahmad Elshamy
Abstract:
Wireless nodes in a sensor network gather and process critical information designed to process and communicate, information flooding through such network is critical for decision making and data processing, the integrity of such data is one of the most critical factors in wireless security without compromising the processing and transmission capability of the network. This paper presents mechanism to securely transmit data over a chain of sensor nodes without compromising the throughput of the network utilizing available battery resources available at the sensor node.Keywords: hybrid protocol, data integrity, lightweight encryption, neighbor based key sharing, sensor node data processing, Z-MAC
Procedia PDF Downloads 14524493 Solutions of Thickening the Sludge from the Wastewater Treatment by a Rotor with Bars
Authors: Victorita Radulescu
Abstract:
Introduction: The sewage treatment plants, in the second stage, are formed by tanks having as main purpose the formation of the suspensions with high possible solid concentration values. The paper presents a solution to produce a rapid concentration of the slurry and sludge, having as main purpose the minimization as much as possible the size of the tanks. The solution is based on a rotor with bars, tested into two different areas of industrial activity: the remediation of the wastewater from the oil industry and, in the last year, into the mining industry. Basic Methods: It was designed, realized and tested a thickening system with vertical bars that manages to reduce sludge moisture content from 94% to 87%. The design was based on the hypothesis that the streamlines of the vortices detached from the rotor with vertical bars accelerate, under certain conditions, the sludge thickening. It is moved at the lateral sides, and in time, it became sediment. The formed vortices with the vertical axis in the viscous fluid, under the action of the lift, drag, weight, and inertia forces participate at a rapid aggregation of the particles thus accelerating the sludge concentration. Appears an interdependence between the Re number attached to the flow with vortex induced by the vertical bars and the size of the hydraulic compaction phenomenon, resulting from an accelerated process of sedimentation, therefore, a sludge thickening depending on the physic-chemical characteristics of the resulting sludge is projected the rotor's dimensions. Major findings/ Results: Based on the experimental measurements was performed the numerical simulation of the hydraulic rotor, as to assure the necessary vortices. The experimental measurements were performed to determine the optimal height and the density of the bars for the sludge thickening system, to assure the tanks dimensions as small as possible. The time thickening/settling was reduced by 24% compared to the conventional used systems. In the present, the thickeners intend to decrease the intermediate stage of water treatment, using primary and secondary settling; but they assume a quite long time, the order of 10-15 hours. By using this system, there are no intermediary steps; the thickening is done automatically when are created the vortices. Conclusions: The experimental tests were carried out in the wastewater treatment plant of the Refinery of oil from Brazi, near the city Ploiesti. The results prove its efficiency in reducing the time for compacting the sludge and the smaller humidity of the evacuated sediments. The utilization of this equipment is now extended and it is tested the mining industry, with significant results, in Lupeni mine, from the Jiu Valley.Keywords: experimental tests, hydrodynamic modeling, rotor efficiency, wastewater treatment
Procedia PDF Downloads 11824492 Experimental Study on Granulated Steel Slag as an Alternative to River Sand
Authors: K. Raghu, M. N. Vathhsala, Naveen Aradya, Sharth
Abstract:
River sand is the most preferred fine aggregate for mortar and concrete. River sand is a product of natural weathering of rocks over a period of millions of years and is mined from river beds. Sand mining has disastrous environmental consequences. The excessive mining of river bed is creating an ecological imbalance. This has lead to have restrictions imposed by ministry of environment on sand mining. Driven by the acute need for sand, stone dust or manufactured sand prepared from the crushing and screening of coarse aggregate is being used as sand in the recent past. However manufactured sand is also a natural material and has quarrying and quality issues. To reduce the burden on the environment, alternative materials to be used as fine aggregates are being extensively investigated all over the world. Looking to the quantum of requirements, quality and properties there has been a global consensus on a material – Granulated slags. Granulated slag has been proven as a suitable material for replacing natural sand / crushed fine aggregates. In developed countries, the use of granulated slag as fine aggregate to replace natural sand is well established and is in regular practice. In the present paper Granulated slag has been experimented for usage in mortar. Slags are the main by-products generated during iron and steel production in the steel industry. Over the past decades, the steel production has increased and, consequently, the higher volumes of by-products and residues generated which have driven to the reuse of these materials in an increasingly efficient way. In recent years new technologies have been developed to improve the recovery rates of slags. Increase of slags recovery and use in different fields of applications like cement making, construction and fertilizers help in preserving natural resources. In addition to the environment protection, these practices produced economic benefits, by providing sustainable solutions that can allow the steel industry to achieve its ambitious targets of “zero waste” in coming years. Slags are generated at two different stages of steel production, iron making and steel making known as BF(Blast Furnace) slag and steel slag respectively. The slagging agent or fluxes, such as lime stone, dolomite and quartzite added into BF or steel making furnaces in order to remove impurities from ore, scrap and other ferrous charges during smelting. The slag formation is the result of a complex series of physical and chemical reactions between the non-metallic charge(lime stone, dolomite, fluxes), the energy sources(coal, coke, oxygen, etc.) and refractory materials. Because of the high temperatures (about 15000 C) during their generation, slags do not contain any organic substances. Due to the fact that slags are lighter than the liquid metal, they float and get easily removed. The slags protect the metal bath from atmosphere and maintain temperature through a kind of liquid formation. These slags are in liquid state and solidified in air after dumping in the pit or granulated by impinging water systems. Generally, BF slags are granulated and used in cement making due to its high cementious properties, and steel slags are mostly dumped due to unfavourable physio-chemical conditions. The increasing dump of steel slag not only occupies a plenty of land but also wastes resources and can potentially have an impact on the environment due to water pollution. Since BF slag contains little Fe and can be used directly. BF slag has found a wide application, such as cement production, road construction, Civil Engineering work, fertilizer production, landfill daily cover, soil reclamation, prior to its application outside the iron and steel making process.Keywords: steel slag, river sand, granulated slag, environmental
Procedia PDF Downloads 24424491 Towards Achieving Energy Efficiency in Kazakhstan
Authors: Aigerim Uyzbayeva, Valeriya Tyo, Nurlan Ibrayev
Abstract:
Kazakhstan is currently one of the dynamically developing states in its region. The stable growth in all sectors of the economy leads to a corresponding increase in energy consumption. Thus, country consumes a significant amount of energy due to the high level of industralisation and the presence of energy-intensive manufacturing such as mining and metallurgy which in turn leads to low energy efficiency. With allowance for this the Government has set several priorities to adopt a transition of Republic of Kazakhstan to a “green economy”. This article provides an overview of Kazakhstan’s energy efficiency situation in for the period of 1991-2014. First, the dynamics of production and consumption of conventional energy resources are given. Second, the potential of renewable energy sources is summarised, followed by the description of GHG emissions trends in the country. Third, Kazakhstan’ national initiatives, policies and locally implemented projects in the field of energy efficiency are described.Keywords: energy efficiency in Kazakhstan, greenhouse gases, renewable energy, sustainable development
Procedia PDF Downloads 58324490 Anomaly Detection Based on System Log Data
Authors: M. Kamel, A. Hoayek, M. Batton-Hubert
Abstract:
With the increase of network virtualization and the disparity of vendors, the continuous monitoring and detection of anomalies cannot rely on static rules. An advanced analytical methodology is needed to discriminate between ordinary events and unusual anomalies. In this paper, we focus on log data (textual data), which is a crucial source of information for network performance. Then, we introduce an algorithm used as a pipeline to help with the pretreatment of such data, group it into patterns, and dynamically label each pattern as an anomaly or not. Such tools will provide users and experts with continuous real-time logs monitoring capability to detect anomalies and failures in the underlying system that can affect performance. An application of real-world data illustrates the algorithm.Keywords: logs, anomaly detection, ML, scoring, NLP
Procedia PDF Downloads 9424489 Review for Identifying Online Opinion Leaders
Authors: Yu Wang
Abstract:
Nowadays, Internet enables its users to share the information online and to interact with others. Facing with numerous information, these Internet users are confused and begin to rely on the opinion leaders’ recommendations. The online opinion leaders are the individuals who have professional knowledge, who utilize the online channels to spread word-of-mouth information and who can affect the attitudes or even the behavior of their followers to some degree. Because utilizing the online opinion leaders is seen as an important approach to affect the potential consumers, how to identify them has become one of the hottest topics in the related field. Hence, in this article, the concepts and characteristics are introduced, and the researches related to identifying opinion leaders are collected and divided into three categories. Finally, the implications for future studies are provided.Keywords: online opinion leaders, user attributes analysis, text mining analysis, network structure analysis
Procedia PDF Downloads 22324488 EnumTree: An Enumerative Biclustering Algorithm for DNA Microarray Data
Authors: Haifa Ben Saber, Mourad Elloumi
Abstract:
In a number of domains, like in DNA microarray data analysis, we need to cluster simultaneously rows (genes) and columns (conditions) of a data matrix to identify groups of constant rows with a group of columns. This kind of clustering is called biclustering. Biclustering algorithms are extensively used in DNA microarray data analysis. More effective biclustering algorithms are highly desirable and needed. We introduce a new algorithm called, Enumerative tree (EnumTree) for biclustering of binary microarray data. is an algorithm adopting the approach of enumerating biclusters. This algorithm extracts all biclusters consistent good quality. The main idea of EnumLat is the construction of a new tree structure to represent adequately different biclusters discovered during the process of enumeration. This algorithm adopts the strategy of all biclusters at a time. The performance of the proposed algorithm is assessed using both synthetic and real DNA micryarray data, our algorithm outperforms other biclustering algorithms for binary microarray data. Biclusters with different numbers of rows. Moreover, we test the biological significance using a gene annotation web tool to show that our proposed method is able to produce biologically relevent biclusters.Keywords: DNA microarray, biclustering, gene expression data, tree, datamining.
Procedia PDF Downloads 37224487 The Impact of Financial Reporting on Sustainability
Authors: Lynn Ruggieri
Abstract:
The worldwide pandemic has only increased sustainability awareness. The public is demanding that businesses be held accountable for their impact on the environment. While financial data enjoys uniformity in reporting requirements, there are no uniform reporting requirements for non-financial data. Europe is leading the way with some standards being implemented for reporting non-financial sustainability data; however, there is no uniformity globally. And without uniformity, there is not a clear understanding of what information to include and how to disclose it. Sustainability reporting will provide important information to stakeholders and will enable businesses to understand their impact on the environment. Therefore, there is a crucial need for this data. This paper looks at the history of sustainability reporting in the countries of the European Union and throughout the world and makes a case for worldwide reporting requirements for sustainability.Keywords: financial reporting, non-financial data, sustainability, global financial reporting
Procedia PDF Downloads 17824486 Methods and Algorithms of Ensuring Data Privacy in AI-Based Healthcare Systems and Technologies
Authors: Omar Farshad Jeelani, Makaire Njie, Viktoriia M. Korzhuk
Abstract:
Recently, the application of AI-powered algorithms in healthcare continues to flourish. Particularly, access to healthcare information, including patient health history, diagnostic data, and PII (Personally Identifiable Information) is paramount in the delivery of efficient patient outcomes. However, as the exchange of healthcare information between patients and healthcare providers through AI-powered solutions increases, protecting a person’s information and their privacy has become even more important. Arguably, the increased adoption of healthcare AI has resulted in a significant concentration on the security risks and protection measures to the security and privacy of healthcare data, leading to escalated analyses and enforcement. Since these challenges are brought by the use of AI-based healthcare solutions to manage healthcare data, AI-based data protection measures are used to resolve the underlying problems. Consequently, this project proposes AI-powered safeguards and policies/laws to protect the privacy of healthcare data. The project presents the best-in-school techniques used to preserve the data privacy of AI-powered healthcare applications. Popular privacy-protecting methods like Federated learning, cryptographic techniques, differential privacy methods, and hybrid methods are discussed together with potential cyber threats, data security concerns, and prospects. Also, the project discusses some of the relevant data security acts/laws that govern the collection, storage, and processing of healthcare data to guarantee owners’ privacy is preserved. This inquiry discusses various gaps and uncertainties associated with healthcare AI data collection procedures and identifies potential correction/mitigation measures.Keywords: data privacy, artificial intelligence (AI), healthcare AI, data sharing, healthcare organizations (HCOs)
Procedia PDF Downloads 9324485 Mapping Tunnelling Parameters for Global Optimization in Big Data via Dye Laser Simulation
Authors: Sahil Imtiyaz
Abstract:
One of the biggest challenges has emerged from the ever-expanding, dynamic, and instantaneously changing space-Big Data; and to find a data point and inherit wisdom to this space is a hard task. In this paper, we reduce the space of big data in Hamiltonian formalism that is in concordance with Ising Model. For this formulation, we simulate the system using dye laser in FORTRAN and analyse the dynamics of the data point in energy well of rhodium atom. After mapping the photon intensity and pulse width with energy and potential we concluded that as we increase the energy there is also increase in probability of tunnelling up to some point and then it starts decreasing and then shows a randomizing behaviour. It is due to decoherence with the environment and hence there is a loss of ‘quantumness’. This interprets the efficiency parameter and the extent of quantum evolution. The results are strongly encouraging in favour of the use of ‘Topological Property’ as a source of information instead of the qubit.Keywords: big data, optimization, quantum evolution, hamiltonian, dye laser, fermionic computations
Procedia PDF Downloads 19424484 Applying Different Stenography Techniques in Cloud Computing Technology to Improve Cloud Data Privacy and Security Issues
Authors: Muhammad Muhammad Suleiman
Abstract:
Cloud Computing is a versatile concept that refers to a service that allows users to outsource their data without having to worry about local storage issues. However, the most pressing issues to be addressed are maintaining a secure and reliable data repository rather than relying on untrustworthy service providers. In this study, we look at how stenography approaches and collaboration with Digital Watermarking can greatly improve the system's effectiveness and data security when used for Cloud Computing. The main requirement of such frameworks, where data is transferred or exchanged between servers and users, is safe data management in cloud environments. Steganography is the cloud is among the most effective methods for safe communication. Steganography is a method of writing coded messages in such a way that only the sender and recipient can safely interpret and display the information hidden in the communication channel. This study presents a new text steganography method for hiding a loaded hidden English text file in a cover English text file to ensure data protection in cloud computing. Data protection, data hiding capability, and time were all improved using the proposed technique.Keywords: cloud computing, steganography, information hiding, cloud storage, security
Procedia PDF Downloads 19224483 Investigation on Performance of Change Point Algorithm in Time Series Dynamical Regimes and Effect of Data Characteristics
Authors: Farhad Asadi, Mohammad Javad Mollakazemi
Abstract:
In this paper, Bayesian online inference in models of data series are constructed by change-points algorithm, which separated the observed time series into independent series and study the change and variation of the regime of the data with related statistical characteristics. variation of statistical characteristics of time series data often represent separated phenomena in the some dynamical system, like a change in state of brain dynamical reflected in EEG signal data measurement or a change in important regime of data in many dynamical system. In this paper, prediction algorithm for studying change point location in some time series data is simulated. It is verified that pattern of proposed distribution of data has important factor on simpler and smother fluctuation of hazard rate parameter and also for better identification of change point locations. Finally, the conditions of how the time series distribution effect on factors in this approach are explained and validated with different time series databases for some dynamical system.Keywords: time series, fluctuation in statistical characteristics, optimal learning, change-point algorithm
Procedia PDF Downloads 42724482 Author Profiling: Prediction of Learners’ Gender on a MOOC Platform Based on Learners’ Comments
Authors: Tahani Aljohani, Jialin Yu, Alexandra. I. Cristea
Abstract:
The more an educational system knows about a learner, the more personalised interaction it can provide, which leads to better learning. However, asking a learner directly is potentially disruptive, and often ignored by learners. Especially in the booming realm of MOOC Massive Online Learning platforms, only a very low percentage of users disclose demographic information about themselves. Thus, in this paper, we aim to predict learners’ demographic characteristics, by proposing an approach using linguistically motivated Deep Learning Architectures for Learner Profiling, particularly targeting gender prediction on a FutureLearn MOOC platform. Additionally, we tackle here the difficult problem of predicting the gender of learners based on their comments only – which are often available across MOOCs. The most common current approaches to text classification use the Long Short-Term Memory (LSTM) model, considering sentences as sequences. However, human language also has structures. In this research, rather than considering sentences as plain sequences, we hypothesise that higher semantic - and syntactic level sentence processing based on linguistics will render a richer representation. We thus evaluate, the traditional LSTM versus other bleeding edge models, which take into account syntactic structure, such as tree-structured LSTM, Stack-augmented Parser-Interpreter Neural Network (SPINN) and the Structure-Aware Tag Augmented model (SATA). Additionally, we explore using different word-level encoding functions. We have implemented these methods on Our MOOC dataset, which is the most performant one comparing with a public dataset on sentiment analysis that is further used as a cross-examining for the models' results.Keywords: deep learning, data mining, gender predication, MOOCs
Procedia PDF Downloads 14824481 Bayesian Borrowing Methods for Count Data: Analysis of Incontinence Episodes in Patients with Overactive Bladder
Authors: Akalu Banbeta, Emmanuel Lesaffre, Reynaldo Martina, Joost Van Rosmalen
Abstract:
Including data from previous studies (historical data) in the analysis of the current study may reduce the sample size requirement and/or increase the power of analysis. The most common example is incorporating historical control data in the analysis of a current clinical trial. However, this only applies when the historical control dataare similar enough to the current control data. Recently, several Bayesian approaches for incorporating historical data have been proposed, such as the meta-analytic-predictive (MAP) prior and the modified power prior (MPP) both for single control as well as for multiple historical control arms. Here, we examine the performance of the MAP and the MPP approaches for the analysis of (over-dispersed) count data. To this end, we propose a computational method for the MPP approach for the Poisson and the negative binomial models. We conducted an extensive simulation study to assess the performance of Bayesian approaches. Additionally, we illustrate our approaches on an overactive bladder data set. For similar data across the control arms, the MPP approach outperformed the MAP approach with respect to thestatistical power. When the means across the control arms are different, the MPP yielded a slightly inflated type I error (TIE) rate, whereas the MAP did not. In contrast, when the dispersion parameters are different, the MAP gave an inflated TIE rate, whereas the MPP did not.We conclude that the MPP approach is more promising than the MAP approach for incorporating historical count data.Keywords: count data, meta-analytic prior, negative binomial, poisson
Procedia PDF Downloads 11824480 Strategic Citizen Participation in Applied Planning Investigations: How Planners Use Etic and Emic Community Input Perspectives to Fill-in the Gaps in Their Analysis
Authors: John Gaber
Abstract:
Planners regularly use citizen input as empirical data to help them better understand community issues they know very little about. This type of community data is based on the lived experiences of local residents and is known as "emic" data. What is becoming more common practice for planners is their use of data from local experts and stakeholders (known as "etic" data or the outsider perspective) to help them fill in the gaps in their analysis of applied planning research projects. Utilizing international Health Impact Assessment (HIA) data, I look at who planners invite to their citizen input investigations. Research presented in this paper shows that planners access a wide range of emic and etic community perspectives in their search for the “community’s view.” The paper concludes with how planners can chart out a new empirical path in their execution of emic/etic citizen participation strategies in their applied planning research projects.Keywords: citizen participation, emic data, etic data, Health Impact Assessment (HIA)
Procedia PDF Downloads 48424479 Data Augmentation for Automatic Graphical User Interface Generation Based on Generative Adversarial Network
Authors: Xulu Yao, Moi Hoon Yap, Yanlong Zhang
Abstract:
As a branch of artificial neural network, deep learning is widely used in the field of image recognition, but the lack of its dataset leads to imperfect model learning. By analysing the data scale requirements of deep learning and aiming at the application in GUI generation, it is found that the collection of GUI dataset is a time-consuming and labor-consuming project, which is difficult to meet the needs of current deep learning network. To solve this problem, this paper proposes a semi-supervised deep learning model that relies on the original small-scale datasets to produce a large number of reliable data sets. By combining the cyclic neural network with the generated countermeasure network, the cyclic neural network can learn the sequence relationship and characteristics of data, make the generated countermeasure network generate reasonable data, and then expand the Rico dataset. Relying on the network structure, the characteristics of collected data can be well analysed, and a large number of reasonable data can be generated according to these characteristics. After data processing, a reliable dataset for model training can be formed, which alleviates the problem of dataset shortage in deep learning.Keywords: GUI, deep learning, GAN, data augmentation
Procedia PDF Downloads 18424478 Modelling Rainfall-Induced Shallow Landslides in the Northern New South Wales
Authors: S. Ravindran, Y.Liu, I. Gratchev, D.Jeng
Abstract:
Rainfall-induced shallow landslides are more common in the northern New South Wales (NSW), Australia. From 2009 to 2017, around 105 rainfall-induced landslides occurred along the road corridors and caused temporary road closures in the northern NSW. Rainfall causing shallow landslides has different distributions of rainfall varying from uniform, normal, decreasing to increasing rainfall intensity. The duration of rainfall varied from one day to 18 days according to historical data. The objective of this research is to analyse slope instability of some of the sites in the northern NSW by varying cumulative rainfall using SLOPE/W and SEEP/W and compare with field data of rainfall causing shallow landslides. The rainfall data and topographical data from public authorities and soil data obtained from laboratory tests will be used for this modelling. There is a likelihood of shallow landslides if the cumulative rainfall is between 100 mm to 400 mm in accordance with field data.Keywords: landslides, modelling, rainfall, suction
Procedia PDF Downloads 18024477 Machine Learning-Enabled Classification of Climbing Using Small Data
Authors: Nicholas Milburn, Yu Liang, Dalei Wu
Abstract:
Athlete performance scoring within the climbing do-main presents interesting challenges as the sport does not have an objective way to assign skill. Assessing skill levels within any sport is valuable as it can be used to mark progress while training, and it can help an athlete choose appropriate climbs to attempt. Machine learning-based methods are popular for complex problems like this. The dataset available was composed of dynamic force data recorded during climbing; however, this dataset came with challenges such as data scarcity, imbalance, and it was temporally heterogeneous. Investigated solutions to these challenges include data augmentation, temporal normalization, conversion of time series to the spectral domain, and cross validation strategies. The investigated solutions to the classification problem included light weight machine classifiers KNN and SVM as well as the deep learning with CNN. The best performing model had an 80% accuracy. In conclusion, there seems to be enough information within climbing force data to accurately categorize climbers by skill.Keywords: classification, climbing, data imbalance, data scarcity, machine learning, time sequence
Procedia PDF Downloads 14324476 Exploring Environmental, Social, and Governance (ESG) Standards for Space Exploration
Authors: Rachael Sullivan, Joshua Berman
Abstract:
The number of satellites orbiting earth are in the thousands now. Commercial launches are increasing, and civilians are venturing into the outer reaches of the atmosphere. As the space industry continues to grow and evolve, so too will the demand on resources, the disparities amongst socio-economic groups, and space company governance standards. Outside of just ensuring that space operations are compliant with government regulations, export controls, and international sanctions, companies should also keep in mind the impact their operations will have on society and the environment. Those looking to expand their operations into outer space should remain mindful of both the opportunities and challenges that they could encounter along the way. From commercial launches promoting civilian space travel—like the recent launches from Blue Origin, Virgin Galactic, and Space X—to regulatory and policy shifts, the commercial landscape beyond the Earth's atmosphere is evolving. But practices will also have to become sustainable. Through a review and analysis of space industry trends, international government regulations, and empirical data, this research explores how Environmental, Social, and Governance (ESG) reporting and investing will manifest within a fast-changing space industry.Institutions, regulators, investors, and employees are increasingly relying on ESG. Those working in the space industry will be no exception. Companies (or investors) that are already engaging or plan to engage in space operations should consider 1) environmental standards and objectives when tackling space debris and space mining, 2) social standards and objectives when considering how such practices may impact access and opportunities for different socioeconomic groups to the benefits of space exploration, and 3) how decision-making and governing boards will function ethically, equitably, and sustainably as we chart new paths and encounter novel challenges in outer space.Keywords: climate, environment, ESG, law, outer space, regulation
Procedia PDF Downloads 15224475 Analysis of Expression Data Using Unsupervised Techniques
Authors: M. A. I Perera, C. R. Wijesinghe, A. R. Weerasinghe
Abstract:
his study was conducted to review and identify the unsupervised techniques that can be employed to analyze gene expression data in order to identify better subtypes of tumors. Identifying subtypes of cancer help in improving the efficacy and reducing the toxicity of the treatments by identifying clues to find target therapeutics. Process of gene expression data analysis described under three steps as preprocessing, clustering, and cluster validation. Feature selection is important since the genomic data are high dimensional with a large number of features compared to samples. Hierarchical clustering and K Means are often used in the analysis of gene expression data. There are several cluster validation techniques used in validating the clusters. Heatmaps are an effective external validation method that allows comparing the identified classes with clinical variables and visual analysis of the classes.Keywords: cancer subtypes, gene expression data analysis, clustering, cluster validation
Procedia PDF Downloads 14924474 Bidirectional Encoder Representations from Transformers Sentiment Analysis Applied to Three Presidential Pre-Candidates in Costa Rica
Authors: Félix David Suárez Bonilla
Abstract:
A sentiment analysis service to detect polarity (positive, neural, and negative), based on transfer learning, was built using a Spanish version of BERT and applied to tweets written in Spanish. The dataset that was used consisted of 11975 reviews, which were extracted from Google Play using the google-play-scrapper package. The BETO trained model used: the AdamW optimizer, a batch size of 16, a learning rate of 2x10⁻⁵ and 10 epochs. The system was tested using tweets of three presidential pre-candidates from Costa Rica. The system was finally validated using human labeled examples, achieving an accuracy of 83.3%.Keywords: NLP, transfer learning, BERT, sentiment analysis, social media, opinion mining
Procedia PDF Downloads 17424473 Phytomining for Rare Earth Elements: A Comparative Life Cycle Assessment
Authors: Mohsen Rabbani, Trista McLaughlin, Ehsan Vahidi
Abstract:
the remediation of polluted sites with heavy metals, such as rare earth elements (REEs), has been a primary concern of researchers to decontaminate the soil. Among all developed methods to address this concern, phytoremediation has been established as efficient, cost-effective, easy-to-use, and environmentally friendly way, providing a long-term solution for addressing this global concern. Furthermore, this technology has another great potential application in the metals production sector through returning metals buried in soil via metals cropping. Considering the significant metal concentration in hyper-accumulators, the utilization of bioaccumulated metals to extract metals from plant matter has been proposed as a sub-economic area called phytomining. As a recent, more advanced technology to eliminate such pollutants from the soil and produce critical metals, bioharvesting (phytomining/agromining) has been considered another compromising way to produce metals and meet the global demand for critical/target metals. The bio-ore obtained from phytomining can be safely disposed of or introduced to metal production pathways to obtain the most demanded metals, such as REEs. It is well-known that some hyperaccumulators, e.g., fern Dicranopteris linearis, can be used to absorb REE metals from the polluted soils and accumulate them in plant organs, such as leaves and stems. After soil remediation, the plant species can be harvested and introduced to the downstream steps, namely crushing/grinding, leaching, and purification processes, to extract REEs from plant matter. This novel interdisciplinary field can fill the gap between agriculture, mining, metallurgy, and the environment. Despite the advantages of agromining for the REEs production industry, key issues related to the environmental sustainability of the entire life cycle of this new concept have not been assessed yet. Hence, a comparative life cycle assessment (LCA) study was conducted to quantify the environmental footprints of REEs phytomining. The current LCA study aims to estimate and calculate environmental effects associated with phytomining by considering critical factors, such as climate change, land use, and ozone depletion. The results revealed that phytomining is an easy-to-use and environmentally sustainable approach to either eliminate REEs from polluted sites or produce REEs, offering a new source of such metals production. This LCA research provides guidelines for researchers active in developing a reliable relationship between agriculture, mining, metallurgy, and the environment to encounter soil pollution and keep the earth green and clean.Keywords: phytoremediation, phytomining, life cycle assessment, environmental impacts, rare earth elements, hyperaccumulator
Procedia PDF Downloads 6824472 Learning Analytics in a HiFlex Learning Environment
Authors: Matthew Montebello
Abstract:
Student engagement within a virtual learning environment generates masses of data points that can significantly contribute to the learning analytics that lead to decision support. Ideally, similar data is collected during student interaction with a physical learning space, and as a consequence, data is present at a large scale, even in relatively small classes. In this paper, we report of such an occurrence during classes held in a HiFlex modality as we investigate the advantages of adopting such a methodology. We plan to take full advantage of the learner-generated data in an attempt to further enhance the effectiveness of the adopted learning environment. This could shed crucial light on operating modalities that higher education institutions around the world will switch to in a post-COVID era.Keywords: HiFlex, big data in higher education, learning analytics, virtual learning environment
Procedia PDF Downloads 201