Search results for: Lanting Xia
3 A Generic Middleware to Instantly Sync Intensive Writes of Heterogeneous Massive Data via Internet
Authors: Haitao Yang, Zhenjiang Ruan, Fei Xu, Lanting Xia
Abstract:
Industry data centers often need to sync data changes reliably and instantly from a large-scale of heterogeneous autonomous relational databases accessed via the not-so-reliable Internet, for which a practical universal sync middle of low maintenance and operation costs is most wanted, but developing such a product and adapting it for various scenarios are a very sophisticated and continuous practice. The authors have been devising, applying, and optimizing a generic sync middleware system, named GSMS since 2006, holding the principles or advantages that the middleware must be SyncML-compliant and transparent to data application layer logic, need not refer to implementation details of databases synced, does not rely on host computer operating systems deployed, and its construction is light weighted and hence, of low cost. A series of ultimate experiments with GSMS sync performance were conducted for a persuasive example of a source relational database that underwent a broad range of write loads, say, from one thousand to one million intensive writes within a few minutes. The tests proved that GSMS has achieved an instant sync level of well below a fraction of millisecond per record sync, and GSMS’ smooth performances under ultimate write loads also showed it is feasible and competent.Keywords: heterogeneous massive data, instantly sync intensive writes, Internet generic middleware design, optimization
Procedia PDF Downloads 1202 A Model for Predicting Organic Compounds Concentration Change in Water Associated with Horizontal Hydraulic Fracturing
Authors: Ma Lanting, S. Eguilior, A. Hurtado, Juan F. Llamas Borrajo
Abstract:
Horizontal hydraulic fracturing is a technology to increase natural gas flow and improve productivity in the low permeability formation. During this drilling operation tons of flowback and produced water which contains many organic compounds return to the surface with a potential risk of influencing the surrounding environment and human health. A mathematical model is urgently needed to represent organic compounds in water transportation process behavior and the concentration change with time throughout the hydraulic fracturing operation life cycle. A comprehensive model combined Organic Matter Transport Dynamic Model with Two-Compartment First-order Model Constant (TFRC) Model has been established to quantify the organic compounds concentration. This algorithm model is composed of two transportation parts based on time factor. For the fast part, the curve fitting technique is applied using flowback water data from the Marcellus shale gas site fracturing and the coefficients of determination (R2) from all analyzed compounds demonstrate a high experimental feasibility of this numerical model. Furthermore, along a decade of drilling the concentration ratio curves have been estimated by the slow part of this model. The result shows that the larger value of Koc in chemicals, the later maximum concentration in water will reach, as well as all the maximum concentrations percentage would reach up to 90% of initial concentration from shale formation within a long sufficient period.Keywords: model, shale gas, concentration, organic compounds
Procedia PDF Downloads 2241 Regression Approach for Optimal Purchase of Hosts Cluster in Fixed Fund for Hadoop Big Data Platform
Authors: Haitao Yang, Jianming Lv, Fei Xu, Xintong Wang, Yilin Huang, Lanting Xia, Xuewu Zhu
Abstract:
Given a fixed fund, purchasing fewer hosts of higher capability or inversely more of lower capability is a must-be-made trade-off in practices for building a Hadoop big data platform. An exploratory study is presented for a Housing Big Data Platform project (HBDP), where typical big data computing is with SQL queries of aggregate, join, and space-time condition selections executed upon massive data from more than 10 million housing units. In HBDP, an empirical formula was introduced to predict the performance of host clusters potential for the intended typical big data computing, and it was shaped via a regression approach. With this empirical formula, it is easy to suggest an optimal cluster configuration. The investigation was based on a typical Hadoop computing ecosystem HDFS+Hive+Spark. A proper metric was raised to measure the performance of Hadoop clusters in HBDP, which was tested and compared with its predicted counterpart, on executing three kinds of typical SQL query tasks. Tests were conducted with respect to factors of CPU benchmark, memory size, virtual host division, and the number of element physical host in cluster. The research has been applied to practical cluster procurement for housing big data computing.Keywords: Hadoop platform planning, optimal cluster scheme at fixed-fund, performance predicting formula, typical SQL query tasks
Procedia PDF Downloads 232