Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 25054

Search results for: thick data analytics

24274 Applying Different Stenography Techniques in Cloud Computing Technology to Improve Cloud Data Privacy and Security Issues

Abstract:

Cloud Computing is a versatile concept that refers to a service that allows users to outsource their data without having to worry about local storage issues. However, the most pressing issues to be addressed are maintaining a secure and reliable data repository rather than relying on untrustworthy service providers. In this study, we look at how stenography approaches and collaboration with Digital Watermarking can greatly improve the system's effectiveness and data security when used for Cloud Computing. The main requirement of such frameworks, where data is transferred or exchanged between servers and users, is safe data management in cloud environments. Steganography is the cloud is among the most effective methods for safe communication. Steganography is a method of writing coded messages in such a way that only the sender and recipient can safely interpret and display the information hidden in the communication channel. This study presents a new text steganography method for hiding a loaded hidden English text file in a cover English text file to ensure data protection in cloud computing. Data protection, data hiding capability, and time were all improved using the proposed technique.

Keywords: cloud computing, steganography, information hiding, cloud storage, security

Procedia PDF Downloads 182

24273 Investigation on Performance of Change Point Algorithm in Time Series Dynamical Regimes and Effect of Data Characteristics

Authors: Farhad Asadi, Mohammad Javad Mollakazemi

Abstract:

In this paper, Bayesian online inference in models of data series are constructed by change-points algorithm, which separated the observed time series into independent series and study the change and variation of the regime of the data with related statistical characteristics. variation of statistical characteristics of time series data often represent separated phenomena in the some dynamical system, like a change in state of brain dynamical reﬂected in EEG signal data measurement or a change in important regime of data in many dynamical system. In this paper, prediction algorithm for studying change point location in some time series data is simulated. It is verified that pattern of proposed distribution of data has important factor on simpler and smother fluctuation of hazard rate parameter and also for better identification of change point locations. Finally, the conditions of how the time series distribution effect on factors in this approach are explained and validated with different time series databases for some dynamical system.

Keywords: time series, fluctuation in statistical characteristics, optimal learning, change-point algorithm

Procedia PDF Downloads 418

24272 Determination of the Risks of Heart Attack at the First Stage as Well as Their Control and Resource Planning with the Method of Data Mining

Authors: İbrahi̇m Kara, Seher Arslankaya

Abstract:

Frequently preferred in the field of engineering in particular, data mining has now begun to be used in the field of health as well since the data in the health sector have reached great dimensions. With data mining, it is aimed to reveal models from the great amounts of raw data in agreement with the purpose and to search for the rules and relationships which will enable one to make predictions about the future from the large amount of data set. It helps the decision-maker to find the relationships among the data which form at the stage of decision-making. In this study, it is aimed to determine the risk of heart attack at the first stage, to control it, and to make its resource planning with the method of data mining. Through the early and correct diagnosis of heart attacks, it is aimed to reveal the factors which affect the diseases, to protect health and choose the right treatment methods, to reduce the costs in health expenditures, and to shorten the durations of patients’ stay at hospitals. In this way, the diagnosis and treatment costs of a heart attack will be scrutinized, which will be useful to determine the risk of the disease at the first stage, to control it, and to make its resource planning.

Keywords: data mining, decision support systems, heart attack, health sector

Procedia PDF Downloads 353

24271 Bayesian Borrowing Methods for Count Data: Analysis of Incontinence Episodes in Patients with Overactive Bladder

Authors: Akalu Banbeta, Emmanuel Lesaffre, Reynaldo Martina, Joost Van Rosmalen

Abstract:

Including data from previous studies (historical data) in the analysis of the current study may reduce the sample size requirement and/or increase the power of analysis. The most common example is incorporating historical control data in the analysis of a current clinical trial. However, this only applies when the historical control dataare similar enough to the current control data. Recently, several Bayesian approaches for incorporating historical data have been proposed, such as the meta-analytic-predictive (MAP) prior and the modified power prior (MPP) both for single control as well as for multiple historical control arms. Here, we examine the performance of the MAP and the MPP approaches for the analysis of (over-dispersed) count data. To this end, we propose a computational method for the MPP approach for the Poisson and the negative binomial models. We conducted an extensive simulation study to assess the performance of Bayesian approaches. Additionally, we illustrate our approaches on an overactive bladder data set. For similar data across the control arms, the MPP approach outperformed the MAP approach with respect to thestatistical power. When the means across the control arms are different, the MPP yielded a slightly inflated type I error (TIE) rate, whereas the MAP did not. In contrast, when the dispersion parameters are different, the MAP gave an inflated TIE rate, whereas the MPP did not.We conclude that the MPP approach is more promising than the MAP approach for incorporating historical count data.

Keywords: count data, meta-analytic prior, negative binomial, poisson

Procedia PDF Downloads 112

24270 Strategic Citizen Participation in Applied Planning Investigations: How Planners Use Etic and Emic Community Input Perspectives to Fill-in the Gaps in Their Analysis

Authors: John Gaber

Abstract:

Planners regularly use citizen input as empirical data to help them better understand community issues they know very little about. This type of community data is based on the lived experiences of local residents and is known as "emic" data. What is becoming more common practice for planners is their use of data from local experts and stakeholders (known as "etic" data or the outsider perspective) to help them fill in the gaps in their analysis of applied planning research projects. Utilizing international Health Impact Assessment (HIA) data, I look at who planners invite to their citizen input investigations. Research presented in this paper shows that planners access a wide range of emic and etic community perspectives in their search for the “community’s view.” The paper concludes with how planners can chart out a new empirical path in their execution of emic/etic citizen participation strategies in their applied planning research projects.

Keywords: citizen participation, emic data, etic data, Health Impact Assessment (HIA)

Procedia PDF Downloads 481

24269 Data Augmentation for Automatic Graphical User Interface Generation Based on Generative Adversarial Network

Authors: Xulu Yao, Moi Hoon Yap, Yanlong Zhang

Abstract:

As a branch of artificial neural network, deep learning is widely used in the field of image recognition, but the lack of its dataset leads to imperfect model learning. By analysing the data scale requirements of deep learning and aiming at the application in GUI generation, it is found that the collection of GUI dataset is a time-consuming and labor-consuming project, which is difficult to meet the needs of current deep learning network. To solve this problem, this paper proposes a semi-supervised deep learning model that relies on the original small-scale datasets to produce a large number of reliable data sets. By combining the cyclic neural network with the generated countermeasure network, the cyclic neural network can learn the sequence relationship and characteristics of data, make the generated countermeasure network generate reasonable data, and then expand the Rico dataset. Relying on the network structure, the characteristics of collected data can be well analysed, and a large number of reasonable data can be generated according to these characteristics. After data processing, a reliable dataset for model training can be formed, which alleviates the problem of dataset shortage in deep learning.

Keywords: GUI, deep learning, GAN, data augmentation

Procedia PDF Downloads 176

24268 Modelling Rainfall-Induced Shallow Landslides in the Northern New South Wales

Authors: S. Ravindran, Y.Liu, I. Gratchev, D.Jeng

Abstract:

Rainfall-induced shallow landslides are more common in the northern New South Wales (NSW), Australia. From 2009 to 2017, around 105 rainfall-induced landslides occurred along the road corridors and caused temporary road closures in the northern NSW. Rainfall causing shallow landslides has different distributions of rainfall varying from uniform, normal, decreasing to increasing rainfall intensity. The duration of rainfall varied from one day to 18 days according to historical data. The objective of this research is to analyse slope instability of some of the sites in the northern NSW by varying cumulative rainfall using SLOPE/W and SEEP/W and compare with field data of rainfall causing shallow landslides. The rainfall data and topographical data from public authorities and soil data obtained from laboratory tests will be used for this modelling. There is a likelihood of shallow landslides if the cumulative rainfall is between 100 mm to 400 mm in accordance with field data.

Keywords: landslides, modelling, rainfall, suction

Procedia PDF Downloads 169

24267 Machine Learning-Enabled Classification of Climbing Using Small Data

Authors: Nicholas Milburn, Yu Liang, Dalei Wu

Abstract:

Athlete performance scoring within the climbing do-main presents interesting challenges as the sport does not have an objective way to assign skill. Assessing skill levels within any sport is valuable as it can be used to mark progress while training, and it can help an athlete choose appropriate climbs to attempt. Machine learning-based methods are popular for complex problems like this. The dataset available was composed of dynamic force data recorded during climbing; however, this dataset came with challenges such as data scarcity, imbalance, and it was temporally heterogeneous. Investigated solutions to these challenges include data augmentation, temporal normalization, conversion of time series to the spectral domain, and cross validation strategies. The investigated solutions to the classification problem included light weight machine classifiers KNN and SVM as well as the deep learning with CNN. The best performing model had an 80% accuracy. In conclusion, there seems to be enough information within climbing force data to accurately categorize climbers by skill.

Keywords: classification, climbing, data imbalance, data scarcity, machine learning, time sequence

Procedia PDF Downloads 138

24266 Analysis of Expression Data Using Unsupervised Techniques

Authors: M. A. I Perera, C. R. Wijesinghe, A. R. Weerasinghe

Abstract:

his study was conducted to review and identify the unsupervised techniques that can be employed to analyze gene expression data in order to identify better subtypes of tumors. Identifying subtypes of cancer help in improving the efficacy and reducing the toxicity of the treatments by identifying clues to find target therapeutics. Process of gene expression data analysis described under three steps as preprocessing, clustering, and cluster validation. Feature selection is important since the genomic data are high dimensional with a large number of features compared to samples. Hierarchical clustering and K Means are often used in the analysis of gene expression data. There are several cluster validation techniques used in validating the clusters. Heatmaps are an effective external validation method that allows comparing the identified classes with clinical variables and visual analysis of the classes.

Keywords: cancer subtypes, gene expression data analysis, clustering, cluster validation

Procedia PDF Downloads 144

24265 Building a Transformative Continuing Professional Development Experience for Educators through a Principle-Based, Technological-Driven Knowledge Building Approach: A Case Study of a Professional Learning Team in Secondary Education

Authors: Melvin Chan, Chew Lee Teo

Abstract:

There has been a growing emphasis in elevating the teachers’ proficiency and competencies through continuing professional development (CPD) opportunities. In this era of a Volatile, Uncertain, Complex, Ambiguous (VUCA) world, teachers are expected to be collaborative designers, critical thinkers and creative builders. However, many of the CPD structures are still revolving in the model of transmission, which stands in contradiction to the cultivation of future-ready teachers for the innovative world of emerging technologies. This article puts forward the framing of CPD through a Principle-Based, Technological-Driven Knowledge Building Approach grounded in the essence of andragogy and progressive learning theories where growth is best exemplified through an authentic immersion in a social/community experience-based setting. Putting this Knowledge Building Professional Development Model (KBPDM) in operation via a Professional Learning Team (PLT) situated in a Secondary School in Singapore, research findings reveal that the intervention has led to a fundamental change in the learning paradigm of the teachers, henceforth equipping and empowering them successfully in their pedagogical design and practices for a 21^st century classroom experience. This article concludes with the possibility in leveraging the Learning Analytics to deepen the CPD experiences for educators.

Keywords: continual professional development, knowledge building, learning paradigm, principle-based

Procedia PDF Downloads 124

24264 Li-Fi Technology: Data Transmission through Visible Light

Authors: Shahzad Hassan, Kamran Saeed

Abstract:

People are always in search of Wi-Fi hotspots because Internet is a major demand nowadays. But like all other technologies, there is still room for improvement in the Wi-Fi technology with regards to the speed and quality of connectivity. In order to address these aspects, Harald Haas, a professor at the University of Edinburgh, proposed what we know as the Li-Fi (Light Fidelity). Li-Fi is a new technology in the field of wireless communication to provide connectivity within a network environment. It is a two-way mode of wireless communication using light. Basically, the data is transmitted through Light Emitting Diodes which can vary the intensity of light very fast, even faster than the blink of an eye. From the research and experiments conducted so far, it can be said that Li-Fi can increase the speed and reliability of the transfer of data. This paper pays particular attention on the assessment of the performance of this technology. In other words, it is a 5G technology which uses LED as the medium of data transfer. For coverage within the buildings, Wi-Fi is good but Li-Fi can be considered favorable in situations where large amounts of data are to be transferred in areas with electromagnetic interferences. It brings a lot of data related qualities such as efficiency, security as well as large throughputs to the table of wireless communication. All in all, it can be said that Li-Fi is going to be a future phenomenon where the presence of light will mean access to the Internet as well as speedy data transfer.

Keywords: communication, LED, Li-Fi, Wi-Fi

Procedia PDF Downloads 335

24263 An Analysis of Humanitarian Data Management of Polish Non-Governmental Organizations in Ukraine Since February 2022 and Its Relevance for Ukrainian Humanitarian Data Ecosystem

Authors: Renata Kurpiewska-Korbut

Abstract:

Making an assumption that the use and sharing of data generated in humanitarian action constitute a core function of humanitarian organizations, the paper analyzes the position of the largest Polish humanitarian non-governmental organizations in the humanitarian data ecosystem in Ukraine and their approach to non-personal and personal data management since February of 2022. Both expert interviews and document analysis of non-profit organizations providing a direct response in the Ukrainian crisis context, i.e., the Polish Humanitarian Action, Caritas, Polish Medical Mission, Polish Red Cross, and the Polish Center for International Aid and the applicability of theoretical perspective of contingency theory – with its central point that the context or specific set of conditions determining the way of behavior and the choice of methods of action – help to examine the significance of data complexity and adaptive approach to data management by relief organizations in the humanitarian supply chain network. The purpose of this study is to determine how the existence of well-established and accurate internal procedures and good practices of using and sharing data (including safeguards for sensitive data) by the surveyed organizations with comparable human and technological capabilities are implemented and adjusted to Ukrainian humanitarian settings and data infrastructure. The study also poses a fundamental question of whether this crisis experience will have a determining effect on their future performance. The obtained finding indicate that Polish humanitarian organizations in Ukraine, which have their own unique code of conduct and effective managerial data practices determined by contingencies, have limited influence on improving the situational awareness of other assistance providers in the data ecosystem despite their attempts to undertake interagency work in the area of data sharing.

Keywords: humanitarian data ecosystem, humanitarian data management, polish NGOs, Ukraine

Procedia PDF Downloads 88

24262 An Approach for Estimation in Hierarchical Clustered Data Applicable to Rare Diseases

Authors: Daniel C. Bonzo

Abstract:

Practical considerations lead to the use of unit of analysis within subjects, e.g., bleeding episodes or treatment-related adverse events, in rare disease settings. This is coupled with data augmentation techniques such as extrapolation to enlarge the subject base. In general, one can think about extrapolation of data as extending information and conclusions from one estimand to another estimand. This approach induces hierarchichal clustered data with varying cluster sizes. Extrapolation of clinical trial data is being accepted increasingly by regulatory agencies as a means of generating data in diverse situations during drug development process. Under certain circumstances, data can be extrapolated to a different population, a different but related indication, and different but similar product. We consider here the problem of estimation (point and interval) using a mixed-models approach under an extrapolation. It is proposed that estimators (point and interval) be constructed using weighting schemes for the clusters, e.g., equally weighted and with weights proportional to cluster size. Simulated data generated under varying scenarios are then used to evaluate the performance of this approach. In conclusion, the evaluation result showed that the approach is a useful means for improving statistical inference in rare disease settings and thus aids not only signal detection but risk-benefit evaluation as well.

Keywords: clustered data, estimand, extrapolation, mixed model

Procedia PDF Downloads 129

24261 Detection of Leptospira interrogans in Kidney and Urine of water Buffalo and its Relationship with Histopathological and Serological Findings

Authors: M. R. Haji Hajikolaei, A. A. Nikvand, A. R. Ghadrdan, M. Ghorbanpoor, B. Mohammadian

Abstract:

This study was carried out on water buffalo for detection of Leptospira interrogans in kidney and urine and its relationship with serological findings. Blood, urine and kidney samples were taken immediately after slaughter from 353 water buffalos at Ahvaz abattoir in Khouzestan province, Iran. Sera were initially screened at serum dilution of 1:100 against seven live antigens of Leptospira interrogans: pomona, hardjo, ballum, icterohemorrhagiae, tarasovi, australis and grippotyphosa using the microscopic agglutination test (MAT) and sera with positive results were titrated against reacting antigens in serial twofold dilution from 1:100 to 1:800. The samples of kidney were embedded in paraffin wax and 5µm thick sections were stained routinely with Haematoxylin and Eosin (H&E). Polymerase chain reaction (PCR) examination was done on urine and kidney by using LipL32 gene primers. Antibodies against one or more serovars at dilution >:100 were detected in sera. The most frequent reactor was hardjo (56.2%), followed by pomona (52.3%), australis (9.8%), tarassovi (5.9%), grippotyphosa (4.5%) and icterohaemorrhagiae (3.9%). The L. interrogans were detected in 43 (12.2%) of examined buffaloes, so that 26 (8.2%) of kidney tissues, 14 (4.8%) of urine samples separately and 3 (0.84%) of both kidney and urine samples were positive in PCR. From 153 (43.3%) buffaloes with positive MAT, 24 cases were positive by PCR of kidney and/or urine samples, synchronously. Renal lesions such as interstitial nephritis, acute tubular necrosis (ATN), pyelonephritis, glomerolonephritis, renal fibrosis and hydronephrosis were found in 128 (36.3%) cases. Statistical analysis indicated that there was no significant association between results of MAT, PCR and interstitial nephritis.

Keywords: leptospiral infection, PCR, MAT, histopathology, river buffalo

Procedia PDF Downloads 325

24260 Authorization of Commercial Communication Satellite Grounds for Promoting Turkish Data Relay System

Authors: Celal Dudak, Aslı Utku, Burak Yağlioğlu

Abstract:

Uninterrupted and continuous satellite communication through the whole orbit time is becoming more indispensable every day. Data relay systems are developed and built for various high/low data rate information exchanges like TDRSS of USA and EDRSS of Europe. In these missions, a couple of task-dedicated communication satellites exist. In this regard, for Turkey a data relay system is attempted to be defined exchanging low data rate information (i.e. TTC) for Earth-observing LEO satellites appointing commercial GEO communication satellites all over the world. First, justification of this attempt is given, demonstrating duration enhancements in the link. Discussion of preference of RF communication is, also, given instead of laser communication. Then, preferred communication GEOs – including TURKSAT4A already belonging to Turkey- are given, together with the coverage enhancements through STK simulations and the corresponding link budget. Also, a block diagram of the communication system is given on the LEO satellite.

Keywords: communication, GEO satellite, data relay system, coverage

Procedia PDF Downloads 433

24259 The Development of Encrypted Near Field Communication Data Exchange Format Transmission in an NFC Passive Tag for Checking the Genuine Product

Authors: Tanawat Hongthai, Dusit Thanapatay

Abstract:

This paper presents the development of encrypted near field communication (NFC) data exchange format transmission in an NFC passive tag for the feasibility of implementing a genuine product authentication. We propose a research encryption and checking the genuine product into four major categories; concept, infrastructure, development and applications. This result shows the passive NFC-forum Type 2 tag can be configured to be compatible with the NFC data exchange format (NDEF), which can be automatically partially data updated when there is NFC field.

Keywords: near field communication, NFC data exchange format, checking the genuine product, encrypted NFC

Procedia PDF Downloads 271

24258 Data Hiding by Vector Quantization in Color Image

Authors: Yung Gi Wu

Abstract:

With the growing of computer and network, digital data can be spread to anywhere in the world quickly. In addition, digital data can also be copied or tampered easily so that the security issue becomes an important topic in the protection of digital data. Digital watermark is a method to protect the ownership of digital data. Embedding the watermark will influence the quality certainly. In this paper, Vector Quantization (VQ) is used to embed the watermark into the image to fulfill the goal of data hiding. This kind of watermarking is invisible which means that the users will not conscious the existing of embedded watermark even though the embedded image has tiny difference compared to the original image. Meanwhile, VQ needs a lot of computation burden so that we adopt a fast VQ encoding scheme by partial distortion searching (PDS) and mean approximation scheme to speed up the data hiding process. The watermarks we hide to the image could be gray, bi-level and color images. Texts are also can be regarded as watermark to embed. In order to test the robustness of the system, we adopt Photoshop to fulfill sharpen, cropping and altering to check if the extracted watermark is still recognizable. Experimental results demonstrate that the proposed system can resist the above three kinds of tampering in general cases.

Keywords: data hiding, vector quantization, watermark, color image

Procedia PDF Downloads 358

24257 Anomaly Detection in a Data Center with a Reconstruction Method Using a Multi-Autoencoders Model

Authors: Victor Breux, Jérôme Boutet, Alain Goret, Viviane Cattin

Abstract:

Early detection of anomalies in data centers is important to reduce downtimes and the costs of periodic maintenance. However, there is little research on this topic and even fewer on the fusion of sensor data for the detection of abnormal events. The goal of this paper is to propose a method for anomaly detection in data centers by combining sensor data (temperature, humidity, power) and deep learning models. The model described in the paper uses one autoencoder per sensor to reconstruct the inputs. The auto-encoders contain Long-Short Term Memory (LSTM) layers and are trained using the normal samples of the relevant sensors selected by correlation analysis. The difference signal between the input and its reconstruction is then used to classify the samples using feature extraction and a random forest classifier. The data measured by the sensors of a data center between January 2019 and May 2020 are used to train the model, while the data between June 2020 and May 2021 are used to assess it. Performances of the model are assessed a posteriori through F1-score by comparing detected anomalies with the data center’s history. The proposed model outperforms the state-of-the-art reconstruction method, which uses only one autoencoder taking multivariate sequences and detects an anomaly with a threshold on the reconstruction error, with an F1-score of 83.60% compared to 24.16%.

Keywords: anomaly detection, autoencoder, data centers, deep learning

Procedia PDF Downloads 185

24256 Integration Process and Analytic Interface of different Environmental Open Data Sets with Java/Oracle and R

Authors: Pavel H. Llamocca, Victoria Lopez

Abstract:

The main objective of our work is the comparative analysis of environmental data from Open Data bases, belonging to different governments. This means that you have to integrate data from various different sources. Nowadays, many governments have the intention of publishing thousands of data sets for people and organizations to use them. In this way, the quantity of applications based on Open Data is increasing. However each government has its own procedures to publish its data, and it causes a variety of formats of data sets because there are no international standards to specify the formats of the data sets from Open Data bases. Due to this variety of formats, we must build a data integration process that is able to put together all kind of formats. There are some software tools developed in order to give support to the integration process, e.g. Data Tamer, Data Wrangler. The problem with these tools is that they need data scientist interaction to take part in the integration process as a final step. In our case we don’t want to depend on a data scientist, because environmental data are usually similar and these processes can be automated by programming. The main idea of our tool is to build Hadoop procedures adapted to data sources per each government in order to achieve an automated integration. Our work focus in environment data like temperature, energy consumption, air quality, solar radiation, speeds of wind, etc. Since 2 years, the government of Madrid is publishing its Open Data bases relative to environment indicators in real time. In the same way, other governments have published Open Data sets relative to the environment (like Andalucia or Bilbao). But all of those data sets have different formats and our solution is able to integrate all of them, furthermore it allows the user to make and visualize some analysis over the real-time data. Once the integration task is done, all the data from any government has the same format and the analysis process can be initiated in a computational better way. So the tool presented in this work has two goals: 1. Integration process; and 2. Graphic and analytic interface. As a first approach, the integration process was developed using Java and Oracle and the graphic and analytic interface with Java (jsp). However, in order to open our software tool, as second approach, we also developed an implementation with R language as mature open source technology. R is a really powerful open source programming language that allows us to process and analyze a huge amount of data with high performance. There are also some R libraries for the building of a graphic interface like shiny. A performance comparison between both implementations was made and no significant differences were found. In addition, our work provides with an Official Real-Time Integrated Data Set about Environment Data in Spain to any developer in order that they can build their own applications.

Keywords: open data, R language, data integration, environmental data

Procedia PDF Downloads 306

24255 Theoretical Approach for Estimating Transfer Length of Prestressing Strand in Pretensioned Concrete Members

Authors: Sun-Jin Han, Deuck Hang Lee, Hyo-Eun Joo, Hyun Kang, Kang Su Kim

Abstract:

In pretensioned concrete members, the transfer length region is existed, in which the stress in prestressing strand is developed due to the bond mechanism with surrounding concrete. The stress of strands in the transfer length zone is smaller than that in the strain plateau zone, so-called effective prestress, therefore the web-shear strength in transfer length region is smaller than that in the strain plateau zone. Although the transfer length is main key factor in the shear design, a few analytical researches have been conducted to investigate the transfer length. Therefore, in this study, a theoretical approach was used to estimate the transfer length. The bond stress developed between the strands and the surrounding concrete was quantitatively calculated by using the Thick-Walled Cylinder Model (TWCM), based on this, the transfer length of strands was calculated. To verify the proposed model, a total of 209 test results were collected from the previous studies. Consequently, the analysis results showed that the main influencing factors on the transfer length are the compressive strength of concrete, the cover thickness of concrete, the diameter of prestressing strand, and the magnitude of initial prestress. In addition, the proposed model predicted the transfer length of collected test specimens with high accuracy. Acknowledgement: This research was supported by a grant(17TBIP-C125047-01) from Technology Business Innovation Program funded by Ministry of Land, Infrastructure and Transport of Korean government.

Keywords: bond, Hoyer effect, prestressed concrete, prestressing strand, transfer length

Procedia PDF Downloads 284

24254 Clinical Implication of Hyper-Intense Signal Thyroid Incidentaloma on Time of Flight Magnetic Resonance Angiography

Authors: Inseon Ryoo, Soo Chin Kim, Hyena Jung, Sangil Suh

Abstract:

Objectives: The purpose of this study is to evaluate the clinical significance of hyper-intense signal thyroid incidentalomas on the time of flight magnetic resonance angiography (TOF-MRA) using correlation study with ultrasound (US). Methods: We retrospectively reviewed 3,505 non-contrast TOF-MRA performed at an institution between September 2014 and May 2017. Two radiologists correlated the thyroid incidentalomas detected on TOF-MRA with US features which was obtained within three months interval between MRA and US examinations in consensus method. Results: The prevalence of hyper-intense signal thyroid nodules incidentally detected on TOF-MRA was 1.2% (43/3505). Among them, 35 people (81.4%) underwent US examinations, and total 45 hyper-intense signal thyroid nodules were detected on US exams. Of these 45 nodules, 35 nodules (72.9%) were categorized as benign (K-TIRADS category 2) on US exams. Fine needle aspiration was performed on 9 nodules according to the indications recommended by Korean Society of Thyroid Radiology. All except one high-suspicious thyroid nodule were confirmed as benign (Bethesda 2) on cytologic exams. One high-suspicious nodule on US showed a non-diagnostic result (Bethesda 1) on cytologic exam. However, this nodule collapsed after aspiration of thick colloid material. Conclusions: Our study showed that the most hyper-intense signal thyroid nodules detected on TOF-MRA were benign. Therefore, if a hyper-intense signal incidentaloma is found on TOF-MRA, further evaluation, especially invasive biopsy of the nodules could be suspended unless the patient had other symptoms or clinical factors suggesting the need for further evaluation.

Keywords: incidentaloma, thyroid nodule, TOF MR angiography, ultrasound

Procedia PDF Downloads 160

24253 FCNN-MR: A Parallel Instance Selection Method Based on Fast Condensed Nearest Neighbor Rule

Authors: Lu Si, Jie Yu, Shasha Li, Jun Ma, Lei Luo, Qingbo Wu, Yongqi Ma, Zhengji Liu

Abstract:

Instance selection (IS) technique is used to reduce the data size to improve the performance of data mining methods. Recently, to process very large data set, several proposed methods divide the training set into some disjoint subsets and apply IS algorithms independently to each subset. In this paper, we analyze the limitation of these methods and give our viewpoint about how to divide and conquer in IS procedure. Then, based on fast condensed nearest neighbor (FCNN) rule, we propose a large data sets instance selection method with MapReduce framework. Besides ensuring the prediction accuracy and reduction rate, it has two desirable properties: First, it reduces the work load in the aggregation node; Second and most important, it produces the same result with the sequential version, which other parallel methods cannot achieve. We evaluate the performance of FCNN-MR on one small data set and two large data sets. The experimental results show that it is effective and practical.

Keywords: instance selection, data reduction, MapReduce, kNN

Procedia PDF Downloads 248

24252 Experimental Evaluation of Succinct Ternary Tree

Authors: Dmitriy Kuptsov

Abstract:

Tree data structures, such as binary or in general k-ary trees, are essential in computer science. The applications of these data structures can range from data search and retrieval to sorting and ranking algorithms. Naive implementations of these data structures can consume prohibitively large volumes of random access memory limiting their applicability in certain solutions. Thus, in these cases, more advanced representation of these data structures is essential. In this paper we present the design of the compact version of ternary tree data structure and demonstrate the results for the experimental evaluation using static dictionary problem. We compare these results with the results for binary and regular ternary trees. The conducted evaluation study shows that our design, in the best case, consumes up to 12 times less memory (for the dictionary used in our experimental evaluation) than a regular ternary tree and in certain configuration shows performance comparable to regular ternary trees. We have evaluated the performance of the algorithms using both 32 and 64 bit operating systems.

Keywords: algorithms, data structures, succinct ternary tree, per- formance evaluation

Procedia PDF Downloads 156

24251 Predicting Data Center Resource Usage Using Quantile Regression to Conserve Energy While Fulfilling the Service Level Agreement

Authors: Ahmed I. Alutabi, Naghmeh Dezhabad, Sudhakar Ganti

Abstract:

Data centers have been growing in size and dema nd continuously in the last two decades. Planning for the deployment of resources has been shallow and always resorted to over-provisioning. Data center operators try to maximize the availability of their services by allocating multiple of the needed resources. One resource that has been wasted, with little thought, has been energy. In recent years, programmable resource allocation has paved the way to allow for more efficient and robust data centers. In this work, we examine the predictability of resource usage in a data center environment. We use a number of models that cover a wide spectrum of machine learning categories. Then we establish a framework to guarantee the client service level agreement (SLA). Our results show that using prediction can cut energy loss by up to 55%.

Keywords: machine learning, artificial intelligence, prediction, data center, resource allocation, green computing

Procedia PDF Downloads 103

24250 Prosperous Digital Image Watermarking Approach by Using DCT-DWT

Authors: Prabhakar C. Dhavale, Meenakshi M. Pawar

Abstract:

In this paper, everyday tons of data is embedded on digital media or distributed over the internet. The data is so distributed that it can easily be replicated without error, putting the rights of their owners at risk. Even when encrypted for distribution, data can easily be decrypted and copied. One way to discourage illegal duplication is to insert information known as watermark, into potentially valuable data in such a way that it is impossible to separate the watermark from the data. These challenges motivated researchers to carry out intense research in the field of watermarking. A watermark is a form, image or text that is impressed onto paper, which provides evidence of its authenticity. Digital watermarking is an extension of the same concept. There are two types of watermarks visible watermark and invisible watermark. In this project, we have concentrated on implementing watermark in image. The main consideration for any watermarking scheme is its robustness to various attacks

Keywords: watermarking, digital, DCT-DWT, security

Procedia PDF Downloads 416

24249 Machine Learning Data Architecture

Authors: Neerav Kumar, Naumaan Nayyar, Sharath Kashyap

Abstract:

Most companies see an increase in the adoption of machine learning (ML) applications across internal and external-facing use cases. ML applications vend output either in batch or real-time patterns. A complete batch ML pipeline architecture comprises data sourcing, feature engineering, model training, model deployment, model output vending into a data store for downstream application. Due to unclear role expectations, we have observed that scientists specializing in building and optimizing models are investing significant efforts into building the other components of the architecture, which we do not believe is the best use of scientists’ bandwidth. We propose a system architecture created using AWS services that bring industry best practices to managing the workflow and simplifies the process of model deployment and end-to-end data integration for an ML application. This narrows down the scope of scientists’ work to model building and refinement while specialized data engineers take over the deployment, pipeline orchestration, data quality, data permission system, etc. The pipeline infrastructure is built and deployed as code (using terraform, cdk, cloudformation, etc.) which makes it easy to replicate and/or extend the architecture to other models that are used in an organization.

Keywords: data pipeline, machine learning, AWS, architecture, batch machine learning

Procedia PDF Downloads 58

24248 Risks for Cyanobacteria Harmful Algal Blooms in Georgia Piedmont Waterbodies Due to Land Management and Climate Interactions

Authors: Sam Weber, Deepak Mishra, Susan Wilde, Elizabeth Kramer

Abstract:

The frequency and severity of cyanobacteria harmful blooms (CyanoHABs) have been increasing over time, with point and non-point source eutrophication and shifting climate paradigms being blamed as the primary culprits. Excessive nutrients, warm temperatures, quiescent water, and heavy and less regular rainfall create more conducive environments for CyanoHABs. CyanoHABs have the potential to produce a spectrum of toxins that cause gastrointestinal stress, organ failure, and even death in humans and animals. To promote enhanced, proactive CyanoHAB management, risk modeling using geospatial tools can act as predictive mechanisms to supplement current CyanoHAB monitoring, management and mitigation efforts. The risk maps would empower water managers to focus their efforts on high risk water bodies in an attempt to prevent CyanoHABs before they occur, and/or more diligently observe those waterbodies. For this research, exploratory spatial data analysis techniques were used to identify the strongest predicators for CyanoHAB blooms based on remote sensing-derived cyanobacteria cell density values for 771 waterbodies in the Georgia Piedmont and landscape characteristics of their watersheds. In-situ datasets for cyanobacteria cell density, nutrients, temperature, and rainfall patterns are not widely available, so free gridded geospatial datasets were used as proxy variables for assessing CyanoHAB risk. For example, the percent of a watershed that is agriculture was used as a proxy for nutrient loading, and the summer precipitation within a watershed was used as a proxy for water quiescence. Cyanobacteria cell density values were calculated using atmospherically corrected images from the European Space Agency’s Sentinel-2A satellite and multispectral instrument sensor at a 10-meter ground resolution. Seventeen explanatory variables were calculated for each watershed utilizing the multi-petabyte geospatial catalogs available within the Google Earth Engine cloud computing interface. The seventeen variables were then used in a multiple linear regression model, and the strongest predictors of cyanobacteria cell density were selected for the final regression model. The seventeen explanatory variables included land cover composition, winter and summer temperature and precipitation data, topographic derivatives, vegetation index anomalies, and soil characteristics. Watershed maximum summer temperature, percent agriculture, percent forest, percent impervious, and waterbody area emerged as the strongest predictors of cyanobacteria cell density with an adjusted R-squared value of 0.31 and a p-value ~ 0. The final regression equation was used to make a normalized cyanobacteria cell density index, and a Jenks Natural Break classification was used to assign waterbodies designations of low, medium, or high risk. Of the 771 waterbodies, 24.38% were low risk, 37.35% were medium risk, and 38.26% were high risk. This study showed that there are significant relationships between free geospatial datasets representing summer maximum temperatures, nutrient loading associated with land use and land cover, and the area of a waterbody with cyanobacteria cell density. This data analytics approach to CyanoHAB risk assessment corroborated the literature-established environmental triggers for CyanoHABs, and presents a novel approach for CyanoHAB risk mapping in waterbodies across the greater southeastern United States.

Keywords: cyanobacteria, land use/land cover, remote sensing, risk mapping

Procedia PDF Downloads 207

24247 A Comparison of Image Data Representations for Local Stereo Matching

Authors: André Smith, Amr Abdel-Dayem

Abstract:

The stereo matching problem, while having been present for several decades, continues to be an active area of research. The goal of this research is to find correspondences between elements found in a set of stereoscopic images. With these pairings, it is possible to infer the distance of objects within a scene, relative to the observer. Advancements in this field have led to experimentations with various techniques, from graph-cut energy minimization to artificial neural networks. At the basis of these techniques is a cost function, which is used to evaluate the likelihood of a particular match between points in each image. While at its core, the cost is based on comparing the image pixel data; there is a general lack of consistency as to what image data representation to use. This paper presents an experimental analysis to compare the effectiveness of more common image data representations. The goal is to determine the effectiveness of these data representations to reduce the cost for the correct correspondence relative to other possible matches.

Keywords: colour data, local stereo matching, stereo correspondence, disparity map

Procedia PDF Downloads 365

24246 Smart Mobility Planning Applications in Meeting the Needs of the Urbanization Growth

Authors: Caroline Atef Shoukry Tadros

Abstract:

Massive Urbanization growth threatens the sustainability of cities and the quality of city life. This raised the need for an alternate model of sustainability, so we need to plan the future cities in a smarter way with smarter mobility. Smart Mobility planning applications are solutions that use digital technologies and infrastructure advances to improve the efficiency, sustainability, and inclusiveness of urban transportation systems. They can contribute to meeting the needs of Urbanization growth by addressing the challenges of traffic congestion, pollution, accessibility, and safety in cities. Some example of a Smart Mobility planning application are Mobility-as-a-service: This is a service that integrates different transport modes, such as public transport, shared mobility, and active mobility, into a single platform that allows users to plan, book, and pay for their trips. This can reduce the reliance on private cars, optimize the use of existing infrastructure, and provide more choices and convenience for travelers. MaaS Global is a company that offers mobility-as-a-service solutions in several cities around the world. Traffic flow optimization: This is a solution that uses data analytics, artificial intelligence, and sensors to monitor and manage traffic conditions in real-time. This can reduce congestion, emissions, and travel time, as well as improve road safety and user satisfaction. Waycare is a platform that leverages data from various sources, such as connected vehicles, mobile applications, and road cameras, to provide traffic management agencies with insights and recommendations to optimize traffic flow. Logistics optimization: This is a solution that uses smart algorithms, blockchain, and IoT to improve the efficiency and transparency of the delivery of goods and services in urban areas. This can reduce the costs, emissions, and delays associated with logistics, as well as enhance the customer experience and trust. ShipChain is a blockchain-based platform that connects shippers, carriers, and customers and provides end-to-end visibility and traceability of the shipments. Autonomous vehicles: This is a solution that uses advanced sensors, software, and communication systems to enable vehicles to operate without human intervention. This can improve the safety, accessibility, and productivity of transportation, as well as reduce the need for parking space and infrastructure maintenance. Waymo is a company that develops and operates autonomous vehicles for various purposes, such as ride-hailing, delivery, and trucking. These are some of the ways that Smart Mobility planning applications can contribute to meeting the needs of the Urbanization growth. However, there are also various opportunities and challenges related to the implementation and adoption of these solutions, such as the regulatory, ethical, social, and technical aspects. Therefore, it is important to consider the specific context and needs of each city and its stakeholders when designing and deploying Smart Mobility planning applications.

Keywords: smart mobility planning, smart mobility applications, smart mobility techniques, smart mobility tools, smart transportation, smart cities, urbanization growth, future smart cities, intelligent cities, ICT information and communications technologies, IoT internet of things, sensors, lidar, digital twin, ai artificial intelligence, AR augmented reality, VR virtual reality, robotics, cps cyber physical systems, citizens design science

Procedia PDF Downloads 70

24245 Business-Intelligence Mining of Large Decentralized Multimedia Datasets with a Distributed Multi-Agent System

Authors: Karima Qayumi, Alex Norta

Abstract:

The rapid generation of high volume and a broad variety of data from the application of new technologies pose challenges for the generation of business-intelligence. Most organizations and business owners need to extract data from multiple sources and apply analytical methods for the purposes of developing their business. Therefore, the recently decentralized data management environment is relying on a distributed computing paradigm. While data are stored in highly distributed systems, the implementation of distributed data-mining techniques is a challenge. The aim of this technique is to gather knowledge from every domain and all the datasets stemming from distributed resources. As agent technologies offer significant contributions for managing the complexity of distributed systems, we consider this for next-generation data-mining processes. To demonstrate agent-based business intelligence operations, we use agent-oriented modeling techniques to develop a new artifact for mining massive datasets.

Keywords: agent-oriented modeling (AOM), business intelligence model (BIM), distributed data mining (DDM), multi-agent system (MAS)

Procedia PDF Downloads 424