Search results for: data transfer optimization
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 29617

Search results for: data transfer optimization

26017 Accurate HLA Typing at High-Digit Resolution from NGS Data

Authors: Yazhi Huang, Jing Yang, Dingge Ying, Yan Zhang, Vorasuk Shotelersuk, Nattiya Hirankarn, Pak Chung Sham, Yu Lung Lau, Wanling Yang

Abstract:

Human leukocyte antigen (HLA) typing from next generation sequencing (NGS) data has the potential for applications in clinical laboratories and population genetic studies. Here we introduce a novel technique for HLA typing from NGS data based on read-mapping using a comprehensive reference panel containing all known HLA alleles and de novo assembly of the gene-specific short reads. An accurate HLA typing at high-digit resolution was achieved when it was tested on publicly available NGS data, outperforming other newly-developed tools such as HLAminer and PHLAT.

Keywords: human leukocyte antigens, next generation sequencing, whole exome sequencing, HLA typing

Procedia PDF Downloads 665
26016 Early Childhood Education: Teachers Ability to Assess

Authors: Ade Dwi Utami

Abstract:

Pedagogic competence is the basic competence of teachers to perform their tasks as educators. The ability to assess has become one of the demands in teachers pedagogic competence. Teachers ability to assess is related to curriculum instructions and applications. This research is aimed at obtaining data concerning teachers ability to assess that comprises of understanding assessment, determining assessment type, tools and procedure, conducting assessment process, and using assessment result information. It uses mixed method of explanatory technique in which qualitative data is used to verify the quantitative data obtained through a survey. The technique of quantitative data collection is by test whereas the qualitative data collection is by observation, interview and documentation. Then, the analyzed data is processed through a proportion study technique to be categorized into high, medium and low. The result of the research shows that teachers ability to assess can be grouped into 3 namely, 2% of high, 4% of medium and 94% of low. The data shows that teachers ability to assess is still relatively low. Teachers are lack of knowledge and comprehension in assessment application. The statement is verified by the qualitative data showing that teachers did not state which aspect was assessed in learning, record children’s behavior, and use the data result as a consideration to design a program. Teachers have assessment documents yet they only serve as means of completing teachers administration for the certification program. Thus, assessment documents were not used with the basis of acquired knowledge. The condition should become a consideration of the education institution of educators and the government to improve teachers pedagogic competence, including the ability to assess.

Keywords: assessment, early childhood education, pedagogic competence, teachers

Procedia PDF Downloads 246
26015 Retrofitting Measures for Existing Housing Stock in Kazakhstan

Authors: S. Yessengabulov, A. Uyzbayeva

Abstract:

Residential buildings fund of Kazakhstan was built in the Soviet time about 35-60 years ago without considering energy efficiency measures. Currently, most of these buildings are in a rundown condition and fail to meet the minimum of hygienic, sanitary and comfortable living requirements. The paper aims to examine the reports of recent building energy survey activities in the country and provide a possible solution for retrofitting existing housing stock built before 1989 which could be applicable for building envelope in cold climate. Methodology also includes two-dimensional modeling of possible practical solutions and further recommendations.

Keywords: energy audit, energy efficient buildings in Kazakhstan, retrofit, two-dimensional conduction heat transfer analysis

Procedia PDF Downloads 247
26014 Thermal Contact Resistance of Nanoscale Rough Surfaces

Authors: Ravi Prasher

Abstract:

In nanostructured material thermal transport is dominated by contact resistance. Theoretical models describing thermal transport at interfaces assume perfectly flat surface whereas in reality surfaces can be rough with roughness ranging from sub-nanoscale dimension to micron scale. Here we introduce a model which includes both nanoscale contact mechanics and nanoscale heat transfer for rough nanoscale surfaces. This comprehensive model accounts for the effect of phonon acoustic mismatch, mechanical properties, chemical properties and randomness of the rough surface.

Keywords: adhesion and contact resistance, Kaptiza resistance of rough surfaces, nanoscale thermal transport

Procedia PDF Downloads 370
26013 APPLE: Providing Absolute and Proportional Throughput Guarantees in Wireless LANs

Authors: Zhijie Ma, Qinglin Zhao, Hongning Dai, Huan Zhang

Abstract:

This paper proposes an APPLE scheme that aims at providing absolute and proportional throughput guarantees, and maximizing system throughput simultaneously for wireless LANs with homogeneous and heterogenous traffic. We formulate our objectives as an optimization problem, present its exact and approximate solutions, and prove the existence and uniqueness of the approximate solution. Simulations validate that APPLE scheme is accurate, and the approximate solution can well achieve the desired objectives already.

Keywords: IEEE 802.11e, throughput guarantee, priority, WLANs

Procedia PDF Downloads 364
26012 Stochastic Matrices and Lp Norms for Ill-Conditioned Linear Systems

Authors: Riadh Zorgati, Thomas Triboulet

Abstract:

In quite diverse application areas such as astronomy, medical imaging, geophysics or nondestructive evaluation, many problems related to calibration, fitting or estimation of a large number of input parameters of a model from a small amount of output noisy data, can be cast as inverse problems. Due to noisy data corruption, insufficient data and model errors, most inverse problems are ill-posed in a Hadamard sense, i.e. existence, uniqueness and stability of the solution are not guaranteed. A wide class of inverse problems in physics relates to the Fredholm equation of the first kind. The ill-posedness of such inverse problem results, after discretization, in a very ill-conditioned linear system of equations, the condition number of the associated matrix can typically range from 109 to 1018. This condition number plays the role of an amplifier of uncertainties on data during inversion and then, renders the inverse problem difficult to handle numerically. Similar problems appear in other areas such as numerical optimization when using interior points algorithms for solving linear programs leads to face ill-conditioned systems of linear equations. Devising efficient solution approaches for such system of equations is therefore of great practical interest. Efficient iterative algorithms are proposed for solving a system of linear equations. The approach is based on a preconditioning of the initial matrix of the system with an approximation of a generalized inverse leading to a stochastic preconditioned matrix. This approach, valid for non-negative matrices, is first extended to hermitian, semi-definite positive matrices and then generalized to any complex rectangular matrices. The main results obtained are as follows: 1) We are able to build a generalized inverse of any complex rectangular matrix which satisfies the convergence condition requested in iterative algorithms for solving a system of linear equations. This completes the (short) list of generalized inverse having this property, after Kaczmarz and Cimmino matrices. Theoretical results on both the characterization of the type of generalized inverse obtained and the convergence are derived. 2) Thanks to its properties, this matrix can be efficiently used in different solving schemes as Richardson-Tanabe or preconditioned conjugate gradients. 3) By using Lp norms, we propose generalized Kaczmarz’s type matrices. We also show how Cimmino's matrix can be considered as a particular case consisting in choosing the Euclidian norm in an asymmetrical structure. 4) Regarding numerical results obtained on some pathological well-known test-cases (Hilbert, Nakasaka, …), some of the proposed algorithms are empirically shown to be more efficient on ill-conditioned problems and more robust to error propagation than the known classical techniques we have tested (Gauss, Moore-Penrose inverse, minimum residue, conjugate gradients, Kaczmarz, Cimmino). We end on a very early prospective application of our approach based on stochastic matrices aiming at computing some parameters (such as the extreme values, the mean, the variance, …) of the solution of a linear system prior to its resolution. Such an approach, if it were to be efficient, would be a source of information on the solution of a system of linear equations.

Keywords: conditioning, generalized inverse, linear system, norms, stochastic matrix

Procedia PDF Downloads 136
26011 Sheathed Cotton Fibers: Material for Oil-Spill Cleanup

Authors: Benjamin M Dauda, Esther Ibrahim, Sylvester Gadimoh, Asabe Mustapha, Jiyah Mohammed

Abstract:

Despite diverse optimization techniques on natural hydrophilic fibers, hydrophobic synthetic fibers are still the best oil sorption materials. However, these hydrophobic fibers are not biodegradable, making their disposal problematic. To this end, this work sets out to develop Nonwoven sorbents from epoxy-coated Cotton fibers. As a way of improving the compatibility of the crude oil and reduction of moisture absorption, cotton fibers were coated with epoxy resin by immersion in acetone-thinned epoxy solution. A needle-punching machine was used to convert the fibers into coherent nonwoven sheets. An oil sorption experiment was then carried out. The result indicates that the developed epoxy-modified sorbent has a higher crude oil-sorption capacity compared with those of untreated cotton and commercial polypropylene sorbents. Absorption Curves show that the coated fiber and polypropylene sorbent saturated faster than the uncoated cotton fiber pad. The result also shows that the coated cotton sorbent adsorbed crude faster than the polypropylene sorbent, and the equilibrium exhaustion was also higher. After a simple mechanical squeezing process, the Nonwoven pads could be restored to their original form and repeatedly recycled for oil/water separation. The results indicate that the cotton-coated non-woven pads hold promise for the cleanup of oil spills. Our data suggests that the sorption behaviors of the epoxy-coated Nonwoven pads and their crude oil sorption capacity are relatively stable under various environmental conditions compared to the commercial sheet.

Keywords: oil spill, adsorption, cotton, epoxy, nonwoven

Procedia PDF Downloads 55
26010 Statistical Analysis for Overdispersed Medical Count Data

Authors: Y. N. Phang, E. F. Loh

Abstract:

Many researchers have suggested the use of zero inflated Poisson (ZIP) and zero inflated negative binomial (ZINB) models in modeling over-dispersed medical count data with extra variations caused by extra zeros and unobserved heterogeneity. The studies indicate that ZIP and ZINB always provide better fit than using the normal Poisson and negative binomial models in modeling over-dispersed medical count data. In this study, we proposed the use of Zero Inflated Inverse Trinomial (ZIIT), Zero Inflated Poisson Inverse Gaussian (ZIPIG) and zero inflated strict arcsine models in modeling over-dispersed medical count data. These proposed models are not widely used by many researchers especially in the medical field. The results show that these three suggested models can serve as alternative models in modeling over-dispersed medical count data. This is supported by the application of these suggested models to a real life medical data set. Inverse trinomial, Poisson inverse Gaussian, and strict arcsine are discrete distributions with cubic variance function of mean. Therefore, ZIIT, ZIPIG and ZISA are able to accommodate data with excess zeros and very heavy tailed. They are recommended to be used in modeling over-dispersed medical count data when ZIP and ZINB are inadequate.

Keywords: zero inflated, inverse trinomial distribution, Poisson inverse Gaussian distribution, strict arcsine distribution, Pearson’s goodness of fit

Procedia PDF Downloads 544
26009 Monotone Rational Trigonometric Interpolation

Authors: Uzma Bashir, Jamaludin Md. Ali

Abstract:

This study is concerned with the visualization of monotone data using a piece-wise C1 rational trigonometric interpolating scheme. Four positive shape parameters are incorporated in the structure of rational trigonometric spline. Conditions on two of these parameters are derived to attain the monotonicity of monotone data and other two are left-free. Figures are used widely to exhibit that the proposed scheme produces graphically smooth monotone curves.

Keywords: trigonometric splines, monotone data, shape preserving, C1 monotone interpolant

Procedia PDF Downloads 271
26008 Electroactive Ferrocenyl Dendrimers as Transducers for Fabrication of Label-Free Electrochemical Immunosensor

Authors: Sudeshna Chandra, Christian Gäbler, Christian Schliebe, Heinrich Lang

Abstract:

Highly branched dendrimers provide structural homogeneity, controlled composition, comparable size to biomolecules, internal porosity and multiple functional groups for conjugating reactions. Electro-active dendrimers containing multiple redox units have generated great interest in their use as electrode modifiers for development of biosensors. The electron transfer between the redox-active dendrimers and the biomolecules play a key role in developing a biosensor. Ferrocenes have multiple and electrochemically equivalent redox units that can act as electron “pool” in a system. The ferrocenyl-terminated polyamidoamine dendrimer is capable of transferring multiple numbers of electrons under the same applied potential. Therefore, they can be used for dual purposes: one in building a film over the electrode for immunosensors and the other for immobilizing biomolecules for sensing. Electrochemical immunosensor, thus developed, exhibit fast and sensitive analysis, inexpensive and involve no prior sample pre-treatment. Electrochemical amperometric immunosensors are even more promising because they can achieve a very low detection limit with high sensitivity. Detection of the cancer biomarkers at an early stage can provide crucial information for foundational research of life science, clinical diagnosis and prevention of disease. Elevated concentration of biomarkers in body fluid is an early indication of some type of cancerous disease and among all the biomarkers, IgG is the most common and extensively used clinical cancer biomarkers. We present an IgG (=immunoglobulin) electrochemical immunosensor using a newly synthesized redox-active ferrocenyl dendrimer of generation 2 (G2Fc) as glassy carbon electrode material for immobilizing the antibody. The electrochemical performance of the modified electrodes was assessed in both aqueous and non-aqueous media using varying scan rates to elucidate the reaction mechanism. The potential shift was found to be higher in an aqueous electrolyte due to presence of more H-bond which reduced the electrostatic attraction within the amido groups of the dendrimers. The cyclic voltammetric studies of the G2Fc-modified GCE in 0.1 M PBS solution of pH 7.2 showed a pair of well-defined redox peaks. The peak current decreased significantly with the immobilization of the anti-goat IgG. After the immunosensor is blocked with BSA, a further decrease in the peak current was observed due to the attachment of the protein BSA to the immunosensor. A significant decrease in the current signal of the BSA/anti-IgG/G2Fc/GCE was observed upon immobilizing IgG which may be due to the formation of immune-conjugates that blocks the tunneling of mass and electron transfer. The current signal was found to be directly related to the amount of IgG captured on the electrode surface. With increase in the concentration of IgG, there is a formation of an increasing amount of immune-conjugates that decreased the peak current. The incubation time and concentration of the antibody was optimized for better analytical performance of the immunosensor. The developed amperometric immunosensor is sensitive to IgG concentration as low as 2 ng/mL. Tailoring of redox-active dendrimers provides enhanced electroactivity to the system and enlarges the sensor surface for binding the antibodies. It may be assumed that both electron transfer and diffusion contribute to the signal transformation between the dendrimers and the antibody.

Keywords: ferrocenyl dendrimers, electrochemical immunosensors, immunoglobulin, amperometry

Procedia PDF Downloads 337
26007 GPU-Based Back-Projection of Synthetic Aperture Radar (SAR) Data onto 3D Reference Voxels

Authors: Joshua Buli, David Pietrowski, Samuel Britton

Abstract:

Processing SAR data usually requires constraints in extent in the Fourier domain as well as approximations and interpolations onto a planar surface to form an exploitable image. This results in a potential loss of data requires several interpolative techniques, and restricts visualization to two-dimensional plane imagery. The data can be interpolated into a ground plane projection, with or without terrain as a component, all to better view SAR data in an image domain comparable to what a human would view, to ease interpretation. An alternate but computationally heavy method to make use of more of the data is the basis of this research. Pre-processing of the SAR data is completed first (matched-filtering, motion compensation, etc.), the data is then range compressed, and lastly, the contribution from each pulse is determined for each specific point in space by searching the time history data for the reflectivity values for each pulse summed over the entire collection. This results in a per-3D-point reflectivity using the entire collection domain. New advances in GPU processing have finally allowed this rapid projection of acquired SAR data onto any desired reference surface (called backprojection). Mathematically, the computations are fast and easy to implement, despite limitations in SAR phase history data size and 3D-point cloud size. Backprojection processing algorithms are embarrassingly parallel since each 3D point in the scene has the same reflectivity calculation applied for all pulses, independent of all other 3D points and pulse data under consideration. Therefore, given the simplicity of the single backprojection calculation, the work can be spread across thousands of GPU threads allowing for accurate reflectivity representation of a scene. Furthermore, because reflectivity values are associated with individual three-dimensional points, a plane is no longer the sole permissible mapping base; a digital elevation model or even a cloud of points (collected from any sensor capable of measuring ground topography) can be used as a basis for the backprojection technique. This technique minimizes any interpolations and modifications of the raw data, maintaining maximum data integrity. This innovative processing will allow for SAR data to be rapidly brought into a common reference frame for immediate exploitation and data fusion with other three-dimensional data and representations.

Keywords: backprojection, data fusion, exploitation, three-dimensional, visualization

Procedia PDF Downloads 86
26006 Integration of Knowledge and Metadata for Complex Data Warehouses and Big Data

Authors: Jean Christian Ralaivao, Fabrice Razafindraibe, Hasina Rakotonirainy

Abstract:

This document constitutes a resumption of work carried out in the field of complex data warehouses (DW) relating to the management and formalization of knowledge and metadata. It offers a methodological approach for integrating two concepts, knowledge and metadata, within the framework of a complex DW architecture. The objective of the work considers the use of the technique of knowledge representation by description logics and the extension of Common Warehouse Metamodel (CWM) specifications. This will lead to a fallout in terms of the performance of a complex DW. Three essential aspects of this work are expected, including the representation of knowledge in description logics and the declination of this knowledge into consistent UML diagrams while respecting or extending the CWM specifications and using XML as pivot. The field of application is large but will be adapted to systems with heteroge-neous, complex and unstructured content and moreover requiring a great (re)use of knowledge such as medical data warehouses.

Keywords: data warehouse, description logics, integration, knowledge, metadata

Procedia PDF Downloads 138
26005 Network Conditioning and Transfer Learning for Peripheral Nerve Segmentation in Ultrasound Images

Authors: Harold Mauricio Díaz-Vargas, Cristian Alfonso Jimenez-Castaño, David Augusto Cárdenas-Peña, Guillermo Alberto Ortiz-Gómez, Alvaro Angel Orozco-Gutierrez

Abstract:

Precise identification of the nerves is a crucial task performed by anesthesiologists for an effective Peripheral Nerve Blocking (PNB). Now, anesthesiologists use ultrasound imaging equipment to guide the PNB and detect nervous structures. However, visual identification of the nerves from ultrasound images is difficult, even for trained specialists, due to artifacts and low contrast. The recent advances in deep learning make neural networks a potential tool for accurate nerve segmentation systems, so addressing the above issues from raw data. The most widely spread U-Net network yields pixel-by-pixel segmentation by encoding the input image and decoding the attained feature vector into a semantic image. This work proposes a conditioning approach and encoder pre-training to enhance the nerve segmentation of traditional U-Nets. Conditioning is achieved by the one-hot encoding of the kind of target nerve a the network input, while the pre-training considers five well-known deep networks for image classification. The proposed approach is tested in a collection of 619 US images, where the best C-UNet architecture yields an 81% Dice coefficient, outperforming the 74% of the best traditional U-Net. Results prove that pre-trained models with the conditional approach outperform their equivalent baseline by supporting learning new features and enriching the discriminant capability of the tested networks.

Keywords: nerve segmentation, U-Net, deep learning, ultrasound imaging, peripheral nerve blocking

Procedia PDF Downloads 107
26004 Application of Regularized Low-Rank Matrix Factorization in Personalized Targeting

Authors: Kourosh Modarresi

Abstract:

The Netflix problem has brought the topic of “Recommendation Systems” into the mainstream of computer science, mathematics, and statistics. Though much progress has been made, the available algorithms do not obtain satisfactory results. The success of these algorithms is rarely above 5%. This work is based on the belief that the main challenge is to come up with “scalable personalization” models. This paper uses an adaptive regularization of inverse singular value decomposition (SVD) that applies adaptive penalization on the singular vectors. The results show far better matching for recommender systems when compared to the ones from the state of the art models in the industry.

Keywords: convex optimization, LASSO, regression, recommender systems, singular value decomposition, low rank approximation

Procedia PDF Downloads 456
26003 Efficient Frequent Itemset Mining Methods over Real-Time Spatial Big Data

Authors: Hamdi Sana, Emna Bouazizi, Sami Faiz

Abstract:

In recent years, there is a huge increase in the use of spatio-temporal applications where data and queries are continuously moving. As a result, the need to process real-time spatio-temporal data seems clear and real-time stream data management becomes a hot topic. Sliding window model and frequent itemset mining over dynamic data are the most important problems in the context of data mining. Thus, sliding window model for frequent itemset mining is a widely used model for data stream mining due to its emphasis on recent data and its bounded memory requirement. These methods use the traditional transaction-based sliding window model where the window size is based on a fixed number of transactions. Actually, this model supposes that all transactions have a constant rate which is not suited for real-time applications. And the use of this model in such applications endangers their performance. Based on these observations, this paper relaxes the notion of window size and proposes the use of a timestamp-based sliding window model. In our proposed frequent itemset mining algorithm, support conditions are used to differentiate frequents and infrequent patterns. Thereafter, a tree is developed to incrementally maintain the essential information. We evaluate our contribution. The preliminary results are quite promising.

Keywords: real-time spatial big data, frequent itemset, transaction-based sliding window model, timestamp-based sliding window model, weighted frequent patterns, tree, stream query

Procedia PDF Downloads 162
26002 The Extent of Big Data Analysis by the External Auditors

Authors: Iyad Ismail, Fathilatul Abdul Hamid

Abstract:

This research was mainly investigated to recognize the extent of big data analysis by external auditors. This paper adopts grounded theory as a framework for conducting a series of semi-structured interviews with eighteen external auditors. The research findings comprised the availability extent of big data and big data analysis usage by the external auditors in Palestine, Gaza Strip. Considering the study's outcomes leads to a series of auditing procedures in order to improve the external auditing techniques, which leads to high-quality audit process. Also, this research is crucial for auditing firms by giving an insight into the mechanisms of auditing firms to identify the most important strategies that help in achieving competitive audit quality. These results are aims to instruct the auditing academic and professional institutions in developing techniques for external auditors in order to the big data analysis. This paper provides appropriate information for the decision-making process and a source of future information which affects technological auditing.

Keywords: big data analysis, external auditors, audit reliance, internal audit function

Procedia PDF Downloads 70
26001 Numerical Study of a 6080HP Open Drip Proof (ODP) Motor

Authors: Feng-Hisang Lai

Abstract:

CFD(Computational Fluid Dynamics) is conducted to numerically study the flow and heat transfer features of a two-pole, 6,080HP, 60Hz, 3,150V open drip-proof (ODP) motor. The stator and rotor cores in this high voltage induction motor are segmented with the use of spacers for cooling purposes, which leads to difficulties in meshing when the entire system is to be simulated. The system is divided into 4 parts, meshed separately and then combined using interfaces. The deviation between the CFD and experimental results in temperature and flow rate is less than 10%. The internal flow is further examined and a final design is proposed to reduce the winding temperature by 10 degrees.

Keywords: CFD, open drip proof, induction motor, cooling

Procedia PDF Downloads 197
26000 A Model of Teacher Leadership in History Instruction

Authors: Poramatdha Chutimant

Abstract:

The objective of the research was to propose a model of teacher leadership in history instruction for utilization. Everett M. Rogers’ Diffusion of Innovations Theory is applied as theoretical framework. Qualitative method is to be used in the study, and the interview protocol used as an instrument to collect primary data from best practices who awarded by Office of National Education Commission (ONEC). Open-end questions will be used in interview protocol in order to gather the various data. Then, information according to international context of history instruction is the secondary data used to support in the summarizing process (Content Analysis). Dendrogram is a key to interpret and synthesize the primary data. Thus, secondary data comes as the supportive issue in explanation and elaboration. In-depth interview is to be used to collected information from seven experts in educational field. The focal point is to validate a draft model in term of future utilization finally.

Keywords: history study, nationalism, patriotism, responsible citizenship, teacher leadership

Procedia PDF Downloads 280
25999 Evaluation of Modern Natural Language Processing Techniques via Measuring a Company's Public Perception

Authors: Burak Oksuzoglu, Savas Yildirim, Ferhat Kutlu

Abstract:

Opinion mining (OM) is one of the natural language processing (NLP) problems to determine the polarity of opinions, mostly represented on a positive-neutral-negative axis. The data for OM is usually collected from various social media platforms. In an era where social media has considerable control over companies’ futures, it’s worth understanding social media and taking actions accordingly. OM comes to the fore here as the scale of the discussion about companies increases, and it becomes unfeasible to gauge opinion on individual levels. Thus, the companies opt to automize this process by applying machine learning (ML) approaches to their data. For the last two decades, OM or sentiment analysis (SA) has been mainly performed by applying ML classification algorithms such as support vector machines (SVM) and Naïve Bayes to a bag of n-gram representations of textual data. With the advent of deep learning and its apparent success in NLP, traditional methods have become obsolete. Transfer learning paradigm that has been commonly used in computer vision (CV) problems started to shape NLP approaches and language models (LM) lately. This gave a sudden rise to the usage of the pretrained language model (PTM), which contains language representations that are obtained by training it on the large datasets using self-supervised learning objectives. The PTMs are further fine-tuned by a specialized downstream task dataset to produce efficient models for various NLP tasks such as OM, NER (Named-Entity Recognition), Question Answering (QA), and so forth. In this study, the traditional and modern NLP approaches have been evaluated for OM by using a sizable corpus belonging to a large private company containing about 76,000 comments in Turkish: SVM with a bag of n-grams, and two chosen pre-trained models, multilingual universal sentence encoder (MUSE) and bidirectional encoder representations from transformers (BERT). The MUSE model is a multilingual model that supports 16 languages, including Turkish, and it is based on convolutional neural networks. The BERT is a monolingual model in our case and transformers-based neural networks. It uses a masked language model and next sentence prediction tasks that allow the bidirectional training of the transformers. During the training phase of the architecture, pre-processing operations such as morphological parsing, stemming, and spelling correction was not used since the experiments showed that their contribution to the model performance was found insignificant even though Turkish is a highly agglutinative and inflective language. The results show that usage of deep learning methods with pre-trained models and fine-tuning achieve about 11% improvement over SVM for OM. The BERT model achieved around 94% prediction accuracy while the MUSE model achieved around 88% and SVM did around 83%. The MUSE multilingual model shows better results than SVM, but it still performs worse than the monolingual BERT model.

Keywords: BERT, MUSE, opinion mining, pretrained language model, SVM, Turkish

Procedia PDF Downloads 146
25998 The Effect of Institutions on Economic Growth: An Analysis Based on Bayesian Panel Data Estimation

Authors: Mohammad Anwar, Shah Waliullah

Abstract:

This study investigated panel data regression models. This paper used Bayesian and classical methods to study the impact of institutions on economic growth from data (1990-2014), especially in developing countries. Under the classical and Bayesian methodology, the two-panel data models were estimated, which are common effects and fixed effects. For the Bayesian approach, the prior information is used in this paper, and normal gamma prior is used for the panel data models. The analysis was done through WinBUGS14 software. The estimated results of the study showed that panel data models are valid models in Bayesian methodology. In the Bayesian approach, the effects of all independent variables were positively and significantly affected by the dependent variables. Based on the standard errors of all models, we must say that the fixed effect model is the best model in the Bayesian estimation of panel data models. Also, it was proved that the fixed effect model has the lowest value of standard error, as compared to other models.

Keywords: Bayesian approach, common effect, fixed effect, random effect, Dynamic Random Effect Model

Procedia PDF Downloads 68
25997 A Sensor Placement Methodology for Chemical Plants

Authors: Omid Ataei Nia, Karim Salahshoor

Abstract:

In this paper, a new precise and reliable sensor network methodology is introduced for unit processes and operations using the Constriction Coefficient Particle Swarm Optimization (CPSO) method. CPSO is introduced as a new search engine for optimal sensor network design purposes. Furthermore, a Square Root Unscented Kalman Filter (SRUKF) algorithm is employed as a new data reconciliation technique to enhance the stability and accuracy of the filter. The proposed design procedure incorporates precision, cost, observability, reliability together with importance-of-variables (IVs) as a novel measure in Instrumentation Criteria (IC). To the best of our knowledge, no comprehensive approach has yet been proposed in the literature to take into account the importance of variables in the sensor network design procedure. In this paper, specific weight is assigned to each sensor, measuring a process variable in the sensor network to indicate the importance of that variable over the others to cater to the ultimate sensor network application requirements. A set of distinct scenarios has been conducted to evaluate the performance of the proposed methodology in a simulated Continuous Stirred Tank Reactor (CSTR) as a highly nonlinear process plant benchmark. The obtained results reveal the efficacy of the proposed method, leading to significant improvement in accuracy with respect to other alternative sensor network design approaches and securing the definite allocation of sensors to the most important process variables in sensor network design as a novel achievement.

Keywords: constriction coefficient PSO, importance of variable, MRMSE, reliability, sensor network design, square root unscented Kalman filter

Procedia PDF Downloads 160
25996 Area-Efficient FPGA Implementation of an FFT Processor by Reusing Butterfly Units

Authors: Atin Mukherjee, Amitabha Sinha, Debesh Choudhury

Abstract:

Fast Fourier transform (FFT) of large-number of samples requires larger hardware resources of field programmable gate arrays and it asks for more area as well as power. In this paper, an area efficient architecture of FFT processor is proposed, that reuses the butterfly units more than once. The FFT processor is emulated and the results are validated on Virtex-6 FPGA. The proposed architecture outperforms the conventional architecture of a N-point FFT processor in terms of area which is reduced by a factor of log_N(2) with the negligible increase of processing time.

Keywords: FFT, FPGA, resource optimization, butterfly units

Procedia PDF Downloads 523
25995 On the Study of the Electromagnetic Scattering by Large Obstacle Based on the Method of Auxiliary Sources

Authors: Hidouri Sami, Aguili Taoufik

Abstract:

We consider fast and accurate solutions of scattering problems by large perfectly conducting objects (PEC) formulated by an optimization of the Method of Auxiliary Sources (MAS). We present various techniques used to reduce the total computational cost of the scattering problem. The first technique is based on replacing the object by an array of finite number of small (PEC) object with the same shape. The second solution reduces the problem on considering only the half of the object.These two solutions are compared to results from the reference bibliography.

Keywords: method of auxiliary sources, scattering, large object, RCS, computational resources

Procedia PDF Downloads 243
25994 Early Predictive Signs for Kasai Procedure Success

Authors: Medan Isaeva, Anna Degtyareva

Abstract:

Context: Biliary atresia is a common reason for liver transplants in children, and the Kasai procedure can potentially be successful in avoiding the need for transplantation. However, it is important to identify factors that influence surgical outcomes in order to optimize treatment and improve patient outcomes. Research aim: The aim of this study was to develop prognostic models to assess the outcomes of the Kasai procedure in children with biliary atresia. Methodology: This retrospective study analyzed data from 166 children with biliary atresia who underwent the Kasai procedure between 2002 and 2021. The effectiveness of the operation was assessed based on specific criteria, including post-operative stool color, jaundice reduction, and bilirubin levels. The study involved a comparative analysis of various parameters, such as gestational age, birth weight, age at operation, physical development, liver and spleen sizes, and laboratory values including bilirubin, ALT, AST, and others, measured pre- and post-operation. Ultrasonographic evaluations were also conducted pre-operation, assessing the hepatobiliary system and related quantitative parameters. The study was carried out by two experienced specialists in pediatric hepatology. Comparative analysis and multifactorial logistic regression were used as the primary statistical methods. Findings: The study identified several statistically significant predictors of a successful Kasai procedure, including the presence of the gallbladder and levels of cholesterol and direct bilirubin post-operation. A detectable gallbladder was associated with a higher probability of surgical success, while elevated post-operative cholesterol and direct bilirubin levels were indicative of a reduced chance of positive outcomes. Theoretical importance: The findings of this study contribute to the optimization of treatment strategies for children with biliary atresia undergoing the Kasai procedure. By identifying early predictive signs of success, clinicians can modify treatment plans and manage patient care more effectively and proactively. Data collection and analysis procedures: Data for this analysis were obtained from the health records of patients who received the Kasai procedure. Comparative analysis and multifactorial logistic regression were employed to analyze the data and identify significant predictors. Question addressed: The study addressed the question of identifying predictive factors for the success of the Kasai procedure in children with biliary atresia. Conclusion: The developed prognostic models serve as valuable tools for early detection of patients who are less likely to benefit from the Kasai procedure. This enables clinicians to modify treatment plans and manage patient care more effectively and proactively. Potential limitations of the study: The study has several limitations. Its retrospective nature may introduce biases and inconsistencies in data collection. Being single centered, the results might not be generalizable to wider populations due to variations in surgical and postoperative practices. Also, other potential influencing factors beyond the clinical, laboratory, and ultrasonographic parameters considered in this study were not explored, which could affect the outcomes of the Kasai operation. Future studies could benefit from including a broader range of factors.

Keywords: biliary atresia, kasai operation, prognostic model, native liver survival

Procedia PDF Downloads 55
25993 Topic Modelling Using Latent Dirichlet Allocation and Latent Semantic Indexing on SA Telco Twitter Data

Authors: Phumelele Kubheka, Pius Owolawi, Gbolahan Aiyetoro

Abstract:

Twitter is one of the most popular social media platforms where users can share their opinions on different subjects. As of 2010, The Twitter platform generates more than 12 Terabytes of data daily, ~ 4.3 petabytes in a single year. For this reason, Twitter is a great source for big mining data. Many industries such as Telecommunication companies can leverage the availability of Twitter data to better understand their markets and make an appropriate business decision. This study performs topic modeling on Twitter data using Latent Dirichlet Allocation (LDA). The obtained results are benchmarked with another topic modeling technique, Latent Semantic Indexing (LSI). The study aims to retrieve topics on a Twitter dataset containing user tweets on South African Telcos. Results from this study show that LSI is much faster than LDA. However, LDA yields better results with higher topic coherence by 8% for the best-performing model represented in Table 1. A higher topic coherence score indicates better performance of the model.

Keywords: big data, latent Dirichlet allocation, latent semantic indexing, telco, topic modeling, twitter

Procedia PDF Downloads 151
25992 Enhance the Power of Sentiment Analysis

Authors: Yu Zhang, Pedro Desouza

Abstract:

Since big data has become substantially more accessible and manageable due to the development of powerful tools for dealing with unstructured data, people are eager to mine information from social media resources that could not be handled in the past. Sentiment analysis, as a novel branch of text mining, has in the last decade become increasingly important in marketing analysis, customer risk prediction and other fields. Scientists and researchers have undertaken significant work in creating and improving their sentiment models. In this paper, we present a concept of selecting appropriate classifiers based on the features and qualities of data sources by comparing the performances of five classifiers with three popular social media data sources: Twitter, Amazon Customer Reviews, and Movie Reviews. We introduced a couple of innovative models that outperform traditional sentiment classifiers for these data sources, and provide insights on how to further improve the predictive power of sentiment analysis. The modelling and testing work was done in R and Greenplum in-database analytic tools.

Keywords: sentiment analysis, social media, Twitter, Amazon, data mining, machine learning, text mining

Procedia PDF Downloads 353
25991 Laser Induced Transient Current in Quasi-One-Dimensional Nanostructure

Authors: Tokuei Sako

Abstract:

Light-induced ultrafast charge transfer in low-dimensional nanostructure has been studied by a model of a few electrons confined in a 1D electrostatic potential coupled to electrodes at both ends and subjected to an ultrashort pulsed laser field. The time-propagation of the one- and two-electron wave packets has been calculated by integrating the time-dependent Schrödinger equation by the symplectic integrator method with uniform Fourier grid. The temporal behavior of the resultant light-induced current in the studied systems has been discussed with respect to the central frequency and pulse width of the applied laser fields.

Keywords: pulsed laser field, nanowire, wave packet, quantum dots, conductivity

Procedia PDF Downloads 509
25990 Real-Time Big-Data Warehouse a Next-Generation Enterprise Data Warehouse and Analysis Framework

Authors: Abbas Raza Ali

Abstract:

Big Data technology is gradually becoming a dire need of large enterprises. These enterprises are generating massively large amount of off-line and streaming data in both structured and unstructured formats on daily basis. It is a challenging task to effectively extract useful insights from the large scale datasets, even though sometimes it becomes a technology constraint to manage transactional data history of more than a few months. This paper presents a framework to efficiently manage massively large and complex datasets. The framework has been tested on a communication service provider producing massively large complex streaming data in binary format. The communication industry is bound by the regulators to manage history of their subscribers’ call records where every action of a subscriber generates a record. Also, managing and analyzing transactional data allows service providers to better understand their customers’ behavior, for example, deep packet inspection requires transactional internet usage data to explain internet usage behaviour of the subscribers. However, current relational database systems limit service providers to only maintain history at semantic level which is aggregated at subscriber level. The framework addresses these challenges by leveraging Big Data technology which optimally manages and allows deep analysis of complex datasets. The framework has been applied to offload existing Intelligent Network Mediation and relational Data Warehouse of the service provider on Big Data. The service provider has 50+ million subscriber-base with yearly growth of 7-10%. The end-to-end process takes not more than 10 minutes which involves binary to ASCII decoding of call detail records, stitching of all the interrogations against a call (transformations) and aggregations of all the call records of a subscriber.

Keywords: big data, communication service providers, enterprise data warehouse, stream computing, Telco IN Mediation

Procedia PDF Downloads 175
25989 Programming with Grammars

Authors: Peter M. Maurer Maurer

Abstract:

DGL is a context free grammar-based tool for generating random data. Many types of simulator input data require some computation to be placed in the proper format. For example, it might be necessary to generate ordered triples in which the third element is the sum of the first two elements, or it might be necessary to generate random numbers in some sorted order. Although DGL is universal in computational power, generating these types of data is extremely difficult. To overcome this problem, we have enhanced DGL to include features that permit direct computation within the structure of a context free grammar. The features have been implemented as special types of productions, preserving the context free flavor of DGL specifications.

Keywords: DGL, Enhanced Context Free Grammars, Programming Constructs, Random Data Generation

Procedia PDF Downloads 147
25988 A Model Architecture Transformation with Approach by Modeling: From UML to Multidimensional Schemas of Data Warehouses

Authors: Ouzayr Rabhi, Ibtissam Arrassen

Abstract:

To provide a complete analysis of the organization and to help decision-making, leaders need to have relevant data; Data Warehouses (DW) are designed to meet such needs. However, designing DW is not trivial and there is no formal method to derive a multidimensional schema from heterogeneous databases. In this article, we present a Model-Driven based approach concerning the design of data warehouses. We describe a multidimensional meta-model and also specify a set of transformations starting from a Unified Modeling Language (UML) metamodel. In this approach, the UML metamodel and the multidimensional one are both considered as a platform-independent model (PIM). The first meta-model is mapped into the second one through transformation rules carried out by the Query View Transformation (QVT) language. This proposal is validated through the application of our approach to generating a multidimensional schema of a Balanced Scorecard (BSC) DW. We are interested in the BSC perspectives, which are highly linked to the vision and the strategies of an organization.

Keywords: data warehouse, meta-model, model-driven architecture, transformation, UML

Procedia PDF Downloads 160