Search results for: maximal data sets
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25280

Search results for: maximal data sets

24710 A Data Envelopment Analysis Model in a Multi-Objective Optimization with Fuzzy Environment

Authors: Michael Gidey Gebru

Abstract:

Most of Data Envelopment Analysis models operate in a static environment with input and output parameters that are chosen by deterministic data. However, due to ambiguity brought on shifting market conditions, input and output data are not always precisely gathered in real-world scenarios. Fuzzy numbers can be used to address this kind of ambiguity in input and output data. Therefore, this work aims to expand crisp Data Envelopment Analysis into Data Envelopment Analysis with fuzzy environment. In this study, the input and output data are regarded as fuzzy triangular numbers. Then, the Data Envelopment Analysis model with fuzzy environment is solved using a multi-objective method to gauge the Decision Making Units' efficiency. Finally, the developed Data Envelopment Analysis model is illustrated with an application on real data 50 educational institutions.

Keywords: efficiency, Data Envelopment Analysis, fuzzy, higher education, input, output

Procedia PDF Downloads 34
24709 Navigating Complex Communication Dynamics in Qualitative Research

Authors: Kimberly M. Cacciato, Steven J. Singer, Allison R. Shapiro, Julianna F. Kamenakis

Abstract:

This study examines the dynamics of communication among researchers and participants who have various levels of hearing, use multiple languages, have various disabilities, and who come from different social strata. This qualitative methodological study focuses on the strategies employed in an ethnographic research study examining the communication choices of six sets of parents who have Deaf-Disabled children. The participating families varied in their communication strategies and preferences including the use of American Sign Language (ASL), visual-gestural communication, multiple spoken languages, and pidgin forms of each of these. The research team consisted of two undergraduate students proficient in ASL and a Deaf principal investigator (PI) who uses ASL and speech as his main modes of communication. A third Hard-of-Hearing undergraduate student fluent in ASL served as an objective facilitator of the data analysis. The team created reflexive journals by audio recording, free writing, and responding to team-generated prompts. They discussed interactions between the members of the research team, their evolving relationships, and various social and linguistic power differentials. The researchers reflected on communication during data collection, their experiences with one another, and their experiences with the participating families. Reflexive journals totaled over 150 pages. The outside research assistant reviewed the journals and developed follow up open-ended questions and prods to further enrich the data. The PI and outside research assistant used NVivo qualitative research software to conduct open inductive coding of the data. They chunked the data individually into broad categories through multiple readings and recognized recurring concepts. They compared their categories, discussed them, and decided which they would develop. The researchers continued to read, reduce, and define the categories until they were able to develop themes from the data. The research team found that the various communication backgrounds and skills present greatly influenced the dynamics between the members of the research team and with the participants of the study. Specifically, the following themes emerged: (1) students as communication facilitators and interpreters as barriers to natural interaction, (2) varied language use simultaneously complicated and enriched data collection, and (3) ASL proficiency and professional position resulted in a social hierarchy among researchers and participants. In the discussion, the researchers reflected on their backgrounds and internal biases of analyzing the data found and how social norms or expectations affected the perceptions of the researchers in writing their journals. Through this study, the research team found that communication and language skills require significant consideration when working with multiple and complex communication modes. The researchers had to continually assess and adjust their data collection methods to meet the communication needs of the team members and participants. In doing so, the researchers aimed to create an accessible research setting that yielded rich data but learned that this often required compromises from one or more of the research constituents.

Keywords: American Sign Language, complex communication, deaf-disabled, methodology

Procedia PDF Downloads 103
24708 Copper Price Prediction Model for Various Economic Situations

Authors: Haidy S. Ghali, Engy Serag, A. Samer Ezeldin

Abstract:

Copper is an essential raw material used in the construction industry. During the year 2021 and the first half of 2022, the global market suffered from a significant fluctuation in copper raw material prices due to the aftermath of both the COVID-19 pandemic and the Russia-Ukraine war, which exposed its consumers to an unexpected financial risk. Thereto, this paper aims to develop two ANN-LSTM price prediction models, using Python, that can forecast the average monthly copper prices traded in the London Metal Exchange; the first model is a multivariate model that forecasts the copper price of the next 1-month and the second is a univariate model that predicts the copper prices of the upcoming three months. Historical data of average monthly London Metal Exchange copper prices are collected from January 2009 till July 2022, and potential external factors are identified and employed in the multivariate model. These factors lie under three main categories: energy prices and economic indicators of the three major exporting countries of copper, depending on the data availability. Before developing the LSTM models, the collected external parameters are analyzed with respect to the copper prices using correlation and multicollinearity tests in R software; then, the parameters are further screened to select the parameters that influence the copper prices. Then, the two LSTM models are developed, and the dataset is divided into training, validation, and testing sets. The results show that the performance of the 3-Month prediction model is better than the 1-Month prediction model, but still, both models can act as predicting tools for diverse economic situations.

Keywords: copper prices, prediction model, neural network, time series forecasting

Procedia PDF Downloads 99
24707 Combined Automatic Speech Recognition and Machine Translation in Business Correspondence Domain for English-Croatian

Authors: Sanja Seljan, Ivan Dunđer

Abstract:

The paper presents combined automatic speech recognition (ASR) for English and machine translation (MT) for English and Croatian in the domain of business correspondence. The first part presents results of training the ASR commercial system on two English data sets, enriched by error analysis. The second part presents results of machine translation performed by online tool Google Translate for English and Croatian and Croatian-English language pairs. Human evaluation in terms of usability is conducted and internal consistency calculated by Cronbach's alpha coefficient, enriched by error analysis. Automatic evaluation is performed by WER (Word Error Rate) and PER (Position-independent word Error Rate) metrics, followed by investigation of Pearson’s correlation with human evaluation.

Keywords: automatic machine translation, integrated language technologies, quality evaluation, speech recognition

Procedia PDF Downloads 468
24706 A Fact-Finding Analysis on the Expulsions Made under Title 42 in Us

Authors: Avi Shrivastava

Abstract:

Title 42, an emergency health decree, has forced the federal authorities to turn away asylum seekers and all other border crossers since last year. When Title 42 was first deployed in immigration detention centers, where many migrants are held when they arrive at the U.S.-Mexico border, the Trump administration embraced it as a strategy. Expulsions Policy and New Border Challenges will be examined in regard to Title 42 concerns. Humanitarian measures for refugees arriving at the US-Mexico border are the focus of this article. To a large extent, this article addresses the implications of the United States' use of Title 42 in expelling refugees and the possible ramifications of doing away with it. A secondary data collecting strategy was used to gather the information for this study, allowing researchers to examine a large number of previously collected data sets. Information about Title 42 may be found in a variety of places, such as scholarly publications, newspapers, books, and the internet. The inquiry employed qualitative and explanatory research approaches. The claim that 1.7 million individuals were forced to leave the country as a result of it was withdrawn. Since CBP and ICE were limited in their ability to process deportees, it employed a very random patchwork technique in selecting the expelled individuals. As a consequence, repeat offenders, particularly those who were single, got a reduced punishment. The government will be compelled to focus on long-overdue but vital border enhancements if expulsions are halted. Title 42 provisions may help expedite the processing of asylum and other types of humanitarian relief. The government is prepared for an increase in arrivals, but ending the program would lead to a return to arrival levels seen during the Title 42 period.

Keywords: migrants, refugees, title 42, medical, trump administration

Procedia PDF Downloads 79
24705 The Architecture, Engineering and Construction(AEC)New Paradigm Shift: Building Information Modelling Trend in the United Arab Emirates

Authors: Salem B. Abdalla

Abstract:

This study investigated the current Building Information Modelling (BIM) trends and practices in the UAE, particularly to shed light on a recently circulated Dubai BIM mandate. Two sets of surveys were mailed to the AEC industry and the corresponding academic sector within the UAE to collect up-to-date data on BIM awareness and utilization. The surveys showed startling results concerning the academic sector in the UAE where almost 70% of respondents were not aware of the BIM mandate. Among the rest, even when aware, the majority of mechanical and electrical engineering schools felt that BIM is not pertinent to their discipline. Therefore, the response to offering BIM in their curriculum was substantially low (35%). On the other hand, the industrial survey identified a large majority (76.5%) of the AEC industry in the UAE are using BIM. The results clearly indicate that the academia should include BIM in their curriculum to produce qualified graduates to support the market. However, the academia is also faced with several obstacles to implement BIM in their curriculum, where the main pretext is that there is “no room for new courses in existing curriculum”.

Keywords: building information modeling, BIM adoption, UAE BIM industry survey, UAE BIM academia survey, Dubai BIM mandate, UK BIM mandate, BIM education, architecture education, engineering schools, BIM implementation, BIM curriculum

Procedia PDF Downloads 392
24704 The role of Financial Development and Institutional Quality in Promoting Sustainable Development through Tourism Management

Authors: Hashim Zameer

Abstract:

Effective tourism management plays a vital role in promoting sustainability and supporting ecosystems. A common principle that has been in practice over the years is “first pollute and then clean,” indicating countries need financial resources to promote sustainability. Financial development and the tourism management both seems very important to promoting sustainable development. However, without institutional support, it is very difficult to succeed. In this context, it seems prominently significant to explore how institutional quality, tourism development, and financial development could promote sustainable development. In the past, no research explored the role of tourism development in sustainable development. Moreover, the role of financial development, natural resources, and institutional quality in sustainable development is also ignored. In this regard, this paper aims to investigate the role of tourism development, natural resources, financial development, and institutional quality in sustainable development in China. The study used time-series data from 2000–2021 and employed the Bayesian linear regression model because it is suitable for small data sets. The robustness of the findings was checked using a quantile regression approach. The results reveal that an increase in tourism expenditures stimulates the economy, creates jobs, encourages cultural exchange, and supports sustainability initiatives. Moreover, financial development and institution quality have a positive effect on sustainable development. However, reliance on natural resources can result in negative economic, social, and environmental outcomes, highlighting the need for resource diversification and management to reinforce sustainable development. These results highlight the significance of financial development, strong institutions, sustainable tourism, and careful utilization of natural resources for long-term sustainability. The study holds vital insights for policy formulation to promote sustainable tourism.

Keywords: sustainability, tourism development, financial development, institutional quality

Procedia PDF Downloads 62
24703 Pruning Algorithm for the Minimum Rule Reduct Generation

Authors: Sahin Emrah Amrahov, Fatih Aybar, Serhat Dogan

Abstract:

In this paper we consider the rule reduct generation problem. Rule Reduct Generation (RG) and Modified Rule Generation (MRG) algorithms, that are used to solve this problem, are well-known. Alternative to these algorithms, we develop Pruning Rule Generation (PRG) algorithm. We compare the PRG algorithm with RG and MRG.

Keywords: rough sets, decision rules, rule induction, classification

Procedia PDF Downloads 511
24702 Optimization Modeling of the Hybrid Antenna Array for the DoA Estimation

Authors: Somayeh Komeylian

Abstract:

The direction of arrival (DoA) estimation is the crucial aspect of the radar technologies for detecting and dividing several signal sources. In this scenario, the antenna array output modeling involves numerous parameters including noise samples, signal waveform, signal directions, signal number, and signal to noise ratio (SNR), and thereby the methods of the DoA estimation rely heavily on the generalization characteristic for establishing a large number of the training data sets. Hence, we have analogously represented the two different optimization models of the DoA estimation; (1) the implementation of the decision directed acyclic graph (DDAG) for the multiclass least-squares support vector machine (LS-SVM), and (2) the optimization method of the deep neural network (DNN) radial basis function (RBF). We have rigorously verified that the LS-SVM DDAG algorithm is capable of accurately classifying DoAs for the three classes. However, the accuracy and robustness of the DoA estimation are still highly sensitive to technological imperfections of the antenna arrays such as non-ideal array design and manufacture, array implementation, mutual coupling effect, and background radiation and thereby the method may fail in representing high precision for the DoA estimation. Therefore, this work has a further contribution on developing the DNN-RBF model for the DoA estimation for overcoming the limitations of the non-parametric and data-driven methods in terms of array imperfection and generalization. The numerical results of implementing the DNN-RBF model have confirmed the better performance of the DoA estimation compared with the LS-SVM algorithm. Consequently, we have analogously evaluated the performance of utilizing the two aforementioned optimization methods for the DoA estimation using the concept of the mean squared error (MSE).

Keywords: DoA estimation, Adaptive antenna array, Deep Neural Network, LS-SVM optimization model, Radial basis function, and MSE

Procedia PDF Downloads 82
24701 Chinese Sentence Level Lip Recognition

Authors: Peng Wang, Tigang Jiang

Abstract:

The computer based lip reading method of different languages cannot be universal. At present, for the research of Chinese lip reading, whether the work on data sets or recognition algorithms, is far from mature. In this paper, we study the Chinese lipreading method based on machine learning, and propose a Chinese Sentence-level lip-reading network (CNLipNet) model which consists of spatio-temporal convolutional neural network(CNN), recurrent neural network(RNN) and Connectionist Temporal Classification (CTC) loss function. This model can map variable-length sequence of video frames to Chinese Pinyin sequence and is trained end-to-end. More over, We create CNLRS, a Chinese Lipreading Dataset, which contains 5948 samples and can be shared through github. The evaluation of CNLipNet on this dataset yielded a 41% word correct rate and a 70.6% character correct rate. This evaluation result is far superior to the professional human lip readers, indicating that CNLipNet performs well in lipreading.

Keywords: lipreading, machine learning, spatio-temporal, convolutional neural network, recurrent neural network

Procedia PDF Downloads 110
24700 The Economic Limitations of Defining Data Ownership Rights

Authors: Kacper Tomasz Kröber-Mulawa

Abstract:

This paper will address the topic of data ownership from an economic perspective, and examples of economic limitations of data property rights will be provided, which have been identified using methods and approaches of economic analysis of law. To properly build a background for the economic focus, in the beginning a short perspective of data and data ownership in the EU’s legal system will be provided. It will include a short introduction to its political and social importance and highlight relevant viewpoints. This will stress the importance of a Single Market for data but also far-reaching regulations of data governance and privacy (including the distinction of personal and non-personal data, data held by public bodies and private businesses). The main discussion of this paper will build upon the briefly referred to legal basis as well as methods and approaches of economic analysis of law.

Keywords: antitrust, data, data ownership, digital economy, property rights

Procedia PDF Downloads 63
24699 Protecting the Cloud Computing Data Through the Data Backups

Authors: Abdullah Alsaeed

Abstract:

Virtualized computing and cloud computing infrastructures are no longer fuzz or marketing term. They are a core reality in today’s corporate Information Technology (IT) organizations. Hence, developing an effective and efficient methodologies for data backup and data recovery is required more than any time. The purpose of data backup and recovery techniques are to assist the organizations to strategize the business continuity and disaster recovery approaches. In order to accomplish this strategic objective, a variety of mechanism were proposed in the recent years. This research paper will explore and examine the latest techniques and solutions to provide data backup and restoration for the cloud computing platforms.

Keywords: data backup, data recovery, cloud computing, business continuity, disaster recovery, cost-effective, data encryption.

Procedia PDF Downloads 71
24698 Missing Link Data Estimation with Recurrent Neural Network: An Application Using Speed Data of Daegu Metropolitan Area

Authors: JaeHwan Yang, Da-Woon Jeong, Seung-Young Kho, Dong-Kyu Kim

Abstract:

In terms of ITS, information on link characteristic is an essential factor for plan or operation. But in practical cases, not every link has installed sensors on it. The link that does not have data on it is called “Missing Link”. The purpose of this study is to impute data of these missing links. To get these data, this study applies the machine learning method. With the machine learning process, especially for the deep learning process, missing link data can be estimated from present link data. For deep learning process, this study uses “Recurrent Neural Network” to take time-series data of road. As input data, Dedicated Short-range Communications (DSRC) data of Dalgubul-daero of Daegu Metropolitan Area had been fed into the learning process. Neural Network structure has 17 links with present data as input, 2 hidden layers, for 1 missing link data. As a result, forecasted data of target link show about 94% of accuracy compared with actual data.

Keywords: data estimation, link data, machine learning, road network

Procedia PDF Downloads 499
24697 Customer Data Analysis Model Using Business Intelligence Tools in Telecommunication Companies

Authors: Monica Lia

Abstract:

This article presents a customer data analysis model using business intelligence tools for data modelling, transforming, data visualization and dynamic reports building. Economic organizational customer’s analysis is made based on the information from the transactional systems of the organization. The paper presents how to develop the data model starting for the data that companies have inside their own operational systems. The owned data can be transformed into useful information about customers using business intelligence tool. For a mature market, knowing the information inside the data and making forecast for strategic decision become more important. Business Intelligence tools are used in business organization as support for decision-making.

Keywords: customer analysis, business intelligence, data warehouse, data mining, decisions, self-service reports, interactive visual analysis, and dynamic dashboards, use cases diagram, process modelling, logical data model, data mart, ETL, star schema, OLAP, data universes

Procedia PDF Downloads 417
24696 Calculation of Electronic Structures of Nickel in Interaction with Hydrogen by Density Functional Theoretical (DFT) Method

Authors: Choukri Lekbir, Mira Mokhtari

Abstract:

Hydrogen-Materials interaction and mechanisms can be modeled at nano scale by quantum methods. In this work, the effect of hydrogen on the electronic properties of a cluster material model «nickel» has been studied by using of density functional theoretical (DFT) method. Two types of clusters are optimized: Nickel and hydrogen-nickel system. In the case of nickel clusters (n = 1-6) without presence of hydrogen, three types of electronic structures (neutral, cationic and anionic), have been optimized according to three basis sets calculations (B3LYP/LANL2DZ, PW91PW91/DGDZVP2, PBE/DGDZVP2). The comparison of binding energies and bond lengths of the three structures of nickel clusters (neutral, cationic and anionic) obtained by those basis sets, shows that the results of neutral and anionic nickel clusters are in good agreement with the experimental results. In the case of neutral and anionic nickel clusters, comparing energies and bond lengths obtained by the three bases, shows that the basis set PBE/DGDZVP2 is most suitable to experimental results. In the case of anionic nickel clusters (n = 1-6) with presence of hydrogen, the optimization of the hydrogen-nickel (anionic) structures by using of the basis set PBE/DGDZVP2, shows that the binding energies and bond lengths increase compared to those obtained in the case of anionic nickel clusters without the presence of hydrogen, that reveals the armor effect exerted by hydrogen on the electronic structure of nickel, which due to the storing of hydrogen energy within nickel clusters structures. The comparison between the bond lengths for both clusters shows the expansion effect of clusters geometry which due to hydrogen presence.

Keywords: binding energies, bond lengths, density functional theoretical, geometry optimization, hydrogen energy, nickel cluster

Procedia PDF Downloads 408
24695 Private Coded Computation of Matrix Multiplication

Authors: Malihe Aliasgari, Yousef Nejatbakhsh

Abstract:

The era of Big Data and the immensity of real-life datasets compels computation tasks to be performed in a distributed fashion, where the data is dispersed among many servers that operate in parallel. However, massive parallelization leads to computational bottlenecks due to faulty servers and stragglers. Stragglers refer to a few slow or delay-prone processors that can bottleneck the entire computation because one has to wait for all the parallel nodes to finish. The problem of straggling processors, has been well studied in the context of distributed computing. Recently, it has been pointed out that, for the important case of linear functions, it is possible to improve over repetition strategies in terms of the tradeoff between performance and latency by carrying out linear precoding of the data prior to processing. The key idea is that, by employing suitable linear codes operating over fractions of the original data, a function may be completed as soon as enough number of processors, depending on the minimum distance of the code, have completed their operations. The problem of matrix-matrix multiplication in the presence of practically big sized of data sets faced with computational and memory related difficulties, which makes such operations are carried out using distributed computing platforms. In this work, we study the problem of distributed matrix-matrix multiplication W = XY under storage constraints, i.e., when each server is allowed to store a fixed fraction of each of the matrices X and Y, which is a fundamental building of many science and engineering fields such as machine learning, image and signal processing, wireless communication, optimization. Non-secure and secure matrix multiplication are studied. We want to study the setup, in which the identity of the matrix of interest should be kept private from the workers and then obtain the recovery threshold of the colluding model, that is, the number of workers that need to complete their task before the master server can recover the product W. The problem of secure and private distributed matrix multiplication W = XY which the matrix X is confidential, while matrix Y is selected in a private manner from a library of public matrices. We present the best currently known trade-off between communication load and recovery threshold. On the other words, we design an achievable PSGPD scheme for any arbitrary privacy level by trivially concatenating a robust PIR scheme for arbitrary colluding workers and private databases and the proposed SGPD code that provides a smaller computational complexity at the workers.

Keywords: coded distributed computation, private information retrieval, secret sharing, stragglers

Procedia PDF Downloads 106
24694 Evalutaion of the Surface Water Quality Using the Water Quality Index and Discriminant Analysis Method

Authors: Lazhar Belkhiri, Ammar Tiri, Lotfi Mouni

Abstract:

Water resources present to the public order of the world a very important problem for the protection and management of water quality given the complexity of water quality data sets. In this study, the water quality index (WQI) and irrigation water quality index (IWQI) were calculated in order to evaluate the surface water quality for drinking and irrigation purposes based on nine hydrochemical parameters. In order to separate the variables that are the most responsible for the spatial differentiation, the discriminant analysis (DA) was applied. The results show that the surface water quality for drinking is poor quality and very poor quality based on WQI values, however, the values of IWQI reflect that this water is acceptable for irrigation with a restriction for sensitive plants. Consequently, the discriminant analysis DA method has shown that the following parameters pH, potassium, chloride, sulfate, and bicarbonate are significant discrimination between the different stations with the spatial variation of the surface water quality, therefore, the results obtained in this study provide very useful information to decision-makers

Keywords: surface water quality, drinking and irrigation purposes, water quality index, discriminant analysis

Procedia PDF Downloads 68
24693 Measurement Technologies for Advanced Characterization of Magnetic Materials Used in Electric Drives and Automotive Applications

Authors: Lukasz Mierczak, Patrick Denke, Piotr Klimczyk, Stefan Siebert

Abstract:

Due to the high complexity of the magnetization in electrical machines and influence of the manufacturing processes on the magnetic properties of their components, the assessment and prediction of hysteresis and eddy current losses has remained a challenge. In the design process of electric motors and generators, the power losses of stators and rotors are calculated based on the material supplier’s data from standard magnetic measurements. This type of data does not include the additional loss from non-sinusoidal multi-harmonic motor excitation nor the detrimental effects of residual stress remaining in the motor laminations after manufacturing processes, such as punching, housing shrink fitting and winding. Moreover, in production, considerable attention is given to the measurements of mechanical dimensions of stator and rotor cores, whereas verification of their magnetic properties is typically neglected, which can lead to inconsistent efficiency of assembled motors. Therefore, to enable a comprehensive characterization of motor materials and components, Brockhaus Measurements developed a range of in-line and offline measurement technologies for testing their magnetic properties under actual motor operating conditions. Multiple sets of experimental data were obtained to evaluate the influence of various factors, such as elevated temperature, applied and residual stress, and arbitrary magnetization on the magnetic properties of different grades of non-oriented steel. Measured power loss for tested samples and stator cores varied significantly, by more than 100%, comparing to standard measurement conditions. Quantitative effects of each of the applied measurement were analyzed. This research and applied Brockhaus measurement methodologies emphasized the requirement for advanced characterization of magnetic materials used in electric drives and automotive applications.

Keywords: magnetic materials, measurement technologies, permanent magnets, stator and rotor cores

Procedia PDF Downloads 133
24692 Opening up Government Datasets for Big Data Analysis to Support Policy Decisions

Authors: K. Hardy, A. Maurushat

Abstract:

Policy makers are increasingly looking to make evidence-based decisions. Evidence-based decisions have historically used rigorous methodologies of empirical studies by research institutes, as well as less reliable immediate survey/polls often with limited sample sizes. As we move into the era of Big Data analytics, policy makers are looking to different methodologies to deliver reliable empirics in real-time. The question is not why did these people do this for the last 10 years, but why are these people doing this now, and if the this is undesirable, and how can we have an impact to promote change immediately. Big data analytics rely heavily on government data that has been released in to the public domain. The open data movement promises greater productivity and more efficient delivery of services; however, Australian government agencies remain reluctant to release their data to the general public. This paper considers the barriers to releasing government data as open data, and how these barriers might be overcome.

Keywords: big data, open data, productivity, data governance

Procedia PDF Downloads 355
24691 A Review on Existing Challenges of Data Mining and Future Research Perspectives

Authors: Hema Bhardwaj, D. Srinivasa Rao

Abstract:

Technology for analysing, processing, and extracting meaningful data from enormous and complicated datasets can be termed as "big data." The technique of big data mining and big data analysis is extremely helpful for business movements such as making decisions, building organisational plans, researching the market efficiently, improving sales, etc., because typical management tools cannot handle such complicated datasets. Special computational and statistical issues, such as measurement errors, noise accumulation, spurious correlation, and storage and scalability limitations, are brought on by big data. These unique problems call for new computational and statistical paradigms. This research paper offers an overview of the literature on big data mining, its process, along with problems and difficulties, with a focus on the unique characteristics of big data. Organizations have several difficulties when undertaking data mining, which has an impact on their decision-making. Every day, terabytes of data are produced, yet only around 1% of that data is really analyzed. The idea of the mining and analysis of data and knowledge discovery techniques that have recently been created with practical application systems is presented in this study. This article's conclusion also includes a list of issues and difficulties for further research in the area. The report discusses the management's main big data and data mining challenges.

Keywords: big data, data mining, data analysis, knowledge discovery techniques, data mining challenges

Procedia PDF Downloads 95
24690 Verification of Dosimetric Commissioning Accuracy of Flattening Filter Free Intensity Modulated Radiation Therapy and Volumetric Modulated Therapy Delivery Using Task Group 119 Guidelines

Authors: Arunai Nambi Raj N., Kaviarasu Karunakaran, Krishnamurthy K.

Abstract:

The purpose of this study was to create American Association of Physicist in Medicine (AAPM) Task Group 119 (TG 119) benchmark plans for flattening filter free beam (FFF) deliveries of intensity modulated radiation therapy (IMRT) and volumetric arc therapy (VMAT) in the Eclipse treatment planning system. The planning data were compared with the flattening filter (FF) IMRT & VMAT plan data to verify the dosimetric commissioning accuracy of FFF deliveries. AAPM TG 119 proposed a set of test cases called multi-target, mock prostate, mock head and neck, and C-shape to ascertain the overall accuracy of IMRT planning, measurement, and analysis. We used these test cases to investigate the performance of the Eclipse Treatment planning system for the flattening filter free beam deliveries. For these test cases, we generated two sets of treatment plans, the first plan using 7–9 IMRT fields and a second plan utilizing two arc VMAT technique for both the beam deliveries (6 MV FF, 6MV FFF, 10 MV FF and 10 MV FFF). The planning objectives and dose were set as described in TG 119. The dose prescriptions for multi-target, mock prostate, mock head and neck, and C-shape were taken as 50, 75.6, 50 and 50 Gy, respectively. The point dose (mean dose to the contoured chamber volume) at the specified positions/locations was measured using compact (CC‑13) ion chamber. The composite planar dose and per-field gamma analysis were measured with IMatriXX Evaluation 2D array with OmniPro IMRT Software (version 1.7b). FFF beam deliveries of IMRT and VMAT plans were comparable to flattening filter beam deliveries. Our planning and quality assurance results matched with TG 119 data. AAPM TG 119 test cases are useful to generate FFF benchmark plans. From the obtained data in this study, we conclude that the commissioning of FFF IMRT and FFF VMAT delivery were found within the limits of TG-119 and the performance of the Eclipse treatment planning system for FFF plans were found satisfactorily.

Keywords: flattening filter free beams, intensity modulated radiation therapy, task group 119, volumetric modulated arc therapy

Procedia PDF Downloads 135
24689 Ill-Posed Inverse Problems in Molecular Imaging

Authors: Ranadhir Roy

Abstract:

Inverse problems arise in medical (molecular) imaging. These problems are characterized by large in three dimensions, and by the diffusion equation which models the physical phenomena within the media. The inverse problems are posed as a nonlinear optimization where the unknown parameters are found by minimizing the difference between the predicted data and the measured data. To obtain a unique and stable solution to an ill-posed inverse problem, a priori information must be used. Mathematical conditions to obtain stable solutions are established in Tikhonov’s regularization method, where the a priori information is introduced via a stabilizing functional, which may be designed to incorporate some relevant information of an inverse problem. Effective determination of the Tikhonov regularization parameter requires knowledge of the true solution, or in the case of optical imaging, the true image. Yet, in, clinically-based imaging, true image is not known. To alleviate these difficulties we have applied the penalty/modified barrier function (PMBF) method instead of Tikhonov regularization technique to make the inverse problems well-posed. Unlike the Tikhonov regularization method, the constrained optimization technique, which is based on simple bounds of the optical parameter properties of the tissue, can easily be implemented in the PMBF method. Imposing the constraints on the optical properties of the tissue explicitly restricts solution sets and can restore uniqueness. Like the Tikhonov regularization method, the PMBF method limits the size of the condition number of the Hessian matrix of the given objective function. The accuracy and the rapid convergence of the PMBF method require a good initial guess of the Lagrange multipliers. To obtain the initial guess of the multipliers, we use a least square unconstrained minimization problem. Three-dimensional images of fluorescence absorption coefficients and lifetimes were reconstructed from contact and noncontact experimentally measured data.

Keywords: constrained minimization, ill-conditioned inverse problems, Tikhonov regularization method, penalty modified barrier function method

Procedia PDF Downloads 260
24688 Reasons to Redesign: Teacher Education for a Brighter Tomorrow

Authors: Deborah L. Smith

Abstract:

To review our program and determine the best redesign options, department members gathered feedback and input through focus groups, analysis of data, and a review of the current research to ensure that the changes proposed were not based solely on the state’s new professional standards. In designing course assignments and assessments, we listened to a variety of constituents, including students, other institutions of higher learning, MDE webinars, host teachers, literacy clinic personnel, and other disciplinary experts. As a result, we are designing a program that is more inclusive of a variety of field experiences for growth. We have determined ways to improve our program by connecting academic disciplinary knowledge, educational psychology, and community building both inside and outside the classroom for professional learning communities. The state’s release of new professional standards led my department members to question what is working and what needs improvement in our program. One aspect of our program that continues to be supported by research and data analysis is the function of supervised field experiences with meaningful feedback. We seek to expand in this area. Other data indicate that we have strengths in modeling a variety of approaches such as cooperative learning, discussions, literacy strategies, and workshops. In the new program, field assignments will be connected to multiple courses, and efforts to scaffold student learning to guide them toward best evidence-based practices will be continuous. Despite running a program that meets multiple sets of standards, there are areas of need that we directly address in our redesign proposal. Technology is ever-changing, so it’s inevitable that improving digital skills is a focus. In addition, scaffolding procedures for English Language Learners (ELL) or other students who struggle is imperative. Diversity, equity, and inclusion (DEI) has been an integral part of our curriculum, but the research indicates that more self-reflection and a deeper understanding of culturally relevant practices would help the program improve. Connections with professional learning communities will be expanded, as will leadership components, so that teacher candidates understand their role in changing the face of education. A pilot program will run in academic year 22/23, and additional data will be collected each semester through evaluations and continued program review.

Keywords: DEI, field experiences, program redesign, teacher preparation

Procedia PDF Downloads 153
24687 A Systematic Review on Challenges in Big Data Environment

Authors: Rimmy Yadav, Anmol Preet Kaur

Abstract:

Big Data has demonstrated the vast potential in streamlining, deciding, spotting business drifts in different fields, for example, producing, fund, Information Technology. This paper gives a multi-disciplinary diagram of the research issues in enormous information and its procedures, instruments, and system identified with the privacy, data storage management, network and energy utilization, adaptation to non-critical failure and information representations. Other than this, result difficulties and openings accessible in this Big Data platform have made.

Keywords: big data, privacy, data management, network and energy consumption

Procedia PDF Downloads 291
24686 Modeling Depth Averaged Velocity and Boundary Shear Stress Distributions

Authors: Ebissa Gadissa Kedir, C. S. P. Ojha, K. S. Hari Prasad

Abstract:

In the present study, the depth-averaged velocity and boundary shear stress in non-prismatic compound channels with three different converging floodplain angles ranging from 1.43ᶱ to 7.59ᶱ have been studied. The analytical solutions were derived by considering acting forces on the channel beds and walls. In the present study, five key parameters, i.e., non-dimensional coefficient, secondary flow term, secondary flow coefficient, friction factor, and dimensionless eddy viscosity, were considered and discussed. An expression for non-dimensional coefficient and integration constants was derived based on the boundary conditions. The model was applied to different data sets of the present experiments and experiments from other sources, respectively, to examine and analyse the influence of floodplain converging angles on depth-averaged velocity and boundary shear stress distributions. The results show that the non-dimensional parameter plays important in portraying the variation of depth-averaged velocity and boundary shear stress distributions with different floodplain converging angles. Thus, the variation of the non-dimensional coefficient needs attention since it affects the secondary flow term and secondary flow coefficient in both the main channel and floodplains. The analysis shows that the depth-averaged velocities are sensitive to a shear stress-dependent model parameter non-dimensional coefficient, and the analytical solutions are well agreed with experimental data when five parameters are included. It is inferred that the developed model may facilitate the interest of others in complex flow modeling.

Keywords: depth-average velocity, converging floodplain angles, non-dimensional coefficient, non-prismatic compound channels

Procedia PDF Downloads 64
24685 Robust and Dedicated Hybrid Cloud Approach for Secure Authorized Deduplication

Authors: Aishwarya Shekhar, Himanshu Sharma

Abstract:

Data deduplication is one of important data compression techniques for eliminating duplicate copies of repeating data, and has been widely used in cloud storage to reduce the amount of storage space and save bandwidth. In this process, duplicate data is expunged, leaving only one copy means single instance of the data to be accumulated. Though, indexing of each and every data is still maintained. Data deduplication is an approach for minimizing the part of storage space an organization required to retain its data. In most of the company, the storage systems carry identical copies of numerous pieces of data. Deduplication terminates these additional copies by saving just one copy of the data and exchanging the other copies with pointers that assist back to the primary copy. To ignore this duplication of the data and to preserve the confidentiality in the cloud here we are applying the concept of hybrid nature of cloud. A hybrid cloud is a fusion of minimally one public and private cloud. As a proof of concept, we implement a java code which provides security as well as removes all types of duplicated data from the cloud.

Keywords: confidentiality, deduplication, data compression, hybridity of cloud

Procedia PDF Downloads 368
24684 A Review of Machine Learning for Big Data

Authors: Devatha Kalyan Kumar, Aravindraj D., Sadathulla A.

Abstract:

Big data are now rapidly expanding in all engineering and science and many other domains. The potential of large or massive data is undoubtedly significant, make sense to require new ways of thinking and learning techniques to address the various big data challenges. Machine learning is continuously unleashing its power in a wide range of applications. In this paper, the latest advances and advancements in the researches on machine learning for big data processing. First, the machine learning techniques methods in recent studies, such as deep learning, representation learning, transfer learning, active learning and distributed and parallel learning. Then focus on the challenges and possible solutions of machine learning for big data.

Keywords: active learning, big data, deep learning, machine learning

Procedia PDF Downloads 424
24683 Loading and Unloading Scheduling Problem in a Multiple-Multiple Logistics Network: Modelling and Solving

Authors: Yasin Tadayonrad

Abstract:

Most of the supply chain networks have many nodes starting from the suppliers’ side up to the customers’ side that each node sends/receives the raw materials/products from/to the other nodes. One of the major concerns in this kind of supply chain network is finding the best schedule for loading /unloading the shipments through the whole network by which all the constraints in the source and destination nodes are met and all the shipments are delivered on time. One of the main constraints in this problem is loading/unloading capacity in each source/ destination node at each time slot (e.g., per week/day/hour). Because of the different characteristics of different products/groups of products, the capacity of each node might differ based on each group of products. In most supply chain networks (especially in the Fast-moving consumer goods industry), there are different planners/planning teams working separately in different nodes to determine the loading/unloading timeslots in source/destination nodes to send/receive the shipments. In this paper, a mathematical problem has been proposed to find the best timeslots for loading/unloading the shipments minimizing the overall delays subject to respecting the capacity of loading/unloading of each node, the required delivery date of each shipment (considering the lead-times), and working-days of each node. This model was implemented on python and solved using Python-MIP on a sample data set. Finally, the idea of a heuristic algorithm has been proposed as a way of improving the solution method that helps to implement the model on larger data sets in real business cases, including more nodes and shipments.

Keywords: supply chain management, transportation, multiple-multiple network, timeslots management, mathematical modeling, mixed integer programming

Procedia PDF Downloads 83
24682 3D Classification Optimization of Low-Density Airborne Light Detection and Ranging Point Cloud by Parameters Selection

Authors: Baha Eddine Aissou, Aichouche Belhadj Aissa

Abstract:

Light detection and ranging (LiDAR) is an active remote sensing technology used for several applications. Airborne LiDAR is becoming an important technology for the acquisition of a highly accurate dense point cloud. A classification of airborne laser scanning (ALS) point cloud is a very important task that still remains a real challenge for many scientists. Support vector machine (SVM) is one of the most used statistical learning algorithms based on kernels. SVM is a non-parametric method, and it is recommended to be used in cases where the data distribution cannot be well modeled by a standard parametric probability density function. Using a kernel, it performs a robust non-linear classification of samples. Often, the data are rarely linearly separable. SVMs are able to map the data into a higher-dimensional space to become linearly separable, which allows performing all the computations in the original space. This is one of the main reasons that SVMs are well suited for high-dimensional classification problems. Only a few training samples, called support vectors, are required. SVM has also shown its potential to cope with uncertainty in data caused by noise and fluctuation, and it is computationally efficient as compared to several other methods. Such properties are particularly suited for remote sensing classification problems and explain their recent adoption. In this poster, the SVM classification of ALS LiDAR data is proposed. Firstly, connected component analysis is applied for clustering the point cloud. Secondly, the resulting clusters are incorporated in the SVM classifier. Radial basic function (RFB) kernel is used due to the few numbers of parameters (C and γ) that needs to be chosen, which decreases the computation time. In order to optimize the classification rates, the parameters selection is explored. It consists to find the parameters (C and γ) leading to the best overall accuracy using grid search and 5-fold cross-validation. The exploited LiDAR point cloud is provided by the German Society for Photogrammetry, Remote Sensing, and Geoinformation. The ALS data used is characterized by a low density (4-6 points/m²) and is covering an urban area located in residential parts of the city Vaihingen in southern Germany. The class ground and three other classes belonging to roof superstructures are considered, i.e., a total of 4 classes. The training and test sets are selected randomly several times. The obtained results demonstrated that a parameters selection can orient the selection in a restricted interval of (C and γ) that can be further explored but does not systematically lead to the optimal rates. The SVM classifier with hyper-parameters is compared with the most used classifiers in literature for LiDAR data, random forest, AdaBoost, and decision tree. The comparison showed the superiority of the SVM classifier using parameters selection for LiDAR data compared to other classifiers.

Keywords: classification, airborne LiDAR, parameters selection, support vector machine

Procedia PDF Downloads 139
24681 Strengthening Legal Protection of Personal Data through Technical Protection Regulation in Line with Human Rights

Authors: Tomy Prihananto, Damar Apri Sudarmadi

Abstract:

Indonesia recognizes the right to privacy as a human right. Indonesia provides legal protection against data management activities because the protection of personal data is a part of human rights. This paper aims to describe the arrangement of data management and data management in Indonesia. This paper is a descriptive research with qualitative approach and collecting data from literature study. Results of this paper are comprehensive arrangement of data that have been set up as a technical requirement of data protection by encryption methods. Arrangements on encryption and protection of personal data are mutually reinforcing arrangements in the protection of personal data. Indonesia has two important and immediately enacted laws that provide protection for the privacy of information that is part of human rights.

Keywords: Indonesia, protection, personal data, privacy, human rights, encryption

Procedia PDF Downloads 166