Search results for: ignorable missing data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24325

Search results for: ignorable missing data

24175 An Ensemble Deep Learning Architecture for Imbalanced Classification of Thoracic Surgery Patients

Authors: Saba Ebrahimi, Saeed Ahmadian, Hedie Ashrafi

Abstract:

Selecting appropriate patients for surgery is one of the main issues in thoracic surgery (TS). Both short-term and long-term risks and benefits of surgery must be considered in the patient selection criteria. There are some limitations in the existing datasets of TS patients because of missing values of attributes and imbalanced distribution of survival classes. In this study, a novel ensemble architecture of deep learning networks is proposed based on stacking different linear and non-linear layers to deal with imbalance datasets. The categorical and numerical features are split using different layers with ability to shrink the unnecessary features. Then, after extracting the insight from the raw features, a novel biased-kernel layer is applied to reinforce the gradient of the minority class and cause the network to be trained better comparing the current methods. Finally, the performance and advantages of our proposed model over the existing models are examined for predicting patient survival after thoracic surgery using a real-life clinical data for lung cancer patients.

Keywords: deep learning, ensemble models, imbalanced classification, lung cancer, TS patient selection

Procedia PDF Downloads 110
24174 Government Big Data Ecosystem: A Systematic Literature Review

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Data that is high in volume, velocity, veracity and comes from a variety of sources is usually generated in all sectors including the government sector. Globally public administrations are pursuing (big) data as new technology and trying to adopt a data-centric architecture for hosting and sharing data. Properly executed, big data and data analytics in the government (big) data ecosystem can be led to data-driven government and have a direct impact on the way policymakers work and citizens interact with governments. In this research paper, we conduct a systematic literature review. The main aims of this paper are to highlight essential aspects of the government (big) data ecosystem and to explore the most critical socio-technical factors that contribute to the successful implementation of government (big) data ecosystem. The essential aspects of government (big) data ecosystem include definition, data types, data lifecycle models, and actors and their roles. We also discuss the potential impact of (big) data in public administration and gaps in the government data ecosystems literature. As this is a new topic, we did not find specific articles on government (big) data ecosystem and therefore focused our research on various relevant areas like humanitarian data, open government data, scientific research data, industry data, etc.

Keywords: applications of big data, big data, big data types. big data ecosystem, critical success factors, data-driven government, egovernment, gaps in data ecosystems, government (big) data, literature review, public administration, systematic review

Procedia PDF Downloads 179
24173 A Machine Learning Decision Support Framework for Industrial Engineering Purposes

Authors: Anli Du Preez, James Bekker

Abstract:

Data is currently one of the most critical and influential emerging technologies. However, the true potential of data is yet to be exploited since, currently, about 1% of generated data are ever actually analyzed for value creation. There is a data gap where data is not explored due to the lack of data analytics infrastructure and the required data analytics skills. This study developed a decision support framework for data analytics by following Jabareen’s framework development methodology. The study focused on machine learning algorithms, which is a subset of data analytics. The developed framework is designed to assist data analysts with little experience, in choosing the appropriate machine learning algorithm given the purpose of their application.

Keywords: Data analytics, Industrial engineering, Machine learning, Value creation

Procedia PDF Downloads 134
24172 A Partially Accelerated Life Test Planning with Competing Risks and Linear Degradation Path under Tampered Failure Rate Model

Authors: Fariba Azizi, Firoozeh Haghighi, Viliam Makis

Abstract:

In this paper, we propose a method to model the relationship between failure time and degradation for a simple step stress test where underlying degradation path is linear and different causes of failure are possible. It is assumed that the intensity function depends only on the degradation value. No assumptions are made about the distribution of the failure times. A simple step-stress test is used to shorten failure time of products and a tampered failure rate (TFR) model is proposed to describe the effect of the changing stress on the intensities. We assume that some of the products that fail during the test have a cause of failure that is only known to belong to a certain subset of all possible failures. This case is known as masking. In the presence of masking, the maximum likelihood estimates (MLEs) of the model parameters are obtained through an expectation-maximization (EM) algorithm by treating the causes of failure as missing values. The effect of incomplete information on the estimation of parameters is studied through a Monte-Carlo simulation. Finally, a real example is analyzed to illustrate the application of the proposed methods.

Keywords: cause of failure, linear degradation path, reliability function, expectation-maximization algorithm, intensity, masked data

Procedia PDF Downloads 300
24171 An Integrated Label Propagation Network for Structural Condition Assessment

Authors: Qingsong Xiong, Cheng Yuan, Qingzhao Kong, Haibei Xiong

Abstract:

Deep-learning-driven approaches based on vibration responses have attracted larger attention in rapid structural condition assessment while obtaining sufficient measured training data with corresponding labels is relevantly costly and even inaccessible in practical engineering. This study proposes an integrated label propagation network for structural condition assessment, which is able to diffuse the labels from continuously-generating measurements by intact structure to those of missing labels of damage scenarios. The integrated network is embedded with damage-sensitive features extraction by deep autoencoder and pseudo-labels propagation by optimized fuzzy clustering, the architecture and mechanism which are elaborated. With a sophisticated network design and specified strategies for improving performance, the present network achieves to extends the superiority of self-supervised representation learning, unsupervised fuzzy clustering and supervised classification algorithms into an integration aiming at assessing damage conditions. Both numerical simulations and full-scale laboratory shaking table tests of a two-story building structure were conducted to validate its capability of detecting post-earthquake damage. The identifying accuracy of a present network was 0.95 in numerical validations and an average 0.86 in laboratory case studies, respectively. It should be noted that the whole training procedure of all involved models in the network stringently doesn’t rely upon any labeled data of damage scenarios but only several samples of intact structure, which indicates a significant superiority in model adaptability and feasible applicability in practice.

Keywords: autoencoder, condition assessment, fuzzy clustering, label propagation

Procedia PDF Downloads 67
24170 Providing Security to Private Cloud Using Advanced Encryption Standard Algorithm

Authors: Annapureddy Srikant Reddy, Atthanti Mahendra, Samala Chinni Krishna, N. Neelima

Abstract:

In our present world, we are generating a lot of data and we, need a specific device to store all these data. Generally, we store data in pen drives, hard drives, etc. Sometimes we may loss the data due to the corruption of devices. To overcome all these issues, we implemented a cloud space for storing the data, and it provides more security to the data. We can access the data with just using the internet from anywhere in the world. We implemented all these with the java using Net beans IDE. Once user uploads the data, he does not have any rights to change the data. Users uploaded files are stored in the cloud with the file name as system time and the directory will be created with some random words. Cloud accepts the data only if the size of the file is less than 2MB.

Keywords: cloud space, AES, FTP, NetBeans IDE

Procedia PDF Downloads 172
24169 Determination of the Effective Economic and/or Demographic Indicators in Classification of European Union Member and Candidate Countries Using Partial Least Squares Discriminant Analysis

Authors: Esra Polat

Abstract:

Partial Least Squares Discriminant Analysis (PLSDA) is a statistical method for classification and consists a classical Partial Least Squares Regression (PLSR) in which the dependent variable is a categorical one expressing the class membership of each observation. PLSDA can be applied in many cases when classical discriminant analysis cannot be applied. For example, when the number of observations is low and when the number of independent variables is high. When there are missing values, PLSDA can be applied on the data that is available. Finally, it is adapted when multicollinearity between independent variables is high. The aim of this study is to determine the economic and/or demographic indicators, which are effective in grouping the 28 European Union (EU) member countries and 7 candidate countries (including potential candidates Bosnia and Herzegovina (BiH) and Kosova) by using the data set obtained from database of the World Bank for 2014. Leaving the political issues aside, the analysis is only concerned with the economic and demographic variables that have the potential influence on country’s eligibility for EU entrance. Hence, in this study, both the performance of PLSDA method in classifying the countries correctly to their pre-defined groups (candidate or member) and the differences between the EU countries and candidate countries in terms of these indicators are analyzed. As a result of the PLSDA, the value of percentage correctness of 100 % indicates that overall of the 35 countries is classified correctly. Moreover, the most important variables that determine the statuses of member and candidate countries in terms of economic indicators are identified as 'external balance on goods and services (% GDP)', 'gross domestic savings (% GDP)' and 'gross national expenditure (% GDP)' that means for the 2014 economical structure of countries is the most important determinant of EU membership. Subsequently, the model validated to prove the predictive ability by using the data set for 2015. For prediction sample, %97,14 of the countries are correctly classified. An interesting result is obtained for only BiH, which is still a potential candidate for EU, predicted as a member of EU by using the indicators data set for 2015 as a prediction sample. Although BiH has made a significant transformation from a war-torn country to a semi-functional state, ethnic tensions, nationalistic rhetoric and political disagreements are still evident, which inhibit Bosnian progress towards the EU.

Keywords: classification, demographic indicators, economic indicators, European Union, partial least squares discriminant analysis

Procedia PDF Downloads 237
24168 Business Intelligence for Profiling of Telecommunication Customer

Authors: Rokhmatul Insani, Hira Laksmiwati Soemitro

Abstract:

Business Intelligence is a methodology that exploits the data to produce information and knowledge systematically, business intelligence can support the decision-making process. Some methods in business intelligence are data warehouse and data mining. A data warehouse can store historical data from transactional data. For data modelling in data warehouse, we apply dimensional modelling by Kimball. While data mining is used to extracting patterns from the data and get insight from the data. Data mining has many techniques, one of which is segmentation. For profiling of telecommunication customer, we use customer segmentation according to customer’s usage of services, customer invoice and customer payment. Customers can be grouped according to their characteristics and can be identified the profitable customers. We apply K-Means Clustering Algorithm for segmentation. The input variable for that algorithm we use RFM (Recency, Frequency and Monetary) model. All process in data mining, we use tools IBM SPSS modeller.

Keywords: business intelligence, customer segmentation, data warehouse, data mining

Procedia PDF Downloads 441
24167 The Role Of Digital Technology In Crime Prevention

Authors: Muhammad Ashfaq

Abstract:

Main theme: This prime focus of this study is on the role of digital technology in crime prevention, with special focus on Cellular Forensic Unit, Capital City Police Peshawar-Khyber Pakhtunkhwa-Pakistan. Objective(s) of the study: The prime objective of this study is to provide statistics, strategies and pattern of analysis used for crime prevention in Cellular Forensic Unit of Capital City Police Peshawar, Khyber Pakhtunkhwa-Pakistan. Research Method and Procedure: Qualitative method of research has been used in the study for obtaining secondary data from research wing and Information Technology (IT) section of Peshawar police. Content analysis was the method used for the conduction of the study. This study is delimited to Capital City Police and Cellular Forensic Unit Peshawar-KP, Pakistan. information technologies. Major finding(s): It is evident that the old traditional approach will never provide solutions for better management in controlling crimes. The best way to control crimes and promotion of proactive policing is to adopt new technologies. The study reveals that technology have transformed police more effective and vigilant as compared to traditional policing. The heinous crimes like abduction, missing of an individual, snatching, burglaries and blind murder cases are now traceable with the help of technology. Recommendation(s): From the analysis of the data, it is reflected that Information Technology (IT) expert should be recruited along with research analyst to timely assist and facilitate operational as well as investigation units of police.A mobile locator should be Provided to Cellular Forensic Unit to timely apprehend the criminals .Latest digital analysis software should be provided to equip the Cellular Forensic Unit.

Keywords: crime prevention, digital technology, pakistan, police

Procedia PDF Downloads 34
24166 Recognition of Tifinagh Characters with Missing Parts Using Neural Network

Authors: El Mahdi Barrah, Said Safi, Abdessamad Malaoui

Abstract:

In this paper, we present an algorithm for reconstruction from incomplete 2D scans for tifinagh characters. This algorithm is based on using correlation between the lost block and its neighbors. This system proposed contains three main parts: pre-processing, features extraction and recognition. In the first step, we construct a database of tifinagh characters. In the second step, we will apply “shape analysis algorithm”. In classification part, we will use Neural Network. The simulation results demonstrate that the proposed method give good results.

Keywords: Tifinagh character recognition, neural networks, local cost computation, ANN

Procedia PDF Downloads 304
24165 Mobi Navi Tour for Rescue Operations

Authors: V. R. Sadasivam, M. Vipin, P. Vineeth, M. Sajith, G. Sathiskumar, R. Manikandan, N. Vijayarangan

Abstract:

Global positioning system technology is what leads to such things as navigation systems, GPS tracking devices, GPS surveying and GPS mapping. All that GPS does is provide a set of coordinates which represent the location of GPS units with respect to its latitude, longitude and elevation on planet Earth. It also provides time, which is accurate. The tracking devices themselves come in different flavors. They will contain a GPS receiver, and GPS software, along with some way of transmitting the resulting coordinates. GPS in mobile tend to use radio waves to transmit their location to another GPS device. The purpose of this prototype “Mobi Navi Tour for Rescue Operation” timely communication, and lightning fast decision-making with a group of people located in different places with a common goal. Timely communication and tracking the people are a critical issue in many situations, environments. Expedited can find missing person by sending the location and other related information to them through mobile. Information must be drawn from the caller and entered into the system by the administrator or a group leader and transferred to the group leader. This system will locate the closest available person, a group of people working in an organization/company or vehicle to determine availability and their position to track them. Misinformation cannot lead to the wrong decision in the rapidly paced environment in a normal and an abnormal situation. In “Mobi Navi Tour for Rescue Operation” we use Google Cloud Messaging for android (GCM) which is a service that helps developers send data from servers to their android applications on android devices. The service provides a simple, lightweight mechanism that servers can use to tell mobile applications to contact the server directly, to fetch updated application or user data.

Keywords: android, gps, tour, communication, service

Procedia PDF Downloads 366
24164 PDDA: Priority-Based, Dynamic Data Aggregation Approach for Sensor-Based Big Data Framework

Authors: Lutful Karim, Mohammed S. Al-kahtani

Abstract:

Sensors are being used in various applications such as agriculture, health monitoring, air and water pollution monitoring, traffic monitoring and control and hence, play the vital role in the growth of big data. However, sensors collect redundant data. Thus, aggregating and filtering sensors data are significantly important to design an efficient big data framework. Current researches do not focus on aggregating and filtering data at multiple layers of sensor-based big data framework. Thus, this paper introduces (i) three layers data aggregation and framework for big data and (ii) a priority-based, dynamic data aggregation scheme (PDDA) for the lowest layer at sensors. Simulation results show that the PDDA outperforms existing tree and cluster-based data aggregation scheme in terms of overall network energy consumptions and end-to-end data transmission delay.

Keywords: big data, clustering, tree topology, data aggregation, sensor networks

Procedia PDF Downloads 298
24163 Identification of New Familial Breast Cancer Susceptibility Genes: Are We There Yet?

Authors: Ian Campbell, Gillian Mitchell, Paul James, Na Li, Ella Thompson

Abstract:

The genetic cause of the majority of multiple-case breast cancer families remains unresolved. Next generation sequencing has emerged as an efficient strategy for identifying predisposing mutations in individuals with inherited cancer. We are conducting whole exome sequence analysis of germ line DNA from multiple affected relatives from breast cancer families, with the aim of identifying rare protein truncating and non-synonymous variants that are likely to include novel cancer predisposing mutations. Data from more than 200 exomes show that on average each individual carries 30-50 protein truncating mutations and 300-400 rare non-synonymous variants. Heterogeneity among our exome data strongly suggest that numerous moderate penetrance genes remain to be discovered, with each gene individually accounting for only a small fraction of families (~0.5%). This scenario marks validation of candidate breast cancer predisposing genes in large case-control studies as the rate-limiting step in resolving the missing heritability of breast cancer. The aim of this study is to screen genes that are recurrently mutated among our exome data in a larger cohort of cases and controls to assess the prevalence of inactivating mutations that may be associated with breast cancer risk. We are using the Agilent HaloPlex Target Enrichment System to screen the coding regions of 168 genes in 1,000 BRCA1/2 mutation-negative familial breast cancer cases and 1,000 cancer-naive controls. To date, our interim analysis has identified 21 genes which carry an excess of truncating mutations in multiple breast cancer families versus controls. Established breast cancer susceptibility gene PALB2 is the most frequently mutated gene (13/998 cases versus 0/1009 controls), but other interesting candidates include NPSR1, GSN, POLD2, and TOX3. These and other genes are being validated in a second cohort of 1,000 cases and controls. Our experience demonstrates that beyond PALB2, the prevalence of mutations in the remaining breast cancer predisposition genes is likely to be very low making definitive validation exceptionally challenging.

Keywords: predisposition, familial, exome sequencing, breast cancer

Procedia PDF Downloads 459
24162 Inverse Heat Conduction Analysis of Cooling on Run-Out Tables

Authors: M. S. Gadala, Khaled Ahmed, Elasadig Mahdi

Abstract:

In this paper, we introduced a gradient-based inverse solver to obtain the missing boundary conditions based on the readings of internal thermocouples. The results show that the method is very sensitive to measurement errors, and becomes unstable when small time steps are used. The artificial neural networks are shown to be capable of capturing the whole thermal history on the run-out table, but are not very effective in restoring the detailed behavior of the boundary conditions. Also, they behave poorly in nonlinear cases and where the boundary condition profile is different. GA and PSO are more effective in finding a detailed representation of the time-varying boundary conditions, as well as in nonlinear cases. However, their convergence takes longer. A variation of the basic PSO, called CRPSO, showed the best performance among the three versions. Also, PSO proved to be effective in handling noisy data, especially when its performance parameters were tuned. An increase in the self-confidence parameter was also found to be effective, as it increased the global search capabilities of the algorithm. RPSO was the most effective variation in dealing with noise, closely followed by CRPSO. The latter variation is recommended for inverse heat conduction problems, as it combines the efficiency and effectiveness required by these problems.

Keywords: inverse analysis, function specification, neural net works, particle swarm, run-out table

Procedia PDF Downloads 209
24161 Artificial Intelligence and Development: The Missing Link

Authors: Driss Kettani

Abstract:

ICT4D actors are naturally attempted to include AI in the range of enabling technologies and tools that could support and boost the Development process, and to refer to these as AI4D. But, doing so, assumes that AI complies with the very specific features of ICT4D context, including, among others, affordability, relevance, openness, and ownership. Clearly, none of these is fulfilled, and the enthusiastic posture that AI4D is a natural part of ICT4D is not grounded and, to certain extent, does not serve the purpose of Technology for Development at all. In the context of Development, it is important to emphasize and prioritize ICT4D, in the national digital transformation strategies, instead of borrowing "trendy" waves of the IT Industry that are motivated by business considerations, with no specific care/consideration to Development.

Keywords: AI, ICT4D, technology for development, position paper

Procedia PDF Downloads 20
24160 Statistical Model to Examine the Impact of the Inflation Rate and Real Interest Rate on the Bahrain Economy

Authors: Ghada Abo-Zaid

Abstract:

Introduction: Oil is one of the most income source in Bahrain. Low oil price influence on the economy growth and the investment rate in Bahrain. For example, the economic growth was 3.7% in 2012, and it reduced to 2.9% in 2015. Investment rate was 9.8% in 2012, and it is reduced to be 5.9% and -12.1% in 2014 and 2015, respectively. The inflation rate is increased to the peak point in 2013 with 3.3 %. Objectives: The objectives here are to build statistical models to examine the effect of the interest rate inflation rate on the growth economy in Bahrain from 2000 to 2018. Methods: This study based on 18 years, and the multiple regression model is used for the analysis. All of the missing data are omitted from the analysis. Results: Regression model is used to examine the association between the Growth national product (GNP), the inflation rate, and real interest rate. We found that (i) Increase the real interest rate decrease the GNP. (ii) Increase the inflation rate does not effect on the growth economy in Bahrain since the average of the inflation rate was almost 2%, and this is considered as a low percentage. Conclusion: There is a positive impact of the real interest rate on the GNP in Bahrain. While the inflation rate does not show any negative influence on the GNP as the inflation rate was not large enough to effect negatively on the economy growth rate in Bahrain.

Keywords: growth national product, egypt, regression model, interest rate

Procedia PDF Downloads 122
24159 Kindergarten Children’s Reactions to the COVID-19 Pandemic: Creating a Sense of Coherence

Authors: Bilha Paryente, Roni Gez Langerman

Abstract:

Background and Objectives: The current study focused on how kindergarten children have experienced the COVID-19 pandemic. The main goals were understanding children’s emotions, coping strategies, and thoughts regarding the presence of the COVID-19 virus in their daily lives, using the salute genic approach to study their sense of coherence, and to promote relevant professional instruction. Design and Method: Semistructured in-depth interviews were held with 130 five- to six-year-old children, with an equal number of boys and girls. All of the children were recruited from kindergartens affiliated with the state's secular education system. Results: Data were structured into three themes: 1) the child’s pandemic perception as manageable through meaningful accompanying and missing figures; 2) the child’s comprehension of the virus as dangerous, age differentiating, and contagious. 3) the child’s emotional processing of the pandemic as arousing fear of death and, through images, as thorny and as a monster. Conclusions: Results demonstrate the young children’s sense of coherence, characterized as extrapersonal perception, interpersonal coping, and intrapersonal emotional processing, and the need for greater acknowledgement of child-parent educators' informed interventions that could give children a partial feeling of the adult’s awareness of their needs.

Keywords: kindergarten children, continuous stress, COVID-19, salutogenic approach

Procedia PDF Downloads 146
24158 Adolescent-Parent Relationship as the Most Important Factor in Preventing Mood Disorders in Adolescents: An Application of Artificial Intelligence to Social Studies

Authors: Elżbieta Turska

Abstract:

Introduction: One of the most difficult times in a person’s life is adolescence. The experiences in this period may shape the future life of this person to a large extent. This is the reason why many young people experience sadness, dejection, hopelessness, sense of worthlessness, as well as losing interest in various activities and social relationships, all of which are often classified as mood disorders. As many as 15-40% adolescents experience depressed moods and for most of them they resolve and are not carried into adulthood. However, (5-6%) of those affected by mood disorders develop the depressive syndrome and as many as (1-3%) develop full-blown clinical depression. Materials: A large questionnaire was given to 2508 students, aged 13–16 years old, and one of its parts was the Burns checklist, i.e. the standard test for identifying depressed mood. The questionnaire asked about many aspects of the student’s life, it included a total of 53 questions, most of which had subquestions. It is important to note that the data suffered from many problems, the most important of which were missing data and collinearity. Aim: In order to identify the correlates of mood disorders we built predictive models which were then trained and validated. Our aim was not to be able to predict which students suffer from mood disorders but rather to explore the factors influencing mood disorders. Methods: The problems with data described above practically excluded using all classical statistical methods. For this reason, we attempted to use the following Artificial Intelligence (AI) methods: classification trees with surrogate variables, random forests and xgboost. All analyses were carried out with the use of the mlr package for the R programming language. Resuts: The predictive model built by classification trees algorithm outperformed the other algorithms by a large margin. As a result, we were able to rank the variables (questions and subquestions from the questionnaire) from the most to least influential as far as protection against mood disorder is concerned. Thirteen out of twenty most important variables reflect the relationships with parents. This seems to be a really significant result both from the cognitive point of view and also from the practical point of view, i.e. as far as interventions to correct mood disorders are concerned.

Keywords: mood disorders, adolescents, family, artificial intelligence

Procedia PDF Downloads 72
24157 Improved Computational Efficiency of Machine Learning Algorithm Based on Evaluation Metrics to Control the Spread of Coronavirus in the UK

Authors: Swathi Ganesan, Nalinda Somasiri, Rebecca Jeyavadhanam, Gayathri Karthick

Abstract:

The COVID-19 crisis presents a substantial and critical hazard to worldwide health. Since the occurrence of the disease in late January 2020 in the UK, the number of infected people confirmed to acquire the illness has increased tremendously across the country, and the number of individuals affected is undoubtedly considerably high. The purpose of this research is to figure out a predictive machine learning archetypal that could forecast COVID-19 cases within the UK. This study concentrates on the statistical data collected from 31st January 2020 to 31st March 2021 in the United Kingdom. Information on total COVID cases registered, new cases encountered on a daily basis, total death registered, and patients’ death per day due to Coronavirus is collected from World Health Organisation (WHO). Data preprocessing is carried out to identify any missing values, outliers, or anomalies in the dataset. The data is split into 8:2 ratio for training and testing purposes to forecast future new COVID cases. Support Vector Machines (SVM), Random Forests, and linear regression algorithms are chosen to study the model performance in the prediction of new COVID-19 cases. From the evaluation metrics such as r-squared value and mean squared error, the statistical performance of the model in predicting the new COVID cases is evaluated. Random Forest outperformed the other two Machine Learning algorithms with a training accuracy of 99.47% and testing accuracy of 98.26% when n=30. The mean square error obtained for Random Forest is 4.05e11, which is lesser compared to the other predictive models used for this study. From the experimental analysis Random Forest algorithm can perform more effectively and efficiently in predicting the new COVID cases, which could help the health sector to take relevant control measures for the spread of the virus.

Keywords: COVID-19, machine learning, supervised learning, unsupervised learning, linear regression, support vector machine, random forest

Procedia PDF Downloads 85
24156 Secondary Charged Fragments Tracking for On-Line Beam Range Monitoring in Particle Therapy

Authors: G. Traini, G. Battistoni, F. Collamati, E. De Lucia, R. Faccini, C. Mancini-Terracciano, M. Marafini, I. Mattei, S. Muraro, A. Sarti, A. Sciubba, E. Solfaroli Camillocci, M. Toppi, S. M. Valle, C. Voena, V. Patera

Abstract:

In Particle Therapy (PT) treatments a large amount of secondary particles, whose emission point is correlated to the dose released in the crossed tissues, is produced. The measurement of the secondary charged fragments component could represent a valid technique to monitor the beam range during the PT treatments, that is a still missing item in the clinical practice. A sub-millimetrical precision on the beam range measurement is required to significantly optimise the technique and to improve the treatment quality. In this contribution, a detector, named Dose Profiler (DP), is presented. It is specifically planned to monitor on-line the beam range exploiting the secondary charged particles produced in PT Carbon ions treatment. In particular, the DP is designed to track the secondary fragments emitted at large angles with respect to the beam direction (mainly protons), with the aim to reconstruct the spatial coordinates of the fragment emission point extrapolating the measured track toward the beam axis. The DP is currently under development within of the INSIDE collaboration (Innovative Solutions for In-beam Dosimetry in hadrontherapy). The tracker is made by six layers (20 × 20 cm²) of BCF-12 square scintillating fibres (500 μm) coupled to Silicon Photo-Multipliers, followed by two plastic scintillator layers of 6 mm thickness. A system of front-end boards based on FPGAs arranged around the detector provides the data acquisition. The detector characterization with cosmic rays is currently undergoing, and a data taking campaign with protons will take place in May 2017. The DP design and the performances measured with using MIPs and protons beam will be reviewed.

Keywords: fragmentation, monitoring, particle therapy, tracking

Procedia PDF Downloads 199
24155 Control the Flow of Big Data

Authors: Shizra Waris, Saleem Akhtar

Abstract:

Big data is a research area receiving attention from academia and IT communities. In the digital world, the amounts of data produced and stored have within a short period of time. Consequently this fast increasing rate of data has created many challenges. In this paper, we use functionalism and structuralism paradigms to analyze the genesis of big data applications and its current trends. This paper presents a complete discussion on state-of-the-art big data technologies based on group and stream data processing. Moreover, strengths and weaknesses of these technologies are analyzed. This study also covers big data analytics techniques, processing methods, some reported case studies from different vendor, several open research challenges and the chances brought about by big data. The similarities and differences of these techniques and technologies based on important limitations are also investigated. Emerging technologies are suggested as a solution for big data problems.

Keywords: computer, it community, industry, big data

Procedia PDF Downloads 156
24154 Efficacy and Safety of Electrical Vestibular Stimulation on Adults with Symptoms of Insomnia: A Double-Blind, Randomized, Sham-Controlled Trial

Authors: Teris Cheung, Joyce Yuen Ting Lam, Kwan Hin Fong, Calvin Pak-Wing Cheng, Julie Sittlington, Yu-Tao Xiang, Tim Man Ho Li

Abstract:

Insomnia is one of the most common health problems in the general population. Insomnia can be acute, intermittent, and become chronic, often due to comorbidity with other physical and mental health conditions. Although there are conventional pharmaceutical and psychotherapeutic treatments to treat symptoms of insomnia, however; there is no robust and novel randomized controlled trial (RCT) using transdermal neurostimulation on individuals with insomnia symptoms. This gives us the impetus to execute the first nationwide RCT. Aim: To evaluate the efficacy of Electrical Vestibular Stimulation (VeNS) on individuals with insomnia in Hong Kong. Design: This study was a two-armed, double blinded, randomized, sham-controlled trial. Sampling: 60 community-dwelling adults aged 18 and 60 years with moderate insomnia symptoms or above (Insomnia Severity Index > 14) were recruited. All subjects were computerized randomized into either the active VeNS group or the sham VeNS group on a 1:1 ratio. Intervention: All participants received a home-use VeNS device and used 30-min VeNS sessions during five consecutive days across a 4-week period (total treatment hours: 10). Baseline measurements and post-VeNS evaluation of the psychological outcomes, including 1) insomnia severity, 2) sleep quality, and 3) quality of life were investigated. The short-and long-term sustainability of the VeNS intervention was assessed immediately after poststim and at a 1-month and 3-month follow-up period. Data analysis: A mixed GEE model was used to analyze the repeated measures data. Missing data were managed by multiple imputations. The level of significance was set to p < 0.05. Significance of the study: This is the first trial to examine the efficacy and safety of VeNS among adults with insomnia symptoms in Hong Kong. Findings that emerged were used to determine whether this VeNS device can be considered a self-help technological device to reduce the severity of insomnia in the community setting and to reduce the global disease burden. Clinical Trial Registration: ClinicalTrials.gov, identifier: NCT04452981.

Keywords: adults, insomnia, neuromodulation, rct, vestibular stimulation

Procedia PDF Downloads 46
24153 High Performance Computing and Big Data Analytics

Authors: Branci Sarra, Branci Saadia

Abstract:

Because of the multiplied data growth, many computer science tools have been developed to process and analyze these Big Data. High-performance computing architectures have been designed to meet the treatment needs of Big Data (view transaction processing standpoint, strategic, and tactical analytics). The purpose of this article is to provide a historical and global perspective on the recent trend of high-performance computing architectures especially what has a relation with Analytics and Data Mining.

Keywords: high performance computing, HPC, big data, data analysis

Procedia PDF Downloads 483
24152 Ensemble Machine Learning Approach for Estimating Missing Data from CO₂ Time Series

Authors: Atbin Mahabbati, Jason Beringer, Matthias Leopold

Abstract:

To address the global challenges of climate and environmental changes, there is a need for quantifying and reducing uncertainties in environmental data, including observations of carbon, water, and energy. Global eddy covariance flux tower networks (FLUXNET), and their regional counterparts (i.e., OzFlux, AmeriFlux, China Flux, etc.) were established in the late 1990s and early 2000s to address the demand. Despite the capability of eddy covariance in validating process modelling analyses, field surveys and remote sensing assessments, there are some serious concerns regarding the challenges associated with the technique, e.g. data gaps and uncertainties. To address these concerns, this research has developed an ensemble model to fill the data gaps of CO₂ flux to avoid the limitations of using a single algorithm, and therefore, provide less error and decline the uncertainties associated with the gap-filling process. In this study, the data of five towers in the OzFlux Network (Alice Springs Mulga, Calperum, Gingin, Howard Springs and Tumbarumba) during 2013 were used to develop an ensemble machine learning model, using five feedforward neural networks (FFNN) with different structures combined with an eXtreme Gradient Boosting (XGB) algorithm. The former methods, FFNN, provided the primary estimations in the first layer, while the later, XGB, used the outputs of the first layer as its input to provide the final estimations of CO₂ flux. The introduced model showed slight superiority over each single FFNN and the XGB, while each of these two methods was used individually, overall RMSE: 2.64, 2.91, and 3.54 g C m⁻² yr⁻¹ respectively (3.54 provided by the best FFNN). The most significant improvement happened to the estimation of the extreme diurnal values (during midday and sunrise), as well as nocturnal estimations, which is generally considered as one of the most challenging parts of CO₂ flux gap-filling. The towers, as well as seasonality, showed different levels of sensitivity to improvements provided by the ensemble model. For instance, Tumbarumba showed more sensitivity compared to Calperum, where the differences between the Ensemble model on the one hand and the FFNNs and XGB, on the other hand, were the least of all 5 sites. Besides, the performance difference between the ensemble model and its components individually were more significant during the warm season (Jan, Feb, Mar, Oct, Nov, and Dec) compared to the cold season (Apr, May, Jun, Jul, Aug, and Sep) due to the higher amount of photosynthesis of plants, which led to a larger range of CO₂ exchange. In conclusion, the introduced ensemble model slightly improved the accuracy of CO₂ flux gap-filling and robustness of the model. Therefore, using ensemble machine learning models is potentially capable of improving data estimation and regression outcome when it seems to be no more room for improvement while using a single algorithm.

Keywords: carbon flux, Eddy covariance, extreme gradient boosting, gap-filling comparison, hybrid model, OzFlux network

Procedia PDF Downloads 107
24151 The Role of Digital Technology in Crime Prevention: A Case Study of Cellular Forensics Unit, Capital City Police Peshawar

Authors: Muhammad Ashfaq

Abstract:

Main theme: This prime focus of this study is on the role of digital technology in crime prevention, with special focus on Cellular Forensic Unit, Capital City Police Peshawar-Khyber Pakhtunkhwa-Pakistan. Objective(s) of the study: The prime objective of this study is to provide statistics, strategies, and pattern of analysis used for crime prevention in Cellular Forensic Unit of Capital City Police Peshawar, Khyber Pakhtunkhwa-Pakistan. Research Method and Procedure: Qualitative method of research has been used in the study for obtaining secondary data from research wing and Information Technology (IT) section of Peshawar police. Content analysis was the method used for the conduction of the study. This study is delimited to Capital City Police and Cellular Forensic Unit Peshawar-KP, Pakistan. information technologies. Major finding(s): It is evident that the old traditional approach will never provide solutions for better management in controlling crimes. The best way to control crimes and promotion of proactive policing is to adopt new technologies. The study reveals that technology have transformed police more effective and vigilant as compared to traditional policing. The heinous crimes like abduction, missing of an individual, snatching, burglaries, and blind murder cases are now traceable with the help of technology. Recommendation(s): From the analysis of the data, it is reflected that Information Technology (IT) expert should be recruited along with research analyst to timely assist and facilitate operational as well as investigation units of police. A mobile locator should be Provided to Cellular Forensic Unit to timely apprehend the criminals. Latest digital analysis software should be provided to equip the Cellular Forensic Unit.

Keywords: criminology-pakistan, crime prevention-KP, digital forensics, digital technology-pakistan

Procedia PDF Downloads 57
24150 A Landscape of Research Data Repositories in Re3data.org Registry: A Case Study of Indian Repositories

Authors: Prashant Shrivastava

Abstract:

The purpose of this study is to explore re3dat.org registry to identify research data repositories registration workflow process. Further objective is to depict a graph for present development of research data repositories in India. Preliminarily with an approach to understand re3data.org registry framework and schema design then further proceed to explore the status of research data repositories of India in re3data.org registry. Research data repositories are getting wider relevance due to e-research concepts. Now available registry re3data.org is a good tool for users and researchers to identify appropriate research data repositories as per their research requirements. In Indian environment, a compatible National Research Data Policy is the need of the time to boost the management of research data. Registry for Research Data Repositories is a crucial tool to discover specific information in specific domain. Also, Research Data Repositories in India have not been studied. Re3data.org registry and status of Indian research data repositories both discussed in this study.

Keywords: research data, research data repositories, research data registry, re3data.org

Procedia PDF Downloads 293
24149 Products in Early Development Phases: Ecological Classification and Evaluation Using an Interval Arithmetic Based Calculation Approach

Authors: Helen L. Hein, Joachim Schwarte

Abstract:

As a pillar of sustainable development, ecology has become an important milestone in research community, especially due to global challenges like climate change. The ecological performance of products can be scientifically conducted with life cycle assessments. In the construction sector, significant amounts of CO2 emissions are assigned to the energy used for building heating purposes. Therefore, sustainable construction materials for insulating purposes are substantial, whereby aerogels have been explored intensively in the last years due to their low thermal conductivity. Therefore, the WALL-ACE project aims to develop an aerogel-based thermal insulating plaster that would achieve minor thermal conductivities. But as in the early stage of development phases, a lot of information is still missing or not yet accessible, the ecological performance of innovative products bases increasingly on uncertain data that can lead to significant deviations in the results. To be able to predict realistically how meaningful the results are and how viable the developed products may be with regard to their corresponding respective market, these deviations however have to be considered. Therefore, a classification method is presented in this study, which may allow comparing the ecological performance of modern products with already established and competitive materials. In order to achieve this, an alternative calculation method was used that allows computing with lower and upper bounds to consider all possible values without precise data. The life cycle analysis of the considered products was conducted with an interval arithmetic based calculation method. The results lead to the conclusion that the interval solutions describing the possible environmental impacts are so wide that the result usability is limited. Nevertheless, a further optimization in reducing environmental impacts of aerogels seems to be needed to become more competitive in the future.

Keywords: aerogel-based, insulating material, early development phase, interval arithmetic

Procedia PDF Downloads 118
24148 A Study of Cloud Computing Solution for Transportation Big Data Processing

Authors: Ilgin Gökaşar, Saman Ghaffarian

Abstract:

The need for fast processed big data of transportation ridership (eg., smartcard data) and traffic operation (e.g., traffic detectors data) which requires a lot of computational power is incontrovertible in Intelligent Transportation Systems. Nowadays cloud computing is one of the important subjects and popular information technology solution for data processing. It enables users to process enormous measure of data without having their own particular computing power. Thus, it can also be a good selection for transportation big data processing as well. This paper intends to examine how the cloud computing can enhance transportation big data process with contrasting its advantages and disadvantages, and discussing cloud computing features.

Keywords: big data, cloud computing, Intelligent Transportation Systems, ITS, traffic data processing

Procedia PDF Downloads 421
24147 The Role of Digital Technology in Crime Prevention: a Case Study of Cellular Forensics Unit, Capital City Police Peshawar-Pakistan

Authors: Muhammad Ashfaq

Abstract:

Main theme: This prime focus of this study is on the role of digital technology in crime prevention, with special focus on Cellular Forensic Unit, Capital City Police Peshawar-Khyber Pakhtunkhwa-Pakistan. Objective(s) of the study: The prime objective of this study is to provide statistics, strategies and pattern of analysis used for crime prevention in Cellular Forensic Unit of Capital City Police Peshawar, Khyber Pakhtunkhwa-Pakistan. Research Method and Procedure: Qualitative method of research has been used in the study for obtaining secondary data from research wing and Information Technology (IT) section of Peshawar police. Content analysis was the method used for the conduction of the study. This study is delimited to Capital City Police and Cellular Forensic Unit Peshawar-KP, Pakistan. information technologies.Major finding(s): It is evident that the old traditional approach will never provide solutions for better management in controlling crimes. The best way to control crimes and promotion of proactive policing is to adopt new technologies. The study reveals that technology have transformed police more effective and vigilant as compared to traditional policing. The heinous crimes like abduction, missing of an individual, snatching, burglaries and blind murder cases are now traceable with the help of technology.Recommendation(s): From the analysis of the data, it is reflected that Information Technology (IT) expert should be recruited along with research analyst to timely assist and facilitate operational as well as investigation units of police .A mobile locator should be Provided to Cellular Forensic Unit to timely apprehend the criminals .Latest digital analysis software should be provided to equip the Cellular Forensic Unit.

Keywords: crime-prevention, cellular-forensic unit-pakistan, crime prevention-digital-pakistan, crminology-pakistan

Procedia PDF Downloads 50
24146 Harmonic Data Preparation for Clustering and Classification

Authors: Ali Asheibi

Abstract:

The rapid increase in the size of databases required to store power quality monitoring data has demanded new techniques for analysing and understanding the data. One suggested technique to assist in analysis is data mining. Preparing raw data to be ready for data mining exploration take up most of the effort and time spent in the whole data mining process. Clustering is an important technique in data mining and machine learning in which underlying and meaningful groups of data are discovered. Large amounts of harmonic data have been collected from an actual harmonic monitoring system in a distribution system in Australia for three years. This amount of acquired data makes it difficult to identify operational events that significantly impact the harmonics generated on the system. In this paper, harmonic data preparation processes to better understanding of the data have been presented. Underlying classes in this data has then been identified using clustering technique based on the Minimum Message Length (MML) method. The underlying operational information contained within the clusters can be rapidly visualised by the engineers. The C5.0 algorithm was used for classification and interpretation of the generated clusters.

Keywords: data mining, harmonic data, clustering, classification

Procedia PDF Downloads 215