Search results for: R data science
24828 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency
Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami
Abstract:
Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.Keywords: clustering, unsupervised learning, pattern recognition, categorical datasets, knowledge discovery, k-means
Procedia PDF Downloads 26024827 PARP1 Links Transcription of a Subset of RBL2-Dependent Genes with Cell Cycle Progression
Authors: Ewelina Wisnik, Zsolt Regdon, Kinga Chmielewska, Laszlo Virag, Agnieszka Robaszkiewicz
Abstract:
Apart from protecting genome, PARP1 has been documented to regulate many intracellular processes inter alia gene transcription by physically interacting with chromatin bound proteins and by their ADP-ribosylation. Our recent findings indicate that expression of PARP1 decreases during the differentiation of human CD34+ hematopoietic stem cells to monocytes as a consequence of differentiation-associated cell growth arrest and formation of E2F4-RBL2-HDAC1-SWI/SNF repressive complex at the promoter of this gene. Since the RBL2 complexes repress genes in a E2F-dependent manner and are widespread in the genome in G0 arrested cells, we asked (a) if RBL2 directly contributes to defining monocyte phenotype and function by targeting gene promoters and (b) if RBL2 controls gene transcription indirectly by repressing PARP1. For identification of genes controlled by RBL2 and/or PARP1,we used primer libraries for surface receptors and TLR signaling mediators, genes were silenced by siRNA or shRNA, analysis of gene promoter occupation by selected proteins was carried out by ChIP-qPCR, while statistical analysis in GraphPad Prism 5 and STATISTICA, ChIP-Seq data were analysed in Galaxy 2.5.0.0. On the list of 28 genes regulated by RBL2, we identified only four solely repressed by RBL2-E2F4-HDAC1-BRM complex. Surprisingly, 24 out of 28 emerged genes controlled by RBL2 were co-regulated by PARP1 in six different manners. In one mode of RBL2/PARP1 co-operation, represented by MAP2K6 and MAPK3, PARP1 was found to associate with gene promoters upon RBL2 silencing, which was previously shown to restore PARP1 expression in monocytes. PARP1 effect on gene transcription was observed only in the presence of active EP300, which acetylated gene promoters and activated transcription. Further analysis revealed that PARP1 binding to MA2K6 and MAPK3 promoters enabled recruitment of EP300 in monocytes, while in proliferating cancer cell lines, which actively transcribe PARP1, this protein maintained EP300 at the promoters of MA2K6 and MAPK3. Genome-wide analysis revealed a similar distribution of PARP1 and EP300 around transcription start sites and the co-occupancy of some gene promoters by PARP1 and EP300 in cancer cells. Here, we described a new RBL2/PARP1/EP300 axis which controls gene transcription regardless of the cell type. In this model cell, cycle-dependent transcription of PARP1 regulates expression of some genes repressed by RBL2 upon cell cycle limitation. Thus, RBL2 may indirectly regulate transcription of some genes by controlling the expression of EP300-recruiting PARP1. Acknowledgement: This work was financed by Polish National Science Centre grants nr DEC-2013/11/D/NZ2/00033 and DEC-2015/19/N/NZ2/01735. L.V. is funded by the National Research, Development and Innovation Office grants GINOP-2.3.2-15-2016-00020 TUMORDNS, GINOP-2.3.2-15-2016-00048-STAYALIVE and OTKA K112336. AR is supported by Polish Ministry of Science and Higher Education 776/STYP/11/2016.Keywords: retinoblastoma transcriptional co-repressor like 2 (RBL2), poly(ADP-ribose) polymerase 1 (PARP1), E1A binding protein p300 (EP300), monocytes
Procedia PDF Downloads 21024826 Structural Equation Modeling Semiparametric Truncated Spline Using Simulation Data
Authors: Adji Achmad Rinaldo Fernandes
Abstract:
SEM analysis is a complex multivariate analysis because it involves a number of exogenous and endogenous variables that are interconnected to form a model. The measurement model is divided into two, namely, the reflective model (reflecting) and the formative model (forming). Before carrying out further tests on SEM, there are assumptions that must be met, namely the linearity assumption, to determine the form of the relationship. There are three modeling approaches to path analysis, including parametric, nonparametric and semiparametric approaches. The aim of this research is to develop semiparametric SEM and obtain the best model. The data used in the research is secondary data as the basis for the process of obtaining simulation data. Simulation data was generated with various sample sizes of 100, 300, and 500. In the semiparametric SEM analysis, the form of the relationship studied was determined, namely linear and quadratic and determined one and two knot points with various levels of error variance (EV=0.5; 1; 5). There are three levels of closeness of relationship for the analysis process in the measurement model consisting of low (0.1-0.3), medium (0.4-0.6) and high (0.7-0.9) levels of closeness. The best model lies in the form of the relationship X1Y1 linear, and. In the measurement model, a characteristic of the reflective model is obtained, namely that the higher the closeness of the relationship, the better the model obtained. The originality of this research is the development of semiparametric SEM, which has not been widely studied by researchers.Keywords: semiparametric SEM, measurement model, structural model, reflective model, formative model
Procedia PDF Downloads 4124825 Quality Assurance for the Climate Data Store
Authors: Judith Klostermann, Miguel Segura, Wilma Jans, Dragana Bojovic, Isadora Christel Jimenez, Francisco Doblas-Reyees, Judit Snethlage
Abstract:
The Climate Data Store (CDS), developed by the Copernicus Climate Change Service (C3S) implemented by the European Centre for Medium-Range Weather Forecasts (ECMWF) on behalf of the European Union, is intended to become a key instrument for exploring climate data. The CDS contains both raw and processed data to provide information to the users about the past, present and future climate of the earth. It allows for easy and free access to climate data and indicators, presenting an important asset for scientists and stakeholders on the path for achieving a more sustainable future. The C3S Evaluation and Quality Control (EQC) is assessing the quality of the CDS by undertaking a comprehensive user requirement assessment to measure the users’ satisfaction. Recommendations will be developed for the improvement and expansion of the CDS datasets and products. User requirements will be identified on the fitness of the datasets, the toolbox, and the overall CDS service. The EQC function of the CDS will help C3S to make the service more robust: integrated by validated data that follows high-quality standards while being user-friendly. This function will be closely developed with the users of the service. Through their feedback, suggestions, and contributions, the CDS can become more accessible and meet the requirements for a diverse range of users. Stakeholders and their active engagement are thus an important aspect of CDS development. This will be achieved with direct interactions with users such as meetings, interviews or workshops as well as different feedback mechanisms like surveys or helpdesk services at the CDS. The results provided by the users will be categorized as a function of CDS products so that their specific interests will be monitored and linked to the right product. Through this procedure, we will identify the requirements and criteria for data and products in order to build the correspondent recommendations for the improvement and expansion of the CDS datasets and products.Keywords: climate data store, Copernicus, quality, user engagement
Procedia PDF Downloads 14624824 Quantifying the Methods of Monitoring Timers in Electric Water Heater for Grid Balancing on Demand-Side Management: A Systematic Mapping Review
Authors: Yamamah Abdulrazaq, Lahieb A. Abrahim, Samuel E. Davies, Iain Shewring
Abstract:
An electric water heater (EWH) is a powerful appliance that uses electricity in residential, commercial, and industrial settings, and the ability to control them properly will result in cost savings and the prevention of blackouts on the national grid. This article discusses the usage of timers in EWH control strategies for demand-side management (DSM). Up to the authors' knowledge, there is no systematic mapping review focusing on the utilisation of EWH control strategies in DSM has yet been conducted. Consequently, the purpose of this research is to identify and examine main papers exploring EWH procedures in DSM by quantifying and categorising information with regard to publication year and source, kind of methods, and source of data for monitoring control techniques. In order to answer the research questions, a total of 31 publications published between 1999 and 2023 were selected depending on specific inclusion and exclusion criteria. The data indicate that direct load control (DLC) has been somewhat more prevalent than indirect load control (ILC). Additionally, the mixing method is much lower than the other techniques, and the proportion of Real-time data (RTD) to non-real-time data (NRTD) is about equal.Keywords: demand side management, direct load control, electric water heater, indirect load control, non real-time data, real-time data
Procedia PDF Downloads 8224823 Internet Economy: Enhancing Information Communication Technology Adaptation, Service Delivery, Content and Digital Skills for Small Holder Farmers in Uganda
Authors: Baker Ssekitto, Ambrose Mbogo
Abstract:
The study reveals that indeed agriculture employs over 70% of Uganda’s population, of which majority are youth and women. The study further reveals that over 70% of the farmers are smallholder farmers based in rural areas, whose operations are greatly affected by; climate change, weak digital skills, limited access to productivity knowledge along value chains, limited access to quality farm inputs, weak logistics systems, limited access to quality extension services, weak business intelligence, limited access to quality markets among others. It finds that the emerging 4th industrial revolution powered by artificial intelligence, 5G and data science will provide possibilities of addressing some of these challenges. Furthermore, the study finds that despite rapid development of ICT4Agric Innovation, their uptake is constrained by a number of factors including; limited awareness of these innovations, low internet and smart phone penetration especially in rural areas, lack of appropriate digital skills, inappropriate programmes implementation models which are project and donor driven, limited articulation of value addition to various stakeholders among others. Majority of farmers and other value chain actors lacked knowledge and skills to harness the power of ICTs, especially their application of ICTs in monitoring and evaluation on quality of service in the extension system and farm level processes.Keywords: artificial intelligence, productivity, ICT4agriculture, value chain, logistics
Procedia PDF Downloads 7824822 Implications of Circular Economy on Users Data Privacy: A Case Study on Android Smartphones Second-Hand Market
Authors: Mariia Khramova, Sergio Martinez, Duc Nguyen
Abstract:
Modern electronic devices, particularly smartphones, are characterised by extremely high environmental footprint and short product lifecycle. Every year manufacturers release new models with even more superior performance, which pushes the customers towards new purchases. As a result, millions of devices are being accumulated in the urban mine. To tackle these challenges the concept of circular economy has been introduced to promote repair, reuse and recycle of electronics. In this case, electronic devices, that previously ended up in landfills or households, are getting the second life, therefore, reducing the demand for new raw materials. Smartphone reuse is gradually gaining wider adoption partly due to the price increase of flagship models, consequently, boosting circular economy implementation. However, along with reuse of communication device, circular economy approach needs to ensure the data of the previous user have not been 'reused' together with a device. This is especially important since modern smartphones are comparable with computers in terms of performance and amount of data stored. These data vary from pictures, videos, call logs to social security numbers, passport and credit card details, from personal information to corporate confidential data. To assess how well the data privacy requirements are followed on smartphones second-hand market, a sample of 100 Android smartphones has been purchased from IT Asset Disposition (ITAD) facilities responsible for data erasure and resell. Although devices should not have stored any user data by the time they leave ITAD, it has been possible to retrieve the data from 19% of the sample. Applied techniques varied from manual device inspection to sophisticated equipment and tools. These findings indicate significant barrier in implementation of circular economy and a limitation of smartphone reuse. Therefore, in order to motivate the users to donate or sell their old devices and make electronic use more sustainable, data privacy on second-hand smartphone market should be significantly improved. Presented research has been carried out in the framework of sustainablySMART project, which is part of Horizon 2020 EU Framework Programme for Research and Innovation.Keywords: android, circular economy, data privacy, second-hand phones
Procedia PDF Downloads 12824821 Development of Muay Thai Competition Management for Promoting Sport Tourism in the next Decade (2015-2024)
Authors: Supasak Ngaoprasertwong
Abstract:
The purpose of this research was to develop a model for Muay Thai competition management for promoting sport tourism in the next decade. Moreover, the model was appropriately initiated for practical use. This study also combined several methodologies, both quantitative research and qualitative research, to entirely cover all aspects of data, especially the tourists’ satisfaction toward Muay Thai competition. The data were collected from 400 tourists watching Muay Thai competition in 4 stadiums to create the model for Muay Thai competition to support the sport tourism in the next decade. Besides, Ethnographic Delphi Futures Research (EDFR) was applied to gather the data from certain experts in boxing industry or having significant role in Muay Thai competition in both public sector and private sector. The first step of data collection was an in-depth interview with 27 experts associated with Muay Thai competition, Muay Thai management, and tourism. The second step and the third step of data collection were conducted to confirm the experts’ opinions toward various elements. When the 3 steps of data collection were completely accomplished, all data were assembled to draft the model. Then the model was proposed to 8 experts to conduct a brainstorming to affirm it. According to the results of quantitative research, it found that the tourists were satisfied with personnel of competition at high level (x=3.87), followed by facilities, services, and safe high level (x=3.67). Furthermore, they were satisfied with operation in competition field at high level (x=3.62).Regarding the qualitative methodology including literature review, theories, concepts and analysis of qualitative research development of the model for Muay Thai competition to promote the sport tourism in the next decade, the findings indicated that there were 2 data sets as follows: The first one was related to Muay Thai competition to encourage the sport tourism and the second one was associated with Muay Thai stadium management to support the sport tourism. After the brain storming, “EE Muay Thai Model” was finally developed for promoting the sport tourism in the next decade (2015-2024).Keywords: Muay Thai competition management, Muay Thai sport tourism, Muay Thai, Muay Thai for sport tourism management
Procedia PDF Downloads 31624820 Interpretation and Clustering Framework for Analyzing ECG Survey Data
Authors: Irum Matloob, Shoab Ahmad Khan, Fahim Arif
Abstract:
As Indo-Pak has been the victim of heart diseases since many decades. Many surveys showed that percentage of cardiac patients is increasing in Pakistan day by day, and special attention is needed to pay on this issue. The framework is proposed for performing detailed analysis of ECG survey data which is conducted for measuring prevalence of heart diseases statistics in Pakistan. The ECG survey data is evaluated or filtered by using automated Minnesota codes and only those ECGs are used for further analysis which is fulfilling the standardized conditions mentioned in the Minnesota codes. Then feature selection is performed by applying proposed algorithm based on discernibility matrix, for selecting relevant features from the database. Clustering is performed for exposing natural clusters from the ECG survey data by applying spectral clustering algorithm using fuzzy c means algorithm. The hidden patterns and interesting relationships which have been exposed after this analysis are useful for further detailed analysis and for many other multiple purposes.Keywords: arrhythmias, centroids, ECG, clustering, discernibility matrix
Procedia PDF Downloads 47024819 LiDAR Based Real Time Multiple Vehicle Detection and Tracking
Authors: Zhongzhen Luo, Saeid Habibi, Martin v. Mohrenschildt
Abstract:
Self-driving vehicle require a high level of situational awareness in order to maneuver safely when driving in real world condition. This paper presents a LiDAR based real time perception system that is able to process sensor raw data for multiple target detection and tracking in dynamic environment. The proposed algorithm is nonparametric and deterministic that is no assumptions and priori knowledge are needed from the input data and no initializations are required. Additionally, the proposed method is working on the three-dimensional data directly generated by LiDAR while not scarifying the rich information contained in the domain of 3D. Moreover, a fast and efficient for real time clustering algorithm is applied based on a radially bounded nearest neighbor (RBNN). Hungarian algorithm procedure and adaptive Kalman filtering are used for data association and tracking algorithm. The proposed algorithm is able to run in real time with average run time of 70ms per frame.Keywords: lidar, segmentation, clustering, tracking
Procedia PDF Downloads 42324818 Brain Connectome of Glia, Axons, and Neurons: Cognitive Model of Analogy
Authors: Ozgu Hafizoglu
Abstract:
An analogy is an essential tool of human cognition that enables connecting diffuse and diverse systems with physical, behavioral, principal relations that are essential to learning, discovery, and innovation. The Cognitive Model of Analogy (CMA) leads and creates patterns of pathways to transfer information within and between domains in science, just as happens in the brain. The connectome of the brain shows how the brain operates with mental leaps between domains and mental hops within domains and the way how analogical reasoning mechanism operates. This paper demonstrates the CMA as an evolutionary approach to science, technology, and life. The model puts forward the challenges of deep uncertainty about the future, emphasizing the need for flexibility of the system in order to enable reasoning methodology to adapt to changing conditions in the new era, especially post-pandemic. In this paper, we will reveal how to draw an analogy to scientific research to discover new systems that reveal the fractal schema of analogical reasoning within and between the systems like within and between the brain regions. Distinct phases of the problem-solving processes are divided thusly: stimulus, encoding, mapping, inference, and response. Based on the brain research so far, the system is revealed to be relevant to brain activation considering each of these phases with an emphasis on achieving a better visualization of the brain’s mechanism in macro context; brain and spinal cord, and micro context: glia and neurons, relative to matching conditions of analogical reasoning and relational information, encoding, mapping, inference and response processes, and verification of perceptual responses in four-term analogical reasoning. Finally, we will relate all these terminologies with these mental leaps, mental maps, mental hops, and mental loops to make the mental model of CMA clear.Keywords: analogy, analogical reasoning, brain connectome, cognitive model, neurons and glia, mental leaps, mental hops, mental loops
Procedia PDF Downloads 16524817 On-line Control of the Natural and Anthropogenic Safety in Krasnoyarsk Region
Authors: T. Penkova, A. Korobko, V. Nicheporchuk, L. Nozhenkova, A. Metus
Abstract:
This paper presents an approach of on-line control of the state of technosphere and environment objects based on the integration of Data Warehouse, OLAP and Expert systems technologies. It looks at the structure and content of data warehouse that provides consolidation and storage of monitoring data. There is a description of OLAP-models that provide a multidimensional analysis of monitoring data and dynamic analysis of principal parameters of controlled objects. The authors suggest some criteria of emergency risk assessment using expert knowledge about danger levels. It is demonstrated now some of the proposed solutions could be adopted in territorial decision making support systems. Operational control allows authorities to detect threat, prevent natural and anthropogenic emergencies and ensure a comprehensive safety of territory.Keywords: decision making support systems, emergency risk assessment, natural and anthropogenic safety, on-line control, territory
Procedia PDF Downloads 40624816 Geomagnetic Jerks Observed in Geomagnetic Observatory Data Over Southern Africa Between 2017 and 2023
Authors: Sanele Lionel Khanyile, Emmanuel Nahayo
Abstract:
Geomagnetic jerks are jumps observed in the second derivative of the main magnetic field that occurs on annual to decadal timescales. Understanding these jerks is crucial as they provide valuable insights into the complex dynamics of the Earth’s outer liquid core. In this study, we investigate the occurrence of geomagnetic jerks in geomagnetic observatory data collected at southern African magnetic observatories, Hermanus (HER), Tsumeb (TSU), Hartebeesthoek (HBK) and Keetmanshoop (KMH) between 2017 and 2023. The observatory data was processed and analyzed by retaining quiet night-time data recorded during quiet geomagnetic activities with the help of Kp, Dst, and ring current RC indices. Results confirm the occurrence of the 2019-2020 geomagnetic jerk in the region and identify the recent 2021 jerk detected with V-shaped secular variation changes in X and Z components at all four observatories. The highest estimated 2021 jerk secular acceleration amplitudes in X and Z components were found at HBK, 12.7 nT/year² and 19. 1 nT/year², respectively. Notably, the global CHAOS-7 model aptly identifies this 2021 jerk in the Z component at all magnetic observatories in the region.Keywords: geomagnetic jerks, secular variation, magnetic observatory data, South Atlantic Anomaly
Procedia PDF Downloads 7424815 Social Work Advocacy Regarding Equitable Hiring Of Latinos
Authors: Roberto Lorenzo
Abstract:
Much has been said about the dynamics of the Latin American experience in the United States, however, there seems to be very little data regarding the perception of career identity. Although we do have some Latinos within the professional ranks, there is not nearly enough to claim that we have practiced enough cultural competence to create equity in the professional sphere in the United States. In this thesis, data will be provided regarding labor force statistics highlighting the industries that Latin Americans frequent. Also provided will be the citing of data that suggests further necessity of cultural competence within the professional realm regarding Latin Americans. In addition, methods that were spoken about over the course of our social work education will be discussed in order to connect to possible solutions to this issue.Keywords: hiring, Latinos, professional equity, cultural competence
Procedia PDF Downloads 2224814 Handling, Exporting and Archiving Automated Mineralogy Data Using TESCAN TIMA
Authors: Marek Dosbaba
Abstract:
Within the mining sector, SEM-based Automated Mineralogy (AM) has been the standard application for quickly and efficiently handling mineral processing tasks. Over the last decade, the trend has been to analyze larger numbers of samples, often with a higher level of detail. This has necessitated a shift from interactive sample analysis performed by an operator using a SEM, to an increased reliance on offline processing to analyze and report the data. In response to this trend, TESCAN TIMA Mineral Analyzer is designed to quickly create a virtual copy of the studied samples, thereby preserving all the necessary information. Depending on the selected data acquisition mode, TESCAN TIMA can perform hyperspectral mapping and save an X-ray spectrum for each pixel or segment, respectively. This approach allows the user to browse through elemental distribution maps of all elements detectable by means of energy dispersive spectroscopy. Re-evaluation of the existing data for the presence of previously unconsidered elements is possible without the need to repeat the analysis. Additional tiers of data such as a secondary electron or cathodoluminescence images can also be recorded. To take full advantage of these information-rich datasets, TIMA utilizes a new archiving tool introduced by TESCAN. The dataset size can be reduced for long-term storage and all information can be recovered on-demand in case of renewed interest. TESCAN TIMA is optimized for network storage of its datasets because of the larger data storage capacity of servers compared to local drives, which also allows multiple users to access the data remotely. This goes hand in hand with the support of remote control for the entire data acquisition process. TESCAN also brings a newly extended open-source data format that allows other applications to extract, process and report AM data. This offers the ability to link TIMA data to large databases feeding plant performance dashboards or geometallurgical models. The traditional tabular particle-by-particle or grain-by-grain export process is preserved and can be customized with scripts to include user-defined particle/grain properties.Keywords: Tescan, electron microscopy, mineralogy, SEM, automated mineralogy, database, TESCAN TIMA, open format, archiving, big data
Procedia PDF Downloads 11024813 Design of a Standard Weather Data Acquisition Device for the Federal University of Technology, Akure Nigeria
Authors: Isaac Kayode Ogunlade
Abstract:
Data acquisition (DAQ) is the process by which physical phenomena from the real world are transformed into an electrical signal(s) that are measured and converted into a digital format for processing, analysis, and storage by a computer. The DAQ is designed using PIC18F4550 microcontroller, communicating with Personal Computer (PC) through USB (Universal Serial Bus). The research deployed initial knowledge of data acquisition system and embedded system to develop a weather data acquisition device using LM35 sensor to measure weather parameters and the use of Artificial Intelligence(Artificial Neural Network - ANN)and statistical approach(Autoregressive Integrated Moving Average – ARIMA) to predict precipitation (rainfall). The device is placed by a standard device in the Department of Meteorology, Federal University of Technology, Akure (FUTA) to know the performance evaluation of the device. Both devices (standard and designed) were subjected to 180 days with the same atmospheric condition for data mining (temperature, relative humidity, and pressure). The acquired data is trained in MATLAB R2012b environment using ANN, and ARIMAto predict precipitation (rainfall). Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Correction Square (R2), and Mean Percentage Error (MPE) was deplored as standardize evaluation to know the performance of the models in the prediction of precipitation. The results from the working of the developed device show that the device has an efficiency of 96% and is also compatible with Personal Computer (PC) and laptops. The simulation result for acquired data shows that ANN models precipitation (rainfall) prediction for two months (May and June 2017) revealed a disparity error of 1.59%; while ARIMA is 2.63%, respectively. The device will be useful in research, practical laboratories, and industrial environments.Keywords: data acquisition system, design device, weather development, predict precipitation and (FUTA) standard device
Procedia PDF Downloads 9224812 AI-Based Technologies in International Arbitration: An Exploratory Study on the Practicability of Applying AI Tools in International Arbitration
Authors: Annabelle Onyefulu-Kingston
Abstract:
One of the major purposes of AI today is to evaluate and analyze millions of micro and macro data in order to determine what is relevant in a particular case and proffer it in an adequate manner. Microdata, as far as it relates to AI in international arbitration, is the millions of key issues specifically mentioned by either one or both parties or by their counsels, arbitrators, or arbitral tribunals in arbitral proceedings. This can be qualifications of expert witness and admissibility of evidence, amongst others. Macro data, on the other hand, refers to data derived from the resolution of the dispute and, consequently, the final and binding award. A notable example of this includes the rationale of the award and specific and general damages awarded, amongst others. This paper aims to critically evaluate and analyze the possibility of technological inclusion in international arbitration. This research will be imploring the qualitative method by evaluating existing literature on the consequence of applying AI to both micro and macro data in international arbitration, and how this can be of assistance to parties, counsels, and arbitrators.Keywords: AI-based technologies, algorithms, arbitrators, international arbitration
Procedia PDF Downloads 9624811 A Virtual Grid Based Energy Efficient Data Gathering Scheme for Heterogeneous Sensor Networks
Authors: Siddhartha Chauhan, Nitin Kumar Kotania
Abstract:
Traditional Wireless Sensor Networks (WSNs) generally use static sinks to collect data from the sensor nodes via multiple forwarding. Therefore, network suffers with some problems like long message relay time, bottle neck problem which reduces the performance of the network. Many approaches have been proposed to prevent this problem with the help of mobile sink to collect the data from the sensor nodes, but these approaches still suffer from the buffer overflow problem due to limited memory size of sensor nodes. This paper proposes an energy efficient scheme for data gathering which overcomes the buffer overflow problem. The proposed scheme creates virtual grid structure of heterogeneous nodes. Scheme has been designed for sensor nodes having variable sensing rate. Every node finds out its buffer overflow time and on the basis of this cluster heads are elected. A controlled traversing approach is used by the proposed scheme in order to transmit data to sink. The effectiveness of the proposed scheme is verified by simulation.Keywords: buffer overflow problem, mobile sink, virtual grid, wireless sensor networks
Procedia PDF Downloads 39124810 Analysis of ECGs Survey Data by Applying Clustering Algorithm
Authors: Irum Matloob, Shoab Ahmad Khan, Fahim Arif
Abstract:
As Indo-pak has been the victim of heart diseases since many decades. Many surveys showed that percentage of cardiac patients is increasing in Pakistan day by day, and special attention is needed to pay on this issue. The framework is proposed for performing detailed analysis of ECG survey data which is conducted for measuring the prevalence of heart diseases statistics in Pakistan. The ECG survey data is evaluated or filtered by using automated Minnesota codes and only those ECGs are used for further analysis which is fulfilling the standardized conditions mentioned in the Minnesota codes. Then feature selection is performed by applying proposed algorithm based on discernibility matrix, for selecting relevant features from the database. Clustering is performed for exposing natural clusters from the ECG survey data by applying spectral clustering algorithm using fuzzy c means algorithm. The hidden patterns and interesting relationships which have been exposed after this analysis are useful for further detailed analysis and for many other multiple purposes.Keywords: arrhythmias, centroids, ECG, clustering, discernibility matrix
Procedia PDF Downloads 35124809 The Impact of Motivation on Employee Performance in South Korea
Authors: Atabong Awung Lekeazem
Abstract:
The purpose of this paper is to identify the impact or role of incentives on employee’s performance with a particular emphasis on Korean workers. The process involves defining and explaining the different types of motivation. In defining them, we also bring out the difference between the two major types of motivations. The second phase of the paper shall involve gathering data/information from a sample population and then analyzing the data. In the analysis, we shall get to see the almost similar mentality or value which Koreans attach to motivation, which a slide different view coming only from top management personnel. The last phase shall have us presenting the data and coming to a conclusion from which possible knowledge on how managers and potential managers can ignite the best out of their employees.Keywords: motivation, employee’s performance, Korean workers, business information systems
Procedia PDF Downloads 41424808 Improved Classification Procedure for Imbalanced and Overlapped Situations
Authors: Hankyu Lee, Seoung Bum Kim
Abstract:
The issue with imbalance and overlapping in the class distribution becomes important in various applications of data mining. The imbalanced dataset is a special case in classification problems in which the number of observations of one class (i.e., major class) heavily exceeds the number of observations of the other class (i.e., minor class). Overlapped dataset is the case where many observations are shared together between the two classes. Imbalanced and overlapped data can be frequently found in many real examples including fraud and abuse patients in healthcare, quality prediction in manufacturing, text classification, oil spill detection, remote sensing, and so on. The class imbalance and overlap problem is the challenging issue because this situation degrades the performance of most of the standard classification algorithms. In this study, we propose a classification procedure that can effectively handle imbalanced and overlapped datasets by splitting data space into three parts: nonoverlapping, light overlapping, and severe overlapping and applying the classification algorithm in each part. These three parts were determined based on the Hausdorff distance and the margin of the modified support vector machine. An experiments study was conducted to examine the properties of the proposed method and compared it with other classification algorithms. The results showed that the proposed method outperformed the competitors under various imbalanced and overlapped situations. Moreover, the applicability of the proposed method was demonstrated through the experiment with real data.Keywords: classification, imbalanced data with class overlap, split data space, support vector machine
Procedia PDF Downloads 30824807 Profile of the Renal Failure Patients under Haemodialysis at B. P. Koirala Institute of Health Sciences Nepal
Authors: Ram Sharan Mehta, Sanjeev Sharma
Abstract:
Introduction: Haemodialysis (HD) is a mechanical process of removing waste products from the blood and replacing essential substances in patients with renal failure. First artificial kidney developed in Netherlands in 1943 AD First successful treatment of CRF reported in 1960AD, life-saving treatment begins for CRF in 1972 AD. In 1973 AD Medicare took over financial responsibility for many clients and after that method become popular. BP Koirala institute of health science is the only center outside the Kathmandu, where HD service is available. In BPKIHS PD started in Jan.1998, HD started in August 2002 till September 2003 about 278 patients received HD. Day by day the number of HD patients is increasing in BPKIHS as with institutional growth. No such type of study was conducted in past hence there is lack of valid & reliable baseline data. Hence, the investigators were interested to conduct the study on " Profile of the Renal Failure patients under Haemodialysis at B.P. Koirala Institute of Health Sciences Nepal". Objectives: The objectives of the study were: to find out the Socio-demographic characteristics of the patients, to explore the knowledge of the patients regarding disease process and Haemodialysis and to identify the problems encountered by the patients. Methods: It is a hospital-based exploratory study. The population of the study was the clients under HD and the sampling method was purposive. Fifty-four patients undergone HD during the period of 17 July 2012 to 16 July 2013 of complete one year were included in the study. Structured interview schedule was used for collect data after obtaining validity and reliability. Results: Total 54 subjects had undergone for HD, having age range of 5-75 years and majority of them were male (74%) and Hindu (93 %). Thirty-one percent illiterate, 28% had agriculture their occupation, 80% of them were from very poor community, and about 30% subjects were unaware about the disease they suffering. Majority of subjects reported that they had no complications during dialysis (61%), where as 20% reported nausea and vomiting, 9% Hypotension, 4% headache and 2%chest pain during dialysis. Conclusions: CRF leading to HD is a long battle for patients, required to make major and continuous adjustment, both physiologically and psychologically. The study suggests that non-compliance with HD regimen were common. The socio-demographic and knowledge profile will help in the management and early prevention of disease and evaluate aspects that will influence care and patients can select mode of treatment themselves properly.Keywords: profile, haemodialysis, Nepal, patients, treatment
Procedia PDF Downloads 37524806 Mapping of Geological Structures Using Aerial Photography
Authors: Ankit Sharma, Mudit Sachan, Anurag Prakash
Abstract:
Rapid growth in data acquisition technologies through drones, have led to advances and interests in collecting high-resolution images of geological fields. Being advantageous in capturing high volume of data in short flights, a number of challenges have to overcome for efficient analysis of this data, especially while data acquisition, image interpretation and processing. We introduce a method that allows effective mapping of geological fields using photogrammetric data of surfaces, drainage area, water bodies etc, which will be captured by airborne vehicles like UAVs, we are not taking satellite images because of problems in adequate resolution, time when it is captured may be 1 yr back, availability problem, difficult to capture exact image, then night vision etc. This method includes advanced automated image interpretation technology and human data interaction to model structures and. First Geological structures will be detected from the primary photographic dataset and the equivalent three dimensional structures would then be identified by digital elevation model. We can calculate dip and its direction by using the above information. The structural map will be generated by adopting a specified methodology starting from choosing the appropriate camera, camera’s mounting system, UAVs design ( based on the area and application), Challenge in air borne systems like Errors in image orientation, payload problem, mosaicing and geo referencing and registering of different images to applying DEM. The paper shows the potential of using our method for accurate and efficient modeling of geological structures, capture particularly from remote, of inaccessible and hazardous sites.Keywords: digital elevation model, mapping, photogrammetric data analysis, geological structures
Procedia PDF Downloads 68624805 Using Optical Character Recognition to Manage the Unstructured Disaster Data into Smart Disaster Management System
Authors: Dong Seop Lee, Byung Sik Kim
Abstract:
In the 4th Industrial Revolution, various intelligent technologies have been developed in many fields. These artificial intelligence technologies are applied in various services, including disaster management. Disaster information management does not just support disaster work, but it is also the foundation of smart disaster management. Furthermore, it gets historical disaster information using artificial intelligence technology. Disaster information is one of important elements of entire disaster cycle. Disaster information management refers to the act of managing and processing electronic data about disaster cycle from its’ occurrence to progress, response, and plan. However, information about status control, response, recovery from natural and social disaster events, etc. is mainly managed in the structured and unstructured form of reports. Those exist as handouts or hard-copies of reports. Such unstructured form of data is often lost or destroyed due to inefficient management. It is necessary to manage unstructured data for disaster information. In this paper, the Optical Character Recognition approach is used to convert handout, hard-copies, images or reports, which is printed or generated by scanners, etc. into electronic documents. Following that, the converted disaster data is organized into the disaster code system as disaster information. Those data are stored in the disaster database system. Gathering and creating disaster information based on Optical Character Recognition for unstructured data is important element as realm of the smart disaster management. In this paper, Korean characters were improved to over 90% character recognition rate by using upgraded OCR. In the case of character recognition, the recognition rate depends on the fonts, size, and special symbols of character. We improved it through the machine learning algorithm. These converted structured data is managed in a standardized disaster information form connected with the disaster code system. The disaster code system is covered that the structured information is stored and retrieve on entire disaster cycle such as historical disaster progress, damages, response, and recovery. The expected effect of this research will be able to apply it to smart disaster management and decision making by combining artificial intelligence technologies and historical big data.Keywords: disaster information management, unstructured data, optical character recognition, machine learning
Procedia PDF Downloads 12924804 Awareness and Utilization of Social Network Tools among Agricultural Science Students in Colleges of Education in Ogun State, Nigeria
Authors: Adebowale Olukayode Efunnowo
Abstract:
This study was carried out to assess the awareness and utilization of Social Network Tools (SNTs) among agricultural science students in Colleges of Education in Ogun State, Nigeria. Simple random sampling techniques were used to select 280 respondents from the study area. Descriptive statistics was used to describe the objectives while Pearson Product Moment Correlation was used to test the hypothesis. The result showed that the majority (71.8%) of the respondents were single, with a mean age of 20 years. Almost all (95.7%) the respondents were aware of Facebook and 2go as a Social Network Tools (SNTs) while 85.0% of the respondents were not aware of Blackplanet, LinkedIn, MyHeritage and Bebo. Many (41.1%) of the respondents had views that using SNTs can enhance extensive literature survey, increase internet browsing potential, promote teaching proficiency, and update on outcomes of researches. However, 51.4% of the respondents perceived that SNTs usage as what is meant for the lecturers/adults only while 16.1% considered it as mainly used by internet fraudsters. Findings revealed that about 50.0% of the respondents browsed Facebook and 2go daily while more than 80% of the respondents used Blackplanet, MyHeritage, Skyrock, Bebo, LinkedIn and My YearBook as the need arise. Major constraints to the awareness and utilization of SNTs were high cost and poor quality of ICTs facilities (77.1%), epileptic power supply (75.0%), inadequate telecommunication infrastructure (71.1%), low technical know-how (62.9%) and inadequate computer knowledge (61.1%). The result of PPMC analysis showed that there was an inverse relationship between constraints and utilization of SNTs at p < 0.05. It can be concluded that constraints affect efficient and effective utilization of SNTs in the study area. It is hereby recommended that management of colleges of education and agricultural institutes should provide good internet connectivity, computer facilities, and alternative power supply in order to increase the awareness and utilization of SNTs among students.Keywords: awareness, utilization, social network tools, constraints, students
Procedia PDF Downloads 35224803 Predicting Seoul Bus Ridership Using Artificial Neural Network Algorithm with Smartcard Data
Authors: Hosuk Shin, Young-Hyun Seo, Eunhak Lee, Seung-Young Kho
Abstract:
Currently, in Seoul, users have the privilege to avoid riding crowded buses with the installation of Bus Information System (BIS). BIS has three levels of on-board bus ridership level information (spacious, normal, and crowded). However, there are flaws in the system due to it being real time which could provide incomplete information to the user. For example, a bus comes to the station, and on the BIS it shows that the bus is crowded, but on the stop that the user is waiting many people get off, which would mean that this station the information should show as normal or spacious. To fix this problem, this study predicts the bus ridership level using smart card data to provide more accurate information about the passenger ridership level on the bus. An Artificial Neural Network (ANN) is an interconnected group of nodes, that was created based on the human brain. Forecasting has been one of the major applications of ANN due to the data-driven self-adaptive methods of the algorithm itself. According to the results, the ANN algorithm was stable and robust with somewhat small error ratio, so the results were rational and reasonable.Keywords: smartcard data, ANN, bus, ridership
Procedia PDF Downloads 16724802 Improving Temporal Correlations in Empirical Orthogonal Function Expansions for Data Interpolating Empirical Orthogonal Function Algorithm
Authors: Ping Bo, Meng Yunshan
Abstract:
Satellite-derived sea surface temperature (SST) is a key parameter for many operational and scientific applications. However, the disadvantage of SST data is a high percentage of missing data which is mainly caused by cloud coverage. Data Interpolating Empirical Orthogonal Function (DINEOF) algorithm is an EOF-based technique for reconstructing the missing data and has been widely used in oceanographic field. The reconstruction of SST images within a long time series using DINEOF can cause large discontinuities and one solution for this problem is to filter the temporal covariance matrix to reduce the spurious variability. Based on the previous researches, an algorithm is presented in this paper to improve the temporal correlations in EOF expansion. Similar with the previous researches, a filter, such as Laplacian filter, is implemented on the temporal covariance matrix, but the temporal relationship between two consecutive images which is used in the filter is considered in the presented algorithm, for example, two images in the same season are more likely correlated than those in the different seasons, hence the latter one is less weighted in the filter. The presented approach is tested for the monthly nighttime 4-km Advanced Very High Resolution Radiometer (AVHRR) Pathfinder SST for the long-term period spanning from 1989 to 2006. The results obtained from the presented algorithm are compared to those from the original DINEOF algorithm without filtering and from the DINEOF algorithm with filtering but without taking temporal relationship into account.Keywords: data interpolating empirical orthogonal function, image reconstruction, sea surface temperature, temporal filter
Procedia PDF Downloads 32424801 Sparse Unmixing of Hyperspectral Data by Exploiting Joint-Sparsity and Rank-Deficiency
Authors: Fanqiang Kong, Chending Bian
Abstract:
In this work, we exploit two assumed properties of the abundances of the observed signatures (endmembers) in order to reconstruct the abundances from hyperspectral data. Joint-sparsity is the first property of the abundances, which assumes the adjacent pixels can be expressed as different linear combinations of same materials. The second property is rank-deficiency where the number of endmembers participating in hyperspectral data is very small compared with the dimensionality of spectral library, which means that the abundances matrix of the endmembers is a low-rank matrix. These assumptions lead to an optimization problem for the sparse unmixing model that requires minimizing a combined l2,p-norm and nuclear norm. We propose a variable splitting and augmented Lagrangian algorithm to solve the optimization problem. Experimental evaluation carried out on synthetic and real hyperspectral data shows that the proposed method outperforms the state-of-the-art algorithms with a better spectral unmixing accuracy.Keywords: hyperspectral unmixing, joint-sparse, low-rank representation, abundance estimation
Procedia PDF Downloads 26124800 Electronic Physical Activity Record (EPAR): Key for Data Driven Physical Activity Healthcare Services
Authors: Rishi Kanth Saripalle
Abstract:
Medical experts highly recommend to include physical activity in everyone’s daily routine irrespective of gender or age as it helps to improve various medical issues or curb potential issues. Simultaneously, experts are also diligently trying to provide various healthcare services (interventions, plans, exercise routines, etc.) for promoting healthy living and increasing physical activity in one’s ever increasing hectic schedules. With the introduction of wearables, individuals are able to keep track, analyze, and visualize their daily physical activities. However, there seems to be no common agreed standard for representing, gathering, aggregating and analyzing an individual’s physical activity data from disparate multiple sources (exercise pans, multiple wearables, etc.). This issue makes it highly impractical to develop any data-driven physical activity applications and healthcare programs. Further, the inability to integrate the physical activity data into an individual’s Electronic Health Record to provide a wholistic image of that individual’s health is still eluding the experts. This article has identified three primary reasons for this potential issue. First, there is no agreed standard, both structure and semantic, for representing and sharing physical activity data across disparate systems. Second, various organizations (e.g., LA fitness, Gold’s Gym, etc.) and research backed interventions and programs still primarily rely on paper or unstructured format (such as text or notes) to keep track of the data generated from physical activities. Finally, most of the wearable devices operate in silos. This article identifies the underlying problem, explores the idea of reusing existing standards, and identifies the essential modules required to move forward.Keywords: electronic physical activity record, physical activity in EHR EIM, tracking physical activity data, physical activity data standards
Procedia PDF Downloads 28224799 Developing Pavement Structural Deterioration Curves
Authors: Gregory Kelly, Gary Chai, Sittampalam Manoharan, Deborah Delaney
Abstract:
A Structural Number (SN) can be calculated for a road pavement from the properties and thicknesses of the surface, base course, sub-base, and subgrade. Historically, the cost of collecting structural data has been very high. Data were initially collected using Benkelman Beams and now by Falling Weight Deflectometer (FWD). The structural strength of pavements weakens over time due to environmental and traffic loading factors, but due to a lack of data, no structural deterioration curve for pavements has been implemented in a Pavement Management System (PMS). International Roughness Index (IRI) is a measure of the road longitudinal profile and has been used as a proxy for a pavement’s structural integrity. This paper offers two conceptual methods to develop Pavement Structural Deterioration Curves (PSDC). Firstly, structural data are grouped in sets by design Equivalent Standard Axles (ESA). An ‘Initial’ SN (ISN), Intermediate SN’s (SNI) and a Terminal SN (TSN), are used to develop the curves. Using FWD data, the ISN is the SN after the pavement is rehabilitated (Financial Accounting ‘Modern Equivalent’). Intermediate SNIs, are SNs other than the ISN and TSN. The TSN was defined as the SN of the pavement when it was approved for pavement rehabilitation. The second method is to use Traffic Speed Deflectometer data (TSD). The road network already divided into road blocks, is grouped by traffic loading. For each traffic loading group, road blocks that have had a recent pavement rehabilitation, are used to calculate the ISN and those planned for pavement rehabilitation to calculate the TSN. The remaining SNs are used to complete the age-based or if available, historical traffic loading-based SNI’s.Keywords: conceptual, pavement structural number, pavement structural deterioration curve, pavement management system
Procedia PDF Downloads 544