Search results for: data privacy
24024 Bayesian Analysis of Topp-Leone Generalized Exponential Distribution
Authors: Najrullah Khan, Athar Ali Khan
Abstract:
The Topp-Leone distribution was introduced by Topp- Leone in 1955. In this paper, an attempt has been made to fit Topp-Leone Generalized exponential (TPGE) distribution. A real survival data set is used for illustrations. Implementation is done using R and JAGS and appropriate illustrations are made. R and JAGS codes have been provided to implement censoring mechanism using both optimization and simulation tools. The main aim of this paper is to describe and illustrate the Bayesian modelling approach to the analysis of survival data. Emphasis is placed on the modeling of data and the interpretation of the results. Crucial to this is an understanding of the nature of the incomplete or 'censored' data encountered. Analytic approximation and simulation tools are covered here, but most of the emphasis is on Markov chain based Monte Carlo method including independent Metropolis algorithm, which is currently the most popular technique. For analytic approximation, among various optimization algorithms and trust region method is found to be the best. In this paper, TPGE model is also used to analyze the lifetime data in Bayesian paradigm. Results are evaluated from the above mentioned real survival data set. The analytic approximation and simulation methods are implemented using some software packages. It is clear from our findings that simulation tools provide better results as compared to those obtained by asymptotic approximation.Keywords: Bayesian Inference, JAGS, Laplace Approximation, LaplacesDemon, posterior, R Software, simulation
Procedia PDF Downloads 53424023 Machine Learning Application in Shovel Maintenance
Authors: Amir Taghizadeh Vahed, Adithya Thaduri
Abstract:
Shovels are the main components in the mining transportation system. The productivity of the mines depends on the availability of shovels due to its high capital and operating costs. The unplanned failure/shutdowns of a shovel results in higher repair costs, increase in downtime, as well as increasing indirect cost (i.e. loss of production and company’s reputation). In order to mitigate these failures, predictive maintenance can be useful approach using failure prediction. The modern mining machinery or shovels collect huge datasets automatically; it consists of reliability and maintenance data. However, the gathered datasets are useless until the information and knowledge of data are extracted. Machine learning as well as data mining, which has a major role in recent studies, has been used for the knowledge discovery process. In this study, data mining and machine learning approaches are implemented to detect not only anomalies but also patterns from a dataset and further detection of failures.Keywords: maintenance, machine learning, shovel, conditional based monitoring
Procedia PDF Downloads 21624022 Standard Languages for Creating a Database to Display Financial Statements on a Web Application
Authors: Vladimir Simovic, Matija Varga, Predrag Oreski
Abstract:
XHTML and XBRL are the standard languages for creating a database for the purpose of displaying financial statements on web applications. Today, XBRL is one of the most popular languages for business reporting. A large number of countries in the world recognize the role of XBRL language for financial reporting and the benefits that the reporting format provides in the collection, analysis, preparation, publication and the exchange of data (information) which is the positive side of this language. Here we present all advantages and opportunities that a company may have by using the XBRL format for business reporting. Also, this paper presents XBRL and other languages that are used for creating the database, such XML, XHTML, etc. The role of the AJAX complex model and technology will be explained in detail, and during the exchange of financial data between the web client and web server. Here will be mentioned basic layers of the network for data exchange via the web.Keywords: XHTML, XBRL, XML, JavaScript, AJAX technology, data exchange
Procedia PDF Downloads 39324021 Analyze and Visualize Eye-Tracking Data
Authors: Aymen Sekhri, Emmanuel Kwabena Frimpong, Bolaji Mubarak Ayeyemi, Aleksi Hirvonen, Matias Hirvonen, Tedros Tesfay Andemichael
Abstract:
Fixation identification, which involves isolating and identifying fixations and saccades in eye-tracking protocols, is an important aspect of eye-movement data processing that can have a big impact on higher-level analyses. However, fixation identification techniques are frequently discussed informally and rarely compared in any meaningful way. With two state-of-the-art algorithms, we will implement fixation detection and analysis in this work. The velocity threshold fixation algorithm is the first algorithm, and it identifies fixation based on a threshold value. For eye movement detection, the second approach is U'n' Eye, a deep neural network algorithm. The goal of this project is to analyze and visualize eye-tracking data from an eye gaze dataset that has been provided. The data was collected in a scenario in which individuals were shown photos and asked whether or not they recognized them. The results of the two-fixation detection approach are contrasted and visualized in this paper.Keywords: human-computer interaction, eye-tracking, CNN, fixations, saccades
Procedia PDF Downloads 13224020 Modelling the Education Supply Chain with Network Data Envelopment Analysis
Authors: Sourour Ramzi, Claudia Sarrico
Abstract:
Little has been done on network DEA in education, and nobody has attempted to model the whole education supply chain using network DEA. As such the contribution of the present paper is to propose a model for measuring the efficiency of education supply chains using network DEA. First, we use a general survey of data envelopment analysis (DEA) to establish the emergent themes for research in DEA, and focus on the theme of Network DEA. Second, we use a survey on two-stage DEA models, and Network DEA to write a state of the art on Network DEA, particularly applied to supply chain management. Third, we use a survey on DEA applications to establish the most influential papers on DEA education applications, in order to establish the state of the art on applications of DEA in education, in general, and applications of DEA to education using network DEA, in particular. Finally, we propose a model for measuring the performance of education supply chains of different education systems (countries or states within a country, for instance). We then use this model on some empirical data.Keywords: supply chain, education, data envelopment analysis, network DEA
Procedia PDF Downloads 36624019 Secure Transmission Scheme in Device-to-Device Multicast Communications
Authors: Bangwon Seo
Abstract:
In this paper, we consider multicast device-to-device (D2D) direct communication systems in cellular networks. In multicast D2D communications, nearby mobile devices exchanges, their data directly without going through a base station and a D2D transmitter send its data to multiple D2D receivers that compose of D2D multicast group. We consider wiretap channel where there is an eavesdropper that attempts to overhear the transmitted data of the D2D transmitter. In this paper, we propose a secure transmission scheme in D2D multicast communications in cellular networks. In order to prevent the eavesdropper from overhearing the transmitted data of the D2D transmitter, a precoding vector is employed at the D2D transmitter in the proposed scheme. We perform computer simulations to evaluate the performance of the proposed scheme. Through the simulation, we show that the secrecy rate performance can be improved by selecting an appropriate precoding vector.Keywords: device-to-device communications, wiretap channel, secure transmission, precoding
Procedia PDF Downloads 29124018 Post Pandemic Mobility Analysis through Indexing and Sharding in MongoDB: Performance Optimization and Insights
Authors: Karan Vishavjit, Aakash Lakra, Shafaq Khan
Abstract:
The COVID-19 pandemic has pushed healthcare professionals to use big data analytics as a vital tool for tracking and evaluating the effects of contagious viruses. To effectively analyze huge datasets, efficient NoSQL databases are needed. The analysis of post-COVID-19 health and well-being outcomes and the evaluation of the effectiveness of government efforts during the pandemic is made possible by this research’s integration of several datasets, which cuts down on query processing time and creates predictive visual artifacts. We recommend applying sharding and indexing technologies to improve query effectiveness and scalability as the dataset expands. Effective data retrieval and analysis are made possible by spreading the datasets into a sharded database and doing indexing on individual shards. Analysis of connections between governmental activities, poverty levels, and post-pandemic well being is the key goal. We want to evaluate the effectiveness of governmental initiatives to improve health and lower poverty levels. We will do this by utilising advanced data analysis and visualisations. The findings provide relevant data that supports the advancement of UN sustainable objectives, future pandemic preparation, and evidence-based decision-making. This study shows how Big Data and NoSQL databases may be used to address problems with global health.Keywords: big data, COVID-19, health, indexing, NoSQL, sharding, scalability, well being
Procedia PDF Downloads 6824017 Prediction of Anticancer Potential of Curcumin Nanoparticles by Means of Quasi-Qsar Analysis Using Monte Carlo Method
Authors: Ruchika Goyal, Ashwani Kumar, Sandeep Jain
Abstract:
The experimental data for anticancer potential of curcumin nanoparticles was calculated by means of eclectic data. The optimal descriptors were examined using Monte Carlo method based CORAL SEA software. The statistical quality of the model is following: n = 14, R² = 0.6809, Q² = 0.5943, s = 0.175, MAE = 0.114, F = 26 (sub-training set), n =5, R²= 0.9529, Q² = 0.7982, s = 0.086, MAE = 0.068, F = 61, Av Rm² = 0.7601, ∆R²m = 0.0840, k = 0.9856 and kk = 1.0146 (test set) and n = 5, R² = 0.6075 (validation set). This data can be used to build predictive QSAR models for anticancer activity.Keywords: anticancer potential, curcumin, model, nanoparticles, optimal descriptors, QSAR
Procedia PDF Downloads 31524016 The Lived Experiences of South African Female Offenders and the Possible Links to Recidivism Due to their Exclusion from Educational Rehabilitation Programmes
Authors: Jessica Leigh Thornton
Abstract:
The South African Constitution outlines provisions for every detainee and sentenced prisoner in relation to the human rights recognized in the country since 1994; but currently, across the country, prisons have yet to meet many of these criteria. Consequently, their day-to-day lives are marked by extreme lack of privacy, high rates of infection, poor nutrition, and deleterious living conditions, which steadily erode prisoners’ mental and physical capacities rather than rehabilitating inmates so that they can effectively reintegrate into society. Even more so, policy reform, advocacy, security, and rehabilitation programs continue to be based on research and theories that were developed to explain the experiences of men, while female offenders are seen as the “special category” of inmates. Yet, the experiences of women and their pathways to incarceration are remarkably different from those of male offenders. Consequently, little is known about the profile, nature and contributing factors and experiences of female offenders which has impeded a comprehensive and integrated understanding of the subject of female criminality. The number of women globally in correctional centers has more than doubled over the past fifteen years (these increases vary from prison to prison and country to country). Yet, female offenders have largely been ignored in research even though the minority status of female offenders is a phenomenon that is not peculiar to South Africa as the number of women incarcerated has increased by 68% within the decade. Within South Africa, there have been minimal studies conducted on the gendered experience of offenders. While some studies have explored the pathways to female offending, gender-sensitive correctional programming for women that respond to their needs has been overlooked. This often leads to a neglect of the needs of female offenders, not only in terms of programs and services delivery to this minority group but also from a research perspective. In response, the aim of the proposed research is twofold: Firstly, the lived experiences and views of rehabilitation and reintegration of female offenders will be explored. Secondly, the various pathways into and out of recidivism amongst female offenders will be investigated regarding their inclusion in educational rehabilitation.Keywords: female incarceration, educational rehabilitation, exclusion, experiences of female offenders
Procedia PDF Downloads 27124015 Static vs. Stream Mining Trajectories Similarity Measures
Authors: Musaab Riyadh, Norwati Mustapha, Dina Riyadh
Abstract:
Trajectory similarity can be defined as the cost of transforming one trajectory into another based on certain similarity method. It is the core of numerous mining tasks such as clustering, classification, and indexing. Various approaches have been suggested to measure similarity based on the geometric and dynamic properties of trajectory, the overlapping between trajectory segments, and the confined area between entire trajectories. In this article, an evaluation of these approaches has been done based on computational cost, usage memory, accuracy, and the amount of data which is needed in advance to determine its suitability to stream mining applications. The evaluation results show that the stream mining applications support similarity methods which have low computational cost and memory, single scan on data, and free of mathematical complexity due to the high-speed generation of data.Keywords: global distance measure, local distance measure, semantic trajectory, spatial dimension, stream data mining
Procedia PDF Downloads 39224014 A Qualitative Study Identifying the Complexities of Early Childhood Professionals' Use and Production of Data
Authors: Sara Bonetti
Abstract:
The use of quantitative data to support policies and justify investments has become imperative in many fields including the field of education. However, the topic of data literacy has only marginally touched the early care and education (ECE) field. In California, within the ECE workforce, there is a group of professionals working in policy and advocacy that use quantitative data regularly and whose educational and professional experiences have been neglected by existing research. This study aimed at analyzing these experiences in accessing, using, and producing quantitative data. This study utilized semi-structured interviews to capture the differences in educational and professional backgrounds, policy contexts, and power relations. The participants were three key professionals from county-level organizations and one working at a State Department to allow for a broader perspective at systems level. The study followed Núñez’s multilevel model of intersectionality. The key in Núñez’s model is the intersection of multiple levels of analysis and influence, from the individual to the system level, and the identification of institutional power dynamics that perpetuate the marginalization of certain groups within society. In a similar manner, this study looked at the dynamic interaction of different influences at individual, organizational, and system levels that might intersect and affect ECE professionals’ experiences with quantitative data. At the individual level, an important element identified was the participants’ educational background, as it was possible to observe a relationship between that and their positionality, both with respect to working with data and also with respect to their power within an organization and at the policy table. For example, those with a background in child development were aware of how their formal education failed to train them in the skills that are necessary to work in policy and advocacy, and especially to work with quantitative data, compared to those with a background in administration and/or business. At the organizational level, the interviews showed a connection between the participants’ position within the organization and their organization’s position with respect to others and their degree of access to quantitative data. This in turn affected their sense of empowerment and agency in dealing with data, such as shaping what data is collected and available. These differences reflected on the interviewees’ perceptions and expectations for the ECE workforce. For example, one of the interviewees pointed out that many ECE professionals happen to use data out of the necessity of the moment. This lack of intentionality is a cause for, and at the same time translates into missed training opportunities. Another interviewee pointed out issues related to the professionalism of the ECE workforce by remarking the inadequacy of ECE students’ training in working with data. In conclusion, Núñez’s model helped understand the different elements that affect ECE professionals’ experiences with quantitative data. In particular, what was clear is that these professionals are not being provided with the necessary support and that we are not being intentional in creating data literacy skills for them, despite what is asked of them and their work.Keywords: data literacy, early childhood professionals, intersectionality, quantitative data
Procedia PDF Downloads 25224013 Data and Spatial Analysis for Economy and Education of 28 E.U. Member-States for 2014
Authors: Alexiou Dimitra, Fragkaki Maria
Abstract:
The objective of the paper is the study of geographic, economic and educational variables and their contribution to determine the position of each member-state among the EU-28 countries based on the values of seven variables as given by Eurostat. The Data Analysis methods of Multiple Factorial Correspondence Analysis (MFCA) Principal Component Analysis and Factor Analysis have been used. The cross tabulation tables of data consist of the values of seven variables for the 28 countries for 2014. The data are manipulated using the CHIC Analysis V 1.1 software package. The results of this program using MFCA and Ascending Hierarchical Classification are given in arithmetic and graphical form. For comparison reasons with the same data the Factor procedure of Statistical package IBM SPSS 20 has been used. The numerical and graphical results presented with tables and graphs, demonstrate the agreement between the two methods. The most important result is the study of the relation between the 28 countries and the position of each country in groups or clouds, which are formed according to the values of the corresponding variables.Keywords: Multiple Factorial Correspondence Analysis, Principal Component Analysis, Factor Analysis, E.U.-28 countries, Statistical package IBM SPSS 20, CHIC Analysis V 1.1 Software, Eurostat.eu Statistics
Procedia PDF Downloads 50924012 Deployment of Electronic Healthcare Records and Development of Big Data Analytics Capabilities in the Healthcare Industry: A Systematic Literature Review
Authors: Tigabu Dagne Akal
Abstract:
Electronic health records (EHRs) can help to store, maintain, and make the appropriate handling of patient histories for proper treatment and decision. Merging the EHRs with big data analytics (BDA) capabilities enable healthcare stakeholders to provide effective and efficient treatments for chronic diseases. Though there are huge opportunities and efforts that exist in the deployment of EMRs and the development of BDA, there are challenges in addressing resources and organizational capabilities that are required to achieve the competitive advantage and sustainability of EHRs and BDA. The resource-based view (RBV), information system (IS), and non- IS theories should be extended to examine organizational capabilities and resources which are required for successful data analytics in the healthcare industries. The main purpose of this study is to develop a conceptual framework for the development of healthcare BDA capabilities based on past works so that researchers can extend. The research question was formulated for the search strategy as a research methodology. The study selection was made at the end. Based on the study selection, the conceptual framework for the development of BDA capabilities in the healthcare settings was formulated.Keywords: EHR, EMR, Big data, Big data analytics, resource-based view
Procedia PDF Downloads 13024011 Development of a Spatial Data for Renal Registry in Nigeria Health Sector
Authors: Adekunle Kolawole Ojo, Idowu Peter Adebayo, Egwuche Sylvester O.
Abstract:
Chronic Kidney Disease (CKD) is a significant cause of morbidity and mortality across developed and developing nations and is associated with increased risk. There are no existing electronic means of capturing and monitoring CKD in Nigeria. The work is aimed at developing a spatial data model that can be used to implement renal registries required for tracking and monitoring the spatial distribution of renal diseases by public health officers and patients. In this study, we have developed a spatial data model for a functional renal registry.Keywords: renal registry, health informatics, chronic kidney disease, interface
Procedia PDF Downloads 20924010 Environmental Evaluation of Two Kind of Drug Production (Syrup and Pomade Form) Using Life Cycle Assessment Methodology
Authors: H. Aksas, S. Boughrara, K. Louhab
Abstract:
The goal of this study was the use of life cycle assessment (LCA) methodology to assess the environmental impact of pharmaceutical product (four kinds of syrup form and tree kinds of pomade form), which are produced in one leader manufactory in Algeria town that is SAIDAL Company. The impacts generated have evaluated using SimpaPro7.1 with CML92 Method for syrup form and EPD 2007 for pomade form. All impacts evaluated have compared between them, with determination of the compound contributing to each impacts in each case. Data needed to conduct Life Cycle Inventory (LCI) came from this factory, by the collection of theoretical data near the responsible technicians and engineers of the company, the practical data are resulting from the assay of pharmaceutical liquid, obtained at the laboratories of the university. This data represent different raw material imported from European and Asian country necessarily to formulate the drug. Energy used is coming from Algerian resource for the input. Outputs are the result of effluent analysis of this factory with different form (liquid, solid and gas form). All this data (input and output) represent the ecobalance.Keywords: pharmaceutical product, drug residues, LCA methodology, environmental impacts
Procedia PDF Downloads 24524009 Multi Cloud Storage Systems for Resource Constrained Mobile Devices: Comparison and Analysis
Authors: Rajeev Kumar Bedi, Jaswinder Singh, Sunil Kumar Gupta
Abstract:
Cloud storage is a model of online data storage where data is stored in virtualized pool of servers hosted by third parties (CSPs) and located in different geographical locations. Cloud storage revolutionized the way how users access their data online anywhere, anytime and using any device as a tablet, mobile, laptop, etc. A lot of issues as vendor lock-in, frequent service outage, data loss and performance related issues exist in single cloud storage systems. So to evade these issues, the concept of multi cloud storage introduced. There are a lot of multi cloud storage systems exists in the market for mobile devices. In this article, we are providing comparison of four multi cloud storage systems for mobile devices Otixo, Unclouded, Cloud Fuze, and Clouds and evaluate their performance on the basis of CPU usage, battery consumption, time consumption and data usage parameters on three mobile phones Nexus 5, Moto G and Nexus 7 tablet and using Wi-Fi network. Finally, open research challenges and future scope are discussed.Keywords: cloud storage, multi cloud storage, vendor lock-in, mobile devices, mobile cloud computing
Procedia PDF Downloads 40524008 Preparation of Wireless Networks and Security; Challenges in Efficient Accession of Encrypted Data in Healthcare
Authors: M. Zayoud, S. Oueida, S. Ionescu, P. AbiChar
Abstract:
Background: Wireless sensor network is encompassed of diversified tools of information technology, which is widely applied in a range of domains, including military surveillance, weather forecasting, and earthquake forecasting. Strengthened grounds are always developed for wireless sensor networks, which usually emerges security issues during professional application. Thus, essential technological tools are necessary to be assessed for secure aggregation of data. Moreover, such practices have to be incorporated in the healthcare practices that shall be serving in the best of the mutual interest Objective: Aggregation of encrypted data has been assessed through homomorphic stream cipher to assure its effectiveness along with providing the optimum solutions to the field of healthcare. Methods: An experimental design has been incorporated, which utilized newly developed cipher along with CPU-constrained devices. Modular additions have also been employed to evaluate the nature of aggregated data. The processes of homomorphic stream cipher have been highlighted through different sensors and modular additions. Results: Homomorphic stream cipher has been recognized as simple and secure process, which has allowed efficient aggregation of encrypted data. In addition, the application has led its way to the improvisation of the healthcare practices. Statistical values can be easily computed through the aggregation on the basis of selected cipher. Sensed data in accordance with variance, mean, and standard deviation has also been computed through the selected tool. Conclusion: It can be concluded that homomorphic stream cipher can be an ideal tool for appropriate aggregation of data. Alongside, it shall also provide the best solutions to the healthcare sector.Keywords: aggregation, cipher, homomorphic stream, encryption
Procedia PDF Downloads 25924007 The Relationship between Emotional Intelligence and Leadership Performance
Authors: Omar Al Ali
Abstract:
The current study was aimed to explore the relationships between emotional intelligence, cognitive ability, and leader's performance. Data were collected from 260 senior managers from UAE. The results showed that there are significant relationships between emotional intelligence and leadership performance as measured by the annual internal evaluations of each participant (r = .42, p < .01). Data from regression analysis revealed that both variables namely emotional intelligence (beta = .31, p < .01), and cognitive ability (beta = .29, p < .01), predicted leadership competencies, and together explained 26% of its variance. Data suggests that EI and cognitive ability are significantly correlated with leadership performance. In depth implications of the present findings for human resource development theory and practice are discussed.Keywords: emotional intelligence, cognitive ability, leadership, performance
Procedia PDF Downloads 47524006 Comparison of Irradiance Decomposition and Energy Production Methods in a Solar Photovoltaic System
Authors: Tisciane Perpetuo e Oliveira, Dante Inga Narvaez, Marcelo Gradella Villalva
Abstract:
Installations of solar photovoltaic systems have increased considerably in the last decade. Therefore, it has been noticed that monitoring of meteorological data (solar irradiance, air temperature, wind velocity, etc.) is important to predict the potential of a given geographical area in solar energy production. In this sense, the present work compares two computational tools that are capable of estimating the energy generation of a photovoltaic system through correlation analyzes of solar radiation data: PVsyst software and an algorithm based on the PVlib package implemented in MATLAB. In order to achieve the objective, it was necessary to obtain solar radiation data (measured and from a solarimetric database), analyze the decomposition of global solar irradiance in direct normal and horizontal diffuse components, as well as analyze the modeling of the devices of a photovoltaic system (solar modules and inverters) for energy production calculations. Simulated results were compared with experimental data in order to evaluate the performance of the studied methods. Errors in estimation of energy production were less than 30% for the MATLAB algorithm and less than 20% for the PVsyst software.Keywords: energy production, meteorological data, irradiance decomposition, solar photovoltaic system
Procedia PDF Downloads 14024005 Social Media Data Analysis for Personality Modelling and Learning Styles Prediction Using Educational Data Mining
Authors: Srushti Patil, Preethi Baligar, Gopalkrishna Joshi, Gururaj N. Bhadri
Abstract:
In designing learning environments, the instructional strategies can be tailored to suit the learning style of an individual to ensure effective learning. In this study, the information shared on social media like Facebook is being used to predict learning style of a learner. Previous research studies have shown that Facebook data can be used to predict user personality. Users with a particular personality exhibit an inherent pattern in their digital footprint on Facebook. The proposed work aims to correlate the user's’ personality, predicted from Facebook data to the learning styles, predicted through questionnaires. For Millennial learners, Facebook has become a primary means for information sharing and interaction with peers. Thus, it can serve as a rich bed for research and direct the design of learning environments. The authors have conducted this study in an undergraduate freshman engineering course. Data from 320 freshmen Facebook users was collected. The same users also participated in the learning style and personality prediction survey. The Kolb’s Learning style questionnaires and Big 5 personality Inventory were adopted for the survey. The users have agreed to participate in this research and have signed individual consent forms. A specific page was created on Facebook to collect user data like personal details, status updates, comments, demographic characteristics and egocentric network parameters. This data was captured by an application created using Python program. The data captured from Facebook was subjected to text analysis process using the Linguistic Inquiry and Word Count dictionary. An analysis of the data collected from the questionnaires performed reveals individual student personality and learning style. The results obtained from analysis of Facebook, learning style and personality data were then fed into an automatic classifier that was trained by using the data mining techniques like Rule-based classifiers and Decision trees. This helps to predict the user personality and learning styles by analysing the common patterns. Rule-based classifiers applied for text analysis helps to categorize Facebook data into positive, negative and neutral. There were totally two models trained, one to predict the personality from Facebook data; another one to predict the learning styles from the personalities. The results show that the classifier model has high accuracy which makes the proposed method to be a reliable one for predicting the user personality and learning styles.Keywords: educational data mining, Facebook, learning styles, personality traits
Procedia PDF Downloads 22924004 Talent-to-Vec: Using Network Graphs to Validate Models with Data Sparsity
Authors: Shaan Khosla, Jon Krohn
Abstract:
In a recruiting context, machine learning models are valuable for recommendations: to predict the best candidates for a vacancy, to match the best vacancies for a candidate, and compile a set of similar candidates for any given candidate. While useful to create these models, validating their accuracy in a recommendation context is difficult due to a sparsity of data. In this report, we use network graph data to generate useful representations for candidates and vacancies. We use candidates and vacancies as network nodes and designate a bi-directional link between them based on the candidate interviewing for the vacancy. After using node2vec, the embeddings are used to construct a validation dataset with a ranked order, which will help validate new recommender systems.Keywords: AI, machine learning, NLP, recruiting
Procedia PDF Downloads 8324003 A Web Service-Based Framework for Mining E-Learning Data
Authors: Felermino D. M. A. Ali, S. C. Ng
Abstract:
E-learning is an evolutionary form of distance learning and has become better over time as new technologies emerged. Today, efforts are still being made to embrace E-learning systems with emerging technologies in order to make them better. Among these advancements, Educational Data Mining (EDM) is one that is gaining a huge and increasing popularity due to its wide application for improving the teaching-learning process in online practices. However, even though EDM promises to bring many benefits to educational industry in general and E-learning environments in particular, its principal drawback is the lack of easy to use tools. The current EDM tools usually require users to have some additional technical expertise to effectively perform EDM tasks. Thus, in response to these limitations, this study intends to design and implement an EDM application framework which aims at automating and simplify the development of EDM in E-learning environment. The application framework introduces a Service-Oriented Architecture (SOA) that hides the complexity of technical details and enables users to perform EDM in an automated fashion. The framework was designed based on abstraction, extensibility, and interoperability principles. The framework implementation was made up of three major modules. The first module provides an abstraction for data gathering, which was done by extending Moodle LMS (Learning Management System) source code. The second module provides data mining methods and techniques as services; it was done by converting Weka API into a set of Web services. The third module acts as an intermediary between the first two modules, it contains a user-friendly interface that allows dynamically locating data provider services, and running knowledge discovery tasks on data mining services. An experiment was conducted to evaluate the overhead of the proposed framework through a combination of simulation and implementation. The experiments have shown that the overhead introduced by the SOA mechanism is relatively small, therefore, it has been concluded that a service-oriented architecture can be effectively used to facilitate educational data mining in E-learning environments.Keywords: educational data mining, e-learning, distributed data mining, moodle, service-oriented architecture, Weka
Procedia PDF Downloads 23524002 Mathematics Bridging Theory and Applications for a Data-Driven World
Authors: Zahid Ullah, Atlas Khan
Abstract:
In today's data-driven world, the role of mathematics in bridging the gap between theory and applications is becoming increasingly vital. This abstract highlights the significance of mathematics as a powerful tool for analyzing, interpreting, and extracting meaningful insights from vast amounts of data. By integrating mathematical principles with real-world applications, researchers can unlock the full potential of data-driven decision-making processes. This abstract delves into the various ways mathematics acts as a bridge connecting theoretical frameworks to practical applications. It explores the utilization of mathematical models, algorithms, and statistical techniques to uncover hidden patterns, trends, and correlations within complex datasets. Furthermore, it investigates the role of mathematics in enhancing predictive modeling, optimization, and risk assessment methodologies for improved decision-making in diverse fields such as finance, healthcare, engineering, and social sciences. The abstract also emphasizes the need for interdisciplinary collaboration between mathematicians, statisticians, computer scientists, and domain experts to tackle the challenges posed by the data-driven landscape. By fostering synergies between these disciplines, novel approaches can be developed to address complex problems and make data-driven insights accessible and actionable. Moreover, this abstract underscores the importance of robust mathematical foundations for ensuring the reliability and validity of data analysis. Rigorous mathematical frameworks not only provide a solid basis for understanding and interpreting results but also contribute to the development of innovative methodologies and techniques. In summary, this abstract advocates for the pivotal role of mathematics in bridging theory and applications in a data-driven world. By harnessing mathematical principles, researchers can unlock the transformative potential of data analysis, paving the way for evidence-based decision-making, optimized processes, and innovative solutions to the challenges of our rapidly evolving society.Keywords: mathematics, bridging theory and applications, data-driven world, mathematical models
Procedia PDF Downloads 7524001 Unstructured-Data Content Search Based on Optimized EEG Signal Processing and Multi-Objective Feature Extraction
Authors: Qais M. Yousef, Yasmeen A. Alshaer
Abstract:
Over the last few years, the amount of data available on the globe has been increased rapidly. This came up with the emergence of recent concepts, such as the big data and the Internet of Things, which have furnished a suitable solution for the availability of data all over the world. However, managing this massive amount of data remains a challenge due to their large verity of types and distribution. Therefore, locating the required file particularly from the first trial turned to be a not easy task, due to the large similarities of names for different files distributed on the web. Consequently, the accuracy and speed of search have been negatively affected. This work presents a method using Electroencephalography signals to locate the files based on their contents. Giving the concept of natural mind waves processing, this work analyses the mind wave signals of different people, analyzing them and extracting their most appropriate features using multi-objective metaheuristic algorithm, and then classifying them using artificial neural network to distinguish among files with similar names. The aim of this work is to provide the ability to find the files based on their contents using human thoughts only. Implementing this approach and testing it on real people proved its ability to find the desired files accurately within noticeably shorter time and retrieve them as a first choice for the user.Keywords: artificial intelligence, data contents search, human active memory, mind wave, multi-objective optimization
Procedia PDF Downloads 17524000 IoT Based Approach to Healthcare System for a Quadriplegic Patient Using EEG
Authors: R. Gautam, P. Sastha Kanagasabai, G. N. Rathna
Abstract:
The proposed healthcare system enables quadriplegic patients, people with severe motor disabilities to send commands to electronic devices and monitor their vitals. The growth of Brain-Computer-Interface (BCI) has led to rapid development in 'assistive systems' for the disabled called 'assistive domotics'. Brain-Computer-Interface is capable of reading the brainwaves of an individual and analyse it to obtain some meaningful data. This processed data can be used to assist people having speech disorders and sometimes people with limited locomotion to communicate. In this Project, Emotiv EPOC Headset is used to obtain the electroencephalogram (EEG). The obtained data is processed to communicate pre-defined commands over the internet to the desired mobile phone user. Other Vital Information like the heartbeat, blood pressure, ECG and body temperature are monitored and uploaded to the server. Data analytics enables physicians to scan databases for a specific illness. The Data is processed in Intel Edison, system on chip (SoC). Patient metrics are displayed via Intel IoT Analytics cloud service.Keywords: brain computer interface, Intel Edison, Emotiv EPOC, IoT analytics, electroencephalogram
Procedia PDF Downloads 18423999 Searchable Encryption in Cloud Storage
Authors: Ren Junn Hwang, Chung-Chien Lu, Jain-Shing Wu
Abstract:
Cloud outsource storage is one of important services in cloud computing. Cloud users upload data to cloud servers to reduce the cost of managing data and maintaining hardware and software. To ensure data confidentiality, users can encrypt their files before uploading them to a cloud system. However, retrieving the target file from the encrypted files exactly is difficult for cloud server. This study proposes a protocol for performing multikeyword searches for encrypted cloud data by applying k-nearest neighbor technology. The protocol ranks the relevance scores of encrypted files and keywords, and prevents cloud servers from learning search keywords submitted by a cloud user. To reduce the costs of file transfer communication, the cloud server returns encrypted files in order of relevance. Moreover, when a cloud user inputs an incorrect keyword and the number of wrong alphabet does not exceed a given threshold; the user still can retrieve the target files from cloud server. In addition, the proposed scheme satisfies security requirements for outsourced data storage.Keywords: fault-tolerance search, multi-keywords search, outsource storage, ranked search, searchable encryption
Procedia PDF Downloads 38123998 A Bivariate Inverse Generalized Exponential Distribution and Its Applications in Dependent Competing Risks Model
Authors: Fatemah A. Alqallaf, Debasis Kundu
Abstract:
The aim of this paper is to introduce a bivariate inverse generalized exponential distribution which has a singular component. The proposed bivariate distribution can be used when the marginals have heavy-tailed distributions, and they have non-monotone hazard functions. Due to the presence of the singular component, it can be used quite effectively when there are ties in the data. Since it has four parameters, it is a very flexible bivariate distribution, and it can be used quite effectively for analyzing various bivariate data sets. Several dependency properties and dependency measures have been obtained. The maximum likelihood estimators cannot be obtained in closed form, and it involves solving a four-dimensional optimization problem. To avoid that, we have proposed to use an EM algorithm, and it involves solving only one non-linear equation at each `E'-step. Hence, the implementation of the proposed EM algorithm is very straight forward in practice. Extensive simulation experiments and the analysis of one data set have been performed. We have observed that the proposed bivariate inverse generalized exponential distribution can be used for modeling dependent competing risks data. One data set has been analyzed to show the effectiveness of the proposed model.Keywords: Block and Basu bivariate distributions, competing risks, EM algorithm, Marshall-Olkin bivariate exponential distribution, maximum likelihood estimators
Procedia PDF Downloads 14123997 Blind Data Hiding Technique Using Interpolation of Subsampled Images
Authors: Singara Singh Kasana, Pankaj Garg
Abstract:
In this paper, a blind data hiding technique based on interpolation of sub sampled versions of a cover image is proposed. Sub sampled image is taken as a reference image and an interpolated image is generated from this reference image. Then difference between original cover image and interpolated image is used to embed secret data. Comparisons with the existing interpolation based techniques show that proposed technique provides higher embedding capacity and better visual quality marked images. Moreover, the performance of the proposed technique is more stable for different images.Keywords: interpolation, image subsampling, PSNR, SIM
Procedia PDF Downloads 57723996 Active Contours for Image Segmentation Based on Complex Domain Approach
Authors: Sajid Hussain
Abstract:
The complex domain approach for image segmentation based on active contour has been designed, which deforms step by step to partition an image into numerous expedient regions. A novel region-based trigonometric complex pressure force function is proposed, which propagates around the region of interest using image forces. The signed trigonometric force function controls the propagation of the active contour and the active contour stops on the exact edges of the object accurately. The proposed model makes the level set function binary and uses Gaussian smoothing kernel to adjust and escape the re-initialization procedure. The working principle of the proposed model is as follows: The real image data is transformed into complex data by iota (i) times of image data and the average iota (i) times of horizontal and vertical components of the gradient of image data is inserted in the proposed model to catch complex gradient of the image data. A simple finite difference mathematical technique has been used to implement the proposed model. The efficiency and robustness of the proposed model have been verified and compared with other state-of-the-art models.Keywords: image segmentation, active contour, level set, Mumford and Shah model
Procedia PDF Downloads 11223995 Discerning Divergent Nodes in Social Networks
Authors: Mehran Asadi, Afrand Agah
Abstract:
In data mining, partitioning is used as a fundamental tool for classification. With the help of partitioning, we study the structure of data, which allows us to envision decision rules, which can be applied to classification trees. In this research, we used online social network dataset and all of its attributes (e.g., Node features, labels, etc.) to determine what constitutes an above average chance of being a divergent node. We used the R statistical computing language to conduct the analyses in this report. The data were found on the UC Irvine Machine Learning Repository. This research introduces the basic concepts of classification in online social networks. In this work, we utilize overfitting and describe different approaches for evaluation and performance comparison of different classification methods. In classification, the main objective is to categorize different items and assign them into different groups based on their properties and similarities. In data mining, recursive partitioning is being utilized to probe the structure of a data set, which allow us to envision decision rules and apply them to classify data into several groups. Estimating densities is hard, especially in high dimensions, with limited data. Of course, we do not know the densities, but we could estimate them using classical techniques. First, we calculated the correlation matrix of the dataset to see if any predictors are highly correlated with one another. By calculating the correlation coefficients for the predictor variables, we see that density is strongly correlated with transitivity. We initialized a data frame to easily compare the quality of the result classification methods and utilized decision trees (with k-fold cross validation to prune the tree). The method performed on this dataset is decision trees. Decision tree is a non-parametric classification method, which uses a set of rules to predict that each observation belongs to the most commonly occurring class label of the training data. Our method aggregates many decision trees to create an optimized model that is not susceptible to overfitting. When using a decision tree, however, it is important to use cross-validation to prune the tree in order to narrow it down to the most important variables.Keywords: online social networks, data mining, social cloud computing, interaction and collaboration
Procedia PDF Downloads 154