Search results for: open data
25031 Data Mining Approach: Classification Model Evaluation
Authors: Lubabatu Sada Sodangi
Abstract:
The rapid growth in exchange and accessibility of information via the internet makes many organisations acquire data on their own operation. The aim of data mining is to analyse the different behaviour of a dataset using observation. Although, the subset of the dataset being analysed may not display all the behaviours and relationships of the entire data and, therefore, may not represent other parts that exist in the dataset. There is a range of techniques used in data mining to determine the hidden or unknown information in datasets. In this paper, the performance of two algorithms Chi-Square Automatic Interaction Detection (CHAID) and multilayer perceptron (MLP) would be matched using an Adult dataset to find out the percentage of an/the adults that earn > 50k and those that earn <= 50k per year. The two algorithms were studied and compared using IBM SPSS statistics software. The result for CHAID shows that the most important predictors are relationship and education. The algorithm shows that those are married (husband) and have qualification: Bachelor, Masters, Doctorate or Prof-school whose their age is > 41<57 earn > 50k. Also, multilayer perceptron displays marital status and capital gain as the most important predictors of the income. It also shows that individuals that their capital gain is less than 6,849 and are single, separated or widow, earn <= 50K, whereas individuals with their capital gain is > 6,849, work > 35 hrs/wk, and > 27yrs their income will be > 50k. By comparing the two algorithms, it is observed that both algorithms are reliable but there is strong reliability in CHAID which clearly shows that relation and education contribute to the prediction as displayed in the data visualisation.Keywords: data mining, CHAID, multi-layer perceptron, SPSS, Adult dataset
Procedia PDF Downloads 37825030 Developing an Information Model of Manufacturing Process for Sustainability
Authors: Jae Hyun Lee
Abstract:
Manufacturing companies use life-cycle inventory databases to analyze sustainability of their manufacturing processes. Life cycle inventory data provides reference data which may not be accurate for a specific company. Collecting accurate data of manufacturing processes for a specific company requires enormous time and efforts. An information model of typical manufacturing processes can reduce time and efforts to get appropriate reference data for a specific company. This paper shows an attempt to build an abstract information model which can be used to develop information models for specific manufacturing processes.Keywords: process information model, sustainability, OWL, manufacturing
Procedia PDF Downloads 43025029 An Interpretable Data-Driven Approach for the Stratification of the Cardiorespiratory Fitness
Authors: D.Mendes, J. Henriques, P. Carvalho, T. Rocha, S. Paredes, R. Cabiddu, R. Trimer, R. Mendes, A. Borghi-Silva, L. Kaminsky, E. Ashley, R. Arena, J. Myers
Abstract:
The continued exploration of clinically relevant predictive models continues to be an important pursuit. Cardiorespiratory fitness (CRF) portends clinical vital information and as such its accurate prediction is of high importance. Therefore, the aim of the current study was to develop a data-driven model, based on computational intelligence techniques and, in particular, clustering approaches, to predict CRF. Two prediction models were implemented and compared: 1) the traditional Wasserman/Hansen Equations; and 2) an interpretable clustering approach. Data used for this analysis were from the 'FRIEND - Fitness Registry and the Importance of Exercise: The National Data Base'; in the present study a subset of 10690 apparently healthy individuals were utilized. The accuracy of the models was performed through the computation of sensitivity, specificity, and geometric mean values. The results show the superiority of the clustering approach in the accurate estimation of CRF (i.e., maximal oxygen consumption).Keywords: cardiorespiratory fitness, data-driven models, knowledge extraction, machine learning
Procedia PDF Downloads 28625028 The Optimization of Sexual Health Resource Information and Services for Persons with Spinal Cord Injury
Authors: Nasrin Nejatbakhsh, Anita Kaiser, Sander Hitzig, Colleen McGillivray
Abstract:
Following spinal cord injury (SCI), many individuals experience anxiety in adjusting to their lives, and its impacts on their sexuality. Research has demonstrated that regaining sexual function is a very high priority for individuals with SCI. Despite this, sexual health is one of the least likely areas of focus in rehabilitating individuals with SCI. There is currently a considerable gap in appropriate education and resources that address sexual health concerns and needs of people with spinal cord injury. Furthermore, the determinants of sexual health in individuals with SCI are poorly understood and thus poorly addressed. The purpose of this study was to improve current practices by informing a service delivery model that rehabilitation centers can adopt for appropriate delivery of their services. Methodology: We utilized qualitative methods in the form of a semi-structured interview containing open-ended questions to assess 1) sexual health concerns, 2) helpful strategies in current resources, 3) unhelpful strategies in current resources, and 4) Barriers to obtaining sexual health information. In addition to the interviews, participants completed surveys to identify socio-demographic factors. Data gathered was coded and evaluated for emerging themes and subthemes through a ‘code-recode’ technique. Results: We have identified several robust themes that are important for SCI sexual health resource development. Through analysis of these themes and their subthemes, several important concepts have emerged that could provide agencies with helpful strategies for providing sexual health resources. Some of the important considerations are that services be; anonymous, accessible, frequent, affordable, mandatory, casual and supported by peers. Implications: By incorporating the perspectives of individuals with SCI, the finding from this study can be used to develop appropriate sexual health services and improve access to information through tailored needs based program development.Keywords: spinal cord injury, sexual health, determinants of health, resource development
Procedia PDF Downloads 25125027 Appraisal of Road Transport Infrastructure and Commercial Activities in Ede, Osun State Nigeria
Authors: Rafiu Babatunde Ibrahim, Richard Oluseyi Taiwo, Abiodun Toheeb Akintunde
Abstract:
The relationship between road transport infrastructure and commercial activities in Nigeria has been a topical issue and identified as one of the crucial components for economic development in the country. This study examines road transport infrastructure and commercial activities along selected roads in Ede, Nigeria. The study assesses the characteristics of the selected roads, the condition of road infrastructure, the degree of road network connectivity, maintenance culture for the road infrastructure as well as commercial activities along identified roads in the study area. Stratified Sampling Techniques were used to partition the study area into core, Intermediate and Suburb Township zones. Roads were also classified into Major, Distributor and Access Roads. Field observation and measurement, as well as a questionnaire, were used to obtain primary data from 246 systematically sampled respondents along the roads selected, and they were analyzed using descriptive and inferential statistics. The study revealed that most of the roads were characterized by an incidence of potholes. A total of 448 potholes were observed, where Olowoibida Road accounted for (19.0%), Federal Polytechnic Road (17.4%), and Back to Land Road (16.3%). The majority of the selected roads have no street lights and are of open drainage systems. Also, the condition of road surfaces was observed to be deteriorating. Road network connectivity of the study area was found to be poorly connected with 11% using the alpha index and 40% of Gamma index. It was found that the tailoring business (39) is predominant on major roads and Distributor Roads, while petty trading (35) is dominant on the access road. Results of correlation analysis (r = 0.242) show that there is a low positive correlation between road infrastructure and commercial activities; the significant relationships have indeed explained how important it is in influencing commercial activities across the study area. The study concluded by emphasizing the need for the provision of more roads and proper maintenance of the existing ones. This will no doubt improve the commercial activities along the roads in the study area.Keywords: road transport, infrastructure, commercial activities, maintenance culture
Procedia PDF Downloads 3625026 Centrality and Patent Impact: Coupled Network Analysis of Artificial Intelligence Patents Based on Co-Cited Scientific Papers
Authors: Xingyu Gao, Qiang Wu, Yuanyuan Liu, Yue Yang
Abstract:
In the era of the knowledge economy, the relationship between scientific knowledge and patents has garnered significant attention. Understanding the intricate interplay between the foundations of science and technological innovation has emerged as a pivotal challenge for both researchers and policymakers. This study establishes a coupled network of artificial intelligence patents based on co-cited scientific papers. Leveraging centrality metrics from network analysis offers a fresh perspective on understanding the influence of information flow and knowledge sharing within the network on patent impact. The study initially obtained patent numbers for 446,890 granted US AI patents from the United States Patent and Trademark Office’s artificial intelligence patent database for the years 2002-2020. Subsequently, specific information regarding these patents was acquired using the Lens patent retrieval platform. Additionally, a search and deduplication process was performed on scientific non-patent references (SNPRs) using the Web of Science database, resulting in the selection of 184,603 patents that cited 37,467 unique SNPRs. Finally, this study constructs a coupled network comprising 59,379 artificial intelligence patents by utilizing scientific papers co-cited in patent backward citations. In this network, nodes represent patents, and if patents reference the same scientific papers, connections are established between them, serving as edges within the network. Nodes and edges collectively constitute the patent coupling network. Structural characteristics such as node degree centrality, betweenness centrality, and closeness centrality are employed to assess the scientific connections between patents, while citation count is utilized as a quantitative metric for patent influence. Finally, a negative binomial model is employed to test the nonlinear relationship between these network structural features and patent influence. The research findings indicate that network structural features such as node degree centrality, betweenness centrality, and closeness centrality exhibit inverted U-shaped relationships with patent influence. Specifically, as these centrality metrics increase, patent influence initially shows an upward trend, but once these features reach a certain threshold, patent influence starts to decline. This discovery suggests that moderate network centrality is beneficial for enhancing patent influence, while excessively high centrality may have a detrimental effect on patent influence. This finding offers crucial insights for policymakers, emphasizing the importance of encouraging moderate knowledge flow and sharing to promote innovation when formulating technology policies. It suggests that in certain situations, data sharing and integration can contribute to innovation. Consequently, policymakers can take measures to promote data-sharing policies, such as open data initiatives, to facilitate the flow of knowledge and the generation of innovation. Additionally, governments and relevant agencies can achieve broader knowledge dissemination by supporting collaborative research projects, adjusting intellectual property policies to enhance flexibility, or nurturing technology entrepreneurship ecosystems.Keywords: centrality, patent coupling network, patent influence, social network analysis
Procedia PDF Downloads 5425025 Dissecting Big Trajectory Data to Analyse Road Network Travel Efficiency
Authors: Rania Alshikhe, Vinita Jindal
Abstract:
Digital innovation has played a crucial role in managing smart transportation. For this, big trajectory data collected from traveling vehicles, such as taxis through installed global positioning system (GPS)-enabled devices can be utilized. It offers an unprecedented opportunity to trace the movements of vehicles in fine spatiotemporal granularity. This paper aims to explore big trajectory data to measure the travel efficiency of road networks using the proposed statistical travel efficiency measure (STEM) across an entire city. Further, it identifies the cause of low travel efficiency by proposed least square approximation network-based causality exploration (LANCE). Finally, the resulting data analysis reveals the causes of low travel efficiency, along with the road segments that need to be optimized to improve the traffic conditions and thus minimize the average travel time from given point A to point B in the road network. Obtained results show that our proposed approach outperforms the baseline algorithms for measuring the travel efficiency of the road network.Keywords: GPS trajectory, road network, taxi trips, digital map, big data, STEM, LANCE
Procedia PDF Downloads 15725024 Mitigating Supply Chain Risk for Sustainability Using Big Data Knowledge: Evidence from the Manufacturing Supply Chain
Authors: Mani Venkatesh, Catarina Delgado, Purvishkumar Patel
Abstract:
The sustainable supply chain is gaining popularity among practitioners because of increased environmental degradation and stakeholder awareness. On the other hand supply chain, risk management is very crucial for the practitioners as it potentially disrupts supply chain operations. Prediction and addressing the risk caused by social issues in the supply chain is paramount importance to the sustainable enterprise. More recently, the usage of Big data analytics for forecasting business trends has been gaining momentum among professionals. The aim of the research is to explore the application of big data, predictive analytics in successfully mitigating supply chain social risk and demonstrate how such mitigation can help in achieving sustainability (environmental, economic & social). The method involves the identification and validation of social issues in the supply chain by an expert panel and survey. Later, we used a case study to illustrate the application of big data in the successful identification and mitigation of social issues in the supply chain. Our result shows that the company can predict various social issues through big data, predictive analytics and mitigate the social risk. We also discuss the implication of this research to the body of knowledge and practice.Keywords: big data, sustainability, supply chain social sustainability, social risk, case study
Procedia PDF Downloads 40825023 Improving the Analytical Power of Dynamic DEA Models, by the Consideration of the Shape of the Distribution of Inputs/Outputs Data: A Linear Piecewise Decomposition Approach
Authors: Elias K. Maragos, Petros E. Maravelakis
Abstract:
In Dynamic Data Envelopment Analysis (DDEA), which is a subfield of Data Envelopment Analysis (DEA), the productivity of Decision Making Units (DMUs) is considered in relation to time. In this case, as it is accepted by the most of the researchers, there are outputs, which are produced by a DMU to be used as inputs in a future time. Those outputs are known as intermediates. The common models, in DDEA, do not take into account the shape of the distribution of those inputs, outputs or intermediates data, assuming that the distribution of the virtual value of them does not deviate from linearity. This weakness causes the limitation of the accuracy of the analytical power of the traditional DDEA models. In this paper, the authors, using the concept of piecewise linear inputs and outputs, propose an extended DDEA model. The proposed model increases the flexibility of the traditional DDEA models and improves the measurement of the dynamic performance of DMUs.Keywords: Dynamic Data Envelopment Analysis, DDEA, piecewise linear inputs, piecewise linear outputs
Procedia PDF Downloads 16225022 Enhancement in the Absorption Efficiency of Gaas/Inas Nanowire Solar Cells through a Decrease in Light Reflection
Authors: Latef M. Ali, Farah A. Abed
Abstract:
In this paper, the effect of the Barium fluoride (BaF2) layer on the absorption efficiency of GaAs/InAs nanowire solar cells was investigated using the finite difference time domain (FDTD) method. By inserting the BaF2 as antireflection with the dominant size of 10 nm to fill the space between the shells of wires on the Si (111) substrate. The absorption is significantly improved due to the strong reabsorption of light reflected at the shells and compared with the reference cells. The present simulation leads to a higher absorption efficiency (Qabs) and reaches a value of 97%, and the external quantum efficiencies (EQEs) above 92% are observed. The current density (Jsc) increases by 0.22 mA/cm2 and the open-circuit voltage (Voc) is enhanced by 0.11 mV.Keywords: nanowire solar cells, absorption efficiency, photovoltaic, band structures, fdtd simulation
Procedia PDF Downloads 7425021 A Proposal of Advanced Key Performance Indicators for Assessing Six Performances of Construction Projects
Authors: Wi Sung Yoo, Seung Woo Lee, Youn Kyoung Hur, Sung Hwan Kim
Abstract:
Large-scale construction projects are continuously increasing, and the need for tools to monitor and evaluate the project success is emphasized. At the construction industry level, there are limitations in deriving performance evaluation factors that reflect the diversity of construction sites and systems that can objectively evaluate and manage performance. Additionally, there are difficulties in integrating structured and unstructured data generated at construction sites and deriving improvements. In this study, we propose the Key Performance Indicators (KPIs) to enable performance evaluation that reflects the increased diversity of construction sites and the unstructured data generated, and present a model for measuring performance by the derived indicators. The comprehensive performance of a unit construction site is assessed based on 6 areas (Time, Cost, Quality, Safety, Environment, Productivity) and 26 indicators. We collect performance indicator information from 30 construction sites that meet legal standards and have been successfully performed. And We apply data augmentation and optimization techniques into establishing measurement standards for each indicator. In other words, the KPI for construction site performance evaluation presented in this study provides standards for evaluating performance in six areas using institutional requirement data and document data. This can be expanded to establish a performance evaluation system considering the scale and type of construction project. Also, they are expected to be used as a comprehensive indicator of the construction industry and used as basic data for tracking competitiveness at the national level and establishing policies.Keywords: key performance indicator, performance measurement, structured and unstructured data, data augmentation
Procedia PDF Downloads 4325020 Oscillating Water Column Wave Energy Converter with Deep Water Reactance
Authors: William C. Alexander
Abstract:
The oscillating water column (OSC) wave energy converter (WEC) with deep water reactance (DWR) consists of a large hollow sphere filled with seawater at the base, referred to as the ‘stabilizer’, a hollow cylinder at the top of the device, with a said cylinder having a bottom open to the sea and a sealed top save for an orifice which leads to an air turbine, and a long, narrow rod connecting said stabilizer with said cylinder. A small amount of ballast at the bottom of the stabilizer and a small amount of floatation in the cylinder keeps the device upright in the sea. The floatation is set such that the mean water level is nominally halfway up the cylinder. The entire device is loosely moored to the seabed to keep it from drifting away. In the presence of ocean waves, seawater will move up and down within the cylinder, producing the ‘oscillating water column’. This gives rise to air pressure within the cylinder alternating between positive and negative gauge pressure, which in turn causes air to alternately leave and enter the cylinder through said top-cover situated orifice. An air turbine situated within or immediately adjacent to said orifice converts the oscillating airflow into electric power for transport to shore or elsewhere by electric power cable. Said oscillating air pressure produces large up and down forces on the cylinder. Said large forces are opposed through the rod to the large mass of water retained within the stabilizer, which is located deep enough to be mostly free of any wave influence and which provides the deepwater reactance. The cylinder and stabilizer form a spring-mass system which has a vertical (heave) resonant frequency. The diameter of the cylinder largely determines the power rating of the device, while the size (and water mass within) of the stabilizer determines said resonant frequency. Said frequency is chosen to be on the lower end of the wave frequency spectrum to maximize the average power output of the device over a large span of time (such as a year). The upper portion of the device (the cylinder) moves laterally (surge) with the waves. This motion is accommodated with minimal loading on the said rod by having the stabilizer shaped like a sphere, allowing the entire device to rotate about the center of the stabilizer without rotating the seawater within the stabilizer. A full-scale device of this type may have the following dimensions. The cylinder may be 16 meters in diameter and 30 meters high, the stabilizer 25 meters in diameter, and the rod 55 meters long. Simulations predict that this will produce 1,400 kW in waves of 3.5-meter height and 12 second period, with a relatively flat power curve between 5 and 16 second wave periods, as will be suitable for an open-ocean location. This is nominally 10 times higher power than similar-sized WEC spar buoys as reported in the literature, and the device is projected to have only 5% of the mass per unit power of other OWC converters.Keywords: oscillating water column, wave energy converter, spar bouy, stabilizer
Procedia PDF Downloads 10725019 Analysis on Urban Form and Evolution Mechanism of High-Density City: Case Study of Hong Kong
Authors: Yuan Zhang
Abstract:
Along with large population and great demands for urban development, Hong Kong serves as a typical high-density city with multiple altitudes, advanced three-dimensional traffic system, rich city open space, etc. This paper contributes to analyzing its complex urban form and evolution mechanism from three aspects of view, separately as time, space and buildings. Taking both horizontal and vertical dimension into consideration, this paper provides a perspective to explore the fascinating process of growing and space folding in the urban form of high-density city, also as a research reference for related high-density urban design.Keywords: evolution mechanism, high-density city, Hong Kong, urban form
Procedia PDF Downloads 40425018 Comparative Forensic Analysis of Lipsticks Using Thin Layer Chromatography and Gas Chromatography
Authors: M. O. Ezegbogu, H. B. Osadolor
Abstract:
Lipsticks constitute a significant source of transfer evidence, and can, therefore, provide corroborative or inclusionary evidence in criminal investigation. This study aimed to determine the uniqueness and persistence of different lipstick smears using Thin Layer Chromatography (TLC), and Gas Chromatography with a Flame Ionisation Detector (GC-FID). In this study, we analysed lipstick smears retrieved from tea cups exposed to the environment for up to four weeks. The n-alkane content of each sample was determined using GC-FID, while TLC was used to determine the number of bands, and retention factor of each band per smear. This study shows that TLC gives more consistent results over a 4-week period than GC-FID. It also proposes a maximum exposure time of two weeks for the analysis of lipsticks left in the open using GC-FID. Finally, we conclude that neither TLC nor GC-FID can distinguish lipstick evidence recovered from hypothetical crime scenes.Keywords: forensic science, chromatography, identification, lipstick
Procedia PDF Downloads 18725017 A Fuzzy TOPSIS Based Model for Safety Risk Assessment of Operational Flight Data
Authors: N. Borjalilu, P. Rabiei, A. Enjoo
Abstract:
Flight Data Monitoring (FDM) program assists an operator in aviation industries to identify, quantify, assess and address operational safety risks, in order to improve safety of flight operations. FDM is a powerful tool for an aircraft operator integrated into the operator’s Safety Management System (SMS), allowing to detect, confirm, and assess safety issues and to check the effectiveness of corrective actions, associated with human errors. This article proposes a model for safety risk assessment level of flight data in a different aspect of event focus based on fuzzy set values. It permits to evaluate the operational safety level from the point of view of flight activities. The main advantages of this method are proposed qualitative safety analysis of flight data. This research applies the opinions of the aviation experts through a number of questionnaires Related to flight data in four categories of occurrence that can take place during an accident or an incident such as: Runway Excursions (RE), Controlled Flight Into Terrain (CFIT), Mid-Air Collision (MAC), Loss of Control in Flight (LOC-I). By weighting each one (by F-TOPSIS) and applying it to the number of risks of the event, the safety risk of each related events can be obtained.Keywords: F-topsis, fuzzy set, flight data monitoring (FDM), flight safety
Procedia PDF Downloads 16825016 From Modeling of Data Structures towards Automatic Programs Generating
Authors: Valentin P. Velikov
Abstract:
Automatic program generation saves time, human resources, and allows receiving syntactically clear and logically correct modules. The 4-th generation programming languages are related to drawing the data and the processes of the subject area, as well as, to obtain a frame of the respective information system. The application can be separated in interface and business logic. That means, for an interactive generation of the needed system to be used an already existing toolkit or to be created a new one.Keywords: computer science, graphical user interface, user dialog interface, dialog frames, data modeling, subject area modeling
Procedia PDF Downloads 30625015 Optimized Weight Selection of Control Data Based on Quotient Space of Multi-Geometric Features
Authors: Bo Wang
Abstract:
The geometric processing of multi-source remote sensing data using control data of different scale and different accuracy is an important research direction of multi-platform system for earth observation. In the existing block bundle adjustment methods, as the controlling information in the adjustment system, the approach using single observation scale and precision is unable to screen out the control information and to give reasonable and effective corresponding weights, which reduces the convergence and adjustment reliability of the results. Referring to the relevant theory and technology of quotient space, in this project, several subjects are researched. Multi-layer quotient space of multi-geometric features is constructed to describe and filter control data. Normalized granularity merging mechanism of multi-layer control information is studied and based on the normalized scale factor, the strategy to optimize the weight selection of control data which is less relevant to the adjustment system can be realized. At the same time, geometric positioning experiment is conducted using multi-source remote sensing data, aerial images, and multiclass control data to verify the theoretical research results. This research is expected to break through the cliché of the single scale and single accuracy control data in the adjustment process and expand the theory and technology of photogrammetry. Thus the problem to process multi-source remote sensing data will be solved both theoretically and practically.Keywords: multi-source image geometric process, high precision geometric positioning, quotient space of multi-geometric features, optimized weight selection
Procedia PDF Downloads 28525014 Consortium Blockchain-based Model for Data Management Applications in the Healthcare Sector
Authors: Teo Hao Jing, Shane Ho Ken Wae, Lee Jin Yu, Burra Venkata Durga Kumar
Abstract:
Current distributed healthcare systems face the challenge of interoperability of health data. Storing electronic health records (EHR) in local databases causes them to be fragmented. This problem is aggravated as patients visit multiple healthcare providers in their lifetime. Existing solutions are unable to solve this issue and have caused burdens to healthcare specialists and patients alike. Blockchain technology was found to be able to increase the interoperability of health data by implementing digital access rules, enabling uniformed patient identity, and providing data aggregation. Consortium blockchain was found to have high read throughputs, is more trustworthy, more secure against external disruptions and accommodates transactions without fees. Therefore, this paper proposes a blockchain-based model for data management applications. In this model, a consortium blockchain is implemented by using a delegated proof of stake (DPoS) as its consensus mechanism. This blockchain allows collaboration between users from different organizations such as hospitals and medical bureaus. Patients serve as the owner of their information, where users from other parties require authorization from the patient to view their information. Hospitals upload the hash value of patients’ generated data to the blockchain, whereas the encrypted information is stored in a distributed cloud storage.Keywords: blockchain technology, data management applications, healthcare, interoperability, delegated proof of stake
Procedia PDF Downloads 13825013 Finding the Free Stream Velocity Using Flow Generated Sound
Authors: Saeed Hosseini, Ali Reza Tahavvor
Abstract:
Sound processing is one the subjects that newly attracts a lot of researchers. It is efficient and usually less expensive than other methods. In this paper the flow generated sound is used to estimate the flow speed of free flows. Many sound samples are gathered. After analyzing the data, a parameter named wave power is chosen. For all samples, the wave power is calculated and averaged for each flow speed. A curve is fitted to the averaged data and a correlation between the wave power and flow speed is founded. Test data are used to validate the method and errors for all test data were under 10 percent. The speed of the flow can be estimated by calculating the wave power of the flow generated sound and using the proposed correlation.Keywords: the flow generated sound, free stream, sound processing, speed, wave power
Procedia PDF Downloads 41525012 Applying Big Data Analysis to Efficiently Exploit the Vast Unconventional Tight Oil Reserves
Authors: Shengnan Chen, Shuhua Wang
Abstract:
Successful production of hydrocarbon from unconventional tight oil reserves has changed the energy landscape in North America. The oil contained within these reservoirs typically will not flow to the wellbore at economic rates without assistance from advanced horizontal well and multi-stage hydraulic fracturing. Efficient and economic development of these reserves is a priority of society, government, and industry, especially under the current low oil prices. Meanwhile, society needs technological and process innovations to enhance oil recovery while concurrently reducing environmental impacts. Recently, big data analysis and artificial intelligence become very popular, developing data-driven insights for better designs and decisions in various engineering disciplines. However, the application of data mining in petroleum engineering is still in its infancy. The objective of this research aims to apply intelligent data analysis and data-driven models to exploit unconventional oil reserves both efficiently and economically. More specifically, a comprehensive database including the reservoir geological data, reservoir geophysical data, well completion data and production data for thousands of wells is firstly established to discover the valuable insights and knowledge related to tight oil reserves development. Several data analysis methods are introduced to analysis such a huge dataset. For example, K-means clustering is used to partition all observations into clusters; principle component analysis is applied to emphasize the variation and bring out strong patterns in the dataset, making the big data easy to explore and visualize; exploratory factor analysis (EFA) is used to identify the complex interrelationships between well completion data and well production data. Different data mining techniques, such as artificial neural network, fuzzy logic, and machine learning technique are then summarized, and appropriate ones are selected to analyze the database based on the prediction accuracy, model robustness, and reproducibility. Advanced knowledge and patterned are finally recognized and integrated into a modified self-adaptive differential evolution optimization workflow to enhance the oil recovery and maximize the net present value (NPV) of the unconventional oil resources. This research will advance the knowledge in the development of unconventional oil reserves and bridge the gap between the big data and performance optimizations in these formations. The newly developed data-driven optimization workflow is a powerful approach to guide field operation, which leads to better designs, higher oil recovery and economic return of future wells in the unconventional oil reserves.Keywords: big data, artificial intelligence, enhance oil recovery, unconventional oil reserves
Procedia PDF Downloads 28325011 Efficiency of DMUs in Presence of New Inputs and Outputs in DEA
Authors: Esmat Noroozi, Elahe Sarfi, Farha Hosseinzadeh Lotfi
Abstract:
Examining the impacts of data modification is considered as sensitivity analysis. A lot of studies have considered the data modification of inputs and outputs in DEA. The issues which has not heretofore been considered in DEA sensitivity analysis is modification in the number of inputs and (or) outputs and determining the impacts of this modification in the status of efficiency of DMUs. This paper is going to present systems that show the impacts of adding one or multiple inputs or outputs on the status of efficiency of DMUs and furthermore a model is presented for recognizing the minimum number of inputs and (or) outputs from among specified inputs and outputs which can be added whereas an inefficient DMU will become efficient. Finally the presented systems and model have been utilized for a set of real data and the results have been reported.Keywords: data envelopment analysis, efficiency, sensitivity analysis, input, out put
Procedia PDF Downloads 45025010 Credit Card Fraud Detection with Ensemble Model: A Meta-Heuristic Approach
Authors: Gong Zhilin, Jing Yang, Jian Yin
Abstract:
The purpose of this paper is to develop a novel system for credit card fraud detection based on sequential modeling of data using hybrid deep learning models. The projected model encapsulates five major phases are pre-processing, imbalance-data handling, feature extraction, optimal feature selection, and fraud detection with an ensemble classifier. The collected raw data (input) is pre-processed to enhance the quality of the data through alleviation of the missing data, noisy data as well as null values. The pre-processed data are class imbalanced in nature, and therefore they are handled effectively with the K-means clustering-based SMOTE model. From the balanced class data, the most relevant features like improved Principal Component Analysis (PCA), statistical features (mean, median, standard deviation) and higher-order statistical features (skewness and kurtosis). Among the extracted features, the most optimal features are selected with the Self-improved Arithmetic Optimization Algorithm (SI-AOA). This SI-AOA model is the conceptual improvement of the standard Arithmetic Optimization Algorithm. The deep learning models like Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), and optimized Quantum Deep Neural Network (QDNN). The LSTM and CNN are trained with the extracted optimal features. The outcomes from LSTM and CNN will enter as input to optimized QDNN that provides the final detection outcome. Since the QDNN is the ultimate detector, its weight function is fine-tuned with the Self-improved Arithmetic Optimization Algorithm (SI-AOA).Keywords: credit card, data mining, fraud detection, money transactions
Procedia PDF Downloads 13125009 Modelling Affordable Waste Management Solutions for India
Authors: Pradip Baishya, D. K. Mahanta
Abstract:
Rapid and unplanned urbanisation in most cities of India has progressively increased the problem of managing municipal waste in the past few years. With insufficient infrastructure and funds, Municipalities in most cities are struggling to cope with the pace of waste generated. Open dumping is widely in practice as a cheaper option. Scientific disposal of waste in such a large scale with the elements of segregation, recycling, landfill, and incineration involves sophisticated and expensive plants. In an effort to finding affordable and simple solutions to address this burning issue of waste disposal, a semi-mechanized plant has been designed underlying the concept of a zero waste community. The fabrication work of the waste management unit is carried out by local skills from locally available materials. A resident colony in the city of Guwahati has been chosen, which is seen as a typical representative of most cities in India in terms of size and key issues surrounding waste management. Scientific management and disposal of waste on site is carried out on the principle of reduce, reuse and recycle from segregation to compositing. It is a local community participatory model, which involves all stakeholders in the process namely rag pickers, residents, municipality and local industry. Studies were conducted to testify the plant as revenue earning self-sustaining model in the long term. Current working efficiency of plant for segregation was found to be 1kg per minute. Identifying bottlenecks in the success of the model, data on efficiency of the plant, economics of its fabrication were part of the study. Similar satellite waste management plants could potentially be a solution to supplement the waste management system of municipalities of similar sized cities in India or South East Asia with similar issues surrounding waste disposal.Keywords: affordable, rag pickers, recycle, reduce, reuse, segregation, zero waste
Procedia PDF Downloads 30525008 WebAppShield: An Approach Exploiting Machine Learning to Detect SQLi Attacks in an Application Layer in Run-time
Authors: Ahmed Abdulla Ashlam, Atta Badii, Frederic Stahl
Abstract:
In recent years, SQL injection attacks have been identified as being prevalent against web applications. They affect network security and user data, which leads to a considerable loss of money and data every year. This paper presents the use of classification algorithms in machine learning using a method to classify the login data filtering inputs into "SQLi" or "Non-SQLi,” thus increasing the reliability and accuracy of results in terms of deciding whether an operation is an attack or a valid operation. A method Web-App auto-generated twin data structure replication. Shielding against SQLi attacks (WebAppShield) that verifies all users and prevents attackers (SQLi attacks) from entering and or accessing the database, which the machine learning module predicts as "Non-SQLi" has been developed. A special login form has been developed with a special instance of data validation; this verification process secures the web application from its early stages. The system has been tested and validated, up to 99% of SQLi attacks have been prevented.Keywords: SQL injection, attacks, web application, accuracy, database
Procedia PDF Downloads 15125007 From Theory to Practice: Harnessing Mathematical and Statistical Sciences in Data Analytics
Authors: Zahid Ullah, Atlas Khan
Abstract:
The rapid growth of data in diverse domains has created an urgent need for effective utilization of mathematical and statistical sciences in data analytics. This abstract explores the journey from theory to practice, emphasizing the importance of harnessing mathematical and statistical innovations to unlock the full potential of data analytics. Drawing on a comprehensive review of existing literature and research, this study investigates the fundamental theories and principles underpinning mathematical and statistical sciences in the context of data analytics. It delves into key mathematical concepts such as optimization, probability theory, statistical modeling, and machine learning algorithms, highlighting their significance in analyzing and extracting insights from complex datasets. Moreover, this abstract sheds light on the practical applications of mathematical and statistical sciences in real-world data analytics scenarios. Through case studies and examples, it showcases how mathematical and statistical innovations are being applied to tackle challenges in various fields such as finance, healthcare, marketing, and social sciences. These applications demonstrate the transformative power of mathematical and statistical sciences in data-driven decision-making. The abstract also emphasizes the importance of interdisciplinary collaboration, as it recognizes the synergy between mathematical and statistical sciences and other domains such as computer science, information technology, and domain-specific knowledge. Collaborative efforts enable the development of innovative methodologies and tools that bridge the gap between theory and practice, ultimately enhancing the effectiveness of data analytics. Furthermore, ethical considerations surrounding data analytics, including privacy, bias, and fairness, are addressed within the abstract. It underscores the need for responsible and transparent practices in data analytics, and highlights the role of mathematical and statistical sciences in ensuring ethical data handling and analysis. In conclusion, this abstract highlights the journey from theory to practice in harnessing mathematical and statistical sciences in data analytics. It showcases the practical applications of these sciences, the importance of interdisciplinary collaboration, and the need for ethical considerations. By bridging the gap between theory and practice, mathematical and statistical sciences contribute to unlocking the full potential of data analytics, empowering organizations and decision-makers with valuable insights for informed decision-making.Keywords: data analytics, mathematical sciences, optimization, machine learning, interdisciplinary collaboration, practical applications
Procedia PDF Downloads 9325006 Regression for Doubly Inflated Multivariate Poisson Distributions
Authors: Ishapathik Das, Sumen Sen, N. Rao Chaganty, Pooja Sengupta
Abstract:
Dependent multivariate count data occur in several research studies. These data can be modeled by a multivariate Poisson or Negative binomial distribution constructed using copulas. However, when some of the counts are inflated, that is, the number of observations in some cells are much larger than other cells, then the copula based multivariate Poisson (or Negative binomial) distribution may not fit well and it is not an appropriate statistical model for the data. There is a need to modify or adjust the multivariate distribution to account for the inflated frequencies. In this article, we consider the situation where the frequencies of two cells are higher compared to the other cells, and develop a doubly inflated multivariate Poisson distribution function using multivariate Gaussian copula. We also discuss procedures for regression on covariates for the doubly inflated multivariate count data. For illustrating the proposed methodologies, we present a real data containing bivariate count observations with inflations in two cells. Several models and linear predictors with log link functions are considered, and we discuss maximum likelihood estimation to estimate unknown parameters of the models.Keywords: copula, Gaussian copula, multivariate distributions, inflated distributios
Procedia PDF Downloads 15625005 An Exploratory Research of Human Character Analysis Based on Smart Watch Data: Distinguish the Drinking State from Normal State
Authors: Lu Zhao, Yanrong Kang, Lili Guo, Yuan Long, Guidong Xing
Abstract:
Smart watches, as a handy device with rich functionality, has become one of the most popular wearable devices all over the world. Among the various function, the most basic is health monitoring. The monitoring data can be provided as an effective evidence or a clue for the detection of crime cases. For instance, the step counting data can help to determine whether the watch wearer was quiet or moving during the given time period. There is, however, still quite few research on the analysis of human character based on these data. The purpose of this research is to analyze the health monitoring data to distinguish the drinking state from normal state. The analysis result may play a role in cases involving drinking, such as drunk driving. The experiment mainly focused on finding the figures of smart watch health monitoring data that change with drinking and figuring up the change scope. The chosen subjects are mostly in their 20s, each of whom had been wearing the same smart watch for a week. Each subject drank for several times during the week, and noted down the begin and end time point of the drinking. The researcher, then, extracted and analyzed the health monitoring data from the watch. According to the descriptive statistics analysis, it can be found that the heart rate change when drinking. The average heart rate is about 10% higher than normal, the coefficient of variation is less than about 30% of the normal state. Though more research is needed to be carried out, this experiment and analysis provide a thought of the application of the data from smart watches.Keywords: character analysis, descriptive statistics analysis, drink state, heart rate, smart watch
Procedia PDF Downloads 16725004 Experimental Analyses of Thermoelectric Generator Behavior Using Two Types of Thermoelectric Modules for Marine Application
Authors: A. Nour Eddine, D. Chalet, L. Aixala, P. Chessé, X. Faure, N. Hatat
Abstract:
Thermal power technology such as the TEG (Thermo-Electric Generator) arouses significant attention worldwide for waste heat recovery. Despite the potential benefits of marine application due to the permanent heat sink from sea water, no significant studies on this application were to be found. In this study, a test rig has been designed and built to test the performance of the TEG on engine operating points. The TEG device is built from commercially available materials for the sake of possible economical application. Two types of commercial TEM (thermo electric module) have been studied separately on the test rig. The engine data were extracted from a commercial Diesel engine since it shares the same principle in terms of engine efficiency and exhaust with the marine Diesel engine. An open circuit water cooling system is used to replicate the sea water cold source. The characterization tests showed that the silicium-germanium alloys TEM proved a remarkable reliability on all engine operating points, with no significant deterioration of performance even under sever variation in the hot source conditions. The performance of the bismuth-telluride alloys was 100% better than the first type of TEM but it showed a deterioration in power generation when the air temperature exceeds 300 °C. The temperature distribution on the heat exchange surfaces revealed no useful combination of these two types of TEM with this tube length, since the surface temperature difference between both ends is no more than 10 °C. This study exposed the perspective of use of TEG technology for marine engine exhaust heat recovery. Although the results suggested non-sufficient power generation from the low cost commercial TEM used, it provides valuable information about TEG device optimization, including the design of heat exchanger and the types of thermo-electric materials.Keywords: internal combustion engine application, Seebeck, thermo-electricity, waste heat recovery
Procedia PDF Downloads 24425003 Study Regarding Effect of Isolation on Social Behaviour in Mice
Authors: Ritu Shitak
Abstract:
Humans are social mammals, of the primate order. Our biology, behaviour, and pathologies are unique to us. In our desire to understand, reduce solitary confinement one source of information is the many reports of social isolation of other social mammals, especially primates. A behavioural study was conducted in the department of pharmacology at Indira Gandhi Medical College, Shimla in Himachal Pradesh province in India using white albino mice. Different behavioural parameters were observed by using open field, tail suspension, tests for aggressive behaviour and social interactions and the effect of isolation was studied. The results were evaluated and the standard statistics were applied. The said study was done to establish facts that isolation itself impairs social behaviour and can lead to alcohol dependence as well as related drug dependence.Keywords: social isolation, albino mice, drug dependence, isolation on social behaviour
Procedia PDF Downloads 47225002 Pilot Study of Determining the Impact of Surface Subsidence at The Intersection of Cave Mining with the Surface Using an Electrical Impedance Tomography
Authors: Ariungerel Jargal
Abstract:
: Cave mining is a bulk underground mining method, which allows large low-grade deposits to be mined underground. This method involves undermining the orebody to make it collapse under its own weight into a series of chambers from which the ore extracted. It is a useful technique to extend the life of large deposits previously mined by open pits, and it is a method increasingly proposed for new mines around the world. We plan to conduct a feasibility study using Electrical impedance tomography (EIT) technology to show how much subsidence there is at the intersection with the cave mining surface. EIT is an imaging technique which uses electrical measurements at electrodes attached on the body surface to yield a cross-sectional image of conductivity changes within the object. EIT has been developed in several different applications areas as a simpler, cheaper alternative to many other imaging methods. A low frequency current is injected between pairs of electrodes while voltage measurements are collected at all other electrode pairs. In the difference EIT, images are reconstructed of the change in conductivity distribution (σ) between the acquisition of the two sets of measurements. Image reconstruction in EIT requires the solution of an ill-conditioned nonlinear inverse problem on noisy data, typically requiring make simpler assumptions or regularization. It is noted that the ratio of current to voltage represents a complex value according to Ohm’s law, and that it is theoretically possible to re-express EIT. The results of the experiment were presented on the simulation, and it was concluded that it is possible to conduct further real experiments. Drill a certain number of holes in the top wall of the cave to attach the electrodes, flow a current through them, and measure and acquire the potential through these electrodes. Appropriate values should be selected depending on the distance between the holes, the frequency and duration of the measurements, the surface characteristics and the size of the study area using an EIT device.Keywords: impedance tomography, cave mining, soil, EIT device
Procedia PDF Downloads 126