Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 1312

Search results for: custom dataset

382 Composite Approach to Extremism and Terrorism Web Content Classification

Authors: Kolade Olawande Owoeye, George Weir

Abstract:

Terrorism and extremism activities on the internet are becoming the most significant threats to national security because of their potential dangers. In response to this challenge, law enforcement and security authorities are actively implementing comprehensive measures by countering the use of the internet for terrorism. To achieve the measures, there is need for intelligence gathering via the internet. This includes real-time monitoring of potential websites that are used for recruitment and information dissemination among other operations by extremist groups. However, with billions of active webpages, real-time monitoring of all webpages become almost impossible. To narrow down the search domain, there is a need for efficient webpage classification techniques. This research proposed a new approach tagged: SentiPosit-based method. SentiPosit-based method combines features of the Posit-based method and the Sentistrenght-based method for classification of terrorism and extremism webpages. The experiment was carried out on 7500 webpages obtained through TENE-webcrawler by International Cyber Crime Research Centre (ICCRC). The webpages were manually grouped into three classes which include the ‘pro-extremist’, ‘anti-extremist’ and ‘neutral’ with 2500 webpages in each category. A supervised learning algorithm is then applied on the classified dataset in order to build the model. Results obtained was compared with existing classification method using the prediction accuracy and runtime. It was observed that our proposed hybrid approach produced a better classification accuracy compared to existing approaches within a reasonable runtime.

Keywords: sentiposit, classification, extremism, terrorism

Procedia PDF Downloads 258

381 Multi-Temporal Mapping of Built-up Areas Using Daytime and Nighttime Satellite Images Based on Google Earth Engine Platform

Authors: S. Hutasavi, D. Chen

Abstract:

The built-up area is a significant proxy to measure regional economic growth and reflects the Gross Provincial Product (GPP). However, an up-to-date and reliable database of built-up areas is not always available, especially in developing countries. The cloud-based geospatial analysis platform such as Google Earth Engine (GEE) provides an opportunity with accessibility and computational power for those countries to generate the built-up data. Therefore, this study aims to extract the built-up areas in Eastern Economic Corridor (EEC), Thailand using day and nighttime satellite imagery based on GEE facilities. The normalized indices were generated from Landsat 8 surface reflectance dataset, including Normalized Difference Built-up Index (NDBI), Built-up Index (BUI), and Modified Built-up Index (MBUI). These indices were applied to identify built-up areas in EEC. The result shows that MBUI performs better than BUI and NDBI, with the highest accuracy of 0.85 and Kappa of 0.82. Moreover, the overall accuracy of classification was improved from 79% to 90%, and error of total built-up area was decreased from 29% to 0.7%, after night-time light data from the Visible and Infrared Imaging Suite (VIIRS) Day Night Band (DNB). The results suggest that MBUI with night-time light imagery is appropriate for built-up area extraction and be utilize for further study of socioeconomic impacts of regional development policy over the EEC region.

Keywords: built-up area extraction, google earth engine, adaptive thresholding method, rapid mapping

Procedia PDF Downloads 109

380 Automatic Reporting System for Transcriptome Indel Identification and Annotation Based on Snapshot of Next-Generation Sequencing Reads Alignment

Authors: Shuo Mu, Guangzhi Jiang, Jinsa Chen

Abstract:

The analysis of Indel for RNA sequencing of clinical samples is easily affected by sequencing experiment errors and software selection. In order to improve the efficiency and accuracy of analysis, we developed an automatic reporting system for Indel recognition and annotation based on image snapshot of transcriptome reads alignment. This system includes sequence local-assembly and realignment, target point snapshot, and image-based recognition processes. We integrated high-confidence Indel dataset from several known databases as a training set to improve the accuracy of image processing and added a bioinformatical processing module to annotate and filter Indel artifacts. Subsequently, the system will automatically generate data, including data quality levels and images results report. Sanger sequencing verification of the reference Indel mutation of cell line NA12878 showed that the process can achieve 83% sensitivity and 96% specificity. Analysis of the collected clinical samples showed that the interpretation accuracy of the process was equivalent to that of manual inspection, and the processing efficiency showed a significant improvement. This work shows the feasibility of accurate Indel analysis of clinical next-generation sequencing (NGS) transcriptome. This result may be useful for RNA study for clinical samples with microsatellite instability in immunotherapy in the future.

Keywords: automatic reporting, indel, next-generation sequencing, NGS, transcriptome

Procedia PDF Downloads 161

379 Machine Learning Models for the Prediction of Heating and Cooling Loads of a Residential Building

Authors: Aaditya U. Jhamb

Abstract:

Due to the current energy crisis that many countries are battling, energy-efficient buildings are the subject of extensive research in the modern technological era because of growing worries about energy consumption and its effects on the environment. The paper explores 8 factors that help determine energy efficiency for a building: (relative compactness, surface area, wall area, roof area, overall height, orientation, glazing area, and glazing area distribution), with Tsanas and Xifara providing a dataset. The data set employed 768 different residential building models to anticipate heating and cooling loads with a low mean squared error. By optimizing these characteristics, machine learning algorithms may assess and properly forecast a building's heating and cooling loads, lowering energy usage while increasing the quality of people's lives. As a result, the paper studied the magnitude of the correlation between these input factors and the two output variables using various statistical methods of analysis after determining which input variable was most closely associated with the output loads. The most conclusive model was the Decision Tree Regressor, which had a mean squared error of 0.258, whilst the least definitive model was the Isotonic Regressor, which had a mean squared error of 21.68. This paper also investigated the KNN Regressor and the Linear Regression, which had to mean squared errors of 3.349 and 18.141, respectively. In conclusion, the model, given the 8 input variables, was able to predict the heating and cooling loads of a residential building accurately and precisely.

Keywords: energy efficient buildings, heating load, cooling load, machine learning models

Procedia PDF Downloads 78

378 Mapping Thermal Properties Using Resistivity, Lithology and Thermal Conductivity Measurements

Authors: Riccardo Pasquali, Keith Harlin, Mark Muller

Abstract:

The ShallowTherm project is focussed on developing and applying a methodology for extrapolating relatively sparsely sampled thermal conductivity measurements across Ireland using mapped Litho-Electrical (LE) units. The primary data used consist of electrical resistivities derived from the Geological Survey Ireland Tellus airborne electromagnetic dataset, GIS-based maps of Irish geology, and rock thermal conductivities derived from both the current Irish Ground Thermal Properties (IGTP) database and a new programme of sampling and laboratory measurement. The workflow has been developed across three case-study areas that sample a range of different calcareous, arenaceous, argillaceous, and volcanic lithologies. Statistical analysis of resistivity data from individual geological formations has been assessed and integrated with detailed lithological descriptions to define distinct LE units. Thermal conductivity measurements from core and hand samples have been acquired for every geological formation within each study area. The variability and consistency of thermal conductivity measurements within each LE unit is examined with the aim of defining a characteristic thermal conductivity (or range of thermal conductivities) for each LE unit. Mapping of LE units, coupled with characteristic thermal conductivities, provides a method of defining thermal conductivity properties at a regional scale and facilitating the design of ground source heat pump closed-loop collectors.

Keywords: thermal conductivity, ground source heat pumps, resistivity, heat exchange, shallow geothermal, Ireland

Procedia PDF Downloads 161

377 Predictive Pathogen Biology: Genome-Based Prediction of Pathogenic Potential and Countermeasures Targets

Authors: Debjit Ray

Abstract:

Horizontal gene transfer (HGT) and recombination leads to the emergence of bacterial antibiotic resistance and pathogenic traits. HGT events can be identified by comparing a large number of fully sequenced genomes across a species or genus, define the phylogenetic range of HGT, and find potential sources of new resistance genes. In-depth comparative phylogenomics can also identify subtle genome or plasmid structural changes or mutations associated with phenotypic changes. Comparative phylogenomics requires that accurately sequenced, complete and properly annotated genomes of the organism. Assembling closed genomes requires additional mate-pair reads or “long read” sequencing data to accompany short-read paired-end data. To bring down the cost and time required of producing assembled genomes and annotating genome features that inform drug resistance and pathogenicity, we are analyzing the performance for genome assembly of data from the Illumina NextSeq, which has faster throughput than the Illumina HiSeq (~1-2 days versus ~1 week), and shorter reads (150bp paired-end versus 300bp paired end) but higher capacity (150-400M reads per run versus ~5-15M) compared to the Illumina MiSeq. Bioinformatics improvements are also needed to make rapid, routine production of complete genomes a reality. Modern assemblers such as SPAdes 3.6.0 running on a standard Linux blade are capable in a few hours of converting mixes of reads from different library preps into high-quality assemblies with only a few gaps. Remaining breaks in scaffolds are generally due to repeats (e.g., rRNA genes) are addressed by our software for gap closure techniques, that avoid custom PCR or targeted sequencing. Our goal is to improve the understanding of emergence of pathogenesis using sequencing, comparative genomics, and machine learning analysis of ~1000 pathogen genomes. Machine learning algorithms will be used to digest the diverse features (change in virulence genes, recombination, horizontal gene transfer, patient diagnostics). Temporal data and evolutionary models can thus determine whether the origin of a particular isolate is likely to have been from the environment (could it have evolved from previous isolates). It can be useful for comparing differences in virulence along or across the tree. More intriguing, it can test whether there is a direction to virulence strength. This would open new avenues in the prediction of uncharacterized clinical bugs and multidrug resistance evolution and pathogen emergence.

Keywords: genomics, pathogens, genome assembly, superbugs

Procedia PDF Downloads 183

376 A Review of Effective Gene Selection Methods for Cancer Classification Using Microarray Gene Expression Profile

Authors: Hala Alshamlan, Ghada Badr, Yousef Alohali

Abstract:

Cancer is one of the dreadful diseases, which causes considerable death rate in humans. DNA microarray-based gene expression profiling has been emerged as an efficient technique for cancer classification, as well as for diagnosis, prognosis, and treatment purposes. In recent years, a DNA microarray technique has gained more attraction in both scientific and in industrial fields. It is important to determine the informative genes that cause cancer to improve early cancer diagnosis and to give effective chemotherapy treatment. In order to gain deep insight into the cancer classification problem, it is necessary to take a closer look at the proposed gene selection methods. We believe that they should be an integral preprocessing step for cancer classification. Furthermore, finding an accurate gene selection method is a very significant issue in a cancer classification area because it reduces the dimensionality of microarray dataset and selects informative genes. In this paper, we classify and review the state-of-art gene selection methods. We proceed by evaluating the performance of each gene selection approach based on their classification accuracy and number of informative genes. In our evaluation, we will use four benchmark microarray datasets for the cancer diagnosis (leukemia, colon, lung, and prostate). In addition, we compare the performance of gene selection method to investigate the effective gene selection method that has the ability to identify a small set of marker genes, and ensure high cancer classification accuracy. To the best of our knowledge, this is the first attempt to compare gene selection approaches for cancer classification using microarray gene expression profile.

Keywords: gene selection, feature selection, cancer classification, microarray, gene expression profile

Procedia PDF Downloads 432

375 Additive Manufacturing with Ceramic Filler

Authors: Irsa Wolfram, Boruch Lorenz

Abstract:

Innovative solutions with additive manufacturing applying material extrusion for functional parts necessitate innovative filaments with persistent quality. Uniform homogeneity and a consistent dispersion of particles embedded in filaments generally require multiple cycles of extrusion or well-prepared primal matter by injection molding, kneader machines, or mixing equipment. These technologies commit to dedicated equipment that is rarely at the disposal in production laboratories unfamiliar with research in polymer materials. This stands in contrast to laboratories that investigate complex material topics and technology science to leverage the potential of 3-D printing. Consequently, scientific studies in labs are often constrained to compositions and concentrations of fillersofferedfrom the market. Therefore, we introduce a prototypal laboratory methodology scalable to tailoredprimal matter for extruding ceramic composite filaments with fused filament fabrication (FFF) technology. - A desktop single-screw extruder serves as a core device for the experiments. Custom-made filaments encapsulate the ceramic fillers and serve with polylactide (PLA), which is a thermoplastic polyester, as primal matter and is processed in the melting area of the extruder, preserving the defined concentration of the fillers. Validated results demonstrate that this approach enables continuously produced and uniform composite filaments with consistent homogeneity. Itis 3-D printable with controllable dimensions, which is a prerequisite for any scalable application. Additionally, digital microscopy confirms the steady dispersion of the ceramic particles in the composite filament. - This permits a 2D reconstruction of the planar distribution of the embedded ceramic particles in the PLA matrices. The innovation of the introduced method lies in the smart simplicity of preparing the composite primal matter. It circumvents the inconvenience of numerous extrusion operations and expensive laboratory equipment. Nevertheless, it deliversconsistent filaments of controlled, predictable, and reproducible filler concentration, which is the prerequisite for any industrial application. The introduced prototypal laboratory methodology seems capable for other polymer matrices and suitable to further utilitarian particle types beyond and above ceramic fillers. This inaugurates a roadmap for supplementary laboratory development of peculiar composite filaments, providing value for industries and societies. This low-threshold entry of sophisticated preparation of composite filaments - enabling businesses to create their own dedicated filaments - will support the mutual efforts for establishing 3D printing to new functional devices.

Keywords: additive manufacturing, ceramic composites, complex filament, industrial application

Procedia PDF Downloads 91

374 Landslide Vulnerability Assessment in Context with Indian Himalayan

Authors: Neha Gupta

Abstract:

Landslide vulnerability is considered as the crucial parameter for the assessment of landslide risk. The term vulnerability defined as the damage or degree of elements at risk of different dimensions, i.e., physical, social, economic, and environmental dimensions. Himalaya region is very prone to multi-hazard such as floods, forest fires, earthquakes, and landslides. With the increases in fatalities rates, loss of infrastructure, and economy due to landslide in the Himalaya region, leads to the assessment of vulnerability. In this study, a methodology to measure the combination of vulnerability dimension, i.e., social vulnerability, physical vulnerability, and environmental vulnerability in one framework. A combined result of these vulnerabilities has rarely been carried out. But no such approach was applied in the Indian Scenario. The methodology was applied in an area of east Sikkim Himalaya, India. The physical vulnerability comprises of building footprint layer extracted from remote sensing data and Google Earth imaginary. The social vulnerability was assessed by using population density based on land use. The land use map was derived from a high-resolution satellite image, and for environment vulnerability assessment NDVI, forest, agriculture land, distance from the river were assessed from remote sensing and DEM. The classes of social vulnerability, physical vulnerability, and environment vulnerability were normalized at the scale of 0 (no loss) to 1 (loss) to get the homogenous dataset. Then the Multi-Criteria Analysis (MCA) was used to assign individual weights to each dimension and then integrate it into one frame. The final vulnerability was further classified into four classes from very low to very high.

Keywords: landslide, multi-criteria analysis, MCA, physical vulnerability, social vulnerability

Procedia PDF Downloads 288

373 Rural Water Supply Services in India: Developing a Composite Summary Score

Authors: Mimi Roy, Sriroop Chaudhuri

Abstract:

Sustainable water supply is among the basic needs for human development, especially in the rural areas of the developing nations where safe water supply and basic sanitation infrastructure is direly needed. In light of the above, we propose a simple methodology to develop a composite water sustainability index (WSI) to assess the collective performance of the existing rural water supply services (RWSS) in India over time. The WSI will be computed by summarizing the details of all the different varieties of water supply schemes presently available in India comprising of 40 liters per capita per day (lpcd), 55 lpcd, and piped water supply (PWS) per household. The WSI will be computed annually, between 2010 and 2016, to elucidate changes in holistic RWSS performances. Results will be integrated within a robust geospatial framework to identify the ‘hotspots’ (states/districts) which have persistent issues over adequate RWSS coverage and warrant spatially-optimized policy reforms in future to address sustainable human development. Dataset will be obtained from the National Rural Drinking Water Program (NRDWP), operating under the aegis of the Ministry of Drinking Water and Sanitation (MoDWS), at state/district/block levels to offer the authorities a cross-sectional view of RWSS at different levels of administrative hierarchy. Due to simplistic design, complemented by spatio-temporal cartograms, similar approaches can also be adopted in other parts of the world where RWSS need a thorough appraisal.

Keywords: rural water supply services, piped water supply, sustainability, composite index, spatial, drinking water

Procedia PDF Downloads 283

372 Big Data in Telecom Industry: Effective Predictive Techniques on Call Detail Records

Authors: Sara ElElimy, Samir Moustafa

Abstract:

Mobile network operators start to face many challenges in the digital era, especially with high demands from customers. Since mobile network operators are considered a source of big data, traditional techniques are not effective with new era of big data, Internet of things (IoT) and 5G; as a result, handling effectively different big datasets becomes a vital task for operators with the continuous growth of data and moving from long term evolution (LTE) to 5G. So, there is an urgent need for effective Big data analytics to predict future demands, traffic, and network performance to full fill the requirements of the fifth generation of mobile network technology. In this paper, we introduce data science techniques using machine learning and deep learning algorithms: the autoregressive integrated moving average (ARIMA), Bayesian-based curve fitting, and recurrent neural network (RNN) are employed for a data-driven application to mobile network operators. The main framework included in models are identification parameters of each model, estimation, prediction, and final data-driven application of this prediction from business and network performance applications. These models are applied to Telecom Italia Big Data challenge call detail records (CDRs) datasets. The performance of these models is found out using a specific well-known evaluation criteria shows that ARIMA (machine learning-based model) is more accurate as a predictive model in such a dataset than the RNN (deep learning model).

Keywords: big data analytics, machine learning, CDRs, 5G

Procedia PDF Downloads 123

371 Improving Similarity Search Using Clustered Data

Authors: Deokho Kim, Wonwoo Lee, Jaewoong Lee, Teresa Ng, Gun-Ill Lee, Jiwon Jeong

Abstract:

This paper presents a method for improving object search accuracy using a deep learning model. A major limitation to provide accurate similarity with deep learning is the requirement of huge amount of data for training pairwise similarity scores (metrics), which is impractical to collect. Thus, similarity scores are usually trained with a relatively small dataset, which comes from a different domain, causing limited accuracy on measuring similarity. For this reason, this paper proposes a deep learning model that can be trained with a significantly small amount of data, a clustered data which of each cluster contains a set of visually similar images. In order to measure similarity distance with the proposed method, visual features of two images are extracted from intermediate layers of a convolutional neural network with various pooling methods, and the network is trained with pairwise similarity scores which is defined zero for images in identical cluster. The proposed method outperforms the state-of-the-art object similarity scoring techniques on evaluation for finding exact items. The proposed method achieves 86.5% of accuracy compared to the accuracy of the state-of-the-art technique, which is 59.9%. That is, an exact item can be found among four retrieved images with an accuracy of 86.5%, and the rest can possibly be similar products more than the accuracy. Therefore, the proposed method can greatly reduce the amount of training data with an order of magnitude as well as providing a reliable similarity metric.

Keywords: visual search, deep learning, convolutional neural network, machine learning

Procedia PDF Downloads 198

370 The Research on Diesel Bus Emissions in Ulaanbaatar City: Mongolia

Authors: Tsetsegmaa A., Bayarsuren B., Altantsetseg Ts.

Abstract:

To make the best decision on reducing harmful emissions from buses, we need to have a clear understanding of the current state of their actual emissions. The emissions from city buses running on high sulfur fuel, particularly particulate matter (PM) and nitrogen oxides (NOx) from the exhaust gases of conventional diesel engines, have been studied and measured with and without diesel particulate filter (DPF) in Ulaanbaatar city. The study was conducted by using the PEMS (Portable Emissions Measurement System) and gravimetric method in real traffic conditions. The obtained data were used to determine the actual emission rates and to evaluate the effectiveness of the selected particulate filters. Actual road and daily PM emissions from city buses were determined during the warm and cold seasons. A bus with an average daily mileage of 242 km was found to emit 166.155 g of PM into the city's atmosphere on average per day, with 141.3 g in summer and 175.8 g in winter. The actual PM of the city bus is 0.6866 g/km. The concentration of NOx in the exhaust gas averages 1410.94 ppm. The use of DPF reduced the exhaust gas opacity of 24 buses by an average of 97% and filtered a total of 340.4 kg of soot from these buses over a period of six months. Retrofitting an old conventional diesel engine with cassette-type silicon carbide (SiC) DPF, despite the laboriousness of cleaning, can significantly reduce particulate matter emissions. Innovation: First comprehensive road PM and NOx emission dataset and actual road emissions from public buses have been identified. PM and NOx mathematical model equations have been estimated as a function of the bus technical speed and engine revolution with and without DPF.

Keywords: conventional diesel, silicon carbide, real-time onboard measurements, particulate matter, diesel retrofit, fuel sulphur

Procedia PDF Downloads 141

369 Convolutional Neural Networks-Optimized Text Recognition with Binary Embeddings for Arabic Expiry Date Recognition

Authors: Mohamed Lotfy, Ghada Soliman

Abstract:

Recognizing Arabic dot-matrix digits is a challenging problem due to the unique characteristics of dot-matrix fonts, such as irregular dot spacing and varying dot sizes. This paper presents an approach for recognizing Arabic digits printed in dot matrix format. The proposed model is based on Convolutional Neural Networks (CNN) that take the dot matrix as input and generate embeddings that are rounded to generate binary representations of the digits. The binary embeddings are then used to perform Optical Character Recognition (OCR) on the digit images. To overcome the challenge of the limited availability of dotted Arabic expiration date images, we developed a True Type Font (TTF) for generating synthetic images of Arabic dot-matrix characters. The model was trained on a synthetic dataset of 3287 images and 658 synthetic images for testing, representing realistic expiration dates from 2019 to 2027 in the format of yyyy/mm/dd. Our model achieved an accuracy of 98.94% on the expiry date recognition with Arabic dot matrix format using fewer parameters and less computational resources than traditional CNN-based models. By investigating and presenting our findings comprehensively, we aim to contribute substantially to the field of OCR and pave the way for advancements in Arabic dot-matrix character recognition. Our proposed approach is not limited to Arabic dot matrix digit recognition but can also be extended to text recognition tasks, such as text classification and sentiment analysis.

Keywords: computer vision, pattern recognition, optical character recognition, deep learning

Procedia PDF Downloads 68

368 Probabilistic Models to Evaluate Seismic Liquefaction In Gravelly Soil Using Dynamic Penetration Test and Shear Wave Velocity

Authors: Nima Pirhadi, Shao Yong Bo, Xusheng Wan, Jianguo Lu, Jilei Hu

Abstract:

Although gravels and gravelly soils are assumed to be non-liquefiable because of high conductivity and small modulus; however, the occurrence of this phenomenon in some historical earthquakes, especially recently earthquakes during 2008 Wenchuan, Mw= 7.9, 2014 Cephalonia, Greece, Mw= 6.1 and 2016, Kaikoura, New Zealand, Mw = 7.8, has been promoted the essential consideration to evaluate risk assessment and hazard analysis of seismic gravelly soil liquefaction. Due to the limitation in sampling and laboratory testing of this type of soil, in situ tests and site exploration of case histories are the most accepted procedures. Of all in situ tests, dynamic penetration test (DPT), Which is well known as the Chinese dynamic penetration test, and shear wave velocity (Vs) test, have been demonstrated high performance to evaluate seismic gravelly soil liquefaction. However, the lack of a sufficient number of case histories provides an essential limitation for developing new models. This study at first investigates recent earthquakes that caused liquefaction in gravelly soils to collect new data. Then, it adds these data to the available literature’s dataset to extend them and finally develops new models to assess seismic gravelly soil liquefaction. To validate the presented models, their results are compared to extra available models. The results show the reasonable performance of the proposed models and the critical effect of gravel content (GC)% on the assessment.

Keywords: liquefaction, gravel, dynamic penetration test, shear wave velocity

Procedia PDF Downloads 191

367 Automatic Classification of Lung Diseases from CT Images

Authors: Abobaker Mohammed Qasem Farhan, Shangming Yang, Mohammed Al-Nehari

Abstract:

Pneumonia is a kind of lung disease that creates congestion in the chest. Such pneumonic conditions lead to loss of life of the severity of high congestion. Pneumonic lung disease is caused by viral pneumonia, bacterial pneumonia, or Covidi-19 induced pneumonia. The early prediction and classification of such lung diseases help to reduce the mortality rate. We propose the automatic Computer-Aided Diagnosis (CAD) system in this paper using the deep learning approach. The proposed CAD system takes input from raw computerized tomography (CT) scans of the patient's chest and automatically predicts disease classification. We designed the Hybrid Deep Learning Algorithm (HDLA) to improve accuracy and reduce processing requirements. The raw CT scans have pre-processed first to enhance their quality for further analysis. We then applied a hybrid model that consists of automatic feature extraction and classification. We propose the robust 2D Convolutional Neural Network (CNN) model to extract the automatic features from the pre-processed CT image. This CNN model assures feature learning with extremely effective 1D feature extraction for each input CT image. The outcome of the 2D CNN model is then normalized using the Min-Max technique. The second step of the proposed hybrid model is related to training and classification using different classifiers. The simulation outcomes using the publically available dataset prove the robustness and efficiency of the proposed model compared to state-of-art algorithms.

Keywords: CT scan, Covid-19, deep learning, image processing, lung disease classification

Procedia PDF Downloads 130

366 Performance Study of Classification Algorithms for Consumer Online Shopping Attitudes and Behavior Using Data Mining

Authors: Rana Alaa El-Deen Ahmed, M. Elemam Shehab, Shereen Morsy, Nermeen Mekawie

Abstract:

With the growing popularity and acceptance of e-commerce platforms, users face an ever increasing burden in actually choosing the right product from the large number of online offers. Thus, techniques for personalization and shopping guides are needed by users. For a pleasant and successful shopping experience, users need to know easily which products to buy with high confidence. Since selling a wide variety of products has become easier due to the popularity of online stores, online retailers are able to sell more products than a physical store. The disadvantage is that the customers might not find products they need. In this research the customer will be able to find the products he is searching for, because recommender systems are used in some ecommerce web sites. Recommender system learns from the information about customers and products and provides appropriate personalized recommendations to customers to find the needed product. In this paper eleven classification algorithms are comparatively tested to find the best classifier fit for consumer online shopping attitudes and behavior in the experimented dataset. The WEKA knowledge analysis tool, which is an open source data mining workbench software used in comparing conventional classifiers to get the best classifier was used in this research. In this research by using the data mining tool (WEKA) with the experimented classifiers the results show that decision table and filtered classifier gives the highest accuracy and the lowest accuracy classification via clustering and simple cart.

Keywords: classification, data mining, machine learning, online shopping, WEKA

Procedia PDF Downloads 337

365 Power Production Performance of Different Wave Energy Converters in the Southwestern Black Sea

Authors: Ajab G. Majidi, Bilal Bingölbali, Adem Akpınar

Abstract:

This study aims to investigate the amount of energy (economic wave energy potential) that can be obtained from the existing wave energy converters in the high wave energy potential region of the Black Sea in terms of wave energy potential and their performance at different depths in the region. The data needed for this purpose were obtained using the calibrated nested layered SWAN wave modeling program version 41.01AB, which was forced with Climate Forecast System Reanalysis (CFSR) winds from 1979 to 2009. The wave dataset at a time interval of 2 hours was accumulated for a sub-grid domain for around Karaburun beach in Arnavutkoy, a district of Istanbul city. The annual sea state characteristic matrices for the five different depths along with a vertical line to the coastline were calculated for 31 years. According to the power matrices of different wave energy converter systems and characteristic matrices for each possible installation depth, the probability distribution tables of the specified mean wave period or wave energy period and significant wave height were calculated. Then, by using the relationship between these distribution tables, according to the present wave climate, the energy that the wave energy converter systems at each depth can produce was determined. Thus, the economically feasible potential of the relevant coastal zone was revealed, and the effect of different depths on energy converter systems is presented. The Oceantic at 50, 75 and 100 m depths and Oyster at 5 and 25 m depths presents the best performance. In the 31-year long period 1998 the most and 1989 is the least dynamic year.

Keywords: annual power production, Black Sea, efficiency, power production performance, wave energy converter

Procedia PDF Downloads 120

364 Assessment of the Impact of Trawling Activities on Marine Bottoms of Moroccan Atlantic

Authors: Rachida Houssa, Hassan Rhinane, Fadoumo Ali Malouw, Amina Oulmaalem

Abstract:

Since the early 70s, the Moroccan Atlantic sea was subjected to the pressure of the bottom trawling, one of the most destructive techniques seabed that cause havoc on fishing catch, nonselective, and responsible for more than half of all releases of fish around the world. The present paper aims to map and assess the impact of the activity of the bottom trawling of the Moroccan Atlantic coast. For this purpose, a dataset of thirty years, between 1962 and 1999, from foreign fishing vessels using bottom trawling, has been used and integrated in a GIS. To estimate the extent and the importance of the geographical distribution of the trawling effort, the Moroccan Atlantic area was divided into a grid of cells of 25 km2 (5x5 km). This grid was joined to the effort trawling data, creating a new entity with a table containing spatial overlay grid with the polygon of swept surfaces. This mapping model allowed to quantify the used fishing effort versus time and to generate the trace indicative of trawling efforts on the seabed. Indeed, for a given year, a grid cell may have a swept area equal to 0 (never been touched by the trawl) or 25 km2 (the trawled area is similar to the cell size) or may be 100 km2 indicating that for this year, the scanned surface is four times the cell area. The results show that the total cumulative sum of trawled area is approximately 28,738,326 km2, scattered throughout the Atlantic coast. 95% of the overall trawling effort is located in the southern zone, between 29°N and 20°30'N. Nearly 5% of the trawling effort is located in the northern coastal region, north of 33°N. The center area between 33°N and 29°N is the least swept by Russian commercial vessels because in this region the majority of the area is rocky, and non trawlable.

Keywords: GIS, Moroccan Atlantic Ocean, seabed, trawling

Procedia PDF Downloads 315

363 Comparative Study Using WEKA for Red Blood Cells Classification

Authors: Jameela Ali, Hamid A. Jalab, Loay E. George, Abdul Rahim Ahmad, Azizah Suliman, Karim Al-Jashamy

Abstract:

Red blood cells (RBC) are the most common types of blood cells and are the most intensively studied in cell biology. The lack of RBCs is a condition in which the amount of hemoglobin level is lower than normal and is referred to as “anemia”. Abnormalities in RBCs will affect the exchange of oxygen. This paper presents a comparative study for various techniques for classifying the RBCs as normal, or abnormal (anemic) using WEKA. WEKA is an open source consists of different machine learning algorithms for data mining applications. The algorithm tested are Radial Basis Function neural network, Support vector machine, and K-Nearest Neighbors algorithm. Two sets of combined features were utilized for classification of blood cells images. The first set, exclusively consist of geometrical features, was used to identify whether the tested blood cell has a spherical shape or non-spherical cells. While the second set, consist mainly of textural features was used to recognize the types of the spherical cells. We have provided an evaluation based on applying these classification methods to our RBCs image dataset which were obtained from Serdang Hospital-alaysia, and measuring the accuracy of test results. The best achieved classification rates are 97%, 98%, and 79% for Support vector machines, Radial Basis Function neural network, and K-Nearest Neighbors algorithm respectively.

Keywords: K-nearest neighbors algorithm, radial basis function neural network, red blood cells, support vector machine

Procedia PDF Downloads 392

362 Crashworthiness Optimization of an Automotive Front Bumper in Composite Material

Authors: S. Boria

Abstract:

In the last years, the crashworthiness of an automotive body structure can be improved, since the beginning of the design stage, thanks to the development of specific optimization tools. It is well known how the finite element codes can help the designer to investigate the crashing performance of structures under dynamic impact. Therefore, by coupling nonlinear mathematical programming procedure and statistical techniques with FE simulations, it is possible to optimize the design with reduced number of analytical evaluations. In engineering applications, many optimization methods which are based on statistical techniques and utilize estimated models, called meta-models, are quickly spreading. A meta-model is an approximation of a detailed simulation model based on a dataset of input, identified by the design of experiments (DOE); the number of simulations needed to build it depends on the number of variables. Among the various types of meta-modeling techniques, Kriging method seems to be excellent in accuracy, robustness and efficiency compared to other ones when applied to crashworthiness optimization. Therefore the application of such meta-model was used in this work, in order to improve the structural optimization of a bumper for a racing car in composite material subjected to frontal impact. The specific energy absorption represents the objective function to maximize and the geometrical parameters subjected to some design constraints are the design variables. LS-DYNA codes were interfaced with LS-OPT tool in order to find the optimized solution, through the use of a domain reduction strategy. With the use of the Kriging meta-model the crashworthiness characteristic of the composite bumper was improved.

Keywords: composite material, crashworthiness, finite element analysis, optimization

Procedia PDF Downloads 240

361 Text Emotion Recognition by Multi-Head Attention based Bidirectional LSTM Utilizing Multi-Level Classification

Authors: Vishwanath Pethri Kamath, Jayantha Gowda Sarapanahalli, Vishal Mishra, Siddhesh Balwant Bandgar

Abstract:

Recognition of emotional information is essential in any form of communication. Growing HCI (Human-Computer Interaction) in recent times indicates the importance of understanding of emotions expressed and becomes crucial for improving the system or the interaction itself. In this research work, textual data for emotion recognition is used. The text being the least expressive amongst the multimodal resources poses various challenges such as contextual information and also sequential nature of the language construction. In this research work, the proposal is made for a neural architecture to resolve not less than 8 emotions from textual data sources derived from multiple datasets using google pre-trained word2vec word embeddings and a Multi-head attention-based bidirectional LSTM model with a one-vs-all Multi-Level Classification. The emotions targeted in this research are Anger, Disgust, Fear, Guilt, Joy, Sadness, Shame, and Surprise. Textual data from multiple datasets were used for this research work such as ISEAR, Go Emotions, Affect datasets for creating the emotions’ dataset. Data samples overlap or conflicts were considered with careful preprocessing. Our results show a significant improvement with the modeling architecture and as good as 10 points improvement in recognizing some emotions.

Keywords: text emotion recognition, bidirectional LSTM, multi-head attention, multi-level classification, google word2vec word embeddings

Procedia PDF Downloads 160

360 Machine Learning Approach for Stress Detection Using Wireless Physical Activity Tracker

Authors: B. Padmaja, V. V. Rama Prasad, K. V. N. Sunitha, E. Krishna Rao Patro

Abstract:

Stress is a psychological condition that reduces the quality of sleep and affects every facet of life. Constant exposure to stress is detrimental not only for mind but also body. Nevertheless, to cope with stress, one should first identify it. This paper provides an effective method for the cognitive stress level detection by using data provided from a physical activity tracker device Fitbit. This device gathers people’s daily activities of food, weight, sleep, heart rate, and physical activities. In this paper, four major stressors like physical activities, sleep patterns, working hours and change in heart rate are used to assess the stress levels of individuals. The main motive of this system is to use machine learning approach in stress detection with the help of Smartphone sensor technology. Individually, the effect of each stressor is evaluated using logistic regression and then combined model is built and assessed using variants of ordinal logistic regression models like logit, probit and complementary log-log. Then the quality of each model is evaluated using Akaike Information Criterion (AIC) and probit is assessed as the more suitable model for our dataset. This system is experimented and evaluated in a real time environment by taking data from adults working in IT and other sectors in India. The novelty of this work lies in the fact that stress detection system should be less invasive as possible for the users.

Keywords: physical activity tracker, sleep pattern, working hours, heart rate, smartphone sensor

Procedia PDF Downloads 241

359 Hope as a Predictor for Complicated Grief and Anxiety: A Bayesian Structural Equational Modeling Study

Authors: Bo Yan, Amy Y. M. Chow

Abstract:

Bereavement is recognized as a universal challenging experience. It is important to gather research evidence on protective factors in bereavement. Hope is considered as one of the protective factors in previous coping studies. The present study aims to add knowledge by investigating hope at the first month after death to predict psychological symptoms altogether including complicated grief (CG), anxiety, and depressive symptoms at the seventh month. The data were collected via one-on-one interview survey in a longitudinal project with Hong Kong hospice users (sample size 105). Most participants were at their middle age (49-year-old on average), female (72%), with no religious affiliation (58%). Bayesian Structural Equation Modeling (BSEM) analysis was conducted on the longitudinal dataset. The BSEM findings show that hope at the first month of bereavement negatively predicts both CG and anxiety symptoms at the seventh month but not for depressive symptoms. Age and gender are controlled in the model. The overall model fit is good. The current study findings suggest assessing hope at the first month of bereavement. Hope at the first month after the loss is identified as an excellent predictor for complicated grief and anxiety symptoms at the seventh month. The result from this sample is clear, so it encourages cross-cultural research on replicated modeling and development of further clinical application. Particularly, practical consideration for early intervention to increase the level of hope has the potential to reduce the psychological symptoms and thus to improve the bereaved persons’ wellbeing in the long run.

Keywords: anxiety, complicated grief, depressive symptoms, hope, structural equational modeling

Procedia PDF Downloads 184

358 NANCY: Combining Adversarial Networks with Cycle-Consistency for Robust Multi-Modal Image Registration

Authors: Mirjana Ruppel, Rajendra Persad, Amit Bahl, Sanja Dogramadzi, Chris Melhuish, Lyndon Smith

Abstract:

Multimodal image registration is a profoundly complex task which is why deep learning has been used widely to address it in recent years. However, two main challenges remain: Firstly, the lack of ground truth data calls for an unsupervised learning approach, which leads to the second challenge of defining a feasible loss function that can compare two images of different modalities to judge their level of alignment. To avoid this issue altogether we implement a generative adversarial network consisting of two registration networks GAB, GBA and two discrimination networks DA, DB connected by spatial transformation layers. GAB learns to generate a deformation field which registers an image of the modality B to an image of the modality A. To do that, it uses the feedback of the discriminator DB which is learning to judge the quality of alignment of the registered image B. GBA and DA learn a mapping from modality A to modality B. Additionally, a cycle-consistency loss is implemented. For this, both registration networks are employed twice, therefore resulting in images ˆA, ˆB which were registered to ˜B, ˜A which were registered to the initial image pair A, B. Thus the resulting and initial images of the same modality can be easily compared. A dataset of liver CT and MRI was used to evaluate the quality of our approach and to compare it against learning and non-learning based registration algorithms. Our approach leads to dice scores of up to 0.80 ± 0.01 and is therefore comparable to and slightly more successful than algorithms like SimpleElastix and VoxelMorph.

Keywords: cycle consistency, deformable multimodal image registration, deep learning, GAN

Procedia PDF Downloads 109

357 A Gene Selection Algorithm for Microarray Cancer Classification Using an Improved Particle Swarm Optimization

Authors: Arfan Ali Nagra, Tariq Shahzad, Meshal Alharbi, Khalid Masood Khan, Muhammad Mugees Asif, Taher M. Ghazal, Khmaies Ouahada

Abstract:

Gene selection is an essential step for the classification of microarray cancer data. Gene expression cancer data (DNA microarray) facilitates computing the robust and concurrent expression of various genes. Particle swarm optimization (PSO) requires simple operators and less number of parameters for tuning the model in gene selection. The selection of a prognostic gene with small redundancy is a great challenge for the researcher as there are a few complications in PSO based selection method. In this research, a new variant of PSO (Self-inertia weight adaptive PSO) has been proposed. In the proposed algorithm, SIW-APSO-ELM is explored to achieve gene selection prediction accuracies. This new algorithm balances the exploration capabilities of the improved inertia weight adaptive particle swarm optimization and the exploitation. The self-inertia weight adaptive particle swarm optimization (SIW-APSO) is used to search the solution. The SIW-APSO is updated with an evolutionary process in such a way that each particle iteratively improves its velocities and positions. The extreme learning machine (ELM) has been designed for the selection procedure. The proposed method has been to identify a number of genes in the cancer dataset. The classification algorithm contains ELM, K- centroid nearest neighbor (KCNN), and support vector machine (SVM) to attain high forecast accuracy as compared to the start-of-the-art methods on microarray cancer datasets that show the effectiveness of the proposed method.

Keywords: microarray cancer, improved PSO, ELM, SVM, evolutionary algorithms

Procedia PDF Downloads 64

356 A Study of High Viscosity Oil-Gas Slug Flow Using Gamma Densitometer

Authors: Y. Baba, A. Archibong-Eso, H. Yeung

Abstract:

Experimental study of high viscosity oil-gas flows in horizontal pipelines published in literature has indicated that hydrodynamic slug flow is the dominant flow pattern observed. Investigations have shown that hydrodynamic slugging brings about high instabilities in pressure that can damage production facilities thereby making it inherent to study high viscous slug flow regime so as to improve the understanding of its flow dynamics. Most slug flow models used in the petroleum industry for the design of pipelines together with their closure relationships were formulated based on observations of low viscosity liquid-gas flows. New experimental investigations and data are therefore required to validate these models. In cases where these models underperform, improving upon or building new predictive models and correlations will also depend on the new experimental dataset and further understanding of the flow dynamics in high viscous oil-gas flows. In this study conducted at the Flow laboratory, Oil and Gas Engineering Centre of Cranfield University, slug flow variables such as pressure gradient, mean liquid holdup, frequency and slug length for oil viscosity ranging from 1..0 – 5.5 Pa.s are experimentally investigated and analysed. The study was carried out in a 0.076m ID pipe, two fast sampling gamma densitometer and pressure transducers (differential and point) were used to obtain experimental measurements. Comparison of the measured slug flow parameters to the existing slug flow prediction models available in the literature showed disagreement with high viscosity experimental data thus highlighting the importance of building new predictive models and correlations.

Keywords: gamma densitometer, mean liquid holdup, pressure gradient, slug frequency and slug length

Procedia PDF Downloads 312

355 A Comparative Study for Various Techniques Using WEKA for Red Blood Cells Classification

Authors: Jameela Ali, Hamid A. Jalab, Loay E. George, Abdul Rahim Ahmad, Azizah Suliman, Karim Al-Jashamy

Abstract:

Red blood cells (RBC) are the most common types of blood cells and are the most intensively studied in cell biology. The lack of RBCs is a condition in which the amount of hemoglobin level is lower than normal and is referred to as “anemia”. Abnormalities in RBCs will affect the exchange of oxygen. This paper presents a comparative study for various techniques for classifyig the red blood cells as normal, or abnormal (anemic) using WEKA. WEKA is an open source consists of different machine learning algorithms for data mining applications. The algorithm tested are Radial Basis Function neural network, Support vector machine, and K-Nearest Neighbors algorithm. Two sets of combined features were utilized for classification of blood cells images. The first set, exclusively consist of geometrical features, was used to identify whether the tested blood cell has a spherical shape or non-spherical cells. While the second set, consist mainly of textural features was used to recognize the types of the spherical cells. We have provided an evaluation based on applying these classification methods to our RBCs image dataset which were obtained from Serdang Hospital-Malaysia, and measuring the accuracy of test results. The best achieved classification rates are 97%, 98%, and 79% for Support vector machines, Radial Basis Function neural network, and K-Nearest Neighbors algorithm respectively

Keywords: red blood cells, classification, radial basis function neural networks, suport vector machine, k-nearest neighbors algorithm

Procedia PDF Downloads 464

354 Local Interpretable Model-agnostic Explanations (LIME) Approach to Email Spam Detection

Authors: Rohini Hariharan, Yazhini R., Blessy Maria Mathew

Abstract:

The task of detecting email spam is a very important one in the era of digital technology that needs effective ways of curbing unwanted messages. This paper presents an approach aimed at making email spam categorization algorithms transparent, reliable and more trustworthy by incorporating Local Interpretable Model-agnostic Explanations (LIME). Our technique assists in providing interpretable explanations for specific classifications of emails to help users understand the decision-making process by the model. In this study, we developed a complete pipeline that incorporates LIME into the spam classification framework and allows creating simplified, interpretable models tailored to individual emails. LIME identifies influential terms, pointing out key elements that drive classification results, thus reducing opacity inherent in conventional machine learning models. Additionally, we suggest a visualization scheme for displaying keywords that will improve understanding of categorization decisions by users. We test our method on a diverse email dataset and compare its performance with various baseline models, such as Gaussian Naive Bayes, Multinomial Naive Bayes, Bernoulli Naive Bayes, Support Vector Classifier, K-Nearest Neighbors, Decision Tree, and Logistic Regression. Our testing results show that our model surpasses all other models, achieving an accuracy of 96.59% and a precision of 99.12%.

Keywords: text classification, LIME (local interpretable model-agnostic explanations), stemming, tokenization, logistic regression.

Procedia PDF Downloads 28

353 Optimum Drilling States in Down-the-Hole Percussive Drilling: An Experimental Investigation

Authors: Joao Victor Borges Dos Santos, Thomas Richard, Yevhen Kovalyshen

Abstract:

Down-the-hole (DTH) percussive drilling is an excavation method that is widely used in the mining industry due to its high efficiency in fragmenting hard rock formations. A DTH hammer system consists of a fluid driven (air or water) piston and a drill bit; the reciprocating movement of the piston transmits its kinetic energy to the drill bit by means of stress waves that propagate through the drill bit towards the rock formation. In the literature of percussive drilling, the existence of an optimum drilling state (Sweet Spot) is reported in some laboratory and field experimental studies. An optimum rate of penetration is achieved for a specific range of axial thrust (or weight-on-bit) beyond which the rate of penetration decreases. Several authors advance different explanations as possible root causes to the occurrence of the Sweet Spot, but a universal explanation or consensus does not exist yet. The experimental investigation in this work was initiated with drilling experiments conducted at a mining site. A full-scale drilling rig (equipped with a DTH hammer system) was instrumented with high precision sensors sampled at a very high sampling rate (kHz). Data was collected while two boreholes were being excavated, an in depth analysis of the recorded data confirmed that an optimum performance can be achieved for specific ranges of input thrust (weight-on-bit). The high sampling rate allowed to identify the bit penetration at each single impact (of the piston on the drill bit) as well as the impact frequency. These measurements provide a direct method to identify when the hammer does not fire, and drilling occurs without percussion, and the bit propagate the borehole by shearing the rock. The second stage of the experimental investigation was conducted in a laboratory environment with a custom-built equipment dubbed Woody. Woody allows the drilling of shallow holes few centimetres deep by successive discrete impacts from a piston. After each individual impact, the bit angular position is incremented by a fixed amount, the piston is moved back to its initial position at the top of the barrel, and the air pressure and thrust are set back to their pre-set values. The goal is to explore whether the observed optimum drilling state stems from the interaction between the drill bit and the rock (during impact) or governed by the overall system dynamics (between impacts). The experiments were conducted on samples of Calca Red, with a drill bit of 74 millimetres (outside diameter) and with weight-on-bit ranging from 0.3 kN to 3.7 kN. Results show that under the same piston impact energy and constant angular displacement of 15 degrees between impact, the average drill bit rate of penetration is independent of the weight-on-bit, which suggests that the sweet spot is not caused by intrinsic properties of the bit-rock interface.

Keywords: optimum drilling state, experimental investigation, field experiments, laboratory experiments, down-the-hole percussive drilling

Procedia PDF Downloads 74