Search results for: Github
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 20

Search results for: Github

20 AIPM:An Integrator and Pull Request Matching Model in Github

Authors: Zhifang Liao, Yanbing Li, Li Xu, Yan Zhang, Xiaoping Fan, Jinsong Wu

Abstract:

Pull Request (PR) is the primary method for code contributions from the external contributors in Github. PR review is an essential part of open source software developments for maintaining the quality of software. Matching a new PR of an appropriate integrator will make the PR review more effective. However, PR and integrator matching are now organized manually in Github. To reduce this cost, we presented an AIPM model to predict highly relevant integrator of incoming PRs. AIPM uses topic model to extract topics from the PRs, and builds a one-to-one correspondence between topics and integrators. Then, AIPM finds the most suitable integrator according to the maximum entry of the topic-document distribution. On average, AIPM can reach a precision of 60%, and even in some projects, can reach a precision of 80%.

Keywords: pull Request, integrator matching, Github, open source project, topic model

Procedia PDF Downloads 271
19 Semantic Differences between Bug Labeling of Different Repositories via Machine Learning

Authors: Pooja Khanal, Huaming Zhang

Abstract:

Labeling of issues/bugs, also known as bug classification, plays a vital role in software engineering. Some known labels/classes of bugs are 'User Interface', 'Security', and 'API'. Most of the time, when a reporter reports a bug, they try to assign some predefined label to it. Those issues are reported for a project, and each project is a repository in GitHub/GitLab, which contains multiple issues. There are many software project repositories -ranging from individual projects to commercial projects. The labels assigned for different repositories may be dependent on various factors like human instinct, generalization of labels, label assignment policy followed by the reporter, etc. While the reporter of the issue may instinctively give that issue a label, another person reporting the same issue may label it differently. This way, it is not known mathematically if a label in one repository is similar or different to the label in another repository. Hence, the primary goal of this research is to find the semantic differences between bug labeling of different repositories via machine learning. Independent optimal classifiers for individual repositories are built first using the text features from the reported issues. The optimal classifiers may include a combination of multiple classifiers stacked together. Then, those classifiers are used to cross-test other repositories which leads the result to be deduced mathematically. The produce of this ongoing research includes a formalized open-source GitHub issues database that is used to deduce the similarity of the labels pertaining to the different repositories.

Keywords: bug classification, bug labels, GitHub issues, semantic differences

Procedia PDF Downloads 170
18 BingleSeq: A User-Friendly R Package for Single-Cell RNA-Seq Data Analysis

Authors: Quan Gu, Daniel Dimitrov

Abstract:

BingleSeq was developed as a shiny-based, intuitive, and comprehensive application that enables the analysis of single-Cell RNA-Sequencing count data. This was achieved via incorporating three state-of-the-art software packages for each type of RNA sequencing analysis, alongside functional annotation analysis and a way to assess the overlap of differential expression method results. At its current state, the functionality implemented within BingleSeq is comparable to that of other applications, also developed with the purpose of lowering the entry requirements to RNA Sequencing analyses. BingleSeq is available on GitHub and will be submitted to R/Bioconductor.

Keywords: bioinformatics, functional annotation analysis, single-cell RNA-sequencing, transcriptomics

Procedia PDF Downloads 165
17 Artificial Intelligence for All: Artificial Intelligence Education for K-12

Authors: Yiqiao Yin

Abstract:

Many scholars and educators have dedicated their lives in K12 education system and there has been an exploding amount of attention to implement technical foundations for Artificial Intelligence Education for high school and precollege level students. This paper focuses on the development and use of resources to support K-12 education in Artificial Intelligence (AI). The author and his team have more than three years of experience coaching students from pre-college level age from 15 to 18. This paper is a culmination of the experience and proposed online tools, software demos, and structured activities for high school students. The paper also addresses a portfolio of AI concepts as well as the expected learning outcomes. All resources are provided with online videos and Github repositories for immediate use.

Keywords: K12 education, AI4ALL, pre-college education, pre-college AI

Procedia PDF Downloads 109
16 Detecting Logical Errors in Haskell

Authors: Vanessa Vasconcelos, Mariza A. S. Bigonha

Abstract:

In order to facilitate both processes, this paper presents HaskellFL, a tool that uses fault localization techniques to locate a logical error in Haskell code. The Haskell subset used in this work is sufficiently expressive for those studying functional programming to get immediate help debugging their code and to answer questions about key concepts associated with the functional paradigm. HaskellFL was tested against functional programming assignments submitted by students enrolled at the functional programming class at the Federal University of Minas Gerais and against exercises from the Exercism Haskell track that are publicly available on GitHub. Furthermore, the EXAM score was chosen to evaluate the tool’s effectiveness, and results showed that HaskellFL reduced the effort needed to locate an error for all tested scenarios. Results also showed that the Ochiai method was more effective than Tarantula.

Keywords: debug, fault localization, functional programming, Haskell

Procedia PDF Downloads 274
15 Chinese Sentence Level Lip Recognition

Authors: Peng Wang, Tigang Jiang

Abstract:

The computer based lip reading method of different languages cannot be universal. At present, for the research of Chinese lip reading, whether the work on data sets or recognition algorithms, is far from mature. In this paper, we study the Chinese lipreading method based on machine learning, and propose a Chinese Sentence-level lip-reading network (CNLipNet) model which consists of spatio-temporal convolutional neural network(CNN), recurrent neural network(RNN) and Connectionist Temporal Classification (CTC) loss function. This model can map variable-length sequence of video frames to Chinese Pinyin sequence and is trained end-to-end. More over, We create CNLRS, a Chinese Lipreading Dataset, which contains 5948 samples and can be shared through github. The evaluation of CNLipNet on this dataset yielded a 41% word correct rate and a 70.6% character correct rate. This evaluation result is far superior to the professional human lip readers, indicating that CNLipNet performs well in lipreading.

Keywords: lipreading, machine learning, spatio-temporal, convolutional neural network, recurrent neural network

Procedia PDF Downloads 99
14 Reinforcement Learning for Self Driving Racing Car Games

Authors: Adam Beaunoyer, Cory Beaunoyer, Mohammed Elmorsy, Hanan Saleh

Abstract:

This research aims to create a reinforcement learning agent capable of racing in challenging simulated environments with a low collision count. We present a reinforcement learning agent that can navigate challenging tracks using both a Deep Q-Network (DQN) and a Soft Actor-Critic (SAC) method. A challenging track includes curves, jumps, and varying road widths throughout. Using open-source code on Github, the environment used in this research is based on the 1995 racing game WipeOut. The proposed reinforcement learning agent can navigate challenging tracks rapidly while maintaining low racing completion time and collision count. The results show that the SAC model outperforms the DQN model by a large margin. We also propose an alternative multiple-car model that can navigate the track without colliding with other vehicles on the track. The SAC model is the basis for the multiple-car model, where it can complete the laps quicker than the single-car model but has a higher collision rate with the track wall.

Keywords: reinforcement learning, soft actor-critic, deep q-network, self-driving cars, artificial intelligence, gaming

Procedia PDF Downloads 11
13 Fast Adjustable Threshold for Uniform Neural Network Quantization

Authors: Alexander Goncharenko, Andrey Denisov, Sergey Alyamkin, Evgeny Terentev

Abstract:

The neural network quantization is highly desired procedure to perform before running neural networks on mobile devices. Quantization without fine-tuning leads to accuracy drop of the model, whereas commonly used training with quantization is done on the full set of the labeled data and therefore is both time- and resource-consuming. Real life applications require simplification and acceleration of quantization procedure that will maintain accuracy of full-precision neural network, especially for modern mobile neural network architectures like Mobilenet-v1, MobileNet-v2 and MNAS. Here we present a method to significantly optimize training with quantization procedure by introducing the trained scale factors for discretization thresholds that are separate for each filter. Using the proposed technique, we quantize the modern mobile architectures of neural networks with the set of train data of only ∼ 10% of the total ImageNet 2012 sample. Such reduction of train dataset size and small number of trainable parameters allow to fine-tune the network for several hours while maintaining the high accuracy of quantized model (accuracy drop was less than 0.5%). Ready-for-use models and code are available in the GitHub repository.

Keywords: distillation, machine learning, neural networks, quantization

Procedia PDF Downloads 291
12 BodeACD: Buffer Overflow Vulnerabilities Detecting Based on Abstract Syntax Tree, Control Flow Graph, and Data Dependency Graph

Authors: Xinghang Lv, Tao Peng, Jia Chen, Junping Liu, Xinrong Hu, Ruhan He, Minghua Jiang, Wenli Cao

Abstract:

As one of the most dangerous vulnerabilities, effective detection of buffer overflow vulnerabilities is extremely necessary. Traditional detection methods are not accurate enough and consume more resources to meet complex and enormous code environment at present. In order to resolve the above problems, we propose the method for Buffer overflow detection based on Abstract syntax tree, Control flow graph, and Data dependency graph (BodeACD) in C/C++ programs with source code. Firstly, BodeACD constructs the function samples of buffer overflow that are available on Github, then represents them as code representation sequences, which fuse control flow, data dependency, and syntax structure of source code to reduce information loss during code representation. Finally, BodeACD learns vulnerability patterns for vulnerability detection through deep learning. The results of the experiments show that BodeACD has increased the precision and recall by 6.3% and 8.5% respectively compared with the latest methods, which can effectively improve vulnerability detection and reduce False-positive rate and False-negative rate.

Keywords: vulnerability detection, abstract syntax tree, control flow graph, data dependency graph, code representation, deep learning

Procedia PDF Downloads 142
11 A Unified Webcam Proctoring Solution on Edge

Authors: Saw Thiha, Jay Rajasekera

Abstract:

A boom in video conferencing generated millions of hours of video data daily to be analyzed. However, such enormous data pose certain scalability issues to be analyzed efficiently, let alone do it in real-time, as online conferences can involve hundreds of people and can last for hours. This paper proposes an efficient online proctoring solution that can analyze the online conferences real-time on edge devices such as Android, iOS, and desktops. Since the computation can be done upfront on the devices where online conferences take place, it can scale well without requiring intensive resources such as GPU servers and complex cloud infrastructure. According to the linear models, face orientation does indeed impact the perceived eye openness. Also, the proposed z score facial landmark standardization was proven to be functional in detecting face orientation and contributed to classifying eye blinks with single eyelid distance computation while achieving a better f1 score and accuracy than the Eye Aspect Ratio (EAR) threshold method. Last but not least, the authors implemented the solution natively in the MediaPipe framework and open-sourced it along with the reproducible experimental results on GitHub. The solution provides face orientation, eye blink, facial activity, and translation detections out of the box and is highly customizable and extensible.

Keywords: android, desktop, edge computing, blink, face orientation, facial activity and translation, MediaPipe, open source, real-time, video conference, web, iOS, Z score facial landmark standardization

Procedia PDF Downloads 71
10 An Extensible Software Infrastructure for Computer Aided Custom Monitoring of Patients in Smart Homes

Authors: Ritwik Dutta, Marylin Wolf

Abstract:

This paper describes the trade-offs and the design from scratch of a self-contained, easy-to-use health dashboard software system that provides customizable data tracking for patients in smart homes. The system is made up of different software modules and comprises a front-end and a back-end component. Built with HTML, CSS, and JavaScript, the front-end allows adding users, logging into the system, selecting metrics, and specifying health goals. The back-end consists of a NoSQL Mongo database, a Python script, and a SimpleHTTPServer written in Python. The database stores user profiles and health data in JSON format. The Python script makes use of the PyMongo driver library to query the database and displays formatted data as a daily snapshot of user health metrics against target goals. Any number of standard and custom metrics can be added to the system, and corresponding health data can be fed automatically, via sensor APIs or manually, as text or picture data files. A real-time METAR request API permits correlating weather data with patient health, and an advanced query system is implemented to allow trend analysis of selected health metrics over custom time intervals. Available on the GitHub repository system, the project is free to use for academic purposes of learning and experimenting, or practical purposes by building on it.

Keywords: flask, Java, JavaScript, health monitoring, long-term care, Mongo, Python, smart home, software engineering, webserver

Procedia PDF Downloads 364
9 Developing an Automated Protocol for the Wristband Extraction Process Using Opentrons

Authors: Tei Kim, Brooklynn McNeil, Kathryn Dunn, Douglas I. Walker

Abstract:

To better characterize the relationship between complex chemical exposures and disease, our laboratory uses an approach that combines low-cost, polydimethylsiloxane (silicone) wristband samplers that absorb many of the chemicals we are exposed to with untargeted high-resolution mass spectrometry (HRMS) to characterize 1000’s of chemicals at a time. In studies with human populations, these wristbands can provide an important measure of our environment: however, there is a need to use this approach in large cohorts to study exposures associated with the disease. To facilitate the use of silicone samplers in large scale population studies, the goal of this research project was to establish automated sample preparation methods that improve throughput, robustness, and scalability of analytical methods for silicone wristbands. Using the Opentron OT2 automated liquid platform, which provides a low-cost and opensource framework for automated pipetting, we created two separate workflows that translate the manual wristband preparation method to a fully automated protocol that requires minor intervention by the operator. These protocols include a sequence generation step, which defines the location of all plates and labware according to user-specified settings, and a transfer protocol that includes all necessary instrument parameters and instructions for automated solvent extraction of wristband samplers. These protocols were written in Python and uploaded to GitHub for use by others in the research community. Results from this project show it is possible to establish automated and open source methods for the preparation of silicone wristband samplers to support profiling of many environmental exposures. Ongoing studies include deployment in longitudinal cohort studies to investigate the relationship between personal chemical exposure and disease.

Keywords: bioinformatics, automation, opentrons, research

Procedia PDF Downloads 82
8 FlameCens: Visualization of Expressive Deviations in Music Performance

Authors: Y. Trantafyllou, C. Alexandraki

Abstract:

Music interpretation accounts to the way musicians shape their performance by deliberately deviating from composers’ intentions, which are commonly communicated via some form of music transcription, such as a music score. For transcribed and non-improvised music, music expression is manifested by introducing subtle deviations in tempo, dynamics and articulation during the evolution of performance. This paper presents an application, named FlameCens, which, given two recordings of the same piece of music, presumably performed by different musicians, allow visualising deviations in tempo and dynamics during playback. The application may also compare a certain performance to the music score of that piece (i.e. MIDI file), which may be thought of as an expression-neutral representation of that piece, hence depicting the expressive queues employed by certain performers. FlameCens uses the Dynamic Time Warping algorithm to compare two audio sequences, based on CENS (Chroma Energy distribution Normalized Statistics) audio features. Expressive deviations are illustrated in a moving flame, which is generated by an animation of particles. The length of the flame is mapped to deviations in dynamics, while the slope of the flame is mapped to tempo deviations so that faster tempo changes the slope to the right and slower tempo changes the slope to the left. Constant slope signifies no tempo deviation. The detected deviations in tempo and dynamics can be additionally recorded in a text file, which allows for offline investigation. Moreover, in the case of monophonic music, the color of particles is used to convey the pitch of the notes during performance. FlameCens has been implemented in Python and it is openly available via GitHub. The application has been experimentally validated for different music genres including classical, contemporary, jazz and popular music. These experiments revealed that FlameCens can be a valuable tool for music specialists (i.e. musicians or musicologists) to investigate the expressive performance strategies employed by different musicians, as well as for music audience to enhance their listening experience.

Keywords: audio synchronization, computational music analysis, expressive music performance, information visualization

Procedia PDF Downloads 105
7 ROSgeoregistration: Aerial Multi-Spectral Image Simulator for the Robot Operating System

Authors: Andrew R. Willis, Kevin Brink, Kathleen Dipple

Abstract:

This article describes a software package called ROS-georegistration intended for use with the robot operating system (ROS) and the Gazebo 3D simulation environment. ROSgeoregistration provides tools for the simulation, test, and deployment of aerial georegistration algorithms and is available at github.com/uncc-visionlab/rosgeoregistration. A model creation package is provided which downloads multi-spectral images from the Google Earth Engine database and, if necessary, incorporates these images into a single, possibly very large, reference image. Additionally a Gazebo plugin which uses the real-time sensor pose and image formation model to generate simulated imagery using the specified reference image is provided along with related plugins for UAV relevant data. The novelty of this work is threefold: (1) this is the first system to link the massive multi-spectral imaging database of Google’s Earth Engine to the Gazebo simulator, (2) this is the first example of a system that can simulate geospatially and radiometrically accurate imagery from multiple sensor views of the same terrain region, and (3) integration with other UAS tools creates a new holistic UAS simulation environment to support UAS system and subsystem development where real-world testing would generally be prohibitive. Sensed imagery and ground truth registration information is published to client applications which can receive imagery synchronously with telemetry from other payload sensors, e.g., IMU, GPS/GNSS, barometer, and windspeed sensor data. To highlight functionality, we demonstrate ROSgeoregistration for simulating Electro-Optical (EO) and Synthetic Aperture Radar (SAR) image sensors and an example use case for developing and evaluating image-based UAS position feedback, i.e., pose for image-based Guidance Navigation and Control (GNC) applications.

Keywords: EO-to-EO, EO-to-SAR, flight simulation, georegistration, image generation, robot operating system, vision-based navigation

Procedia PDF Downloads 80
6 A System Architecture for Hand Gesture Control of Robotic Technology: A Case Study Using a Myo™ Arm Band, DJI Spark™ Drone, and a Staubli™ Robotic Manipulator

Authors: Sebastian van Delden, Matthew Anuszkiewicz, Jayse White, Scott Stolarski

Abstract:

Industrial robotic manipulators have been commonplace in the manufacturing world since the early 1960s, and unmanned aerial vehicles (drones) have only begun to realize their full potential in the service industry and the military. The omnipresence of these technologies in their respective fields will only become more potent in coming years. While these technologies have greatly evolved over the years, the typical approach to human interaction with these robots has not. In the industrial robotics realm, a manipulator is typically jogged around using a teach pendant and programmed using a networked computer or the teach pendant itself via a proprietary software development platform. Drones are typically controlled using a two-handed controller equipped with throttles, buttons, and sticks, an app that can be downloaded to one’s mobile device, or a combination of both. This application-oriented work offers a novel approach to human interaction with both unmanned aerial vehicles and industrial robotic manipulators via hand gestures and movements. Two systems have been implemented, both of which use a Myo™ armband to control either a drone (DJI Spark™) or a robotic arm (Stäubli™ TX40). The methodologies developed by this work present a mapping of armband gestures (fist, finger spread, swing hand in, swing hand out, swing arm left/up/down/right, etc.) to either drone or robot arm movements. The findings of this study present the efficacy and limitations (precision and ergonomic) of hand gesture control of two distinct types of robotic technology. All source code associated with this project will be open sourced and placed on GitHub. In conclusion, this study offers a framework that maps hand and arm gestures to drone and robot arm control. The system has been implemented using current ubiquitous technologies, and these software artifacts will be open sourced for future researchers or practitioners to use in their work.

Keywords: human robot interaction, drones, gestures, robotics

Procedia PDF Downloads 128
5 Ribotaxa: Combined Approaches for Taxonomic Resolution Down to the Species Level from Metagenomics Data Revealing Novelties

Authors: Oshma Chakoory, Sophie Comtet-Marre, Pierre Peyret

Abstract:

Metagenomic classifiers are widely used for the taxonomic profiling of metagenomic data and estimation of taxa relative abundance. Small subunit rRNA genes are nowadays a gold standard for the phylogenetic resolution of complex microbial communities, although the power of this marker comes down to its use as full-length. We benchmarked the performance and accuracy of rRNA-specialized versus general-purpose read mappers, reference-targeted assemblers and taxonomic classifiers. We then built a pipeline called RiboTaxa to generate a highly sensitive and specific metataxonomic approach. Using metagenomics data, RiboTaxa gave the best results compared to other tools (Kraken2, Centrifuge (1), METAXA2 (2), PhyloFlash (3)) with precise taxonomic identification and relative abundance description, giving no false positive detection. Using real datasets from various environments (ocean, soil, human gut) and from different approaches (metagenomics and gene capture by hybridization), RiboTaxa revealed microbial novelties not seen by current bioinformatics analysis opening new biological perspectives in human and environmental health. In a study focused on corals’ health involving 20 metagenomic samples (4), an affiliation of prokaryotes was limited to the family level with Endozoicomonadaceae characterising healthy octocoral tissue. RiboTaxa highlighted 2 species of uncultured Endozoicomonas which were dominant in the healthy tissue. Both species belonged to a genus not yet described, opening new research perspectives on corals’ health. Applied to metagenomics data from a study on human gut and extreme longevity (5), RiboTaxa detected the presence of an uncultured archaeon in semi-supercentenarians (aged 105 to 109 years) highlighting an archaeal genus, not yet described, and 3 uncultured species belonging to the Enorma genus that could be species of interest participating in the longevity process. RiboTaxa is user-friendly, rapid, allowing microbiota structure description from any environment and the results can be easily interpreted. This software is freely available at https://github.com/oschakoory/RiboTaxa under the GNU Affero General Public License 3.0.

Keywords: metagenomics profiling, microbial diversity, SSU rRNA genes, full-length phylogenetic marker

Procedia PDF Downloads 90
4 Shark Detection and Classification with Deep Learning

Authors: Jeremy Jenrette, Z. Y. C. Liu, Pranav Chimote, Edward Fox, Trevor Hastie, Francesco Ferretti

Abstract:

Suitable shark conservation depends on well-informed population assessments. Direct methods such as scientific surveys and fisheries monitoring are adequate for defining population statuses, but species-specific indices of abundance and distribution coming from these sources are rare for most shark species. We can rapidly fill these information gaps by boosting media-based remote monitoring efforts with machine learning and automation. We created a database of shark images by sourcing 24,546 images covering 219 species of sharks from the web application spark pulse and the social network Instagram. We used object detection to extract shark features and inflate this database to 53,345 images. We packaged object-detection and image classification models into a Shark Detector bundle. We developed the Shark Detector to recognize and classify sharks from videos and images using transfer learning and convolutional neural networks (CNNs). We applied these models to common data-generation approaches of sharks: boosting training datasets, processing baited remote camera footage and online videos, and data-mining Instagram. We examined the accuracy of each model and tested genus and species prediction correctness as a result of training data quantity. The Shark Detector located sharks in baited remote footage and YouTube videos with an average accuracy of 89\%, and classified located subjects to the species level with 69\% accuracy (n =\ eight species). The Shark Detector sorted heterogeneous datasets of images sourced from Instagram with 91\% accuracy and classified species with 70\% accuracy (n =\ 17 species). Data-mining Instagram can inflate training datasets and increase the Shark Detector’s accuracy as well as facilitate archiving of historical and novel shark observations. Base accuracy of genus prediction was 68\% across 25 genera. The average base accuracy of species prediction within each genus class was 85\%. The Shark Detector can classify 45 species. All data-generation methods were processed without manual interaction. As media-based remote monitoring strives to dominate methods for observing sharks in nature, we developed an open-source Shark Detector to facilitate common identification applications. Prediction accuracy of the software pipeline increases as more images are added to the training dataset. We provide public access to the software on our GitHub page.

Keywords: classification, data mining, Instagram, remote monitoring, sharks

Procedia PDF Downloads 90
3 MigrationR: An R Package for Analyzing Bird Migration Data Based on Satellite Tracking

Authors: Xinhai Li, Huidong Tian, Yumin Guo

Abstract:

Bird migration is fantastic natural phenomenon. In recent years, the use of GPS transmitters has generated a vast amount of data, and the Movebank platform has made these data publicly accessible. For researchers, what they need are data analysis tools. Although there are approximately 90 R packages dedicated to animal movement analysis, the capacity for comprehensive processing of bird migration data remains limited. Hence, we introduce a novel package called migrationR. This package enables the calculation of movement speed, direction, changes in direction, flight duration, daily and annual movement distances. Furthermore, it can pinpoint the starting and ending dates of migration, estimate nest site locations and stopovers, and visualize movement trajectories at various time scales. migrationR distinguishes individuals through NMDS (non-metric multidimensional scaling) coordinates based on movement variables such as speed, flight duration, path tortuosity, and migration timing. A distinctive aspect of the package is the development of a hetero-occurrences species distribution model that takes into account the daily rhythm of individual birds across different landcover types. Habitat use for foraging and roosting differs significantly for many waterbirds. For example, White-naped Cranes at Poyang Lake in China typically forage in croplands and roost in shallow water areas. Both of these occurrence types are of equal importance. Optimal habitats consist of a combination of crop lands and shallow waters, whereas suboptimal habitats lack both, which necessitates birds to fly extensively. With migrationR, we conduct species distribution modeling for foraging and roosting separately and utilize the moving distance between crop lands and shallow water areas as an index of overall habitat suitability. This approach offers a more nuanced understanding of the habitat requirements for migratory birds and enhances our ability to analyze and interpret their movement patterns effectively. The functions of migrationR are demonstrated using our own tracking data of 78 White-naped Crane individuals from 2014 to 2023, comprising over one million valid locations in total. migrationR can be installed from a GitHub repository by executing the following command: remotes::install_github("Xinhai-Li/migrationR").

Keywords: bird migration, hetero-occurrences species distribution model, migrationR, R package, satellite telemetry

Procedia PDF Downloads 34
2 Scalable CI/CD and Scalable Automation: Assisting in Optimizing Productivity and Fostering Delivery Expansion

Authors: Solanki Ravirajsinh, Kudo Kuniaki, Sharma Ankit, Devi Sherine, Kuboshima Misaki, Tachi Shuntaro

Abstract:

In software development life cycles, the absence of scalable CI/CD significantly impacts organizations, leading to increased overall maintenance costs, prolonged release delivery times, heightened manual efforts, and difficulties in meeting tight deadlines. Implementing CI/CD with standard serverless technologies using cloud services overcomes all the above-mentioned issues and helps organizations improve efficiency and faster delivery without the need to manage server maintenance and capacity. By integrating scalable CI/CD with scalable automation testing, productivity, quality, and agility are enhanced while reducing the need for repetitive work and manual efforts. Implementing scalable CI/CD for development using cloud services like ECS (Container Management Service), AWS Fargate, ECR (to store Docker images with all dependencies), Serverless Computing (serverless virtual machines), Cloud Log (for monitoring errors and logs), Security Groups (for inside/outside access to the application), Docker Containerization (Docker-based images and container techniques), Jenkins (CI/CD build management tool), and code management tools (GitHub, Bitbucket, AWS CodeCommit) can efficiently handle the demands of diverse development environments and are capable of accommodating dynamic workloads, increasing efficiency for faster delivery with good quality. CI/CD pipelines encourage collaboration among development, operations, and quality assurance teams by providing a centralized platform for automated testing, deployment, and monitoring. Scalable CI/CD streamlines the development process by automatically fetching the latest code from the repository every time the process starts, building the application based on the branches, testing the application using a scalable automation testing framework, and deploying the builds. Developers can focus more on writing code and less on managing infrastructure as it scales based on the need. Serverless CI/CD eliminates the need to manage and maintain traditional CI/CD infrastructure, such as servers and build agents, reducing operational overhead and allowing teams to allocate resources more efficiently. Scalable CI/CD adjusts the application's scale according to usage, thereby alleviating concerns about scalability, maintenance costs, and resource needs. Creating scalable automation testing using cloud services (ECR, ECS Fargate, Docker, EFS, Serverless Computing) helps organizations run more than 500 test cases in parallel, aiding in the detection of race conditions, performance issues, and reducing execution time. Scalable CI/CD offers flexibility, dynamically adjusting to varying workloads and demands, allowing teams to scale resources up or down as needed. It optimizes costs by only paying for the resources as they are used and increases reliability. Scalable CI/CD pipelines employ automated testing and validation processes to detect and prevent errors early in the development cycle.

Keywords: achieve parallel execution, cloud services, scalable automation testing, scalable continuous integration and deployment

Procedia PDF Downloads 10
1 Integrating the Modbus SCADA Communication Protocol with Elliptic Curve Cryptography

Authors: Despoina Chochtoula, Aristidis Ilias, Yannis Stamatiou

Abstract:

Modbus is a protocol that enables the communication among devices which are connected to the same network. This protocol is, often, deployed in connecting sensor and monitoring units to central supervisory servers in Supervisory Control and Data Acquisition, or SCADA, systems. These systems monitor critical infrastructures, such as factories, power generation stations, nuclear power reactors etc. in order to detect malfunctions and ignite alerts and corrective actions. However, due to their criticality, SCADA systems are vulnerable to attacks that range from simple eavesdropping on operation parameters, exchanged messages, and valuable infrastructure information to malicious modification of vital infrastructure data towards infliction of damage. Thus, the SCADA research community has been active over strengthening SCADA systems with suitable data protection mechanisms based, to a large extend, on cryptographic methods for data encryption, device authentication, and message integrity protection. However, due to the limited computation power of many SCADA sensor and embedded devices, the usual public key cryptographic methods are not appropriate due to their high computational requirements. As an alternative, Elliptic Curve Cryptography has been proposed, which requires smaller key sizes and, thus, less demanding cryptographic operations. Until now, however, no such implementation has been proposed in the SCADA literature, to the best of our knowledge. In order to fill this gap, our methodology was focused on integrating Modbus, a frequently used SCADA communication protocol, with Elliptic Curve based cryptography and develop a server/client application to demonstrate the proof of concept. For the implementation we deployed two C language libraries, which were suitably modify in order to be successfully integrated: libmodbus (https://github.com/stephane/libmodbus) and ecc-lib https://www.ceid.upatras.gr/webpages/faculty/zaro/software/ecc-lib/). The first library provides a C implementation of the Modbus/TCP protocol while the second one offers the functionality to develop cryptographic protocols based on Elliptic Curve Cryptography. These two libraries were combined, after suitable modifications and enhancements, in order to give a modified version of the Modbus/TCP protocol focusing on the security of the data exchanged among the devices and the supervisory servers. The mechanisms we implemented include key generation, key exchange/sharing, message authentication, data integrity check, and encryption/decryption of data. The key generation and key exchange protocols were implemented with the use of Elliptic Curve Cryptography primitives. The keys established by each device are saved in their local memory and are retained during the whole communication session and are used in encrypting and decrypting exchanged messages as well as certifying entities and the integrity of the messages. Finally, the modified library was compiled for the Android environment in order to run the server application as an Android app. The client program runs on a regular computer. The communication between these two entities is an example of the successful establishment of an Elliptic Curve Cryptography based, secure Modbus wireless communication session between a portable device acting as a supervisor station and a monitoring computer. Our first performance measurements are, also, very promising and demonstrate the feasibility of embedding Elliptic Curve Cryptography into SCADA systems, filling in a gap in the relevant scientific literature.

Keywords: elliptic curve cryptography, ICT security, modbus protocol, SCADA, TCP/IP protocol

Procedia PDF Downloads 228