Search results for: Sharifa Alshahrani
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 17

Search results for: Sharifa Alshahrani

17 Comics Scanlation and Publishing Houses Translation

Authors: Sharifa Alshahrani

Abstract:

Comics is a multimodal text wherein meaning is created by taking in all modes of expression at once. It uses two different semiotic modes, the verbal and the visual modes, together to make meaning and these different semiotic modes can be socially and culturally shaped to give meaning. Therefore, comics translation cannot treat comics as a monomodal text by translating only the verbal mode inside or outside the speech balloons as the cultural differences are encoded in the visual mode as well. Due to the development of the internet and editing software, comics translation is not anymore confined to the publishing houses and official translation as scanlation, or the fan translation took the initiative in translating comics for being emotionally attracted to the culture and genre. Scanlation is carried out by volunteering fans who translate out of passion. However, quality is one of the debatable issues relating to scanlation and fan translation. This study will investigate how the dynamic multimodal relationship in comics is exploited and interpreted in the translation by exploring the translation strategies and procedures adopted by the publishing houses and scanlation in interpreting comics into Arabic using three analytical frameworks; cultural references model, multimodal relation model and translation strategies and procedures models.

Keywords: comics, multimodality, translation, scanlation

Procedia PDF Downloads 181
16 Formulation and Evaluation of Piroxicam Hydrotropic Starch Gel

Authors: Mohammed Ghazwani, Shyma Ali Alshahrani, Zahra Abdu Yousef, Taif Torki Asiri, Ghofran Abdur Rahman, Asma Ali Alshahrani, Umme Hani

Abstract:

Background and introduction: Piroxicam is a nonsteroidal anti-inflammatory drug characterized by low solubility-high permeability used to reduce pain, swelling, and joint stiffness from arthritis. Hydrotropes are a class of compounds that normally increase the aqueous solubility of insoluble solutes. Aim: The objective of the present research study was to formulate and optimize Piroxicam hydrotropic starch gel using sodium salicylate, sodium benzoate as hydrotropic salts, and potato starch for topical application. Materials and methods: The prepared Piroxicam hydrotropic starch gel was characterized for various physicochemical parameters like drug content estimation, pH, tube extrudability, and spreadability; all the prepared formulations were subjected to in-vitro diffusion studies for six hours in 100 ml phosphate buffer (pH 7.4) and determined gel strength. Results: All formulations were found to be white opaque in appearance and have good homogeneity. The pH of formulations was found to be between 6.9-7.9. Drug content ranged from 96.8%-99.4.5%. Spreadability plays an important role in patient compliance and helps in the uniform application of gel to the skin as gels should spread easily; F4 showed a spreadability of 2.4cm highest among all other formulations. In in vitro diffusion studies, extrudability and gel strength were good with F4 in comparison with other formulations; hence F4 was selected as the optimized formulation. Conclusion: Isolated potato starch was successfully employed to prepare the gel. Hydrotropic salt sodium salicylate increased the solubility of Piroxicam and resulted in a stable gel, whereas the gel prepared using sodium benzoate changed its color after one week of preparation from white to light yellowish. Hydrotropic potato starch gel proposed a suitable vehicle for the topical delivery of Piroxicam.

Keywords: Piroxicam, potato starch, hydrotropic salts, hydrotropic starch gel

Procedia PDF Downloads 99
15 Out-of-Plane Bending Properties of Out-of-Autoclave Thermosetting Prepregs during Forming Processes

Authors: Hassan A. Alshahrani, Mehdi H. Hojjati

Abstract:

In order to predict and model wrinkling which is caused by out of plane deformation due to compressive loading in the plane of the material during composite prepregs forming, it is necessary to quantitatively understand the relative magnitude of the bending stiffness. This study aims to examine the bending properties of out-of-autoclave (OOA) thermosetting prepreg under vertical cantilever test condition. A direct method for characterizing the bending behavior of composite prepregs was developed. The results from direct measurement were compared with results derived from an image-processing procedure that analyses the captured image during the vertical bending test. A numerical simulation was performed using ABAQUS to confirm the bending stiffness value.

Keywords: Bending stiffness, out-of-autoclave prepreg, forming process, numerical simulation.

Procedia PDF Downloads 262
14 Using Machine Learning Techniques for Autism Spectrum Disorder Analysis and Detection in Children

Authors: Norah Mohammed Alshahrani, Abdulaziz Almaleh

Abstract:

Autism Spectrum Disorder (ASD) is a condition related to issues with brain development that affects how a person recognises and communicates with others which results in difficulties with interaction and communication socially and it is constantly growing. Early recognition of ASD allows children to lead safe and healthy lives and helps doctors with accurate diagnoses and management of conditions. Therefore, it is crucial to develop a method that will achieve good results and with high accuracy for the measurement of ASD in children. In this paper, ASD datasets of toddlers and children have been analyzed. We employed the following machine learning techniques to attempt to explore ASD and they are Random Forest (RF), Decision Tree (DT), Na¨ıve Bayes (NB) and Support Vector Machine (SVM). Then Feature selection was used to provide fewer attributes from ASD datasets while preserving model performance. As a result, we found that the best result has been provided by the Support Vector Machine (SVM), achieving 0.98% in the toddler dataset and 0.99% in the children dataset.

Keywords: autism spectrum disorder, machine learning, feature selection, support vector machine

Procedia PDF Downloads 107
13 Upgrading of Problem-Based Learning with Educational Multimedia to the Undergraduate Students

Authors: Sharifa Alduraibi, Abir El Sadik, Ahmed Elzainy, Alaa Alduraibi, Ahmed Alsolai

Abstract:

Introduction: Problem-based learning (PBL) is an active student-centered educational modality, influenced by the students' interest that required continuous motivation to improve their engagement. The new era of professional information technology facilitated the utilization of educational multimedia, such as videos, soundtracks, and photographs promoting students' learning. The aim of the present study was to introduce multimedia-enriched PBL scenarios for the first time in college of medicine, Qassim University, as an incentive for better students' engagement. In addition, students' performance and satisfaction were evaluated. Methodology: Two multimedia-enhanced PBL scenarios were implemented to the third years' students in the urinary system block. Radiological images, plain CT scan, and X-ray of the abdomen and renal nuclear scan correlated with their pathological gross photographs were added to the scenarios. One week before the first sessions, pre-recorded orientation videos for PBL tutors were submitted to clarify the multimedia incorporated in the scenarios. Other two traditional PBL scenarios devoid of multimedia demonstrating the pathological and radiological findings were designed. Results and Discussion: Comparison between the formative assessments' results by the end of the two PBL modalities was done. It revealed significant increase in students' engagement, critical thinking and practical reasoning skills during the multimedia-enhanced sessions. Students' perception survey showed great satisfaction with the new strategy. Conclusion: It could be concluded from the current work that multimedia created technology-based teaching strategy inspiring the student for self-directed thinking and promoting students' overall achievement.

Keywords: multimedia, pathology and radiology images, problem-based learning, videos

Procedia PDF Downloads 119
12 Development of Computational Approach for Calculation of Hydrogen Solubility in Hydrocarbons for Treatment of Petroleum

Authors: Abdulrahman Sumayli, Saad M. AlShahrani

Abstract:

For the hydrogenation process, knowing the solubility of hydrogen (H2) in hydrocarbons is critical to improve the efficiency of the process. We investigated the H2 solubility computation in four heavy crude oil feedstocks using machine learning techniques. Temperature, pressure, and feedstock type were considered as the inputs to the models, while the hydrogen solubility was the sole response. Specifically, we employed three different models: Support Vector Regression (SVR), Gaussian process regression (GPR), and Bayesian ridge regression (BRR). To achieve the best performance, the hyper-parameters of these models are optimized using the whale optimization algorithm (WOA). We evaluated the models using a dataset of solubility measurements in various feedstocks, and we compared their performance based on several metrics. Our results show that the WOA-SVR model tuned with WOA achieves the best performance overall, with an RMSE of 1.38 × 10− 2 and an R-squared of 0.991. These findings suggest that machine learning techniques can provide accurate predictions of hydrogen solubility in different feedstocks, which could be useful in the development of hydrogen-related technologies. Besides, the solubility of hydrogen in the four heavy oil fractions is estimated in different ranges of temperatures and pressures of 150 ◦C–350 ◦C and 1.2 MPa–10.8 MPa, respectively

Keywords: temperature, pressure variations, machine learning, oil treatment

Procedia PDF Downloads 35
11 Detecting Paraphrases in Arabic Text

Authors: Amal Alshahrani, Allan Ramsay

Abstract:

Paraphrasing is one of the important tasks in natural language processing; i.e. alternative ways to express the same concept by using different words or phrases. Paraphrases can be used in many natural language applications, such as Information Retrieval, Machine Translation, Question Answering, Text Summarization, or Information Extraction. To obtain pairs of sentences that are paraphrases we create a system that automatically extracts paraphrases from a corpus, which is built from different sources of news article since these are likely to contain paraphrases when they report the same event on the same day. There are existing simple standard approaches (e.g. TF-IDF vector space, cosine similarity) and alignment technique (e.g. Dynamic Time Warping (DTW)) for extracting paraphrase which have been applied to the English. However, the performance of these approaches could be affected when they are applied to another language, for instance Arabic language, due to the presence of phenomena which are not present in English, such as Free Word Order, Zero copula, and Pro-dropping. These phenomena will affect the performance of these algorithms. Thus, if we can analysis how the existing algorithms for English fail for Arabic then we can find a solution for Arabic. The results are promising.

Keywords: natural language processing, TF-IDF, cosine similarity, dynamic time warping (DTW)

Procedia PDF Downloads 349
10 The Use of Continuous Improvement Methods to Empower the Osh MS With Leading Key Performance Indicators

Authors: Maha Rashid Al-Azib, Almuzn Qasem Alqathradi, Amal Munir Alshahrani, Bilqis Mohammed Assiri, Ali Almuflih

Abstract:

The Occupational Safety and Health Management System in one of the largest Saudi companies has been experiencing in the last 10 years extensive direct and indirect expenses due to lack of proactive leading indicators and safety leadership effective procedures. And since there are no studies that are associated with this department of safety in the company, this research has been conducted. In this study we used a mixed method approach containing a literature review and experts input, then a qualitative questionnaire provided by Institute for Work and Health related to determining the company’s occupational safety and health management system level out from three levels (Compliance - Improvement - Continuous Learning) and the output regarding the company’s level was in Continuous Learning. After that Deming cycle was employed to create a set of proactive leading indicators and analyzed using the SMART method to make sure of its effectiveness and suitability to the company. The objective of this research is to provide a set of proactive indicators to contribute in making an efficient occupational safety and health management system that has less accidents which results in less expenses. Therefore, we provided the company with a prototype of an APP, designed and empowered with our final results to contribute in supporting decisions making processes.

Keywords: proactive leading indicators, OSH MS, safety leadership, accidents reduction

Procedia PDF Downloads 47
9 Exploring the Techniques of Achieving Structural Electrical Continuity for Gas Plant Facilities

Authors: Abdulmohsen Alghadeer, Fahad Al Mahashir, Loai Al Owa, Najim Alshahrani

Abstract:

Electrical continuity of steel structure members is an essential condition to ensure equipotential and ultimately to protect personnel and assets in industrial facilities. The steel structure is electrically connected to provide a low resistance path to earth through equipotential bonding to prevent sparks and fires in the event of fault currents and avoid malfunction of the plant with detrimental consequences to the local and global environment. The oil and gas industry is commonly establishing steel structure electrical continuity by bare surface connection of coated steel members. This paper presents information pertaining to a real case of exploring and applying different techniques to achieve the electrical continuity in erecting steel structures at a gas plant facility. A project was supplied with fully coated steel members even at the surface connection members that cause electrical discontinuity. This was observed while a considerable number of steel members had already been received at the job site and erected. This made the resolution of the case to use different techniques such as bolt tightening and torqueing, chemical paint stripping and single point jumpers. These techniques are studied with comparative analysis related to their applicability, workability, time and cost advantages and disadvantages.

Keywords: coated Steel, electrical continuity, equipotential bonding, galvanized steel, gas plant facility, lightning protection, steel structure

Procedia PDF Downloads 89
8 Investigating Self-Confidence Influence on English as a Foreign Language Student English Language Proficiency Level

Authors: Ali A. Alshahrani

Abstract:

This study aims to identify Saudi English as a Foreign Language (EFL) students' perspectives towards using the English language in their studies. The study explores students' self-confident and its association with students' actual performance in English courses in their different academic programs. A multimodal methodology was used to fulfill the research purpose and answer the research questions. A 25-item survey questionnaire and final examination grades were used to collect data. Two hundred forty-one students agreed to participate in the study. They completed the questionnaire and agreed to release their final grades to be a part of the collected data. The data were coded and analyzed by SPSS software. The findings indicated a significant difference in students' performance in English courses between participants' academic programs on the one hand. Students' self-confidence in their English language skills, on the other hand, was not significantly different between participants' academic programs. Data analysis also revealed no correlational relationship between students' self-confidence level and their language skills and their performance. The study raises more questions about other vital factors such as course instructors' views of the materials, faculty members of the target department, family belief in the usefulness of the program, potential employers. These views and beliefs shape the student's preparation process and, therefore, should be explored further.

Keywords: English language intensive program, language proficiency, performance, self-confidence

Procedia PDF Downloads 90
7 Perceived Risks in Business-to-Consumer Online Contracts: An Empirical Study in Saudi Arabia

Authors: Shaya Alshahrani

Abstract:

Perceived risks play a major role in consumer intentions, behaviors, attitudes, and decisions about online shopping in the KSA. This paper investigates the influence of six perceived risk dimensions on Saudi consumers: product risk, information risk, financial risk, privacy and security risk, delivery risk, and terms and conditions risk empirically. To ensure the success of this study, a random survey was distributed to reflect the consumers’ perceived risk and to enable the generalization of the results. Data were collected from 323 respondents in the Kingdom of Saudi Arabia (KSA): 50 who had never shopped online and 273 who had done so. The results indicated that all six risks influenced the respondents’ perceptions of online shopping. The non-online shoppers perceived financial and delivery risks as the most significant barriers to online shopping. This was followed closely by performance, information, and privacy and security risks. Terms and conditions were perceived as less significant. The online consumers considered delivery and performance risks to be the most significant influences on internet shopping. This was followed closely by information and terms and conditions. Financial and privacy and security risks were perceived as less significant. This paper argues that introducing adequate legal solutions to addressing related problems arising from this study is an urgent need. This may enhance consumer trust in the KSA online market, increase consumers’ intentions regarding online shopping, and improve consumer protection.

Keywords: perceived risk, online contracts, Saudi Arabia, consumer protection

Procedia PDF Downloads 113
6 Cerebral Venous Thrombosis at High Altitude: A Rare Presentation by Sub-Arachnoid Hemorrhage

Authors: Eman G. Alayad, Mazen G. Aleyad, Mohammed Alshahrani, Ibrahim Alnaami

Abstract:

Introduction: Cerebral venous thrombosis (CVT) is a rare type of cerebrovascular disease that can occur at any age. Patients with CVT commonly present with headache, focal neurological deficit, decreased level of consciousness and seizures. Many etiologic risk factors have been reported for CVT, high altitude and oral contraceptive pill some of them. Case Presentation: A 37-year-old woman living in Abha city in the southeastern area of Saudi Arabia. (about 10,000 feet-3000 m) over the sea. complaining acute onset of severe diffuse headache and generalized tonic clonic convulsions. Followed by loss of consciousness. She was on contraceptive pills for the last 3 years. No significant Medical or surgical history. Brain CT revealed subarachnoid hemorrhage, with MRI findings showing thrombosis in transvers sinus. There was no vascular malformations such as aneurysm, arteriovenous malformation (AVM), or dural arteriovenous fistula. A CVT with subarachnoid hemorrhage was our final diagnosis based on clinical presentation and radiographic findings. Discussion: Patients with CVT had evidence of cortical SAH by 10 of 233, others found 3% of SAH was caused by CVT, indicating that the presence of cortical SAH without involvement of the basal cisterns may provide an early sign of underlying CVT. However, what is more interesting in this case, is the relationship of high altitude with CVT and SAH, which previously undescribed. Conclusion: High-altitude climbing per se was described as a risk factor for the development of CVT, though its occurrence was probably rare. Whether it is primary in etiology due to high altitude induced hypercoagulable state of unknown origin or due to cerebrovascular disturbances there is a need for further investigation especially at this unusual presentation of subarachnoid hemorrhage.

Keywords: cerebral venous thrombosis, high-altitude, subarachnoid hemorrhage, stroke

Procedia PDF Downloads 194
5 Determinants of Consultation Time at a Family Medicine Center

Authors: Ali Alshahrani, Adel Almaai, Saad Garni

Abstract:

Aim of the study: To explore duration and determinants of consultation time at a family medicine center. Methodology: This study was conducted at the Family Medicine Center in Ahad Rafidah City, at the southwestern part of Saudi Arabia. It was conducted on the working days of March 2013. Trained nurses helped in filling in the checklist. A total of 459 patients were included. A checklist was designed and used in this study. It included patient’s age, sex, diagnosis, type of visit, referral and its type, psychological problems and additional work-up. In addition, number of daily bookings, physician`s experience and consultation time. Results: More than half of patients (58.39%) had less than 10 minutes’ consultation (Mean+SD: 12.73+9.22 minutes). Patients treated by physicians with shortest experience (i.e., ≤5 years) had the longest consultation time while those who were treated with physicians with the longest experience (i.e., > 10 years) had the shortest consultation time (13.94±10.99 versus 10.79±7.28, p=0.011). Regarding patients’ diagnosis, those with chronic diseases had the longest consultation time (p<0.001). Patients who did not need referral had significantly shorter consultation time compared with those who had routine or urgent referral (11.91±8.42,14.60±9.03 and 22.42±14.81 minutes, respectively, p<0.001). Patients with associated psychological problems needed significantly longer consultation time than those without associated psychological problems (20.06±13.32 versus 12.45±8.93, p<0.001). Conclusions: The average length of consultation time at Ahad Rafidah Family Medicine Center is approximately 13 minutes. Less-experienced physicians tend to spend longer consultation times with patients. Referred patients, those with psychological problems, those with chronic diseases tend to have longer consultation time. Recommendations: Family physicians should be encouraged to keep their optimal consultation time. Booking an adequate number of patients per shift would allow the family physician to provide enough consultation time for each patient.

Keywords: consultation, quality, medicine, clinics

Procedia PDF Downloads 258
4 Blockchain-Based Decentralized Architecture for Secure Medical Records Management

Authors: Saeed M. Alshahrani

Abstract:

This research integrated blockchain technology to reform medical records management in healthcare informatics. It was aimed at resolving the limitations of centralized systems by establishing a secure, decentralized, and user-centric platform. The system was architected with a sophisticated three-tiered structure, integrating advanced cryptographic methodologies, consensus algorithms, and the Fast Healthcare Interoperability Resources (HL7 FHIR) standard to ensure data security, transaction validity, and semantic interoperability. The research has profound implications for healthcare delivery, patient care, legal compliance, operational efficiency, and academic advancements in blockchain technology and healthcare IT sectors. The methodology adapted in this research comprises of Preliminary Feasibility Study, Literature Review, Design and Development, Cryptographic Algorithm Integration, Modeling the data and testing the system. The research employed a permissioned blockchain with a Practical Byzantine Fault Tolerance (PBFT) consensus algorithm and Ethereum-based smart contracts. It integrated advanced cryptographic algorithms, role-based access control, multi-factor authentication, and RESTful APIs to ensure security, regulate access, authenticate user identities, and facilitate seamless data exchange between the blockchain and legacy healthcare systems. The research contributed to the development of a secure, interoperable, and decentralized system for managing medical records, addressing the limitations of the centralized systems that were in place. Future work will delve into optimizing the system further, exploring additional blockchain use cases in healthcare, and expanding the adoption of the system globally, contributing to the evolution of global healthcare practices and policies.

Keywords: healthcare informatics, blockchain, medical records management, decentralized architecture, data security, cryptographic algorithms

Procedia PDF Downloads 29
3 The Mediation Impact of Demographic and Clinical Characteristics on the Relationship between Trunk Control and Quality of Life among the Sub-Acute Stroke Population: A Cross-Sectional Study

Authors: Kumar Gular, Viswanathan S., Mastour Saeed Alshahrani, Ravi Shankar Reddy, Jaya Shanker Tedla, Snehil Dixit, Ajay Prasad Gautam, Venkata Nagaraj Kakaraparthi, Devika Rani Sangadala

Abstract:

Background: Despite trunk control’s significant contribution to improving various functional activity components, the independent effect of trunk performance on quality of life is yet to be estimated in stroke survivors. Ascertaining the correlation between trunk control and self-reported quality of life while evaluating the effect of demographic and clinical characteristics on their relationship will guide concerned healthcare professionals in designing ideal rehabilitation protocols during the late sub-acute stroke stage of recovery. The aims of the present research were to (1) investigate the associations of trunk performance with self-rated quality of life and (2) evaluate if age, body mass index (BMI), and clinical characteristics mediate the relationship between trunk motor performance and perceived quality of life in the sub-acute stroke population. Methods: Trunk motor functions and quality of life among the late sub-acute stroke population aged 57.53 ± 6.42 years were evaluated through the trunk Impairment Scale (TIS) and Stroke specific quality of life (SSQOL) questionnaire, respectively. Pearson correlation coefficients and mediation analysis were performed to elucidate the relationship of trunk motor function with quality of life and determine the mediation impact of demographic and clinical characteristics on their association, respectively. Results: The current study observed significant correlations between trunk motor functions (TIS) and quality of life (SSQOL) with r=0.68 (p<0.001). Age, BMI, and type of stroke were detected as potential mediating factors in the association between trunk performance and quality of life. Conclusion: Validated associations between trunk motor functions and perceived quality of life among the late sub-acute stroke population emphasize the importance of comprehensive evaluation of trunk control. Rehabilitation specialists should focus on appropriate strategies to enhance trunk performance anticipating the potential effects of age, BMI, and type of stroke to improve health-related quality of life in stroke survivors.

Keywords: sub-acute stroke, quality of life, functional independence, trunk control

Procedia PDF Downloads 37
2 A Geoprocessing Tool for Early Civil Work Notification to Optimize Fiber Optic Cable Installation Cost

Authors: Hussain Adnan Alsalman, Khalid Alhajri, Humoud Alrashidi, Abdulkareem Almakrami, Badie Alguwaisem, Said Alshahrani, Abdullah Alrowaished

Abstract:

Most of the cost of installing a new fiber optic cable is attributed to civil work-trenching-cost. In many cases, information technology departments receive project proposals in their eReview system, but not all projects are visible to everyone. Additionally, if there was no IT scope in the proposed project, it is not likely to be visible to IT. Sometimes it is too late to add IT scope after project budgets have been finalized. Finally, the eReview system is a repository of PDF files for each project, which commits the reviewer to manual work and limits automation potential. This paper details a solution to address the late notification of the eReview system by integrating IT Sites GIS data-sites locations-with land use permit (LUP) data-civil work activity, which is the first step before securing the required land usage authorizations and means no detailed designs for any relevant project before an approved LUP request. To address the manual nature of eReview system, both the LUP System and IT data are using ArcGIS Desktop, which enables the creation of a geoprocessing tool with either Python or Model Builder to automate finding and evaluating potentially usable LUP requests to reduce trenching between two sites in need of a new FOC. To achieve this, a weekly dump was taken from LUP system production data and loaded manually onto ArcMap Desktop. Then a custom tool was developed in model builder, which consisted of a table of two columns containing all the pairs of sites in need of new fiber connectivity. The tool then iterates all rows of this table, taking the sites’ pair one at a time and finding potential LUPs between them, which satisfies the provided search radius. If a group of LUPs was found, an iterator would go through each LUP to find the required civil work between the two sites and the LUP Polyline feature and the distance through the line, which would be counted as cost avoidance if an IT scope had been added. Finally, the tool will export an Excel file named with sites pair, and it will contain as many rows as the number of LUPs, which met the search radius containing trenching and pulling information and cost. As a result, multiple projects have been identified – historical, missed opportunity, and proposed projects. For the proposed project, the savings were about 75% ($750,000) to install a new fiber with the Euclidean distance between Abqaiq GOSP2 and GOSP3 DCOs. In conclusion, the current tool setup identifies opportunities to bundle civil work on single projects at a time and between two sites. More work is needed to allow the bundling of multiple projects between two sites to achieve even more cost avoidance in both capital cost and carbon footprint.

Keywords: GIS, fiber optic cable installation optimization, eliminate redundant civil work, reduce carbon footprint for fiber optic cable installation

Procedia PDF Downloads 191
1 Partial Least Square Regression for High-Dimensional and High-Correlated Data

Authors: Mohammed Abdullah Alshahrani

Abstract:

This research focuses on the investigation of partial least squares (PLS) methodology to deal with high-dimensional correlated data. Current developments in technology have enabled experiments to produce data that are characterized by, first, the number of variables that far exceeds the number of observations and, second, variables that are substantially correlated between them. These types of data are commonly found in, first, chemometrics, where absorbance levels of chemical samples are recorded across hundreds of wavelengths in a calibration of a near-infrared (NIR) spectrometer. Second, they are also common to be found in genomics where copy number alterations (CNA) are recorded across thousands of genomic regions from cancer patients. In our study, we investigated key areas to address these challenges. Firstly, we tackled the issue of three main PLS algorithms having potentially different interpretations of relevant quantities. We unified these interpretations by identifying scenarios where all three algorithms yield the same estimates. Secondly, we explored the phenomenon of unusual negative shrinkage factors encountered during PLS model fitting. Unlike ridge regression or principal component regression, where shrinkage factors range between zero and one, PLS can exhibit factors greater than one or even negative, hence more aptly termed ‘filter factors’ rather than ‘shrinkage factors’. This characteristic allows PLS to effectively handle high-dimensional data by applying shrinkage to estimates. To our knowledge, there has been no previous meaningful investigation on the negative filter factors (NFF) in PLS. In this research we present a novel result whereby we identify the condition for NFF to happen and investigate characteristics of the data that are associated with NFF to get an insight. Lastly, the main challenge of the application of PLS is in the interpretation of weights associated with the predictors. With hundreds and thousands of predictors, each and every predictor variable has non-zero weight. However, we expect that only some predictor variables are contributing to the association with the outcome variable. We, therefore, resort to the sparse estimation of predictor weights where some weights are zero estimated and the other weights are non-zero. A (standard) lasso estimation has a weakness in dealing with correlated variables as it picks up one variable within a correlation block without knowing the reason. A novel approach is needed to consider the dependencies between predictor variables in estimating the weights. We propose a new method where a new penalty function is introduced in the likelihood function associated with the estimation of weights. The penalty function is a combination of a lasso penalty that imposes sparsity and a penalty based on Cauchy distribution with a smoother matrix to take into account dependencies between genomic regions. The results show that the estimates of the weights are sparse: many weights are zero estimated, and those non-zero estimates are grouped and exhibit smoothness within them. The interpretation of genomic regions becomes easy, and the identification of important regions for each component can be done simultaneously with prediction in a single modeling framework. We investigate the relation between PLS and graphical modeling using the information in the weights to construct the graph with unsuccessful results. High-dimensional data where the number of predictors (p) exceeds the number of observations (n) are widely used in many applications of regression analysis. Ordinary least squares regression (OLS), which is the most well-known method for regression problems, has less performance with high-dimensional and highly- correlated data. Previous studies have shown that there is an association between copy number alterations (CNA) in some key genes and disease phenotypes. Moreover, it is very important in high-dimensional data to classify the samples into groups, such as tumor types, of gene expression data in bioinformatics and biology. However, the standard regression of classification methods will fail in these cases because the predictors matrix is singular and so, cannot be inverted. Hence, regularised methods are needed such as shrinkage methods and dimension reduction methods. One of the most suggested methods in the literature is partial least squares regression (PLS) for linear regression and classification.

Keywords: negative filter factors, partial least square regression, high-dimensional data, biostatistics, bioinformatics

Procedia PDF Downloads 5