This retrospective pilot study was conducted on plasma from patients diagnosed with RA, consisting of a group that developed CVEs and an age- and sex-matched RA control group that was considered healthy with respect to CVEs.
Patients were recruited using Reade’s patient medical records and biobank database, which consists of data from subjects who had previously participated in other RA trials and signed an informed consent form.
First, we looked for a recorded CVE with a precise date in the medical history and/or medical letters. To verify this, we also used the pharmacy history medication list to check whether the patient actually received medication and whether the date was correct. We searched for P2Y12 inhibitors, such as Ticagrelor, Perstantin, and Clopidogrel (which are used for the treatment of myocardial infarction or cerebral infarction). This resulted in approximately 279 patients. These 279 patients were then searched in the biobank database for samples from at least 6 months BEFORE and 3–6 months AFTER the event. This resulted in 10 patients who met the above criteria. We then matched the 10 RA patients who were diagnosed with a cardiovascular event by age and sex with 10 control RA patients who did not have an event. The medical records of the latter group were similarly reviewed, with no evidence of an experienced cardiovascular event, which was further verified by medication use. All blood samples were drawn into conventional tubes containing ethylenediaminetetraacetic acid (EDTA)/ K2EDTA and were stored in the biobank at -80 °C until further processing. The study was conducted in accordance with the Declaration of Helsinki and approved by the local medical ethics committee of Slotervaart Hospital. All patients provided written informed consent.
Plasma depletion (top 14 most abundant proteins)Plasma depletion of highly abundant proteins is essential for increasing total protein coverage prior to LC‒MS/MS analysis [28]. Liquid chromatography (Äkta Explorer 10 s, GE, Mijdrecht, The Netherlands) was performed with a Multi Affinity Removal Column (Human-14HC 4.6 × 50 mm, Agilent Technologies, Santa Clara, US). The column contained immobilized antibodies to capture the top 14 most abundant plasma proteins, which accounted for more than 90–95% of the total protein mass. Prior to injection, the system was rinsed with the equil/load/wash buffer A (pH 7.0) (Agilent Technologies, Santa Clara, US). A plasma sample was diluted 1:4 with buffer A, and 100 µL of sample was injected at a flow rate of 0.125 ml/min of buffer A. The flow through (containing the depleted plasma) was collected in different fractions in a 96-well plate (250 µL/well). Next, a buffer switch (Buffer B pH 3.0) (Agilent Technologies, Santa Clara, US) combined with a flow rate increase of up to 1 mL/min was applied. This resulted in the release of the 14 most abundant proteins captured (Supplemental Fig. 1). The wells containing the flow-through proteins were pooled (4 wells per run, 2 runs per sample) for further processing.
Fraction concentration and trypsin gold digestionAfter concentration, the sample was subjected to protein digestion using the rapigest-SF method with trypsin gold A total of 50 µg of total protein was pipetted into a 0.5 mL Eppendorf low-binding tube to a final volume of 116 µL using ULC/ms grade water (Biosolve BV, Valkenswaard, the Netherlands). Subsequently, 15 µL of 10% acetonitrile (ACN) (Biosolve BV, Valkenswaard, Netherlands), 15 µL of 1% Rapigest SF Surfactant (Waters, Milford, US) diluted in 50 mM ammonium bicarbonate (Sigma‒Aldrich, Saint Louis, US), and 1.5 µL of 0.5 M dithiothreitol (DTT) (Sigma Aldrich, Saint Louis, US) were added. Next, 3 µL of 1 mg/ml alcohol dehydrogenase (Yeast, Sigma Aldrich, and Saint Louis, US) dissolved in saline was added as the internal standard.
This mixture was then incubated at 60 °C for 30 min to allow denaturation of the protein disulfide bonds. DTT was added to prevent reformation of the disulfide bonds. After incubation, 1.5 µL of 1 M iodoacetamide (IAA) (Sigma Aldrich, Saint Louis, US) was added, and the mixture was incubated in the dark at room temperature for 30 min to prevent reformation of disulfide bonds. After incubation, 1.0 µL of trypsin gold (Promega, Madison, US) was added to digest the protein content at 37 °C overnight, and after incubation, 7.9 µL of 10% trifluoroacetic acid (TFA) (Sigma Aldrich, Saint Louis, US) was added to lower the pH, thereby inactivating the trypsin. Finally, all samples were centrifuged at 14,750×g for 2 min, and the supernatant containing the peptide mixture was pipetted into a new 0.5 mL low binding tube for further purification.
C18 pipette-based solid phase extraction (SPE) and peptide quantificationTo initially activate the pipette-based C18 column (Agilent Bond Elut OMIX Pipette-based SPE, Agilent Technologies, Santa Clara, US), 100 µL of 50% acetonitrile (ACN) (Biosolve, Valkenswaard, Netherlands) was pipetted up and down and finally eluted into a waste tube. After activation, the C18 tip was rinsed three times with 100µL of 0.1% formic acid LC‒MS grade (FA) (Thermo Fisher Scientific, Waltham, US). Then, 100 µL of sample mixture was loaded onto the C18 tip and eluted into another low-binding tube. After loading, the C18 tip was “washed” 3 times, and the purified peptides were eluted into a clean low-binding tube by loading 100 µL of 75% acetonitrile (ACN) (Biosolve, Valkenswaard, Netherlands). Quantification of all eluted peptides was performed by using the Piercetm Quantitative Colorimetric Peptide Assay (Thermo Scientific, Rockford, US) in a 96-well plate. Subsequently, all samples were freeze-dried overnight and were resolubilized in 10% acetonitrile (ACN) (Biosolve, Valkenswaard, Netherlands)/0.1% trifluoroacetic acid (TFA) (Sigma Aldrich, Saint Louis, US)/ULC/89.9% ULC/MS grade water (Biosolve BV, Valkenswaard, the Netherlands) at such a volume to reach a final peptide concentration of 100 ng/µL.
LC‒MS/MS analysisMass spectrometric analysis was carried out on a TIMS-TOF Pro (Bruker, Bremen, Germany) instrument equipped with an Ultimate 3000 nanoRSLC UHPLC system (Thermo Scientific, Germeringen, Germany) [29]. Specifically, a total of 200 ng (2 µL) of peptide per sample was injected onto a C18 column (75 m, 250 mm, 1.6 m particle size; Aurora, Ionopticks, Fritzroy, Australia) heated at 50 °C.
The sample was loaded at 400 nl/min for 2 min in 3% solvent B and separated using a multistep gradient: 6% solvent B for 55 min, 21% solvent B for 21 min, 31% solvent B for 12 min, 42.5% solvent B for 3 min and 99% solvent B for 7 min (solvent A: 0.1% formic acid in water, solvent B: 0.1% formic acid in acetonitrile). Analysis of the eluted peptides was performed using a time-of-flight mass spectrometer with a collision energy of 20–59 eVa. The precursor scan ranged from 100 to 1700 m/z, and the TIMS range was 0.6–1.6 V.s/cm2 in PASEF mode [29].
To monitor the accuracy of the mass spectrometer, a quality control (QC) sample was prepared by pooling 2 µL of all patient samples in one vial. After every 10 sample injections, the system was injected with this QC sample, allowing any changes in the absolute peak intensities to be observed over the time it took for the system to measure the 60 patient samples and 7 QC samples, thereby controlling for technical variation.
Normalization, data analysis and statisticsValidation of the dataset is critical prior to analysis. A quality control sample was included in the TIMS-TOF analysis to correct for machine bias, and a sample spike-in peptide (ADH) was added to correct for inter- and intra-peak variation. The tracing of the intensities of the QC samples over time was monitored (Supplemental Fig. 2). The mathematical function created to monitor intensities over time was manually set to a second-order polynomial function trend line, which allowed normalization of the entire dataset (Supplemental Fig. 3).
Spectral analysis was performed using SearchGUI (version 4.1.1) and a peptide shaker (version 2.2.0). Mass spectrometry protein identification was performed using MaxQuant software (version 1.6.14.0) and the Human Proteome Database from UniProt (© 2002–2023 UniProt consortium), which was used in combination with reversed decoy protein sequences for false discovery rate (FDR) estimation of protein identification [30].
The settings in MaxQuant were as follows: trypsin/P digestion enzyme, allowing for a maximum of 2 missed cleavages. The variable modifications were set to oxidation (M), and the fixed modifications were set to carbamidomethyl (C). Matching between runs was enabled with a matching time window of 12 s and a matching ion mobility window of 0.05 indices as the default setting for the TIMS-DDA. For label-free quantification, both iBAQ and LFQ were enabled.
The resulting intensities were normalized using RStudio (version 2021.09.0 Build 351 with R (version 4.1.1) to implement and normalize the matrix by the calculated function. Spike protein normalization was also performed in R-studio. The code used to normalize these files can be found in supplementary file 1. Plot visualization and t tests were performed using Perseus (version 2.0.11). The Benjamini-Hochberg FDR method was used for correction for multiple testing and was set to 0.05. Pathway analysis was performed using the pathway analysis tool of the Reactome database [31].
Machine learning analysisThe eXtreme Gradient Boosting (XGBoost) algorithm was used to select a panel of proteins that showed the best ability to predict the difference between the control group and the CVD group that experienced a CVD event. To reduce the complexity of the analysis, a filtering process (using the ANOVA F value) was applied to select the top 50 most important proteins. A stability selection process was then applied to prevent overfitting and ensure the robustness of the results. The complete dataset was randomly divided into 20 different subsets. Each of the subsets consisted of 85% of the data. Within each random subset, leave-one-out cross-validation was applied. The training set included all samples but one sample, and the sample that was left out was included in the test set. The hyperparameters of the XGBoost classifier were optimized via a randomized search with triple cross-validation. For each random search, 10 random parameter settings were tested. The training set consisted of 70% of the data, and the validation set contained the remaining 30%. The splits were made using a stratified shuffle split. The final performance of the model was evaluated using the area under the curve (AUC) metric. The importance of each feature in the model was determined by calculating the mean decrease in impurity, in which the most important feature was scaled to 100%. The other feature importance levels are relative to the most important feature. All of these steps were performed using Python (v3.7.7) and the scikit-learn package (v0.23.1) (Program code available on request). Pathway analysis was performed by using Gene Ontology and Reactome applications [32].
Comments (0)