Diversity and Impact of Rare Variants in Genes Encoding the Platelet G Protein-Coupled Receptors

Summary Platelet responses to activating agonists are influenced by common population variants within or near G protein-coupled receptor (GPCR) genes that affect receptor activity. However, the impact of rare GPCR gene variants is unknown. We describe the rare single nucleotide variants (SNVs) in the coding and splice regions of 18 GPCR genes in 7,595 exomes from the 1,000-genomes and Exome Sequencing Project databases and in 31 cases with inherited platelet function disorders (IPFDs). In the population databases, the GPCR gene target regions contained 740 SNVs (318 synonymous, 410 missense, 7 stop gain and 6 splice region) of which 70% had global minor allele frequency (MAF) < 0.05%. Functional annotation using six computational algorithms, experimental evidence and structural data identified 156/740 (21%) SNVs as potentially damaging to GPCR function, most commonly in regions encoding the transmembrane and C-terminal intracellular receptor domains. In 31 index cases with IPFDs (Gi-pathway defect n=15; secretion defect n=11; thromboxane pathway defect n=3 and complex defect n=2) there were 256 SNVs in the target regions of 15 stimulatory platelet GPCRs (34 unique; 12 with MAF<1% and 22 with MAF ≥ 1%). These included rare variants predicting R122H, P258T and V207A substitutions in the P2Y12 receptor that were annotated as potentially damaging, but only partially explained the platelet function defects in each case. Our data highlight that potentially damaging variants in platelet GPCR genes have low individual frequencies, but are collectively abundant in the population. Potentially damaging variants are also present in pedigrees with IPFDs and may contribute to complex laboratory phenotypes.


Introduction
G protein-coupled receptors (GPCRs) are seven transmembrane domain proteins that mediate signal transduction from a wide range of extracellular stimuli. GPCRs are expressed widely in haematopoietic and vascular tissues, including platelets, in which they mediate activation signals from agonists such as thrombin (protease activated receptors [PAR] 1 and 4), thromboxane A 2 (thromboxane A 2 receptor [TP]), epinephrine (α 2A -adrenoreceptor) and ADP (P2Y 12 and P2Y 1 receptors). Platelets also express G scoupled GPCRs such as the prostacyclin (IP 1 ), adenosine 2A (A 2A ) and prostaglandin D 2 (DP 1 ) receptors, which mediate inhibitory signals from prostacyclin, adenosine and PGD 2 respectively, to suppress platelet activation.
Platelet GPCR activity varies between individuals within the population, in part because of common genetic sequence variants (minor allele frequency (MAF) ≥ 1.0 %) near, or within GPCR genes. Examples include the variant rs1472122 (downstream of the P2Y 12 gene P2RY12), which affects ADP-induced platelet fibrinogen binding and P-selectin exposure (1) and the variant rs4311994 (downstream of the α 2A -adrenoreceptor gene ADRA2A), which affects epinephrine-induced platelet aggregation (2). Similar associations have been demonstrated between common variants in the PAR1 (F2R), PAR4 (F2RL3) and the TP receptor (TBXA2R) genes and function of the corresponding GPCRs (3)(4)(5). Some common variants also influence susceptibility to cardiovascular disease and responses to anti-platelet drugs (4)(5)(6). Since the common GPCR gene variants lie exclusively in noncoding regions, these effects are most likely caused by changes in receptor expression, and not altered receptor function (4,7).
Although the evidence linking common variants near platelet GPCR genes and GPCR activity is compelling, the individual effect size of common variants is small (2). For other genes, rare (MAF < 1 %) single nucleotide variants (SNVs), with large individual effect size, provide a greater source of inter-individual genetic variation than common variants (8)(9)(10). However, for platelet GPCR genes, descriptions of rare variants affecting platelet function are restricted to SNVs in P2RY12 and TBXA2R in isolated pedigrees with inherited platelet function disorders (IPFD) (11)(12)(13)(14)(15)(16)(17). It is likely that the impact of rare GPCR gene variants in the population is much greater than implied from these limited descriptions, but this has not been confirmed by systematic analysis. In order to assess the population diversity and impact of rare SNVs in platelet GPCR genes, we have surveyed and annotated coding and splice region SNVs in public databases of 7595 individuals and in 31 cases with IPFD of unknown genetic basis.

Materials and methods G protein-coupled receptors in human platelets
Class A GPCRs that were listed in the International Union of Basic and Clinical Pharmacology (IUPHAR) GPCR Database (Suppl. Table 1, available online at www.thrombosis-online.com) were selected for analysis if present in the Proteomics Identifications Database (PRIDE), the PlateletWeb resource (Suppl. Table 1, available online at www.thrombosis-online.com) and in the human platelet transcriptome with > 1.0 reads per kilobase of exon model per million mapped reads (18).

GPCR gene variations in population datasets
We identified coding sequence and splice region (from 3 exonic to 8 intronic nucleotides flanking the exon-intron boundaries) SNVs in the GPCR gene shortlist in the April 2012 Integrated Variant Set release of the 1,000 Genomes project and the NHLBI Exome Sequencing Project (ESP) dataset release number ESP6500, accessed through Ensembl Variation 74 (H. sapiens Short Variation GRCh37.p13 dataset) using the BioMart tool (Suppl. Table 1, available online at www.thrombosis-online.com). Nucleotide variations were annotated to the consensus coding sequence (CCDS) database transcript of each platelet GPCR.

GPCR gene variations in inherited platelet function disorders
Genomic DNA was isolated from peripheral venous blood from a representative sub-group of 31 unrelated cases with IPFD recruited at UK Haemophilia Comprehensive Care Centres to the Genotyping and Phenotyping of Platelets (GAPP) study (ISRCTN 77951167, UK REC 06/MRE07/36) according to previously reported eligibility criteria (19). For all cases, platelet function was evaluated using light transmission aggregation and ATP secretion assays using nine agonists at least two weeks after exposure to 19 drugs known to affect platelet function (19,20). Genomic DNA was enriched for the target GPCR genes either using a custom made bait library for platelet genes (21) or the Agilent SureSelect All Exon 50Mb kit (Agilent Technologies, Wokingham, UK). Sequence data were captured using an Illumina HiSeq 2000 analyser (Illumina Inc San Diego, CA, USA). Sequence reads were mapped to the reference genome GRCh37.p11, Feb 2009 and SNVs were annotated to the consensus CCDS records using the ANNOVAR tool (Suppl. Table 1, available online at www.thrombosis-online. com). Since the IPFD cases all showed reduced platelet responses, we analysed stimulatory platelet GPCRs and excluded the G scoupled inhibitory GPCRs IP 1 , DP 1 receptor and A 2A receptor. All potentially damaging SNVs were confirmed by PCR amplification of individual exons and direct cycle sequencing.

Functional annotation of GPCR gene variants using computational algorithms
SNVs that were identified in population databases and in cases with IPFD were classified according to sequence ontology terminology used in Ensembl release 74 (Suppl. Table 1, available online at www.thrombosis-online.com). The likely pathogenicity of each SNV was determined using the MAPP, PhD-SNP, PolyPhen-1, PolyPhen-2, SIFT and SNAP prediction tools on the PredictSNP server (Suppl. Table 1, available online at www.thrombosis-online. com). SNVs were classified as potentially damaging if identified as 'damaging' by the PredictSNP meta-analysis tool with a consensus likelihood of > 0.5 (22). Splice region variants were analysed using the Human SpliceFinder tool (Suppl. Table 1, available online at www.thrombosis-online.com) and were classified as potentially damaging if the difference between the splice site prediction scores of the wild type and variant sequences exceeded 30 % that of the wild-type sequence (23).

Manual functional annotation of GPCR gene variants
Missense SNVs were also annotated using a manual strategy in which variants were classified as potentially damaging if any of the following criteria were met: • The substituted amino-acid was within a functional GPCR sequence motif identified in UniProt (Suppl. Tables 1 and 2, available online at www.thrombosis-online.com).
• The substituted amino-acid, expressed in Ballesteros-Weinstein nomenclature, (24) contributed to inter-helical interactions, the ligand binding pocket or to the G-protein binding sites in the consensus Class A GPCR structure (25)  • There was published experimental evidence of a change in GPCR function from site-directed mutagenesis in a heterologous system, determined from the GPCRDB resource (Suppl. Table 1, available online at www.thrombosis-online.com).

Analysis of the P2Y 12 R122H and V207A variants in transfected cells
R122H and V207A HA-tagged human P2Y 12 constructs were generated by site-directed mutagenesis (Eurofins MWG Operon, Ebersberg, Germany) and were transfected into either HEK293 or

Jones et al. Variation in platelet G protein-coupled receptors
1321N1 cells according to previously described methods (17). Cell surface P2Y 12 expression in the transfected cells was determined by enzyme linked immunosorbent assay (ELISA) and by immunofluorescence microscopy using murine anti-HA antibody (HA-11) as described previously (17). P2Y 12 receptor function was measured by incubating the transfected cells with 1 µM forsoklin (Sigma-Aldrich, Gillingham, UK) to increase basal cAMP levels. The cells were then incubated with 50 µM-10 nM ADP before residual cAMP concentrations were determined in cell lysates by ELISA (Sigma-Aldrich cAMP Enzyme Immunoassay Kit, Gillingham, UK).

Identification of GPCRs in human platelets
Using the IUPHAR database, we identified 18 Class A GPCRs with robust evidence of expression in human platelets at transcript and protein levels. The coding regions of the 18 GPCR genes had median length of 1121 kb (interquartile range [IQR] 1043-1248) and median GC content of 56.4 % (IQR 49.0-64.6; ▶ Table 1).

Predicting the functional impact of GPCR gene variants
We used both computational and manual annotation to assess whether missense SNVs in the GPCR gene target regions were potentially damaging to GPCR function. Computational annotation using the PredictSNP server enabled meta-analysis of predictions from six tools that utilise trained decision (PhD-SNP, Polyphen-2 and SNAP), evolutionary conservation (SIFT), physicochemical (MAPP) and expert rule (Polyphen-1) algorithms to generate a consensus likelihood of pathogenicity for each SNV (22). Using this strategy, 122 (30 %) of the 410 missense SNVs in the GPCR gene target regions were classified as potentially damaging (Suppl. Table 3, available online at www.thrombosis-online.com). None of the six splice region SNVs were predicted by computation to disrupt transcript splicing.
Our manual annotation strategy classified missense SNVs as potentially damaging if the predicted amino-acid substitution affected a functional GPCR sequence motif or a critical residue in the consensus or specific GPCR crystal structures or if previous experimental mutagenesis of the residue caused loss of receptor function. This identified 60 (15 %) of the 410 missense SNVs in the GPCR gene target regions as potentially damaging (Suppl. Table 3, available online at www.thrombosis-online.com). Seven stop-gain SNVs were also classified as potentially damaging by manual annotation since they predicted protein truncation.
The total number of all classes of SNV that were classified as potentially damaging by either computational or manual annotation was 156 (21 % of all SNVs; Suppl. Table 3, available online at www.thrombosis-online.com). Forty missense SNVs were classi- A) The total number of unique SNVs in each GPCR gene found in the population datasets, subdivided according to whether missense, synonymous or stop-gain/ splice region. B) The total number of unique SNVs in the population datasets that were classified as potentially damaging, subdivided according to whether missense or stop-gain.

Distribution of damaging missense GPCR gene variants
The 149 potentially damaging missense SNVs were represented in all of the 18 selected GPCR genes (▶ Figure 1 B) and predicted amino-acid substitutions that were more common in the TM domains and C-terminal intracellular region (CT) than other regions (▶ Figure 2). Twenty-five SNVs predicted amino-acid substitutions at sites shown in the consensus Class A GPCR structure to contribute to inter-helical interactions between the TM domains.
A further 15 were in regions implicated in G protein interactions, five were in the helical regions of consensus ligand binding pockets, three were located in D/NPXXY motifs and one was in an E/DRY motif (▶ Table 2).

Characteristics of patients with inherited platelet function disorders
The IPFD collection comprised 31 unrelated cases (11 males and 20 females; age range 6-82 years) with abnormal platelet function determined by light transmission aggregation and ATP release assays (19,20). The collection comprised cases in which the main laboratory defect was within the Gi-pathway (n=15), secretion pathway (n=11) and thromboxane synthesis pathway (n=3) according to previous diagnostic criteria (19). Two cases showed complex defects that could not be classified. This collection was selected as a representative sub-group of a larger collection of 111 previously reported cases with inherited platelet function disorders enrolled into the UK GAPP study and showed a similar distribution of pathway defects to the group as a whole (19).

GPCR gene variations in cases with inherited platelet disorders
Among the 31 cases with IPFD, we identified 256 SNVs in the target regions of the genes encoding the stimulatory platelet GPCRs PAR1, P2Y 12 , TPα, LPA 5, CXCR4, PAR4, P2Y 1 , α 2A -adrenoceptor, CCR4, V 1A receptor, PAF receptor, FPR1, EP3 receptor, succinate, and 5-HT 2A receptor. These comprised 38 individual SNVs of which 22 (58.9 %) were synonymous and 16 (42.1 %) were missense. There were no stop-gain or splice region SNVs. Thirty four unique SNVs were present in the 1,000 genomes and ESP population datasets (12 with global MAF < 1 % and 22 with global MAF ≥ 1 %) and four were undocumented. Using an identical strategy to the analysis of the population datasets, we classified three heterozygous missense SNVs as potentially damaging in the IPFD cases, all within P2RY12 (▶ Table 3). A wider analysis of variants identified in other platelet genes did not identify any single candidate variants that could completely account for the platelet phenotype of each case.

Characteristics of cases with P2Y 12 variants
The P2Y 12 R122H variant was identified in a female index case 1.1 (▶ Figure 3 A) with a history of prolonged bleeding from minor wounds and after a vaginal delivery. There was no abnormal bleeding after two other vaginal deliveries or after tonsillectomy. The P2Y 12 P258T variant was identified in an unrelated male index case 2.1 (▶ Figure 3 D) who had experienced recurrent gastro-intestinal bleeding throughout adulthood but had no other bleeding symptoms. Platelets from case 1.1 and from case 2.1 showed normal shape change but reduced aggregation responses to 10-100 µM ADP compared to healthy controls that was reversible with 10 µM ADP (▶ Figure 3 B and E), indicating selective loss of P2Y 12 function. Compared to control subjects, platelets from both cases also showed reduced aggregation responses to 3-30 µM epinephrine and 1 µg/ml collagen, but not to 3 µg/ml collagen, which are consistent with loss of P2Y 12 function. However, there were also reduced aggregation responses to 0.5-1 mM arachidonic acid in case 1.1 and reduced responses to ristocetin 1.25-1.5 mg/ml and a markedly reduced response to high concentration (100 µM) epinephrine in case 2.1. The latter findings indicate that cases 1.1 and 2.1 have distinct and complex aggregation phenotypes, neither of which can be completely explained by loss of P2Y 12 function. Platelets from other pedigree members 1.2 and 2.2, analysed in parallel with the respective index cases, also showed reduced aggre-gation responses to ADP compared with controls (▶ Figure 3 C and F). Cases 1.2 and 2.2 were subsequently shown to harbour the R122H and P258T variations respectively, but neither had abnormal bleeding symptoms.
The P2Y 12 V207A variant was identified in an asymptomatic female index case 3.1 (▶ Figure 3 G) who also harboured a P2Y 12 SNV on the same allele that predicted a T223R substitution, that was classified as benign. Platelets from 3.1 showed normal platelet shape change but reduced aggregation responses to 5-20 µM ADP (▶ Figure 3 H)  were within the reference interval of responses determined from a panel of 30 locally recruited healthy controls, but fell within the lowest 10 th percentile of control responses, consistent with reduced P2Y 12 function, that was less pronounced than index cases 1.1 and 2.1. Platelets from case 3.1 also showed reduced aggregation responses to 0.5-1 mM arachidonic acid suggesting an additional platelet defect. A pedigree member 3.2 with wild-type P2Y 12 showed platelet aggregation responses to ADP that were similar to control subjects (▶ Figure 3 I).

Analysis of the P2Y 12 R122H and V207A in HEK-293 cells
Since the R122H and V207A substitutions had not been previously associated with P2Y 12 receptor deficiency in humans, we examined the phenotype of these substituted P2Y 12 receptors in trans-fected cells. Expression of P2Y 12 R122H and V207A was observed predominantly at the cell surface by immunofluorescence microscopy (data not shown). When cell-surface expression was quantified by ELISA, the normalised expression levels of the substituted receptors were almost identical to that of wild-type receptor (R122H mean 113 % ± S. E. M. 14.5 % and V207A 97.5 % ± 7.9 %; ▶ Figure 4 A) indicating that neither substitution significantly affected P2Y 12 receptor trafficking. When P2Y 12 receptor function was tested by measuring the ability of ADP to reduce cellular cAMP levels, the substituted P2Y 12 receptors showed less reduction in cAMP at ADP concentrations of 1 µM to 10nM compared to P2Y 12 wild-type (p=0.013 for R122H and p=0.019 for V207A; 1 way ANOVA: ▶ Figure 4 B). These data indicate that both substitutions reduce P2Y 12 function, with a weaker effect from P2Y 12 V207A substitution, consistent with the less marked platelet aggregation defect.  TM7   TM5   TM6   TM6   TM6   TM2   TM3   TM3   TM3   TM3   TM6   TM3   TM3   TM6   TM1   TM6   TM6   TM7 NT

Discussion
We have reported the results of a unique survey of coding and splice region SNVs in 18 platelet GPCR genes from 7,595 exomes in the 1,000 genomes and ESP databases and from 31 cases with IPFD. Our main findings were that: (i) in the population databases, the GPCR gene target regions contained potentially damaging SNVs that were individually rare, but collectively numerous; ii) the potentially damaging SNVs were diverse and were predicted to alter GPCR activity through several mechanisms, and, iii) a representative collection of cases with IPFD also had SNVs in platelet GPCR genes, including potentially damaging variants affecting P2Y 12 in three cases. Our strategy for identifying potentially damaging SNVs was based on computational annotation using six bioinformatic tools with different methodologies (22), complemented by manual annotation using resources that are unique to GPCRs. These included the GPCRDB database that catalogues previous GPCR mutagenesis experiments, the high resolution structures for the PAR1, A 2A and P2Y 12 receptors (26-28) and the consensus structure for Class A GPCRs (25) that provides structural data for GPCRs with unsolved crystal structures. Combined computational and manual annotation has provided a valuable insight into the diversity and impact of rare variants in human GPCR genes. However, our analysis has focussed on missense rather than synonymous coding region SNVs. Since 6 % of all synonymous SNVs in the ESP exome dataset were computed to be potentially damaging, primarily through codon usage effects (10,29), our analysis is likely to have underestimated the overall burden of GPCR gene variation.
Within the ESP and 1,000 genomes databases, we found 740 SNVs in the GPCR gene target regions, of which 56 % were missense and 70 % had a global MAF < 0.05 % or were singleton records. These characteristics are similar to the entire ESP exome dataset comprising > 500,000 SNVs, of which 58 % are missense and 72 % are present in only three alleles or less (10), indicating an exome-wide abundance of rare missense variants. Our prediction that 21 % of SNVs in the GPCR gene target regions were potentially damaging, is also similar to exome-wide estimates of 17 % determined by computation (10). One noteworthy finding from our survey is that the platelet GPCR genes contained a median of 41.5 SNVs per coding region, compared with 24 SNVs per coding region exome-wide (10). The high variation rate in GPCR genes cannot be explained by differences in the length of coding region because the GPCR genes had median coding length 1121 bp, similar to the exome median of 1,100 bp (30). However, this difference could be related to GC content (31) which was 56.4 % in platelet GPCR genes compared to 51 % exome-wide (32). Consistent with this, the GPCR genes F2RL3 and PTGIR with high GC content, had more SNVs than others with lower GC content, although this trend was inconsistent across all platelet GPCRs. We also showed that SNVs classified as potentially damaging were more common in gene regions encoding GPCR TM helices and CT intracellular regions compared to other areas. This reflects the essential roles of the TM helices in maintaining GPCR tertiary structure and defining the ligand binding and G-protein interaction sites (25) and the CT intracellular regions in regulating GPCR signalling and trafficking (33).
Consistent with the population databases, the 31 cases with IPFD also harboured rare missense SNVs in genes encoding the stimulatory platelet GPCRs. Although these were represented in all of the target GPCR genes, the three variants that were predicted to be potentially damaging were exclusively in P2RY12 and occurred as heterozygous traits.
These included the P2Y12 P258T substitution which occurs adjacent to Y259 in TM6 that is required for ligand binding (28). This substitution was identified and characterised previously in an unrelated IPFD pedigree who also displayed reduced platelet responses to ADP (34), identical to the phenotype in the P2Y12 P258T pedigree in our study. Since this independent data provides good evidence that the P258T variation is causally related to loss of P2Y12 receptor function, we performed no further characterisation.
The other observed P2Y12 variants had not been previously reported. These included an SNV predicting an R122H substitution within the P2Y12 DRY motif which has multiple postulated roles in regulating receptor conformation, G-protein interactions and receptor trafficking (35). This substitution also occurs at a residue affected by a different substitution (P2Y12 R122C) in a previously reported IPFD pedigree with P2Y12 dysfunction (17). The P2Y12 V207A substitution affects a residue not previously associated with an IPFD, but which is adjacent to C208 in TM5 which has multiple interactions with TM3 and is, thereby, required for receptor structural integrity (25). Consistent with these significant structural predictions, we confirmed that the P2Y12 V207A and R122H substitutions were responsible for loss of P2Y12 receptor function by demonstrating diminished ADP-mediated reduction in The pedigree of index case 2.1, who was heterozygous for the P2Y 12 P258T substitution (half black shading) showing pedigree member 2.2, also with the cytoplasmic cAMP levels in transfected cells, which is a sensitive and highly specific measure of P2Y12 function (36). Our demonstration that cell surface expression of the substituted P2Y12 receptors was the same as wild-type suggest that both V207A and R122H disrupt function by impairing ligand binding, receptor activation or signal transduction, rather than by affecting receptor trafficking.
It is noteworthy that in the IPFD cases in this series, the R122H and P258T substitutions, and to a less pronounced extent the V207A substitution, were associated with reduced platelet responses to ADP consistent with impaired P2Y12 function. Platelet responses to low concentrations of other activating agonists were also reduced, in keeping with impaired P2Y12-mediated positive feedback from ADP released via dense granules, and simi- lar to previous IPFD cases with loss-of-function P2Y12 variants (11,34,37). Despite this platelet phenotype, there was an inconsistent relationship between the heterozygous P2Y12 variants and abnormal bleeding, suggesting that partial loss of P2Y12 function alone is insufficient to affect haemostasis. It is also noteworthy that in all three index cases with potentially damaging P2Y12 variants, there were abnormal responses to other activating agonists that may not be explained solely by loss of P2Y12 function because of the magnitude of the other defects. Our data do not allow us to exclude a dominant negative effect from the observed heterozygous P2Y12 variants, or a further non-coding P2Y12 variant in trans that reduces expression of the other allele. However, a more plausible explanation is that in addition to a variant affecting P2Y12, the IPFD cases also harboured loss-of-function variants in other platelet genes that contributed to bleeding and to the complex laboratory phenotypes. The concept that some IPFD have complex heritability is supported by previous descriptions of pedigrees with phenotypes that are the composite effect of independent variants affecting P2Y12 and PAR1 (17) and P2Y12 and von Willebrand factor (11). We speculate that the apparent over-representation of P2Y12 defects in our case series and in these previous reports, may reflect that clinical diagnostic LTA agonist panels have greater sensitivity for defective stimulatory GPCRs in non-redundant feedback pathways such as P2Y12, than other GPCRs, thereby introducing a selection effect.
Through a systematic analysis of platelet GPCR genes, we have highlighted the burden of rare SNVs in the general population and in selected patients with IPFDs. The variety and burden of potentially damaging SNVs in the healthy population recruited into the 1,000 genome and ESP databases highlight the incomplete penetrance of these variants. This suggests the possibility that mild IPFDs are more frequent in the general population, but may commonly go unnoticed until a challenge, such as childbirth, surgery or initiation of antiplatelet therapy, is applied. The spread of bleeding manifestations even in patients diagnosed with IPFDs and harbouring identical SNVs also highlights the challenges of applying genetic screening with platelet function testing approaches to large populations. Rare variants affecting GPCR function may be solely responsible for the phenotypes of some isolated pedigrees with IPFD, but could also contribute to complex defects caused by variants in other platelet genes.