S gene is annotated in MG1655 as a putative EPZ004777 price membrane protein and has motifs associated with an inner membrane localisation including 8 transmembrane helices but, like ymdE, has not been functionally characterised. One MPEC genome encodes a truncated copy of ycdU near a contig boundary, but in this case we were able to identify the short remaining portion of the gene in another contig. Both ymdE and ycdU are encoded close to the biofilm-associated polysaccharide synthesis locus pgaABCD, and so we speculated that the presence of ymdE and ycdU in the specific MPEC core genome may serve as an indicator that these genes may also be commonly associated with MPEC. To test this, we extracted the nucleotide sequences of the genes surrounding ymdE and ycdU from MG1655, and profiled the distribution of these genes in phylogroup A and in MPEC (Fig. 5). The data in Fig. 5 reveals some interesting aspects of the distribution of these genes in the E. coli phylogroup A population. Firstly, it is clear that a discrete region comprising the genes pgaABCD, ycdT, insF1E1, and ymdE-ycdU, is absent in at around ten to twenty percent of E. coli from phylogroup A. These are contrasted with the core flanking genes, efeB/phoH, and ghrA/ycdXYZ, found in almost all phylogroup A genomes. In this way, pgaABCD, ycdT, insF1E1, and ymdE-ycdU represent a region of genomic heterogeneity, with not all strains encoding these genes. Within this region of heterogeneity, and in addition to ymdE-ycdU, the pgaABCD genes also appear to be more common in MPEC than phylogroup A. However, the pga genes are clearly not ubiquitous in MPEC. Interestingly, in both phylogroup A and MPEC, there is trend for the pga genes to decrease in abundance compared with the downstream gene, so that whilst pgaD can be found in approximately 90 of all phylogroupScientific RepoRts | 6:30115 | DOI: 10.1038/srepwww.nature.com/scientificreports/Gene in MG1655 b1028 b1029 b1384 b1385 b1393 b1394 b1395 b1396 b1397 b1398 b1399 b1400 b4287 b4288 b4289 b4290 b4291 b4292 b4293 Gene name ymdE ycdU feaR feaB paaF paaG paaH paaI paaJ paaK paaX paaY fecE fecD fecC fecB fecA fecR fecI in MPEC 98.5 98.5 100 100 100 100 98.5 98.5 98.5 98.5 100 100 98.5 100 100 100 100 100 100 in phy A 79.9 82.0 82.1 82.2 79.2 76.7 76.0 76.6 76.6 76.7 79.2 80.5 66.6 67.9 67.9 68.1 68.1 68.1 68.Product undefined product putative inner membrane protein transcriptional activator for tynA and feaB phenylacetaldehyde dehydrogenase 2,3-dehydroadipyl-CoA hydratase 1,2-epoxyphenylacetyl-CoA isomerase, oxepin-CoA-forming 3-hydroxyadipyl-CoA dehydrogenase, NAD + -dependent hydroxyphenylacetyl-CoA thioesterase 3-oxoadipyl-CoA/3-oxo-5,6-dehydrosuberyl-CoA thiolase phenylacetyl-CoA ligase transcriptional (Z)-4-Hydroxytamoxifen chemical information repressor of phenylacetic acid degradation paa operon, phenylacetyl-CoA inducer thioesterase required for phenylacetic acid degradation; trimeric; phenylacetate regulatory and detoxification protein; hexapeptide repeat protein ferric citrate ABC transporter ATPase ferric citrate ABC transporter permease ferric citrate ABC transporter permease ferric citrate ABC transporter periplasmic binding protein TonB-dependent outer membrane ferric citrate transporter and signal transducer; ferric citrate extracelluar receptor; FecR-interacting protein anti-sigma transmembrane signal transducer for ferric citrate transport; periplasmic FecA-bound ferric citrate sensor and cytoplasmic FecI ECF sigma factor activator RNA polymerase sigma-19 factor, fec ope.S gene is annotated in MG1655 as a putative membrane protein and has motifs associated with an inner membrane localisation including 8 transmembrane helices but, like ymdE, has not been functionally characterised. One MPEC genome encodes a truncated copy of ycdU near a contig boundary, but in this case we were able to identify the short remaining portion of the gene in another contig. Both ymdE and ycdU are encoded close to the biofilm-associated polysaccharide synthesis locus pgaABCD, and so we speculated that the presence of ymdE and ycdU in the specific MPEC core genome may serve as an indicator that these genes may also be commonly associated with MPEC. To test this, we extracted the nucleotide sequences of the genes surrounding ymdE and ycdU from MG1655, and profiled the distribution of these genes in phylogroup A and in MPEC (Fig. 5). The data in Fig. 5 reveals some interesting aspects of the distribution of these genes in the E. coli phylogroup A population. Firstly, it is clear that a discrete region comprising the genes pgaABCD, ycdT, insF1E1, and ymdE-ycdU, is absent in at around ten to twenty percent of E. coli from phylogroup A. These are contrasted with the core flanking genes, efeB/phoH, and ghrA/ycdXYZ, found in almost all phylogroup A genomes. In this way, pgaABCD, ycdT, insF1E1, and ymdE-ycdU represent a region of genomic heterogeneity, with not all strains encoding these genes. Within this region of heterogeneity, and in addition to ymdE-ycdU, the pgaABCD genes also appear to be more common in MPEC than phylogroup A. However, the pga genes are clearly not ubiquitous in MPEC. Interestingly, in both phylogroup A and MPEC, there is trend for the pga genes to decrease in abundance compared with the downstream gene, so that whilst pgaD can be found in approximately 90 of all phylogroupScientific RepoRts | 6:30115 | DOI: 10.1038/srepwww.nature.com/scientificreports/Gene in MG1655 b1028 b1029 b1384 b1385 b1393 b1394 b1395 b1396 b1397 b1398 b1399 b1400 b4287 b4288 b4289 b4290 b4291 b4292 b4293 Gene name ymdE ycdU feaR feaB paaF paaG paaH paaI paaJ paaK paaX paaY fecE fecD fecC fecB fecA fecR fecI in MPEC 98.5 98.5 100 100 100 100 98.5 98.5 98.5 98.5 100 100 98.5 100 100 100 100 100 100 in phy A 79.9 82.0 82.1 82.2 79.2 76.7 76.0 76.6 76.6 76.7 79.2 80.5 66.6 67.9 67.9 68.1 68.1 68.1 68.Product undefined product putative inner membrane protein transcriptional activator for tynA and feaB phenylacetaldehyde dehydrogenase 2,3-dehydroadipyl-CoA hydratase 1,2-epoxyphenylacetyl-CoA isomerase, oxepin-CoA-forming 3-hydroxyadipyl-CoA dehydrogenase, NAD + -dependent hydroxyphenylacetyl-CoA thioesterase 3-oxoadipyl-CoA/3-oxo-5,6-dehydrosuberyl-CoA thiolase phenylacetyl-CoA ligase transcriptional repressor of phenylacetic acid degradation paa operon, phenylacetyl-CoA inducer thioesterase required for phenylacetic acid degradation; trimeric; phenylacetate regulatory and detoxification protein; hexapeptide repeat protein ferric citrate ABC transporter ATPase ferric citrate ABC transporter permease ferric citrate ABC transporter permease ferric citrate ABC transporter periplasmic binding protein TonB-dependent outer membrane ferric citrate transporter and signal transducer; ferric citrate extracelluar receptor; FecR-interacting protein anti-sigma transmembrane signal transducer for ferric citrate transport; periplasmic FecA-bound ferric citrate sensor and cytoplasmic FecI ECF sigma factor activator RNA polymerase sigma-19 factor, fec ope.