©2003KluwerAcademicPublishers.PrintedintheNetherlands.
211
Regularpaper
Cyanobacterialsignaturegenes
KirtA.Martin1,JanetL.Siefert2,SailajaYerrapragada1,YueLu1,ThomasZ.McNeill1,3,PedroA.Moreno1,GeorgeM.Weinstock3,WilliamR.Widger1&GeorgeE.Fox1,∗
1DepartmentofBiologyandBiochemistry,University
ofHouston,Houston,TX77204-5001,USA;2Departmentof
Statistics,RiceUniversity,Houston,TX77251-1892,USA;3HumanGenomeSequencingCenter,BaylorCollegeofMedicine,Houston,TX77030,USA;∗Authorforcorrespondence(e-mail:fox@uh.edu;fax:+1-713-743-8351)
Received28June2002;acceptedinrevisedform7November2002
Keywords:comparativegenomics,cyanobacteria,signaturegenesAbstract
Acomparisonof8cyanobacterialgenomesrevealsthatthereare181sharedgenesthatdonothaveobviousorthologsinotherbacteria.Thesesignaturegenesdefineaspectsofthegenotypethatareuniquelycyanobacterial.Approximately25%ofthesegeneshavebeenassociatedwithsomefunction.Thesesignaturegenesmayormaynotbeinvolvedinphotosynthesisbutlikelytheywillbeinmanycases.Inaddition,severalexamplesofwidelyconservedgeneorderinvolvingtwoormoresignaturegeneswereobserved.Thissuggeststheremayberegulatoryprocessesthathavebeenpreservedthroughoutthelonghistoryofthecyanobacterialphenotype.Theresultspresentedherewillbeespeciallyusefulbecausetheyidentifywhichofthemanygenesofunassignedfunctionarelikelytobeofthegreatestinterest.
Introduction
Althoughclaimsfortheearliestfossilizedcyanobac-teriaat3.5Ga(SchopfandPacker1987)havebeenseriouslyquestioned(Brasieretal.2002)thereisstrongagreementthatmembranebiomarkersinwell-preservedsedimentsrevealthepresenceofcyanobac-teriaat2.7Ga(Brocksetal.1999).ItisthereforeofinteresttostudiesoftheearlyEarthtounderstandwhatsharedpropertiestheseearlycyanobacterialikelyhad.Theobviousexampleistheabilitytocarryoutoxy-genicphotosynthesis,whichiswidelyregardedasthecausetheriseofoxygenintheArchaeanatmosphereat2.2Ga(Knoll1999;Catlingetal.2001).Theremay,however,beothersharedpropertiesofthecyanobac-teriathatwouldhavehadmoresubtleimpactontheearlyEarthandwhichmaypersisteventothisday.Theabilitytosequencewholegenomesmakesitpossibletoexaminethedistributionofgenesinaverydetailedway.ThecompletionoftheSynecho-cystis6803genome(Kanekoetal.2001)madethefullcomplementofgenescarriedbyacyanobacterium
availableforthefirsttime.Inordertodeterminewhichofthesegenes,ifany,mightcontributetoauniquesharedcyanobacterialgenotypewecomparedthisini-tialcyanobacterialgenometosevenothergenomesthatarenowavailable.BasedonrRNAcomparisons,theseeightgenomesrepresentfivemajorlineagesatthecrownofthecyanobacterialtree(Turneretal.1999)(Figure1).Theircommonancestorwouldpred-atetheriseoftheheterocyst,whichhasbeendatedat2.1Gabygeologicevidence(AmardandBertrand-Sarfati1997).Aninter-comparisonofthegenomiccontentoftheseeightgenomes,definesaworkingsetofsignaturegenes,Table1.Thissignaturesetcontains181genesthatwereinitiallyamongtheal-most1000putativeopenreadingframesfoundintheSynechocystis6803genome(Kanekoetal.2001).
Materialsandmethods
Thegenecontentofeightcompletelysequencedcy-anobacterialgenomeswasexamined.Thegenomes
212
Table1.The181cyanobacterialsignaturegenesetderivedfromthesetofcoregenes.Columnheadingsindicatethecyanobacterialspeciesusedinthisstudyandthenumbersineachcolumnrefertothegenenameofthelikelyorthologdetectedbytheanalysisreportedhere.ThegenedesignatorsarethoseusedbytheindividualgenomeprojectsasofOctoberof2002.Insomecases,e.g.Nostocthenameconsistsofacontigfollowedbythegenenumberonthatcontig.Insomecasesthedesignator,e.g.sll0558indicatesarelativedirectionoftranscriptiontoo.ThesignaturegenesaretabulatedinthephysicalorderinwhichtheyoccurinSynechocystissp.PCC6803.ThisallowstheidentificationofputativeoperonscontainingtwoormoresignaturegenesbysimplyscanningtheTable.Signaturegenesincludedintheseputativeoperonsaretabulatedinbold.SynechosystisAnabaenasp.sp.PCC6803PCC7120sll0558all1826sll1399alr0301sll1398all0801slr1495alr4075slr1122alr1700slr0728alr5367slr0730alr4016slr0731alr4017slr0732alr0613slr0734all4252slr0737all0329sll0226all4289slr0250alr4067ssr0390asr4775sll1321all0011slr1780all0216ssr2998asl4482slr1796all2716slr1800alr5279
sll1656all3977/all4113sll1654alr4132sll1194alr1216slr1306alr4172sll0933all0748ssl1972asl4547ssr1789asl2354slr1596all1673slr1599alr3954slr1600alr3603sll1507alr4178sll1752all4871slr2032all3013slr2034alr3844ssr3451asr3845smr0006asr3846slr2049alr0617sll1934alr1356slr1908alr2231slr1915alr2980slr1918all3259ssr1698asl4369slr1034all4779slr1384all4892slr1699alr1085ssr2843
asl5079
N.punctiforme
P.marinusMED4458.29765459.441200452.291406507.14341474.201139492.301594412.471670412.481671434.12–507.391877485.8331504.113600477.40–468.781390502.192412504.12646466.401078466.40858468.70715357.8542501.187160441.31167501.1221196472.341418–
1961507.12398454.431631429.131975344.22126465.5318856231530439.16281493.502047493.512048493.52204962388507.250318–
910423.49514505.361241498.71467501.531036–
1958464.53977431.22925
P.marinusSynechococcusMIT9313sp.WH81022886821210281052350119212774384293915152420278620403564204327744462347109538452764474158810791219802412201456–1752130557028301656252582126953096221925431395879119831802919352721179081941545298833882131927255315822361111033301814241311602045229620462297204722984752221–1224262226122891511914155380527163462629210389727183536777
1720
TrichodesmiumT.elongatus
erythraeum55.6548tlr159412.1088tlr032510.339tlr049332.4372tlr124010.349tll219170.7692tll2101100.511tll1109100.512tll110894.8798tlr243446.5725tll205316.1883tll172414.1557tll138813.1355tll15721.151tsr227369.7448tlr042951.6329tll149920.2700tsr0524–
tlr142110.455tll039950.6270tll03962.2628tlr22698.2226tll240940.5311tll143517.2115tlr1208–
tsr080450.6231tsr19162.2572tll074838.4925tll014629.3820tll13001.53tll17177.759tlr080615.1731tlr12498.8173tll16958.8174tsr15418.8175tsr15421.138tll169957.6669tlr11406.6850tlr23243.4146tll20442.2455tlr2348–
tll211322.3033tlr028913.1241tll16324.5055tll247611.782
tlr1507
Table1.ContinuedSynechosystisAnabaenasp.sp.PCC6803PCC7120slr1702all4574sll1578alr0529sll1577alr0528sll1926all2971sll1071all2545ssl3451asl2557sll1797all0938slr1896alr4351slr1900alr4979slr1195alr2014slr1206all2750sll1968alr3655ssl3712all7022sll1737all4804slr1834alr5154slr1841all7614slr1535all0967ssr2595asl0514sll1632alr3857slr1220alr1129sll1271alr2231sll0860alr2431sll1317all2452ssl2598asl0846smr0009all4665ssl3379all0404ssl3364asl2850sll2013all4333sll2002alr4888slr2144all0476sll1702alr2762sll0854all3378sll0851alr4291sll1162alr1535slr1263alr1370slr1990alr1215sll1902all4902sll1586all0462ssl0461asl3112sll0247all4001sll1979all4118ssl2781asr2932slr1506all0969sll1418all3076sll1414all0646slr0816alr3417slr1342alr3454sll1898
all0949
N.punctiforme
P.marinusMED4508.79779313.22053313.12053474.14–498.172939494.691295501.621857463.411555399.18592458.411242506.114–430.13766509.3511673493.571231456.7374373.16910463.50–397.9462502.591657479.571923–
636489.37773506.871061501.108118501.1381133505.201255372.10138423.22387507.621023494.32154492.31993352.13378496.21563405.5222453.531632441.11946362.221009463.651192489.111–363.3–506.2162130409.17–445.501506493.231528378.2881507.98596504.121317493.1191113
P.marinusSynechococcusMIT9313sp.WH810229593357222746212745114135319992591060902289101133921960878187091515620232266288783356820542351109818787341722260932353731011880126754837182206654260929483345350067118232948374293099225436652669877285534396082216254127524241885742340219732536583336619431394878278146011793173185513032809197392689636529221500241233221806989251353670317491221343
2904
213
TrichodesmiumT.elongatus
erythraeum13.1258tlr22214.5118tlr19584.5134tlr195743.5529tlr245128.3708tlr02491.258tsl242870.7710tll15623.4123tlr19342.2487tll091616.1844tlr020856.6664tll192913.1249tlr146110.338tlr18491.110tlr107312.1150tlr07316.6851tll170610.402tll1668–
tsl220816.1948tlr013623.3115tll08536.6850tlr23243.4005tll031529.3867tlr09601.235tsl138662.7131tsr138716.1906tsr10872.2613tsr182014.1574tll171112.1104tlr198634.4556tlr20159.8694tll10923.4011tll227414.1466tlr163113.1245tlr2008–
tlr240218.2224tll02003.4058tlr02073.4041tll237536.4704tsl246857.6686tlr105077.7948tsl08661.74tsr096848.5830tll171715.1653tlr207526.3452tlr113412.1036tll007720.2820tll243087.8443
tll1894
214
Table1.ContinuedSynechosystisAnabaenasp.sp.PCC6803PCC7120slr1978alr0484sll1390alr4100slr1470alr3414sll1382all2919sll1376alr1909sll1372alr0296slr1273alr2060slr1287alr0786slr1530alr3399slr1160all4869slr1677alr4005ssr2831asr4319slr0941all2381slr0954alr1121slr1623all1732slr1636alr4066slr1638all5165slr1645all1258slr1649all5339sll0427all3854slr0443alr3855sll0272alr3297slr0280all4343sll0258all0259sll0359alr0946ssr0657asl3656slr1926all4664slr1946all4180slr1949all4162ssl0563asr3463sll0295alr3863slr0169all3257sll0169all2707sll0157all5037sll0350alr1278slr0376all1871ssl1417asr0062slr0013alr4373sll0208alr5283sll0199all0258slr0438alr3231sll0071all2549ssr0109asl0272sll0456all3908slr0630alr3419sll0609asl4507sll0608all4508slr0418
alr4674
N.punctiforme
P.marinusMED4468.812119486.282118507.951584477.921382476.851050484.941109454.611016478.98422506.251978443.26122504.190225452.3568506.63195477.5753434.471973477.391310498.72982429.91033378.29–502.159133502.58816428.21812423.121599497.551510501.1821072430.311404464.21134366.10–501.118823507.20309372.201808464.261171493.84533457.381459492.172026506.229402472.161952498.167–468.711162497.561000502.3761454.381609–
1992508.91999473.961834455.491389455.481388382.161248
P.marinusSynechococcusMIT9313sp.WH810225451573298733873534701230122634896583372892940333523214672134932182818151041327138426921731262210075021299251078120273035463456263810274718571306198522333005340219702845357920692777686104734983752931117331653037169417371209189977044929972677308235511020201422582003349209488829623358428298627614331035–175512262170959276043229415209792409772392927
3320
TrichodesmiumT.elongatus
erythraeum15.1804tlr028421.2881tll040412.1033tll06015.6163tlr165620.2798tll189123.3109tlr140269.7432tlr07571.31tlr067226.3534tll16585.6050tll01358.8130tll155056.6628tsl156768.7417tll033688.8459tlr04281.67tll044711.756tll11339.8638tll202420.2791tll246421.2953tlr21561.186tll04441.149tll105929.3858tlr0472–
tll185032.4411tll128514.1452tll115013.1378tsl025362.7132tlr086337.4809tll229239.4981tll10523.4159tsl10137.7572tlr16912.2608tlr031121.2839tlr075815.1659tlr243322.2992tlr107524.3302tlr18771.137tsr148313.1261tlr072913.1351tll13137.7632–
10.405tsr1840–
tll041844.5634tsr158420.2779tlr07421.206tlr087220.2811tsr228420.2810tlr143735.4689
tll0771
215
Table1.ContinuedSynechosystissp.PCC6803sll0372ssl0353ssl0352slr0204slr0208slr0906sll0544slr0575sll0832sll0827ssr1425sll0822slr0503ssr1041sll0584slr0116ssl0546sll0288slr0304sll0286sll0662sll0661ssl1263slr0022sll0031slr0042sll0509sll1340slr1459slr0651sll0621slr1557slr1579slr1655slr1660slr1177sll1109slr0589slr0590slr0598
Anabaenasp.PCC7120all2849asl0940asr0654alr2465all1363all0138alr3444alr3596alr4394alr3827asr3137all2080alr0942asr2378alr4170alr3707asr3457alr3455alr3097alr0113alr0045alr0044asr0043alr5129alr2308alr2231alr4466asl4395all2327all4101alr3122all1338all4042all0107all4118alr3362alr3101alr3980alr3874alr2454
N.punctiforme
P.marinusMED41262243111989154680184010372023–18410071926156582512967977967467403402401807180791078812020532011771586–3771974–86232718441125
P.marinusMIT93131837103173921651449379614523463201234541710351837223407–1062378437828321183201200199299918978992973182622251236153201692195921301550283438403313382
Synechococcussp.WH8102129513261213951124327082366631225562325962815221058916969127052703171627681554155315523396767283733741289852631211728762579283092635631660275823722914
Trichodesmiumerythraeum2.261466.730526.35169.854784.835112.115334.46098.821554.649149.594924.327237.478366.731668.739215.178035.46952.24772.247538.487747.58217.75407.75397.753813.133752.64076.685080.82281.24040.525821.288419.227916.18383.4044118.97377.794816.190111.8161.18953.644129.3790
T.elongatus
472.78501.60356.8382.21379.13494.8505.39479.98493.20479.30475.30388.32501.179506.212353.4493.86504.124504.122478.108320.6506.101506.100506.99435.38481.68362.17472.37493.128481.91486.93469.12448.41479.48423.1374.6507.131509.283357.7432.18506.182tlr1952tlr1577tlr0636tll0488tlr0320tlr1530tlr1573tll0792tlr0651tlr1856tsl2457tll2172tlr2014tsl1557tll1063tll2308tlr2018tlr2016tll1363tlr1682tlr2140tlr2139tsr2138tlr0653tlr2308tlr2324tll0991tsl2214tlr2034tlr0351tlr0052tlr0610tll1913tlr2404tlr1444tlr0353tll1867tll0625tlr2271tll0958
wereSynechocystissp.PCC6803(3.6MB)(Kanekoetal.2001),AnabaenaPCC7120(7.2MB),andThermosynechococcuselongatesBP-1(2.6MB)avail-ableathttp://www.kazusa.or.jp/cyano/cyano.html,andSynechococcusWH8102(2.72MB),ProchlorococcusmarinusMED4(1.6MB),ProchlorococcusmarinusMIT9313(2.4Mb),Nostocpunctiforme(9.2MB),andTrichodesmiumerythraeumIMS101(6.5MB)avail-ableathttp://jgi.doe.gov/JGI_microbial/html/index.html.Someofthesesequencesarecurrentlyindraft
216
Figure1.Arepresentativephylogenetictreeofcyanobacteriashowingthepositionofstrainswhosecompletegenomehasbeensequencedandusedinthisstudy.Thecyanobacterial16SrRNAtreewasconstructedfrom1063unambiguouslyalignednucleotidesundertheKimura2-para-meterusingtheNeighborJoiningtreemakingalgorithminBioedit(Hall1999).Thebranchesaredesignatedwithorder-levelnomenclature(Turneretal.1999withmodifications).BoldnamesindicatethepositionofthestrainswhosegenomicsequenceswerepubliclyavailableinOctoberof2002.
form.Atthetimethisworkwasundertaken,theSyne-chocystissp.PCC6803wasbyfarthebestannotatedgenomeamongtheseand,assuch,wasusedasaref-erencesequenceformuchoftheworkreportedhere.ExceptforAnabaenaPCC7120,eachgenomehadpreviouslybeeninter-comparedwiththeothersevengenomesandtheresultshavebeenpostedonthere-spectivegenomesites.Thegenesfoundtobecommontoatleastsevencyanobacterialgenomeswereex-tractedandassembledintoindividualsequencefilesusingtheBioeditplatform(Hall1999).Multigenese-quencealignmentwasperformedusingCLUSTALW
inBioedit(Higginsetal.1994)andtheresultsex-aminedtoverifythatthegenesextractedfromthevariousgenomeswerelikelyhomologsororthologs.Eachoftheconservedgeneswasnextcom-paredagainsttheNCBIproteindatabasebyuseofBLASTP.BLASTPtableswereindividuallyexaminedforscoreandcorresponding10organism.ProteinswithE-values<10−tospeciesotherthanchloroplastsorchloroplast-containingeukaryoteswereculledfromthelist.BecausetheNCBIsitedoesnotincludedatafromthegenomesofseveralphotosyntheticbac-teria,weseparatelyexaminedtheresidualgenesforaffinitywithgenesinthegenomesofChlorobiumtep-idum,Rhodobactersphaeroides,Rhodopseudomonaspalustris,andRhodospirilumrubrumandChloroflexusauranticuswithanexpectationcutoffvalueE<10−6.Wealsobrieflyexaminedthe181-genesignaturesetforputativecyanobacterialspecificoperons.ThesignaturegenesinTable1werearrangedaccordingtotheirpositionintheSynechocystisPCC6803genome.Because,thegenenamesintheothergenomestyp-icallyrelatetolocationinthegenomewewereabletoquicklyscreenthetableforsetsofsignaturegenesthatwereincloseproximitytooneanotherinallthegenomes.ThelikelyoperonsdetectedareshownasboldentriesinTable1.
Results
Wehereinreporttheresultsofthegenomiccompar-isonsforeightcyanobacteria.Theinter-comparisonallowedustoidentifyhundredsofgenesthataresharedbyatleastsevenofthegenomesandthereforecomprisethecore(Makarovaetal.1999)ofthecy-anobacterialgenome.Ofthesecoregenes,itisfoundthat181havenotbeenfoundtohaveanobvioushomo-logororthologinnon-cyanobacterialbacterialgen-omes(Table1).Only43ofthese181signaturegeneshavebeenassociatedwithanyspecificfunctionalroleaccordingtotherecentlyrevisedannotationoftheSynechocystisPCC6803genomeattheCyanobasewebsite(http://www.kazusa.or.jp/cyano/cyano.html).Forthereader’sconvenience,thesegenesaresepar-atelylistedwiththeirgeneticnomenclatureandanindicationofthefunctiontheyareassociatedwithinTable2.Notsurprisingly,34ofthese,includ-ingmanyofthePhotosystemIandPhotosystemIIsubunits,aredirectlyorindirectlyinvolvedinphoto-synthesis.Theremaining9knowngenesareinvolvedinotherfunctionsthatmaynotbedirectlyrelated
217
tophotosynthesis.Theoverwhelmingmajorityofthesignaturegenes(138,or76.2%)remainannotatedashypotheticalgenesinSynechocystisPCC6803.Sinceequivalentgenesarefoundinatleastsevenoutofeightoftheorganismsitisclearthattheseareactualgenesofunassignedfunction.Thesehypotheticalgenesin-clude16genesthataredesignatedas‘ycf’thatarefrequentlyfoundinchloroplasts.Theseareycf21,23,33,34,35,36,41,49,51,52,53,54,58,60,66,and83.
Ourscreenofthesignaturesetforgenesofconservedproximityrevealedsixputativeoperons,Table1.Intwocases,nothingwasknownregardingthefunctionofthegenes.Theotherputativeoper-onsare:(a)aclusterofthreePhotosytemIIgenesconsistingofycf48,psbEandpsbF;(b)aphycocy-aninclustercontainingcpcAandcpcB;(c)aclusterofthreecelldivisionassociatedproteinsincludingthetwoseptumsitedeterminingproteinsminC,minEandthenon-signaturegenemind;and(d)twogenessll0608andsll0609thatincludesaputativehomologoftranscriptionfactordevT.
Discussion
Thecomparisonofeightcyanobacterialgenomesal-lowedustoidentify181genesthatarefoundinallthecyanobacterialgenomes.Thesegenesdonothaveobvioushomologsororthologsinotherbacterialgen-omes,whetherphotosyntheticornot.Together,thesesynapomorphicgeneslikelyaccountfortheuniquesharedcharacteristicsofthecyanobacterialphenotypeandarethereforeacharacteristicsignature(Grahametal.2000)ofthegroup.Therelativeportionofthegenesinthecyanobacterialsignaturesetrangesfrom2.6%ofthetotalnumberofcodingregionsinthecaseofthelargeNostocpunctiformegenometo11.4%forthemuchsmallerProchlorococcusmarinusMED4genome.Sincethelistcontainsgenescon-servedprimarilybetweencyanobacteria,andchloro-plasts,itwouldnotbeexpectedtoincludegenesac-quiredbylateraltransfer,unlesssucheventsoccurredbeforethebranchingofcyanobacteria.
Thisfirstapproximationofacyanobacterialsigna-turesetwilllikelybesubjecttomodificationasfurtherdataemerges.Theadditionofcyanobacterialgenomesfromcurrentlyunrepresentedbranchesmayontheonehandcausesomegenestoberelegatedtobeingsigna-turesofsubgroupsofthecyanobacteria.Anexamplemightbegenesassociatedwiththylakoidmembranes,
218
Table2.Listingof43signaturegenesthathavebeenassociatedwithsomefunction.Thissubsetofsignaturegenesislooselygroupedaccordingtofunction.Thetableindicatestheusualgeneticnomenclatureforeachgene.Thebriefannotationcommentsweretakenfromthe2002annotationoftheSynechocystissp.PCC6803genomewhichisavailableatthecyanobasewebsitehttp://www.kazusa.or.jp/cyano/cyano.html.PCC6803Genenameslr1834ssl0563slr0737ssr2831ssr0390slr1655sll0226slr0906sll0851ssr3451smr0006ssl2598smr0009sll0427sll1194sll0258sll1398slr1645slr2034sll1418sll1317sll0199sll0621sll1578sll1577slr0116sll1382:slr1459ssr2595ssr1789slr1596sll1968sll0247ssl3364slr1841sll1271slr0042sll1321sll1908slr0418sll0169ssl0546sll0288
LocuspsaApsaCpsaDpsaEpsaKpsaLycf4psbBpsbCpsbEpsbFpsbHpsbNpsbOpsbUpsbVpsbWpsbZycf48petApetEccdAcpcAcpcB
Geneticcomment
P700apoproteinsubunitIaPhotosystemIsubunitVIIPhotosystemIsubunitIIPhotosystemIsubunitIVPhotosystemIsubunitXPhotosystemIsubunitXI
PhotosystemIassemblyrelatedprotein
PhotosystemIIcorelightharvestingproteinPhotosystemIICP43proteinCytochromeb559alphasubunitCytochromeb559bsubunitPhotosystemIIPsbHproteinPhotosystemIIPsbNprotein
PhotosystemIImanganese-stabilizingpolypeptidePhotosystemII12kDaextrinsicproteinCytochromec550
PhotosystemIIreactioncenterWprotein(psb13,ycf79)PhotosystemII11kDprotein
PhotosystemIIstability/assemblyfactor
similartoIIoxygen-evolvingcomplex23KproteinpsbPApoCytochromef,componentofcytochromeb6/fcomplexplastocyanin
putativec-typecytochromebiogenesisproteinCcdAPhycocyaninalphasubunitPhycocyaninbetasubunit
Phycocyanobilin:ferredoxinoxidoreductaseFerredoxin,petF-likeproteinPhycobilisomecorecomponent
Highlight-induciblepolypeptideHliBCAB/ELIP/HLIP-relatedproteinHliD
Cytoplasmicmembraneprotein-light-inducedprotonextrusion.Photomixotrophicgrowthrelatedprotein,PmgAIron-stresschlorophyll-bindingproteinCP12polypeptide
Probableporin;majoroutermembraneproteinProbableporin;majoroutermembraneproteinProbableporin;majoroutermembraneproteinATPsynthaseproteinI
D-3-phosphoglyceratedehydrogenasePutativetranscriptonfactorDevThomologCelldivisionproteinFtn2homologSeptumsite-determiningproteinMinESeptumsite-determiningproteinMinC
apcFhliBhliDpxcApmgAisiAcp12
atp1serA
minEminC
asthesemembranesarenotpresentinGloeobacter.Ontheotherhand,someofthegenomesusedinthisstudyarestillundergoinganalysisbytheirannotat-ors,anditispossiblethatequivalentgeneshavebeenoverlookedinsomecases,resultinginanincompletesignatureset.Finally,itshouldbeappreciatedthatwhatconstitutesthepresenceofanequivalentgeneinotherorganismsissomewhatsubjective.NotonlywillworkersdisagreeonappropriatechoicesforBLASTPcutoffvalues,theprogramitselfmaygivedifferentvaluesdependingonthesizeofthedatabaseandtypeoffilteringused.Inaddition,insomecasesonlyaportionofagene,e.g.,adomainmaybeshared.
Regardlessofuncertaintiesinpreciselydefiningthesignatureset,itisclearthatonecanexpectsuchsetstoexistforatleastsomeothergroupsorsub-groupsofrelatedorganismsaswell.Thisisespeciallytrueforgroupingsthathavecharacteristicpropertiessuchastheproductionofanendosporethatinvolvemultiplegenes.Thelargenumbersofgenesinthecyanobacterialsignaturesetthusprovidesfurtherevi-dencethatitmayeventuallybepossibletodeterm-inephylogenybygenecontent(OchmanandBer-gthorsson1995;Fitz-GibbonandHouse1999;Sneletal.1999)foratleastsomegroupingsinthetreeoflife.Thisisimportantbecausebacterialgenomesaredynamic,andaresubjecttorepeatedeventsofgeneacquisitionanddeletion(Doolittle1999;Jain1999).Inordertounraveltheseevents,oneneedstoknowwhatdefinestheessenceofanygenome.Itmayalsobepossibletousesignaturesetstoconstructaninternalhistoryofthegroupunderstudy,iflateraltransferofthesignaturegenesisuncommonwithinthegroup.Ofthe181signaturegenes,46areincludedinasetof434genesthathavebeenproposedaslikelycandidatesforinterdomainhorizontalgenetransfer(Kooninetal.2001).Forty-fiveoftheforty-sixhaveblast-derivedbesthitstoeitherArabidopsisgenes,orvariouschloroplastorcyanellegenomes.Theremain-inggene,sll0031,hadanArchaealbesthit(Koonin2001).Thisputativehomologyistoaregioninternaltosll0031thatcontainsaferredoxintypemotifratherthanthewholegene.Theinter-domainhorizontaltransferthatisbeingdetected(Koonin2001)inthe45genesisinterpretedtoreflectanendosymbioticeventbetweencyanobacteriaandeukaryoticorganismsthatledtotheformationofthechloroplast.
Thefactthatthephylogeneticsignalfromthesegenesisstillsufficienttodefinethemasorthologsinextantcyanobacteriatestifiestoacontinuingimportantbiochemicalrole.Thisresultfurtherdemonstratesthat
219
theseforty-sixgeneswereinfactwidelydistributedamongcyanobacteriaatthetimewhenchloroplastscameintoexistence.Theusualassumptionwouldbethattheremaining135werelostfollowingtheen-dosymbioticeventbuttheymayhavebeentransferredtothenucleusandnotyetdetectedexceptinAra-bidopsis.Alternatively,iftheoriginalendosymbioticeventsoccurredbefore2.1Gaitisnotimpossiblethatsomeofthesegenesweresimplynotyetpresentatthisearlierstage.
Althoughgeneorderhasbeenshowntoberelat-ivelyunstable(MushegianandKoonin1996;Siefertetal.1997;Itohetal.1999),proteinsthatfunc-tiontogetherinapathwayorstructuralcomplexareneverthelesslikelytoevolveinacorrelatedfashion(Pellegrinietal.1999;Huynenetal.2000).Sev-eralofthesignaturegeneswerefoundtobeincloseproximitytoanothersignaturegeneinatleastsevenoftheeightgenomes.Thereweresixoftheseputa-tiveoperonscontainingatleasttwosignaturegenes.Wealsoobservedcasesinwhichasinglesignaturegenewasrepeatedlyassociatedwithgenesthatarefoundinsomeotherphotosyntheticbacteriabutnotnon-photosyntheticbacteria.Atthisstage,thereisanunknownnumberofexamplesofconservedgeneorderinvolvingonesignaturegeneandoneormorenon-signaturegenes.Theanalysisperformedherewouldnothavedetectedputativeoperonsofthistype.Re-gardlessofthenumbersofthese,theexistenceofseveralexamplesofaconservedoperon-likearchitec-tureamongthesignaturegenessuggeststhepossibleexistenceofregulatorysystemsthathavebeensharedbyallcyanobacteriaforovertwobillionyears.
Twocyanobacterialsignaturegenes,minCandminE,thatareseptum-sitedeterminingproteinscouldimpactthecoordinationofnitrogenfixationandanoxygenevolvingcomplex.AlthoughSynechocystisPCC6803doesnotfixnitrogen,atleasttwoofthecyanobacterialspeciesinthisstudyareknowntodoso.Inthecaseofsheathless,unicellularcyanobac-teria,thereisevidenceformultiplegainsandorlossesofnitrogen-fixingability(Turner1997).Itmaywellbethatthecyanobacterialcoreincludesmuchoftheunderlyingmachineryneededfornitrogenfixational-lowingrelativelyrapidevolutionofthatcapability.Thus,somegroupsofancientcyanobacteriamighthavebeenabletoevolvenitrogenfixationandcoordi-nateitwithphotosynthesisinresponsetoecosystemchallengesduringtheArchaean.
Themostinterestingaspectofthesignaturesetisofcoursethelargenumberofgenesthathavenot
220
beenassignedafunction.Thismaysimplyreflectin-completenessineffortstocorrelatefunctionalstudieswiththegenomicresults.However,ifoneacceptsthefindingatfacevalue,itclearlysuggeststheseorganismshavemoresharedcharacteristicsthanhasbeenappreciatedtodate.Inparticular,thereareeitherfarmoregenesassociatedwiththecyanobacterialphotosyntheticprocessesthanpreviouslythought,orthatcyanobacteriapossesspathwaysand/orotherbio-chemicalactivitiesthatarelargelyunknown.Giventheamountofeffortthathasbeenfocusedonunderstand-ingphotosynthesisincyanobacteria,itisunlikelythatmanyoftheseunassignedgenesaredirectlyinvolvedinthatprocess.Itisfarmorelikelythattheyarecarry-ingoutkeysupportingroles,suchascoordinatingthevariousactivitiesassociatedwithphotosynthesis.Thereislittledirectevidenceatthisstageastowhattheseuncharacterizedgeneproductsdo.Sinceoperonsfrequentlyconsistofgenesthatarefunc-tionallyrelated,amoregeneralstudyofconservedneighboringgeneswilllikelyprovidecluesastofunc-tionalrolesinsomecases.Likewise,examinationoftheputativeproteinsfromastructuralperspectivemightallowonetorecognizecharacteristicfeaturessuchastheabilitytospanmembranes.Intheend,theassignmentoffunctiontotheunknowngeneswillrequiredetailedstudiesofbiochemicalfunction.
OneespeciallyrelevantbiochemicalstudywasarecentDNAmicroarrayanalysisoftheexpressionpat-ternsofallPCC-6803genesduringacclimationtohighlight(Hiharaetal.,2001).Althoughtheabil-itytoobservelowexpressiongeneswaslimitedtosomeextentbyribosomalRNAcontamination,morethan160geneswereclassifiedashavingoneofsixcharacteristicresponsepatterns.Threesignaturegenesofknownfunction,psbO,psbV,andapcFwereini-tiallyrepressedandthenlaterincreased.Itisnotable,however,thatnoothersignaturegeneexhibitedanidentifiableresponsetothechangeinlightinten-sity.Sincethecharacteristiccyanobacterialgenesareapparentlynotinvolved,onecanlikelyexpectthatacclimationtolightmaydifferconsiderablyinthevariouscyanobacteriallineages.Anotherexampleofrelevantdataarethetwo-dimensionalproteingelsep-arations,whichareavailableattheCyanobasesite(http://www.kazusa.or.jp/cyano/cyano.html).Fourteendistinctproteinproductswerefoundinthethylakoidmembranefraction.Fourofthesearesignaturegenesofknownfunction,psaC,psaE,andpsbO.Inaddition,threesignaturegenesofunknownfunction,slr1623,ssl0352andssr2998arefoundinthisfraction.
Atthisstage,onecanultimatelyonlyspeculateonthefunctionoftheproteinsencodedbytheunassignedsignaturegenes.However,insteadoffocusingatten-tiononthealmost1000hypotheticalgenesseenintheSynechocystis6803genome,theresultspresent-edherewillallowphotosynthesisresearcherstotargeteffortstolessthan140genesthatarelikelytobeofconsiderableinterest.
Acknowledgements
Thisworkwassupportedinpartbyfundingfrom:theNIHNationalHumanGenomeResearchInstitutetoK.M.(F31-HG00186),NSFPostdoctoralBioin-formaticsgrant(9974214)andanNSFLExEngrant(0085562)toJ.L.S.,andgrantsfromtheNationalSpaceandAeronauticsAdministrationExobiologyProgram(NAG5-8140andNAG5-12366),theIn-stituteofSpaceSystemsOperationsandtheShellScholarsProgramattheUniversityofHoustontoG.E.F.
References
AmardBandBertrand-SarfatiJ(1997)Microfossilsin2000MaoldchertystromatolitesoftheFrancevilleGroup,Gabon.Pre-cambrianRes81:197–221.
BrasierMD,GreenOR,JephcoatAP,KleppeAK,vanKranen-donkMJ,LindsayJF,SteelesAandGrassineauNV(2002)QuestioningtheevidenceforEarth’soldestfossils.Nature416:76–81
BrocksJJ,LoganGA,BuickRandSummonsRE(1999)Achaeanmolecularfossilsandtheearlyriseofeukaryotes.Science285:1033–1036
CatlingDC,ZahnleKJandMcKayC(2001)Biogenicmethane,hydrogenescape,andtheirreversibleoxidationofearlyEarth.Science293:839–843
DoolittleWF(1999)Lateralgenomics.TrendsCellBiol9:M5-M8Fitz-GibbonSTandHouseCH(1999)Wholegenome-basedphylo-geneticanalysisoffree-livingmicroorganisms.NucleicAcidsRes27:4218–4222
GaasterlandTandRaganMA(1998)Constructingmultigenomeviewsofwholemicrobialgenomes.MicrobCompGenomics3:177–192
GrahamDE,OverbeekR,OlsenGJandWoeseCR(2000)Anarchaealgenomicsignature.ProcNatlAcadSciUSA97:3304–3308
HallTA(1999).BioEdit:auser-friendlybiologicalsequencealign-menteditorandanalysisprogramforWindows95/98/NT.Nuc-leicAcidsSympSer41:95–98
HiharaY,KamelA,KanehisaM,KaplanAandIkeuchiM(2001)DNAmicroarrayanalysisofcyanobacterialgeneexpressionduringacclimationtohighlight.PlantCell13:793–806
HigginsD,ThompsonJ,GibsonTThompsonJD,HigginsDGandGibsonTJ(1994)CLUSTALW:improvingthesensitivity
ofprogressivemultiplesequencealignmentthroughsequenceweighting,position-specificgappenaltiesandweightmatrixchoice.NucleicAcidsRes22:4673–4680
HuynenM,SnelB,LatheWandBorkP(2000)Predictingpro-teinfunctionbygenomiccontext:quantitativeevaluationandqualitativeinferences.GenomeRes10:1204–1210
ItohT,TakemotoK,MoriHandGojoboriT(1999)Evolutionaryin-stabilityofoperonstructuresdisclosedbysequencecomparisonsofcompletemicrobialgenomes.MolBiolEvol16:332–346JainKK(1999)Strategiesandtechnologiesinfunctionalgenomics.DrugDiscovToday4:50–53
KanekoT,NakamuraY,WolkCP,KuritzT,SasamotoS,WatanabeA,IriguchiM,IshikawaA,KawashimaK,KimuraT,KishidaY,KoharaM,MatsumotoM,MatsunoA,MurakiA,NakazakiN,ShimpoS,SugimotoM,TakazawaM,YamadaM,YasudaMandTabataS(2001)Completegenomicsequenceofthefilamentousnitrogen-fixingcyanobacteriumAnabaenasp.strainPCC7120(supplement).DNARes8:227–253
KastingJFandSiefertJL(2001)Biogeochemistry.Thenitrogenfix.Nature412:26–27
KnollAH(1999)PALEONTOLOGY:enhanced:anewmolecularwindowonearlylife.Science285:1025-1026
Koonin,EV,Makarova,KSandAravind,L(2001)Horizontalgenetransferinprokaryotes:quantificationandclassification.AnnRevMicrobiol55:709–742
221
MakarovaKS,AravindL,GalperinMY,GrishinNV,TatusovRL,WolfYI,andKooninEV(1999)ComparativegenomicsoftheArchaea(Euryarchaeota):evolutionofconservedproteinfam-ilies,thestablecore,andthevariableshell.GenomeRes9:608–628
MushegianARandKooninEV(1996)Geneorderisnotconservedinbacterialevolution.TrendsGenet12:289–290
OchmanHandBergthorssonU(1995)Genomeevolutioninentericbacteria.CurrOpinGenetDev5:734–738
PellegriniM,MarcotteEM,ThompsonMJ,EisenbergDandYeatesTO(1999)Assigningproteinfunctionsbycomparativegenomeanalysis:proteinphylogeneticprofiles.ProcNatlAcadSciUSA96:4285–4288
SchopfJWandPackerBM(1987)EarlyAchaean(3.3-billionto3.5-billion-year-old)microfossilsfromWarrawoonaGroup,Australia.Science237:70–73
SiefertJL,MartinKA,AbdiF,WidgerWRandFoxGE(1997)Con-servedgeneclustersinbacterialgenomesprovidefurthersupportfortheprimacyofRNA.JMolEvol45:467–472
SnelB,BorkPandHuynenMA(1999)Genomephylogenybasedongenecontent.NatGenet21:108–110
TurnerS,PryerKM,MiaoVPandPalmerJD(1999)Investigatingdeepphylogeneticrelationshipsamongcyanobacteriaandplast-idsbysmallsubunitrRNAsequenceanalysis.JEukMicrobiol46:327–338
因篇幅问题不能全部显示,请点此查看更多更全内容