タンパク質ドメイン

タンパク質ドメイン（Protein domains）は、タンパク質の配列、構造の一部で他の部分とは独立に進化し、機能を持った存在である。それぞれのドメインはコンパクトな三次元構造を作り、独立に折り畳まれ、安定化されることが多い。多くのタンパク質がいくつかのドメインより成り立ち、1つのドメインは進化的に関連した多くのタンパク質の中に現れる。ドメインの長さは様々で、25残基程度から500残基以上に及ぶものもある。ジンクフィンガーのような最も短いドメインは金属イオンやジスルフィド結合によって安定化される。カルモジュリンにおけるカルシウム結合性のEFハンドドメインのように、ドメインはしばしばタンパク質の機能ユニットとなっている。また、ドメインは自己安定化されるため、遺伝子工学によってタンパク質間での組み替えを行い、キメラを作ることができる。

ピルビン酸キナーゼは3つのドメインからできたタンパク質である。 (PDB: 1pkn)

背景

タンパク質のドメインという概念は、リゾチーム(Phillips, 1966)、パパイン(Phillips, 1966)の結晶のX線回折の研究、免疫グロブリン(Porter, 1973; Edelman, 1973)の部分的な分解の研究等を受けて、1973年にD.B. Wetlauferにより提唱された。Wetlauferはドメインを自立的に折り畳まれるタンパク質構造の中の安定なユニット部分と定義した。ドメインは、(1)コンパクトな構造で^[1]、(2)独自の進化をして独自の機能を持ち^[2]、(3)独自に折り畳まれるもの(Wetlaufer, 1973)、と定義された。

ドメインは他のドメインと組み合わさってマルチドメインを作ることがある(Chothia, 1992)。マルチドメインを持つタンパク質では、それぞれのドメインは独立にそれぞれの機能を果たすか、隣接のドメインと協力し合って機能を果たす。またドメインはウイルスの殻や筋肉繊維などのような大きな集合体の一モジュールとなったり、酵素や制御因子の触媒部位、結合部位となることもある。

一例としてピルビン酸キナーゼでは、all-βの調整ドメイン、α/β-基質結合ドメイン、α/β-核結合ドメインがポリペプチドのリンカーで結合されている(George and Heringa, 2002a) 。これらのドメインは別々のファミリーに属している。

酵素の中心にあるTIMバレルの基質結合ドメインは最もよく見られるものの1つである^[3]。これは相互に全く関係のない反応を触媒する多くの異なった酵素ファミリーの中に見られる(Hegyi and Gerstein, 1999)。TIMバレルはこのような構造のうち最も早くに構造が解かれたものである。CATHドメインデータベースには、現在26のホモログファミリーが分類されている(Orengo et al., 1997)。TIMバレルはβ-α-βというモチーフを持ち、末端同士が水素結合で閉じられている。起源については、大昔の1つの酵素にあったものが急速に拡散したという説や^[4]、収束進化によるという説がある(Lesk et al., 1989)。

ピルビン酸キナーゼ中のTIMバレルは、ドメインを形成するのに2つ以上のポリペプチド鎖を要するという意味で「不連続」である。これはタンパク質の進化中にあるドメインが別のドメインの中に挿入されたためであると考えられている。これまで知られているドメインのうち、約4分の1が不連続となっている(Jones et al., 1998; Holm and Sander, 1994)。

共有結合している2つのドメインは、タンパク質の構造や機能の安定性の面において、共有結合していないものよりも有利である(Ghelis and Yon, 1979)。また共有結合は、触媒作用の中間体を安定化したり量比を固定するのにも役立つ(Ostermeier and Benkovic, 2000)。

タンパク質構造のユニットとしてのドメイン

→詳細は「タンパク質構造」を参照

一次構造

タンパク質の一次構造は特異的な三次元構造を決定している^[5]。タンパク質の三次元構造を決める最も重要な要因は極性、非極性のアミノ酸の分布である^[6]。フォールディングは疎水性側鎖を分子の中に閉じ込める力によって進む。

アライメントはドメインを決定する重要な道具である。

二次構造

一般のタンパク質は親水性残基に取り巻かれた疎水性残基の核を持つ。ペプチド結合自体は極性のため、疎水性の環境では水素結合を作って中和される。これにより、二次構造と言われる部分的な共通性を持ったポリペプチドの三次元構造が作られる。主な二次構造には、αヘリックスとβシートがある。

二次構造のモチーフ

二次構造の簡単な組み合わせがいくつかのタンパク質の中で見られることがあり、超二次構造またはモチーフと呼ばれる。例えばβヘアピンは2つの逆平行βストランドが小さなループで結合された構造をしている。これは単独でリボンとしても、より複雑なβシートの一部としても現れる。他の超二次構造の例としてはβ-α-βモチーフがあり、平行βストランドの繋ぎ目でよく見られる。中央のαヘリックスが1番目のストランドのC末端と2番目のストランドのN末端を結合する。

三次構造

いくつかのモチーフが一体となり、局所的にコンパクトな半独立のユニットを作ったものをドメインと呼ぶ^[1]。これに対してポリペプチド鎖全体の三次元的な構造はタンパク質の三次構造と呼ばれる。ドメインはそれぞれがループで繋がった二次構造からなる疎水性の核を持ち、三次構造の基礎的なユニットとなっている。タンパク質のパッキングは一般に外側よりも内側の方が密で、固体の様な核と液体の様な表面になる^[7]。実際、核の部分は保存性が高いがループの部分の残基はタンパク質の機能に関わっているにもかかわらず保存性は高くない。タンパク質の三次構造はドメインに含まれる二次構造によって4つのクラスに分類される^[8]。

All-αドメインはαヘリックスのみからなる核を持ったドメインで、たいていは小さなフォールドである。
All-βドメインは逆平行βシートの核を持つドメインでストランドの配列には様々なパターンがあるがグリークキー型のものが多い^[9]。
α+βドメインはAll-αドメインとAll-βドメインの混合物である。他のクラスと重複するためこのクラスへの分類は難しく、CATHドメインデータベースでは扱われていない^[10]。
α/βドメインはβ-α-βモチーフを含む。二次構造は層状か樽状に配列する。

アライメントはドメインの種類を決める重要な道具である。

ドメインのサイズ

ドメインのサイズには制限がある^[11]。それぞれのドメインのサイズはE-セレクチンの36残基からリポキシゲナーゼ-1の692残基の範囲にある^[12]。しかしドメインの90%以上は200残基以下、平均では約100残基となっている^[13]。40残基以下のとても短いドメインは金属イオンやジスルフィド結合で安定化される。300残基を超える大きなドメインは複数の疎水性の核を持つ場合が多い^[14]。

一次構造と三次構造の関係

モジュール

「自然は発明家ではなく修繕屋だ」^[15]と言われる。新しい配列は全く新しく作り出されるよりも、既存の配列を改良したものの方が多い。ドメインは自然が新しい配列を作る時によく利用する材料で、モジュールと呼ばれる。古細菌、真正細菌、真核生物のどの分類にも共通して現れるドメインもたくさんある。様々なタンパク質で広く見られるドメインには、例えば細胞外の凝固、線維素溶解、補体に関わるようなタンパク質や細胞表面の接着、サイトカイン受容体のタンパク質などがある^[16]。

タンパク質ファミリー

分子進化の研究により、似たような配列、構造を持つタンパク質のファミリーという概念が生まれた。しかし配列の相同性は同じ構造を持つタンパク質同士でも著しく低いこともある。共通の祖先から分岐したためにタンパク質の構造が似る場合もあるが、二次構造の安定化の仕方が共通だったり、進化の過程でたまたま同じような構造を取ることも少なくない。現在では4万5000を超えるタンパク質の三次元構造が実験的に解かれ、蛋白質構造データバンクに登録されている^[17]。しかしここには全く同じかほとんど同じ構造も多数含まれている。全てのタンパク質は進化的な関係を考慮してファミリーに分類されるべきである。しかしドメインレベルでは構造の比較が最も上手くいくため、タンパク質をドメインの三次元構造で分類する多くのアルゴリズムが開発された。

スーパーフォールド

CATHドメインデータベースはドメインを約800のファミリーに分類している。そのうち10のファミリーには沢山のタンパク質が登録されていて、スーパーフォールドと呼ばれる。スーパーフォールドは配列相同性のない少なくとも3つ以上の構造が含まれるフォールドと定義されている^[18]。最もよく見られるのは、前述したようにα/βバレルのスーパーフォールドである。

マルチドメインタンパク質

大部分のタンパク質、すなわち3分の2の単細胞生物や80%の後生動物はマルチドメインタンパク質である。これらは遺伝子の重複によって生じた^[19]。マルチドメインタンパク質のドメインの中の多くは、以前は独立したタンパク質の中にあった。真核生物のマルチドメインタンパク質に含まれるドメインの多くが原核生物のタンパク質の中にも見つかっている^[20]。例えば、脊椎動物はGAR合成酵素、AIR合成酵素、GARホルミル基転移酵素の機能を持つマルチドメインタンパク質（GARs-AIRs-GARt）を持つが、昆虫はGARs-(AIRs)2-GARtという、酵母はGARtだけ分離したGARs-AIRsというタンパク質を持ち、細菌では3つとも分離したそれぞれ別個のタンパク質を持つ^[21]。

起源

マルチドメインタンパク質は、新しい機能を作るように進化的な選択圧がかかって出現したと考えられている。多くのタンパク質が、他のドメインを取り込むことによって共通の祖先から分岐してきた。モジュールは生物内あるいは生物間を次のような遺伝子シャッフリング等によってしばしば転移してきた。

種間の水平移動を含むモジュールの転移^[22]
挿入、転移、欠失、重複などによる再構成
相同組み換え
複製中のDNAポリメラーゼのミス

拡散

上記のような機構はドメインの拡散を促す。どのようなタンパク質にも受け入れられやすいドメインは広い範囲のタンパク質、生物種の中に見られる。例えば、ABCトランスポータードメインは全ての生物に見られ、最も大きなドメインファミリーを形成している^[21]。広い範囲の生物に見られるドメインには他に代謝関連酵素や翻訳に関係するタンパク質がある。

複合化の種類

最も単純なマルチドメインは、1つのドメインがタンデムに並んだものである^[23]。3万残基からなる筋肉を構成する巨大なタンパク質チチンは、120を超えるフィブロネクチンIII及びIgドメインから構成されている^[24]。セリンプロテアーゼでは遺伝子重複により2つのβバレルドメインを持つ酵素ができた。配列が広く分岐したため、これらの類似性は低くなっている。2つのβバレルドメインの間の裂け目に活性部位があり、活性に必要な残基は両方のドメイン上に乗っている。セリンプロテアーゼであるキモトリプシンの遺伝子工学によるミュータントは、活性部位を除去してもある程度のプロテアーゼ活性を示した^[25]。

接続

モジュールは、キネシンとABCトランスポーターの例のように、しばしば異なった接続を見せる。キネシンのモータードメインはコイルドコイル領域とカーゴドメインを持ち、ポリペプチド鎖のどちらの末端にも存在しうる^[26]。ABCトランスポーターは、ATP結合カセットと内在性膜モジュールという相互に無関係な2つのモジュールからなる、4対のドメインより構成されるが、これらは様々な組み合わせで並んでいる。

ドメインの挿入

ドメインは配置が変わるだけではなく、ドメインが他の場所に挿入される例も多く見られる。他のドメインとの配列的または構造的な類似性は、挿入されたホモログがその元になったドメインと独立に存在できることを意味している。例えば、Pol Iファミリーのポリメラーゼの「掌」ドメインに挿入された「指」ドメインなどである^[27]。

自動フォールディングの単位としてのドメイン

フォールディング

→詳細は「タンパク質フォールディング」を参照

歴史

「タンパク質フォールディング - 未解決の謎」と題した1961年のクリスチャン・アンフィンセンの講演の中で^[5]、ポリペプチドが急速に安定なコンフォメーションに自発的に折り畳まれる機構が、大きな問題として提唱された。この問題に関する多くの実験、研究が行われたが、タンパク質フォールディングの原理に関しては初期の研究に頼るしかなかった。アンフィンセンはタンパク質の自然な状態は熱力学的に安定で、コンフォメーションの自由エネルギーは最低値であることを示した。

フォールディングの過程

フォールディングとは、タンパク質が生物学的に意味を持った形になるための空いた空間を探す過程である。サイラス・レヴィンソールの提唱したレヴィンソールのパラドックスでは、ある平均サイズのタンパク質が最低エネルギーの状態を見つける前に可能な全てのコンフォメーションを調べるとすると10億年以上の時間がかかってしまうことが述べられている^[28]。しかしタンパク質のフォールディングには通常0.1秒から1000秒程度の時間しか要さず、タンパク質のフォールディングには道筋があることが示唆された^[29]。

実験的、理論的な進歩によって、フォールディングはエネルギーの観点から見られるようになり^[30]、フォールディングの動力学は部分的に折り畳まれた構造の段階的な組織化と見なされるようになった。これはフォールディングファンネル（フォールディングの漏斗）という言葉で表される。つまり、折り畳まれていないタンパク質はとりうるコンフォメーションが多くあるが、一度折り畳まれたタンパク質の取りうるコンフォメーションは少なくなり、また三次構造が形成されるにつれてエネルギーやエントロピーが低下してくることを暗示した言葉である。全体の折り畳みによって内部の鎖の自由エネルギーも低下し、徐々に取れるコンフォメーションの数を減らしながら1つの自然状態に向かっていく。

フォールディングの順番

多くの実験の結果によって、タンパク質のフォールディングは二次構造の形成から始まり、主に疎水効果を原動力として三次元構造の形成へと続くことが明らかとなった^[31]。フォールディングの初期段階ではポリペプチドのある領域が自発的に疎水効果で安定化された二次構造の要素を形成する^[32]。二次構造と三次構造は協調して同時に形成され^[33]、この過程は生成-凝縮過程と言われている^[34]。二次構造の結合の組み合わせ方は数多くあるため、取りうるフォールディング経路も多数あるが、二次構造を正しく配置した組み合わせのみが自然状態となる。

プロテインフォールドでのドメインの重要性

構造ドメインからなる大きなタンパク質の組織化の場合、それぞれのドメインが自立的にフォールドされることで組み合わせの可能性が狭まり、全体のフォールディングを促進する。さらにタンパク質の内部に疎水的な残基を持ち込むことで^[35]、大きなタンパク質は親水性残基を表面に残しながら疎水性残基を内側に埋め込むことができる^[36]。しかし、内部ドメインのタンパク質フォールディングやエネルギーの安定化に対する役割はタンパク質ごとに異なっている。T4リゾチームではあるドメインの他のドメインへの影響が大きく、タンパク質全体がタンパク質分解に対して抵抗性を持つ。この場合には、C末端側のドメインが早い段階でフォールディングされなければならず、他の部分のフォールディングにはフォールディングされたC末端のドメインが必要であるという逐次機構になっている^[37]。

独立したドメインは内部のドメインよりも速くフォールディングされるということも言われている^[38]。またフォールディングの律速段階はフォールドされたドメイン同士を結合させるところだとも言われている^[14]。これはドメインが正確にフォールドされないためか、ドメインの表面から水分子を除くなどの小さな調整がエネルギー的な障害になっているからだと考えられる^[39]。

ドメインと四次構造

四次構造

→詳細は「四次構造」を参照

多くのタンパク質は四次構造を持ち、サブユニットと呼ばれるいくつかのポリペプチド鎖が多量体となって分子を作っている。例えば、ヘモグロビンは2つのαサブユニットと2つのβサブユニットから構成され、これら4つの鎖でヘムポケットを持ったα-グロビンの形を作っている。

ドメインスワッピング

ドメインスワッピングは、多量体を作る機構である^[40]。ドメインスワッピングでは、単量体タンパク質の二次構造や三次構造は別のタンパク質の同じ要素で置き換えられる。ドメインスワッピングの範囲は、二次構造の中の要素から構造ドメイン全体までに及ぶ。これは、サブユニットの表面に活性部位を持つ多量体酵素のように、多量体による機能適応に向けた進化のモデルでもある^[41]。

ドメインとタンパク質の柔軟性

タンパク質の中のマルチドメインの存在は、タンパク質に柔軟性と可動性を与える。観察された最も大きなドメインの動きの1つはピルビン酸リン酸ジキナーゼの回転機構である。ヌクレオチド結合ドメインの活性部位からホスホエノールピルビン酸/ピルビン酸ドメインにリン酸基を移動させるために、ホスホイノシチドドメインが両者の間を回転する^[42]。リン酸基は45Å、ドメインは約100°も動く。ドメインの動きは次のような働きに重要である^[43]。

触媒
活性の調節
代謝産物の輸送
タンパク質構造の形成
細胞運動

酵素では、あるドメインが他のドメインと閉じることによって基質が捕獲される。このような動きは異なる環境での三次元結晶構造を比較したり、核磁気共鳴分光法での実験によって見ることができる。Gersteinらの詳細な研究によって、ドメインの動きはヒンジ（ちょうつがい）型とはさみ型という2つの基本的なクラスに分類された。またドメインの再配列の際に大きなコンフォメーションの変化を引き起こすのは、インタードメインリンカーと言われる、鎖の小さな部分であることが明らかとなった^[44]。

二次構造のヒンジ

Haywardらによる研究^[45]で、多くの場合αヘリックスとβシートの末端でヒンジを作っていることが分かった。多くのヒンジは2つの二次構造がドアのちょうつがいのように動き、開閉の動きを可能にする。開閉の際にαヘリックスの水素結合は変わらず、強い「弾性力」でドメインを閉じ、基質を素早く捕らえることができる^[45]。

らせん状態から伸張状態へ

ドメインの境界でのらせん状態から伸張状態への変換機構はまだ明らかになっていない。カルモジュリンでは、αヘリックスに結合するドメインの中央の5残基のねじれ角が変化する。らせんはほぼ垂直に2つに分かれ、伸張した鎖の4残基によって小さい方のらせんが切り離される^[46]。

はさみの動き

はさみの動きには、ドメイン表面の小さい滑りの動きが関わる。はさみの動きをするタンパク質の多くは、二次構造が積み重なった層状構造を持っている。インタードメインリンカーはドメイン同士を近づける働きをしている。

ドメインの例

外部リンク

構造ドメインデータベース

配列ドメインデータベース

出典

^ ^a ^b Richardson, J. S. (1981). "The anatomy and taxonomy of protein structure". Adv Protein Chem, 34:167-339.
^ Bork, P. (1991). "Shuffled domains in extracellular proteins". FEBS Lett, 286:47-54.
^ Banner, D. W., Bloomer, A. C., Petsko, G. A., Phillips, D. C., Pogson, C. I., Wilson, I. A., Corran, P. H., Furth, A. J., Milman, J. D., O ord, R. E., Priddle, J. D., and Waley, S. G. (1975). "Structure of chicken muscle triose phosphate isomerase determined crystallographically at 2.5 angstrom resolution using amino acid sequence data". Nature, 255:609-614.
^ Copley, R. R. and Bork, P. (2000). "Homology among (betaalpha)(8) barrels: implications for the evolution of metabolic pathways". J Mol Biol, 303:627-641.
^ ^a ^b Anfinsen, B. C., Haber, E., Sela, M., and White, Jr, F. H. (1961). "The kinetics of formation of native ribonuclease during oxidation of the reduced polypeptide chain". Biochemistry, 47:1309-1314. Anfinsen's Dogma
^ Cordes, M. H., Davidson, A. R., and Sauer, R. T. (1996). "Sequence space, folding and protein design". Curr Opin Struct Biol, 6:3-10.
^ Zhou, Y., Vitkup, D., and Karplus, M. (1999). "Native proteins are surface-molten solids: application of the Lindemann criterion for the solid versus liquid state". J Mol Biol, 285:1371-1375.
^ Levitt and Chothia, 1976
^ Hutchinson and Thornton, 1993
^ Orengo et al., 1997
^ Savageau, 1986
^ Jones et al., 1998
^ Siddiqui and Barton, 1995
^ ^a ^b Garel, 1992
^ Jacob, 1977
^ Campbell and Downing, 1994
^ http://www.pdb.org/
^ Orengo et al., 1994
^ Apic, G., Gough, J., and Teichmann, S. A. (2001). "Domain combinations in archaeal, eubacterial and eukaryotic proteomes". J Mol Biol, 310:311-325.
^ Davidson et al., 1993
^ ^a ^b Henikoff et al., 1997
^ Bork, P. and Doolittle, R. F. (1992). "Proposed acquisition of an animal protein domain by bacteria2. Proc Natl Acad Sci U S A, 89:8990-8994.
^ Heringa 1998
^ Politou, A. S., Gautel, M., Improta, S., Vangelista, L., and Pastore, A. (1996). "The elastic I-band region of titin is assembled in a 'modular' fashion by weakly interacting Ig-like domains". J Mol Biol, 255:604-616.
^ McLachlan, 1979
^ Moore and Endow, 1996
^ Russell, 1994
^ Levinthal, 1968
^ Dill, 1999
^ Leopold et al., 1992; Dill and Chan, 1997
^ Dobson, C. M. and Karplus, M. (1999). "The fundamentals of protein folding: bringing together theory and experiment". Curr Opin Struct Biol, 9:92-101.
^ Dyson et al., 1992;Yang and Honig, 1995b; Yang and Honig, 1995a
^ Kim and Baldwin,1990
^ Fersht, 1997
^ White and Jacobs 1990
^ George and Heringa 2002b; George et al 2005
^ Desmadril, M. and Yon, J. M. (1981). "Existence of intermediates in the refolding of T4 lysozyme at pH 7.4". Biochem Biophys Res Commun, 101:563-569.
^ Teale and Benjamin, 1977
^ Creighton, T. E. (1983). Proteins: Structures and molecular properties. Freeman, New York. Second edition.
^ Bennett, M. J., Schlunegger, M. P., and Eisenberg, D. (1995). 3D domain swapping: a mechanism for oligomer assembly. Protein Sci, 4:2455-2468.
^ Heringa and Taylor, 1997
^ Herzberg et al., 1996
^ Gerstein et al., 1994
^ Janin, J. and Wodak, S. J. (1983). "Structural domains in proteins and their role in the dynamics of protein function". Prog Biophys Mol Biol, 42:21-78.
^ ^a ^b Hayward, 1999
^ Meador et al., 1992; Ikura et al., 1992

重要論文

Bastian, H. C. (1872). The beginnings of life: being some account of the nature, modes of origin and transformation of lower organisms. Macmillan and Co., England.
Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N., and Bourne, P. E. (2000). "The Protein Data Bank". Nucleic Acids Res, 28:235-242.
Branden, C.-I. and Tooze, J. (1991). Introduction to protein structure. Garland, New York.
Campbell, I. D. and Downing, A. K. (1994). "Building protein structure and function from modular units". Trends Biotech, 12:168-172.
Chothia, C. (1992) "Proteins. One thousand families for the molecular biologist". Nature 357:543-4.
Das, S. and Smith, T. F. (2000). "Identifying nature's protein Lego set". Adv Protein Chem, 54:159-183.
Davidson, J. N., Chen, K. C., Jamison, R. S., Musmanno, L. A., and Kern, C. B. (1993). "The evolutionary history of the first three enzymes in pyrimidine biosynthesis". Bioessays, 15:157-164.
Dietmann, S., Park, J., Notredame, C., Heger, A., Lappe, M., and Holm, L. (2001). "A fully automatic evolutionary classification of protein folds: Dali Domain Dictionary version 3". Nucleic Acids Res, 29:55-57.
Dill, K. A. and Chan, H. S. (1997). "From Leventhal to pathways to funnels". Nat Struc Biol, 4:10-19.
Dill, K. A. (1999). "Polymer principles and protein folding". Protein Sci, 8:1166-1180.
Drenth, J., Jansonius, J. N., Koekoek, R., Swen, H. M., and Wolthers, B. G. (1968). "Structure of papain". Nature, 218:929-932.
Dyson, H. J., Sayre, J. R., Merutka, G., Shin, H. C., Lerner, R. A., and Wright, P. E. (1992). "Folding of peptide fragments comprising the complete sequence of proteins. Models for initiation of protein folding. II. Plastocyanin". J Mol Biol, 226:819-835.
Edelman, G. M. (1973). "Antibody structure and molecular immunology". Science, 180:830-840.
Fersht, A. R. (1997). "Nucleation mechanisms in protein folding". Curr Opin Struct Biol, 7:3-9.
Garel, J. (1992). "Folding of large proteins: Multidomain and multisubunit proteins". In Creighton, T., editor, Protein Folding, pages 405-454. W.H. Freeman and Company, New York, first edition.
George, D. G., Hunt, L. T., and Barker, W. C. (1996). "PIR-international protein sequence database". Methods Enzymol, 266:41-59.
George, R. A. (2002) "Predicting Structural Domains in Proteins". Thesis, University College London
George, R. A. and Heringa, J. (2002a) "An analysis of protein domain linkers: their classification and role in protein folding". Protein Eng 15, 871-879.
George, R. A. and Heringa, J. (2002b) "SnapDRAGON - a method to delineate protein structural domains from sequence data". J Mol Biol 316, 839-851.
George, R. A., Lin, K., and Heringa, J (2005) "Scooby-domain: prediction of globular domains in protein sequence". Nucleic Acids Res 33, W160-W163.
Gerstein, M., Lesk, A. M., and Chothia, C. (1994). "Structural mechanisms for domain movements in proteins". Biochemistry, 33:6739-6749.
Ghelis, C. and Yon, J. M. (1979). "Conformational coupling between structural units. A decisive step in the functional structure formation". C R Seances Acad Sci D, 289:197-199.
Go, M. (1978). "Correlation of DNA exonic regions with protein structural units in haemoglobin". Nature, 291:90-92.
Hadley, C. and Jones, D. T. (1999). "A systematic comparison of protein structure classifications SCOP, CATH and FSSP". Struct Fold Des, 7:1099-1112.
Hayward, S. (1999). "Structural principles governing domain motions in proteins". Proteins, 36:425-435.
Hegyi, H. and Gerstein, M. (1999). "The relationship between protein structure and function: a comprehensive survey with application to the yeast genome". J Mol Biol, 288:147-164.
Henikoff, S., Greene, E. A., Pietrokovski, S., Bork, P., Attwood, T. K., and Hood, L. (1997). "Gene families: the taxonomy of protein paralogs and chimeras". Science, 278:609-614.
Heringa, J. and Argos, P. (1991). "Side-chain clusters in protein structures and their role in protein folding". J Mol Biol, 220:151-171.
Heringa, J. (1998). "Detection of internal repeats: how common are they". Curr Opin Struct Biol, 8:338-345.
Heringa, J. and Taylor, W. R. (1997). "Three-dimensional domain duplication, swapping and stealing". Curr Opin Struct Biol, 7:416-421.
Herzberg, O., Chen, C. C., Kapadia, G., McGuire, M., Carroll, L. J., Noh, S. J., and Dunaway-Mariano, D. (1996). "Swiveling-domain mechanism for enzymatic phosphotransfer between remote reaction sites". Proc Natl Acad Sci U S A, 93:2652-2657.
Holm, L. and Sander, C. (1994). "Parser for protein folding units". Proteins, 19:256-268.
Holm, L. and Sander, C. (1997). "Dali/FSSP classification of three-dimensional protein folds". Nucleic Acids Res, 25:231-234.
Honig, B. (1999). "Protein folding: from the levinthal paradox to structure prediction". J Mol Biol, 293:283-293.
Hutchinson, E. G. and Thornton, J. M. (1993). "The Greek key motif - extraction, classification and analysis". Protein Eng, 6:233-245.
Ikura, M., Clore, G. M., Gronenborn, A. M., Zhu, G., Klee, C. B., and Bax, A. (1992). "Solution structure of a calmodulin-target peptide complex by multidimensional NMR". Science, 256:632-638.
Islam, S. A., Luo, J., and Sternberg, M. J. E. (1995). "Identification and analysis of domains in proteins". Prot Eng, 8:513-525.
Jacob, F. (1977). "Evolution and tinkering". Science, 196:1161-1166.
Jones, S., Stewart, M., Michie, A., Swindells, M. B., Orengo, C., and Thornton, J. M. (1998). "Domain assignment for protein structures using a consensus approach: characterization and analysis". Protein Sci, 7:233-242.
Kim, P. S. and Baldwin, R. L. (1990). "Intermediates in the folding reactions of small proteins". Annu Rev Biochem, 59:631-660.
Larsen, T. M., Laughlin, L. T., Holden, H. M., Rayment, I., and Reed, G. H. (1994). "Structure of rabbit muscle pyruvate kinase complexed with Mn2+, K+, and pyruvate". Biochemistry, 33:6301-6309.
Leopold, P. E., Montal, M., and Onuchic, J. N. (1992). "Protein folding funnels: a kinetic approach to the sequence-structure relationship". Proc Natl Acad Sci U S A, 89:8721-8725.
Lesk, A. M., Branden, C. I., and Chothia, C. (1989). "Structural principles of alpha/beta barrel proteins: the packing of the interior of the sheet". Proteins, 5:139-148.
Levinthal, C. (1968). "Are there pathways for protein folding?" J Chim Phys, 65:44-45.
Levitt, M. and Chothia, C. (1976). "Structural patterns in globular proteins". Nature, 261:552-558.
McLachlan, A. D. (1979). "Gene duplications in the structural evolution of chymotrypsin". J Mol Biol, 128:49-79.
Meador, W. E., Means, A. R., and Quiocho, F. A. (1992). "Target enzyme recognition by calmodulin: 2.4A structure of a calmodulin-peptide complex". Science, 257:1251-1255.
Moore, J. D. and Endow, S. A. (1996). "Kinesin proteins: a phylum of motors for microtubule-based motility". Bioessays, 18:207-219.
Murvai, J., Vlahovicek, K., Barta, E., Cataletto, B., and Pongor, S. (2000). "The SBASE protein domain library, release 7.0: a collection of annotated protein sequence segments". Nucleic Acids Res 28:260-262
Murzin, A. G., Brenner, S. E., Hubbard, T., and Chothia, C. (1995). "SCOP: a structural classification of proteins database for the investigation of sequences and structures". J Mol Biol, 247:536-540.
Nissen, P., Hansen, J., Ban, N., Moore, P. B., and Steitz, T. A. (2000). "The structural basis of ribosome activity in peptide bond synthesis". Science, 289:920-930.
Orengo, C. A., Jones, D. T., and Thornton, J. M. (1994). "Protein superfamilies and domain superfolds". Nature, 372:631-634.
Orengo, C. A., Michie, A. D., Jones, S., Jones, D. T., Swindells, M. B., and Thornton, J. M. (1997). 2CATH - a hierarchic classification of protein domain structures". Structure, 5:1093-1108.
Ostermeier, M. and Benkovic, S. J. (2000). "Evolution of protein function by domain swapping". Adv Protein Chem, 55:29-77.
Phillips, D. C. (1966). "The three-dimensional structure of an enzyme molecule". Sci Am, 215:78-90.
Porter, R. R. (1973). 2Structural studies of immunoglobulins". Science, 180:713-716.
Rashin, A. (1985). "Location of domains in globular proteins". Methods Enzymol, 115:420-440.
Rose, G. D. (1979). "Hierarchic organisation of domains in globular proteins". J Mol Biol, 234:447-470.
Rossmann, M. G., Moras, D., and Olsen, K. W. (1974). "Chemical and biological evolution of nucleotide binding proteins". Nature, 250:194-199.
Russell, R. B. (1994). "Domain insertion". Protein Eng, 7:1407-1410.
Savageau, M. A. (1986). "Proteins of Escherichia coli come in sizes that are multiples of 14 kDa: domain concepts and evolutionary implications". Proc Natl Acad Sci U S A, 83:1198-1202.
Schultz, J., Copley, R. R., Doerks, T., Ponting, C. P., and Bork, P. (2000). "SMART: a web-based tool for the study of genetically mobile domains". Nucleic Acids Res, 28:231-234.
Siddiqui, A. S. and Barton, G. J. (1995). 2Continuous and discontinuous domains - an algorithm for the automatic generation of reliable protein domain definitions". Protein Sci, 4:872-884.
Siddiqui, A. S., Dengler, U., and Barton, G. J. (2001). "3Dee: a database of protein structural domains". Bioinformatics, 17:200-201.
Sowdhamini, R. and Blundell, T. (1995). "An automatic method involving cluster analysis of secondary structures for the identification of domains in proteins". Protein Sci, 4:506-520.
Srinivasarao, G. Y., Yeh, L. S., Marzec, C. R., Orcutt, B. C., Barker, W. C., and Pfei er, F. (1999). "Database of protein sequence alignments: PIR-ALN". Nucleic Acids Res, 27:284-285.
Swindells, M. B. (1995). "A procedure for detecting structural domains in proteins". Protein Sci, 4:103-112.
Tatusov, R. L., Natale, D. A., Garkavtsev, I. V., Tatusova, T. A., Shankavaram, U. T., Rao, B. S., Kiryutin, B., Galperin, M. Y., Fedorova, N. D., and Koonin, E. V. (2001). "The COG database: new developments in phylogenetic classification of proteins from complete genomes". Nucleic Acids Res, 29:22-28.
Taylor, W. R. and Orengo, C. A. (1989). "Protein structure alignment". J Mol Biol, 208:1-22.
Taylor, W. R. (1999). "Protein structure domain identification". Protein Eng, 12:203-216.
Teale, J. M. and Benjamin, D. C. (1977). "Antibody as immunological probe for studying refolding of bovine serum albumin. Refolding within each domain". J Biol Chem, 252:4521-4526. * Tsai, C. J. and Nussinov, R. (1997). "Hydrophobic folding units derived from dissimilar monomer structures and their interactions". Protein Sci, 6:24-42.
Wetlaufer, D. B. (1973). "Nucleation, rapid folding, and globular intrachain regions in proteins". Proc Natl Acad Sci U S A, 70:697-701.
White, S. H. and Jacobs, R. E. (1990). "Statistical distribution of hydrophobic residues along the length of protein chains. Implications for protein folding and evolution". Biophys J, 57:911-921.
Yang, A. S. and Honig, B. (1995a) "Free energy determinants of secondary structure formation: I. alpha-Helices". J Mol Biol, 252:351-365.
Yang, A. S. and Honig, B. (1995b). "Free energy determinants of secondary structure formation: II. Antiparallel beta-sheets". J Mol Biol, 252:366-376.

[Richardson_1981-1] Richardson, J. S. (1981). "The anatomy and taxonomy of protein structure". Adv Protein Chem, 34:167-339.

[2] Bork, P. (1991). "Shuffled domains in extracellular proteins". FEBS Lett, 286:47-54.

[3] Banner, D. W., Bloomer, A. C., Petsko, G. A., Phillips, D. C., Pogson, C. I., Wilson, I. A., Corran, P. H., Furth, A. J., Milman, J. D., O ord, R. E., Priddle, J. D., and Waley, S. G. (1975). "Structure of chicken muscle triose phosphate isomerase determined crystallographically at 2.5 angstrom resolution using amino acid sequence data". Nature, 255:609-614.

[4] Copley, R. R. and Bork, P. (2000). "Homology among (betaalpha)(8) barrels: implications for the evolution of metabolic pathways". J Mol Biol, 303:627-641.

[Anfinsen_1961-5] Anfinsen, B. C., Haber, E., Sela, M., and White, Jr, F. H. (1961). "The kinetics of formation of native ribonuclease during oxidation of the reduced polypeptide chain". Biochemistry, 47:1309-1314. Anfinsen's Dogma

[Cordes-6] Cordes, M. H., Davidson, A. R., and Sauer, R. T. (1996). "Sequence space, folding and protein design". Curr Opin Struct Biol, 6:3-10.

[7] Zhou, Y., Vitkup, D., and Karplus, M. (1999). "Native proteins are surface-molten solids: application of the Lindemann criterion for the solid versus liquid state". J Mol Biol, 285:1371-1375.

[8] Levitt and Chothia, 1976

[9] Hutchinson and Thornton, 1993

[10] Orengo et al., 1997

[11] Savageau, 1986

[12] Jones et al., 1998

[13] Siddiqui and Barton, 1995

[#1-14] Garel, 1992

[15] Jacob, 1977

[16] Campbell and Downing, 1994

[17] ttp://www.pdb.org/

[18] Orengo et al., 1994

[19] Apic, G., Gough, J., and Teichmann, S. A. (2001). "Domain combinations in archaeal, eubacterial and eukaryotic proteomes". J Mol Biol, 310:311-325.

[20] Davidson et al., 1993

[#2-21] Henikoff et al., 1997

[22] Bork, P. and Doolittle, R. F. (1992). "Proposed acquisition of an animal protein domain by bacteria2. Proc Natl Acad Sci U S A, 89:8990-8994.

[23] Heringa 1998

[24] Politou, A. S., Gautel, M., Improta, S., Vangelista, L., and Pastore, A. (1996). "The elastic I-band region of titin is assembled in a 'modular' fashion by weakly interacting Ig-like domains". J Mol Biol, 255:604-616.

[25] McLachlan, 1979

[26] Moore and Endow, 1996

[27] Russell, 1994

[28] Levinthal, 1968

[29] Dill, 1999

[30] Leopold et al., 1992; Dill and Chan, 1997

[31] Dobson, C. M. and Karplus, M. (1999). "The fundamentals of protein folding: bringing together theory and experiment". Curr Opin Struct Biol, 9:92-101.

[32] Dyson et al., 1992;Yang and Honig, 1995b; Yang and Honig, 1995a

[33] Kim and Baldwin,1990

[34] Fersht, 1997

[35] White and Jacobs 1990

[36] George and Heringa 2002b; George et al 2005

[37] Desmadril, M. and Yon, J. M. (1981). "Existence of intermediates in the refolding of T4 lysozyme at pH 7.4". Biochem Biophys Res Commun, 101:563-569.

[38] Teale and Benjamin, 1977

[39] Creighton, T. E. (1983). Proteins: Structures and molecular properties. Freeman, New York. Second edition.

[40] Bennett, M. J., Schlunegger, M. P., and Eisenberg, D. (1995). 3D domain swapping: a mechanism for oligomer assembly. Protein Sci, 4:2455-2468.

[41] Heringa and Taylor, 1997

[42] Herzberg et al., 1996

[43] Gerstein et al., 1994

[Janin_1983-44] Janin, J. and Wodak, S. J. (1983). "Structural domains in proteins and their role in the dynamics of protein function". Prog Biophys Mol Biol, 42:21-78.

[#3-45] Hayward, 1999

[46] Meador et al., 1992; Ikura et al., 1992

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

[37]

[38]

[39]

[40]

[41]

[42]

[43]

[44]

[45]

[46]