Talk:WikiJournal Preprints/Non-canonical base pairing

From Wikiversity
Jump to navigation Jump to search

WikiJournal Preprints logo.svg

WikiJournal Preprints
Open access • Publication charge free • Public peer review

WikiJournal User Group is a publishing group of open-access, free-to-publish, Wikipedia-integrated academic journals. <seo title=" Wikiversity Journal User Group, WikiJournal Free to publish, Open access, Open-access, Non-profit, online journal, Public peer review "/>

<meta name='citation_doi' value=>

Article information

Authors: Dhananjay Bhattacharyya[a][i]ORCID iD.svg , Abhijit Mitra[b]

Non canonical base pairing, 2020Wikidata Q39049436


Plagiarism check[edit source]

Artículo bueno.svg Pass. plagiarism check. Report from WMF copyvios tool: Violation Unlikely 3.8% confidence. Only trivial similarities found. T.Shafee(Evo﹠Evo)talk 02:33, 27 February 2019 (UTC)

Review 1[edit source]

Review by Marta Szachniuk ,
This review was submitted on , and refers to this previous version of the article

I suggest rejecting the article. It is clear that the authors of the paper are not specialists in the subject they present. The paper is poorly written. The article collects generally known information published in the literature. However, the content is not fully organized. The examples are probably randomly selected, and may confuse the reader (they are not unified, no captions). Many classical concepts like RNA folding, RNA architecture, RNA 3D motifs, recurrent motifs are missing. There is nothing about the graphical representation of non-canonical base pairs (pictograms proposed by Leontis and Westhof) and such representation is currently widely used. The role of non-canonical base pairs in RNA folding is not considered.

More details follow:

1. Introduction
Here the authors write that the 3D structures of DNA and RNA have complementary base pairs. A nomenclature for bases is used here. In the context of nucleic acids, rather A,G,T/U, C (nucleosides) are used. In nucleic acids, we do not deal with bases alone, but with nucleotides. Ade:Thy (Ade:Ura in RNA) or Gua:Cyt is ok, but this nomenclature is rather used for QM calculations (as far as the bases are concerned). I would replace Ade:Gua Trans H:S with AG trans Hoogsteen/Sugar (tHS).


We have now used the base nomenclature scheme adopted in Wikipedia. The nomenclature schemes, in different ways, have been elaborated in Table 1

2. History
The authors first discuss the Watson-Crick base pairing. Then they write about the discovery of Hoogsteen pairs but no reference is given to the latter one! I would still expect the authors to give examples where such Hoogsteen base pair was first discovered. I guess the authors intended to do it because they give two examples for DNA: TATA box (there are many newer works, e.g. from Al-Hashimi lab) and sixteenth-nucleotide repeats of TTAGGG in telomere DNA, which creates quadruplexes (1987). These examples are ok, but they are not clearly explained. No reference to quadruplexes is given. IT should be clearly stated where the Hoogsteen base pair was first discovered and in what sequential and structural context.

Then the authors address the RNA world. Here they give an example of the first crystallographic structure tRNA Phe (again no references!!!). In this paragraph, I would add the discovery of G-U wooble base pair because it is an important pair for RNA.


We have now discussed about the history of determination of structure A:T or G:C Watson-Crick base pairs with relevant references. The sections on Hoogsteen base pairs in DNA double helix or in G-quadruplex structures also have been elaborated

The picture of nucleoside and nucleotide is illegible. It should focus on the bases because the sugar is the same everywhere. I would show here base pair of WC GC, AT/U G-U and example of Hoogsteen from TATA box (with quadruplexes on) or something like Figure 9 here doi: 10.1002/a.1258. Then it would be easier to read this introduction.

I have critical remarks to all the figures in the paper. There are no captions, they are not uniform, on some of them, the bases have hydrogens and on others not. If the figures are copied from another source then they need to be cited. The figures should be more informative as for the review paper. They are of poor quality to the readers.


We have redrawn all the figures with better clarity and resolution

3. Classification of Non-canonical Base-pairs
There is no reference to the basic paper that introduced the classification of non-canonical base pairs: Leontis, N.B. and E. Westhof, Geometric nomenclature and classification of RNA base pairs. RNA, 2001. 7(4): p. 499-512. Many users still use the Saenger classification (e.g. RNA FRABASE). This classification (which was the first one, being the standard for many years) must be mentioned together with a reference. The "Saenger nomenclature" is mentioned but the information is given rather chaotically. It should be clearly stated that we have the Leontis-Westhof nomenclature (currently accepted) and Saenger (older). In this paragraph, I would add information about symbols used to denote the type of base pair.

There is no link to I do not understand in the first sentence '...and are by and large hydrophobic in nature.'


We have added relevant references as suggested

4. Classification based on isostericity
I would not call this classification, it is not a new classification. I would use "Isostericity matrix" and give the reference to


We differ from the view of the reviewer as the three ways discussed in this manuscript are distinct ways of clustering all the 156 canonical and non-canonical classes of Base Pairs, based on distinct set of attributes. Each of these approaches is useful in distinct contexts. There may be a better term, other than ‘classification’, but we could not think of one such.

5. Classification based on local strand orientation
I wouldn't call this classification again. In the description/classification of base pairs we have such parameters as bond orientation and strand orientation. This paragraph should be placed before isostericity.


The first two methods of classification focus on hydrogen bonding patterns stabilizing the base pairs, and, based on their respective glycosidic bond orientations, on the topology of the interacting faces of the bases. They do not convey any information regarding the local strand orientations associated with the paired bases. This additional information is however relevant for a detailed understanding of the role of the base pairs in the context of the 3D configuration of the sugar-phosphate backbone of long chain functional RNA/DNA structures. Since the first two methods discuss the base pairs as basic local units, and the third method provides additional relevance in the context of the global structures, we feel that our ‘local first and then global’ discussion sequence is more appropriate.

6. Detection of Non-canonical Base-pairs
I wouldn't use the word "Detection" (I associate it more with experimental methods). They should use "identification" instead. In the world literature this is named "identification of base pairs". This chapter deals with methods of identifying base pairs. The authors give two algorithms: Lu & Olson, and Das. I did not understand from the description what is the difference between them (and probably the authors do not understand this difference either). By the way, I do not know Das's method for base pair identification (never heard of). No other methods are cited here, but there are some: 3DNA/DSSR, RNAView, FR3D, RNApdbee, etc. Authors should correct this paragraph. It is incomplete and misleading If the authors provide examples of methods e.g. 3DNA/DSSR, FR3D and what web servers use them, e.g. 3DNA/DSSR in NDB, FR3D in, then this information will be useful for the reader.


We stand corrected, and thank the reviewer for the suggestion. Accordingly, in the current version, we have used the term ‘Identification’ instead of ‘detection’, wherever applicable. We have also tried to explain the different identification protocols to the best of our abilities and have provided the relevant references for the benefit of interested readers. In addition, we have included several other available methods and related servers, in our discussions.

7. Strengths and stabilities of Non-canonical Base-pairs
I do not understand the first sentence: “As stated above, A:U or A:T base pairs, as proposed by Watson and Crick, are stabilized by two hydrogen bonds, hence all the base pairs detected by the second method are expected to be as stable as the canonical A:U base pair” - What does the second method refer to? To the Bhattacharyya method? Why only base pairs identified by the second method? “Several groups attempted to detect the binding energy” should be changed to “Several groups attempted to estimate/calculate the binding energy”


We have rewritten this section to address the issues raised. We thank the reviewer for helping us to make this section more comprehensive.

8. Examples
Here I would introduce the RNA motif (RNA module) and give an example of non-canonical base pairs in a selected motif, as shown by RNApdbee. It might be GNRA tetraloop, as the authors have it. I would consider moving this paragraph further.


We have discussed at length with figures about different RNA motifs where non-canonical base pairs play important role.

9. Higher Order Structures formed by Non-canonical Base-pairs
The authors give two examples: triplets in DNA triplexes and tetrads in DNA quadruplexes. Perhaps a drawing of G-tetrade and quadruplex would be useful here.


We have given figures on quadruplex

10. Non-canonical Base-pairs in Double Helical Regions First of all the authors give an example of non canonical base pairs stabilizing tertiary effects in tRNA. Then it is not known in what context the G-U pair is discussed. A GU pair is important but why such an example in this paragraph? (authors talks about the interactions where in tRNA usually occurs G:PSU)

"symmetric internal bulge like motifs" should be changed to "symmetric internal loop".

I suggest replacing the order of paragraphs in this section, i.e. first duplexes and then tertiary effects.

The text would be more clear if the authors introduced the "RNA motif" and gave some examples. Then the role of non-canonical base pairing would be shown more clearly.


We found there are Wiki pages on G:U base pair and hence have largely referred to that. We feel with added discussions on RNA motifs this aspect has become more comprehensive.

Review 2[edit source]

Review by anonymous peer reviewer ,
This review was submitted on , and refers to this previous version of the article

The article provides a general introduction of the non-canonical RNA base pairs, including a historic overview, as well as some discussion of their classification, detection, and stability. It is great to see the experts embracing WikiJournal of Science and improving the coverage of scientific topics.

Major comments

1) While the manuscript does include many relevant citations, it could be significantly enhanced by adding more literature references to support the text. For example, a 3-paragraph History section cites only 1 paper. In another example, the Detection section could be made more comprehensive by including other software used for base pair annotation (MC-Annotate, FR3D, RNAView, and ClaRNA, among others). Additional references would also help avoid numerous judgement statements like “reasonably planar, and quite stable” as different readers may have different views on what should be considered as reasonable.


We have modified the History section with more references. We have also discussed about the other methods of identification of base pairs.

2) Would it make sense to provide at least one example of a non-canonical base pair in Figure 1 to help the readers visualise the hydrogen bonds between nucleotides?


We have given four representative figures of non-canonical base pairs which are frequently found in the available structures.

3) To make the article more comprehensive, it could be useful to discuss RNA 3D motifs, such as hairpins, internal loops, and multi-way junctions, as these structures are composed of non-canonical base pairs. In the current version, the only mention of 3D motifs appears to be restricted to internal loops: “Similarly, many non-canonical base pairs, e.g. Ade:Gua tHS (trans Hoogsteen/Sugar edge) or Ade:Ura tHW (trans Hoogsteen/Watson-Crick), Ade:Gua cWW, etc, are seen often within double helical regions giving rise to symmetric internal bulge like motifs.”


We have now discussed about several RNA structural motifs where non-canonical base pairs are involved.

Minor comments

1) “Recent developments, however, reveal that the nucleotide bases are also capable to form large number of various other types of pairing between non-complementary bases”

The statement is not entirely accurate because it suggests that non-canonical base pairs form only between non-complementary bases whereas there are examples of non-canonical base pairs between complementary bases, such as the trans WC-WC AU base pair: In addition, non-coding base pairs have been known for quite a while - for example, their comprehensive classification was published by Leontis and Westhof in 2001.


We have modified that section with more inputs on historical developments

2) “Most often the non-canonical base pairs appear in the RNA structures as isolated contacts between different residues stabilizing the appropriate fold”

What is meant by isolated contacts?


We have substantially modified the related section to remove the ambiguities

3) “The bases can thus, in principle, be involved in hydrogen bond mediated pairing with other bases involving any one of their three edges.“

Not any of the edges can be always used to form base pairs as some base pair combinations are not possible due to steric clashes, such as the GG cis WC-WC base pair.


Our modified manuscript clearly indicates these limitations

4) “It is expected that a proper understanding of these Non-canonical base pairings would improve the methods of prediction of RNA structure from sequence, hence would be quite useful for human health.” This section could be expanded by mentioning the methods that predict non-coding RNA base pairs based on sequence, such as RNAwolf or RMDetect.


We have now included discussions about how consideration of non-canonical base pairs can improve RNA structure prediction methods.


“As the canonical and even non-canonical base pairs are sufficiently planer,” “planer “ should be “planar”

“Of course all non-canonical base pairs are not extremely strong and stable” - this sentence needs to be clarified

“Hence, for most non-canonical base-pairs, some of the parameters (Open, Shear and Stretch) calculated by Curves or 3DNA are calculated as large unusual,” - it’s not clear what is meant by “large unusual”.

The “Competing interests” section contains boilerplate text: “Any conflicts of interest that you would like to declare. Otherwise, a statement that the authors have no competing interest.”

Additional editor comments[edit source]

Comments by Marshallsumter
These comments were submitted on , and refer to this previous version of the article

Editorial comments from a non-specialist:

  1. All of the links to Wikipedia articles are using [ DNA] rather than [[w:DNA|DNA]], for example.
  2. Figures 1, 3, 4 and 5 do not have a credit.
  3. Note: [w:Non-canonical base pairing] is five sentences long and needs serious expansion. Major expansion would likely mean extensive wikifying of current literature.
  4. Info in [w:Wobble base pair] probably should be included.
  5. Note: a Google Scholar search using "Non-canonical base pairing" for anytime produces "About 1,250 results (0.06 sec)". Since 2015: "About 419 results (0.06 sec)". The submission could be extensively expanded to bring the review closer to the state of the science in an apparently 'hot topic'.
  6. Comment in response to "It is clear that the authors of the paper are not specialists in the subject they present.": On the whole web using "Non-canonical base pairing" and "Dhananjay Bhattacharyya" as search concepts produces "About 41 results (0.42 seconds)". Some appear to be conference presentations. In the WikiJournal preprint, 4 co-authorships and one first authorship occur for "Dhananjay Bhattacharyya". I'd say Dhananjay Bhattacharyya and co-author are experts. A few papers suggest Dhananjay Bhattacharyya is a theoretician focused on aspects of non-canonical base pairing.
  7. Comment in response to "Many classical concepts like RNA folding, RNA architecture, RNA 3D motifs, recurrent motifs are missing." These might make good expansion subtopics.
  8. Comment in response to "The authors first discuss the Watson-Crick base pairing. Then they write about the discovery of Hoogsteen pairs but no reference is given to the latter one!" The reference is in the Wikipedia [w:Hoogsteen base pair] article and I agree should be included in the preprint along with perhaps examples from [w:Hoogsteen base pair].
  9. The remaining comments by reviewer 1 I believe are well-intended.
  10. Overall impression is that reviewer 1 has made some good constructive suggestions which the authors can handle. Using the phrase "rejecting the article" I see as supplying an incentive to improve the Wikipedia article and especially the review for its own sake. If the authors are amenable to expanding the article and wikifying some of their review as needed to improve the Wikipedia entry, it's a win-win for WikiJournal of Science and Wikipedia. A good review for WJS is essential. --Marshallsumter (discusscontribs) 23:23, 26 February 2019 (UTC)

Comments by Thomas Shafee
These comments were submitted on , and refer to this previous version of the article

I would add that information on biological function would also be useful, this would include the role in formation of RNA tertiary structures (perhaps with example crystal structure) as well as role in interactions (e.g. in codon recognition during translation)