Dominant group/Two-word terms

From Wikiversity
Jump to: navigation, search
Space Shuttle is a two-word term. This image shows STS-1 with the Orbiter Columbia at Complex 39A Credit: NASA.
SYawning.svg Completion status: Been started, but most of the work is still to be done.

Dominant group is a two-word term.

Books-aj.svg aj ashton 01b.svg Educational level: this is a secondary education resource.

"[T]wo-word glossary items are the most common technical terms".[1] "Human use is supported by published glossaries, on-line glossary reference tools, and authoring environments that use glossaries to enable or enforce terminological consistency."[1]

Sciences humaines.svg Educational level: this is a tertiary (university) resource.

"[M]ost technical jargon is not likely to be included in a general-purpose dictionary."[1]

Nuvola apps edu science.svg Educational level: this is a research resource.

"Dominant group" is an adjective (A), noun (N) glossary term.

38254-new folder-12.svg Resource type: this resource is an article.
Ryanscontribs.svg Resource type: this resource consists of notes.
LeafMoss0225Cropped.jpg Subject classification: this is a semantics resource.
Jamming cleat.jpg Subject classification: this is a terminology resource.

Contents

Notation [edit]

Notation: let the symbol Def. indicate that a definition is following.

Notation: let the symbols between [ and ] be replacement for that portion of a quoted text.

Universals [edit]

To help with definitions, their meanings and intents, there is the learning resource theory of definition.

Def. evidence that demonstrates that a concept is possible is called proof of concept.

Def. "words which are not found in a dictionary",[1] are called out-of-vocabulary words.

The proof-of-concept structure consists of

  1. background,
  2. procedures,
  3. findings, and
  4. interpretation.[2]

The findings demonstrate a statistically systematic change from the status quo or the control group.

Two-word terms [edit]

"The compound two-word term is employed to give more precision than either word alone would have, not being exact synonyms. And each word indicates the sense in which the other is used."[3] Bold added.

Def. "words which are not found in a dictionary"[1] are called out-of-vocabulary words.

Key terms [edit]

"A key-word is a single word with high frequency over the set of Web pages, and a key-term is a two-word term with very high frequency."[4]

Relative synonyms [edit]

The relative synonyms of "dominant group" fall into the following set of orderable pairs:

Relative synonyms for "dominant group"[5]
Synonym for "dominant" Category Number Category Title Synonym for "group" Category Number Catgeory Title
“superior” 36 SUPERIORITY "arrangement" 60 ARRANGEMENT
“influential” 171 INFLUENCE "class" 61 CLASSIFICATION
“musical note” 462 HARMONICS "assembly" 74 ASSEMBLAGE
“most important” 670 IMPORTANCE "size" 194 SIZE
“governing” 739 GOVERNMENT "painting", "grouping" 572 ART
"master" 747 MASTER "association", "set" 786 ASSOCIATION
----- --- ------- "sect" 1018 RELIGIONS, CULTS, SECTS

'Orderable' means that any synonym from within the first category can be ordered with any synonym from the second category to form an alternate term for "dominant group"; for example, "superior class", "influential sect", "master assembly", "most important group", and "dominant painting". "Dominant" falls into category 171. "Group" is in category 61. Further, any word which has its most or much more common usage within these categories may also form an alternate term, such as "ruling group", where "ruling" has its most common usage in category 739, or "dominant party", where "party" is in category 74. "Taxon" or "taxa" are like "species" in category 61. "Society" is in category 786 so there is a "dominant society".

Term filtering [edit]

"Two-word terms [are] determined not to be of interest in the context of the whole document collection either because they do not occur frequently enough or because they occur in a constant distribution among different documents [deviation-based approach]."[6]

Low frequency [edit]

Using a relative synonym, or "meta-term", such as "influential classification" may work unless the "scholarly popularity" is too restrictive.

Number of articles on Google scholar.
Genus Number of articles Two-word terms Popularity in articles Percentage (%)
"group" 5,910,000 "control group" 2,340,000 39.6
"group" 5,910,000 "social group" 964,000 16.3
"group" 5,910,000 "whole group" 245,000 4.15
"species" 4,630,000 "dominant species" 168,000 3.63
"group" 5,910,000 "dominant group" 69,000 1.17
"genus" 1,990,000 "genus species" 22,600 1.14
"wagon" 370,000 "red wagon"[7] (suggested non technical phrase) 3,610 0.976
"chair" 3,090,000 "comfortable chair"[7] (suggested non technical phrase) 21,400 0.693
"genus" 1,990,000 "species genus" 13,500 0.678
"type" 6,200,000 "dominant type" 30,500 0.492
"group" 5,910,000 "influential group" 13,100 0.222
"genus" 1,990,000 "dominant genus" 3,070 0.154
"genera" 2,600,000 "dominant genera" 3,950 0.152
“classification” 2,620,000 "control classification" 3,800 0.145
"support" 4,140,000 "content support" (meta-term for "comfortable chair") 3,190 0.077
"vehicle" 2,520,000 "red vehicle" (meta-term for "red wagon") 708 0.028
"classification" 2,620,000 "dominant classification" 506 0.019
“classification” 2,620,000 "influential classification" 421 0.016
“support” 4,140,000 "satisfying support" (alternate meta-term) 285 0.0069

As suggested in the Google scholar "popularity" table above two-word terms below the traditional phrase "genus species" are too restrictive either in popularity or non technicalness. While other synonyms may alter the picture suggested above, terms such as category1 + category2, category1 + "group", and "dominant" + category2 have neither "scholarly" popularity nor technicalness.

The genus term "group" seems to be an adequate starting point.

Constant frequency [edit]

As the following table of subject popularity shows, individual subjects, especially named by one word versus two have high popularity among the document collection sampled by the Google scholar search engine.

Number of articles per subject on Google scholar.
Two-word term Number of articles Subject Number of subject articles Percentage (%)
"credit card" 483,000 paleontology 249,000 0.298
"dominant group" 71,200 anthropology 1,320,000 1.30
"dominant group" 71,200 archaeology 794,000 0.558
"dominant group" 71,200 art 4,620,000 0.472
"dominant group" 71,200 astronomy 1,780,000 0.073
"dominant group" 71,200 astrophysics 730,000 0.046
"dominant group" 71,200 "atmospheric science" 101,000 0.048
"dominant group" 71,200 biology 4,600,000 0.289
"dominant group" 71,200 chemistry 5,130,000 0.102
"dominant group" 71,200 communication 4,560,000 0.757
"dominant group" 71,200 culture 4,490,000 1.23
"dominant group" 71,200 economics 2,630,000 1.13
"dominant group" 71,200 education 4,030,000 1.16
"dominant group" 71,200 ethnicity 1,340,000 2.77
“dominant group” 71,200 evolution 3,930,000 0.583
"dominant group" 71,200 geography 1,980,000 1.02
"dominant group" 71,200 geology 1,710,000 0.249
"dominant group" 71,200 history 3,400,000 1.54
"dominant group" 71,200 humanities 1,480,000 0.445
“dominant group” 71,200 language 2,900,000 1.58
“dominant group” 71,200 law 1,360,000 2.61
“dominant group” 71,200 literature 5,180,000 0.90
“dominant group” 71,200 "materials science" 1,170,000 0.01
“dominant group” 71,200 metagenome 6,930 3.61
“dominant group” 71,200 music 2,590,000 0.552
“dominant group” 71,200 mythology 385,000 1.37
“dominant group” 71,200 "of the" 8,370,000 0.855
“dominant group” 71,200 paleontology 249,000 0.393
“dominant group” 71,200 "performing arts" 125,000 0.608
“dominant group” 71,200 philosophy 2,340,000 0.979
“dominant group” 71,200 physics 4,370,000 0.096
“dominant group” 71,200 "planetary science" 483,000 0.039
“dominant group” 71,200 "political science" 1,270,000 0.709
“dominant group” 71,200 psychology 2,510,000 1.01
“dominant group” 71,200 "red wagon" 3,690 0.136 (all are books)
“dominant group” 71,200 regions 5,360,000 0.580
“dominant group” 71,200 religion 1,990,000 1.40
“dominant group” 71,200 semantics 1,280,000 0.441
“dominant group” 71,200 sociology 1,890,000 1.40
“dominant group” 71,200 technology 6,250,000 0.376
“dominant group” 71,200 theology 1,080,000 0.470
“dominant pickle” 1 "of the" 8,370,000 0.001
“net income” 253,000 paleontology 249,000 0.061
“net income” 253,000 technology 6,250,000 1.03

To comment on sampling, "dominant group" alone yields about 71,200 resources. Using "of the" followed by "dominant group" returns 71,600, and "dominant group" followed by "of the" yields 71,900. Or, this suggests the total for "dominant group" is 71,550±350. This is an error of 0.489 %.

The frequency of "dominant group" in these subject areas ranges from 3.61 to 0.01 per cent, which is a range of ~3 x 102. The number of occurrences of "dominant group" in any subject is low suggesting that it is not relevant but may only be an artifact of author word choice.

In Google scholar searches, "credit card" may appear more often associated with articles than "net income", for example, because of the common occurrence of the sentence "The only accepted payment is by credit card." with regard to purchasing a copy of the article or book.

"Fresh-pack pickles are the dominant pickle products in retail groceries and supermarkets."[8]

Significant variation [edit]

The statistical significance approach "is to test whether the variation of the relative frequency of a given term t in the document collection is statistically significant."[6]

Term relevance [edit]

"The notion of term relevance with respect to a document collection is [determined by assigning] each term its score based on maximal tf-idf (term frequency - inverse document frequency, maximal with respect to all the documents in the collection) [information retrieval approach]."[6] For example, "net income" received a score of 17.17, but "big bank" received only 5.39 [which is above the irrelevance cutoff].[6] "Credit card" did not make the cutoff.[6]

Depending on the meaning of "big bank", it may be a relative synonym for "dominant group". "Big" may suggest "important" (one of its synonyms) and a "bank" might be considered an "assemblage" (also a synonym), although the two words taken individually have more popular meanings.

See also [edit]

References [edit]

  1. 1.0 1.1 1.2 1.3 1.4 Youngja Park, Roy J Byrd and Branimir Boguraev (2002). Automatic Glossary Extraction: Beyond Terminology Identification, In: "Proceedings of the Nineteenth International Conference on Computational Linguistics". Morristown, New Jersey. pp. 772-8. 
  2. Ginger Lehrman and Ian B Hogue, Sarah Palmer, Cheryl Jennings, Celsa A Spina, Ann Wiegand, Alan L Landay, Robert W Coombs, Douglas D Richman, John W Mellors, John M Coffin, Ronald J Bosch, David M Margolis (August 13, 2005). "Depletion of latent HIV-1 infection in vivo: a proof-of-concept study". Lancet 366 (9485): 549-55. doi:10.1016/S0140-6736(05)67098-5. Retrieved on 2012-05-09. 
  3. Robert I. Coulter (1954). "Typewritten Library Manuscripts are not Printed Publications". Journal of the Patent Office Society 36: 258. Retrieved on 2012-06-21. 
  4. Yongzheng Zhang, Nur Zincir-Heywood, Evangelos Milios (2004). "World Wide Web site summarization". Web Intelligence and Agent Systems 2 (1): 39-53. Retrieved on 2012-06-21. 
  5. Peter Mark Roget (1969). Lester V. Berrey and Gorton Carruth. ed. Roget's International Thesaurus, third edition. New York: Thomas Y. Crowell Company. pp. 1258. 
  6. 6.0 6.1 6.2 6.3 6.4 Ronen Feldman, Moshe Fresko, Yakkov Kinar, Yehuda Lindell, Orly Liphstat, Martin Rajman, Yonatan Schler and Oren Zamir (1998). "Text mining at the term level". Principles of Data Mining and Knowledge Discovery Lecture Notes in Computer Science 1510 (1998): 65-73. doi:10.1007/BFb0094806. Retrieved on 2012-03-05. 
  7. 7.0 7.1 Kaldari (September 22, 2011). "Requests for Deletion#Dominant group and subpages". Wikiversity: 1. Retrieved on 2011-10-22. 
  8. Rashmi Maruvada (March 13, 2006). Evaluation of the Importance of Enzymatic and Non-enzymatic Softening in Low Salt Cucumber Fermentations. Raleigh, North Carolina, USA: North Carolina State University. http://repository.lib.ncsu.edu/ir/handle/1840.16/1683. Retrieved 2012-03-06. 

Further reading [edit]

  • Ronen Feldman, Moshe Fresko, Yakkov Kinar, Yehuda Lindell, Orly Liphstat, Martin Rajman, Yonatan Schler and Oren Zamir (1998). "Text mining at the term level". Principles of Data Mining and Knowledge Discovery Lecture Notes in Computer Science 1510 (1998): 65-73. doi:10.1007/BFb0094806. Retrieved on 2012-03-05. 
  • Youngja Park, Roy J Byrd and Branimir Boguraev (2002). Automatic Glossary Extraction: Beyond Terminology Identification, In: "Proceedings of the Nineteenth International Conference on Computational Linguistics". Morristown, New Jersey. pp. 772-8. 

External links [edit]