Graphing

From Wikiversity
(Redirected from Histogram)
Jump to: navigation, search
Progress-0500.svg Completion status: this resource is ~50% complete.

Graphing visualises data to facilitate perception and interpretation of distributions and relationships. Graphs should be accompanied by descriptive statistics.

This page provides an overview of graphing steps and principles and types of graphs.

Parodyfilm.png Watch "Science is beautiful", a 5:30 minute Nature Video which explores three different visualisations: Florence Nightingale (health), genome overlaps, ocean currents.

Parodyfilm.png Watch "Is Pivot a turning point for the web?", a 6:25 minute TED talk about a Microsoft technology which enables flexible exploration and zooming in and out of visualised data. This illustrates the power of being able to visualise data as a whole in order to discover patterns and links.

Graphing - How to[edit]

“Visualization is any technique for creating images, diagrams, or animations to communicate a message.” - Wikipedia

Steps[edit]

Creating effective data visualisations is not easy. Suggested basic steps are:

  1. Identify the purpose of the graph
  2. Select which type of graph to use, based on the variable(s)' level of measurement
  3. Draw an appropriate graph
  4. Modify the graph to be clear, non-distorting, and well-labelled.
  5. Disseminate the graph (e.g., include it in a report)

Principles[edit]

"Like good writing, good graphical displays of data communicate ideas with clarity, precision, and efficiency.

Like poor writing, bad graphical displays distort or obscure the data, make it harder to understand or compare, or otherwise thwart the communicative effect which the graph should convey." Michael Friendly – Gallery of Data Visualisation

Try looking at your graph from many angles. Will it make sense to everyone?
  1. Maximise objective display of truth. According to Tufte, the “lie factor” in graphs is size shown in graph divided by statistical size. It should be 1.
  2. Avoid distortion (Tufte)
  3. Avoid excessive use of colour - effective graphs are often monotone.
  4. Clearly label axes and provide a meaningful, descriptive figure caption. Use a legend and/or footnotes as appropriate to provide sufficient information for the graph to be interpretable as a whole without detailed references to accompanying text.
  5. Graphs are subject to the law of parsimony - i.e., they should be as simple as necessary to clearly communicate about data of interest.
  6. The whole of the data is more than the sum of the parts (Gary Flake, 2010)
  7. Show the data (Tufte)
  8. Reveal data at several levels (Tufte)

Graph types[edit]

The choice of graph will depend on the variables' level of measurement.

Univariate[edit]

Graphs of a single variable.

Bar chart[edit]

  • Also referred to as bar graphs
  • Used for illustrating frequencies or percentages for categories or the means of different groups or variables.
  • The x-axis shows the categories, groups or variables, the y-axis shows the quantity (frequency, percentage or a statistic such us a mean).
  • Tips for creating bar charts

Incarceration Rates Worldwide ZP.svg

Pie chart[edit]

  • Represents percentage data as pie slices (angles).
  • Generally not as effective as bar or error-bar graphs and are to be avoided.
  • Can be difficult to compare the relative size of similar-sized slices.
  • Can be difficult to label very small slices

English dialects1997.svg

Error-bar graph[edit]

  • Shows means with confidence intervals
  • Alternative to bar chart
Error-bar graph showing mean pulse rates and 95% confidence intervals by exercise level.

Box plot[edit]

  • Also known as the box and whisker plot
  • Plots the mean, quartiles, confidence interval and outliers

Box-Plot mit Interquartilsabstand.png

Stem and leaf plot[edit]

  • Displays stem (e.g., 10s) and leaves (e.g., 1s).
  • Each leaf represents a case
  • The exact data is provided in a visual display (like a histogram)
-2 | 4
-1 | 2
-0 | 3
 0 | 4 6 6 
 1 | 6
 2 | 4
 3 | 
 4 | 
 5 | 7

Histogram[edit]

  • Displays the frequency of occurrence of data for intervals within a single variable continuous distribution.
Heights of 31 Black Cherry trees.

Bivariate[edit]

Graphs of the relation between two variables.

Clustered bar-graph[edit]

  • Clustered bar-graphs are used to include an additional independent variable (e.g., gender). The separate groups are represented by different coloured bars.

Adobe Flex ColumnChart.png

Scatterplot[edit]

  • Plots the relation between two variables on a continuous x and y axis

Okuns law quarterly differences.svg

See also[edit]

External links[edit]