Data analysis/Methodologies

From Wikiversity
Jump to navigation Jump to search

This learning resource provides an overview of available methodologies for data analysis.

Objective[edit | edit source]

After going through the learning resource you should be able to select the appropriate methodology for your research design or for your application scenario.

Loss of Information and lossless Storage[edit | edit source]

Identify in your application scenario which loss of data is acceptable without loosing the quality of information you want to extract from the data.

  • Some application scenarios of data analysis improve recognition by removing information from the raw data. Especially when it is possible to remove noise for the raw data.
  • keep in mind that removing information could create a bias in the data analysis.


Visual Example of Loss of Information[edit | edit source]

The following example represents a pencil drawing of the eye.

Drawing of eye on the sketchpad.

Compare the drawing above with the image below, where the Smoother was applied with the Open Source software GIMP.

Smoother in OpenSource GIMP was applied on the raw image above of eye.

In the image processing software information is lost. We are not able to reconstruct the exact drawing of the lines in source image. In GIMP filters are applied. A filter removes information.

  • Characterized what type of information remains in the seconds images, that was encoded in the previous black and white image.
  • Apply that in general on other type of information e.g. in geographical information, text based information or data sets in your context of data analysis.


Data preparation[edit | edit source]

The following methodologies can be used for data preparation as one of the first steps after collecting the data in your project.