Drawing Quick Plots With ggplot2 – InApps is an article under the topic Software Development Many of you are most interested in today !! Today, let’s InApps.net learn Drawing Quick Plots With ggplot2 – InApps in today’s post !
Read more about Drawing Quick Plots With ggplot2 – InApps at Wikipedia
You can find content about Drawing Quick Plots With ggplot2 – InApps from the Wikipedia website
A picture is worth a thousand words — meaning a complex notion can be expressed as an image to enable one to visually absorb large volumes of data. The popular R package uses ggplot2 to depict meaningful graphs from the data available, and requires minimum skills to create a picture of the graphical data. The concept is built to represent data accurately without any worry about graph complexities.
This graph representation is ideally achieved by using inbuilt datasets with R.
Get Started Creating
ggplot2 is a plotting system in R that uses the grammar of graphics. It illustrates two important commands:
- qplot(): a quick plot.
- ggplot(): allows for more detailing of the graph. It also allows for layered graphs.
Let’s review how ggplot2 basically works, starting with an in-depth look at the inbuilt data available in R.
 “height” “weight”
This will display all the column names for the data. With the use of the head command, we can view the data as follows:
1 58 115
2 59 117
3 60 120
4 61 123
5 62 126
6 63 129
In the data set “women” there are two parameters: height and weight.
Let’s now pass these parameters into qplot():
The resulting graph is shown below:
The above graph shows a representation of height versus weight, which by default is plotted in terms of points.
Let’s plot the line graph below by issuing the command:
In ggplot2 the geometric object used to represent the data is called a “geom.”
If we’re looking for both representations, points as well as lines, the same needs to be passed on to the geom. Here, the geom fundamentally is the argument that denotes geometry.
Recall: When we want to declare a collection in R programming, we use c.
We are passing two options for the geom, so we have to use a collection.
To change the color of the graph, say from black to red, we have to issue:
But that is not the only use of that color option — try out a parameter with the color command:
Notice from the graph above that we can get the height in a particular color gradient, and the extra legend beside the graph is automatically added.
By adding more options, more legends are included, as indicated below:
Here, there are two legends — one for height, another for weight. Notice the size of the points change as the weight increases.
More Datasets for More Plots
Let’s take a closer look into the data set “iris” which contains information of the iris dataset as given below.
 “Sepal.Length” “Sepal.Width” “Petal.Length” “Petal.Width” “Species”
Look closely at the different columns this data displays: earlier, we used only two columns. Now the effect of the third column indicates what we should see after passing it to color:
> qplot(Sepal.Length, Petal.Length, data = iris, geom = “point”, color = Species)
Let’s now find the different species available under the iris data set:
 “setosa” “versicolor” “virginica”
The effect of specifying “color = Species” will result in displaying separate colors for separate species, including the legend.
More Options to Use
There are many options for a geom — these are some most frequently used. Some options work with a single-column of data information, others with double-column information.
Recall: To plot data on a graph we require two coordinates (x,y).
- When data is available as a single column, it plots with respect to the range of frequency (count) for that data.
- Two columns represent (x,y) data.
This will plot the histogram of Sepal.Length with count.
There are many more interesting options. For the histogram, we are considering the single column Sepal.Length, and we use the option “fill=Species” to add more colorful information to the graph:
Let’s execute another data set called Orange. Take a look at the different column names it holds:
To see the data it holds, simply type the below command:
Now let us plot the graph with age versus circumference of the “data = Orange” with respect to color:
> qplot(age, circumference, data = Orange, geom = c(“point”, “line”), color = Tree)
Other Useful Options With qplot
|xlim||limits for X axis|
|ylim||limits for Y axis|
|main||Main title for the graph|
|xlab||Lable for X axis|
|ylab||Lable for Y axis|
For further options visit the following urls: http://www.cookbook-r.com/Graphs/ and http://docs.ggplot2.org/current/
Storing the Graphics
After some trial and error you will be able to obtain good graphics. R also provides a way to store graphics:
- Create a file where graphical output can be redirected.
- Type the command for the graphics.
- Redirect the output to the console by issuing graphics.off().
In this tutorial we have just witnessed the beginning of the world of graphics for data analysis with ggplot2.
In the next post we will take a look at how the ggplot() command helps us plot complex graphs.
The code for the exercises can be found here.
Manjusha Joshi is a freelancer of free, open source software for scientific computing. She is a mathematician and member of the Pune Linux user group.
Featured image via Flickr Creative Commons.
Let’s create the next big thing together!
Coming together is a beginning. Keeping together is progress. Working together is success.