Plotting Graphs in R

R is a Statistical language used for data analysis. Owing to being open source it is rapidly gaining popularity,I would like to showcase some of the basic graphs constructed in R

I have used a sample csv file with bank related data for the purpose.

  • Histogram
setwd("C:\\myR")            #import the required csv file
bank1 <- read.csv("bank.csv",sep=";") # Read the CSV file
banko<-hist(bank1$age)                 # Plot the Histogram

We can decrease the number of bins and club frequencies together to have a broader understanding and a better picture of the graph as a whole.

banko$breaks # break gives the starting and ending value of a bin
##  [1]  15  20  25  30  35  40  45  50  55  60  65  70  75  80  85  90  95
## [18] 100
banko$count # count gives the total frequency in a respective bin
##  [1]  140 1526 5717 9130 7255 5589 4651 3598 2672  291  197  177  126   76
## [15]   33    8    2
banko<-hist(bank1$age, breaks=10, main="Age of Acc_Holders",
            xlab="Age",ylab="Number"
            ,col="lightgreen", xlim=c(0,100), ylim=c(0,16500))

Breaks =10, gives an approximate number of bins , nearly equal to 10, devised on an internal algorithm .

It may be a good idea to analyze graphs based on the density rather than frequency as, frequeny changes depending upon the number of bins selected.

  • Density Plot
d <- density(bank1$age)
plot(d,main="Density plot on Ages")
polygon(d,col="blue",border="Red")

- Bar Plots

counts <- table(bank1$job)
max(counts)
## [1] 10422
par(mar=c(6.4,4.1,4.1,2.1)) # set the margins
barplot(counts, main="Job Distribution", ,col="darkblue", 
        border="orange",las=2,ylim=c(0,10500))

The Margin size can we changed, in this case I have increased the bottom margin, to acccomodate the x-axis labels.

Also las=2, lets the labes to be presented in a vertical position. The default is horizontal.

  • Horizontal Bar Plots
par(mar=c(4.1,6.4,4.1,2.1))
barplot(counts, main="Job Distribution", ,col="darkblue",
border="orange",horiz=TRUE,las=2,xlim=c(0,10500))

Here we increased the right margin to accomodate the Y-axis labels.

  • Stacked Bar PLots
counts <- table(bank1$loan, bank1$job)
par(mar=c(5.4,6.2,2,2.1))
barplot(counts, main="Acc Holder Distribution by Job and Loan",xlab="job type", col=c("blue","red","yellow"),las=2,cex.names=c(0.8))

  • Grouped Bar Plots
counts <- table(bank1$loan, bank1$marital)

counts
##          
##           divorced married single unknown
##   no          3816   20567   9500      67
##   unknown      121     588    280       1
##   yes          675    3773   1788      12
barplot(counts, main="marital status and taken loan",
xlab="Marital status",col=c("darkblue","red","yellow"),beside=TRUE)