Plotting Graphs in R
Avadhoot
R is a Statistical language used for data analysis. Owing to being open source it is rapidly gaining popularity,I would like to showcase some of the basic graphs constructed in R
I have used a sample csv file with bank related data for the purpose.
- Histogram
setwd("C:\\myR") #import the required csv file
bank1 <- read.csv("bank.csv",sep=";") # Read the CSV file
banko<-hist(bank1$age) # Plot the Histogram
We can decrease the number of bins and club frequencies together to have a broader understanding and a better picture of the graph as a whole.
banko$breaks # break gives the starting and ending value of a bin
## [1] 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95
## [18] 100
banko$count # count gives the total frequency in a respective bin
## [1] 140 1526 5717 9130 7255 5589 4651 3598 2672 291 197 177 126 76
## [15] 33 8 2
banko<-hist(bank1$age, breaks=10, main="Age of Acc_Holders",
xlab="Age",ylab="Number"
,col="lightgreen", xlim=c(0,100), ylim=c(0,16500))
Breaks =10, gives an approximate number of bins , nearly equal to 10, devised on an internal algorithm .
It may be a good idea to analyze graphs based on the density rather than frequency as, frequeny changes depending upon the number of bins selected.
- Density Plot
d <- density(bank1$age)
plot(d,main="Density plot on Ages")
polygon(d,col="blue",border="Red")
- Bar Plots
counts <- table(bank1$job)
max(counts)
## [1] 10422
par(mar=c(6.4,4.1,4.1,2.1)) # set the margins
barplot(counts, main="Job Distribution", ,col="darkblue",
border="orange",las=2,ylim=c(0,10500))
The Margin size can we changed, in this case I have increased the bottom margin, to acccomodate the x-axis labels.
Also las=2, lets the labes to be presented in a vertical position. The default is horizontal.
- Horizontal Bar Plots
par(mar=c(4.1,6.4,4.1,2.1))
barplot(counts, main="Job Distribution", ,col="darkblue",
border="orange",horiz=TRUE,las=2,xlim=c(0,10500))
Here we increased the right margin to accomodate the Y-axis labels.
- Stacked Bar PLots
counts <- table(bank1$loan, bank1$job)
par(mar=c(5.4,6.2,2,2.1))
barplot(counts, main="Acc Holder Distribution by Job and Loan",xlab="job type", col=c("blue","red","yellow"),las=2,cex.names=c(0.8))
- Grouped Bar Plots
counts <- table(bank1$loan, bank1$marital)
counts
##
## divorced married single unknown
## no 3816 20567 9500 67
## unknown 121 588 280 1
## yes 675 3773 1788 12
barplot(counts, main="marital status and taken loan",
xlab="Marital status",col=c("darkblue","red","yellow"),beside=TRUE)