Plotting Graphs in R
Avadhoot
R is a Statistical language used for data analysis. Owing to being open source it is rapidly gaining popularity,I would like to showcase some of the basic graphs constructed in R
I have used a sample csv file with bank related data for the purpose.
- Histogram
setwd("C:\\myR") #import the required csv file
bank1 <- read.csv("bank.csv",sep=";") # Read the CSV file
banko<-hist(bank1$age) # Plot the Histogram
We can decrease the number of bins and club frequencies together to have a broader understanding and a better picture of the graph as a whole.
banko$breaks # break gives the starting and ending value of a bin
## [1] 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95
## [18] 100
banko$count # count gives the total frequency in a respective bin
## [1] 140 1526 5717 9130 7255 5589 4651 3598 2672 291 197 177 126 76
## [15] 33 8 2
banko<-hist(bank1$age, breaks=10, main="Age of Acc_Holders",
xlab="Age",ylab="Number"
,col="lightgreen", xlim=c(0,100), ylim=c(0,16500))
Breaks =10, gives an approximate number of bins , nearly equal to 10, devised on an internal algorithm .
It may be a good idea to analyze graphs based on the density rather than frequency as, frequeny changes depending upon the number of bins selected.
- Density Plot
d <- density(bank1$age)
plot(d,main="Density plot on Ages")
polygon(d,col="blue",border="Red")
- Bar Plots
counts <- table(bank1$job)
max(counts)
## [1] 10422
par(mar=c(6.4,4.1,4.1,2.1)) # set the margins
barplot(counts, main="Job Distribution", ,col="darkblue",
border="orange",las=2,ylim=c(0,10500))
The Margin size can we changed, in this case I have increased the bottom margin, to acccomodate the x-axis labels.
Also las=2, lets the labes to be presented in a vertical position. The default is horizontal.
- Horizontal Bar Plots
par(mar=c(4.1,6.4,4.1,2.1))
barplot(counts, main="Job Distribution", ,col="darkblue",
border="orange",horiz=TRUE,las=2,xlim=c(0,10500))
Here we increased the right margin to accomodate the Y-axis labels.
- Stacked Bar PLots
counts <- table(bank1$loan, bank1$job)
par(mar=c(5.4,6.2,2,2.1))
barplot(counts, main="Acc Holder Distribution by Job and Loan",xlab="job type", col=c("blue","red","yellow"),las=2,cex.names=c(0.8))
- Grouped Bar Plots
counts <- table(bank1$loan, bank1$marital)
counts
##
## divorced married single unknown
## no 3816 20567 9500 67
## unknown 121 588 280 1
## yes 675 3773 1788 12
barplot(counts, main="marital status and taken loan",
xlab="Marital status",col=c("darkblue","red","yellow"),beside=TRUE)
Good plots.. Shows the power of R
ReplyDeleteThanks a Lot Sai, you will be a constant motivation
DeleteOne of the most appealing things about R is its ability to create data visualizations with just a couple of lines of code!
ReplyDeleteGood one!
Our firm works with Chief Information Office at US Patent and Trade Office. R is the leading edge language being adapted by the CTO's office. Also US Census is looking into using it in their upcoming 2020 Census Survey. Keep up the great work
ReplyDeleteGood Job
ReplyDelete