Skip to main content

Visualizing Principle Components for Images

[This article was first published on R – Hi! I am Nagdev, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Principle Component Analysis (PCA) is a great tool for a data analysis projects for a lot of reasons. If you have never heard of PCA, in simple words it does a linear transformation of your features using covariance or correlation. I will add a few links below if you want to know more about it. Some of the applications of PCA are dimensional reduction, feature analysis, data compression, anomaly detection, clustering and many more. The first time I learnt about PCA, it was not easy to understand and quite confusing. But, as I started to read about its applications in research papers, I started to get curious and try them all out. Now, I use it for most of my projects as a pre-processing step.

I recently added this topic to my data science curriculum as PCA has become relevant in data science today. The first time I taught this to my students, 90% of the class had a blank look on their face. Honestly, it was my own reflection. Then, I leaned towards demonstrative teaching rather than using slides and talking for an hour. This actually made it a lot easier to understand. I thought of sharing this example on my blog and help those in need.

For this example we will use this grey scale image as shown below. Also, I will try to keep R code used in this example as minimalistic as possible.

Step 1: Image processing

Load imager library, load the image and convert the image to row x column matrix grid.

Next, we will visualize our image using image function. A post on stack overflow helped me out on using image function the right way.

library(imager)

# load the image and look at the image properties
image = load.image("/cloud/project/bwimage.JPG")
image
# Image. Width: 282 pix Height: 220 pix Depth: 1 Colour channels: 3 

# convert image data to data frame
image_df = as.data.frame(image)

head(image_df)
# x y cc     value
# 1 1 1  1 0.9372549
# 2 2 1  1 0.9254902
# 3 3 1  1 0.9254902
# 4 4 1  1 0.9294118
# 5 5 1  1 0.9372549
# 6 6 1  1 0.9372549

# convert image into x and y grid using matrix function
image_mat = matrix(image_df$value, nrow = 220, ncol = 282, byrow = TRUE)

# visualize the image
image(t(apply(image_mat, 2, rev)), col=grey(seq(0,1,length=256)))

Step 2: PCA analysis

The next step is to load the matrix to principal component function to perform reconstruction. Scaling is very important for PCA. Since the image I used is grey scale, I have not scaled the data to keep it simple. Then we visualize principal components and identify that the first 5 contribute to the highest variance in the data as shown in the below image.

# pca analysis     
pca_model = prcomp(image_mat)

# plot the scree plot
plot(pca_model)

Step 3: Reconstruction and visualization

The final step is to visualize the reconstructed image for each of the components. Here, we will use alternating components from 1 to 9 and plot them on a grid to visualize PCA reconstruction.

To perform the reconstruction, we will first do a matrix multiplication of say, first PC and the transpose of rotation of the first component. This will generate a matrix resembling our image dimension. Finally, we will take this reconstructed data and plot an image.

To make this little more easier, I have put all the reconstruction and visualization into a function. Then loop through lappy to visualize the reconstructed images as shown below.

# Reconsturction and plotting
par(mfrow= c(3,3))
recon_fun = function(comp){
  recon = pca_model$x[, 1:comp] %*% t(pca_model$rotation[, 1:comp])
  image(t(apply(recon, 2, rev)), col=grey(seq(0,1,length=256)), main = paste0("Principle Components = ", comp))
}

# run reconstruction for 1:17 alternating components
lapply(seq(1,18, by = 2), recon_fun)

As we see in the above image, as we add more components for reconstruction, the image gets clearer. In real world application we could just store few components of the data as a representation of the image and reconstruct the image. We could also use this reconstructed image and feed it to neural network to enhance the quality of the image. Now, you know how dimensionality reduction works for images using PCA. This step by step demonstrative approach has definitely helped while teaching in my class and I wished if I was taught this way.

Below are some of the best tutorials on PCA out there.

I have written few jupyter notebooks on applications of PCA in anomaly detection and dimensionality reduction on my GitHub page. Feel free to check it out.

Thanks for stopping by and reading this article. Feel free to comment below and share this article with your colleagues. Also, check out my other articles.

The post Visualizing Principle Components for Images appeared first on Hi! I am Nagdev.

To leave a comment for the author, please follow the link and comment on their blog: R – Hi! I am Nagdev.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.


from R-bloggers https://ift.tt/3gdkfWF
via IFTTT

Comments

Popular posts from this blog

Solving Van der Pol equation with ivp_solve

Van der Pol’s differential equation is The equation describes a system with nonlinear damping, the degree of damping given by μ. If μ = 0 the system is linear and undamped, but for positive μ the system is nonlinear and damped. We will plot the phase portrait for the solution to Van der Pol’s equation in Python using SciPy’s new ODE solver ivp_solve . The function ivp_solve does not solve second-order systems of equations directly. It solves systems of first-order equations, but a second-order differential equation can be recast as a pair of first-order equations by introducing the first derivative as a new variable. Since y is the derivative of x , the phase portrait is just the plot of ( x , y ). If μ = 0, we have a simple harmonic oscillator and the phase portrait is simply a circle. For larger values of μ the solutions enter limiting cycles, but the cycles are more complicated than just circles. Here’s the Python code that made the plot. from scipy import linspace from ...

Lawyer: 'Socialite Grifter' Anna Sorokin 'Had To Do It Her Way' (And Steal $275,000)

Opening statements were made in the "Socialite Grifter" trial on Wednesday, and both sides provided extremely different reasons why Anna Sorokin allegedly scammed a number of people and institutions out of $275,000. [ more › ] Gothamist https://ift.tt/2HXgI0E March 29, 2019 at 12:33AM

5 Massively Important AI Features In Time Tracking Applications

Artificial intelligence has transformed the future of many industries. One area that has been under- investigated is the use of AI in time tracking technology. AI is Fundamentally Changing the Future of Time Tracking Technology A time tracking software is a worthy investment irrespective of the size of your organization. It generates accurate reports based on the amount of time your team spends working on a task. These reports facilitate planning of budgets for upcoming projects. Many AI tools are changing the nature of time management. MindSync AI discussed the pivotal role of AI in time management in a Medium article . Why is time tracking software important? It helps with keeping track of the hours being invested on a given task. This sheds light on the timeline for the overall project. It also helps in determining the productivity levels of the employees. This is one of the many reasons that AI is driving workplace productivity . But how can employers utilize it effectively? ...