Data Science | Ruchit Patel

COMPUTATIONAL METHODS FOR

DATA ANALYSIS

This page includes all the small projects which show data analysis tools/skills I have learned through my academics.

There is a total of 6 final reports which include areas such as data compression, data mining, image processing, principal component analysis, dynamic mode decomposition, reduced-order composition, motion tracking, video processing, machine learning, and neural network.

Schematic-architecture-of-compressed-dynamic-mode-decomposition_edited.png

LANGUAGE CLASSIFICATION

The main “objective” for choosing this topic is to make a ‘Voice Personal Assistant’ which can speak in an exactly similar manner as its user. And by ‘exactly similar manner’, I mean having a device which has the same accent as you, which can reply in a voice having equivalent frequency as you and which can understand your current mood & emotion just from listening to you. Well, this seems very ambitious idea but I believe that as just from hearing few words, a human can recognize the language, accent, gender, and emotion/mood of another person, an algorithm also can be made which can deduce all of the mentioned properties just by listening to its user.

Key Words: Machine Learning, Naive Bayes, Gaussian Mixture Model, Support Vector Machine, Spectrogram, Audio Processing, Feature Extraction

View Report (MATLAB code included)

MNIST HANDWRITING RECOGNITION BY NEURAL NETWORK

In this project, most modern machine learning algorithm “neural network” is implemented on the MNIST dataset of handwritten digits and how training accuracy and testing accuracy varies with the number of hidden layers used is mentioned. First, the introduction of the neural network is given in section-I. Then underlying theories of neural network in explained in section-II. Section-III describes the whole step-by-step explanation of whole code used in this report. All the computational results are shown in section-IV and the report is summarized and concluded in section-V.

Key Words: Machine Learning, Neural Network, Deep Learning, MNIST Data Set, Feature Extraction

View Report (MATLAB code included)

DYNAMIC MODE DECOMPOSITION FOR REAL-TIME BACKGROUND/FOREGROUND SEPARATION IN VIDEO STREAM

This report contains the method of dynamic mode decomposition (DMD) for separating video frames into the background (low-rank) and foreground (sparse) components in real-time. DMD terms with frequencies near zero are interpreted as background (low-rank) portions of the given video frames, and the terms with frequencies far from zero are sparse foreground components. An approximate low-rank/sparse separation is achieved at the computational cost of just one singular value decomposition and one linear equation solve. The DMD method that is shown here is working robustly in real-time without any parameter tuning which is ideal for video surveillance and recognition applications. Also, an algorithm is implemented to the video which is not perfectly steady and where the background is also moving a little bit to see how the algorithm performs.

Key Words: Video Processing, Dynamic Mode Decomposition, Singular Value Decomposition, Feature Extraction

View Report (MATLAB code included)

MUSIC GENRE CLASSIFICATION

This report contains genre classification using supervised machine learning algorithm. Three different tests are performed which are; band classification from three different bands having different genres, band classification from three different bands having the same genre, genre classification from three different genres. To do this, SVD on the real part of the spectrogram version of randomly chosen five seconds clip from 100 songs per genre is done to make training data set. Then 80%-20% - cross-validation is done to check training accuracy and then outside data is fed to Matlab to check its testing accuracy. Many iterations are done to tune parameters and results of those are shown in section five. All the training sets are showing good, ~80-90%, accuracy in cross-validation and there is testing accuracy of ~70% for the test-1, ~55% for the test-2 and ~70% for the test-3 is obtained.

Key Words: Machine Learning, Naive Bayes, Gaussian Mixture Model, Support Vector Machine, Spectrogram, Audio Processing, Feature Extraction

View Report (MATLAB code included)

PRINCIPAL COMPONENT ANALYSIS BY EIGEN VALUE DECOMPOSITION

This report contains a principal component analysis of the simple harmonic motion captured by three cameras from a different angle. The main goal of this project is to understand how Principal Component Analysis work and how much data redundancy it can remove when we implement it. The one more major objective is to learn what happens when we perform PCA on noisy data, part-2, and the data which has experimental errors, part - 3 & 4. In this report, data redundancy is removed by diagonalizing, eigen-decomposition of, the covariance matrix and the weight of eigenvalues are plotted to see how many independent data really exist. At the end comparison between PCA of the ideal case, noisy case and inaccurate experiment are discussed.

Key Words: Motion Tracking, Video Processing, Principal Component Analysis, Eigen Value Decomposition, Feature Extraction

View Report (MATLAB code included)

PRINCIPAL COMPONENT ANALYSIS AND DATA COMPRESSION OF IMAGES USING SINGULAR VALUE DECOMPOSITION

In this report, SVD analysis of two types of the image database, cropped (aligned) and uncropped, has been done. This analysis shows and verifies that image can be compressed and reconstructed with very good accuracy by prioritizing important singular modes which are calculated using SVD. The report also contains reconstructed images using fewer data and their comparison with the original image. In the end, the comparison between the SVD of cropped/aligned and uncropped images is also mentioned.

Key Words: Data Mining, Image Processing, Principal Component Analysis, Singular Value Decomposition, Feature Extraction

View Report (MATLAB code included)

Data Science: Discography