Sanyam Kapoor
/
Recent content on Sanyam KapoorHugo -- gohugo.ioMon, 21 May 2018 00:00:00 +0000Policy Gradients in a Nutshell
/machine-learning/policy-gradients-nutshell/
Mon, 21 May 2018 00:00:00 +0000/machine-learning/policy-gradients-nutshell/This article aims to provide a concise yet comprehensive introduction to one of the most important class of control algorithms in Reinforcement Learning - Policy Gradients. I will discuss these algorithms in progression, arriving at well-known results from the ground up. It is aimed at readers with a reasonable background as for any other topic in Machine Learning. By the end, I hope that you’d be able to attack a vast amount of (if not all) Reinforcement Learning literature.When Does Stochastic Gradient Algorithm Work Well?
/machine-learning/stochastic-gradient-descent-bounds/
Tue, 06 Feb 2018 09:48:58 -0500/machine-learning/stochastic-gradient-descent-bounds/Stochastic Gradient Descent (SGD) has turned out to be a workhorse for most gradient-based supervised learning algorithms today. But, why does that work? This post presents an understanding from recent [1] theoretical results which gives insights into the properties of this algorithm.
Background Formally, we define the general problem of stochastic optimization given a random variable \( \xi \) as
\[ \underset{\mathbf{w} \in \mathbb{R}^d}{min} F(\mathbf{w}) = \mathbb{E}[f(\mathbf{w},\xi)] \]
In supervised learning, this framework is seen in the form of Empirical Risk Minimization (ERM) where the aim is to search the d-dimensional parameter space \( w \in \mathbb{R}^d \), so that we minimize the expected risk (colloquially know as loss) over the given data.The beauty of Bayesian Learning
/machine-learning/the-beauty-of-bayesian-learning/
Mon, 04 Dec 2017 02:25:07 -0500/machine-learning/the-beauty-of-bayesian-learning/This post has an accompanying Bayesian Learning Demo. See here!
In this post, we will build intuitions behind Bayes theory via an interactive visualization and then realize certain properties of such a probabilistic formulation of Machine Learning. It turns of that most advanced approaches in fact run on top of these very simple foundational concepts and help us model problems quite beautifully.
Probability theory is ubiquitous in Machine Learning and perhaps will be the most important tools for future breakthroughs.PyTorch Data Loaders are abstraction done right!
/machine-learning/pytorch-data-loaders/
Sat, 25 Nov 2017 01:59:07 -0400/machine-learning/pytorch-data-loaders/PyTorch is great fun. Seriously! It has only been a few weeks that I started working with it. It already is the least painful thing in the process, which, is kind of the point of having such a library.
The first task that any Machine Learning engineer would struggle with is to load and handle data. PyTorch provides an excellent abstraction in the form of torch.util.data.Dataset. Such dataset classes are handy as they allow treating the dataset as just another iterator (almost) object.Visualizing the Confusion Matrix
/machine-learning/confusion-matrix-visualization/
Sun, 15 Oct 2017 13:33:07 -0400/machine-learning/confusion-matrix-visualization/Confusion Matrix is a matrix built for binary classification problems. It is an important starting tool in understanding how well a binary classifier is performing and provides a whole bunch of metrics to be analysed and compared.
Here, I present an intuitive visualization given that most of the times the definition gets confusing.
The Confusion Matrix How to read the visualization? Before we go ahead and read the visualization, let us remember the definitions.The magic of Automatic Differentiation
/machine-learning/autograd-magic/
Sun, 08 Oct 2017 08:46:07 -0400/machine-learning/autograd-magic/Any machine learning problem is generally formulated as roughly the following steps
Model the outputs \( Y \) as some function of the input and parameters \( f(X,\theta) \)
Then come up with a loss function \( L \) that quantifies how well the trained model fits the data \( X \)
We solve the following minimization problem at the end of all
\[ \underset{\theta}{argmin} L(X,Y,\theta) \]An Introduction to Epipolar Geometry
/machine-learning/an-introduction-to-epipolar-geometry/
Tue, 08 Aug 2017 16:01:07 +0530/machine-learning/an-introduction-to-epipolar-geometry/In this post we will take a look at how Camera Projections work. A demo at the end will illustrate the important segments of the theory. Prerequisites to understand the material are available in the Readings & References section below.
Epipolar Geometry is the intrinsic projective geometry between two views. This knowledge becomes an interesting piece in the puzzle of estimating the 3D geometry of a given image projection and the estimated 3D model can then be applied to a myriad of meaningful real-world problems.A Primer on Projective Geometry
/machine-learning/a-primer-on-projective-geometry/
Wed, 19 Jul 2017 03:08:04 +0530/machine-learning/a-primer-on-projective-geometry/Projective Geometry is a term used to describe properties of projections of a given geometric shape. When a shape is projected onto \( \Bbb{R}^2 \) (commonly known as the 2D real space), it is called a Planar Projection. This idea can be extended to a shape being projected as \( H : \Bbb{R}^m \to \Bbb{R}^n \) where \( H \) is a homogenous matrix. To understand concepts, we’ll use Planar geometry because that is the easiest to visualize.Introduction to RANSAC
/machine-learning/ransac-illustration/
Sat, 24 Jun 2017 22:11:07 +0530/machine-learning/ransac-illustration/The Problem Given a data set and a question to be answered, one generally tends to solve the problem in the following fashion -
Understand the data and its patterns Hypothesize a model which will fit the data in the most relevant manner A theme that overarches all such models is that they need to be fast, accurate and robust. The models need to be reasonably fast to provide actionable results in meaningful time, need to be accurate for the user to rely on and need to be robust in scenarios where the data is adversarial.Understanding Nginx behavior with dynamic DNS upstreams
/blog/nginx-dns/
Sat, 13 May 2017 20:59:06 +0530/blog/nginx-dns/Recently, I was fiddling around with AWS Elastic Beanstalk to run our Nginx reverse proxy server at Headout and observed an interesting behavior with regards to DNS resolution in Nginx.
The Setup A fairly simple deployment architecture that I have followed is that each internal service is deployed in its own Auto-Scaling group and load balanced via its own Internal Load Balancer (ELB). Note the the load balancer is internal and does not have a public IP address.Codeforces 651B
/problem-solving/codeforces-651b/
Thu, 04 May 2017 09:04:45 +0530/problem-solving/codeforces-651b/I was recently solving this problem on Codeforces titled Beautiful Paintings. The problem seems easy at first but has interesting approaches - one easier but longer and one slightly harder but shorter.
Approach 1 On the first read, the problem is fairly straightforward and all we need is sorting because that is how we achieve a(i) <= a(i + 1) (mind the equality).
Using this approach, solving the following problemCodeforces 484A
/problem-solving/codeforces-484a/
Wed, 05 Apr 2017 09:04:45 +0530/problem-solving/codeforces-484a/I was recently solving this problem on Codeforces titled Bits. It was a fairly interesting problem considering a few new learnings I had.
To be simply put, the aim of the problem is to get the number with the maximum number of bits (popcount(x)) within the range [l, r].
Approach The problem can be approached as follows.
For any given number from 1 to n, the maximum number of bits will be in a number of the form 2^b - 1, or in other words, a number one less than power of two.Choosing the right keys
/blog/choosing-the-right-keys/
Mon, 03 Apr 2017 01:57:44 +0530/blog/choosing-the-right-keys/Deciding the keys in an SQL table is one of the most important decisions in the lifecycle of the database. The ramifications of this decision will last as long as the database is alive and serving. Therefore, much thought needs to be put in deciding the trade-offs - fast lookups versus complex queries, ad-hoc queries versus analytical queries, so on and so forth.
Why do we need indexes? In databases, indexes are data structures which help achieve fast lookups for data which could either be random or ordered.Before the Flood: A review
/blog/before-the-flood-a-review/
Thu, 03 Nov 2016 12:06:04 +0530/blog/before-the-flood-a-review/I’ve been reading about DiCaprio’s quest to push the agenda of Climate Change every now and then. It always came to me as ironic when I tried hard to look from the perspective of a concerned citizen. DiCaprio is an entertainer and with this altruistic act, he’s just pushing another hackneyed agenda forward from a position of power. Speaking up abstract terminologies and interviewing policy makers and stakeholders around the world — what good does that do?Setup custom email addresses
/blog/custom-domain-email/
Mon, 17 Oct 2016 13:20:40 +0530/blog/custom-domain-email/It had been a while that I owned my domain and was wondering if I could switch all my email communication to a custom email address hello@sanyamkapoor.com. Apart from owning the domain, the only requirement which remains is the that of an MX (Mail Exchange) server which can route my emails for me. This is pretty handily done via services like GSuite and Outlook but I am not in a stage to shell out a few bucks for a custom email address.Getting started with Compose and WordPress
/blog/docker-wordpress/
Sat, 09 Apr 2016 22:50:29 +0530/blog/docker-wordpress/Quickstart: Docker Compose and WordPress You can use Docker Compose to easily run WordPress in an isolated environment built with Docker containers. This quick-start guide demonstrates how to use Compose to set up and run WordPress. Before starting, you’ll need to have Compose installed.
This tutorial has been taken from my latest pull request to the Docker Compose project. The pull request can be found at https://github.com/docker/compose/pull/3275.
The same document should also be available at https://docs.Building Docker Images on Travis CI
/blog/docker-on-travis/
Thu, 07 Apr 2016 03:40:47 +0530/blog/docker-on-travis/Docker has been the star of the recent times and I have recently been building a lot of Docker images, deploying them to both development and production (luckily yes!).
Travis CI is one of the most popular CI services and I have lately been using it extensively. It is awesome!
Travis CI YAML Here is a simple .travis.yml to get started with Docker builds on Travis.
sudo: required services: - docker env: # IMPORTANT!Microservices is not the solution you think
/blog/microservices-is-not-the-solution/
Wed, 24 Feb 2016 23:50:16 +0530/blog/microservices-is-not-the-solution/<p><strong><em>Microservices</em></strong> have suddenly become the hip thing to do. While
it would be imbecile to question the potential of microservices,
what I am surprised at why wasn’t it popular enough pre-2014. While
microservices solves plenty of problems which otherwise would have bugged a
seasoned developer, but to keep things in the larger perspective,
a microservices approach is not the best way to get started as a
potential high performance developer. The path to microservices is
hard, very hard.</p>
<p></p>For better software delivery
/blog/for-better-software-delivery/
Sun, 03 Jan 2016 01:00:09 +0530/blog/for-better-software-delivery/<p>For the past couple years, I have been working as the Dev and the Ops for my team
at <a href="https://storyxpress.co">StoryXpress</a>. I promise you it has been an amazing
learning ride but for all that matters, it has been an exhaustive one. We have
been hosting our services on Azure and by no means do I think is Azure developer
friendly. Yes, you heard that right!</p>
<p></p>The Ultimate Guide to Building Cloud Applications - Development
/blog/guides/cloud-apps/2/
Sun, 06 Dec 2015 00:27:09 +0530/blog/guides/cloud-apps/2/<p>Every big project ever dreamt of, always started from that one folder sitting on
a local development machine. While, it could be highly likely that many projects
never saw the light of the day, don’t forget the fact that you are the architect
of a system which could potentially increase the happiness of a million others
(<strong><em>$$$</em></strong>).</p>
<p>Here is a collection of ideas, which will help making the right decisions from
day one of development.</p>The Ultimate Guide to Building Cloud Applications
/blog/guides/cloud-apps/1/
Sat, 24 Oct 2015 00:27:09 +0530/blog/guides/cloud-apps/1/<p>The landscape of applications is an ever-expanding one from desktop to web to mobile.
With increasing popularity, a torrent of new development and automation tools are
popping in which are making the lives of developers blissful. It has also introduced
a new perplexities of choice between a new (flashy) tool because one is excited
and a battle-tested tool that you’ve got experience with (nil for a newbie). Among
all the automations and increasing levels of abstraction, a lot of fundamentals
are being obscured from the developer. What is left for a developer is just one
click of a button and automagically, everything is setup.</p>About
/about/
Mon, 01 Jan 0001 00:00:00 +0000/about/I’m a Master’s student at NYU Courant interested in Reinforcement Learning and Bayesian Learning. I work with Joan Bruna and other researchers in the CILVR lab. Recently, I’ve been investigating techniques to improve efficiency of Monte Carlo sampling algorithms.
In a previous life, I spent 3 years dabbling in the startup space, building StoryXpress with my co-founders from IIT Hyderabad (who by the way are still killing it!).
Have a look at my resume Last updated Feb 6, 2019 for more!Learning Wishlist
/wishlist/
Mon, 01 Jan 0001 00:00:00 +0000/wishlist/I’ve decided to maintain a wishlist of things I’d like to work on whenever my interest peaks. Clearly, this is a weakly organized scratchpad.
Theory These are works that I’ve only glossed over and never delved deeper. Sometime in the future, I’d like to understand them better.
Gaussian Processes GP priors are unusually interesting. They lead to smooth interpolators with uncertainty estimates. Some parts of the inference are embarrasingly parallelizable.