Occasionally, I blog about things that interest me.
I spent 5 years in the Computer Science BS-MS program. I was a peer mentor for CS111 for 3 years.
I've met a wide range of students in the CS department. I've helped students with assignments, and explained concepts to them over and over again. This blog post explains the patterns for success and failure that I have noticed.
A good performance in the intro courses builds the foundation for your CS knowledge. Without it, you are as useful as a mathematician that cannot add fractions, or solve linear equations.
By the time you get through CS111, you should have a clear mental model of how problem solving is accomplished through code. It's easy to get distracted as a 111 student by the complexities of Java, Eclipse, jEdit, your weekly assignments, and course project. This is especially true if you come to 111 with no programming experience. It might feel like all you're doing is staying afloat week after week, assignment after assignment, milestone after milestone. Someone tells you that using Eclipse is good/bad. Someone tells you that you should put curly braces on the same line as a function, as opposed to putting it on its own line. It can be very overwhelming.
If at the end of 111, you are able to take a problem given to you, think about how it's solved, write code to solve it, and debug your code until it works, you have done well. This seems like a hard thing to measure, so I'm going to give you a couple of litmus tests to gauge your performance.
Consider the problem below:
Write a program that prints the numbers from 1 to 100. But for multiples of three print
Lemon instead of the number and for the multiples of five print
Juice. For numbers which are multiples of both three and five print
Here is what the output of the program looks like from 1 to 20.
This problem should be a breeze for you after 111. If you cannot write code to solve this in 30 minutes, without someone's help, you are in trouble.
Here is another litmus test. If you can solve the first 5 problems on Project Euler without any help, you have learned problem solving. You can start worrying about other things. If you can't do these problems, you need to spend more time learning to problem solve, writing code, and debugging code. You will not get very far in a CS degree before you do this.
You can write code to solve a Project Euler problem, and submit the solution by creating an account. This is also a great way to learn a new programming language (This is how I learned the basics of Python).
Data Structures develops on your understanding of problem solving from 111 and teaches you about clever ways in which programmers manipulate data, and algorithms.
You can ask any upperclassman what they think the most important CS course is, and Data Structures will almost certainly be one of their top 3.
Now, let me tell you the story of Alice. Alice is a metaphorical student that represents 50% of the CS112 roster.
Alice took CS111, did fairly well in it (got better than a B). In 111, Alice often found it hard to listen to the professor in lecture. Every once in a while, she zoned out, and couldn't figure out what Tjang or Sesh was saying. She went home, looked at the weekly assignment, and finished it (perhaps with some help from a friend, TA, or someone at the iLabs). Since every concept in lecture was reinforced in a weekly assignment, she understood most of the material that was covered in 111.
When Alice took 112, she followed a similar path. Tjang/Sesh was lecturing about Heaps, Linked Lists, Graphs, Tail Recursion, Efficiency Analysis of Insertion Sort etc. She went on Facebook on her laptop/got a text from a roommate, subequently zoned out in class, and quickly lost track of what the professor was saying. Alice went home, there was a project due in two weeks. Alice worked very hard on the project. She definitely had some issues along the way, but she was able to ask Sesh/TAs/peers/people at the iLabs for help and get it done. She got an 85+ on all the 5 projects in CS112. Alice got around a 50 on both Data Structures exams, ended Data Structures with a C/C+, and moved on.
Alice is the canonical example of a student that does poorly in the Rutgers CS degree, and here's why.
In 111, Alice had projects every week that forced her to learn material. The only thing in 112 that forces you to learn material is the exam, and that happens twice a semester. Sometimes, the first 112 exam is so early in the semester that there's very little material covered on it. This means that the final covers a ton of material, and because Alice has been zoning out of 112 lecture on a regular basis, she does not know why Heapify is O(n), or why you would want to use a min heap as a frontier when implmenting Dijkstra's shortest path algorithm.
In order to avoid Alice's pitfalls, you need to make sure that you understand every concept in Data Structures. Practically every sentence that Sesh or Tjang says in 112 is important. I understand that you are human, and cannot keep perfect concentration in an 80 minute lecture. But you need to be proactive about understanding all the concepts that are covered. This means that if you zoned out in the lecture about AVL Trees, you still need to make sure you learn it. You can go read the textbook after class. You can look up the material online. There's some very good online courses that cover data structures. I strongly recommend the lectures from Stanford's Programming Abstractions course. You can ask upperclassmen to explain the concept to you.
You need to do this before the day of your data structures exam, because the course covers a lot of important material, and you cannot learn all of it in a day.
Prerequisites don't matter.
Stop focusing on grades. Focus on concepts you're learning
Your professors haven't written code in 15+ years. They're not going to teach you how to develop software - don't waste your time taking CS431.
Start writing code in your free time to solve any problem that you have. I wrote RUBUS when I was in 112. I used to maintain static web pages for a board game my roommate played with his high school friends. Everyone that's good at programming has put in lots of time into it, and you will need to do the same.
Talk to upperclassmen about ideas you have, and the things that they've built. Hang out in the CAVE, attend USACS events, and go to hackathons. All of this experience will add up, and help you land your first paid programming gig.
When I started taking CS classes, I didn't understand why people got paid to sit at a computer and write for-loops. I thought the mysterious, legendary developers getting internships and part time jobs must know so much more about the web/mobile apps/algorithms. This is a classic case of impostor syndrome, and it creates a mental roadblock until you get paid to write code.
Use your summers for internships or part time work (not for folding clothes). For your first development job, I recommend Student System Administration at OSS, System Administration at LCSR (under Doug Motto), an internship at Too Much Media, or HackNY if you are lucky. You'll want to do a more traditional tech internship at companies like Microsoft, Google or Etsy, but you'll have an easier time getting these once you have some experience. If you hang around CS folks, you'll hear about such opportunities frequently. Apply early and often - you'll have an easier time in October than in April.
The Rutgers CS community, while not perfect, has grown a great deal over the last few years. You've got access to events like HackRU, HackNY, PennApps, and HackTCNJ. You've got access to the CAVE, the Hackerspace, and the Makerspace. USACS, RuMAD and the Rutgers Hackathon Club are active, and full of smart people. People taking the same classes as you have gone on to amazing jobs, start companies, and sell companies. Start reading hacker news, and /r/programming.
Don't forget to give back to the community. Teach CS111 recitation. Join the USACS board, and help plan events. Hang out at the CAVE and help underclassmen understand difficult concepts. Help the noobs out at hackathons.
Make friends out of your peers. Impossible looking homework assignments will become easier. You'll spend a silly amount of time working on a CTF challenge, or writing a game. You'll get one letter Github usernames together. After college, they'll help you find jobs and offer you their couches.
Rutgers is a great place to study Computer Science, and I hope your time there will be as memorable as mine was.
If you're like me, you've written loads of web apps, but you rarely set up SSL on them. SSL is a must for any production-grade web application, especially if you're authenticating users or taking personal information from them. Otherwise all the contents of your HTTP requests are being sent in plaintext - user login info / passwords, cookies etc.
Usually, SSL certificates can cost lots of money (Verisign charges over $100 / month), and be annoying to setup. After paying for domains and hosting, this is the last thing you want to shell out money for. Thus, StartSSL Free is a very appealing product because it gives you a free SSL certificate valid for 1 year, that's accepted in all major browsers. I'm using it to serve flipdclass.com over SSL.
wwwas the sub domain to the certificate. You'll need to create a certificate for each subdomain that you want to access over HTTPS, unless you get a wildcard SSL cert (not available through StartSSL Free).
domain.crt. I would also recommend saving the intermediate and root CA certs, because you'll need them for your webserver setup.
I'll walk you through setting up nginx or apache for SSL.
domain_key.enc, domain.crt, ca.pem, sub.class1.server.ca.pem), probably via scp.
domain_key.enc, so that it can be read by your web server. Without this, your webserver will prompt you for a password everytime it is restarted.
Nginx does not have a directive for SSL Certificate Chains, so you will to concatenate your certificate to the intermediate and root CA certs.
Then you can configure your virtual host as follows.
Now, you can reload your webserver, and if you did everything correctly, you should get a successful HTTPS connection to your web app. Make sure that you test your site on a few different browsers, because not all browsers will behave the same way with SSL certificates.
You should also consider configuring your webserver to redirect all traffic to HTTPS, in order to prevent users from leaking their sensitive data by mistake.
SSH is a powerful protocol that lets you access machines remotely and run commands on them. Rutgers has a cluster of linux machines for CS students, and I often run programs on them. Sometimes, I leave a program running for a while, and forget which machine it was on. In this situation, PDSH comes in handy. It lets me run
ps aux | grep -i <username> quickly across all the machines.
PDSH lets you run a command in parallel across a bunch of machines. I start by creating a text file with a list of machines I want to shell into:
Let's say I save this file as
machines.txt. I can then run a command in parallel across all these machines:
Here are some things you can do with PDSH that you might find useful
Find all python processes running on these machines.
$ pdsh -R ssh -w ^machines "ps aux | grep -i python"
Kill any processes being run by my user. (Super useful if you forget to log out of a lab machine.)
$ pdsh -R ssh -w ^machines "killall -u `whoami`"
Check a specific log file for errors.
$ pdsh -R ssh -w ^machines "grep -i error /path/to/log"
It's a handy UNIX tool to have in your arsenal when working with lots of machines. Clearly, I am only showing the usage of
pdsh in the most basic way. Check out PDSH on Google Code for a more detailed description of everything PDSH can do.
Web Scraping is a super useful technique that lets you get data out of web pages that don't have an API. I often scrape web pages to get structured data out of unstructured web pages, and Python is my language of choice for quick scripts.
In the past, I used Beautiful Soup almost exclusively to do this kind of scraping. BeautifulSoup is a great library for web scraping - it has great docs, and it gets the job done most of the time. I've used it on lots of projects. However, I find that it doesn't fit my workflow.
Let's say I wanted to scrape some data off a web page. I usually inspect the element in the Chrome Dev Console, and guess at a selector that might give me the data I want. Perhaps I guess
div.foo li a. I quickly check to see if this works by running this selector in the console
$('div.foo li a'), and modify it if it doesn't.
Even after using BeautifulSoup for a while, I find that I have to go back and read the docs to write code that scrapes this selector. I always forget how to select classes in BeautifulSoup's
find_all method. I don't remember how to write a CSS attribute selector such as
a[href=*foo*]. It doesn't let me write code at the speed of thought.
LXML is a robust library for parsing XML and HTML in Python that even BeautifulSoup is built on top of. I don't know much about
lxml, except that I can use CSS Selectors with it very easily, thanks to lxml.cssselect. Look at the example code below to see how easy this is.
As you can see, it's really easy to use CSS Selectors with Python and lxml. Instead of spending time reading BeautifulSoup docs, spend time writing your application.
LXML and CSSSelect are both Python packages that you can install easily via
pip. In order to install
lxml via pip you will need
libxslt. On a standard Ubuntu installation, you can simply do
Having used Linux almost exclusively for the last four years, I miss efficient window management on Macs. Coming from the awesome window manager, I find that OS X does not have good support for a two monitor multiple workspace workflow out of the box. After tinkering with third party software, I believe I've found a good solution for most of my complaints, and have a workflow that I feel productive with. In my experience, this works best with multiple monitors, a standard keyboard (think Dell not Apple), and a three button mouse (I'm not a fan of touchpads or Apple mice).
I really like the use of multiple desktops in my workflow. I usually set up four desktops. I keep Spotify open on the very last one. The middle ones are my "work" desktops that I use for terminals, browsers, IDEs, and documentation. The first one is usually a "distraction workspace" that will have my email, and Adium open. This helps me keep my windows organized, and keep focus when I need to.
In order to set this up, I add additional desktops (up to 4). The easiest way to do this is to open up
Mission Control (usually Control-Up), hover over the Desktops, and click the Plus button on the top right.
Once this is done, I would recommend setting up easy keybindings to switch between desktops. To do this, you go to
System Preferences > Keyboard > Keyboard Shortcuts > Mission Control. Then you can set up keybindings for
Move left a space,
Move right a space, ...
Switch to Desktop 1-4. I use
Ctrl-Alt-Left/Right to move between desktops, and use
Command-1/2/3/4 to jump to a desktop.
On OS X, it's sometimes pretty cumbersome to perform window management tasks like moving windows between monitors, and maximizing windows efficiently. This is where Slate comes in. Slate is a configurable third-party window management application, that makes these window management tasks super easy. I will explain how I use Slate day-to-day.
You can install Slate, pretty simply by downloading the Slate dmg. After installing and starting Slate, you will want to make sure it's properly configured.
Here is my
~/.slate.js configuration file that describes the keybindings I use with it. Right click on the Slate icon in the topbar >
Relaunch and Load Config, to apply configuration changes.
Here is the mapping of keybindings
Feel free to modify this slate configuration to suit your needs. You might find the Slate documentation helpful.
If you use Adium as a chat client on your machine, I recommend setting up a Global Keyboard Shortcut. This allows you to switch the focus to Adium anytime on your machine by pressing the key sequence. It's super handy to instantly switch to Adium when you get an Adium notification.
I set my global keyboard shortcut to
Cmd-Shift-/. To use this, you'll have to get rid of the keybinding for the
Help Center first. Do this by removing the keybinding in
System Preferences > Keyboard > Keyboard Shortcuts > Help Center.
To set the global keyboard shortcut in Adium, go to
Preferences > General > Global Shortcut.
Now, you can press
Cmd-Shift-/ to switch to Adium, and press
Cmd-/ to show/hide your buddy list.
You should already be using Alfred as your primary application launcher. It lets you launch applications with your keyboard really easily, and do much more. It's also way faster than Spotlight.
If you use multiple desktops, sometimes you'll want to create multiple windows of the same application. Alfred will get in your way here, because if you try to launch an application that already has a window open, it will take you to the window instead of opening a new one. This happens to me all the time when I want to create a Chrome window, when you already have one open on another window.
The easiest way to do this is to go to an existing Chrome window, press
Cmd-N to create another window, drag the newly created window, and while dragging the window, press
Cmd-1/2/3/4 to take the window to another desktop.
Like I mentioned before, I use Spaces in my development workflow. One thing that I really dislike about Spaces is that when you move between Spaces using
Ctrl-Alt-Left/Right, it takes a second to animate the movement. I don't like this because it feels clunky.
You can run this in a terminal to make this animation a lot faster.
Credit to Gerard O'Neill for showing me a lot of this workflow.
Rutgers Open Systems Solutions mirrors a bunch of Linux distributions, and you can use these to download packages quickly when you're on campus. When downloading on the Rutgers campus, your bandwidth will also not be throttled which significantly improves your download speeds.
To add a mirror, open up your
It should look something like this:
[distrib] here is the name of your distribution. This is one of
saucy. You can find a full list on the Ubuntu Wikipedia Page
You are going to add the following lines at the top of the file.
Be sure to replace
[distrib] with the name of your distribution. Now, save the file and quit, and run
You should now be downloading packages from the Rutgers mirror. Try to install a package, and make sure that you're making requests to
When building web applications, you sometimes want to retrieve JSON data from APIs and domains that are external to your service. Because of the Same Origin policy in browsers, you cannot retrieve data from other domains via AJAX.
Usually to get around this, APIs will have endpoints that support JSON-P. JSON-P is a nifty technique that loads JSON data via
<script> tags instead of loading the data via XMLHttpRequests (AJAX). To understand this, let's look at an example.
Let's say you have a service on
http://myservice.com/data.json that returns the following JSON.
An application that lives on
http://anotherapplication.com cannot access data.json in client-side JS via AJAX because
myservice.com are not the same domain.
As the author of
myservice.com, you can solve this problem by turning your JSON endpoint into a JSON-P endpoint. To do this you write
myservice.com in such a way that hitting
http://myservice.com/data.json?callback=procedureName returns the following:
Now, the author of
anotherapplication.com can load data.json by adding the following script tag dynamically to the client side DOM.
Now, the function
procedureName will get called with the data from data.json. Using this trick, does mean that you have to trust
http://myservice.com, because any content returned by it can get executed by your client side JS.
Most web services will support JSON-P if they expect you to retrieve their data on the client side, but some do not.
For services that do not support JSON-P that live on the internet, you can use YQL to proxy the request through Yahoo's servers, and retrieve data in the same way.
Here is a snippet of jQuery code that would normally hit
Here is how you modify it to proxy via Yahoo's servers.
The snippet above will send a request to Yahoo and get back data from
myservice.com as a response. This does mean that
http://myservice.com needs to live on the open web (not on an internal server), so that Yahoo servers can hit it.
jQuery will automatically add a
callback parameter to the request, and give that name to the
success function, so that it gets called appropriately.
Often times when building web applications, I used to spend time deploying my web applications via
scp. Then I used Heroku for a few projects, and I really liked that deploying to heroku was as easy as it could be.
I wanted to have a similar deployment scheme on my own projects that aren't deployed on Heroku.
Since git is a distributed version control system, you can push the code that lives on your machine to another machine very easily via
ssh. So your first instinct is to set up a repo in the location that your code needs to be deployed, and push to it via git. This is a good instinct, but git does not allow you to push code to a working copy. To resolve this, you will create a bare repository on your server, and push to it. You will also set up a git hook to automatically deploy your application when code gets pushed to the bare repository.
Before you start, your codebase needs to be in a git repository. This could be a Github repository that you use for version control. I will assume that your codebase lives in one directory called
project on your development machine, which I will refer to as
This codebase will be deployed to your server. I will refer to your server as
Now, you are going to create a bare git repository on
deploy, and you will be able to push to it from
You will now set up your codebase on
develop to push to the repos/project.git directory on
This will push your codebase, to the bare repository you just created on
deploy. You can verify this by cloning the bare repository if you'd like.
Now that we are pushing to the repos/project.git directory on
deploy. Let's set up our repository to actually deploy its code. I'll assume that your application gets deployed to
The post-receive hook gets called by git right after code gets pushed to a repository (right after git push deploy master). We will make this hook deploy your application to
/var/www/myproject.com . Using an editor of your choice, place the following in the post-receive file.
Make the hook executable.
Make sure that your user has permissions to write to
/var/www/myproject.com. This is it! You can now deploy your code anytime you want by running:
Verify that your code is deployed when you push, and you should never need to use
scp to deploy ever again.