As a form of procastination, I wrote my first scraper on phantomJS about two months ago. It would visit hacker news and store some data about each article posted on the front page such as the link, number of comments, number of karma and the position. I made it run every 30 minutes on my server and then left it.

I did this even though I knew there was an API from, but at the time I just wanted to use the awesome scraper that I wrote and I wanted the data right from the source. One of the goals of scraping HN was so that I could do some random (probably useless) statistics with the data. So here it is thanks to some scripts that I wrote in python to make sense of the scrape data. Sorry for the extra 48 requests to the front page every day.

In a Week

total karma
total posts

Karma per day

Unique articles per day

Comments per day

In a day

- New karma

- New posts

Hover over the graph to change background guide lines from new karma to new posts

Articles with karma between 0 and 50:

Articles with karma between 51 and 100:

Articles with karma between 101 and 200:

Articles with karma over 200

Top article: