hubReports: Project review

hubReports is a side project of mine. It's a site that gathers data from GitHub on a daily basis and then presents it via the website. I went live with the latest interation of it on November 1st 2013 and thought it was time to review how I think it did and where to go next.

The Launch

The project had previously existed under a different domain but with the hubreports name still and a slightly different focus. After rewriting the project, adjusting its target, I felt ready to go public.

First week page views (all stats mentioned are from Google Analytics) total 1,172 followed by 1,444 the next. Not a bad start but that was with a lot of tweeting, Google plus posts and a blog post. The big factor of the launch was being featured in the "StatusCode" weekly newsletter, issue #46, from Peter Cooper of JavaScript Weekly. Off the back of that I hit 6,161 for that week and a second week of 5,287.

Initial Response

I didn't receive much in the way of feedback except for a few comments from people on Twitter. It seemed to be liked as there were plenty of tweets that simply shared the site, which also helped pull in more visitors. The lack of more in-depth feedback did make me wonder which parts of the site were working well and which needed the most attention.

Back down to Earth

This year so far, the site gets around 300 to 200 page views a week. Not amazingly busy but considering this whole project was based on teaching myself new stuff and better techniques, I'm reasonably happy with that number.

What went wrong?

AngularJS and SEO

Early in the project I decided to use AngularJS. There were a few reasons behind this, including some planned features being easier to implement and trying to keep the bandwidth usage to an absolute minimum. Mostly because of my hosting budget (zero). Unfortunately, AngularJS based sites aren't easily indexable by search engines as they require JavaScript. There are a few techniques I've seen that provide server rendered versions or data to search engines, but I haven't looked into those yet and don't know if any would be out of reach for my available resources.

Not being easy to index for search engines makes the site very difficult to find. The majority of visitors are refered from the Statuscode newletters, blog posts and tweets.

GitHub weekly

No hard feelings towards GitHub, couldn't have done hubReports without them, but they went and created a weekly newsletter for trending repositories. That's going to cut into my expected audience a little but probably wouldn't have made much of an impact.

Displaying data

I have all this data, but I'm not entirely sure what the best method for display it is. I've also certain features planned (allowing language / repo / user comparisons) but without any feedback I've held back on implementing them as I'm unsure on the best way to display the data. Purely based on my worry on how the rest of the existing site is being received.

The missing newsletters

There was a "News generation" feature planned, I haven't had chance to work on it enough. It was to be used for the basis of weekly newsletters for each "top / specially selected" language that hubReports monitors. Having those newsletters available, I believe, would increase the regular audience of the site.

Data, data, data...

hubReports gathers all of its data from the GitHub API. While having the advantage of making my life easier, there have been the odd bugs with the API that'll cause headaches in my data (missing repositories / users) or failure of a collection for that day. I'd much prefer to be working with the data from the GitHub Archive. But that isn't quite on the cards from my initial investigation. Mostly due to the fact that it's a lot of data to process and I haven't the available resources to constantly process the feed for statistics, without impacting the hubReports site itself.

What went well


Even though it made my life difficult with search engines, it made my life easier when it came to putting the site together.

Learning experience

Throwing this project together allowed me the chance to work with: Node.js, MongoDB, OpenShift hosting, CDN's, AngularJS, Grunt, Express and a few others. While I've toyed around with some of this stuff already, it was nice to have a set purpose to aim for and apply them to.

The world of tomorrow

Where will hubReports go from here? I've decided to focus on a few areas that I felt let the project down:

  • Comparisons: I'd like to provide a way of selecting another language / repository / user, while viewing one and providing an easy way to compare their statistics. This would be more interesting if I throw a dynamic graph in there, so the compared statistics can be seen over time.
  • News generation: For the goal of creating newsletters, I need (+want) more content in them, than just having tables of trending repositories and a few graphs.
  • Increase Search engine friendlyness: Some time definitely needs to be spent making the site more indexable. If for whatever reason I find it out of reach for the resources I have available, then so be it.
  • Data retention: I must make this a priority. The database is hosted on the free tier of OpenShift and I'll eventually hit it with the entire site probably grinding to a halt of forever serving out of date data. My initial target will be repositories / users who haven't been seen by hubReports for a set period of time. That'll buy myself a decent amount of extra time. If that turns out to be a simple enough job, I may also look at how to deal with data retention of statistics. Possibly going with a RRD like solution to prevent data growth by averaging out stats over time (the older data gets, the lower resolution it is held at).


Any comments on any of this is more than welcome. But if you think the whole project is pointless, it may not be worth saying so as I value the site primarily as a learning experience, so it wouldn't be entirely constructive if you did ;-)


David Boyer
David Boyer

Full-stack web developer from Cardiff, Wales. With a love for JavaScript, especially from within Node.js.