Parsing User Agents

For many, many years, Gary Keith maintained a set of files (“the files”) as part of the Browser Capabilities Project. These files contained thousands of entries mapping regular expressions to browser families, versions, platforms and capabilities, and these files were used by parsers written in a dozen languages to turn website visitors’ user-agent strings into structured data about their system. I’m truly thankful for the hard work Gary put in maintaining those files single-handedly for so many years.

At the end of 2012, after many months of warnings about his health and declining ability to run the project, it was closed with no new maintainer. Nobody’s stepped up to provide the same level of support he provided, and it’s no longer a reliable source of information about browsers and their capabilities.

If, like me, you need the ability to parse user agents for browser and platform information, I recommend the ua-parser project. It’s up-to-date, with a team of contributors, takes pull requests on github and provides libraries in 10 languages. It’s also much simpler to maintain, with only 1000 lines of YAML rather than 28,000 lines of INI files.

I’ll be using ua-parser as the basis of W3Counter’s browser and platform reports going forward.

Summer Workspace

Workspace

Sunroom, wall-to-wall sliding windows, fold-up table. The Surface Pro comes with me when I go out for server emergencies..

A reader lives a thousand lives before he dies, said Jojen. The man who never reads lives only one.

— George R.R. Martin, A Dance With Dragons

What startups need to know about the “internet sales tax bill”

This week, there is a good possibility the Senate will pass the Marketplace Fairness Act, often called the “internet sales tax bill”. Here’s what this bill will do:

  • Online sellers will be required to collect, report and pay sales taxes in all of the states (once the states meet certain requirements), rather than only the states the seller has a physical presence in.
     
  • Each state that wants “remote sellers” to collect sales tax must establish a single entity to manage tax collection and audits for the entire state. Sellers won’t have to deal with all 5,900+ separate taxing municipalities in the country, just 50.
     
  • Each state must provide a database indicating the types of products and services taxed, and at what rates and with what boundaries. If this is implemented like previous “internet sales tax” proposals, that means the database should map 5- and 9-digit zip codes to tax rates for each category of product or service to be taxed.
     
  • Each state must provide free software for both the calculation of sales taxes due at the time a transaction is being completed, and software to prepare sales tax returns.
     
  • There is a “small seller exemption”. If you collect less than $1,000,000 a year from out-of-state customers (based on the previous year), you will have no new obligations under this bill.
     
  • Many startups are exempt from collecting sales tax in their home states as they only sell services rather than tangible goods. Even if you’re exempt in your home state, you may not be exempt in others. Several states charge sales tax on all services, others charge sales tax on certain categories of service, and any of them could change their taxable classifications as part of opting in to the new systems this bill creates.
     
  • It’s very likely states will be providing APIs for computing sales tax; the bill requires their software be able to provide a tax rate for a specific online transaction as it’s taking place.
     
  • These APIs will all have to follow the same rules for determining whose tax applies to a specific customer: based on the delivery address provided by the customer; if not provided, then based on the customer’s address; if not provided, then the address of the customer’s payment instrument; if not provided, then the tax will be based on the location of the seller.
     
  • The bill encourages a new type of business into existence: “certified software providers”. These new software businesses can become certified by each state in computing and filing sales tax using that state’s APIs.
     
  • The bill creates a benefit for businesses to use these new certified services rather than integrate with all 50 states on their own. A business will not be liable for errors in its tax returns if they were prepared by a certified provider. That means no penalties or fees for mistakes.
     
  • The providers in turn have no liability for errors in the taxes they calculate and returns they prepare if the errors are a result of inaccurate information from a seller (i.e. miscategorized products or services), or inaccurate information from a state (i.e. the state’s API returning the wrong rate).
     

There’s little else contained in this relatively straightforward bill. Should it pass, online sellers will eventually be collecting sales tax for most of their US customers.

Since integrating 50 different software packages into every online store, filing 50 different sales tax returns, cutting 50 checks, and getting audited by 50 states is not an appealing idea to most small businesses, they’re almost guaranteed to pay a certified software provider to handle it.

Portal on the Big Screen

J.J. Abrams and Valve are in talks for a Half-Life or Portal film. I’d watch either one; I definitely wanted to see more after this live-action teaser based on Portal —

I’m still waiting for the World of Warcraft live-action film that was promised back in 2006. Legendary Pictures only chose a director for it this week, 7 years later, and announced a 2015 release date. The game it’s based on will be 11 years old at that point — no doubt still popular, though, given the franchise is already 19 years old and still 10 million subscribers strong.

PLURALITY

Who Hosts the Y Combinator Startups?

Every couple of months I re-evaluate my hosting choices to ensure they still makes sense in terms of cost, stability and service. Curious as to what choices other technology companies are making, I decided to conduct an independent survey of who hosts the Y Combinator-funded startups. Armed with only this spreadsheet listing the over 300 websites, my terminal and nslookup, I compiled this graph:

Graph of Y Combinator Startup Hosts

In all, I found 289 websites that weren’t dead or merged into another company’s website. Almost 3/4 of all sites were hosted by Amazon, Rackspace, Softlayer or Linode.

The Secret to Finding a Startup Domain

I’m terrible at naming products. Meaningful one- and two-word .com domains that aren’t already owned by someone are virtually nonexistant. Combine these things and I spent almost 3 entire days trying to find a name for my latest app without coming up with a single viable domain.

Then I found Stylate through a comment at Hacker News.

They exist solely to solve this exact problem — brand packages for startups with brandable names, matching .com domains and logos. Within minutes I found Improvely among their available names and knew it would be better than anything I’d come up with myself, and snapped it up for only $250. Compared to spending another few days searching names on my own, and likely coming up with some unbrandable combination of words, Stylate was a bargain.

Improvely gets its own blog

Improvely now has its own blog, where I’ll be announcing new features and integrations, and occasionally sharing tips for measuring and optimizing conversion rates.

What Hacker News Users Use

Yesterday, the post about the Date Range Picker that I created for Improvely spent almost 24 hours on the front page of Hacker News. That led to 20,910 new unique visits to this blog with, at the peak, over 800 concurrent people with the post open in a browser.

W3Counter

Of course, I have W3Counter collecting analytics on my sites, which made this a great opportunity to discover what kinds of systems Hacker News users use during their work days. I filtered out the visits from other sources and compiled a couple graphics.

Operating Systems

Nearly half of HN readers are working on Macs, with Windows used by just 27%. 1 in 10 are browsing the web on Linux systems, while the remaining nearly 20% are on mobile devices. Not shown in the chart are the 16 Windows Phone 7 owners and 7 people on ChromeBooks.

Browser

Chrome was the browser of choice by 63% of the new visitors. Surprisingly, there were more people reading from an iPad or iPhone than users of Firefox on the desktop, and less than 200 people in the 21,000 used any version of Internet Explorer.

Displays

Some flavor of MacBook appears to be the most popular system among Hacker News readers, with 1440×900 displays and OS X as the most common combination.

Hi, I'm Dan Grossman, the Philly-area creator of Improvely, W3Counter, and some open source stuff. Here's a couple other places you can follow me online —