Thoughts on W3Counter page view limits

I started writing this as a response to Josh’s comment on my last post, but it’s getting long enough to warrant a new post and, if anyone’s interested, some discussion in the comments here.

Dan, quick question about the new pricing… is the page views limit new? I was planning to upgrade to a paid account when the new version came out likely anyway (so I could track more than one site)… But 10k pages per day minimum seems low (right now my site is at 4-6k/per day, but growing… I’d anticipate crossing 10k/day this year).

Right now the limit published, which isn’t enforced in the code anywhere, is 500,000 a month. That’s ~16,000 per day, which is on the same order of magnitude. Regardless, it’s still up in the air. I’m not set on what I want to do here.

My ideal would be to provide stats to thousands more free accounts that have at most a few hundred page views per day. That’s probably 99% of all web sites. Each site would add only one query every few seconds to the overall load, its log table would grow only a few hundred rows a day making pruning and optimizing fast, and the reports will show weeks of data since the log table size divided by daily page views will be high.

On the other hand, if someone wants to pay for the better account, I don’t want to deny them that. The issues are that one paying customer uses as many resources as 10 or even 100 smaller sites, their log file grows many thousands of rows during the day which makes querying it require more memory, cpu and time as the day goes on, and pruning the table takes longer during nightly maintenance. Worst of all, the reports are less useful for these users, since their log tables may only represent a few days of activity.

The obvious sounding solution to that is to increase the log size for the upgraded plans. Right now when a table hits about 30,000 rows, some of the more complex queries (like the exit pages / bounce rate report) start taking more than a second. That’s significant – locking tables for a few seconds means INSERT queries trying to hit them start building up, and should the number of sites getting in that situation become too large, that build up could become large enough to never catch up.

I did have the opportunity with the new version to write the entire data access layer from scratch, including rewriting all the queries and indexes on the tables. In some cases I was able to get a significant performance boost by splitting a complex query into two or more smaller queries. Usually that’s the opposite of what you want since the trip to and from the database can eclipse the time spent running the query, but these were exceptions to that rule. It’s possible I’ve trimmed down the expensive queries enough that I can increase the log size without those effects I mentioned happening. I’ll have to tell the nightly pruning procedure to skip some tables and start testing against larger logs to see.

To sum all this up, I may enforce a higher limit than 10,000 per day, but it likely won’t be more than double that, and larger sites will still see reports based on only a few days of activity.

More from this category

  • Max

    So even if someone pays for your largest “Premium” package, they can’t surpass 10,000 pageviews per day? I’m guessing 200,000 is out of the question :(

  • Dan

    This has to run on a computer, and computational power is still quite limited in 2007. Even with 4 cores, 4GB RAM, it’s still just a computer, with one hard disk. How many sites of that size, that need to do multiple database queries on every page view, do you run on one server? W3Counter has to handle thousands of sites without blinking.

  • http://www.mikehealy.com.au Mike

    It seems to me you have an application that could very easily scale to multiple servers (as long as the paid accounts can support the cost of course!).

    The stats from siteX don’t really need to live beside those from siteY. They could be completely autonomous. Sure you are probably interested in global stats and averages, but you don’t need those reports to be 100% fresh.

    Why not just replicate your app across N servers? You would need to maintain multiple copies of the code and database, but so does all the distributed software in the world :)

    The servers wouldn’t even need to communicate with each other in a special way. They each just act as the database, file and application server for the accounts they host.

  • Dan

    Yep, that’s certainly doable, and not very hard. The design is set up to do that eventually if it’s necessary. But right now there just aren’t enough paying accounts. It takes a lot of $5 accounts to pay for beefy servers.

    Maybe this is the excuse I’ve been looking for to play around with Amazon Elastic Compute Cloud. I’ll look into it this weekend, and thanks for bringing it up.

  • Dan

    Some follow-up to this issue: I did some testing today on performance against larger log tables. While W3Counter 3 had several reports that entered multiple-second query times to generate, not one showed up in my slow query log using W3Counter 4. Looks like I’ve knocked out all the query-related bottlenecks.

    That means I’m now more free to raise the log size, and I plan to do so. Stats for free accounts will still run off the last 25,000 page views, but I’ll be at least doubling that for premium accounts. I’ll also be increasing the daily page view limits; I’ll have chosen the new values by the time the new site goes up tomorrow night.

    Increasing log size does still increase the time it takes for nightly maintenance when the older entries are deleted to maintain the table size, but with a bit of artificial delay between queries (thanks to SLEEP()), that won’t be a serious issue.