Urchin development and support will be discontinued by Google as of March 2012. Urchin was Log Analysis software bought by Google in 2005. They used this software as a base for Google Analytics but have now announced they will focus exclusively on Google Analytics. We have since received a number of emails from Urchin users, asking if Piwik could be set up to carry out log analysis in the same way as Urchin, and import all past logs in a Piwik server.

We are happy to say that we have been developing a powerful, simple to use script that will analyse your webserver log files (Apache, Nginx, IIS, Akamai, etc.) and will import visits and page views into Piwik.

We hope that over the next few months, Piwik will become the best alternative to Urchin and AWStats (and others).

! UPDATE 2012, March 20th !

We have now released the beta version of the script to import and analyze server logs using Piwik.

Find all of the documentation and details on the Server Log Analytics page.

Piwik Features When Used to Import Log Files

Piwik normally uses JavaScript code to track visits and pages. This new script also makes it easy to track visits by importing one or many web server log files into Piwik. Here are some examples of when you might want to use the script:

  • if you are unable to add the JS code to the websites
  • if you wish to import large amount of historical data at once
  • if you are looking for a software that does the same thing as Urchin, AWStats, Webanalyzer or Webtrends.

Some features of the Piwik log import script include:

  • Great performance: we have tested it to track several millions of log lines per day with success. See the Piwik for high traffic websites check list.
  • Bot traffic is automatically excluded: to keep your web analytics report clean and useful, with increased performance.
  • Track using more than one method: Piwik can track some websites with the standard JavaScript code, and other websites could be tracked by importing the access logs. You could, for example, use JavaScript tagging for websites 1 and 3, and log import for sites 2 and 4. We expect these hybrid Piwik servers to become a common configuration among the community.
  • File downloads appearing in the logs will be automatically tracked as “Downloads” in Piwik
  • Access to all Piwik features: because logs are imported via the Tracking API, all Piwik features will be supported (Goal tracking based on URL, IP Anonymization, Visitor log, etc.)
  • This script will effectively replace Apache2Piwik, the new tool providing more features and better performance.
  • In later versions we are planning to support log reprocessing, error code tracking, search engine & spam Bot tracking, features to use the logs to enhance existing JS tracked pages, and more (based on user popularity and feedback).

Note: Some reports will have no data because the log data is more limited that data obtained via Javascript. For example: screen resolutions, Supported Browser plugins, Custom variables, Ecommerce Analytics will not work.

This script was written in Python and is released under the GPL license, for free (just like Piwik!)

Perfect for Web Hosting Companies and Web Agencies, and for One-Off Log Imports

The script will have 2 modes:

  1. Web Host – web analytics provider user
    This mode is ideal for web hosts, where new websites are often added in the access logs, but the Piwik admin does not wish to manually create each website. The script will automatically detect the Piwik website ID to track based on the URL being parsed: it will look for any Piwik website registered with a URL or “Alias URL” set to this page view host. If a website with the hostname doesn’t exist, a new website is automatically created for this URL.
    A summary is then emailed to the Piwik Super User so he/she knows which websites are automatically created by the log import script and can create users or assign permissions to view these new websites.
  2. Simple log import for one or a few websites only
    This mode is ideal if you import only a small number of websites or if you wish to control exactly in which websites requests are to be tracked.
    When a line contains a URL to an unknown Piwik website, Piwik will ignore all these page views and will report, at the end of the script execution, the list of hostnames that were not matched to any website in Piwik.
    If these unknown URLs turn out to be legitimate page views, you can either create a new website manually, or add an Alias URL to an existing website, so the page URLs are directly tracked in this website the next time you import similar logs.

Join the Beta Testing Group

To be part of our beta testing group, please email us at hello@piwik.org and mention the testing of the Urchin/Awstats log import script. Please also mention the number of websites you wish to track, how many pages per day, and if you are willing to test the script and report bugs or feedback.

Featured Sponsor: a Web Host that Tracks Millions of Log Lines with Piwik

This work is sponsored by Alwaysdata, a French web hosting company. They provide Piwik as their web analytics package of choice, deprecating AWStats, for thousands of their users. They have been using Piwik for a few years and we are finally integrating this log import analytics key feature in Piwik, as well as ensuring good performance for the script. We want to make it easy for web hosts and large web agencies to use it as their Web analytics platform.

Goodbye Urchin + Scale of Google Analytics in 2012

The Google Analytics team have decided to focus on the privately hosted Google Analytics (GA) service and discontinue the log analysis version (Urchin). At Piwik we are quite simply amazed at the scale and reach of Google Analytics in 2012: GA is used by over 55% of all internet websites (source). At least 15 million websites use Google Analytics! (source). In comparison Piwik is used by 1% of the Internet (cheers!) and 250k+ websites.

Millions of pings (page views) are tracked by GA per SECOND. This is enough to make any software developer speechless. We can only congratulate Google engineers and product designers for the work they are doing to track and aggregate so much data, while allowing users to slice this data in real time across dozens of dimensions. This is an amazing technical milestone. We also hope that Google users’ privacy will be respected and privacy standards will improve in the future.

Regarding the end of Urchin, we at Piwik will do our best to provide to existing Urchin users a good user experience when they upgrade to Piwik to try the leading free software platform. If you are a Urchin user and would like to try Piwik, send us an email us at hello@piwik.org with your current setup. We will help and check if similar functionality is do-able with Piwik and the log import script.

Privacy & Security implications of self hosting your web analytics data

Ensuring the full control over your customers’ log files and Piwik database are important requirements if you are a web agency or a web host providing web analytics to hundreds or thousands of users.

The tips on the Privacy page will help ensure that you make changes to data collection and data retention required by your Privacy Policy. We also focus on Code security and recommend that all Piwik users to spend some time securing their Piwik server.

Piwik Also an Alternative to AWStats & Webalizer: Modern UI, Better Performance, and More!

We hope that Piwik will become the leading alternative to Urchin and to AWStats. AWStats was a great tool but we hope to modernize the log analysis open source software world and make use of all the great Piwik features and capabilities in terms of data analysis and graphing. Users in 2012 and beyond will need a modern interface to access the data gathered from their web server access logs.

We expect a release in 1-2 months. Stay tuned…

Happy Log files import with Piwik!


Piwik Core Team

Piwik is liberating web analytics by offering an open platform with built-in privacy. Piwik is used on more than 1 million websites worldwide and is translated in 53 languages. The Marketplace enables the community to create innovation in the world of web analytics. Roadmap - Get involved.