Web Analytics Privacy in Piwik

This guide explains the Privacy implications of tracking your User Web Analytics data, and how Piwik can easily be configured to ensure your User privacy is respected.

Piwik ensures the Privacy of your Users and Analytics data. When using Piwik, YOU keep control of your data, nobody else does. Your data is stored in your own Mysql Database, and logs or reports data will never be sent to other servers by Piwik. In March 2011, the Independent Center for Privacy Protection in Germany (ULD) recommended Piwik as Privacy compliant Web Analytics software. Piwik Privacy compliance is also reflected by the many government agencies who already trust and rely on Piwik (in Europe, Asia, North America, etc.) for providing self hosted Web Analytics.

To ensure further security, after you have installed Piwik, we recommend you:

  1. make your Piwik server more secure (link to the security guide) by undertaking a few extra security checks, and
  2. follow the guide below to enable important Privacy features

How to enable Privacy settings in Piwik?

By design, Piwik ensures your Analytics data is only accessible by the Piwik administrator, meaning it is completely secure. This guide will explain how to easily make your favourite Web Analytics tool "Privacy compliant". Firstly, you will need to login as Super User and click on Settings > Privacy.

Step 1) Automatically anonymize Visitors IPs

By default, Piwik stores the visitor IP address (ipv4 or ipv6 format) in the database for each new visitor. If your user has a static IP address, it means his browsing history could be easily tracked across several days and even across websites tracked within the same Piwik server.

To ensure that you do not store the Visitor IP, which is Personally Identifiable Information (PII), please go to Settings > Privacy to enable IP anonymization, with at least 2 bytes masked from the IP.

Step 2) Delete old Visitors logs

You can configure Piwik to automatically delete your older logs from the database (click on the link for FAQs). For Privacy reasons, we highly recommend that you keep the detailed Piwik logs only for only 3 to 6 months and delete older logs data.

Deleting old logs also has one other important advantage: it will free significant database space, which will, in turn, slightly increase performance!

If you run the automatic script as explained in the FAQs, it is safe to delete your old log data and still access all historical reports in Piwik.

Step 3) Include a Web Analytics opt-out feature on your site (using an iframe)

On your website in your existing Privacy Policy page or in the 'Legal' page, you can actually add a way for your visitors to "Opt out" of being tracked by your Piwik server. By default, all of your website visitors are tracked, but if they opt-out by clicking on the iframe link, a cookie 'piwik_ignore' will be set. All visitors with a piwik_ignore cookie will thereafter not be tracked.

In Settings > Privacy, you will be able to copy and paste the following Iframe code:

Here below is the example iframe for this website. You can opt out from being tracked on demo.piwik.org:

Step 4) Optional plugins

  • Maybe you have heard of the initiative Do not track? If you wish to make your Piwik server respect the "Do not track" settings of your users, please install the DoNotTrack plugin.
  • As the Piwik administrator, you might decide that giving access to Real time & Visitor Log features are not necessary for your Piwik users. In this case, you can disable the Live plugin in Settings > Plugins.
  • If you track many websites with the same Piwik server, all your websites code will contain the Piwik server URL in the Javascript code. To prevent other users from finding out all your websites, you can Hide the Piwik Server URL in your Javascript using this technique (FAQ)

Privacy applied to Web Analytics logs: a philosophical choice?

Privacy on the Internet is a major concern for many users, webmasters and companies today. We spend so much time online that having access to our Internet activity log (websites, pages visited, internet searches) can reveal a lot of personal information about ourselves, our life, work, etc.

When you use a Web Analytics tool such as Google Analytics (GA) or Yahoo Analytics, your web analytics data is tracked, stored and owned by the company providing you with the free analytics service. While they provide an excellent service for free, they also re-use the visitors log data tracked on your website to enrich existing profiles for a given user or IP address.

Here are just a few examples of how websites can gather your personal information while you are away from the website:

Google example

If you use GA on your website, Google would not only know all of the IP addresses (and other browser unique identifiers) of visitors to your website (and which pages they looked at on your site), but because most other websites use GA as well (or another Google product), Google would know for a visitor to your site, eg. the 6 other websites that person visited earlier today and the 367 websites he looked at in the last month. Because more than 50% of all websites on the Internet use GA, Google Adsense or another Google product using tracking beacons, Google is able to build a very accurate picture of most websites visited by any given user.

Facebook example

Facebook Social widgets are used already more than 15% of all websites & blogs (source w3techs). When you visit a website with a Facebook Like button (or any other FB functionnality), your browser will send data (and your IP address) to FB. Recently it was made public that FB creates shadow profiles of logged out users. This means that even if you are logged out or not a FB member, they still keep track of the websites and articles your IP address (and other browser unique identifiers) were looking at.

Why is this profiling for marketing purposes considered a "problem" for some Piwik users?

Privacy is becoming increasingly important to us as we spend more of our lives 'connected' on the internet. While Google is providing a multitude of amazing services for free – for which we are very thankful – we are at the same time concerned about where and how our private information is being used. Websites like Google (and increasingly Facebook) are able to build an enormous profile of all websites and pages looked at by most (nearly all) Internet users worldwide (even if they are not Google or FB users).

One of their main goals is to improve the re-marketing of Google Ads and Facebook Ads to Internet users and find the right advertising segments for the right ad. But many Internet users and website operators are growing concerned about what could be termed a Global Internet User Activity Database and its moral implications. You don't need to be a Privacy Junkie to be interested in the challenges and moral implications of gathering so much data on the Internet. We choose not to discuss the details here but recommend you check out the Privacy section of the EFF website to learn more.

Opting out of all data collection & Remain Anonymous online

To opt-out of all Web Tracking technologies (including Google Analytics and Piwik), using an extension such as Noscript or Ghostery is the safest way: these browser extensions will disable all the known Javascript trackers and ensure that your browser does not send a request to external tracking servers. If you wish to browse the Internet without your IP address being tracked at all, please consider using Tor Browser which will automatically connect you to the secured and anonymous Tor network. If you wish to stay completely anonymous online, please see the PDF guide How to remain anonymous online?

Conclusion

Piwik is the leading self-hosted, privacy compliant, decentralized, modern & Free (GPL License) Web Analytics platform, a building block of Free and Open Internet. By using Piwik and configuring a few options as explained in this guide, you will ensure that all of your valuable information is private and owned by one person (you!) and that your website also, just as importantly, respects your visitors privacy.

Entries (RSS)