Electronic Profiling
Dave Halstead, Helen Ashman,
Computer Science and Information Technology,
University of Nottingham, Nottingham
NG8 1BB, U.K.,
hla@cs.nott.ac.uk
Abstract
In this paper, we review the various practices relating to the collection,
dissemination and use of personal data. Information about people is becoming
increasingly useful to marketing agents, and vast digital collections of personal
information are being collected. In some cases, the data is sold, but in others,
it is freely available. Now that the Web has become the ultimate data dissemination
tool, databases of personal information can be queried very quickly, easily,
and above all, anonymously. Never has the task of an investigator (or a casual
interrogator) been easier.
It also is clear that data collection is increasingly a core activity of private
companies, rather than government organisations. Orwell overshadowed a generation
of readers with the "Big Brother" of 1984, and the spectre of governmental databases
often provokes a Big Brother scare. However the public is less inclined to view
commercial data collections as a threat to personal privacy or liberties. This
could in part be because such data collections are hidden, and their extent
and use are not publicised. So while Identity card proposals, such as the Australia
Card of the early 1980s, are almost universally condemned, more insidious electronic
profiling such as supermarket loyalty cards meet with indifference, despite
the fact that even more detailed data is available.
We set out to discover how much information could be available on a single
person through publicly accessible data sources. The aim is to highlight the
boundaries of what information can be collected and to consider how this fits
in with the current legal position in the UK. The can also include information
on a specific company, rather than a person. This is interesting because a company
has the same legal rights as an individual under British law.
1. Introduction
There are many sources of personal information available, each providing small
amounts of information on individuals, but judicious cross-referencing between
the many sources increases how much information is accessible just by using public
data.
The aim of this research is to determine the necessity for a tightening of
UK data protection laws, or, optimally, to reassure us that these laws are indeed
protecting citizens' information. We additionally establish a manual data seeking
process, from which we will create a simple automated system. This system will
query and cross reference specific information servers, outputting information
on the given person or company. The purpose of this part of the work is to consider
means of rendering software of this nature ineffective. The end result of this
would enable a user to enter the persons'/company's details, for instance a
name, a valid email address, or current home address. The possible output may
consist of, but may not be limited to, the following:
- Full name
- Address/es
- Telephone number
- Social security details
- Driving licence details
- Income details
- Education
- Family details (possibly details of individual family members)
- Mail order/online purchases
The work reported here is part of an ongoing programme of research to develop
and evaluate protective mechanisms at technical, corporate policy and legislative
levels. It seeks to identify the boundaries between legitimate use of this information
(such as for credit referencing purposes) and unnecessary and unwelcome probing.
There is also the concern about accuracy of the information being collected, because
in some cases, the information is not gathered so much as inferred from other
information1. Apart from the initial background
research, this paper establishes that human-directed and automatic methods can
effectively collate dossiers on individuals and companies.
2. Data Collection
There are two main forms of electronic/digital media that have been identified
as relevant to this study; these can be broken down into how the sources can be
accessed - off-line and on-line.
2.1 Off-Line: CD-Rom - Containing Commercial Software Packages
Recently in the press there has been talk of software available that allows free
searches of details similar to those given in telephone directories, such as the
"UK Info Disk" [1]. This allows the user to perform a search
on a person's surname or a company name. The information returned is the equivalent
to the entire entry from the phone book, however the information available is
slightly richer than the phone book, as it give the initials of all the residents
of the house, rather than just one resident. This suggests that the telephone
directory has been combined with information from another source such as the electoral
register. If this is the case it would only contain details of other residents
in the house who are on the register and hence are seventeen or over.
The software gives a list of matches, and for each match there is the personsā/company's
initials, full address including house number and postcode, and of course the
telephone number. This software is extremely powerful, and arguably useful for
everyday life. Simply knowing some very basic facts about a person can enable
any user to locate their home address, their telephone number, and even who
they live with.
The UK government currently sells the "Register of Electors", to any company
or individual who requests it. The Data Protection Registrar identifies the
large scale usage for non-electoral purposes as follows [2]:
- Credit Referencing - the register provides information to Credit reference
agencies used in the initial construction of individual's reference files.
The information is combined with County Court Judgement information and information
from Lenders who have extended credit to the individual (used by agencies
such as Equifax and Experian).
- Other Risk Management - the Register is commonly used for basic identity
checks, but the Registrar states that it is also combined with other information
and can be used for vetting insurance claims, checking current/prospective
employees, and even money laundering checks.
- Direct Marketing - The registrar identifies that there are two main marketing
usages of the register:
- Targeted offers - Individual receive marketing offers because their
details appear on register. These offers can be more carefully targeted
with careful analysis of the register. A simple example targeting could
be of individuals living on their own in certain areas - using the register,
a marketer can identify addresses of individual who live alone and in
the identified postal area. This information can be combined with telephone
directories to give the individual's phone number. More complex analysis
of the register, and how it changes from year to year, can enable marketers
to identify targets such as "new movers", or those who fall in a particular
age group.
- "Cleaning" - a marketer who has names and address from another source,
can check them for accuracy by comparing the names to those on the register,
removing those that do not appear on the assumption that they have moved.
- Law enforcement - the Police use the register to assist in tracing individuals.
Local authorities use it to identify those who may be evading their legal
responsibilities, such as tax avoidance.
- Tracing activities - can be used for any tracing activity from tracing
individuals who have gone away leaving unpaid debts, to tracing those who
have gone away to escape violence.
- Research - register used as a source to draw samples for research.
- The information provided in these search CDs must be from a computerised
Electoral register and telephone directory combined.
2.2 Online: The Internet
Public opinion, partially determined by the media, see the Web as a rich source
of information on any conceivable subject. The use of the Web has changed dramatically
in the last year or two, with Internet usage now dominated by commercial Web
sites, rather than academics and the general public wanting their "15 minutes
of fame".
If the majority of the popular sites are commercial, this raises the question
of why there should be any personal information available on the Web. This section
identifies the motivation for constructing profiles of individual users. The
commercial Web companies are constantly looking for more and more ways to make
money.
From outside of the industry, looking at how people make money online, simple
intuition would suggest two routes:
- Selling goods or services online - basically an enhanced mail order company
- Charging to access information they provide, in similar way to subscription
journals and magazines.
However some successful Web-based companies, in many cases, frequently do not
profit from goods that they sell online, and those who run "pay to view" sites
make far more money than merely the sum of their annual subscriptions. This highlights
that there is more to their businesses than initially meet the eye. Marketing
is the key tool here, and can be considered in terms of accessing general numbers
of people and also targeting specific groups. The latter of these techniques in
the non-e-commerce world usually involve highly expensive and extensive research
to identify people who fit the correct category for the product that is being
promoted. The Web however has provided multiple alternative methods of identifying
groups of people to target products at. These techniques, in various forms, provide
a substantial component of a site's revenue, and are the basis for such high stock
valuations of new Web companies - even with negative price earnings ratios. The
rest of this section will identifies each broad technique in turn, analysing how
personal data is collected and the possible ways in which Web companies exploit
it.
2.2.1 Banner advertisements
This is the simplest direct marketing technique, in a similar way to advertisements
in magazines, advertisements for products are shown as "banners" on the top (and
often bottom) of Web pages. By displaying these banners on pages that would be
interesting to somebody who fits the profile of the person, the marketer is targeting
the advertisement to the intended section of society. Unlike magazines however,
many advanced sites use mechanisms to make sure that a visitor sees a different
banner advertisement every time the page is accessed, or at least a reasonable
selection of the advertisements, and this maximises the efficiency of the advertising
space. A common mechanism used to do this, is to write a "cookie" file to the
user's computer. DoubleClick, a large Web advertising company, uses this technique.
When they first serve a user with an advert, a unique number is assigned to that
user. This is then recorded in the cookie file on the user's computer. On subsequent
visits the cookie is read and advertisements are served accordingly.
2.2.2 Transaction trails
In normal, non-cyberspace life, people leave large numbers of transaction trails.
Every purchase made on a credit card is recorded with details of goods and time
purchased, every withdrawal from a teller machine is recorded with details of
the machine location, many of us have standing orders set up, and so on. Internet
transactions automate the monitoring of our activities and the construction of
highly detailed trails of each person's activities and locations using:
- Logs of emails sent and received.
- Logs of Web pages visited.
- Logs of file transactions using other Internet tools - FTP, Telnet, IRC,
MUD etc.
The data trails identified above can be collected by a user's Internet Service
Provider (ISP), and the value of all this data contributes to the continued success
and high market value of the "free" Internet providers. When a user signs up to
a service provider, they commonly have to provide full name and address details
and are often required to answer some market research questions. When all the
data is combined, potentially a very accurate user profile can be constructed.
Currently it is hard to establish how much of this data is being used or indeed
sold on by the ISPs and how this data is being used, but if it is not already
being used to profit, it certainly will be in the not too distant future.
Large advertising companies such as DoubleClick serve advertisements on hundreds
of sites (including freeserve, Disney, Infoseek), and it is possible for them
to construct a profile of the pages a user views (forming similar data to that
of the ISP Web page visiting log), and the frequency in which these pages are
viewed, all by identifying a user from their cookie file. Using this information,
advertising companies can offer highly tailored marketing solutions to their
customers. The difference here between the ISP's log file and the marketer's
cookie-based log is that the marketers can only collect non-personally- identifiable
information (unless a user has registered with the advertising company to receive
tailored advertisements and has volunteered the personal details). DoubleClick
collects the following types of non-personally-identifiable information when
it serves an advertisement [3]:
- IP address - further information can be inferred from this:
- Users' geographic location.
- Size of organisation whose network the user is accessing through.
- Type of Organisation
- Company/Organisation name
- User's domain type (i.e. .comā .eduā .ac.ukā .govā etc).
- How a user utilises pages visited within a customer's site
Clearly the more accurately a marketer can target advertisements, the more valuable
the advertisement is, and the more money they can charge their customers. To highlight
how seriously this profiling is being taken, consider that DoubleClick has created
a file of 10 million users in a single year and are reportedly adding a further
100,000 new profiles every day [4].
2.2.3 Email address collecting
Whenever a user subscribes to a newsgroup, posts a message on a bulletin board,
or joins in an Internet relay chat room, they volunteer their email address and
often a name. These message systems are frequently scanned by automatic software
that gathers the email addresses. These addresses are often used by "spammers"
who send unsolicited emails to vast groups of users. By collating the subject
type of the message system that the addresses have been found in, marketers have
yet another way of constructing profiles to associate with the email addressed,
with the hope of being able to target advertisements more efficiently.
Although the majority of marketers claim to only collect non-personally-identifiable
information, some less scrupulous marketers have been exploiting a security
flaw in many email programmes. One security problem is caused by email programs
allowing HTML- formatted messages to write a cookie file on the user's computer.
These programs include MS Outlook, Netscape Messenger, Eudora, and online HTML
mail services such as Hotmail and Yahoo Mail. Companies send out "spam" emails,
which include code to write a cookie file with a unique identification number.
When the user subsequently visits a site hosting advertisements maintained by
the marketer, the cookie file is found along with the profile cookie file (mentioned
earlier) and the user's email address, which was associated with the new cookie
file can be associated with the user's profile. This enables the marketer to
send targeted marketing offers by email to the user, rather than just banner
ads. This is a far more invasive means of advertising [5].
2.2.4 Volunteering personal details
Many of the most useful "free" services on the Internet require users to forfeit
personal details in order to access the services or areas of their site. Examples
include free e-mail providers, such as Hotmail.com, Yahoo-mail etc, and information
providers, such as FT.com etc. Many also require users to complete a small survey,
asking questions of marital status, job, salary etc - making users feel that
it is a fair exchange for the ability to access the services.
2.2.5 Summary of Information collected
The following graph based on figures from a report to the Federal Trade Commission2,
which studied 100 commercial web sites and their privacy policies, shows clearly
the sort of data that is collected.

2.2.6 Children being targeted
With rapidly increasing numbers of children, regularly surfing
the net, especially in the UK, where schools are getting special Internet connection
deals, concerning privacy issues are being raised. Recent surveys have discovered
that some large companies have created sites aimed at children with specific
intentions to gather information for marketing purposes. The companies manipulate
children into divulging information, often personally identifiable, about themselves
and their parents. To get this information, companies entice children with games
and competitions that require the filling in of a questionnaire, before they
can participate. Companies practising this include household
names such as Lego (http://WWW.lego.com) and Crayola (http://WWW.crayola.com)
.
With the rapid expansion of the web and companies' desire to build
a base of loyal customers for the future (even though initially the customers
are of course the parents!), these marketing techniques are becoming more widespread.
CME's recent survey3
of 75 randomly chosen kids' sites and 80 top commercial children's
sites, highlight how extensive these practices are - the results found that
95% of the random sample collect personally-identifiable information from children,
with only 27% displaying a privacy policy! A similar survey by the FCC (US Federal
Communications Commission) found some alarming statistics. Of the 212 web sites
included almost all (96%) solicited the children's email address with almost
half (49%) requesting their postal address. The following chart shows a clear
summary the information solicited by sites in the survey4
.

With companies marketing specifically targeting children, we must
begin to consider whether this is a legitimate use of their position (in providing
sites that attract children) or whether children need protecting. CME's survey
found that less than six percent of the sites they surveyed asked for parental
consent. However even if more asked for consent, it would be extremely difficult
to ascertain whether the consent is genuine - rather than the child forging
consent.
3. Maintaining Privacy
There is no simple way to maintain
privacy whilst online. There are services offered that help to avoid some of
the data collecting techniques used by marketing companies, these are explained
below. The only real way to maintain your privacy, is to be very careful of
what you disclose and to whom, for a lot of people this means inventing a 'pseudo'
identity which they can use whenever personal information is required.
As privacy is rapidly becoming a hot topic, there is a range of
software products and services being developed that aim to help Internet users
hold onto the privacy. Some of the more interesting ones are as follows:
3.1 PermissionTracker
This is a system developed by FollowUp.net, a provider of online,
permission-based marketing for companies such as IBM, KPMG, Priceline.com, and
Intel. From their publicity material, PermissionTracker creates a database of
email addresses and possibly IP details, from users who specify how they will
allow their information to be used. When a web company who has agreed to participate
in the PermissionTracker scheme, receives a visit, they check the PermissionTracker
database to see what privacy rights the user has allowed. Currently it appears
that users who join the scheme can only specify that sites agree:
- Not to sell or share any personal information collected from me on the Internet
with other parties.
- Not to aggregate or combine any personal information collected from me on
the Internet with data from third parties.
There is also an option to opt out of DoubleClick's tracking.
On the PermissionTracker site (http://permissionTracker.com)
there is a list of companies who have agreed to participate and also facilities
to recommend companies who should be approached. At the time of writing it
appears that the only company who has agreed to use the scheme is FollowUp.net,
the creators! However, it is however early days yet.
3.2 Zero-Knowledge Systems
Zero-Knowledge Systems, offer a system that aims to give users
total Internet privacy, its is called 'Freedom'(http://www.freedom.net). On the face of it, Freedom
is a small client you launch from the desktop when you want to secure your Internet
activities, be it browsing, mail, newsgroups, telnet, or IRC.
The Freedom system uses three components to conceal your online
identity:
- Nyms- pseudonyms you create and use online, these conceal your personal
information by creating alternative details.
- Encryption- on all outgoing data and messages.
- The Freedom Network- a group of servers that route your Internet
traffic through a series of "privacy-enhancing detours" that further encrypt
your data and strip out all location information for your ISP, leaving only
the 'Nym'.
The Freedom system appears to be very secure, not even Zero-Knowledge
have the ability to open or expose your traffic. The system is double-blind,
although the company have your credit card details (if you purchase on line)
they cannot associate it with any individual Nym identity. According to Zero-Knowledge,
as well as the Nym system, Freedom hides the source and destination IP addresses
of your communication and encrypts the data flow (Freedom uses multiple cryptographic
algorithms, such as Blowfish 128-bit encryption, DSA 1,024-bit keys, DH 2,048-bit
keys). Apparently the most an outsider could find coming from a Freedom-equipped
system is an encrypted stream of packets travelling to a Freedom server.
Zero-Knowledge have realised that the Freedom system could be
abused by spammers and have implemented 'spam control' going into and out of
the Freedom Network. The system will refuse to send excessive amounts of mail
from any single Nym.
The Freedom system has had some excellent reviews, which are summed
up with comments like the following, from Patrick Norton of PC Magazine :
"If online privacy is an issue for you, we can't think of a better option,
except for not going online at all."
3.3 SiegeSoft
SiegeSoft offer three products to increase users' privacy, they
are as follows(http://www.siegesoft.com) :
SiegeSurfer - This is a web-based proxy that relays pages either through
clear text or through encrypted SSL.
SiegePipe - is a secure, private and anonymous Internet connection.
All a users Internet activity appears to come from SiegeSoft.
SiegeWindowWasher - cleans up any user activity stored on Windows,
such as Internet history, cache, recent documents and log files.
4. Current legal position
Currently there are two main acts applicable in the UK that the acquisition of
personal data falls under, these are the U.K. Data Protection Acts of 1984 (now
superseded) and
1998, and the European Union (EU) Data Protection Directive.
The following sections give a summary of the features of the Act that are
relevant to an electronic profiling system.
4.1 The Data Protection Acts 1984 and 1998 (introduced 1/3/2000)
The 1984 Data Protection Act requires that "Data Users" must register with the
Data Protection Registrar if they are planning on holding or controlling personal
data on a computer. Registering currently costs 75 pounds [7],
and usually lasts for 3 years. When a Data User registers, the following information
is recorded:
- Data User's Name
- Data User's Address
along with a broad description of:
- Those about whom personal data are held.
- The items of data held.
- The purposes for which the data are used.
- The sources from which the information may be obtained.
- The types of organisation to whom the information may be disclosed i.e.
shown or passed on
- Any overseas countries or territories to which the data may be transferred.
There is an online searchable database of all registered users at http://www.dpr.gov.uk/search.html.
The penalty for not registering is a fine of up to 5000 pounds, if prosecuted
in the Magistrates Court, or unlimited if convicted by the Crown Court [8].
The act has eight "Data Protection principles of good information handling"
that registered users must comply with, these are summarised as follows, personal
data must be:
- Obtained and processed fairly and lawfully.
- Held for lawful purposes described in data users' register entry.
- Used for those purposes, and disclosed only to those people, described
in the register entry.
- Adequate, relevant and not excessive in relation to the purposes for which
they are held
- Accurate and, where necessary, kept up-to-date.
- Held no longer than is necessary for the registered purposes.
- Accessible to the individual concerned who, where appropriate, has the
right to have information about themselves corrected or erased.
- Surrounded by proper security.
The 1998 Data Protection Act, which was due to come fully into effect by 1999,
but which actually came into effect on 1st March 2000 [9],
is based around similar principles as the 1984 act, but takes into account the
requirements of the European Union Directive (discussed below) [10].
4.2 The EU Data Protection Directive
A study into European Data Protection Law summarises the key data principles of
the Directive as [11]:
a series of fair information practices that define obligations and
responsibilities with respect to the processing of personal information ... personal
information should only be collected legitimately for specific purposes (Directive
95/46/EC, at Art. 6(1)(a) and Art. 6(1)(b)).
The report considers the effect of the ease of collecting and processing data
via Internet on these key "fair practices":
The Internet, however, challenges the establishment of obligations and responsibilities
with respect to the processing of personal information. Under current practices,
the basic principle of a "purpose limitation" for personal information has
become the exception rather than the rule in the on-line environment. Where
paper records themselves had provided a physical barrier of sorts to any further
use, information generated by individuals on the Internet is digital from
the start and available for any number of further kinds of sharing and combination.
Once on-line, the individual generates enormous amounts of personal data and
further use has not been limited to compatible purposes ... for example a large
amount of transactional information is collected by service providers who
make various kinds of further use of these data.
4.3 Regulation of Investigatory Powers bill
On 9th February 2000 a controversial bill was unveiled in the House of Commons,
this is the 'Regulation of Investigatory Powers bill' (RIP bill). It is thought to give
the UK authorities more intrusive powers than any other western democracy5
.
Although the Home Secretary Jack Straw claimed that "the new powers would be
used mainly to track down serious criminals", but the legislation enables authorities
to collect huge amounts of data on ordinary citizens. Civil rights experts expect
the RIP bill to be challenged in the European Court of Human Rights if it is
adopted6
. Mr Straw insists however that· "in my view the provisions of the Regulation
of Investigatory Powers Bill are compatible with the Convention Rights"7 . The
sections of the bill that are most alarming from an Internet privacy perspective
is that the bill requires ISPs to become party to secret surveillance of their
customers, siphoning off Internet traffic into government computers. They can
also be required to provide the authorities with detailed traffic analysis,
this could include every email address and Internet site to which an individual
had used, and possibly any correspondence between them. The bill also gives
the authorities access to any encryption keys, with the presumption of guilt
if anyone fails to produce their encryption key.
5. Public opinion
Internet users are showing a fear and distrust, regarding the loss of personal
privacy associated with the e-commerce industry. A recent study by Equifax and
Harris Associates found that over two-thirds of Internet consumers considered
the privacy concern to be "very" important [12]. The following
findings from similar surveys highlight the magnitude of the public's concerns:
- In a recent survey of Web users 39% of respondents "agree strongly", and 33%
"agree somewhat" that there should be new laws to protect privacy on the Internet.
Interestingly respondents from Europe generally felt less strongly about this,
with 18% of respondents being "neutral" on the issue and 39% "somewhat" agreeing
[13].
- The same survey found that the most disagreed statement
was that "Content providers have the right to resell user information". 63% of
respondents "Disagree strongly" while another 19% "disagree somewhat". In total
82% of respondents are in disagreement with the reselling of their personal information
[13].
- In a public opinion poll on medical privacy, 75%
of respondents were concerned a "great deal" or "a fair amount" about insurance
companies putting medical information about them into a computer information bank,
that others have access to [14].
6. The automated system
In order to establish what information can be uncovered on individuals and
individual companies by an automated system, we researched Websites that reliably
provide information on individuals. The majority of the sites only provide small
quantities of personal data, but by data-matching these small quantities of
data can contribute to a richer data profile. In order to do the data-matching,
the set of contributing sites needs to be as accurate and complete as possible.
A non-exhaustive list of the sites potentially useful to the automated profiling
system is included in the Appendix. We aim to document the theoretical maximum
of data that can be achieved, and by what combination of these sources. The
process may include an element of recursion because, as more information is
uncovered, earlier sources in the chain may provide more output data given richer
starting data.
6.1 Design Concept
One of the key aims of this project was to ascertain the feasibility of a software
system that only using information available via the Web, would uncover personally
identifiable information on a search target, thus highlighting how the Internet
is affecting our privacy. Therefore although the easiest means of getting basic
details would be using the electoral register (in the UK or other similar databases),
this would require locally stored data, and hence would not fulfil this aim.
The basic process that the program fulfils is similar to that of a human researcher
or 'private detective' using the resources on the web to locate a target, the
resulting software, will highlight how easy the process is, by giving the search
a clear structure. The following diagram clearly illustrates the design concept:

Each layer in the search can be expanded to include different
resources, and add to the variety of data found, as well as tailoring it for
different locations. The initial steps in the process require user input to
select the most likely target from the list produced. The system could perform
a search on all the possibilities, but this would take a considerable amount
of time, due to the slow Internet connections and frequently slow servers (due
to high traffic) that the system uses.
6.2 Ethical issues of Software
The key aim in constructing this software was to establish the feasibility
of an electronic profiling engine that would invade individuals privacy. As
the implementation is primarily a feasibility study it was felt that it was
very important that the information returned by the system wasn't 'too' invasive.
However it should be alarming enough to raise public interest. After much consideration,
information servers would chosen which, hopefully, adhere to these considerations.
The system, given basic information, name details and/or email address, will
return (where possible) full postal address of target including postcode (zip-code),
telephone number, email address, and a series of maps showing the location of
the targets house. Although this information is very easy to
find, without any form of authorisation of justification required,
it is all available from a county library and therefore can be seen as not
'too' invasive.
The key question of ethics for these types of program is really what or how
much information should be private and therefore shouldn't be available via
any source? In most cases when we volunteer personal information we are not
aware exactly what it will be used for - with most users (understandably) assuming
that the current requirement for the data will be its only use. What is really
needed is a globally defined structure for privacy statements. The current problem
is that the majority of sites do not even have privacy statements, and users
don't consider necessarily consider how their privacy is being affected. Where
sites do have privacy statements, they are often quite long and written in a
convoluted non-user-friendly manner. This means that users who are concerned
about their privacy are put off reading the document.
If rigid format were to
be given to privacy statements, it would make the statements easier to understand
and perhaps the public would then expect all sites to have one. Ultimately features
could be added to web browsers in a similar way to security certificates that
would warn the user when they were accessing a site that doesn't posses a privacy
statement or where the statements exist, if it contradicts the user's privacy
settings. This would allow individual users to specify what purposes they are
happy for any data to be collected for, and hands the control of personal data
back to the individual.
In conclusion
The British Government is trying to present a more open structure, giving members
of the public access to documents and papers that had previously not been available.
The main way to access this information is via the Web. This trend, if applied
to other areas of government information holding, could lead to a large amount
of information databases being made available online. For example the register
of electors could become an online database along with other information, such
as Driver Vehicle Licensing Agency information, Social Security details, Police
files etc.
The free availability of large collections of data does not in itself constitute
an infringement of personal privacy. As discussed in [15],
a major problem is the "transliteration" of existing databases and information
searching and retrieval practices to the electronic media, particularly the
Web. The nature and purpose of queries are no longer screened, and the identity
of the agent requesting the information is not required. For legitimate uses,
this is highly appropriate, but it also provides no protection, or audit trail,
against misuse of the information.
Also, it is interesting to note that the greatest threat to personal privacy
does not necessarily come from government agencies, but from the more hidden
activities of commercial data collectors. These commercial databases have far
more detail than any governmental Identity card proposals, as they include the
most minor details of an individual's life, such as what items they purchase
and what brand, time and place of purchase, Web browsing habits and trails,
and many other seemingly trivial pieces of information which become important
when taken in context with other such pieces of information. Orwell's Big Brother
is a reality, but is not necessarily a government organisation.
Hypertext References
- UK InfoDisk, Published by I-CD Publishing (UK) Ltd.
- "The Sale of the Register of Electors", Submission by the
Data Protection Registrar to the Home Office Working Party on Electoral Procedures,
http://www.dataprotection.gov.uk/reg-elec.htm,
August 1998
- http://www.doubleclick.net/company_info/about_doubleclick/privacy/non_identify.htm
- Dan Schiller: "Tradesmen launch Assault on the Internet",
Le Monde Diplomatique, March 1997
- "The Cookie Leak Security Hole", R.M. Smith, 30/11/99,
http://www.tiac.net/users/smiths/privacy/cookleak.htm
- Figures from "Privacy and The Top 100 Web Sites: Report
to the Federal Trade Commission", M.J.Culnan, Ph.D, Georgetown University,
June 1999
- "Registration & Data Protection",
http://www.dataprotection.gov.uk/register.htm
- "Data Protection - a brief Introduction", Dec 1998,
www.dataprotection.gov.uk/intro.htm
-
http://www.dataprotection.gov.uk/eurotalk.htm
- "Defaults Guidance" - DP Act 1984, E. France - Data Protection
Registrar, www.dataprotection.gov.uk/defaults.htm
- "Data Protection Law and On-line Services: Regulatory Responses",
J.R.Reidenberg Prof. Law - Fordham University, P.M.Schwartz Prof. Law - Brooklyn
Law School, http://www.europa.eu.int/comm/dg15/en/media/dataprot/studies/regul.pdf.
- "Consumer Privacy Concerns about Internet Marketing", Communications
of the ACM, Vol.41 No.3, March 1998
- Graphic, Visualization, & Usability Centre's (GVU) 8th
WWW User Survey, www.gvu.gatech.edu/user_surveys/survey-1997-10/
- "Health Information Privacy Survey", Harris Equifax, 1993,
http://www.epic.org/privacy/medical/polls.html
- H. Ashman, "Transliteration of Databases and Information
Retrieval Practices to Public Websites", submitted for publication, 2000.
Further reading
Hard copy
- J.H.Ellsworth and M.V.Ellsworth, Marketing on the Internet, Wiley,
1997
- D.Banisar, S.Davies, Privacy and Human Rights 1999: An International
Survey of Privacy Laws & Developments, Electronic Privacy Information
Center, 13/8/1999.
On-line Resources
Authors and dates are provided where known.
- A Guide to Developing Data Protection Codes of Practice on Data Matching,
http://www.dataprotection.gov.uk/match.htm
- Data Protection Act 1984, http://www.hmso.gov.uk/acts/acts1984/1984035.htm,
1984
- Data Protection Act 1998, http://www.hmso.gov.uk/acts/acts1998/19980029.htm,
1998
- Data Protection and the Internet - Guidance on Registration, http://www.dataprotection.gov.uk/internet.htm
- Defaults Guidance - DP Act 1984, http://www.dataprotection.gov.uk/defaults.htm,
6/1999
- DoubleClick Privacy Statement,
http://www.doubleclick.net/company_info/about_doubleclick/privacy
- Privacy Threatened By New Databases, http://www.dataprotection.gov.uk/dbasepr.htm,
15/7/97
- The 15th Annual Report of the Data Protection Registrar,
http://www.dataprotection.gov.uk/99arcontents.htm, 6/1999
- The Eighth Data Protection Principle and Transborder Dataflows,
http://www.dataprotection.gov.uk/transbord.htm
- The Sale of the Register of Electors,
http://www.dataprotection.gov.uk/reg-elec.htm, 8/1998
- Using the Law to Protect your Information,
http://www.dataprotection.gov.uk/intro.htm, 12/1998
- Consumer Privacy Concerns about Internet Marketing, H.Wang, M.K.O.Lee,
C.Wang,
http://www.acm.org/pubs/articles/journals/cacm/1998-41-3/p63-wang/,
3/1998
- You've Got mailing Lists, J.Duffy,
http://www.news.bbc.co.uk/hi/english/uk/newsid_557000/557505.stm,
10/12/1999
- Data Protection Law and On-Line Services: Regulatory Responses, J.R.Reidenberg,
P.M.Schwartz,
http://www.europa.eu.int/comm/dg15/en/media/dataprot/studies/regul.pdf
- Internet Privacy Conerns confirm the case for Intervention, R. Clarke,
http://www.acm.org/pubs/articles/journals/cacm/1999-42-2/p60-clarke/,
2/1999
- Privacy and the Top 100 Web Sites: Report to the Federal Trade Commission,
M.J.Culnan,
http://www.msb.edu/faculty/culnanm/GIPPS/oparpt.pdf, 6/1999
- Recommendation 1/99 on Invisible and Automatic Processing of Personal
data on the Internet Performed by Software and Hardware, P.Hustinx, http://www.europa.eu.int/comm/dg15/en/media/dataprot/wpdocs/wp17en.htm,
23/2/1999
- Information Privacy on the Internet Cyberspace Invades Personal Space,
R.Clarke,
http://www.anu.edu.au/people/Roger.Clarke/DV/Iprivacy.html, 2/5/1998
- Information Technology and Dataveillance, R.Clarke,
http://www.anu.edu.au/people/Roger.Clarke/DV/CACM88.html, 11/1987
- The Cookie Leak Security Hole in HTML Email Messages, R.M.Smith,
http://www.tiac.net/users/smiths/privacy/cookleak.htm, 30/11/1999
- On-line Srvices and data protection and the Protection of Privacy,
S.Gaurthronet, F.Nathan,
http://www.europa.eu.int/comm/dg15/en/media/dataprot/studies/servint.htm,
12/1998
- How anonymous is the Web?, W.Rodger, http://www.usatoday.com/life/cyber/tech/ctg802.htm,
3/12/1999
Appendix
Footnotes
- A particularly worrying example is that of
the Morgan Stanley Dean Witter bank who "collect", among other things,
details about an individual's race, religious beliefs, sexual preferences,
union membership, etc. As this information is never required as part of
the credit application procedures, it is most likely inferred by analysing
the individual's subsequent spending pattern. This is similar to the way
supermarket chains infer such things as marital status, number and age
of depdendents etc. using their "loyalty cards" to analyse purchasing
patterns. Morgan Stanley Dean Witter also claim the right to disseminate
this information .
- Figures from "Privacy and the Top 100 Web sites: Report to the Federal Trade Commission", M.J. Culnan, Ph.D. Georgetown University, June 1999.
- CME Assessment of Data Collection Practices
of Children's Web sites", Center for Media Education, July 1999.
- "Online Marketing to Children", US Federal
Communications Commission (FCC), 1998.
- "Leader: Spies in the Web", The Financial Times, 7/3/2000
- "Ripping into UK Privacy Bill", Kalin Lillington, 23/3/2000, Wired News.
- "On the Cover of the 'Regulation of Investigatory Powers Bill'", 9/2/2000, http://www.publications.parliament.uk/pa/cm199900/cmbills/064/2000
064.htm
Copyright
Dave Halstead and Helen Ashman, © 2000. The author assigns to Southern Cross University
and other educational and non-profit institutions a non-exclusive licence to use
this document for personal use and in courses of instruction provided that the
article is used in full and this copyright statement is reproduced. The author
also grants a non-exclusive licence to Southern Cross University to publish this
document in full on the World Wide Web and on CD-ROM and in printed form with
the conference papers and for the document to be published on mirrors on the World
Wide Web.
[ Proceedings ]
AusWeb2K, the Sixth Australian World Wide Web Conference, Rihga Colonial Club Resort, Cairns, 12-17 June 2000 Contact: Norsearch Conference Services +61 2 66 20 3932 (from outside Australia) (02) 6620 3932 (from inside Australia) Fax (02) 6622 1954