W3C: Leading the Web to its Full Potential

Bob Hopgood, Head of Offices, World Wide Web Consortium, W3C UK Office, Rutherford Appleton Laboratory, Chilton, Didcot, Oxon, OX11 0QX. frah@w3.org


Abstract

This paper gives some background information concerning the World-Wide Web Consortium, its origins, objectives, structure and current situation. This is followed by an overview of what has been achieved to date and discusses the current and future activities particularly with regard to the family of XML standards, and XML applications such as XHTML, SMIL, SVG, and RDF.


Introduction

The World Wide Web Consortium [HREF1] (W3C) was founded in 1994 to lead the web to its full potential by giving organisations a forum in which to develop technical specifications [HREF2] as the foundation for the Web. W3C's process allows everybody to contribute to and benefit from the W3C activities. Only by serving the entire Web community can W3C achieve its objective of leading the Web to its full potential.

W3C now has about 420 Members [HREF3] and 60 full-time staff [HREF4] (the W3C Team). It has developed 22 Recommendations [HREF5] that define the Web and future work aimed at extending this set is grouped into about 20 separate Activities [HREF6]).

The rapid changes taking place in terms of technology, user demands and societal needs requires W3C to work at a fast pace and it is only through the participation of its Members and others in terms of developing and reviewing specifications, trial implementations, translations and promotion that this pace is achieved.

The W3C Process

The W3C Process [HREF7] aims to achieve consensus initially within a Working Group but later throughout the Membership and finally around the world. New activities arise from:

If a new area looks promising, an Activity Proposal is sent to the Membership for review and, if there is consensus to proceed, the Activity will start. Work may be divided among several Working Groups, Interest Groups, and Coordination Groups. For example, the XML Activity [HREF10] is carried out by four Working Groups (Core [HREF11], Schema [HREF12], Linking [HREF13], and Query [HREF14]), three Interest Groups (Plenary, Schema, Linking) and one Coordination Group. Unlike other standards bodies, the W3C Team works full-time to coordinate W3C Activities and develop Recommendations.

Within W3C, related Activities [HREF15] are grouped into Domains [HREF16]. Currently W3C has four Domains:

From Specification to Recommendation

W3C specifications undergo a formal process of review, revision, and refinement to build consensus. Documents advance through four stages:

To participate in a Working Group requires a serious commitment with weekly teleconferences and several face-to-face meetings a year. Some Working Groups are restricted to employees of Members, the W3C Team, and invited experts while others adopt a more public forum.

How W3C Works

W3C is primarily funded through the dues paid by its Members although some funding comes from public funds. In return, W3C Members can send staff to participate in Working Groups. Members have a seat on the Advisory Committee, access to Member-confidential information, the right to use the W3C Member logo, and access to W3C news services. The Advisory Committee meets twice a year face-to-face to review the W3C Activities and respond to proposals including a review of proposed Activities and proposed Recommendations. The May Advisory Committee Meetings for 1999 in Toronto and 2000 in Amsterdam have been co-located with the International World Wide Web Conference.

The W3C Team coordinates the work carried out by the Activities and handles the infrastructure required by the Consortium. The Team consists of the W3C Director (Tim Berners-Lee), the Chairman (Jean-Francois Abramatic) and the full-time staff. The Chairman manages the general operation of the Consortium while the Director is the lead architect for the technologies developed at the Consortium and ensures consensus is reached before a specification becomes a Recommendation.

W3C is hosted by the Massachusetts Institute of Technology [HREF17], Laboratory for Computer Science [HREF18] (MIT/LCS) in the United States; the Institut National de Recherche en Informatique et en Automatique [HREF19] (INRIA) in Europe; and the Keio University [HREF20] Shonan Fujisawa Campus in Japan. Most of the W3C Team works at one of these host locations. W3C is not a legal entity, so Members enter into a contractual relationship with the three hosts when they join W3C [HREF21].

The W3C Offices

The W3C Offices [HREF22] are local points of contact in specific countries that help ensure that W3C and its specifications are known in those countries. The Offices work with their regional Web community to promote participation in W3C. The current set are:

Offices are also planned for Australia and Tunisia. Some of the activities performed by the local Offices are:

  1. First Line Support of the local community. That means being aware of what is happening in W3C and making sure that the local community knows the Office exists and is there to help.
  2. Member Relations: To provide a link to local W3C members possibly acting as a conduit from them to W3C. Some Offices run local meetings of W3C Members and organise Regional/National Web Conferences. For example, the Hong Kong Office will be involved with WWW10 and the Netherlands Office at CWI helped develop the Programme for WWW9. WWW6 was jointly organised by RAL in the UK and the INRIA Host.
  3. Publicity for W3C: the offices onward route W3C Press Releases to the local newspapers and journals. In some cases, this includes translating either the summary or the complete Press Release
  4. Raising Awareness: via pro-active events (Seminars, Stands, Newsletter etc).
  5. W3C Mirror Site: each Office runs a local mirror of the W3C site that is regularly updated and ensures good access to the W3C site world-wide. This is often augmented by a W3C Office web site providing local information including translations.
  6. Recruiting W3C Members: W3C needs those companies with a significant investment in the Web to be Members and contribute to the development of the web.
  7. Promote W3C Recommendations: not all countries are aware of the Recommendations that define the protocols that ensure the continued interoperability of the Web.
  8. Work with Local Government: the Offices work with the national government to raise public awareness.

Current Activities

Currently, the four Domains of W3C have the following activities:

It is impossible in a short paper to give full justice to all these activities. In consequence, what follows should be seen as a taster to encourage the reader to look more closely at the Recommendations being developed by the W3C Members.

Architecture Domain: the Move to XML

The Extensible Markup Language (XML 1.0) became a W3C Recommendation in February 1998. Some of the main points [HREF23] about XML are:

In consequence, associated with XML 1.0, there is a set of related specifications under development:

While the family of XML specifications is large, it should be realised that only XML 1.0, XML Namespaces, DOM Level 1, XPath and XSLT have reached Recommendation so far. At the moment, the user can be confident of defining XML applications, using several together via the namespace mechanism, interacting with them via a standard API, and powerfully transforming XML documents using XSLT. For the remaining specifications, there is still time for them to change. However, the web community has accepted XML with enthusiasm. XML gives the user access to a large and growing community of tools independent of a single vendor.

XML Schemas

Currently, the way to validate an XML document is via the Document Type Definition that is part of XML 1.0. Given a simple application, like marking up an exam paper:

<exam paper="CS203">
<qapair>
<question>Who is the last King of England</question>
<answer>George VI</answer>
</qapair>
<qapair>
<question>How many queens were named Elizabeth</question>
<answer>Two</answer>
</qapair>
</exam>

This would have a DTD something like:

<?xml version="1.0"?>
<!DOCTYPE exam [
<!ELEMENT exam (qapair)* >
<!ATTLIST exam paper CDATA #REQUIRED>
<!ELEMENT qapair (question,answer) >
<!ELEMENT question (#PCDATA) >
<!ELEMENT answer (#PCDATA) >
]>

Note that the DTD has a different syntax from XML as it was based on the notation used by SGML. The DTD provides facilities for defining valid documents and also how to include text and other objects from other files in the XML document via the entity reference mechanism.

One of the major changes coming through will be the replacement of the validation part of the DTD by an XML Schema. An XML Schema has the advantage of using the XML syntax and will provide stronger datatyping than is currently available. The above DTD would be replaced by something like:

<schema>
  <element name="exam">
    <type>
      <attribute name="paper" type="course">
        <datatype name="course">
          <pattern><lexical>[A-Z]{2}d{3}
          </lexical></pattern>
        </datatype>
      </attribute>
      <element name="qapair" minOccurs="0" maxOccurs="*">
        <type>
          <element name="question" type="string"/>
          <element name="answer" type="string"/>
        </type>
      </element>
    </type>
  </element>
</schema>

The place where the stronger datatyping can be seen is in the definition of the paper attribute whose value is of type course and where it specifically states that this consists of two alphabetic characters followed by three digits. W3C will provide conversion facilities from DTDs to XML Schemas.

Hyperlinks in XML

The aim of XLink, together with XPointer, is to provide advanced hyperlinking and addressing functionality for XML including:

For example, a link in an XML document is no longer limited in the type of element it can be associated with:

<ABC xlink:type="simple" xlink:href="http://www.w3.org/">The W3C</ABC>

This defines a simple link associated with the <XYZ> element that has similar functionality to the <a href=" functionality of HTML. Extended links in XLink will allow links to connect a number of resources and the links can be defined away from the document they refer to.

User Interface Domain: XHTML and other XML Applications

XHTML

The W3C User Interface Domain has a broad programme with a major activity aimed at moving the existing Web from HTML to XML via XHTML[HREF38].

HTML currently serves as the lingua franca for most people publishing on the Web. While that is the case today, the future of the Web is with XML. In designing XHTML 1.0, the challenge was to design the next generation language for Web documents without making obsolete what is already on the Web. The answer is to rewrite HTML 4.0 as an XML application. In simple terms that means making the HTML a well-formed XML document where start and end tags are always there and match precisely. Empty elements have to use the correct syntax. For example, the following valid HTML 4 fragment:

<P>This is a list:
<Ul>
<lI>First One
<li>Second One</LI>
<LI>Third One
</ul>
<Hr>
<p>And so on</P>

would need to be changed to:

<p>This is a list:</p>
<ul>
<li>First One</li>
<li>Second One</li>
<li>Third One</li>
</ul>
<hr/>
<p>And so on</p>

W3C's tidy[HREF39] tool will make the necessary changes for you. The benefits of changing to XHTML are:

By migrating to XHTML early, content developers can enter the XML world with all of its attendant benefits, while still remaining confident in their content's backward and future compatibility. XHTML also has an ambitious roadmap of future activities.

XHTML 1.1 will take the existing XHMTL 1.0 and reformulate it as a set of Modules. Once that has been achieved, the work will proceed in two directions. First will be a set of modules defined as XHTML Basic that will form the base set of modules that devices with limited functionality will be expected to handle. XHTML Basic will include the widely used HTML mark-up tags plus images, forms, and basic tables. It is designed for Web clients that do not support the full set of XHTML features such as mobile phones, PDAs, pagers, and settop boxes. Second will be some new improved modules with Form and Event Modules being the first two:

SVG

This summer should see the completion of the following XML applications:

SVG will allow both simple vector graphics and high quality graphics arts rendering to be produced via an XML application. A simple SVG document is:

<svg viewBox="0 0 600 400">
  <g>
    <rect x="100" y="50" width="60" height="30" style="fill:red;stroke:yellow"/>
    <rect x="200" y="50" width="60" height="30" style="fill:url(#radgrad);
      stroke:url(#lingrad)"/>
    <rect x="300" y="50" width="60" height="30" style="stroke:blue; fill:none;
      stroke-width:6; stroke-dasharray:20 5; stroke-linejoin:miter"/>
  </g>
  <g transform="translate(10 10) scale(10)" style="stroke:none; fill:lime">
    <path d="M 0.0 11.2 L 2.0 12.4 L 4.0 12.9 L 6.0 12.6 L 8.0 12.0 L 10.0 11.1
      L 12.0 10.4 L 14.0 10.1 L 16.4 10.6 L 17.0 10.3 L 17.3 8.0 L 17.8 6.0
      L 18.5 3.9 L 20.0 3.0 L 22.0 3.0 L 24.0 4.0 L 26.0 6.1 L 28.0 6.9
      L 29.0 6.8 L 28.8 7.7 L 27.2 8.5 L 25.0 8.5 L 23.0 8.5 L 21.5 8.8
      L 21.1 9.5 L 21.5 11.0 L 22.8 12.0 L 24.1 13.0 L 25.1 14.9 L 25.2 16.4
      L 24.2 18.1 L 22.1 18.9 L 20.0 19.1 L 18.0 19.3 L 16.0 19.2 L 14.0 19.0
      L 12.0 19.0 L 10.0 18.8 L 8.0 18.2 L 6.1 17.9 L 4.2 17.1 L 3.0 15.9
      L 1.3 14.0 L 0.0 11.2 z"/>
  </g>
  <g style="fill:yellow; font-size:28pt">
    <text x="150" y="150">My first text string</text>
  </g>
</svg>

The outer <svg> element establishes the coordinate system for the drawing. Inside are three <g> grouping elements. These can be nested to any depth and allow local CSS properties, SVG attributes and a local coordinate system to be established for the group. The first group uses the initial coordinate system established. The second group will scale all the coordinates by a factor of 10 and translate them in space.

The two main SVG drawing primitives are path and text but common paths like lines, polylines, polygons, rectangles, circles and ellipses have shorthand descriptions that effectively get turned into paths. The path element defines a path by a sequence of path data including:

A lowercase command letter implies that the coordinate following is relative rather than absolute. The whole aim is to transmit complex diagrams (maps, engineering data etc) across the internet efficiently. In consequence, path descriptions are not easy to read but are efficient to transmit.

CSS properties are used if appropriate and these have been extended to allow areas to be filled with radial and linear changes of colour as can be seen in the second rectangle.

Several implementations of SVG are available both as stand-alone viewers and as plug-ins. Adobe, IBM, JacKaroo and CSIRO are examples. The SVG Working Group has developed a test suite to check implementations against. Below is an example of one of the basic tests of functionality. Each test has a PNG image to check the SVG against:

Technology and Society Domain: RDF Schema and P3P

RDF Schema

The Resource Description Framework, RDF Model and Syntax[HREF41], for specifying metadata associated with a resource has been a W3C Recommendation since February 1999. A simple example is:

<RDF xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:DC="http://purl.org/dc/elements/1.0/">
<Description about="http://www.w3.org/folio.html">
<DC:title>The W3C Folio 1999</DC:title>
<DC:creator>W3C Communications Team</DC:creator>
<DC:date>1999-03-10</DC:date>
<DC:subject>Web development, World Wide Web Consortium, Interoperability of the Web</DC:subject>
</Description>
</RDF>

The metadata consists of a collection of properties called an RDF Description, in this case about the W3C Folio. The <RDF> element declares that this is an RDF expression using the format defined by the RDF Model and Syntax specification. The next line indicates that the Dublin Core RDF vocabulary is to be used and the prefix "DC" will be used for elements and attributes in that namespace. The Description element indicates that the metadata concerns the W3C Folio. The subsequent RDF statements define the metadata (title, creator, date, and subject) and they are part of the Dublin Core RDF vocabulary.

RDF provides a framework in which industry sectors can develop vocabularies that suit their needs and share these vocabularies with others. To do this, the meaning of the terms must be defined precisely and this is done using an RDF Schema[HREF42]. It defines the meaning, characteristics, and relationships of a set of properties, including constraints on potential values and the inheritance of properties from other schemas. One schema is the Dublin Core used by the library community. The Dublin Core is a set of 15 properties associated with bibliographic information. The RDF Schema specification will become a Recommendation in the near future.

A goal of RDF was to allow the mechanical translation of PICS metadata into RDF form. A recent W3C Note defines one possible mapping of PICS into XML/RDF. This Note was orginally going to appear as part of the RDF Schema document but it has been published separately so that it can evolve independently of the RDF Schema specification.

P3P

Many Web sites collect information about users particularly when conducting e-commerce but also as a prerequisite to providing information. Users need to know whether the Web site is trustworthy and what the site plans to do with the information collected. A particular concern is to whom the information will be given. The Platform for Privacy Preferences Project (P3P[HREF43]) defines how a user can be informed of a site's practices. The user, or an agent working for the user, can then decide whether to proceed with a transaction.

A web site might have as its policy that: "it collects clickstream data in HTTP logs and collects first name, age and gender to personalise the responses. The information is not given to any other organisation and it has a third-party that audits its Web site to ensure that it keeps to this policy."

The P3P specification defines a vocabulary of keywords and possible values to encode this kind of information using the Resource Description Format (RDF) data model. On arriving at a web site, the user or the agent can check that the site's policy is acceptable. Only if the user agrees to the privacy policy will personal data be sent to the web site by the agent. The site's P3P policy needs to be compared with the user's preferences, and appropriate action taken. A browser might, for example, show a warning if the user is about to enter a web site with a privacy policy unacceptable to the user.

The P3P specifications is currently at the Working Draft stage and consists of:

A P3P Preference Exchange Language (APPEL[HREF44]) is also being defined to encode user preferences about privacy. However, P3P can be used without using APPEL.

W3C is hosting an interoperability session in New York on June 21, 2000 to "test drive" P3P and to demonstrate its potential uses and capabilities to a broad audience of software and hardware developers, and Web site operators.

Web Accessibility Domain: First Set of Guidelines Completed

The Accessibility Domain will soon complete its first round of Guidelines aimed at:

  1. Web Content[HREF45]
  2. Authoring Tools[HREF46]
  3. User Agents[HREF47]

The Web Content Guidelines have been in wide use for some time. The Authoring Tools Guidelines appeared this Spring and at least one existing tool satisfies each checkpoint of these guidelines (even though no tool yet satisfies all of them).

The User Agent Guidelines explain to developers how to design user agents that are accessible to people with disabilities. User agents include graphical desktop browsers, multimedia players, text browsers, voice browsers, plug-ins, and other assistive technologies that give full access to Web content. While these guidelines primarily address the accessibility of general-purpose graphical user agents (including communication with assistive technologies), the principles presented apply to other types of user agents as well. Following these principles will make the Web accessible to users with disabilities and will benefit all users.

Here is a flavour of what is in the user agent guidelines:

In all there are seven main guidelines with a great deal of background information plus a priority check list for the tool designer.

IBM, RealNetworks, Sausage, SoftQuad, Amaya and others have all agreed to implement the relevant guidelines in their products. Also large companies like Boeing, Bell Atlantic and Electricty de France have welcomed the guidelines as valuable in allowing them to use tools that produce accessible output.

In Conclusion

Hopefully, this introduction to the W3C and its activities gives a flavour of what W3C does and how it accomplishes it. The web continues to develop at a great pace and W3C and its Members are the major driving force to lead the web to its full potential.

W3C Recommendations

Below is a complete list of the current W3C Recommendations.

Name Description Date
PNG Portable Network Graphics October 1996
PICS 1.1
Rating Services and Systems
Label Format and Distribution
Platform for Internet Content Selection
Language for describing rating services
Formats for labels and their distribution
October 1996
PICS Rules 1.1 PICS Rules December 1997
XML 1.0 Extensible Markup Language February 1998
CSS2 Cascading Style Sheets May 1998
DSig 1.0 PICS Signed Labels 1.0 May 1998
SMIL 1.0 Synchronised Multimedia Integration Language June 1998
HTTP 1.1 HyperText Transfer Protocol September 1998
DOM Level 1 Document Object Model October 1998
Namespaces in XML Defines how XML namespaces can coexist January 1999
Web CGM Profile Computer Graphics Metafile Profile for use on the Web January 1999
RDF Model and Syntax Resource Description Framework Model and syntax February 1999
Web Content
Accessibility Guidelines
Guidelines for making Web content
accessible to people with disabilities
May 1999
Associating Stylesheets
with XML documents
Using Processing Instructions
to link a style sheet to an XML document
June 1999
MathML 1.01 Mathematical Markup Language July 1999
XPath 1.0 XML Path Language November 1999
XSLT 1.0 XSL Transformations November 1999
HTML 4.01 HyperText Markup Language December 1999
XHTML 1.0 Reformulation of HTML 4.0 in XML January 2000
Authoring Tool
Accessibility Guidelines
Guidelines for editors and systems
that create documents for the web
February 2000

New Recommendations Anticipated shortly

Name Description
DOM Level 2.0 Document Object Model level 2
RDF Schemas Resource Description Framework Schemas
User Agent Accessibility Guidelines 1.0 What browsers, screen readers etc need to support accessibility
MathML 2.0 A new version of the Mathematical Markup language

Hypertext References

HREF1
http://www.w3.org/
HREF2
http://www.w3.org/TR
HREF3
http://www.w3.org/Member
HREF4
http://www.w3.org/People
HREF5
http://www.w3.org/TR
HREF6
http://www.w3.org/Consortium/Activities
HREF7
http://www.w3.org/Consortium/Process
HREF8
http://www.w3.org/Process/Process-19991111/submission#Submission
HREF9
http://www.w3.org/Submission
HREF10
http://www.w3.org/XML
HREF11
http://www.w3.org/XML/Activity#core-wg
HREF12
http://www.w3.org/XML/Activity#schema-wg
HREF13
http://www.w3.org/XML/Activity#linking-wg
HREF14
http://www.w3.org/XML/Activity#query-wg
HREF15
http://www.w3.org/Consortium/Activities
HREF16
href="http://www.w3.org/Consortium/#domains
HREF17
http://web.mit.edu/
HREF18
http://www.lcs.mit.edu/
HREF19
http://www.inria.fr/
HREF20
http://www.keio.ac.jp/
HREF21
http://www.w3.org/Consortium/Prospectus/Joining
HREF22
http://www.w3.org/Consortium/Offices/
HREF23
http://www.w3.org/XML/1999/XML-in-10-points
HREF24
http://www.w3.org/TR/REC-xml-names/
HREF25
http://www.w3.org/TR/xpath
HREF26
http://www.w3.org/TR/xslt
HREF27
http://www.w3.org/TR/xsl/
HREF28
http://www.w3.org/DOM/
HREF29
http://www.w3.org/TR/xlink/
HREF30
http://www.w3.org/TR/xptr
HREF31
http://www.w3.org/TR/xmlschema-0/
HREF32
http://www.w3.org/TR/xml-infoset
HREF33
http://www.w3.org/TR/xinclude
HREF34
http://www.w3.org/TR/xmlbase
HREF35
http://www.w3.org/TR/xml-c14n
HREF36
http://www.w3.org/TR/xmlquery-req
HREF37
http://www.w3.org/TR/WD-xml-fragment
HREF38
http://www.w3.org/TR/xhtml1
HREF39
http://www.w3.org/People/Raggett/tidy/
HREF40
http://www.w3.org/TR/SVG
HREF41
http://www.w3.org/TR/REC-rdf-syntax
HREF42
http://www.w3.org/TR/rdf-schema
HREF43
http://www.w3.org/TR/P3P
HREF44
http://www.w3.org/TR/P3P-preferences
HREF45
http://www.w3.org/TR/WAI-WEBCONTENT
HREF46
http://www.w3.org/TR/ATAG10
HREF47
http://www.w3.org/TR/UAAG10


Copyright

Bob Hopgood, (c) 2000. The author assigns to Southern Cross University and other educational and non-profit institutions a non-exclusive licence to use this document for personal use and in courses of instruction provided that the article is used in full and this copyright statement is reproduced. The author also grants a non-exclusive licence to Southern Cross University to publish this document in full on the World Wide Web and on CD-ROM and in printed form with the conference papers and for the document to be published on mirrors on the World Wide Web.


[ Proceedings ]


AusWeb2K, the Sixth Australian World Wide Web Conference, Rihga Colonial Club Resort, Cairns, 12-17 June 2000 Contact: Norsearch Conference Services +61 2 66 20 3932 (from outside Australia) (02) 6620 3932 (from inside Australia) Fax (02) 6622 1954