Version: 5.1.1
HTML: Parsing Library
The 
html library provides
functions to read html documents and structures to represent them.
Reads (X)HTML from a port, producing an 
html instance.
If v is not #f, then comments are read and returned. Defaults to #f.
If v is not #f, then the HTML must respect the HTML specification
with regards to what elements are allowed to be the children of
other elements. For example, the top-level "<html>"
element may only contain a "<body>" and "<head>"
element. Defaults to #f.
1 Example
| (module html-example racket |  |   |  |   ; Some of the symbols in html and xml conflict with |  |   ; each other and with racket/base language, so we prefix |  |   ; to avoid namespace conflict. |  |   (require (prefix-in h: html) |  |            (prefix-in x: xml)) |  |   |  |   (define an-html |  |     (h:read-xhtml |  |      (open-input-string |  |       (string-append |  |        "<html><head><title>My title</title></head><body>" |  |        "<p>Hello world</p><p><b>Testing</b>!</p>" |  |        "</body></html>")))) |  |   |  |   ; extract-pcdata: html-content -> (listof string) |  |   ; Pulls out the pcdata strings from some-content. |  |   (define (extract-pcdata some-content) |  |     (cond [(x:pcdata? some-content) |  |            (list (x:pcdata-string some-content))] |  |           [(x:entity? some-content) |  |            (list)] |  |           [else |  |            (extract-pcdata-from-element some-content)])) |  |   |  |   ; extract-pcdata-from-element: html-element -> (listof string) |  |   ; Pulls out the pcdata strings from an-html-element. |  |   (define (extract-pcdata-from-element an-html-element) |  |     (match an-html-element |  |       [(struct h:html-full (attributes content)) |  |        (apply append (map extract-pcdata content))] |  |   |  |       [(struct h:html-element (attributes)) |  |        '()])) |  |   |  |   (printf "~s\n" (extract-pcdata an-html))) |  
  | 
   | 
| > (require 'html-example) |  ("My title" "Hello world" "Testing" "!")  |  
  | 
2 HTML Structures
pcdata, entity, and attribute are defined
in the xml documentation.
A html-content is either
 | 
|   content : (listof html-content) | 
Any html tag that may include content also inherits from
html-full without adding any additional fields.
A Contents-of-html is either
A Contents-of-head is either
A Contents-of-tr is either
A Contents-of-table is either
A Contents-of-fieldset is either
A Contents-of-select is either
A Contents-of-dl is either
A Contents-of-pre is either
A Contents-of-object-applet is either
A Map is
(make-map (listof attribute) (listof Contents-of-map))A Contents-of-map is either
A Contents-of-a is either
A Contents-of-address is either
A Contents-of-body is either
A G12 is either
A G11 is either
A G10 is either
A G9 is either
A G8 is either
A G7 is either
A G6 is either
A G5 is either
A G4 is either
A G3 is either
A G2 is either