last data update: 2011/10/14, 23:39

Website loading time

during the test: 0.23 s

cable connection (average): 0.37 s

DSL connection (average): 0.5 s

modem (average): 7.7 s

HTTP headers

Information about DNS servers

cafeconleche.orgMX5IN10800
cafeconleche.orgA152.19.134.41IN10800
cafeconleche.orgSOAns0.easydns.comadmin.easydns.com130035554710800 3600 1209600 10800 IN 10800
cafeconleche.orgNSremote1.easydns.comIN10800
cafeconleche.orgNSremote2.easydns.comIN10800
cafeconleche.orgNSns1.easydns.comIN10800
cafeconleche.orgNSns2.easydns.comIN10800

Received from the first DNS server

Request to the server "cafeconleche.org"
You used the following DNS server:
DNS Name: remote1.easydns.com
DNS Server Address: 64.68.192.10#53
DNS server aliases:

HEADER opcode: REQUEST, status: NOERROR, id: 41748
flag: qr aa rd REQUEST: 1, ANSWER: 7, AUTHORITY: 0, ADDITIONAL: 4

REQUEST SECTION:
cafeconleche.org. IN ANY

ANSWER SECTION:
cafeconleche.org. 10800 IN SOA ns0.easydns.com. admin.easydns.com. 1300355547 10800 3600 1209600 10800
cafeconleche.org. 10800 IN A 152.19.134.41
cafeconleche.org. 10800 IN MX 5 .
cafeconleche.org. 10800 IN NS remote1.easydns.com.
cafeconleche.org. 10800 IN NS ns1.easydns.com.
cafeconleche.org. 10800 IN NS ns2.easydns.com.
cafeconleche.org. 10800 IN NS remote2.easydns.com.

SECTION NOTES:
ns1.easydns.com. 300 IN A 64.68.192.10
ns2.easydns.com. 300 IN A 72.52.2.1
remote1.easydns.com. 300 IN A 64.68.192.10
remote2.easydns.com. 300 IN A 72.52.2.1

Received 266 bytes from address 64.68.192.10#53 in 14 ms

Received from the second DNS server

Request to the server "cafeconleche.org"
You used the following DNS server:
DNS Name: remote2.easydns.com
DNS Server Address: 72.52.2.1#53
DNS server aliases:

HEADER opcode: REQUEST, status: NOERROR, id: 57728
flag: qr aa rd REQUEST: 1, ANSWER: 7, AUTHORITY: 0, ADDITIONAL: 4

REQUEST SECTION:
cafeconleche.org. IN ANY

ANSWER SECTION:
cafeconleche.org. 10800 IN SOA ns0.easydns.com. admin.easydns.com. 1300355547 10800 3600 1209600 10800
cafeconleche.org. 10800 IN A 152.19.134.41
cafeconleche.org. 10800 IN MX 5 .
cafeconleche.org. 10800 IN NS ns2.easydns.com.
cafeconleche.org. 10800 IN NS remote1.easydns.com.
cafeconleche.org. 10800 IN NS ns1.easydns.com.
cafeconleche.org. 10800 IN NS remote2.easydns.com.

SECTION NOTES:
ns1.easydns.com. 300 IN A 64.68.192.10
ns2.easydns.com. 300 IN A 72.52.2.1
remote1.easydns.com. 300 IN A 64.68.192.10
remote2.easydns.com. 300 IN A 72.52.2.1

Received 266 bytes from address 72.52.2.1#53 in 67 ms

Subdomains (the first 50)

Typos (misspells)

xafeconleche.org
vafeconleche.org
fafeconleche.org
dafeconleche.org
czfeconleche.org
csfeconleche.org
cwfeconleche.org
cqfeconleche.org
cadeconleche.org
caceconleche.org
caveconleche.org
cageconleche.org
cateconleche.org
careconleche.org
cafwconleche.org
cafsconleche.org
cafdconleche.org
cafrconleche.org
caf4conleche.org
caf3conleche.org
cafexonleche.org
cafevonleche.org
cafefonleche.org
cafedonleche.org
cafecinleche.org
cafecknleche.org
cafeclnleche.org
cafecpnleche.org
cafec0nleche.org
cafec9nleche.org
cafecobleche.org
cafecomleche.org
cafecojleche.org
cafecohleche.org
cafeconkeche.org
cafeconpeche.org
cafeconoeche.org
cafeconlwche.org
cafeconlsche.org
cafeconldche.org
cafeconlrche.org
cafeconl4che.org
cafeconl3che.org
cafeconlexhe.org
cafeconlevhe.org
cafeconlefhe.org
cafeconledhe.org
cafeconlecge.org
cafeconlecbe.org
cafeconlecne.org
cafeconlecje.org
cafeconlecue.org
cafeconlecye.org
cafeconlechw.org
cafeconlechs.org
cafeconlechd.org
cafeconlechr.org
cafeconlech4.org
cafeconlech3.org
afeconleche.org
cfeconleche.org
caeconleche.org
cafconleche.org
cafeonleche.org
cafecnleche.org
cafecoleche.org
cafeconeche.org
cafeconlche.org
cafeconlehe.org
cafeconlece.org
cafeconlech.org
acfeconleche.org
cfaeconleche.org
caefconleche.org
cafceonleche.org
cafeocnleche.org
cafecnoleche.org
cafecolneche.org
cafeconelche.org
cafeconlcehe.org
cafeconlehce.org
cafeconleceh.org
ccafeconleche.org
caafeconleche.org
caffeconleche.org
cafeeconleche.org
cafecconleche.org
cafecoonleche.org
cafeconnleche.org
cafeconlleche.org
cafeconleeche.org
cafeconlecche.org
cafeconlechhe.org
cafeconlechee.org

Location

IP: 152.19.134.41

continent: NA, country: United States (USA), city: Chapel Hill

Website value

rank in the traffic statistics:

There is not enough data to estimate website value.

Basic information

website build using CSS

code weight: 52.32 KB

text per all code ratio: 60 %

title: Cafe con Leche XML News and Resources

description: Cafe con Leche is the preeminent independent source of XML information on the net. Cafe con Leche is neither beholden to specific companies nor to advertisers. At Cafe con Leche you'll find many resources to help you develop your XML skills here including daily news summaries, examples, book reviews, mailing lists and more.

keywords: XML, Cafe con Leche, XML Bible, XML: Extensible Markup Language, XML in a Nutshell, Processing XML with Java, Effective XML

encoding: UTF-8

language: en-US

Website code analysis

one word phrases repeated minimum three times

PhraseQuantity
the115
let96
:=94
to77
XML58
and57
in53
for50
return40
of39
declare30
that29
as27
namespace25
on24
if23
have23
is21
then20
else19
201018
The17
(Permalink)17
this16
","))16
January15
but14
into14
be14
Cafe14
I've14
"1.0";13
version13
xquery13
with13
$date13
can12
out12
XQuery12
work11
time11
it11
new11
web10
was10
(Java)10
not10
some9
I'm9
so9
au9
how9
$tweets9
or9
News9
more9
about9
just8
lot8
site8
now8
atom="http://www.w3.org/2005/Atom";8
my8
Lait8
old8
query8
it's8
function8
up8
$entry8
xmldb="http://exist-db.org/xquery/xmldb";7
In7
Language7
1.07
from7
will7
still7
should7
};7
are7
going6
them6
only6
where6
It6
at6
(XSLT)6
$dd6
"on6
$y16
had6
by6
than6
do6
"("))6
me6
an6
don't6
you6
xs:string6
"list,"))6
eXist6
""5
Java5
back5
one5
other5
I'll5
This5
get5
also5
quotes5
reverse(document("/db/twitter/elharo")/atom:feed/atom:entry)5
(C)5
html="http://www.w3.org/1999/xhtml";5
XOM5
XSL5
think5
very5
(XSL-FO)5
con5
they5
bug4
$year4
Friday,4
$day-of-month4
data4
eXist.4
(:4
XHTML4
")4
:)4
$month4
Markup4
see4
normalize-space(substring-before($string-date,4
these4
Tuesday,4
$month-day4
like4
no4
substring-before($month-day,4
something4
run4
Now4
$num4
files4
Leche4
$id4
may4
error4
Atom4
string($dt)4
probably4
the"))4
substring-before($sourcetext,4
normalize-space(substring-after($string-date,4
the")4
(contains($sourcetext,4
HTML4
$dt4
when4
Wednesday,4
$location4
since4
$day4
$tweet/p4
distinct-values($tweets/date)4
$tweet/date4
xs:string)4
"T"),4
}4
news4
$result4
{substring-before(substring-after($entry/atom:updated/text(),4

4
"+")}4
XSLT4
{substring-before($entry/atom:updated/text(),
4
"T")}4
normalize-space(substring-before($date,4
date4
$tweet4
normalize-space(substring-after($date,4
$string-date4

{$date}

4
"elharo:"),3
Today's3
Thursday,3
SAX3
Resources3
February3
DOM3
"(http://[^s]+)",3
XPath3
Feed3
Level3
there's3
figured3
it,3
3
Schema3
Part3
finished3
April3
documents3
"_",3
replace3
Description3
sites3
default3
simple3
$dt/following-sibling::html:dd[1]3
3
decided3
3
much3
last3
string($dt/@id)3
$quote3
(Eiffel)3
quote3
too3
"3
"--"))3
3
$source3
string($quote/@cite)3
(Python)3
$dd/html:blockquote3
all3
$cite3
There's3
$author3
If3
"March",3
exist3
"February",3
("January",3
Saxon3
$months3
"April",3
And3
Extensible3
"August",3
"July",3
XML:3
"May",3
"June",3
didn't3
has3
yet.3
here3
permalink3
Processing3
Quotes3
strip3
Lists3
(contains($y1,3
Jaxen3
normalize-space(substring-before($y1,3
using3
Effective3
few3
List3
debugger3
3
instead3
AOL3
XLinks3
{substring-after($entry/atom:title/text(),3
Maybe3
XPointers3
RSS3
{$id}3
{$cite}3
{$author}3
Then3
might3
Cafes3
Conferences3
Rusty3
"November",3
"December")3
"September",3
"October",3
Mailing3

two word phrases repeated minimum three times

PhraseQuantity
declare namespace23
2010 (Permalink)17
:= if17
version "1.0";13
",")) let13
"1.0"; declare13
xquery version13
of the12
Cafe au9
on the9
the old8
namespace atom="http://www.w3.org/2005/Atom";8
au Lait8
$entry in8
have to8
for $entry7
declare function7
namespace xmldb="http://exist-db.org/xquery/xmldb";7
lot of7
in the7
return for6
let $dd6
xs:string let6
$dd :=6
as xs:string6
to be6
to work5
Cafe con5
how to5
in reverse(document("/db/twitter/elharo")/atom:feed/atom:entry)5
atom="http://www.w3.org/2005/Atom"; let5
let $tweets5
$tweets :=5
}; declare5
xmldb="http://exist-db.org/xquery/xmldb"; declare5
for $date5
$date in5
namespace html="http://www.w3.org/1999/xhtml";5
time to5
and the5
work on5
normalize-space(substring-before($string-date, ","))4
:= normalize-space(substring-before($string-date,4
:= substring-before($month-day,4
let $month4
$month :=4
") let4
:= string($dt)4
$dt in4
$month-day :=4
$date :=4
for $dt4
let $date4
$day-of-month :=4
string($dt) let4
let $day-of-month4
$day :=4
xs:string) as4
let $day4
let $id4
:= normalize-space(substring-before($date,4
as xs:string)4
(Permalink) I've4
the new4
into the4
to get4
normalize-space(substring-before($date, ","))4
let $string-date4
XML with4
let $year4
$year :=4
normalize-space(substring-after($string-date, ","))4
:= normalize-space(substring-after($string-date,4
$string-date :=4
:= normalize-space(substring-after($date,4
normalize-space(substring-after($date, ","))4
let $month-day4
substring-before($month-day, ")4

{substring-before(substring-after($entry/atom:updated/text(),

4
"T")}

4
{substring-before(substring-after($entry/atom:updated/text(), "T"),4
"T"), "+")}4
distinct-values($tweets/date) return4
in distinct-values($tweets/date)4
{substring-before($entry/atom:updated/text(), "T")}
4
return
{substring-before($entry/atom:updated/text(),
4
the") let4
"on the")4
"list,")) then4
$id :=4
reverse(document("/db/twitter/elharo")/atom:feed/atom:entry) return4
:= for4
that the4
return

{$date}

4
$tweet/p }4
return $tweet/p4
the web4
The XML4
con Leche4
Markup Language4
$date return4
$tweet/date $date4
for $tweet4

{$date}

for
4
$tweet in4
in $tweets4
where $tweet/date4
$tweets where4
if (contains($sourcetext,4
else ""4
:= $dt/following-sibling::html:dd[1]3
old quotes3
html="http://www.w3.org/1999/xhtml"; for3
"+")} 3
"elharo:"), "(http://[^s]+)",3
don't have3
$dt/following-sibling::html:dd[1] let3
"(http://[^s]+)", "3
Friday, January3
:= string($dt/@id)3
$dd/html:blockquote let3
let $cite3
$cite :=3
:= $dd/html:blockquote3
$quote :=3
string($dt/@id) let3
let $quote3
Tuesday, January3
I've decided3
the time3
XML 1.03
figured out3
Thursday, January3
The Cafes3
XML Schema3
some of3
it back3
Schema Part3
all the3
Effective XML3
Mailing Lists3
for the3
XML Mailing3
Processing XML3
with Java3
out how3
Java XML3
but it's3
:= string($quote/@cite)3
string($quote/@cite) let3
if (contains($y1,3
(contains($y1, "("))3
"(")) then3
("January", "February",3
"February", "March",3
permalink :)3
:) let3
"March", "April",3
then normalize-space(substring-before($y1,3
normalize-space(substring-before($y1, "("))3
let $author3
$source :=3
let $months3
$author :=3
:= ("January",3
"(")) else3
else $y13
$y1 let3
strip permalink3
(: strip3
"August", "September",3
"July", "August",3
"June", "July",3
"September", "October",3
"October", "November",3
let $source3
"December") let3
"November", "December")3
"May", "June",3
here on3
"" let3
"April", "May",3
",")) (:3
$y1 :=3
let $y13
the XQuery3
{$id}3
$months :=3
going to3
instead of3
this site3
to the3

three word phrases repeated minimum three times

PhraseQuantity
xquery version "1.0";13
version "1.0"; declare13
"1.0"; declare namespace12
declare namespace atom="http://www.w3.org/2005/Atom";8
Cafe au Lait8
declare namespace xmldb="http://exist-db.org/xquery/xmldb";7
for $entry in7
let $dd :=6
as xs:string let6
xmldb="http://exist-db.org/xquery/xmldb"; declare namespace5
namespace xmldb="http://exist-db.org/xquery/xmldb"; declare5
}; declare function5
for $date in5
let $tweets :=5
declare namespace html="http://www.w3.org/1999/xhtml";5
atom="http://www.w3.org/2005/Atom"; let $tweets5
namespace atom="http://www.w3.org/2005/Atom"; let5
$entry in reverse(document("/db/twitter/elharo")/atom:feed/atom:entry)5
let $id :=4
let $year :=4
"on the") let4
2010 (Permalink) I've4
Cafe con Leche4
let $month-day :=4
:= normalize-space(substring-before($string-date, ","))4
$month-day := normalize-space(substring-before($string-date,4
normalize-space(substring-before($string-date, ",")) let4
$date := string($dt)4
$tweets := for4
:= for $entry4
to work on4
for $dt in4
:= if (contains($sourcetext,4
let $day-of-month :=4
") let $day-of-month4
let $month :=4
let $date :=4
$month := substring-before($month-day,4
:= substring-before($month-day, ")4
substring-before($month-day, ") let4
:= string($dt) let4
:= normalize-space(substring-after($string-date, ","))4
"T")}

{substring-before(substring-after($entry/atom:updated/text(),

4
{substring-before($entry/atom:updated/text(), "T")}

4
return
{substring-before($entry/atom:updated/text(), "T")}
4
reverse(document("/db/twitter/elharo")/atom:feed/atom:entry) return
{substring-before($entry/atom:updated/text(),
4

{substring-before(substring-after($entry/atom:updated/text(), "T"),

4
{substring-before(substring-after($entry/atom:updated/text(), "T"), "+")}4
return

{$date}

for
4
distinct-values($tweets/date) return

{$date}

4
in distinct-values($tweets/date) return4
$date in distinct-values($tweets/date)4
in reverse(document("/db/twitter/elharo")/atom:feed/atom:entry) return4
as xs:string) as4
let $string-date :=4
$string-date := normalize-space(substring-after($date,4
:= normalize-space(substring-after($date, ","))4
normalize-space(substring-after($date, ",")) let4
",")) let $string-date4
normalize-space(substring-before($date, ",")) let4
let $day :=4
$day := normalize-space(substring-before($date,4
:= normalize-space(substring-before($date, ","))4

{$date}

for $tweet
4
return for $date4
where $tweet/date $date4
$tweet/date $date return4
return $tweet/p }4
$date return $tweet/p4
for $tweet in4
$tweet in $tweets4
in $tweets where4
$tweets where $tweet/date4
$id := string($dt/@id)3
string($dt/@id) let $date3
:= string($dt/@id) let3
$quote := $dd/html:blockquote3
XML Mailing Lists3
normalize-space(substring-after($string-date, ",")) (:3
html="http://www.w3.org/1999/xhtml"; for $dt3
let $quote :=3
$dd/html:blockquote let $cite3
:= $dd/html:blockquote let3
$dd := $dt/following-sibling::html:dd[1]3
let $author :=3
namespace html="http://www.w3.org/1999/xhtml"; for3
Processing XML with3
some of the3
string($dt) let $dd3
out how to3
XML with Java3
let $source :=3
:= $dt/following-sibling::html:dd[1] let3
let $cite :=3
$cite := string($quote/@cite)3
:= string($quote/@cite) let3
string($quote/@cite) let $source3
the old quotes3
"March", "April", "May",3
then normalize-space(substring-before($y1, "("))3
"(")) then normalize-space(substring-before($y1,3
(contains($y1, "(")) then3
if (contains($y1, "("))3
normalize-space(substring-before($y1, "(")) else3
"(")) else $y13
with Java XML3
else $y1 let3
",")) let $y13
:= if (contains($y1,3
$year := if3
XML Schema Part3
time to work3
xs:string) as xs:string3
",")) (: strip3
(: strip permalink3
:) let $year3
permalink :) let3
strip permalink :)3
xs:string let $day3
$y1 let $month-day3
"September", "October", "November",3
"August", "September", "October",3
"July", "August", "September",3
"June", "July", "August",3
"October", "November", "December")3
"November", "December") let3
"elharo:"), "(http://[^s]+)", "3
"T"), "+")} 3
"December") let $month3
"May", "June", "July",3
"April", "May", "June",3
$months := ("January",3
let $months :=3
",")) let $months3
:= ("January", "February",3
let $y1 :=3
$y1 := normalize-space(substring-after($string-date,3
"February", "March", "April",3
("January", "February", "March",3

B tags

U tags

I tags

Cafe con Leche XML News and Resources Quote of the Day I remember the early days of the web -- and the last days of CD ROM -- when there was this mainstream consensus that the web and PCs were too durned geeky and difficult and unpredictable for "my mom" (it's amazing how many tech people have an incredibly low opinion of their mothers). If I had a share of AOL for every time someone told me that the web would die because AOL was so easy and the web was full of garbage, I'd have a lot of AOL shares. And they wouldn't be worth much. --Cory Doctorow Read the rest in Why I won't buy an iPad (and think you shouldn't, either) Today's News I've released XOM 1.2.5, my free-as-in-speech (LGPL) dual streaming/tree-based API for processing XML with Java. 1.2.5 is a very minor release. The only visible change is that Builder.build((Reader) null) now throws a NullPointerException instead of a confusing MalformedURLException. I've also added support for Maven 2, and hope to get the packages uploaded to the central repository in a week or two. In other news, I have had very little time to work on this site lately. In order to have any time to work on other projects including XOM and Jaxen, I've had to let this site slide. I expect to have more news about that soon. Also, speaking of Jaxen, I noticed that the website has been a little out of date for a while now because I neglected to update the releases page when 1.1.2 was released in 2008. Consequently, a lot of folks have been missing out on the latest bug fixes and optimizations. If you're still using Jaxen 1.1.1 or earlier, please upgrade when you get a minute. Also, note that the official site is http://jaxen.codehaus.org/. jaxen.org is a domain name spammer. I'm not sure who let that one slide, but we'll have to see about grabbing it back one of these days. Permalink to Today's News | Recent News | Today's Java News on Cafe au Lait | The Cafes | Older News | E-mail Elliotte Rusty Harold Recommended Reading Selected content that might have some relevance or interest for this site's visitors: You can also see previous recommended reading or subscribe to the recommended reading RSS feed if you like. Recent News Friday, February 12, 2010 (Permalink) Yesterday I figured out how to process form input. Today I figured out how to parse strings into nodes in eXist. This is very eXist specific, but briefly: let $doc := "<html xmlns='http://www.w3.org/1999/xhtml'> <div> foo </div> </html>" let $list := util:catch('*', (util:parse($doc)), ($util:exception-message)) return $list I'll need this for posts and comments. There's also a parse-html function but it's based on the flaky NekoHTML instead of the m,orereliable TagSoup. Wednesday, February 10, 2010 (Permalink) I'm slowly continuing to work on the new backend. I've finally gotten indexing to work. It turns out that eXist's namespace hanlding for index configuration files is broken in 1.4.0, but that shouldn be fixed in the release. I've also manged to get the source built and most of the tests to run so I can contribute patches back. Next up I'm looking into the supoprt for the Atom Publishing Protocol. Wednesday, February 3, 2010 (Permalink) I spent a morning debugging a problem that I have now boiled down to this test case. The following query prints 3097: <html> { let $num := count(collection("/db/quotes")/quote) return $num } </html> and this query prints 0: <html xmlns="http://www.w3.org/1999/xhtml"> { let $num := count(collection("/db/quotes")/quote) return $num } </html> The only difference is the default namespace declaration. In the documents being queried the quote elements are indeed in no namespace. Much to my surprise XQuery has broken the semantics of XPath 1.0 by applying default namespaces to unqualified names in path expressions. Who thought it would be a good idea to break practice with XSLT, every single XPath implementation on the planet, and years of experience and documentation? There's an argument to be made for default namespaces applying in path expressions, but the time for that argument to be made was 1998. Once the choice was made, the cost of switching was far higher than any incremental improvement you might make. Stare decisis isn't just for the supreme court. Saturday, January 30, 2010 (Permalink) XQuery executing for about an hour now. O(N^2) algorithm perhaps? Maybe I should learn about indexes? Or is eXist just hung? declare namespace xmldb="http://exist-db.org/xquery/xmldb"; declare namespace html="http://www.w3.org/1999/xhtml"; declare namespace xs="http://www.w3.org/2001/XMLSchema"; declare namespace atom="http://www.w3.org/2005/Atom"; for $date in distinct-values( for $updated in collection("/db/news")/atom:entry/atom:updated order by $updated descending return xs:date(xs:dateTime($updated))) let $entries := collection("/db/news")/atom:entry[xs:date(xs:dateTime(atom:updated)) = $date] return <div> for $entry in $entries return $entry/atom:title <hr /> </div> Friday, January 29, 2010 (Permalink) I've got a lot of the old data loaded into eXist (news and quotes; readings and other pages I still have to think about). I'm now focusing on how to get it back out again and put it in web pages. Once that's done, the remaining piece is setting up some system for putting new data in. It will probably be a fairly simple HTML form, but some sort of markdown support might be nice. Perhaps I can hack something together that will insert paragraphs if there are no existing paragraphs, and otherwise leave the markup alone. I'm also divided on the subject of whether to store the raw text, the XHTML converted text, or both. This will be even more critical when I add comment support. Tuesday, January 26, 2010 (Permalink) I've more or less completed the script that converts the old news into Atom entry documents: xquery version "1.0"; declare namespace xmldb="http://exist-db.org/xquery/xmldb"; declare namespace html="http://www.w3.org/1999/xhtml"; declare namespace xs="http://www.w3.org/2001/XMLSchema"; declare namespace atom="http://www.w3.org/2005/Atom"; declare namespace text="http://exist-db.org/xquery/text"; declare function local:leading-zero($n as xs:decimal) as xs:string { let $result := if ($n >= 10) then string($n) else concat("0", string($n)) return $result }; declare function local:parse-date($date as xs:string) as xs:string { let $day := normalize-space(substring-before($date, ",")) let $string-date := normalize-space(substring-after($date, ",")) let $y1 := normalize-space(substring-after($string-date, ",")) (: strip permalink :) let $year := if (contains($y1, "(")) then normalize-space(substring-before($y1, "(")) else $y1 let $month-day := normalize-space(substring-before($string-date, ",")) let $months := ("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December") let $month := substring-before($month-day, " ") let $day-of-month := local:leading-zero(xs:decimal(substring-after($month-day, " "))) let $monthnum := local:leading-zero(index-of($months,$month)) (: I don't necessarily know the time so I'll pick something vaguely plausible. :) return concat($year, "-", $monthnum, "-", $day-of-month, "T07:00:31-05:00") }; declare function local:first-sentence($text as xs:string) as xs:string { let $r0 := normalize-space($text) let $r1 := substring-before($text, '. ') let $penultimate := substring($r1, string-length($r1)-1, 1) let $sentence := if ($penultimate != " " or not(contains($r1, ' '))) then concat($r1, ".") else concat($r1, ". ", local:first-sentence($r1)) return $sentence }; declare function local:make-id($date as xs:string, $position as xs:integer) as xs:string { let $day := normalize-space(substring-before($date, ",")) let $string-date := normalize-space(substring-after($date, ",")) let $y1 := normalize-space(substring-after($string-date, ",")) (: strip permalink :) let $year := if (contains($y1, "(")) then normalize-space(substring-before($y1, "(")) else $y1 let $month-day := normalize-space(substring-before($string-date, ",")) let $months := ("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December") let $month := substring-before($month-day, " ") let $day-of-month := local:leading-zero(xs:decimal(substring-after($month-day, " "))) let $monthnum := local:leading-zero(index-of($months,$month)) return concat($month, "_", $day-of-month, "_", $year, "_", $position) }; declare function local:permalink-date($date as xs:string) as xs:string { let $day := normalize-space(substring-before($date, ",")) let $string-date := normalize-space(substring-after($date, ",")) let $y1 := normalize-space(substring-after($string-date, ",")) (: strip permalink :) let $year := if (contains($y1, "(")) then normalize-space(substring-before($y1, "(")) else $y1 let $month-day := normalize-space(substring-before($string-date, ",")) let $month := substring-before($month-day, " ") let $day-of-month := xs:decimal(substring-after($month-day, " ")) return concat($year, $month, $day-of-month) }; for $newsyear in (1998 to 2009) return for $dt in doc(concat("file:///Users/elharo/cafe%20con%20Leche/news", $newsyear ,".html"))/html:html/html:body/html:dl/html:dt let $dd := $dt/following-sibling::html:dd[1] let $date := string($dt) let $itemstoday := count($dd/html:div) return for $item at $count in $dd/html:div let $sequence := $itemstoday - $count + 1 let $id := if ($item/@id) then string($item/@id) else local:make-id($date, $sequence) let $published := if ($item/@class) then string($item/@class) else local:parse-date($date) let $link := concat("http://www.cafeconleche.org/#", $id) let $permalink := if ($item/@id) then concat("http://www.cafeconleche.org/oldnews/news", local:permalink-date($date), ".html#", $item/@id) else concat("http://www.cafeconleche.org/oldnews/news", local:permalink-date($date), ".html") return <atom:entry xml:id="{$id}"> <atom:author> <atom:name>Elliotte Rusty Harold</atom:name> <atom:uri>http://www.elharo.com/</atom:uri> </atom:author> <atom:id>{$link}</atom:id> <atom:title>{local:first-sentence(string($item))}</atom:title> <atom:updated>{$published}</atom:updated> <atom:content type="xhtml" xml:lang="en" xml:base="http://www.cafeconleche.org/" xmlns="http://www.w3.org/1999/xhtml">{$item/node()}</atom:content> <link rel="alternate" href="{$link}"/> <link rel="permalink" href="{$permalink}"/> </atom:entry> I should probably figure out how to remove some of the duplicate date parsing code, but it's basically a one-off migration script so I may not bother. I think I have enough in place now that I can start setting up the templates for the main index.html page and the quote and news archives. Then I can start exploring the authoring half of the equation. Monday, January 25, 2010 (Permalink) I'm beginning to seriously hate the runtime error handling (or lack thereof) in XQuery. It's just too damn hard to debug what's going wrong where compared to Java. You can't see where the bad data is coming from, and there's no try-catch facility to help you out. Now that I think about it, I had very similar problems with Haskell last year. I wonder if this is a common issue with functional languages? Thursday, January 21, 2010 (Permalink) I've just about finished importing all the old quotes into eXist. (There was quite a bit of cleanup work going back 12 years. The format changed solowly over time.) Next up is the news. I am wondering if maybe this is backwards. Perhaps first I should build the forms and backend for posting new content, and then import the old data? After all, it's the new content people are interested in. There's not that much call for breaking XML news from 1998. :-) Wednesday, January 20, 2010 (Permalink) Parsing a date in the form "Wednesday, January 20, 2010" in XQuery: xquery version "1.0"; declare function local:leading-zero($n as xs:decimal) as xs:string { let $result := if ($n >= 10) then string($n) else concat("0", string($n)) return $result }; declare function local:parse-date($date as xs:string) as element() { let $day := normalize-space(substring-before($date, ",")) let $string-date := normalize-space(substring-after($date, ",")) let $year := normalize-space(substring-after($string-date, ",")) let $month-day := normalize-space(substring-before($string-date, ",")) let $months := ("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December") let $month := substring-before($month-day, " ") let $day-of-month := number(substring-after($month-day, " ")) return <postdate> <day>{$day}</day> <date>{$year}-{local:leading-zero(index-of($months,$month))}-{local:leading-zero($day-of-month)}</date> </postdate> }; local:parse-date("Monday, April 27, 2009") Tuesday, January 19, 2010 (Permalink) Today I went from merely splitting the quotes files apart into indiviodual quotes to actually storing them back into the database: xquery version "1.0"; declare namespace xmldb="http://exist-db.org/xquery/xmldb"; declare namespace html="http://www.w3.org/1999/xhtml"; for $dt in doc("/db/quoteshtml/quotes2009.html")/html:html/html:body/html:dl/html:dt let $id := string($dt/@id) let $date := string($dt) let $dd := $dt/following-sibling::html:dd[1] let $quote := $dd/html:blockquote let $cite := string($quote/@cite) let $source := $quote/following-sibling::* let $sourcetext := normalize-space(substring-after($source, "--")) let $author := if (contains($sourcetext, "Read the")) then substring-before($sourcetext, "Read") else substring-before($sourcetext, "on the") let $location := if ($source/html:a) then $source/html:a else substring-after($sourcetext, "on the") let $quotedate := if (contains($sourcetext, "list,")) then normalize-space(substring-after($sourcetext, "list,")) else "" let $justlocation := if (contains($location, "list,")) then normalize-space(substring-after(substring-before($sourcetext, ","), "on the")) else $location let $singlequote := <quote> <id>{$id}</id> <postdate>{$date}</postdate> <content>{$quote}</content> <cite>{$cite}</cite> <author>{$author}</author> <location>{$justlocation}</location> { if ($quotedate) then <quotedate>{$quotedate}</quotedate> else "" } </quote> let $name := concat("quote_", $id) let $store-return := xmldb:store("quotes", $name, $singlequote) return <store-result> <store>{$store-return}</store> <documentname>{$name}</documentname> </store-result> I suspect the next thing I should do is work on iomproving the dates somewhat since I'll likely want to sort and query by them. Right now they're human reabale but not so easy to process. E.g. <postdate>Monday, April 27, 2009</postdate> I should try to turn this into <postdate> <day>Monday</day> <date>2009-04-27</date> </postdate> Time to read up on the XQuery date and time functions. Hmm, looks like it's going to be regular expressions after all. Friday, January 15, 2010 (Permalink) I've converted all the old quotes archives to well-formed (though not necessarily valid) XHTML and uploaded them into eXist. Now I have to come up with an XQuery that breaks them up into individual quotes. This is proving trickier than expected (and I expected it to be pretty tricky, especially since a lot of the old quotes aren't in perfectly consistent formats. Maybe it's time to try out Oxygen's XQuery debugger since they sent me a freebie? If only the interface weren't such a horrow show. They say they have a debugger but I can't find it, and the buttons they're using in the screencast don't seem to be present in the latest version. In the meantime, can anyone see the syntax error in this code? xquery version "1.0"; declare namespace xmldb="http://exist-db.org/xquery/xmldb"; declare namespace html="http://www.w3.org/1999/xhtml"; for $dt in doc("/db/quoteshtml/quotes2010.html")/html:html/html:body/html:dl/html:dt let $id := string($dt/@id) let $date := string($dt) let $dd := $dt/following-sibling::html:dd let $quote := $dd/html:blockquote let $cite := string($quote/@cite) let $source := $quote/following-sibling::html:p let $author := normalize-space(substring-after($source/*[1], "--")) return <quote> <id>{$id}</id> <date>{$date}</date> <quote>{$quote}</quote> <cite>{$cite}</cite> <source>{$quote}</source> <author>{$author}</author> </quote> The error message from exist is "The actual cardinality for parameter 1 does not match the cardinality declared in the function's signature: string($arg as item()?) xs:string. Expected cardinality: zero or one, got 4." Found the bug: the debugger wasn't very helpful (once I found it--apparently Author and Oxygen are not the same thing), but Saxon had much better error messages than eXist. I needed to change let $dd := $dt/following-sibling::html:dd to let $dd := $dt/following-sibling::html:dd[1]. eXist didn't tell me which line had the problem so I was looking in the wrong place. Saxon pointed me straight to it. Score 1 for Saxon. Here's the finished script. It works for at least the lasy couple of years. I still have to test it out on some of the older files: xquery version "1.0"; declare namespace xmldb="http://exist-db.org/xquery/xmldb"; declare namespace html="http://www.w3.org/1999/xhtml"; for $dt in doc("/db/quoteshtml/quotes2009.html")/html:html/html:body/html:dl/html:dt let $id := string($dt/@id) let $date := string($dt) let $dd := $dt/following-sibling::html:dd[1] let $quote := $dd/html:blockquote let $cite := string($quote/@cite) let $source := $quote/following-sibling::* let $sourcetext := normalize-space(substring-after($source, "--")) let $author := if (contains($sourcetext, "Read the")) then substring-before($sourcetext, "Read") else substring-before($sourcetext, "on the") let $location := if ($source/html:a) then $source/html:a else substring-after($sourcetext, "on the") let $quotedate := if (contains($sourcetext, "list,")) then normalize-space(substring-after($sourcetext, "list,")) else "" let $justlocation := if (contains($location, "list,")) then normalize-space(substring-after(substring-before($sourcetext, ","), "on the")) else $location return <quote> <id>{$id}</id> <postdate>{$date}</postdate> <quote>{$quote}</quote> <cite>{$cite}</cite> <author>{$author}</author> <location>{$justlocation}</location> { if ($quotedate) then <quotedate>{$quotedate}</quotedate> else "" } </quote> Thursday, January 14, 2010 (Permalink) The XQuery work continues to roll along. I think I've roughly figured out to configure the server. I found and reported a few more bugs in eXists, none too critical. I now have eXist serving this entire web site on my local box, though I haven't changed the server here on IBiblio yet. That's still Apache and PHP. The next step is to convert all the static files from the last 12 years--quotes, news, books, conferences, etc.--into smaller documents in the database. For instance, each quote will be its own document. Then I have to rewrite the pages the as XQuery "templates" that query the database. From that point I can add suppor for new posts, submissions, and comments via a web browser and forms. Friday, January 8, 2010 (Permalink) I didn't really like the format of yesterday's Twitter dump so today I opened another can of XQuery ass-kicking to improve it. First, let's group by date: xquery version "1.0"; declare namespace atom="http://www.w3.org/2005/Atom"; let $tweets := for $entry in reverse(document("/db/twitter/elharo")/atom:feed/atom:entry) return <div><date>{substring-before($entry/atom:updated/text(), "T")}</date> <p> <span>{substring-before(substring-after($entry/atom:updated/text(), "T"), "+")} UTC</span> {substring-after($entry/atom:title/text(), "elharo:")}</p></div> return for $date in distinct-values($tweets/date) return <div><h3>{$date}</h3> { for $tweet in $tweets where $tweet/date = $date return $tweet/p }</div> Now let's hyperlink the URLs: xquery version "1.0"; declare namespace atom="http://www.w3.org/2005/Atom"; let $tweets := for $entry in reverse(document("/db/twitter/elharo")/atom:feed/atom:entry) return <div><date>{substring-before($entry/atom:updated/text(), "T")}</date> <p> <span>{substring-before(substring-after($entry/atom:updated/text(), "T"), "+")} </span> {replace(substring-after($entry/atom:title/text(), "elharo:"), "(http://[^s]+)", "<a href='http://$1'>http://$1</a>")}</p></div> return for $date in distinct-values($tweets/date) return <div><h3>{$date}</h3> { for $tweet in $tweets where $tweet/date = $date return $tweet/p }</div> Let's do the same for @names: xquery version "1.0"; declare namespace atom="http://www.w3.org/2005/Atom"; let $tweets := for $entry in reverse(document("/db/twitter/elharo")/atom:feed/atom:entry) return <div><date>{substring-before($entry/atom:updated/text(), "T")}</date> <p> <span>{substring-before(substring-after($entry/atom:updated/text(), "T"), "+")} </span> { replace ( replace(substring-after($entry/atom:title/text(), "elharo:"), "(http://[^s]+)", "<a href='$1'>$1</a>"), " @([a-zA-Z]+)", " <a href='http://twitter.com/$1'>@$1</a>" ) }</p></div> return for $date in distinct-values($tweets/date) return <div><h3>{$date}</h3> { for $tweet in $tweets where $tweet/date = $date return $tweet/p }</div> And one more time for hash tags: xquery version "1.0"; declare namespace atom="http://www.w3.org/2005/Atom"; let $tweets := for $entry in reverse(document("/db/twitter/elharo")/atom:feed/atom:entry) return <div><date>{substring-before($entry/atom:updated/text(), "T")}</date> <p> <span>{substring-before(substring-after($entry/atom:updated/text(), "T"), "+")} </span> { replace ( replace ( replace(substring-after($entry/atom:title/text(), "elharo:"), "(http://[^s]+)", "<a href='$1'>$1</a>"), " @([a-zA-Z]+)", " <a href='http://twitter.com/$1'>@$1</a>" ), " #([a-zA-Z]+)", " <a href='http://twitter.com/search?q=#$1'>#$1</a>" ) }</p></div> return for $date in distinct-values($tweets/date) return <div><h3>{$date}</h3> { for $tweet in $tweets where $tweet/date = $date return $tweet/p }</div> And here's the finished result. Thursday, January 7, 2010 (Permalink) This morning a simple practice exercise to get my toes wet. First load my Tweets from their Atom feed into eXist: xquery version "1.0"; declare namespace xmldb="http://exist-db.org/xquery/xmldb"; let $collection := xmldb:create-collection("/db", "twitter") let $filename := "" let $URI := xs:anyURI("file:///Users/elharo/backups/elharo_statuses.xml") let $retcode := xmldb:store($collection, "elharo", $URI) return $retcode Then generate HTML of each tweet: xquery version "1.0"; declare namespace atom="http://www.w3.org/2005/Atom"; for $entry in document("/db/twitter/elharo")/atom:feed/atom:entry return <p>{$entry/atom:updated/text()} {substring-after($entry/atom:title/text(), "elharo:")}</p> Can I reverse them so they go forward in time? Yes, easily: for $entry in reverse(document("/db/twitter/elharo")/atom:feed/atom:entry) Now how do I dump that to a file? Maybe something like this? xquery version "1.0"; declare namespace atom="http://www.w3.org/2005/Atom"; let $tweets := <html> {for $entry in document("/db/twitter/elharo")/atom:feed/atom:entry return <p>{$entry/atom:updated/text()} {substring-after($entry/atom:title/text(), "elharo:")}</p> } </html> return xmldb:store("/db/twitter", "/Users/elharo/tmp/tweets.html", $tweets) Oh damn. Almost, but that puts it back into the database instead of the filesystem. Still I can now run a query that grabs just that and copy and paste the result since there's only 1. The first query gave almost 1000 results and the query sandbox only shows one at a time. Tomorrow: how do I serve that query as a web page? Wednesday, January 6, 2010 (Permalink) What I've learned about eXist so far: I can use virtual hosting to run it, either at Rackspace Cloud, Amazon EC2, or right here on IBiblio; and use Jetty as my web server. However I probably should proxy it behind Apache anyway. I can upload files into the repository. I can execute simple XQueries using the XQuery sandbox. What I still don't know: How to address the documents I upload from inside the XQuery sandbox; and in general how to manage and manipulate collections. Partial answer: xquery version "1.0"; declare namespace xmldb="http://exist-db.org/xquery/xmldb"; for $foo in collection("/db/collectionname

images

file namealternative text
Cup of
Refactoring HTML
Hosted by IBiblio

headers

H1

Cafe con Leche XML News and Resources

H2

Quote of the Day

Today's News

Recommended Reading

Recent News

H3

Quote of the Day

Today's News

Recommended Reading

Recent News

H4

Validating Parsers

Non-validating Parsers

Online Validators and Syntax Checkers

Formatting Engines

Browsers

Class Libraries

Editors

XML Applications

External Sites

H5

H6

internal links

addressanchor text
Cup of
Permalink to Today's News
Recent News
Older News
E-mail Elliotte Rusty Harold
previous recommended reading
recommended reading RSS feed
Permalink
Permalink
Permalink
Permalink
Permalink
Permalink
I should probably figure out how to remove some of the duplicate date parsing code, but it's basically a one-off migration script so I may not bother. I think I have enough in place now that I can start setting up the templates for the main index.html page and the quote and news archives. Then I can start exploring the authoring half of the equation. Monday, January 25, 2010 (Permalink
Permalink
Permalink
Permalink
Permalink
Permalink
Permalink
Permalink
Permalink
Permalink
Permalink
Older News
XML Books
XML Examples
XML Trade Shows
XML Mailing Lists
XML Quotes
Cafe con Leche RSS Feed
elharo@metalab.unc.edu
Processing XML with Java
XML in a Nutshell
Effective XML
The XML 1.1 Bible
The XML Bible, Gold Edition
XML: Extensible Markup Language
Special Reports
XML Book List
XML Examples
XML Seminar Slides
XML Conferences
XML Mailing Lists
XML Quotes
RSS Feed
Atom Feed
XSLT
XSL-FO
XLinks
XPointers
Schemas
More conferences
SAX Conformance Tests
XQuisitor
Effective XML
StAX
XOM
DOM Level 3
Intro to XML
Processing XML with Java
XML Fundamentals
XQuery
Namespaces
XSLT 2.0 and Beyond
Schemas
DTDs
DOM
SAX
XLinks and XPointers
XSL Transformations
XML: Hype vs. Hope
XInclude
Advanced XML
About this web site
Effective XML
Processing XML with Java
XML In A Nutshell
XML Bible
XML: Extensible Markup Language
XML Conferences and Trade Shows
XML Book List
XML Mailing Lists
Quotes

external links

addressanchor text
XOM 1.2.5
website
releases page
Today's Java News on Cafe au Lait
The Cafes
XQuery date and time functions
XQuery debugger
http://$1")}

return for $date in distinct-values($tweets/date) return

{$date}

{ for $tweet in $tweets where $tweet/date = $date return $tweet/p }
Let's do the same for @names: xquery version "1.0"; declare namespace atom="http://www.w3.org/2005/Atom"; let $tweets := for $entry in reverse(document("/db/twitter/elharo")/atom:feed/atom:entry) return
{substring-before($entry/atom:updated/text(), "T")}

{substring-before(substring-after($entry/atom:updated/text(), "T"), "+")} { replace ( replace(substring-after($entry/atom:title/text(), "elharo:"), "(http://[^s]+)", "$1"), " @([a-zA-Z]+)", " @$1" ) }

return for $date in distinct-values($tweets/date) return

{$date}

{ for $tweet in $tweets where $tweet/date = $date return $tweet/p }
And one more time for hash tags: xquery version "1.0"; declare namespace atom="http://www.w3.org/2005/Atom"; let $tweets := for $entry in reverse(document("/db/twitter/elharo")/atom:feed/atom:entry) return
{substring-before($entry/atom:updated/text(), "T")}

{substring-before(substring-after($entry/atom:updated/text(), "T"), "+")} { replace ( replace ( replace(substring-after($entry/atom:title/text(), "elharo:"), "(http://[^s]+)", "$1"), " @([a-zA-Z]+)", " @$1" ), " #([a-zA-Z]+)", " #$1" ) }

return for $date in distinct-values($tweets/date) return

{$date}

{ for $tweet in $tweets where $tweet/date = $date return $tweet/p }
And here's the finished result
excessive confirmation
Second bug filed
The Cafes
Cafe au Lait
Elliotte Rusty Harold
The Cafes
Mokka mit Schlag
Cafe au Lait
Amazon Plog
Refactoring HTML
The XML FAQ List
The Annotated XML Spec
Balisage
XOM
Jaxen
XInclude
Cafe au Lait
XML 1.0
Errata in XML 1.0
Annotated XML 1.0 specification
XML Namespaces
CSS Level 1
CSS Level 2
HTML 4.0
XHTML 1.0
XSL Formatting Objects
XSL Transformations 1.0
XPath 1.0
XML Schema Part 0: Primer
XML Schema Part 1: Structures
XML Schema Part 2: Datatypes
XLinks
XPointers
DOM
SAX
URLs
URIs
Dublin Core
Unicode
libxml
EXML
Xerces-J
Xerces-C
Xerces-P
MSXML
XML Parser for Java
xmlproc
Larval
SXP
RXP
Crimson
fxp
XML for C++
LTXML
XML parser
KXML
XML Tools
xmltex
XML Engine
XML::Parser
Lark
XP
GNU JAXP
Expat
eXML
Nenie XML
XParse
PyXML
xml.sax
STG XML Validation Form
xml-check
libxslt
Saxon
Xalan
jd.xslt
Sablotron
XT
FOP
xmlroff
XEP
PassiveTeX
Antenna House XSL Formatter
Jade
xslj
docproc
Koala XSL Engine
Opera
Jumbo
Mozilla
Firefox
Safari
Internet Explorer Windows
X-Smiles
TechExplorer
JDOM
XOM
Jaxen
XML Copy Editor
Serna
EditiX
Exchanger
EditML
XMetaL
XML Spy
XML Pro
XMLwriter
Jumbo
Cooktop
XMLmind XML Editor
Ecological Metadata Language (EML)
Itsy Bitsy Teeny Weeny Simple Hypertext DTD (IBTWSH)
Molecular Dynamics Language
Chemical Markup Language
Mathematical Markup Language
MusicXML
ICE
Resource Description Framework
FlixML
Extensible Mail Transport Protocol
Personalized Information Description Language
XHTML
Channel Definition Format
Scalable Vector Graphics
The W3C
xml.com
Microsoft's XML Page
Robin Cover's XML Web Page
The XML Files
OASIS
fr
Cafe au Lait
Hosted by IBiblio