Parser bug found ib cwm rdf/xml parser

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Parser bug found ib cwm rdf/xml parser

Tim Berners-Lee

It seems that cwm's RDF/XML parser had a bug from the early days when 
a bunch of RDF used strange or no namespaces for the attributes about= etc.

The offending code is referred to in the IRC snippet below.
The file /sax2rdf.py 
had the bug, in which a property attribute  xx:about is assumed to ve rdf:about even though in fact it is sioc:about. It was had been commented as a hack in the code.

and that was what was messing up the reading of the tabulator issues list (data wiki version). 

I found I could edit the file in place and did so.

I have edited the cwm source, and checked it in as it seems to run run quite a lot of the test suite.

Tim BL


from <a href="irc://irc.freenode.net/swig">irc://irc.freenode.net/swig

You are now known as <a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member" style="font-weight: bold; color: rgb(255, 153, 0); text-decoration: none !important; ">timbl.
22:33<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timblFound the problem -- it is with cwm's parser
22:33<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timblAn old kludge
22:33<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timbl# The following section was a kludge to work with presumably old bad RDF
22:33<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timbl# files while RDF was being defined way back when.
22:33<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timbl#            if ns:              # Removed 2010 as this is a kludge which creaks with <a href="sioc:about">sioc:about - <a href="member:timbl" class="member" style="font-weight: inherit; color: inherit; text-decoration: none !important; ">timbl 2010-07-19
22:33<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timbl#                if string.find("ID about aboutEachPrefix bagID type", ln)>0:
22:33<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timbl#                    if ns != RDF_NS_URI:
22:33<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timbl#                      print ("# Warning -- %s attribute in %s namespace not RDF NS." %
22:33<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timbl#                              name, ln)
22:33<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timbl#                      ns = RDF_NS_URI  # Allowed as per dajobe: ID, bagID, about, resource, parseType or type
22:33<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timbl-----------------------------
22:33<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timblThat whole clause should be commented out
22:34<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timblin /afs/csail.mit.edu/group/dig/www/data/TAMI/2007/cwmrete/tmswap/sax2rdf.py
22:38<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timblHave commented the lines out in taht fie
22:39<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timblon v slow connection so difficult to test
22:40<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timblSeems to work better!
22:40<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timblnn
22:40<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timbl----------------
22:40<a href="member:kennyluck" title="~kennyluck@EM114-48-146-40.pool.e-mobile.ne.jp" class="member" style="font-weight: bold; color: rgb(255, 153, 0); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">kennyluck<a href="member:timbl" class="member highlight" style="font-weight: inherit; color: inherit; text-decoration: none !important; ">timbl, that was for proof sent along with the Updated triple.
22:40<a href="member:kennyluck" title="~kennyluck@EM114-48-146-40.pool.e-mobile.ne.jp" class="member" style="font-weight: bold; color: rgb(255, 153, 0); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">kennylucks/Updated triple/updated triples/
22:40<a href="member:kennyluck" title="~kennyluck@EM114-48-146-40.pool.e-mobile.ne.jp" class="member" style="font-weight: bold; color: rgb(255, 153, 0); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">kennyluckSorry if I have messed things up.
22:41<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timblheads off for the night
22:41<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timblproof?
22:41<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timblI notice I get a justify pane
22:41<a href="member:kennyluck" title="~kennyluck@EM114-48-146-40.pool.e-mobile.ne.jp" class="member" style="font-weight: bold; color: rgb(255, 153, 0); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">kennyluckYes.
22:41<a href="member:kennyluck" title="~kennyluck@EM114-48-146-40.pool.e-mobile.ne.jp" class="member" style="font-weight: bold; color: rgb(255, 153, 0); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">kennyluckThe system allows you to send SPARUL with proof.
22:42<a href="member:kennyluck" title="~kennyluck@EM114-48-146-40.pool.e-mobile.ne.jp" class="member" style="font-weight: bold; color: rgb(255, 153, 0); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">kennyluckIt's a dirty hack and a try on CWM + Linked Data.
22:42<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timblTomorrow .. gtg - the proof is stoted in the file an/?
22:42<a href="member:kennyluck" title="~kennyluck@EM114-48-146-40.pool.e-mobile.ne.jp" class="member" style="font-weight: bold; color: rgb(255, 153, 0); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">kennyluckAnyway, we should maintain the wiki well. I'm sorry I didn't do it right.
22:42<a href="member:kennyluck" title="~kennyluck@EM114-48-146-40.pool.e-mobile.ne.jp" class="member" style="font-weight: bold; color: rgb(255, 153, 0); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">kennyluckin the file.
22:42<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timblmaybe explain in email to tabulator@
22:42<a href="member:kennyluck" title="~kennyluck@EM114-48-146-40.pool.e-mobile.ne.jp" class="member" style="font-weight: bold; color: rgb(255, 153, 0); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">kennyluckI think we shoudn't use Algae anymore though.
22:42<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timblAlgae doesn't use cwm
22:43<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timblMaye we shoul duse SWobjects
22:43<a href="member:kennyluck" title="~kennyluck@EM114-48-146-40.pool.e-mobile.ne.jp" class="member" style="font-weight: bold; color: rgb(255, 153, 0); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">kennyluckYeah, we should have more SPARUL wikis
22:43<a href="member:timbl" title="timbl@PC001a80d47d78-WM0003D7097db5.wbb.net.cable.rogers.com" class="member self" style="font-weight: bold; color: rgb(170, 34, 17); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">timblgtg - tomorrow
22:43<a href="member:kennyluck" title="~kennyluck@EM114-48-146-40.pool.e-mobile.ne.jp" class="member" style="font-weight: bold; color: rgb(255, 153, 0); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">kennyluckimplememntaion
22:43<a href="member:kennyluck" title="~kennyluck@EM114-48-146-40.pool.e-mobile.ne.jp" class="member" style="font-weight: bold; color: rgb(255, 153, 0); text-decoration: none !important; margin-right: 0.5ex; white-space: nowrap; ">kennyluckbye
22:49<a href="member:kennyluck" title="~kennyluck@EM114-48-146-40.pool.e-mobile.ne.jp" class="member" style="font-weight: bold; color: rgb(255, 153, 0); text-decoration: none !important; ">kennyluck lef