WiktionaryFirstSense

From NLP2RDF-Wiki
Jump to: navigation, search

Contents

Requirements

  • - Apache Web Server with PHP
  • - SQLite3

Install SQLite3 in Ubuntu

sudo apt-get install sqlite3 libsqlite3-dev php5-sqlite
# restart apache, so the changes are loaded:
sudo service apache2 restart

After successfully completing the installation of SQLite3, if you could not still enable the sqlite driver in PHP, then try using these command lines to solve that problem:

$ apt-get --purge remove php5*
$ sudo apt-get install php5 php5-sqlite php5-mysql
$ sudo apt-get install php-pear php-apc php5-curl
$ sudo apt-get autoremove

Source Code

  • Repository:
  • Checkout:
$ sudo mkdir your_dir
$ hg clone https://bitbucket.org/wormsbee/wiktionary-nif your_dir

Note: your_dir muss be writable

$ sudo chmod 777 your_dir

After running the script, a "token_db.sqlite3" file should be created to store your returned results.

Note: Make sure that the file could be writable to insert the continously data

$ sudo chmod 777 your_dir/token_db.sqlite3

Test Cases

Some test cases can execute with this Shell-script:

  • For database:
$ sh ./test-case.sh
  • For Query:
$ sh ./test_their.sh

Online Demo

Example Requests

Additional parameters

The select parameter specifies what is passed to the SPARQL query on Wiktionary. Default is: select="English-.*-1en$"

 SELECT ?sense FROM <http://en.wiktionary.dbpedia.org/> WHERE {
   <http://wiktionary.dbpedia.org/resource/'.$new_token.'> <http://www.monnet-project.eu/lemon#sense> ?sense .
   FILTER (regex (str(?sense ) , "$select" ))
  } LIMIT 100
 Example 1: 
  select="English-.*-1en$"  
  returns: http://wiktionary.dbpedia.org/resource/${token in text}-English-Noun-1en -or- http://wiktionary.dbpedia.org/resource/${token in text}-English-Verb-1en
 Example 2: 
  select="French-Verb-1en$"  
  returns: http://wiktionary.dbpedia.org/resource/${token in text}-French-Verb-1en


The reset_db parameter (boolean) specifies all entries in table should be clear before the new one're inserted.

curl -d reset_db=true -d nif=true -d input-type=text --data-urlencode  prefix="http://example.com/example.txt#" \
--data-urlencode  input="President Obama on Monday will call for a new minimum tax rate for individuals making more than $1 million a year to 
ensure that they pay at least the same percentage of their earnings as other taxpayers, according to administration officials." \
localhost/wiktionary/firstsense.php

Text only

curl -d nif=true -d input-type=text --data-urlencode  prefix="http://example.com/example.txt#" \
--data-urlencode  input="President Obama on Monday will call for a new minimum tax rate for individuals making more than $1 million a year to 
ensure that they pay at least the same percentage of their earnings as other taxpayers, according to administration officials." \
localhost/wiktionary/firstsense.php

NIF (text in RDF only)

The NIF web service parses the RDF and looks for the anchorOf property of all "Word" nodes.

curl -d nif=true -d input-type=rdfxml --data-urlencode  prefix="http://example.com/example.txt#" \
--data-urlencode  input="$rdfdata" \
localhost/wiktionary/firstsense.php
//example.com/example.txt#char=0,> 
        "0"      ;
         "243"    ;
      """President Obama on Monday will call for a new minimum tax rate for individuals making more than $1 million a year to 
ensure that they pay at least the same percentage of their earnings as other taxpayers, according to administration officials."""  ;
      //example.com/example.txt> .
    rdf:type     ;
    rdf:type     .  
//example.com/example.txt#char=0,9> 
        "0"      ;
         "9"    ;
      """President""" ;
    rdf:type      .
 
//example.com/example.txt#char=10,5> 
        "10"      ;
         "5"    ;
      """Obama""" ;
    rdf:type      .
//example.com/example.txt#char=16,2> 
        "16"      ;
         "2"    ;
      """on""" ;
    rdf:type      .

Example output

  1. All triples which are displayed in example input NIF should be included in the output
  2. Below are only the triples that are added
 itsrdf: //www.w3.org/2005/11/its/rdf#> .
//example.com/example.txt#char=0,9> 
    itsrdf:disambigIdentRef  //wiktionary.dbpedia.org/resource/President-English-Noun-1en> . 
//example.com/example.txt#char=10,5> 
    itsrdf:disambigIdentRef  //wiktionary.dbpedia.org/resource/Obama-English-Noun-1en> .
//example.com/example.txt#char=16,2> 
    itsrdf:disambigIdentRef  //wiktionary.dbpedia.org/resource/on-English-Noun-1en> .

If more than one entry is found, only one is used for "itsrdf:disambigIdentRef" , the others should use "ua:candidate"

 ua: //nlp2rdf.lod2.eu/schema/unity/unifiedannotation.ttl#> .
//example.com/example.txt#char=16,2> 
    itsrdf:disambigIdentRef  //wiktionary.dbpedia.org/resource/on-English-Noun-1en> ;
    ua:candidate             //wiktionary.dbpedia.org/resource/on-French-Pronoun-3fr> ;
    ua:candidate             //wiktionary.dbpedia.org/resource/on-German-Abbreviation-1de> ;
    ua:candidate             //wiktionary.dbpedia.org/resource/on-English-Preposition-6fr> ;
    ua:candidate             //wiktionary.dbpedia.org/resource/on-Walloon-Pronoun-1fr> ;
    ua:candidate             //wiktionary.dbpedia.org/resource/on-Turkmen-Adjective-1fr> .
Retrieved from "http://wiki.nlp2rdf.org/index.php?title=WiktionaryFirstSense&oldid=679"
Personal tools
Namespaces

Variants
Views
  • Read
  • View source
  • View history
Actions
Back to main:
NIF 2.0 Draft
Documentation
ToDo - Help Wanted
Navigation
Toolbox
  • What links here
  • Related changes
  • Special pages
  • Printable version