Pesquisa e Desenvolvimento de Hardware e Software: Setup do Spyder

O Spyder é um dataset para avaliação de parsers de linguagem natural para SQL o site oficial é:

https://yale-lily.github.io/spider

Para você conseguir realizar avaliação localmente é necessário realizar um setup que originalmente está em:

https://github.com/taoyds/test-suite-sql-eval

A sequência de passos que vou relatar aqui visam a ajudar pessoas que queiram montar este ambiente e tiveram alguma problema com as intruçãoes originais, que são muito boas. Acredito que acrescentei passos que podem ajudar quem está começando a montar ambientes como este.

Environment preparation:

Escolha onde <alguma coisa> irá montar a estrutua de diretórios para o parsers e:

mkdir nl2sql

cd nl2sql

mkdir spider-eval

cd spider-eval

git clone https://github.com/taoyds/test-suite-sql-eval

Environment prepare

conda create --name spider-eval python=3.7

conda activate spider-eval

conda install pytorch=1.5 cudatoolkit=10.2 -c pytorch

Download the dataset

cd <alguma coisa>/nl2sql/spider-eval/test-suite-sql-eval

pip install gdown

gdown --id 1mkCx2GOFIqNesD4y8TDAO1yX1QZORP5w

unzip testsuitedatabases.zip

rm -r __MACOSX < retira diretório que não vai ser usado

rm testsuitedatabases.zip < retira arquivo grande que não vai ser mais usado

Install sqlparse and nltk to run the evaluation:

pip3 install sqlparse

pip3 install nltk

conda install jupyter notebook < eu que inclui

conda install -c conda-forge jupyter_contrib_nbextensions < eu que inclui

Test

python3 evaluation.py --gold=evaluation_examples/gold.txt --pred=evaluation_examples/predict.txt --db=database --etype exec --plug_value

arguments:

[gold file] gold file where each line is `a gold SQL \t db_id` for Spider, SParC, and CoSQL, and interactions are seperated by one empty line for SParC and CoSQL. See an example at evaluation_examples/gold.txt

[predicted file] predicted sql file where each line is a predicted SQL, and interactions are seperated by one empty line. See an example at evaluation_examples/predict.txt

[database dir] the directory that contains all the databases and test suites

[table file] table.json file which includes foreign key info of each database.

[evaluation type] "exec" for test suite accuracy (default), "match" for the original exact set match accuracy, and "all" for both

--plug_value whether to plug in the gold value into the predicted query; suitable if your model does not predict values.

--keep_distinct whether to keep distinct keyword during evaluation. default is false.

--progress_bar_for_each_datapoint whether to print progress bar of running test inputs for each datapoint

Pesquisa e Desenvolvimento de Hardware e Software

quarta-feira, 12 de maio de 2021

Setup do Spyder

Environment preparation:

Environment prepare

Download the dataset

Install sqlparse and nltk to run the evaluation:

Nenhum comentário:

Postar um comentário