O Spyder é um dataset para avaliação de parsers de linguagem natural para SQL o site oficial é:
https://yale-lily.github.io/spider
Para você conseguir realizar avaliação localmente é necessário realizar um setup que originalmente está em:
https://github.com/taoyds/test-suite-sql-eval
Environment preparation:
Escolha onde <alguma coisa> irá montar a estrutua de diretórios para o parsers e:
mkdir nl2sql
cd nl2sql
mkdir spider-eval
cd spider-eval
git clone https://github.com/taoyds/test-suite-sql-eval
Environment prepare
conda create --name spider-eval python=3.7
conda activate spider-eval
conda install pytorch=1.5 cudatoolkit=10.2 -c
pytorch
Download the dataset
cd <alguma coisa>/nl2sql/spider-eval/test-suite-sql-eval
pip install gdown
gdown --id 1mkCx2GOFIqNesD4y8TDAO1yX1QZORP5w
unzip testsuitedatabases.zip
rm -r __MACOSX < retira diretório que não vai ser usado
rm testsuitedatabases.zip < retira arquivo grande que não vai ser mais usado
Install sqlparse and
nltk to run the evaluation:
pip3 install sqlparse
pip3 install nltk
conda install jupyter notebook < eu que inclui
conda install -c conda-forge jupyter_contrib_nbextensions < eu que inclui
python3 evaluation.py --gold=evaluation_examples/gold.txt --pred=evaluation_examples/predict.txt --db=database --etype exec --plug_value
arguments:
[gold
file] gold file where each line is
`a gold SQL \t db_id` for Spider, SParC, and CoSQL, and interactions are
seperated by one empty line for SParC and CoSQL. See an example at evaluation_examples/gold.txt
[predicted
file] predicted sql file where each
line is a predicted SQL, and interactions are seperated by one empty line. See
an example at evaluation_examples/predict.txt
[database
dir] the directory that contains all the databases and test suites
[table
file] table.json file which
includes foreign key info of each database.
[evaluation
type] "exec" for test suite
accuracy (default), "match" for the original exact set match
accuracy, and "all" for both
--plug_value whether to plug
in the gold value into the predicted query; suitable if your model does not predict
values.
--keep_distinct whether to keep
distinct keyword during evaluation. default is false.
--progress_bar_for_each_datapoint
whether to print progress bar of running test inputs for each datapoint
Nenhum comentário:
Postar um comentário