forked from swfarnsworth/n2c2_2018_task2_significance
-
Notifications
You must be signed in to change notification settings - Fork 0
NLPatVCU/n2c2_2018_task2_significance
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Scipts to generate signfificance testing results for n2c2 2018 task 2,
focused on adverse drug event extraction.
-----------------------------------------------------------------------------
Script calculates the statistical significance between concept extraction and
relation extraction systems using approximate randomization. Scripts compute
significance between all runs in a folder against all of each other to create
a significance matrix. Since approximate randomization can take a long time to
run, does so in parallel, and results can be merged upon completion.
A breif overview of the run process:
1) createSignificaneTestFiles.py - this script compares team 1 (or run, etc) to
team 2, and the gold standard to create files used in approximate randomization.
These files contain the samples generated by team 1 and 2 and in gold, team 1
and team 2 and not in gold, team 1 in gold but not team 2, team 2 in gold but
not team 1, team 1 only, and team 2 only. This is done by lenient span
matching of concepts (code can be modified to be strict matching).
2) using the created files approximate randomization is run (art.py) which
calculates the significance between micro-averaged F1, precision, and recall
of teams 1 and 2.
3) the significance results are output as a nohup file (redirected standard out)
and can be read manually or all teams can be merged into a matrix using the
collectResults.pl script.
-----------------------------------------------------------------------------
To Run:
perl runForAll_parallel.pl
<wait for everything to finish>
perl collectResults.pl
Results of collectResults.pl (output to significanceMatrix_demo)
teamA teamB
teamA 0.5,0.5,0.5 0.8,0.8,0.8
teamB 0.2,0.8,0.2 0.5,0.5,0.5
teamA teamB
teamA
teamB
Indicating the results of teamA to teamB significances tests are 0.8,0.8, and 0.8 for
F1, precision, and recall significance values respectively. Therefore there are no
significant differences and the second matrix is empty.
-----------------------------------------------------------------------------
Known Issues:
in collectResults.pl, where results are collected into a significance matrix
if a number with an e exists (e.g. 1e-7), it is just read as a 0.
-----------------------------------------------------------------------------
Author - Sam Henry, 2019
Contact Me: henryst@vcu.edu
Scripts were modified from Alexander Yeh for approximate randomization, and
?? for n2c2 2018 task 2 evaluation script for createSignificanceTestFiles.py
-----------------------------------------------------------------------------
SOFTWARE COPYRIGHT AND LICENSE
Copyright (C) 2019 Sam Henry
This suite of programs is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License as published
by the Free Software Foundation; either version 2 of the License, or (at
your option) any later version.
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
Note: The text of the GNU General Public License is provided in the file
'GPL.txt' that you should have received with this distribution.
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Languages
- Python 91.9%
- Perl 8.1%