{"id":322,"date":"2017-04-04T12:12:00","date_gmt":"2017-04-04T10:12:00","guid":{"rendered":"http:\/\/spreadsheets.ist.tugraz.at\/?page_id=322"},"modified":"2017-10-20T09:18:08","modified_gmt":"2017-10-20T07:18:08","slug":"corpora-overview","status":"publish","type":"page","link":"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/corpora-for-benchmarking\/corpora-overview\/","title":{"rendered":"Corpora Overview"},"content":{"rendered":"<p>This page serves as overview (and link collection) of existing corpora and specific subsets of them (e.g., known faults, smell annotation, version history).<\/p>\n<h3>Enron<\/h3>\n<ul>\n<li>Original\u00a0<a href=\"http:\/\/www.felienne.com\/archives\/3634\">ENRON<\/a> corpus<\/li>\n<li><a href=\"http:\/\/sccpu2.cse.ust.hk\/venron\/\">VENRON<\/a>\u00a0(Enron enhanced with\u00a0version information)<\/li>\n<li><a href=\"http:\/\/ls13-www.cs.tu-dortmund.de\/homepage\/spreadsheets\/enron-errors.htm\">ENRON errors corpus<\/a>\u00a0(Enron spreadsheet containing faults, <a href=\"http:\/\/ieeexplore.ieee.org\/document\/7739679\/\">PDF<\/a>)<\/li>\n<li><a href=\"https:\/\/wwwdb.inf.tu-dresden.de\/misc\/DeExcelarator\/\">Subset of ENRON<\/a> enhanced with type annotations (Meta-data, headers, attributes, data, derived data)<\/li>\n<\/ul>\n<h3>EUSES<\/h3>\n<ul>\n<li><a href=\"http:\/\/eusesconsortium.org\/\">EUSES<\/a>\u00a0corpus<\/li>\n<li>Subset of<a href=\"http:\/\/sccpu2.cse.ust.hk\/custodes\/\"> EUSES with annotated smells<\/a>\u00a0(CUSTODES)<\/li>\n<li><a href=\"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/corpora-for-benchmarking\/euses\/\">Modified EUSES<\/a> (Subset of EUSES with inserted faults and test verdicts)<\/li>\n<li><a href=\"https:\/\/wwwdb.inf.tu-dresden.de\/misc\/DeExcelarator\/\">Subset of EUSES<\/a>\u00a0enhanced with type annotations (Meta-data, headers, attributes, data, derived data)<\/li>\n<\/ul>\n<h3>FUSE<\/h3>\n<ul>\n<li><a href=\"http:\/\/static.barik.net\/fuse\/\">FUSE<\/a>\u00a0corpus<\/li>\n<li><a href=\"https:\/\/wwwdb.inf.tu-dresden.de\/misc\/DeExcelarator\/\">Subset of FUSE<\/a>\u00a0enhanced with type annotations (Meta-data, headers, attributes, data, derived data)<\/li>\n<\/ul>\n<h3>Payroll\/Gradebook<\/h3>\n<ul>\n<li>Original Forms3 spreadsheets with inserted faults and test verdicts in a log file (<a href=\"https:\/\/www.computer.org\/csdl\/trans\/ts\/2006\/04\/e0213-abs.html\">PDF<\/a>, \u00a0authors send corpus on request)<\/li>\n<li><a href=\"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/corpora-for-benchmarking\/payrollgradebook-2\/\">Excel version<\/a><\/li>\n<\/ul>\n<h3>Info1<\/h3>\n<ul>\n<li><a href=\"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/corpora-for-benchmarking\/info1\/\">Corpus<\/a> with real faults and simulated test verdicts<\/li>\n<\/ul>\n<h3>Integer<\/h3>\n<ul>\n<li><a href=\"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/corpora-for-benchmarking\/integer-corpus\/\">Collection<\/a> of spreadsheets with inserted faults and test verdicts. All spreadsheets \u00a0of this corpus comprise only integer values.<\/li>\n<\/ul>\n<h3>Hawaii Kooker<\/h3>\n<ul>\n<li>Collection of faulty spreadsheets created by undergraduate business students (<a href=\"https:\/\/arxiv.org\/ftp\/arxiv\/papers\/1009\/1009.2785.pdf\">PDF<\/a>, authors send corpus on request)<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>This page serves as overview (and link collection) of existing corpora and specific subsets of them (e.g., known faults, smell annotation, version history). Enron Original\u00a0ENRON corpus VENRON\u00a0(Enron enhanced with\u00a0version information) ENRON errors corpus\u00a0(Enron spreadsheet containing faults, PDF) Subset of ENRON enhanced with type annotations (Meta-data, headers, attributes, data, derived data) EUSES EUSES\u00a0corpus Subset of EUSES &hellip; <a href=\"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/corpora-for-benchmarking\/corpora-overview\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Corpora Overview<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"parent":5,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-322","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/wp-json\/wp\/v2\/pages\/322","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/wp-json\/wp\/v2\/comments?post=322"}],"version-history":[{"count":12,"href":"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/wp-json\/wp\/v2\/pages\/322\/revisions"}],"predecessor-version":[{"id":364,"href":"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/wp-json\/wp\/v2\/pages\/322\/revisions\/364"}],"up":[{"embeddable":true,"href":"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/wp-json\/wp\/v2\/pages\/5"}],"wp:attachment":[{"href":"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/wp-json\/wp\/v2\/media?parent=322"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}