{"id":147,"date":"2015-07-22T15:32:36","date_gmt":"2015-07-22T13:32:36","guid":{"rendered":"http:\/\/spreadsheets.ist.tugraz.at\/?page_id=147"},"modified":"2016-05-10T08:44:21","modified_gmt":"2016-05-10T06:44:21","slug":"corpora-comparison","status":"publish","type":"page","link":"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/corpora-for-benchmarking\/corpora-comparison\/","title":{"rendered":"Corpora Comparison"},"content":{"rendered":"<p><script src=\"https:\/\/cdn.mathjax.org\/mathjax\/latest\/MathJax.js?config=TeX-AMS-MML_HTMLorMML\" type=\"text\/javascript\">\/\/ <![CDATA[\n\n\/\/ ]]><\/script><\/p>\n<p>The following table provides a rough comparison of the corpora. The modified <em>EUSES<\/em> corpus convinces by the diversity of the spreadsheets and the high numbers of base spreadsheets and faulty versions. The <em>Payroll\/Gradebook<\/em> corpus is the only corpus that comes with authentic testing decisions, i.e. the testing decisions are provided by users. All other \u00a0corpora have artificial created testing decisions. The main advantage of the <em>Info1<\/em> corpus is that it comes with real faults. The <em>Integer<\/em> corpus is useful for evaluating tools that cannot handle floating point numbers.<\/p>\n<table>\n<tbody>\n<tr>\n<th>Corpus<\/th>\n<th style=\"text-align: center;\">EUSES<\/th>\n<th style=\"text-align: center;\">Info1<\/th>\n<th style=\"text-align: center;\">Integer<\/th>\n<th style=\"text-align: center;\">P\/G*<\/th>\n<\/tr>\n<tr>\n<th>Number of base spreadsheets<\/th>\n<td style=\"text-align: right;\"><span style=\"color: #08bf63;\"><strong>184<\/strong><\/span><\/td>\n<td style=\"text-align: right;\"><span style=\"color: #ff0000;\"><strong>2<\/strong><\/span><\/td>\n<td style=\"text-align: right;\">33<\/td>\n<td style=\"text-align: right;\"><span style=\"color: #ff0000;\"><strong>2<\/strong><\/span><\/td>\n<\/tr>\n<tr>\n<th>Number of faulty versions<\/th>\n<td style=\"text-align: right;\"><span style=\"color: #08bf63;\"><strong>576<\/strong><\/span><\/td>\n<td style=\"text-align: right;\">119<\/td>\n<td style=\"text-align: right;\">231<\/td>\n<td style=\"text-align: right;\">349<\/td>\n<\/tr>\n<tr>\n<th>Spreadsheet diversity<\/th>\n<td style=\"text-align: center;\"><span style=\"color: #08bf63;\"><strong>large<\/strong><\/span><\/td>\n<td style=\"text-align: center;\"><strong><span style=\"color: #ff0000;\">small<\/span><\/strong><\/td>\n<td style=\"text-align: center;\">medium<\/td>\n<td style=\"text-align: center;\"><span style=\"color: #ff0000;\"><strong>small<\/strong><\/span><\/td>\n<\/tr>\n<tr>\n<th>Spreadsheet origin<\/th>\n<td style=\"text-align: center;\"><span style=\"color: #08bf63;\"><strong>authentic<\/strong><\/span><\/td>\n<td style=\"text-align: center;\">exercise<\/td>\n<td style=\"text-align: center;\">mixed<\/td>\n<td style=\"text-align: center;\">laboratory<\/td>\n<\/tr>\n<tr>\n<th>Fault orign<\/th>\n<td style=\"text-align: center;\">injected<\/td>\n<td style=\"text-align: center;\"><span style=\"color: #08bf63;\"><strong>real<\/strong><\/span><\/td>\n<td style=\"text-align: center;\">injected<\/td>\n<td style=\"text-align: center;\">injected<\/td>\n<\/tr>\n<tr>\n<th>Fault complexity<\/th>\n<td style=\"text-align: center;\"><strong><span style=\"color: #ff0000;\">single**<\/span><\/strong><\/td>\n<td style=\"text-align: center;\">multiple<\/td>\n<td style=\"text-align: center;\">\u00a0multiple<\/td>\n<td style=\"text-align: center;\">multiple<\/td>\n<\/tr>\n<tr>\n<th>Testing decisions origin<\/th>\n<td style=\"text-align: center;\">artificial<\/td>\n<td style=\"text-align: center;\">artificial<\/td>\n<td style=\"text-align: center;\">artificial<\/td>\n<td style=\"text-align: center;\"><span style=\"color: #08bf63;\"><strong>user-provided<\/strong><\/span><\/td>\n<\/tr>\n<tr>\n<th>Testing decisions quality<\/th>\n<td style=\"text-align: center;\">always correct<\/td>\n<td style=\"text-align: center;\">always correct<\/td>\n<td style=\"text-align: center;\">always correct<\/td>\n<td style=\"text-align: center;\">wrong classifications possible<\/td>\n<\/tr>\n<tr>\n<th>Testing decision area<\/th>\n<td style=\"text-align: center;\">result cells<\/td>\n<td style=\"text-align: center;\">formula cells<\/td>\n<td style=\"text-align: center;\">result cells<\/td>\n<td style=\"text-align: center;\">arbitrary cells<\/td>\n<\/tr>\n<tr>\n<th>Domain<\/th>\n<td style=\"text-align: center;\">Real<\/td>\n<td style=\"text-align: center;\">Real<\/td>\n<td style=\"text-align: center;\"><span style=\"color: #08bf63;\"><strong>Integer<\/strong><\/span><\/td>\n<td style=\"text-align: center;\">Real<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>*P\/G = Payroll\/Gradebook<br \/>\n** an additional version of the EUSES corpus with double and tripe faults has been added<\/p>\n<p>The following table provides a quantitative comparison\u00a0of three of the corpora.\u00a0The number of input and formula cells are good indicators for the size of the spreadsheets.\u00a0While the <em>Payroll\/Gradebook<\/em> spreadsheets are very small spreadsheets, the <em>EUSES<\/em> spreadsheets have between 6 and more than 10,000 formula cells.\u00a0The smallest <em>Info1<\/em> spreadsheet has even 501 formula cells.<br \/>\nA high percentage of copied cells indicates that grouping techniques, which treat similar cells as a unit, are well suited for this corpus. Both <em>EUSES<\/em> and <em>Info1<\/em> have a high percentages of copied formulas.\u00a0The number of <em>IF<\/em>s is a rough indicator for the success of dynamic techniques,\u00a0in which\u00a0the concrete evaluation of conditions is important.\u00a0<em>Info1<\/em> contains the largest number of <em>IF<\/em> statements.<br \/>\nThe average number of operators per formula cell indicates the complexity of the spreadsheet. The high number of provided testing decisions in the<em> EUSES<\/em> corpus originates from fact that they are automatically generated by comparing the results cells of a faulty spreadsheet with the correct spreadsheet. Such a high number of testing decisions would never be provided by a user.<\/p>\n<table>\n<tbody>\n<tr>\n<th colspan=\"2\">Feature<\/th>\n<th style=\"text-align: center;\">EUSES<\/th>\n<th style=\"text-align: center;\">Info1<\/th>\n<p><!--\n\n\n<th style=\"text-align: center;\">Integer<\/th>\n\n\n--><\/p>\n<th style=\"text-align: center;\">P\/G*<\/th>\n<\/tr>\n<tr>\n<th rowspan=\"6\">Number of formula cells<\/th>\n<td>Min<\/td>\n<td style=\"text-align: right;\">6<\/td>\n<td style=\"text-align: right;\"><strong><span style=\"color: #08bf63;\">501<\/span><\/strong><\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">10<\/td>\n<\/tr>\n<tr>\n<td>Q1<\/td>\n<td style=\"text-align: right;\">40<\/td>\n<td style=\"text-align: right;\">580<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">12<\/td>\n<\/tr>\n<tr>\n<td>Median<\/td>\n<td style=\"text-align: right;\">111.5<\/td>\n<td style=\"text-align: right;\"><strong><span style=\"color: #08bf63;\">2131<\/span><\/strong><\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">18<\/td>\n<\/tr>\n<tr>\n<td>Q3<\/td>\n<td style=\"text-align: right;\">305.25<\/td>\n<td style=\"text-align: right;\">2245.5<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">19<\/td>\n<\/tr>\n<tr>\n<td>Max<\/td>\n<td style=\"text-align: right;\">10316<\/td>\n<td style=\"text-align: right;\">3157<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">19<\/td>\n<\/tr>\n<tr>\n<td>Avg<\/td>\n<td style=\"text-align: right;\">353.95<\/td>\n<td style=\"text-align: right;\"><strong><span style=\"color: #08bf63;\">1466.22<\/span><\/strong><\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">15.03<\/td>\n<\/tr>\n<tr>\n<th rowspan=\"6\">Number of input cells<\/th>\n<td>Min<\/td>\n<td style=\"text-align: right;\">1<\/td>\n<td style=\"text-align: right;\">10<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">5<\/td>\n<\/tr>\n<tr>\n<td>Q1<\/td>\n<td style=\"text-align: right;\">45<\/td>\n<td style=\"text-align: right;\">13<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">5<\/td>\n<\/tr>\n<tr>\n<td>Median<\/td>\n<td style=\"text-align: right;\">129<\/td>\n<td style=\"text-align: right;\">21<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">6<\/td>\n<\/tr>\n<tr>\n<td>Q3<\/td>\n<td style=\"text-align: right;\">430<\/td>\n<td style=\"text-align: right;\">60<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">9<\/td>\n<\/tr>\n<tr>\n<td>Max<\/td>\n<td style=\"text-align: right;\">24067<\/td>\n<td style=\"text-align: right;\">733<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">9<\/td>\n<\/tr>\n<tr>\n<td>Avg<\/td>\n<td style=\"text-align: right;\">601.54<\/td>\n<td style=\"text-align: right;\">90.67<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">7.17<\/td>\n<\/tr>\n<tr>\n<th rowspan=\"6\">Number of unique formulas**<\/th>\n<td>Min<\/td>\n<td style=\"text-align: right;\">2<\/td>\n<td style=\"text-align: right;\">13<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">10<\/td>\n<\/tr>\n<tr>\n<td>Q1<\/td>\n<td style=\"text-align: right;\">6<\/td>\n<td style=\"text-align: right;\">21<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">12<\/td>\n<\/tr>\n<tr>\n<td>Median<\/td>\n<td style=\"text-align: right;\">11<\/td>\n<td style=\"text-align: right;\">23<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">18<\/td>\n<\/tr>\n<tr>\n<td>Q3<\/td>\n<td style=\"text-align: right;\">26<\/td>\n<td style=\"text-align: right;\">28<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">19<\/td>\n<\/tr>\n<tr>\n<td>Max<\/td>\n<td style=\"text-align: right;\">895<\/td>\n<td style=\"text-align: right;\">90<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">19<\/td>\n<\/tr>\n<tr>\n<td>Avg<\/td>\n<td style=\"text-align: right;\">24.19<\/td>\n<td style=\"text-align: right;\">25.91<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">15.02<\/td>\n<\/tr>\n<tr>\n<th rowspan=\"6\">% Copied formulas***<\/th>\n<td>Min<\/td>\n<td style=\"text-align: right;\">0<\/td>\n<td style=\"text-align: right;\">84<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">0<\/td>\n<\/tr>\n<tr>\n<td>Q1<\/td>\n<td style=\"text-align: right;\">75<\/td>\n<td style=\"text-align: right;\">96<\/td>\n<td style=\"text-align: right;\">0<\/td>\n<\/tr>\n<tr>\n<td>Median<\/td>\n<td style=\"text-align: right;\">89<\/td>\n<td style=\"text-align: right;\">98<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">0<\/td>\n<\/tr>\n<tr>\n<td>Q3<\/td>\n<td style=\"text-align: right;\">95<\/td>\n<td style=\"text-align: right;\">99<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">0<\/td>\n<\/tr>\n<tr>\n<td>Max<\/td>\n<td style=\"text-align: right;\">100<\/td>\n<td style=\"text-align: right;\">99<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">5<\/td>\n<\/tr>\n<tr>\n<td>Avg<\/td>\n<td style=\"text-align: right;\">82<\/td>\n<td style=\"text-align: right;\"><span style=\"color: #08bf63;\"><strong>97<\/strong><\/span><\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\"><strong><span style=\"color: #ff0000;\">0<\/span><\/strong><\/td>\n<\/tr>\n<tr>\n<th colspan=\"2\">Number of spreadsheets with <em>IF<\/em>s<\/th>\n<td style=\"text-align: right;\">109<\/td>\n<td style=\"text-align: right;\">112<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">349<\/td>\n<\/tr>\n<tr>\n<th colspan=\"2\">% of spreadsheets with <em>IF<\/em>s<\/th>\n<td style=\"text-align: right;\"><strong><span style=\"color: #ff0000;\">19<\/span><\/strong><\/td>\n<td style=\"text-align: right;\">94<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\"><strong><span style=\"color: #08bf63;\">100<\/span><\/strong><\/td>\n<\/tr>\n<tr>\n<th rowspan=\"6\">Of those, number of<em> If<\/em>s<\/th>\n<td>Min<\/td>\n<td style=\"text-align: right;\">1<\/td>\n<td style=\"text-align: right;\">142<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">7<\/td>\n<\/tr>\n<tr>\n<td>Q1<\/td>\n<td style=\"text-align: right;\">22<\/td>\n<td style=\"text-align: right;\">143<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">7<\/td>\n<\/tr>\n<tr>\n<td>Median<\/td>\n<td style=\"text-align: right;\">54<\/td>\n<td style=\"text-align: right;\">532,5<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">9<\/td>\n<\/tr>\n<tr>\n<td>Q3<\/td>\n<td style=\"text-align: right;\">136<\/td>\n<td style=\"text-align: right;\">1616<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">9<\/td>\n<\/tr>\n<tr>\n<td>Max<\/td>\n<td style=\"text-align: right;\">7839<\/td>\n<td style=\"text-align: right;\">3234<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">9<\/td>\n<\/tr>\n<tr>\n<td>Avg<\/td>\n<td style=\"text-align: right;\">371.44<\/td>\n<td style=\"text-align: right;\"><span style=\"color: #08bf63;\"><strong>1023.28<\/strong><\/span><\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">8.01<\/td>\n<\/tr>\n<tr>\n<th rowspan=\"6\">Average Number of operators per formula cell<\/th>\n<td>Min<\/td>\n<td style=\"text-align: right;\">0.34<\/td>\n<td style=\"text-align: right;\">2.54<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">2<\/td>\n<\/tr>\n<tr>\n<td>Q1<\/td>\n<td style=\"text-align: right;\">1<\/td>\n<td style=\"text-align: right;\">3.33<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">2.21<\/td>\n<\/tr>\n<tr>\n<td>Median<\/td>\n<td style=\"text-align: right;\">1.48<\/td>\n<td style=\"text-align: right;\">3.87<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">2.33<\/td>\n<\/tr>\n<tr>\n<td>Q3<\/td>\n<td style=\"text-align: right;\">2<\/td>\n<td style=\"text-align: right;\">5.92<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">3.17<\/td>\n<\/tr>\n<tr>\n<td>Max<\/td>\n<td style=\"text-align: right;\">17<\/td>\n<td style=\"text-align: right;\">9.25<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">3.9<\/td>\n<\/tr>\n<tr>\n<td>Avg<\/td>\n<td style=\"text-align: right;\">1.89<\/td>\n<td style=\"text-align: right;\"><strong><span style=\"color: #08bf63;\">4.47<\/span><\/strong><\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">2.72<\/td>\n<\/tr>\n<tr>\n<th rowspan=\"6\">Number of positive testing decisions per test set<\/th>\n<td>Min<\/td>\n<td style=\"text-align: right;\">0<\/td>\n<td style=\"text-align: right;\">1<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">0<\/td>\n<\/tr>\n<tr>\n<td>Q1<\/td>\n<td style=\"text-align: right;\">9<\/td>\n<td style=\"text-align: right;\">6<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">2<\/td>\n<\/tr>\n<tr>\n<td>Median<\/td>\n<td style=\"text-align: right;\"><strong><span style=\"color: #ff0000;\">28<\/span><\/strong><\/td>\n<td style=\"text-align: right;\">8<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">4<\/td>\n<\/tr>\n<tr>\n<td>Q3<\/td>\n<td style=\"text-align: right;\"><span style=\"color: #ff0000;\"><strong>92<\/strong><\/span><\/td>\n<td style=\"text-align: right;\">10<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">7<\/td>\n<\/tr>\n<tr>\n<td>Max<\/td>\n<td style=\"text-align: right;\"><strong><span style=\"color: #ff0000;\">2962<\/span><\/strong><\/td>\n<td style=\"text-align: right;\">17<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">17<\/td>\n<\/tr>\n<tr>\n<td>Avg<\/td>\n<td style=\"text-align: right;\">79.81<\/td>\n<td style=\"text-align: right;\">8.08<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">4.76<\/td>\n<\/tr>\n<tr>\n<th rowspan=\"6\">NUMBER OF NegatIVE TESTING DECISIONS PER TEST SET<\/th>\n<td>Min<\/td>\n<td style=\"text-align: right;\">1<\/td>\n<td style=\"text-align: right;\">1<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">1<\/td>\n<\/tr>\n<tr>\n<td>Q1<\/td>\n<td style=\"text-align: right;\">1<\/td>\n<td style=\"text-align: right;\">2<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">1<\/td>\n<\/tr>\n<tr>\n<td>Median<\/td>\n<td style=\"text-align: right;\">1<\/td>\n<td style=\"text-align: right;\">3<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">1<\/td>\n<\/tr>\n<tr>\n<td>Q3<\/td>\n<td style=\"text-align: right;\">2<\/td>\n<td style=\"text-align: right;\">4<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">2<\/td>\n<\/tr>\n<tr>\n<td>Max<\/td>\n<td style=\"text-align: right;\"><strong><span style=\"color: #ff0000;\">72<\/span><\/strong><\/td>\n<td style=\"text-align: right;\">10<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">14<\/td>\n<\/tr>\n<tr>\n<td>Avg<\/td>\n<td style=\"text-align: right;\">2.86<\/td>\n<td style=\"text-align: right;\">3.18<\/td>\n<p><!--\n\n\n<td style=\"text-align: right;\">?<\/td>\n\n\n--><\/p>\n<td style=\"text-align: right;\">1.77<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>*P\/G = Payroll\/Gradebook<\/p>\n<p>** The set\u00a0unique formulas is a subset of all formulas \\(C_{unique}\\subseteq C_{formula}\\) such that \\( \\forall c, c&#8217; \\in C_{unique}, c \\neq c&#8217;: \\ell(c)\\neq \\ell(c&#8217;) \\) where \\(\\ell(c)\\) is the formula in R1C1 notation of cell c.<\/p>\n<p>*** Percentage of copied formulas \\( P_C=(1-\\frac{C_{unique}}{C_{formula}})*100 \\)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The following table provides a rough comparison of the corpora. The modified EUSES corpus convinces by the diversity of the spreadsheets and the high numbers of base spreadsheets and faulty versions. The Payroll\/Gradebook corpus is the only corpus that comes with authentic testing decisions, i.e. the testing decisions are provided by users. All other \u00a0corpora &hellip; <a href=\"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/corpora-for-benchmarking\/corpora-comparison\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Corpora Comparison<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"parent":5,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-147","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/wp-json\/wp\/v2\/pages\/147","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/wp-json\/wp\/v2\/comments?post=147"}],"version-history":[{"count":56,"href":"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/wp-json\/wp\/v2\/pages\/147\/revisions"}],"predecessor-version":[{"id":226,"href":"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/wp-json\/wp\/v2\/pages\/147\/revisions\/226"}],"up":[{"embeddable":true,"href":"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/wp-json\/wp\/v2\/pages\/5"}],"wp:attachment":[{"href":"https:\/\/spreadsheets.sai.tugraz.at\/index.php\/wp-json\/wp\/v2\/media?parent=147"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}