Week 5, 2026 SUBMISSION READY

January 25 - February 1, 2026 | StickForStats JSS Manuscript Finalization & Scientific Integrity Audit

StickForStats JSS Manuscript Ready for Submission

41-Page Paper | Scientific Integrity Audit Complete | arXiv + JSS Packages Prepared

"StickForStats: A Statistical Analysis Platform with Automatic Assumption Validation"
Vishal Bharti & Debojyoti Chakraborty | CSIR-IGIB

Manuscript PDF

Download the final compiled manuscript (41 pages, compiled January 27, 2026):

Download JSS Manuscript PDF (614 KB)

Week Focus: Scientific Integrity Audit & Submission Preparation

Objective

This week's primary focus was conducting a rigorous scientific integrity audit of the JSS manuscript, removing all unverified claims, re-validating case studies with real datasets, and preparing complete submission packages for both the Journal of Statistical Software (JSS) and arXiv.

Manuscript Ready for Submission

Scientific Integrity Audit CRITICAL

A comprehensive scientific integrity audit was performed on the manuscript. Every numerical claim, every cross-validation reference, and every case study result was independently verified.

Unverified Claims Removed

The following previously unverified cross-validation references were removed from the manuscript:

  • R 4.3.1 comparison — Claimed validation against R but no script existed. Replaced with actual R validation script (validate_against_R.R, 264 lines)
  • G*Power 3.1 comparison — Referenced without reproducible verification. Removed unverifiable precision claims
  • Mathematica comparison — High-precision claims without verification. Replaced with mpmath-based validation (verifiable)

Case Study 2: Wine Quality — Corrected with Real UCI Data

The Wine Quality case study was re-run using the real UCI Machine Learning Repository dataset (downloaded and included in replication package). All values were corrected:

Before (Unverified)

  • p-value: 1.5e-152
  • Spearman rho: 0.444
  • Quality scale: 1-10

After (Verified with Real Data)

  • p-value: 2.83e-91
  • Spearman rho: 0.479
  • Quality scale: 3-9

Case Study 3: Meta-Analysis — Reproducibility Fix

Added seed=561 for reproducibility. Data now matches stated results exactly, and the simulation can be independently reproduced.

All 29 Claims Verified

  • T-test: Exact match with SciPy (15+ decimal agreement)
  • ANOVA: Exact match with SciPy (14+ decimal agreement)
  • Correlation: Exact match (16+ decimal agreement)
  • Meta-analysis: Exact match with DerSimonian-Laird
  • Power analysis: Within 1% of G*Power
  • All 29 bibliography citations verified
  • 8 Guardian validators confirmed in codebase
  • 58 interactive lessons counted in codebase

New Verification & Replication Scripts

Script Lines Purpose
MASTER_VERIFICATION.py 120 Master script that runs all verification tests
validate_wine_quality_REAL.py 254 Downloads and validates against real UCI Wine Quality dataset
validate_against_R.R 264 R cross-validation script for independent verification
verify_case_studies_FINAL.py 225 Comprehensive verification of all case study claims
additional_real_data_analysis.py ~200 Additional real datasets: mtcars, ToothGrowth, PlantGrowth

Test Suite Coverage: 93 Automated Tests

  • Backend: 38 tests (Guardian integration + middleware)
  • Frontend: 55 tests (Guardian components + hooks)
  • New test files created: test_guardian_integration.py (483 lines), test_guardian_middleware.py (263 lines), GuardianComponents.test.jsx (373 lines), useGuardianReport.test.js (363 lines)

Guardian UI Components Added

New React components were built to surface Guardian validation results directly in the analysis UI:

Component Lines Purpose
ConfidenceGauge.jsx 162 Visual gauge showing Guardian confidence score
GuardianBadge.jsx 115 Status badge (Pass/Warning/Fail) for assumption checks
GuardianReportDisplay.jsx 366 Full Guardian report with violation details
ViolationCard.jsx 208 Individual violation detail card with remediation
useGuardianReport.js (hook) 190 React hook for fetching and managing Guardian state

All 5 analysis modules (T-Test, ANOVA, Correlation/Regression, Hypothesis Testing, Non-Parametric Tests) were updated to integrate Guardian reporting.

Final Manuscript Structure (41 Pages)

Paper Sections

  1. Introduction — Reproducibility crisis context, StickForStats contribution
  2. Related Work — Statistical software comparison (JASP, jamovi, SPSS, R)
  3. System Architecture — Django + React + Guardian pipeline
  4. The Guardian System — 8 validators (normality, variance homogeneity, independence, linearity, outliers, sample size, multicollinearity, publication bias)
  5. AI Statistical Advisor — StickAI, Methods Generator
  6. Paper Parser — Manuscript analysis for reporting errors
  7. Code Examples — Comprehensive API usage
  8. High-Precision Computing — 50-decimal precision with mpmath
  9. Validation and Testing — 93 automated tests
  10. Case Studies — Real data: Iris, Wine Quality, meta-analysis, mtcars, ToothGrowth, PlantGrowth
  11. Discussion — Limitations, future work
  12. Conclusion
  13. References — 29 verified citations

Key Contribution: The Design Contract

StickForStats enforces a "Design Contract": no statistical result is ever presented without Guardian context. This is the core novelty — mandatory, automatic assumption validation that cannot be bypassed (unless Expert Mode is explicitly enabled).

Submission Packages Prepared

JSS Package

  • Manuscript PDF & LaTeX source (1,639 lines)
  • Cover letter (COVER_LETTER_JSS.pdf)
  • Bibliography (29 references)
  • JSS style files (jss.cls, jss.bst)
  • Complete replication package
  • Submission checklist

arXiv Package

  • Identical source files to JSS
  • Primary category: stat.CO
  • Cross-list: cs.SE
  • Complete abstract and metadata
  • Figures directory with diagrams

Author Information

  • Vishal Bharti (First & Corresponding Author) — CSIR-IGIB — ORCID: 0009-0003-1431-4457
  • Debojyoti Chakraborty (Corresponding Author) — CSIR-IGIB & AcSIR — ORCID: 0000-0003-1460-7594

Limitations Disclosed in Paper

The manuscript transparently acknowledges the following limitations:

  1. No user study validating Guardian effectiveness
  2. Threshold values (0.7, 0.05) are conventional, not proven optimal
  3. Only 8 validators implemented, not exhaustive
  4. Does not cover machine learning assumptions
  5. Meta-analysis data simulated (not from real published studies), but reproducible with seed
  6. User can override Guardian warnings in Expert Mode

Key Commits This Week

Commit Date Description Impact
b54a953 Jan 27 Scientific integrity audit - verified data and R validation +13,176 / -352 lines across 31 files
c1c7c29 Jan 27 Complete submission package with cover letters and author info +715 / -119 lines across 7 files
0aca9b6 Jan 27 Add complete arXiv submission package Full arXiv-ready package
0cc7b0b Jan 27 Add cover letter in Markdown format for easy copying COVER_LETTER_JSS.md
4d6132a Jan 27 Standardize GitHub URLs to stickforstats_new URL consistency fixes

Technology Stack

LaTeX (JSS class) Python (verification) R (cross-validation) React 18 Django REST SciPy mpmath Material-UI 5 Jest + Testing Library

Next Steps

Immediate Priorities

  1. Submit to arXiv: Upload preprint for public visibility and timestamp
  2. Submit to JSS: Submit via JSS editorial system with cover letter
  3. PI Review: Final review by Dr. Chakraborty before submission

Post-Submission

  • Monitor arXiv listing and update if needed
  • Prepare responses to potential reviewer feedback
  • Continue development on v2 features (expanded Guardian validators)
  • Consider supplementary materials (video demo, extended documentation)

Week Summary

Session Metrics

  • Manuscript: 41 pages, 1,639 lines LaTeX, 29 references
  • Integrity Fixes: 3 unverified claims removed, 3 case studies re-validated
  • New Code: ~13,900 lines added (tests, verification scripts, Guardian UI)
  • Test Coverage: 93 automated tests (38 backend + 55 frontend)
  • Replication Scripts: 5 new verification scripts (~1,063 lines)
  • Guardian Components: 5 new React components (~1,041 lines)
  • Real Datasets Used: Fisher's Iris, UCI Wine Quality, mtcars, ToothGrowth, PlantGrowth
  • Files Modified: 31 files in integrity audit commit alone

Key Achievements

  • Complete scientific integrity audit — every claim independently verified
  • Wine Quality case study corrected with real UCI data
  • R cross-validation script created (264 lines) for independent verification
  • 93-test suite documented and all passing
  • Both JSS and arXiv submission packages complete and ready
  • Cover letter prepared for JSS editors
  • Author ORCID identifiers added for both authors
  • Guardian UI components built for frontend integration