StickForStats

Python Streamlit Statistics Data Visualization Education

Project Overview

StickForStats is an advanced, modular statistical toolkit designed to bridge the gap between complex statistical inference and researchers with minimal statistical background. Built using Python and Streamlit, the platform provides an intuitive interface for performing various statistical analyses with proper guidance on test selection, assumption verification, and result interpretation.

The project originated from a need identified during laboratory collaborations at CSIR-IGIB, where many researchers struggled with choosing appropriate statistical methods for their experiments. StickForStats makes advanced statistical approaches more accessible while maintaining mathematical rigor.

Intelligent Test Selection

Recommends appropriate statistical tests based on data characteristics and research questions

Assumption Verification

Automatically checks test assumptions and suggests alternatives when assumptions are violated

Interactive Visualizations

Generates publication-ready visualizations with customizable options

Educational Component

Explains statistical concepts and test selection logic for better understanding

Core Modules

Confidence Intervals Explorer

Interactive tutorial and visualization tool for understanding confidence intervals through simulations and practical applications. Helps researchers understand the nuances of statistical inference through direct manipulation of parameters.

View Demo

Statistical Quality Control

Comprehensive tools for monitoring and analyzing process quality, including control charts, process capability analysis, and measurement systems analysis. Particularly useful for biotechnology lab quality control.

Principal Component Analysis

Interactive PCA tool with detailed visualizations, step-by-step guides, and intuitive interpretation help. Features interactive biplots and scree plots with direct manipulation.

View Demo

Probability Distributions

Educational tool for exploring statistical distributions and their properties with interactive visualizations and practical examples from biological sciences.

View Demo

Development Journey

April 2024 - Present

Platform Development

Architectural refactoring from individual Streamlit modules to a cohesive, integrated Flask-based web application with comprehensive project structure, API endpoints, and authentication system.

October 2023 - March 2024

Module Development

Creation of individual statistical modules (Confidence Intervals, PCA, SQC, Probability) with standardized interfaces and educational components. Focus on intuitive design and proper statistical guidance.

June 2023 - September 2023

Research & Prototyping

Initial research on statistical education needs among researchers, technology selection, and prototyping of core concepts. Surveyed researchers at CSIR-IGIB to identify key statistical pain points.

RAG System Implementation

The latest addition to StickForStats is a Retrieval Augmented Generation (RAG) system for contextual AI assistance in statistical analysis. This system provides intelligent guidance tailored to each user's specific analysis context.

Vector Store

Implemented efficient vector storage using SentenceTransformers for similarity searching, optimized for statistical terminology and concepts.

100%

Knowledge Base

Comprehensive knowledge items for all statistical domains with module-component relationships for contextual suggestions.

90%

Context Tracker

System to monitor user activity and provide relevant assistance based on current module and actions, enabling intelligent content discovery.

85%

Subscription Model

Tiered access approach (Basic, Premium, Enterprise) with secure API key management for premium features.

95%

Technical Details

Backend

  • Python 3.9+ - Core language for statistical implementations
  • Flask - Web framework for integrated platform
  • SciPy & NumPy - Scientific computing and array operations
  • Pandas - Data manipulation and analysis
  • scikit-learn - Machine learning algorithms for PCA and clustering
  • statsmodels - Advanced statistical modeling

Frontend

  • Streamlit - Initial UI framework for individual modules
  • JavaScript - Enhanced interactive visualizations in integrated platform
  • Plotly - Interactive data visualization
  • Bootstrap - Responsive design components
  • Flask Templates - Server-side rendering

RAG System

  • SentenceTransformers - Text embeddings for similarity search
  • FAISS - Vector similarity search
  • Custom Knowledge Base - JSON-based hierarchical structure
  • Session Management - Flask sessions for context persistence

Deployment

  • Streamlit Cloud - Initial module deployment
  • Docker - Containerization for integrated platform
  • RESTful API - Stateless service architecture
  • Environment Variables - Configuration management

Future Development

StickForStats is an ongoing project with several exciting developments planned for the future:

Short-term Goals (May-June 2025)

  • Expand RAG knowledge base with specialized content
  • Develop interactive visualizations for complex concepts
  • Create biotech-specific case studies and examples
  • Implement adaptive learning pathways

Medium-term Goals (June-July 2025)

  • Build community platform for sharing analyses
  • Develop educational partnerships with institutions
  • Create plugin system for community contributions
  • Integrate with popular data science tools

Long-term Vision (August 2025+)

  • Scale platform for enterprise use
  • Launch certified training program
  • Develop industry-specific modules
  • Create comprehensive documentation system

Interested in StickForStats?

Whether you're interested in using StickForStats for your research, collaborating on its development, or have questions about the platform, I'd love to hear from you.